Tutorial: Normalizing XML Documents
Tutorial: Normalizing XML Documents
This tutorial describes how to use the XML Parser to normalize an XML document by
removing whitespace characters. Normalization makes it easy to compare XML
1. Concepts
2. Design
3. Required Software
4. Setup
5. Implementation
6. Resources
7. Feedback
XML documents can contain whitespace characters, including spaces, tabs, carriage
returns, and linefeeds. When comparing two XML documents, in can be useful to remove
whitespace characters so you can work directly with the element and attribute values.
The process of removing whitespace characters is called normalization. A normalized
XML document is not the same as a canonical XML document; canonical XML is a more
rigorously-defined format.
The following table shows an example of an XML document before and after
Before Normalization
This is the test


After Normalization
This is the testValue
