Related items

XML Namespaces : Universal Identification in XML Markup

You are here: irt.org | Articles | Extensible Markup Language (XML) | XML Namespaces : Universal Identification in XML Markup [ previous next ]

Published on: Sunday 21st November 1999 By: Pankaj Kamthan

Introduction
- Motivation for XML Namespaces
XML Namespaces : Under the Hood
Applying XML Namespaces to Elements and Attributes
- Explicit Namespace Declarations
- Default Namespace Declarations
Authoring
- XML Namespace-Sensitive Well-Formedness and Validation
Applications of XML Namespaces
What's not in a Namespace
Conclusion
Acknowledgements
References
Appendix : Selected XML Namespace Names

Introduction

In the last year, XML has emerged as a universal syntax for marking up documents to be served and received on the Web. An XML document, as implied by its data model, consists of a tree of elements. Each element has an element type name and a set of attributes. Applications of XML, where a single XML document may contain elements and attributes that are defined by multiple languages, occur in various contexts. With the distributed nature of the Web, such documents, containing multiple markup vocabularies, pose problems of recognition and collision for processing software. This consideration requires that document constructs should have universal names, whose scope extends beyond their containing document. The mechanism of XML namespaces accomplishes this.

In this article, from the authoring viewpoint, we restrict ourselves to the discussion of the following questions:

What are XML namespaces and why is there a need for them?
How can we use XML namespaces, appropriately and efficiently?

We assume that the reader has some background in XML and, although not required, familiarity with some XML applications is recommended. A Technical Introduction to XML and the XML FAQ provide a good starting point.

Motivation for XML Namespaces

The next two examples illustrate the nature of the problem.

Example 1. Consider the following example:

<books>
<title>Book Database by Subject</title>
<typography>
  <author title="Dr" name="Knuth, Donald" />
  <book title="Digital Typography" isbn="1575860104" pages="720" price="$49.95" year="1999" />
  <publisher name="CSLI Publications" country="USA" />
</typography>
</books>

In this example, there are three occurrences of the name title and two occurrences of the name name within markup. This leads to potential conflict and provides insufficient information to allow correct processing by a software.

Example 2. The following is a fragment of an XML document which is to be displayed using a CSS stylesheet:

<theatre>
  <reservation>
    <name html:class="font1">Kepler, Johannes</name>
    <seat class="A" html:class="VIP">101</seat>
    <name html:class="font2">Kamthan, Pankaj</name>
    <seat class="B" html:class="General">201</seat>
  </reservation>
</theatre>

In this case, the occurrences of the class attributes are semantically different. Since XML 1.0 does not provide a built-in way to declare "global" attributes, this again results in a potential conflict.

The XML namespace mechanism resolves these conflicts by extending the XML data model to allow element type names and attribute names to be qualified with a Uniform Resource Identifier (URI). Thus, a document that describes title of a person can use title qualified by one URI, and a document that describes title of books can use title qualified by another URI.

The idea of XML namespaces has its predecessors. In the real world, address schemes are unique when identified in their entirety (starting from the name of the person to the country of residence). From a programming perspective, a Matlab program can include Fortran or C procedures, or a Java program can include a C procedure via the Java Native Interface (JNI). From the viewpoint of document publishing, XML namespaces has a distant similarity to the concept of Architectural Forms in SGML. For an early motivation to the namespace concept in extensible markup languages, see Web Architecture: Extensible Languages.

XML namespaces provide the following (overlapping) advantages:

Reusability. XML namespaces allow reuse of markup, by reusing elements and attributes which are well-understood and for which there is processing software available.
Modularity. Using elements and attributes from other standards results in modular documents.
Extensibility. XML namespaces (by the very nature of their purpose) provide extensibility to a language by incorporating elements and attributes from other vocabularies. Such extensibility may not only be desirable, but even necessary in some cases.

XML Namespaces : Under the Hood

An XML namespace is defined by:

An XML namespace is a collection of names, identified by a URI reference, which are used in XML documents as element types and attribute names.

In the following subsections, we will expand on each of the components in the definition.

The XML namespaces defines a mapping from an XML 1.0 tree where element type names and attribute names are "local names" into a tree where element type names and attribute names can be "universal names." The mapping, as we see later, is based on the idea of a prefix.

Qualified Names

A qualified name in XML namespaces contains a single colon, separating the name into a namespace prefix and a local part. The prefix, which is mapped to a URI reference, must be associated with a namespace URI reference in a namespace declaration. The namespace is identified by a URI, either a Uniform Resource Locator (URL), or a Uniform Resource Number (URN), but it does not matter what (if anything) it points to. URIs are used simply because they are globally unique across the Internet, and thus help produce identifiers that are universally unique. The prefix functions only as a shorthand placeholder for a namespace name, that is, the URIs. As URIs can contain characters that are not allowed in names, a proxy that associates the prefix with the given URI is used. The appendix lists namespace names associated with some XML vocabularies.

Example 3. Here is an example of a qualified name serving as an element type:

<books xmlns:b="http://www.foo.com/bar">
  <!-- The 'publisher' element's namespace is http://www.foo.com/bar -->
  <b:publisher>Addison Wesley</b:publisher>
</books>

The attribute "xmlns" is an XML keyword for a namespace declaration.

Example 4. Here is an example of a qualified name serving as an attribute name:

<books xmlns:b="http://www.foo.com/bar">
  <!-- The 'category' attribute's namespace is http://www.foo.com/bar -->
  <book b:category="research">Numerical Analysis of Partial Differential Equations</book>
</books>

The following constraints apply to prefixes in namespaces:

Prefixes beginning with the three-letter sequence x, m, l, in that order, in any case combination, are reserved for use by XML and XML-related specifications.
The namespace prefix, unless it is xml or xmlns, must have been declared in a namespace declaration attribute in either the start-tag of the element where the prefix is used or in an ancestor element.

Namespace Declarations

A namespace is declared using a family of reserved attributes. Such an attribute's name must either be xmlns or have xmlns: as a prefix. These attributes, like other XML attributes, may be provided either explicitly or by default. The attribute's value, a URI reference, is the namespace name identifying the namespace. The namespace name has the characteristic of being unique.

The namespace declaration applies to the element where it is specified and to all elements within the content of that element, unless overridden by another namespace declaration with the same attribute name.

Example 5. The following is an example of a namespace declaration, which associates the namespace prefix m with the namespace name http://www.w3.org/Math/MathML:

<apply xmlns:m="http://www.w3.org/Math/MathML">
  <!-- The 'm' prefix is bound to http://www.w3.org/Math/MathML
       for the 'apply' element and its contents -->
</apply>

Uniqueness of Attributes

The following conditions apply to attributes. A tag should not contain two attributes which:

have identical names, or
have qualified names with the same local part and with prefixes which have been bound to namespace names that are identical.

Example 6. The next example shows how these conditions can be contradicted, resulting in an illegal element y:

<!-- http://www.foo.com is bound to n1 and n2 -->
<x xmlns:n1="http://www.foo.com" 
   xmlns:n2="http://www.foo.com" >
  <!-- y contradicts condition 1 -->
  <y a="1" a="2" />
  <!-- y contradicts condition 2 -->
  <y n1:a="1" n2:a="2" />
</x>

Applying XML Namespaces to Elements and Attributes

Explicit Namespace Declarations

With an explicit declaration, you can define a prefix to substitute for the full name of the namespace. You can then use this prefix to qualify elements belonging to that namespace. Explicit declarations are useful when a node contains elements from different namespaces.

Example 7. In the following explicit declaration, all elements beginning with b: or money: are considered to be from the namespace urn:BooksAreUs.org:BookInfo or urn:Finance:Money, respectively. This example also shows that multiple namespace prefixes can be declared as attributes of a single element.

<?xml version="1.0"?>
<books> 
  <b:book xmlns:b="urn:BooksAreUs.org:BookInfo"
          xmlns:money "urn:Finance:Money"> 
    <b:title>Digital Typography</b:title> 
    <b:price money:currency="US Dollar">49.95</b:price> 
  </b:book> 
</books>

Example 8. In this example, the elements prefixed with b are associated with a namespace whose name is urn:BooksAreUs.org:BookInfo, while those prefixed with h are associated with a namespace http://www.w3.org/TR/REC-html40 that is used as the namespace name for HTML.

<?xml version="1.0"?>
<html xmlns:h="http://www.w3.org/TR/REC-html40"
      xmlns:b="urn:BooksAreUs.org:BookInfo">
  <h:head>
    <h:title>Typography</h:title>
  </h:head>
  <h:body>
    <h:p>Welcome to the world of typography! Here is a book that you may find useful.</h:p>
    <b:title h:style="font-family: sans-serif;">Digital Typography</b:title> 
    <b:author>Donald Knuth</b:author>
</h:body>
</h:html>

Default Namespace Declarations

A default namespace applies to the element where it is declared (if that element has no namespace prefix), and to all elements with no prefix within the content of that element. The default namespaces do not apply to attribute names.

Example 9. In this example, all elements and attributes within the book element (title, price, currency) are from the namespace urn:BooksAreUs.org:BookInfo.

<?xml version="1.0"?>
<books> 
  <book xmlns="urn:BooksAreUs.org:BookInfo">
    <title>Digital Typography</title>
    <price currency="US Dollar">49.95</price>
  </book>
</books>

Example 10. In this example, all unprefixed elements and attributes are by default, from the namespace http://www.w3.org/TR/REC-html40 that is used as the namespace name for HTML:

<?xml version="1.0"?>
<html xmlns="http://www.w3.org/TR/REC-html40"
      xmlns:b="urn:BooksAreUs.org:BookInfo">
  <head>
    <title>Typography</title>
  </head>
  <body>
    <p>Welcome to the world of typography! Here is a book that you may find useful.</p>
    <b:title style="font-family: sans-serif;">Digital Typography</b:title> 
    <b:author>Donald Knuth</b:author>
</body>
</html>

The default namespace can be set to the empty string. If the URI reference in a default namespace declaration is empty, then unprefixed elements in the scope of the declaration are not considered to be in any namespace. This could be useful if we have declared a namespace initially, but want to "free" some of the elements from such a binding later.

Authoring

XML documents using multiple vocabularies can be authored like any other XML documents. XML Spy is a commercial XML editor with support for various features often required in XML authoring, including XML namespace support for both elements and attributes. Previously-used namespaces are "preserved" for later use:

XML Namespace Support in XML Spy

These XML authoring environments, however, may not be "sensitive" to specific XML application fragments embedded in a document and treat them as being generic XML markup.

XML Namespace-Sensitive Well-Formedness and Validation

As a safe practice, all XML documents (with or without namespaces) that are authored should be checked for well-formedness. This is possible with any XML authoring environment, for example, XML Spy discussed above, which supports such a facility.

However, with an XML document using elements from different applications, validation becomes an issue. For a name "foo:bar," there is no standard way to validate that "bar" is a member of the namespace "foo" since there is no standardized mechanism for the vocabulary of which "bar" may or may not be a member. For example, an occurrence of "xhtml:table" does not by itself imply that what is being processed is in fact an XHTML element, as an assumption based on the prefix alone is at best a guess. Furthermore, namespace-sensitive validation would require associating each URI corresponding to the namespace name with some sort of schema (similar to a DTD) and be able to validate a document with respect to the schemas for all of the URIs. This is not yet possible as DTD-based schema has various limitations which prohibit such a possibility.

XML Schema initiative fixes the problems associated with XML DTDs and, as a result, provides namespace-sensitiveness. It attempts to provide a mechanism similar to SGML Architecture Forms (without being limited to the constraints of DTD syntax). Also, in this regard, Simple API for XML (SAX) version 2 has added support for XML namespaces. This, however, should not be taken to imply that there is any direct relationship between namespaces and schemas. A namespace is an abstract object with no necessary association between itself and anything. In particular, there is no necessary association between a namespace and a schema.

Applications of XML Namespaces

In previous sections, we have already seen examples where XML can be used in conjunction with HTML. In this section, we provide some further scenarios.

XHTML with MATHML and SVG

Example 11. A circle is one of the basic mathematical objects. Suppose we wish to include the symbolic as well as graphical representation of a circle in an Extensible HyperText Markup Language (XHTML) 1.0 document. Using an entirely XML approach, we could do that by representing the equation of a circle in Mathematical Markup Language (MathML), a language for expressing mathematical notation in XML, and the corresponding graphics in Scalar Vector Graphics (SVG), a language for describing two-dimensional graphics in XML. We embed both MathML and SVG markups in an XHTML 1.0 document with the help of corresponding XML namespaces, as follows:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
  "http://www.w3.org/TR/xhtml1/DTD/strict.dtd">
<html xmlns="http://www.w3.org/TR/xhtml1/strict" xml:lang="en" lang="en">
<head><title>Equation of a Circle</title></head>
<body>
  <p>The equation of a circle:</p>
  <!-- MathML Content Markup of the Equation of a Circle -->
  <math xmlns="http://www.w3.org/1998/Math/MathML">
    <reln><eq/>
      <apply><plus/>
          <apply><power/><ci>x</ci><cn>2</cn></apply>
          <apply><power/><ci>y</ci><cn>2</cn></apply>
      </apply>
      <cn>1</cn>
    </reln>
  </math>
  <p>can be graphically represented as:</p>
  <!-- SVG Graphic of a Circle -->
  <svg xmlns="http://www.w3.org/Graphics/SVG/SVG-19991203.dtd" 
       width="250px" height="250px">
    <g><circle style="fill: none; stroke: black" cx="10" cy="10" r="100"/></g> 
  </svg> 
</body>
</html>

On a renderer (currently not in existence to the author's knowledge) that supports XHTML, MathML and SVG, this should result in the following output:

The equation of a circle:

can be graphically represented as:

Circle Graphic

The use of EzMath to author the equation, and CSIRO SVG viewer for the image shown above, was made.

Translating XML to XHTML Using XSL

There may be times when a need arises to translate a set of XML files to XHTML 1.0, say, for presenting them to HTML 4.0 user agents. An efficient way of doing that is to use the XSL Transformations (XSLT), the "transformational" part of the Extensible Stylesheet Language (XSL). Here is an example of using XSLT to create an XHTML 1.0 document from a given XML document.

Example 12. Here is an example of using XSLT to create an XHTML 1.0 document from a given XML document.

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet href="sales.xsl" type="text/xsl"?>
<sales>
  <department id="A"><revenue>100</revenue><profit>5</profit></department>
  <department id="B"><revenue>200</revenue><profit>15</profit></department>
</sales>

The XSL style sheet (sales.xsl) is:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform" 
  xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"
  default-space="strip" indent-result="yes">
  <xsl:template match="/">
    <html>
      <head><title>Sales By Department</title></head>
      <body>
        <table align="center" cellpadding="5" border="1">
          <tr><th>Department</th><th>Revenue</th><th>Profit</th></tr> 
          <xsl:apply-templates /> 
        </table> 
      </body> 
    </html> 
  </xsl:template>
  <xsl:template match="sales"> 
    <xsl:apply-templates match="department">
      <xsl:sort select="profit" data-type="number" order="ascending"/> 
    </xsl:apply-templates>
  </xsl:template>
  <xsl:template match="department"> 
    <tr> 
      <td><xsl:value-of select="@id" /></td>
      <xsl:apply-templates select="revenue" />
      <xsl:apply-templates select="profit" />
    </tr>
  </xsl:template>
  <xsl:template match="revenue | profit"> 
    <td><xsl:apply-templates /></td>
  </xsl:template>
</xsl:stylesheet>

The files can be, for example, processed using XT under Windows 9x/NT using the XML document sales_1999.xml as input, applying the XSL stylesheet sales.xsl, and directing the output to sales_1999.html. The result, depending on the renderer, appears in the XHTML document as:

Sales By Department : 1999

Metadata with RDF

Resource Description Framework (RDF) is a foundation for processing metadata. It provides interoperability between applications that exchange machine-understandable information on the Web.

RDF requires the XML namespace facility to precisely associate each property with the schema that defines the property. Consider the following RDF statement:

W3C is the host of the resource http://www.w3.org/.

After identifying the RDF parts of the description,

W3C	is the	host	of the resource	http://www.w3.org/	.
`Object`		`Property`		`Subject`

the markup, in the RDF serialization syntax, is:

<?xml version="1.0"?>
<rdf:RDF
  xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
  xmlns:h="http://www.foo.com/consortiums">
  <rdf:Description about="http://www.w3.org/">
    <h:Host>W3C</s:Host>
  </rdf:Description>
</rdf:RDF>

where the elements with prefix h are associated with a namespace name http://www.foo.com/consortiums.

An immediate consequence of "universal identification" is accurate searchability; search over a large set of documents for an element name would otherwise (in absence of namespaces) lead to irrelevant results (as the search program would not know which element do we mean).

What's not in a Namespace

An observation based on some of the available namespace names (see, for example, the Appendix) reveals the following issue: There seems to be an apparent inconsistent pattern in namespace name assignment. Some of the specifications use version control number (for example, XSL and XSLT), some use version control date (for example, RDF Syntax, RDF Schema), some use year (MathML, XHTML), some use DTD location (for example, SVG), and yet some others use URI of the specification itself (HTML 4.0, SMIL). There is an inconsistency even among the members of the same "family" (for example, RDF and MathML). Such an arrangement can pose difficulties such as making an "educated" guess of their form during authoring. A unified direction on how the URIs are assigned to namespaces and managed, is needed.

Namespaces, though allow a mechanism for unique identification of elements and attributes, do not have any implications towards semantics (that is, they do not define what these elements and attributes are, or what they mean). Any inferences based on meanings of namespace names are unreliable. For example, there may be several names in different namespaces that map to the same semantic, and conversely, a given name may have different semantics based on its context. There is no way to describe this using namespaces only.

Conclusion

XML Namespaces : Bridges of XML Applications

XML namespaces are an important step towards making XML applications coexist coherently and interoperate transparently without any potential conflict. They work as a "glue" that binds these standards together.

There are certain W3C standards for which namespace mechanism does not exist. One of them is CSS, for which a namespace enhancement has been proposed in CSS3.

XML has permeated various diverse "islands" of knowledge: mathematics, graphics, multimedia, databases, and electronic commerce, to name a few. With the XML namespace "bridge," elements and attributes of one island can now travel freely to another.

The journey continues.

Acknowledgements

I would like to thank Hsueh-Ieng Pai and Martin Webb for various useful suggestions.

References

Namespaces in XML - Tim Bray, Dave Hollander, Andrew Layman (Editors). W3C Recommendation, January 14, 1999.
Extensible Markup Language (XML) 1.0 - Tim Bray, Jean Paoli, C. M. Sperberg-McQueen (Editors). W3C Recommendation, February 10, 1998.
XHTML^™ 1.0: The Extensible HyperText Markup Language — A Reformulation of HTML 4.0 in XML 1.0 - Steve Pemberton, et al. (Editors). W3C Recommendation, January 26, 2000.
Web Architecture: Extensible Languages - Tim Berners-Lee, Dan Connolly (Editors). W3C Note, February 10, 1998.
XSL Transformations - By James Clark. W3C Recommendation, November 16, 1999.
Associating Style Sheets with XML documents Version 1.0 - James Clark (Editor). W3C Recommendation, June 29, 1999.
The SGML/XML Web Page: Namespaces in XML - By Robin Cover, Oasis.>
XML Tutorial : Using XML Namespaces - By Microsoft. MSDN Online Web Workshop.
XML Namespaces by Example - By Tim Bray, XML.com.
XML Namespaces - James Clark. An alternate description of XML Namespaces.
XML Namespaces and Semantics - By Paul Prescod, posting on XML-DEV mailing list.
Namespaces in XML and XHTML - By Jon Bosak, posting on XML-DEV mailing list.
XML Spy - An XML editor for Windows 9x/NT that supports XML namespaces.

Appendix : Selected XML Namespace Names

APPLICATION	XML NAMESPACE NAME
HTML	`http://www.w3.org/TR/REC-html40` (This is an example of the fact that the concept of XML namespaces is not limited to XML.)
XHTML	`http://www.w3.org/1999/xhtml`
MathML	`http://www.w3.org/1998/Math/MathML`
RDF	RDF Syntax `http://www.w3.org/1999/02/22-rdf-syntax-ns#` RDF Schema `http://www.w3.org/TR/1999/PR-rdf-schema-19990303#`
SMIL	SMIL 1.0 `http://www.w3.org/TR/REC-smil` SMIL Animation `http://www.w3.org/TR/smil-animation10`
SVG	`http://www.w3.org/Graphics/SVG/SVG-19991203.dtd`
XSL	XSL Formatting Semantics Vocabulary `http://www.w3.org/XSL/Format/1.0` XSL Transformation Vocabulary `http://www.w3.org/XSL/Transform/1.0`

The Emperor has New Clothes : HTML Recast as an XML Application

SVG Brings Fast Vector Graphics to Web

Time Changes Everything

P3P - What's in it for us?

XSL - What's in it for us?

RDF - What's in it for us?

MathML - What's in it for us?

XML - What's in it for us?