The current published W3C standard for World Wide Web is XHTML 1.1, and XHTML
2.0 is now in draft form (as at July 2003). HTML 4.01 was the last non-XML
version of the Web page standard, published way back in 1999.
XHTML is best understood as HTML recast using the rules for creating XML
language. In other words, XHTML is HTML re-cast as XML. XML is not really a
language in itself, but a set of rules for creating mark-up languages.
One of the XML rules is that tags are case-sensitive, must be correctly
paired, and must be opened and closed. So while mark-up of:
is valid as HTML, it is not valid as XHTML. In XHTML, the equivalent mark-up would be:
But one of the lesser known differences between HTML and XHTML is that
attributes within tags (such as the href attribute within a link <a> tag)
must use "entities" for special reserved characters. For example,
within mark-up, & has a special meaning. Greater than and less than signs
(> and <) also have a special meaning, as they are used to define tags.
A simple example of an "entity" is:
which is the code to specifically display a greater than sign in the HTML
output. The use of an ampersand followed by a code followed by a semi-colon is
the method that has been used for entities since the inception of HTML. The term
"escape code" is a more common way of describing these codes. So
" " is the escape code for a non-breaking space, and
"&" is the escape code for an ampersand.
In the stricter world of XHTML, escape characters must also be used in tag
attributes. This ends up being important in link and image tags, where the
target URL includes an ampersand. For example:
<img src="logo.gif" alt="A&B Ltd">
is valid as HTML, it is not valid as XHTML. In XHTML, the equivalent mark-up
<img src="logo.gif" alt="A&B Ltd"></img>
So if you want to follow the XHTML specification to the letter, start using
escape codes where needed in your attributes!
To check that your Web pages are truly XHTML compliant, you can use the free
provided by the W3C.