* T&B <[EMAIL PROTECTED]> [2007-01-06 13:05]:
> When using SQLite's HTML output mode it converts some
> characters to HTML code, such as:
> 
> & -> &amp;
> < -> &lt;
> 
> But doesn't for other characters, such as:
> 
> > -> &gt;
> " -> &quot;
> ' -> &apos;
> © -> &copy;     (copyright symbol)
> all other non-ascii characters
> 
> See the translation tables at:
> http://www.w3schools.com/tags/ref_entities.asp
> http://www.w3.org/MarkUp/html3/latin1.html
> 
> Is this a bug, or are the first two all that are needed in
> reality, despite the spec?

Not only in reality, but also in spec. Only for text in
attributes would `"` and `'` have to be escaped (because these
are the attribute value delimiters); and only in XML would it be
necessary to escape the `>` character (because literal `]]>`
sequences are invalid in XML). In the SQLite shell, neither is
the case.

No other characters ever *need* to be represented as entities,
since the character model of HTML documents is Unicode, not
ASCII. Escaping any such characters is necessary only when the
document encoding does not cover the full Unicode range. If the
SQLite output is in the same encoding as the HTML document, then
you need not use entities for any characters other than the two
for which the SQLite shell does.

The basics of encodings and character sets are described in this
article:

    The Absolute Minimum Every Software Developer Absolutely,
    Positively Must Know About Unicode and Character Sets (No
    Excuses!)
    http://www.joelonsoftware.com/articles/Unicode.html

If you have never read anything about the basics of charsets, you
should really read it.

Regards,
-- 
Aristotle Pagaltzis // <http://plasmasturm.org/>

-----------------------------------------------------------------------------
To unsubscribe, send email to [EMAIL PROTECTED]
-----------------------------------------------------------------------------

Reply via email to