Re: Tags and the Private Use Area

James Kass Sun, 29 Apr 2001 06:16:13 -0700

David Starner wrote:

> 
> Character set information must go along with every non-Latin-1
> webpage already, and most word processor formats already carry along
> huge quantities of data, such that just adding the information
> shouldn't be hard at all. 
>  

The charset declaration in HTML header is just one line, like
saying charset=utf-8.  The concern was that someone would
expect TUS 3.0 in its entirety to be included in every file,
as an extreme example.

Since the PUA is part of Unicode, it is covered when the 
character set is specified as utf-8 in HTML.

> Intellegent software cached the file and loads it up from the cache;
> the number of distinct uses for the PUA any one person will run
> across is probably low enough to cache every one permenantly. Dumb
> software will do the TeX thing and say "File not found. Please enter
> alternate PUA reference for 'Klingon at http://www.kli.org/klingon.xml':".  
> Note that there's already precedence in XML for stuff like this; XML
> includes a URL to find the doctype that's needed to validate it. 
> 

My impression is that the typical Klingon user (if they
used the Klingon script rather than the romanization) might
well have dozens or even hundreds of files using the ConScript
PUA encoding.  This could be true of any PUA user group.
The user files could also be many in format, *.TXT, *.HTM, 
*.DBF, *.EML, *.etc.   

Rather than specifying a structure requiring caches and on-line
sessions, it might be better to just leave things be and let 
authors and users work implementation issues out privately.

Common sense should indicate to a publisher that some kind of
info or pointers to same would be a good idea.

Best regards,

James Kass.
Re: Tags and the Private Use Area

Reply via email to