Geoff Hutchison <[EMAIL PROTECTED]> wrote:
>On Mon, 6 Jan 2003, Mike Holderness wrote:
>> Some HTML 4.1 entities (&par;, &pound;...) will be included 
>> in excerpts as singly characters and therefore won't display 
>> in Opera or Mozilla. 
>
>At the moment, the code can deal with some HTML entities, but as no one
>has stepped forward to deal with some of the newer standards, esp. in
>regards to entities, there are undoubtedly a great deal of problems like
>this.

Errr... maybe next month? (This month's new language looks 
set to be Python.)

>> Will others (&ldquo; etc) continue to be included in 
>> excerpts as &amp;ldquo; (& a m p ; l d q u o ;)?
>
>Any HTML entity that is not part of the recognized list will show this
>bug. If you have a suggestion as to which entities should be transformed
>into the appropriate localized character set (i.e. accents) and which
>should be ignored, please let us know or point us to an appropriate URL.

I'll be back. Somewhere I have a table of those I've 
needed (in Linotron 340 typesetter format :-)

Excuse me for clarifying my thoughts by writing messages, 
but: it seems that there are two questions:

1) Updating the table of entities to translate to characters

2) Whether or not to translate characters outside the 7-bit
gamut back to &entity; references when outputting excerpts, 
for maximum compatibility with browser quirks. 

I can see where (1) done The Right Way could lead off into 
infinite work, since TRW would seem to be the ongoing W3C 
discussion of fully-normali?ed Unicode (e.g., having 
translated "e&acute;" to two Unicode characters, then 
translate it to the single character represented by "&eacute;"). 
[see http://www.w3.org/TR/charmod/ - forgive me if there's a 
previous discussion here not revealed by a search for 
"normal* Unicode"]

But a reasonable patch should be a bookkeeping exercise that 
someone like me can handle. Given time.

(2) raises, er, a philosophical question...

>I (for one) didn't know that there even was an HTML 4.1 standard. Have a
>URL?

Sorry, my thinko for 4.01. 

Mike



-------------------------------------------------------
This SF.NET email is sponsored by:
SourceForge Enterprise Edition + IBM + LinuxWorld = Something 2 See!
http://www.vasoftware.com
_______________________________________________
htdig-general mailing list <[EMAIL PROTECTED]>
To unsubscribe, send a message to <[EMAIL PROTECTED]> with a 
subject of unsubscribe
FAQ: http://htdig.sourceforge.net/FAQ.html

Reply via email to