Hi guys,

   I write as an european user. We are replacing the formers ISO LATIN 1
(ISO-8859-1)  character set with the ISO LATIN 9 charset (ISO-8859-15).

   I guess the only way to solve the euro problem is to handle a ISO
LATIN 9 translation, maybe by issuing the configuration parameter
translate_latin9.

   As far as the euro sign is concerned, in ISO LATIN 9 it corresponds
to 164, which in ISO LATIN 1 was used for the 'currency' sign. 

   By looking at the HtSGMLCodec.cc file, given the user wants the ISO
LATIN 9 charset, we should change for instance this row:

  myTextFromString =
" |¡|¢|£|¤|¥|¦|&sec

with:

  myTextFromString =
" |¡|¢|£|€|¥|¦|&sec

  and in the same way the other 8 characters.

  It should not be hard to implement this feature which at run-time
switches to the right charset depending on the configuration.

  The solutions are 2:

1) ignore the other characters and substitute the ¤ entity with
€ - it is not perfect but may work in most of the cases;
2) issue the translate_latin9 configuration attribute and handle it in
the HtSGMLCodec.cc file.

  I am not aware of some possible bugs and misfunctionality with these
approaches. Please tell me if you think of any.

Ciao ciao,
-Gabriele



Il sab, 2004-05-29 alle 03:00, Lachlan Andrew ha scritto:
> Greetings Gabriele,
> 
> Someone just reported this as a bug...
> 
> I suggested replacing the "¬" entity by a "€" entity (see 
> attached patch).  That is a real hack, since the "&#xxx;" 
> representation of "¬" will be incorrectly displayed as a euro 
> sign instead of a not, but it should do until we get unicode support.
> 
> Should we commit this hack, or just leave it as an optional patch?
> 
> Lachlan
> 
> On Tue, 24 Feb 2004 11:22 pm, Gabriele Bartolini wrote:
> > Ciao guys,
> >
> >    I have tested the attributes that are part of the
> > 'internationalistaion' task and they are all successfull according
> > to me and my locale settings ([EMAIL PROTECTED]).
> >
> >    The only problem regards the correct translation of the euro
> > character, which is an HTML entity € part of the LATIN 9
> > charset.
> >
> >     Indeed, this character, spreadly used not only in the european
> > community countries, is not currently imploded/exploded in the
> > digging and searching phase, returning a terrifying "€" string
> > in the results.
> >
> >    Any ideas and workarounds? I found an interesting discussion of
> > this topic on this URL: http://www.cs.tut.fi/~jkorpela/latin9.html
> >
> > Ciao and thanks,
> > -Gabriele
> >
> >
> >
> > -------------------------------------------------------
> > SF.Net is sponsored by: Speed Start Your Linux Apps Now.
> > Build and deploy apps & Web services for Linux with
> > a free DVD software kit from IBM. Click Now!
> > http://ads.osdn.com/?ad_id=1356&alloc_id=3438&op=click
> > _______________________________________________
> > ht://Dig Developer mailing list:
> > [EMAIL PROTECTED]
> > List information (subscribe/unsubscribe, etc.)
> > https://lists.sourceforge.net/lists/listinfo/htdig-dev



-------------------------------------------------------
This SF.Net email is sponsored by: Oracle 10g
Get certified on the hottest thing ever to hit the market... Oracle 10g. 
Take an Oracle 10g class now, and we'll give you the exam FREE.
http://ads.osdn.com/?ad_id=3149&alloc_id=8166&op=click
_______________________________________________
ht://Dig Developer mailing list:
[EMAIL PROTECTED]
List information (subscribe/unsubscribe, etc.)
https://lists.sourceforge.net/lists/listinfo/htdig-dev

Reply via email to