Hi guys, I write as an european user. We are replacing the formers ISO LATIN 1 (ISO-8859-1) character set with the ISO LATIN 9 charset (ISO-8859-15).
I guess the only way to solve the euro problem is to handle a ISO LATIN 9 translation, maybe by issuing the configuration parameter translate_latin9. As far as the euro sign is concerned, in ISO LATIN 9 it corresponds to 164, which in ISO LATIN 1 was used for the 'currency' sign. By looking at the HtSGMLCodec.cc file, given the user wants the ISO LATIN 9 charset, we should change for instance this row: myTextFromString = " |¡|¢|£|¤|¥|¦|&sec with: myTextFromString = " |¡|¢|£|€|¥|¦|&sec and in the same way the other 8 characters. It should not be hard to implement this feature which at run-time switches to the right charset depending on the configuration. The solutions are 2: 1) ignore the other characters and substitute the ¤ entity with € - it is not perfect but may work in most of the cases; 2) issue the translate_latin9 configuration attribute and handle it in the HtSGMLCodec.cc file. I am not aware of some possible bugs and misfunctionality with these approaches. Please tell me if you think of any. Ciao ciao, -Gabriele Il sab, 2004-05-29 alle 03:00, Lachlan Andrew ha scritto: > Greetings Gabriele, > > Someone just reported this as a bug... > > I suggested replacing the "¬" entity by a "€" entity (see > attached patch). That is a real hack, since the "&#xxx;" > representation of "¬" will be incorrectly displayed as a euro > sign instead of a not, but it should do until we get unicode support. > > Should we commit this hack, or just leave it as an optional patch? > > Lachlan > > On Tue, 24 Feb 2004 11:22 pm, Gabriele Bartolini wrote: > > Ciao guys, > > > > I have tested the attributes that are part of the > > 'internationalistaion' task and they are all successfull according > > to me and my locale settings ([EMAIL PROTECTED]). > > > > The only problem regards the correct translation of the euro > > character, which is an HTML entity € part of the LATIN 9 > > charset. > > > > Indeed, this character, spreadly used not only in the european > > community countries, is not currently imploded/exploded in the > > digging and searching phase, returning a terrifying "€" string > > in the results. > > > > Any ideas and workarounds? I found an interesting discussion of > > this topic on this URL: http://www.cs.tut.fi/~jkorpela/latin9.html > > > > Ciao and thanks, > > -Gabriele > > > > > > > > ------------------------------------------------------- > > SF.Net is sponsored by: Speed Start Your Linux Apps Now. > > Build and deploy apps & Web services for Linux with > > a free DVD software kit from IBM. Click Now! > > http://ads.osdn.com/?ad_id=1356&alloc_id=3438&op=click > > _______________________________________________ > > ht://Dig Developer mailing list: > > [EMAIL PROTECTED] > > List information (subscribe/unsubscribe, etc.) > > https://lists.sourceforge.net/lists/listinfo/htdig-dev ------------------------------------------------------- This SF.Net email is sponsored by: Oracle 10g Get certified on the hottest thing ever to hit the market... Oracle 10g. Take an Oracle 10g class now, and we'll give you the exam FREE. http://ads.osdn.com/?ad_id=3149&alloc_id=8166&op=click _______________________________________________ ht://Dig Developer mailing list: [EMAIL PROTECTED] List information (subscribe/unsubscribe, etc.) https://lists.sourceforge.net/lists/listinfo/htdig-dev
