RE: Xalan-C + ICU: windows-1251 encoding troubles

Dimitry Chernyshov 24 Jan 2002 18:31:19 -0000

Yes, David, your second guess is correct.
I didn't check if the numeric sequences are correct - I'm just worried about
the fact that old browsers would not recognise them at all...


Anyways, old browsers understand koi-8.

Thanks!

Best,
Dimitry

-----Original Message-----
From: David N Bertoni/CAM/Lotus [mailto:[EMAIL PROTECTED]
Sent: Thursday, January 24, 2002 9:24 PM
To: [email protected]
Subject: RE: Xalan-C + ICU: windows-1251 encoding troubles



Hi Dmitry,

Just to be clear, are you saying there's something wrong with file?  What I
mean is, are we writing incorrect numeric references?  If so, then we need
to make this a higher priority.  If this is the case, you should file a bug
report.

If not, then I don't think it's accurate to say we don't "support"
Win-1251.  An HTML or XML file with correct numeric character references is
logically equivalent to a file which contains the actual characters.
Perhaps you're struggling with browsers that don't behave correctly with
numeric character references?

Dave




                      "Dimitry
                      Chernyshov"              To:
<[email protected]>
                      <[EMAIL PROTECTED]         cc:
                      n.ru>                    Subject: RE: Xalan-C + ICU:
windows-1251 encoding troubles

                      01/24/2002 10:19
                      AM






I see...

Well, it's not so urgent - fortunately we can still use KOI8...
However, it would be very nice to have win-1251 support, 'cause it's the
most popular encoding for Russian language.

Thanks for explanation though!

Best,
Dimitry Chernyshov,
Technology Group Managing Director,
Polar Design
--------------------------
[EMAIL PROTECTED]
http://www.polardesign.com
phone/fax: +7 (095) 363 0708

-----Original Message-----
From: David N Bertoni/CAM/Lotus [mailto:[EMAIL PROTECTED]
Sent: Thursday, January 24, 2002 8:59 PM
To: [email protected]
Subject: Re: Xalan-C + ICU: windows-1251 encoding troubles



This is not surprising.  Currently, Xalan-C has a pretty brain-dead
algorithm for determining whether or not to write the actual character or a
numeric character reference.  The problem is that checking each character
is horribly expensive, so we just punt on things > 256 in many cases.

I'd like to do something about it, but it's not the highest priority right
now.  Unless you can actually determine that we're not emitting the correct
numeric character reference, there's nothing wrong with doing it the way
we're doing it.  You can always post-process the file yourself if you
object to the references.

I'll bump this up on the list of things to work on for the next release.

Dave




                      "Dimitry
                      Chernyshov"              To:
<[email protected]>
                      <[EMAIL PROTECTED]         cc:      (bcc: David N
Bertoni/CAM/Lotus)
                      n.ru>                    Subject: Xalan-C + ICU:
windows-1251 encoding troubles

                      01/24/2002 08:32
                      AM





Hi!

After I re-built Xalan-c1_3 + Xerces-c1_6_0 + ICU 2.0, Xalan works good
with
different encodings.
Though, I've encountered one pretty strange problem.

If an XSL file has xsl:output encoding set to "windows-1251" (<xsl:output
method="html" encoding="windows-1251"/>) while transforming some XML, the
result contains character codes instead of the characters themselves. E.g.
:

<html>
<head>
<DEFANGED_META http-equiv="Content-Type" content="text/html;
charset=windows-1251">
<title>&#1058;&#1080;&#1087;&#1072; &#1074;&#1080;&#1085;&#1076;&#1099;
&#1073;&#1083;&#1080;&#1085;!</title>

However, if a source XML has "windows-1251" encoding and XSL file has
encoding set to, say, KOI8-R - everything works just fine: Xalan (ICU, I
guess) transforms win-1251 to KOI8-R correctly...

Any thoughts?

Thanks in advance,
Dimitry Chernyshov,
Technology Group Managing Director,
Polar Design
--------------------------
[EMAIL PROTECTED]
http://www.polardesign.com
phone/fax: +7 (095) 363 0708

RE: Xalan-C + ICU: windows-1251 encoding troubles

Reply via email to