Re: Avoiding the escaping UTF-8 unicode text

2004-03-08 Thread Nick Bastin
On Mar 8, 2004, at 3:12 PM, [EMAIL PROTECTED] wrote: Yes, I was confused by the fact you said and XML to XML tranformation worked correctly, but XML to HTML did not. Clearly, they must have beeen with different data sets, so the comparison was not relevant. Well, *we* didn't think they were differ

Re: Avoiding the escaping UTF-8 unicode text

2004-03-08 Thread david_n_bertoni
> On Mar 8, 2004, at 2:18 PM, [EMAIL PROTECTED] wrote: > > > Yes, I was confused by the fact you said and XML to XML tranformation > > worked correctly, but XML to HTML did not. Clearly, they must have > > beeen > > with different data sets, so the comparison was not relevant. > > Well, *we* d

Re: Avoiding the escaping UTF-8 unicode text

2004-03-08 Thread Nick Bastin
On Mar 8, 2004, at 2:18 PM, [EMAIL PROTECTED] wrote: Yes, I was confused by the fact you said and XML to XML tranformation worked correctly, but XML to HTML did not. Clearly, they must have beeen with different data sets, so the comparison was not relevant. Well, *we* didn't think they were diffe

Re: Avoiding the escaping UTF-8 unicode text

2004-03-08 Thread david_n_bertoni
> On Mar 8, 2004, at 11:39 AM, [EMAIL PROTECTED] wrote: > > >"The html output method may output a character using a character > > entity > > reference, if one is defined for it in the version of HTML that the > > output > > method is using." > > > > Many XSLT processors do this, not just Xa

Re: Avoiding the escaping UTF-8 unicode text

2004-03-08 Thread Nick Bastin
On Mar 8, 2004, at 11:39 AM, [EMAIL PROTECTED] wrote: "The html output method may output a character using a character entity reference, if one is defined for it in the version of HTML that the output method is using." Many XSLT processors do this, not just Xalan-C, so I'm not sure why you th

Re: Avoiding the escaping UTF-8 unicode text

2004-03-08 Thread Keith Rogers
Actually, David, it did apply - if you think you've got UTF-8, but really don't then the output won't be what you expect.   We're just talking semantics here.  I've had a [u]string class (for about 4 years now) that encapsulates both UTF-8 and UTF-16, because it seemed like the X/X transcoders leak

Re: Avoiding the escaping UTF-8 unicode text

2004-03-08 Thread david_n_bertoni
> > This just recently happened when I was creating a Xerces text node, > > and the DOM_String (Xerces 1.6!) was constructed with a char* that > > pointed to UTF-8, instead of a wchar_t* pointing to UTF-16.  What > > happens is that Xerces interprets char* as a *multibyte* character > > set, an

Re: Avoiding the escaping UTF-8 unicode text

2004-03-08 Thread david_n_bertoni
| |cc: (bcc: David N Bertoni/Cambridge/IBM) | | Subject: Avoid

RE: Avoiding the escaping UTF-8 unicode text

2004-03-08 Thread Fish Christopher G Contr ESC/ACU OL1
#x27;s we're big and don't care umbrella.   hope it is useful, Christopher -Original Message-From: Keith Rogers [mailto:[EMAIL PROTECTED]Sent: Monday, March 08, 2004 8:45 AMTo: xalan-c-users@xml.apache.orgSubject: Re: Avoiding the escaping UTF-8 unicode text No questio

Re: Avoiding the escaping UTF-8 unicode text

2004-03-08 Thread Keith Rogers
No question - XMLSpy.  Been using it for about 3 1/2 years.  You need a Unicode font installed, of course.   Don't know if this is your problem, but what happens in Windows UTF-8, e.g., in Notepad, is that the file starts with the UTF-8 encoding bytes (actually, UTF-16 flags FFFE or FEFF encoded as

Re: Avoiding the escaping UTF-8 unicode text

2004-03-08 Thread Nick Bastin
On Mar 7, 2004, at 7:42 PM, Keith Rogers wrote: All of our file XML is UTF-8 input, and I haven't seen any problems with direct file transforms using Xerces 1.6/Xalan 1.3 or Xerces 2.3/Xalan 1.6.  I never saw a reason for ICU, since all of our stuff is UTF-8 (or UTF-16), so don't build with it. 

Re: Avoiding the escaping UTF-8 unicode text

2004-03-08 Thread Keith Rogers
All of our file XML is UTF-8 input, and I haven't seen any problems with direct file transforms using Xerces 1.6/Xalan 1.3 or Xerces 2.3/Xalan 1.6.  I never saw a reason for ICU, since all of our stuff is UTF-8 (or UTF-16), so don't build with it.  Like I said, the only time I saw (what should have

Re: Avoiding the escaping UTF-8 unicode text

2004-03-07 Thread Nick Bastin
On Mar 7, 2004, at 5:11 PM, Keith Rogers wrote: Not sure what statement you're having the problem with, but if you've got xsl:output's charset set to UTF-8, and using disable-output-escaping="yes" (e.g., in xsl:value-of or xsl:text), and still see it, then when I've seen this problem, iti turned

Re: Avoiding the escaping UTF-8 unicode text

2004-03-07 Thread Keith Rogers
Not sure what statement you're having the problem with, but if you've got xsl:output's charset set to UTF-8, and using disable-output-escaping="yes" (e.g., in xsl:value-of or xsl:text), and still see it, then when I've seen this problem, iti turned out that the data wasn't actually UTF-8.   This ju

Avoiding the escaping UTF-8 unicode text

2004-03-07 Thread Nick Bastin
Xalan is output-escaping UTF-8 text that should most definitely NOT be escaped in HTML output. It's escaping all of the character 'bytes' as if they were characters themselves. Is there something that has to be specifically set in the stylesheet to avoid this? It seems to me that it should k