|
I have a problem with a doc containing
(sample):-
<?xml version="1.0"
encoding="ISO-8859-1"?>
<!DOCTYPE musiccollection SYSTEM "music.dtd"> <musiccollection> <album medium="LP" label="Philips"
ident="SAL 3686">
<performers artist="Various"/> <entry> <composer>Bruči</composer> <work>Simfonia Lestá</work> </entry> </album> </musiccollection>
The (mildly) exotic letter is the
Eastern European c with an upside-down circumflex above it. As I push it
thru a Java pgm using SAX and the Java built-in xerces intending to get HTML
out, "č" arrives at the "characters" call-back as a single
character (most likely a ?, tho I haven't checked). I've tried various
odds and ends like using "&ccaron" with & without a <!ENTITY> iso
the "č" but to no useful effect. I'd like the #269 to come
through somehow, as it is recognised by browsers. Is there some simple
change I can make to the doc, perhaps a different "encoding" or a reference to
some W3C spec?
I've poked around in the Java source but
probably can't see the wood for the trees. OS is Windows XP SP2,
with Java 1.4 or 1.5 or 1.6 (tried 'em all). I also have stand-alone
xerces 1.4.4 and 2.8.1 and the latter does the same as the Javas. OTOH,
the #269 comes thru an XSL script under the Saxon 7.9 Java product
perfectly.
Rgds, GFStC.
|
- Entities Graeme St.Clair
- Re: Entities Filozof71
- Re: Entities Graeme St.Clair
- Re: Entities Eric J. Schwarzenbach
- Re: Entities Filozof71
- Re: Entities keshlam
- Re: Entities Klaus Malorny
- Re: Entities Graeme St.Clair
- Re: Entities Michael Glavassevich
