I have a problem with a doc containing (sample):-
 
 <?xml version="1.0" encoding="ISO-8859-1"?>
 <!DOCTYPE musiccollection SYSTEM "music.dtd">
 <musiccollection>
 
 <album medium="LP" label="Philips" ident="SAL 3686">
 <performers artist="Various"/>
 <entry>
 <composer>Bru&#269;i</composer>
 <work>Simfonia Lestá</work>
 </entry>
 </album>
 
 </musiccollection>
 
The (mildly) exotic letter is the Eastern European c with an upside-down circumflex above it.  As I push it thru a Java pgm using SAX and the Java built-in xerces intending to get HTML out, "&#269;" arrives at the "characters" call-back as a single character (most likely a ?, tho I haven't checked).  I've tried various odds and ends like using "&ccaron" with & without a <!ENTITY> iso the "&#269;" but to no useful effect.  I'd like the #269 to come through somehow, as it is recognised by browsers.  Is there some simple change I can make to the doc, perhaps a different "encoding" or a reference to some W3C spec?
 
I've poked around in the Java source but probably can't see the wood for the trees.  OS is Windows XP SP2, with Java 1.4 or 1.5 or 1.6 (tried 'em all).  I also have stand-alone xerces 1.4.4 and 2.8.1 and the latter does the same as the Javas.  OTOH, the #269 comes thru an XSL script under the Saxon 7.9 Java product perfectly.
 
Rgds, GFStC.
 
 

Reply via email to