it's not that people don't bother to answer you but a lot of people here don't have any experience with shift-jis encoding. as a Norwegian I have the same problem, non Scandinavians can hardly reproduce problems revolving Scandinavian-characters.
 
when it comes to your string problem there can be several sources. first of all you can test the dom by feeding it a string that has been created with a declared encoding, like :
 
new String( "æ e trønder æ å" ); will not work on all jdks/platforms

new String( "æ e trønder æ å", "UTF-16" ); will work on most sane jdks/platforms

try to create all your strings with shift_jis forced, just in case. second find out weither StringWriter does support shift_jis, as far as i know StringWriter are working on chars and strings and should support shift_jis if all strings fed to it is shift_jis created. lastly there is some problems regarding the PrintWriter that the servlet api are using to return serialized content to the browser, try to serialize to a file instead of to the browser, if the file accepts shift_jis then you should look up fixes/gotchas regarding shift_jis and jsp as cocoon are using the jsp mechanism to send the response back to the user.

the best place to start looking is the xalan faqs and docs because if you use the xml or html serializer it's using the xalan implementations.

mvh karl øie

 

-----Original Message-----
From: Arun.N [mailto:[EMAIL PROTECTED]]
Sent: 12. desember 2001 12:48
To: [EMAIL PROTECTED]
Subject: Re: urgent encoding problem...

Hi all,
            First of all i thank everybody for not bothering to reply. I corrected the second and the third problem. If the list is still alive and anyone cares to give me solution for the first problem please do reply.....
thankx,
Arun.N
 
----- Original Message -----
From: Arun.N
Sent: Tuesday, December 11, 2001 1:31 PM
Subject: urgent encoding problem...

Hi all,
            I have some problems with the xsp pages and encoding. When i try to display Shift_JIS encoded characters it is not displaying properly.
when i hard code the japnese characters it is working properly. for example in this xsp page
 
<?xml version="1.0" encoding="Shift_JIS"?>
<?cocoon-process type="xsp"?>
<?cocoon-process type="xslt"?>
<?xml-stylesheet href="xsl/viewMail-to-html.xsl" type="text/xsl" ?>
<xsp:page
  language="java"
  encoding="Shift_JIS"
  xmlns:xsp="http://www.apache.org/1999/XSP/Core"
  xmlns:request="http://www.apache.org/1999/XSP/Request"
  xmlns:util="http://www.apache.org/1999/XSP/Util"
 >
<page>
   <title>melpo View Mail</title>
  <body>
        <label>‚ ‚È‚½‚ÌPC‚Ì’†‚̃[ƒ‹ƒNƒ‰ƒCƒAƒ“ƒg‚ªÄŠJ‚³‚ê‚Ü‚µ‚½B </label>
    </body>
</xsp:page>
 
the display html is working fine and the characters are working properly .. but the source of the html shows
<html>
    <body>
    &#12354;&#12394;&#12383;&#12398;PC&#12398;&#20013;&#12398;&#12513;&#12540;&#12523;&#12463;&#12521;&#12452;&#12450;&#12531;&#12488;&#12364;&#20877;&#38283;&#12373;&#12428;&#12414;&#12375;&#12383;&#12290;
    </body>
</html>
<!-- This page was served in 2278 milliseconds by Cocoon 1.8.2 -->
 
but why is the characters converted into numbers. the problem i have here is this consumes more bytes .. so if the device has some size limitations of the source of the page then it is a problem. if the characters are left same way then it would consume less bytes for the source page.
 
 
The second problem is, when i dynamically include xml in my xsp it is not working. But the same string when hardcode in the xsp page it is working fine.
<?xml version="1.0" encoding="Shift_JIS"?>
<?cocoon-process type="xsp"?>
<?cocoon-process type="xslt"?>
<?xml-stylesheet href="xsl/viewMail-to-html.xsl" type="text/xsl" ?>
<xsp:page
  language="java"
  encoding="Shift_JIS"
  xmlns:xsp="http://www.apache.org/1999/XSP/Core"
  xmlns:request="http://www.apache.org/1999/XSP/Request"
  xmlns:util="http://www.apache.org/1999/XSP/Util"
 >
<page>
   <title>melpo View Mail</title>
  <body>
        <xsp:logic>
             String xml = (String) request.getAttribute(xml);
            <xsp:content>
               <util:include-expr><util:expr>xml</util:expr></util:include-expr>  // this will append an xml string like <label>‚ ‚È‚½‚ÌPC‚Ì’†‚̃[J‚³‚ê‚Ü‚µ‚½B </label>          
            </xsp:content>
        </xsp:logic>
    </body>
</xsp:page>
 
i am getting an error
 
org.xml.sax.SAXException: An invalid XML character (Unicode: 0x13) was found in the element content of the document. [FATAL ERROR] [File: "null" Line: 1 Column: 109] (nested exception: org.xml.sax.SAXParseException: An invalid XML character (Unicode: 0x13) was found in the element content of the document.)
But the string i am getting if hardcoded itzworking fine. because whrn i hardcode it, the xsp page when getting compiled, it is converting all the characters to those numbers. and whenever the string is dynamically included the it is not working..................................
 
and the third problem is ,
    when i load a string to a dom andthen get back the string the encoding information is gone.The characers displayed are ???????????
        String fullXml = "<?xml version=\"1.0\" encoding=\"Shift_JIS\"?><Response><Message>Mail Client in your PC has been ƒƒOƒAƒEƒg Restarted ƒGƒLƒTƒCƒg : –|–󁄗˜—p‹K–ñ xxx </Message></Response>";
 
      DOMParser parser = new DOMParser();
      InputStream is = new ByteArrayInputStream(fullXml.getBytes());
      InputSource isource=new InputSource(is);
      parser.parse(isource);
      Document xmlDoc= parser.getDocument();       //created an dom
 ------------ doing some manipulation ------------------
      OutputFormat    format  = new OutputFormat( xmlDoc );   //Serialize DOM
      StringWriter  stringOut = new StringWriter();           //Writer will be a String
      XMLSerializer    serial = new XMLSerializer( stringOut, format );
      serial.asDOMSerializer();                               // As a DOM Serializer
      serial.serialize( xmlDoc.getDocumentElement() );
      String returnXML = stringOut.toString();  // got back the xml as String.
 
now if i display the string " returnXML " all the japanese characters are gone. the output is only "???????????"
 
Can any of you please give a solution for these problems, as it is very urgent for me. I have been trying to solve theses isuues from past 2 days and have searched mail archives i was not able to find a solution.
 
Thankx in Advance
 
regards,
Arun.N,

Reply via email to