Hi all, It appears to be a consequence of bug 25594.
I have a workaround involving a hack to StreamGenerator. Cheers, John > -----Original Message----- > From: [EMAIL PROTECTED] > Sent: 08 July 2005 12:12 > To: users@cocoon.apache.org > Subject: RE: Stream Generator / uploading UTF-8 encoded chinese files > > > Hi, > > > > you can configure the encoding like this : > > Did you configure the <form-encoding> in web.xml ? > > Did you try using the action : setCharacterEncoding (at > the start of > > you pipeline) ? > > > > Did you open your document with Ultraedit to see what's the > encoding ? > > > > > > Lionel > > > > > > > > Bazeley, John wrote: > > > > >Hi all, > > > > > >I'm trying to use the stream generator to upload XML files that > > >are UTF-8 encoded and contain chinese characters. Source system > > >is Windows XP and Cocoon is v2.1.7 running on Solaris 9 / Java > > >1.4.2. Whether I use my own pipeline with curl uploading the file > > >or the /samples/stream/process-order pipeline, the results are > > >the same: the file is returned to me with all the chinese > > >characters mangled ('od' shows all the Chinese characters have > > >been converted to 357 277 275). > > > > > >I have inserted debug into the stream generator and the XML > > >serialiser, and both think they are using UTF-8 encoding. > > > > > >Why is my document getting corrupted? What am I doing wrong? > > > > > >The source document has 'encoding="UTF-8"' in the <?xml > ... string, > > >and IE and Firefox both display it correctly and tell me the > > encoding > > >is UTF-8, so I am inclined to believe the document is correctly > > >encoded. > > > > > >All suggestions are welcome. > > > > > >Thanks, John > > Some more information for the record that I did not post earlier: > > I'm using the version of Jetty that comes bundled with Cocoon 2.1.7 as > the servlet container. > Debug has ascertained that the uploaded file gets saved to disk > correctly, so the corruption happens some time after that. > > I have updated the servlet jar to 2.3, and that did not make things > any better. > > My minimal pipeline is: > > <map:match pattern="john/text"> > <map:generate type="stream"> > <map:parameter name="generate-attributes" value="true"/> > <map:parameter name="form-name" value="my_xmlfile"/> > </map:generate> > <map:serialize type="text"/> > </map:match> > > and as I stated earlier, the corruption occurs using the > sample uploader > too. > > In my sitemap, I have the text serialiser set to utf-8 thus: > <map:serializer logger="sitemap.serializer.text" > mime-type="text/plain" name="text" pool-max="20" > src="org.apache.cocoon.serialization.TextSerializer"> > <encoding>UTF-8</encoding> > </map:serializer> > > Thanks for any help, > -- > John > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > > --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]