Hi Greg, If I understand your question correctly, the answer is actually no. You can't just put binary data into your XML and have it 'do the right thing', as binary data will contain sequences of bytes that are not legal (in ISO-8859-1, UTF-8 or any encoding I know of). So the serializer will merrily chug along until it hits one of these and, most likely, convert the unknown sequence into a '?'.
There might be some other options, but from past posts on the list I'll bet that base64 encoding is your best bet. Chris -----Original Message----- From: Greg Hess [mailto:[EMAIL PROTECTED] Sent: Thursday, April 17, 2003 2:02 PM To: [EMAIL PROTECTED] Subject: RE: General tips help, NEWBIE? Thanks, it all helps :-), I am getting there and hopefully soon be able to properly state my problems. My Base64 encoder has methods to encode the image byte[] to a Stirng and then decode the String back to a byte[] for my io. It has been working well so far. My main concern is with the text I am encoding and decoding. If I set the OutputFormat as follows will I be able to construct my DOM without encoding my text strings as they will automatically be encoded in the serialization process? OutputFormat of = new OutputFormat("XML", " ISO-8859-1", true); XMLSerializer ser = new XMLSerializer(System.out, of); Then when I parse the InputStream that contains the serialized DOM will all the text nodes that I read automatically be decoded when I getNodeValue()? Many thanks, Greg -----Original Message----- From: Constantine Georges [mailto:[EMAIL PROTECTED] Sent: Thursday, April 17, 2003 4:42 PM To: [EMAIL PROTECTED] Subject: Re: General tips help, NEWBIE? Greg: I don't know how familiar you are with the Base64 encoding, but, basically, here's the text that would come out: Table 1: The Base64 Alphabet Value Encoding Value Encoding Value Encoding Value Encoding 0 A 17 R 34 i 51 z 1 B 18 S 35 j 52 0 2 C 19 T 36 k 53 1 3 D 20 U 37 l 54 2 4 E 21 V 38 m 55 3 5 F 22 W 39 n 56 4 6 G 23 X 40 o 57 5 7 H 24 Y 41 p 58 6 8 I 25 Z 42 q 59 7 9 J 26 a 43 r 60 8 10 K 27 b 44 s 61 9 11 L 28 c 45 t 62 + 12 M 29 d 46 u 63 / 13 N 30 e 47 v 14 O 31 f 48 w (pad) = 15 P 32 g 49 x 16 Q 33 h 50 y The concept behind Base64 is using 4 6-bit characters to encode 3 actual bytes. Anyway, as you can see, there are no XML special characters in the Base64 vocabulary (<, >, or &), so the entire image could just be added as the value of any tag or attribute. I think the main issue, then, is in your Base64 encoder and its interface. Does it return a byte[ ], or does it return a String? If it returns a byte [ ], you're going to have to be careful since Java's characters are 16 bits, and not 8 (i.e., you can't just call String(imgbuf)). Setting an Element or Attribute value, since that interface uses Java String objects, becomes a question of taking the Base64 and widening the 8 bit bytes to 16 bit chars. Really, it's simple to do -- a brute-force implementation would be to construct a StringBuffer, iterate across the array, and just strbuf.append((char) imgbuf[i]). Then, as you're building the DOM, set some element or attribute's value to the String value of the buffer: StringBuffer strbuf = new StringBuffer(); // dump the image into the buffer here... Element img = doc.createElement("image"); Text value = doc.createTextNode(strbuf.toString()); img.appendChild(value); doc.appendChild(img); And, when you go to get your original image back again, just down-cast the characters in the string to bytes and feed the byte array to the Base64 decoder. If, of course, your Base64 decoder works with Java Strings natively, then this is a non-issue. >From my experience, I would say be careful with XMLSerializer -- it cuts lines off if they are too long unless you specify that it should preserve space in the output format object. Also, if you're working with namespaces at all, it can put some junk in your serialized file in terms of namespace prefixes. In terms of initially setting it up, here's what I always do: OutputFormat of = new OutputFormat("XML", "UTF-8", true); XMLSerializer ser = new XMLSerializer(System.out, of); Does any of this help? C To: "Xerces User Mail list" <[EMAIL PROTECTED]> "Greg Hess" cc: (bcc: Constantine Georges/Towers Perrin) <[EMAIL PROTECTED] Subject: General tips help, NEWBIE? s.com> 04/17/2003 03:45 PM Please respond to xerces-j-user Hi All, I would greatly appreciate any help that you could provide on helping me complete my task. I am new to Xerces and XML parsing and having difficulty understanding how I should be implementing this functionality. I have a web form that allows a user to enter text(may contain reserved characters)/image upload used to populate a remote web page. My Servlet(Action) as I am using Struts, takes the text and image byte[] and creates a XML document containing the data. As I build the DOM I use a Base64 encoding tool to encode my text and image byte[]. I then send the XML DOM to a remote Servlet to be handled. I do this via HTTP POST using the XMLSerializer to write the DOM to the Servlets OutputStream XMLSerializer serializer = new XMLSerializer(os, null); serializer.serialize(doc); I have been looking at the OutputFormat and wondering if I could just set the encoding of the OutputFormat and my text nodes would be encoded by the XMLSerializer. If so could I still use the Base64 encoder to encode my image byte[]? Any help much appreciated, Greg --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]