Hi Robert,
I presume you're a victim of the same syndrom as we had.
I have written this somewhere already but... I can't find it anymore, here is the issue�
- content-encoding header only allows value something like form-data... nothing meaning encoding of the characters in here, in particular how to convert the unicode character ò to some %xx value... (making it %F2 would mean using iso-8859-1).
- what can browsers do ? either ask the user (some browsers have this in preferences) or just use the same encoding as received, this is generally the wise choice...
- what can sever-containers do ? Well... they don't know, they have no clue what was the browser-page all this was coming from... so they just convert the bytes to a string matching %F2 to ò hence giving very weird result if UTF-8 is used...
We do all in UTF-8, russian, french, and math characters were our interest.
Our solution came as follows, once we had guessed that into Tomcat: write a little converter that contains an InputStreamReader(pig,"UTF-8") and read from there with pig defined to be something like a ByteArrayInputStream(request.getParam("xx").getBytes()).
Since then, we're happy. But one day, one should file a bug on the HTML specification...
Hope that helps.
Paul
Robert Priest wrote:
and the following does not help:
try { fileName = new String(cd.substring(start + 10, end).trim().getBytes("UTF-8")); } catch (java.io.UnsupportedEncodingException uee) { }
-----Original Message----- From: Robert Priest [mailto:[EMAIL PROTECTED] Sent: Wednesday, September 17, 2003 11:19 AM To: '[EMAIL PROTECTED]' Subject: [FileUpload] Unicode Encoding for a Form
Hello all,
I have a simple html form which has an <INPUT TYPE="FILE"/> field in it.
Now when I select a file that contains Scandanavian characters (such as umlauts) it is not being URL encoded properly before being sent. As a result, my jsp page which accepts posts of files via the FileUpload package is not interpreting the file name correctly.
Has anyone seen this problem, first? And does anyone have a solution for this issue?
For example, if I select a file say:
filename="C:\Documents and Settings\Robert.Priest\Desktop\���.txt"
what is sent in the request is:
C:\Documents and Settings\Robert.Priest\Desktop\???.txt"
and what is seen by if you do a FileItem.getName() is:
C:\Documents and Settings\Robert.Priest\Desktop\???.txt
So the method FileUploadBase.getFileName(Map /* String, String */ headers)
does not see the correct filename when it executes:
if (start != -1 && end != -1) { fileName = cd.substring(start + 10, end).trim(); }
The following is the multipart requests that IE sends using such a file (with umlauts) in the name: ------------------------------
POST /jsp/upload.jsp HTTP/1.1 Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, application/vnd.ms- powerpoint, application/vnd.ms-excel, application/msword, application/x-shockwav e-flash, */* Referer: http://localhost:8080/roberttest/rptest.html Accept-Language: en-us Content-Type: multipart/form-data; boundary=---------------------------7d39eb580 29a Accept-Encoding: gzip, deflate User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0) Host: localhost:2000 Content-Length: 349 Connection: Keep-Alive Cache-Control: no-cache
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
