RE: [fileupload] UTF-8 encoding issue
How to recreate the issue: In order to recreate the problem simply put an HTML input text box into a form. When filling out the data in the browser on a windows box type ALT-0225 . Then submit the form. The value will not be retrieved correctly. The only work around I could come up with was to do this: if (fileItem.isFormField()) { String value = fileItem.getString("UTF-8"); } But unfortunately this implies that you always have UTF-8. It seems to me that the file upload library should account for the encoding in the method ".getString()" for FileItem. But since the encoding is always "ISO-8859-1" there is no way to work around this issue. There are many questions on Stack Overflow talking about this issue. It seems to be a common problem for people. It is very common to have UTF-8 encoding these days especially at Universities. Thanks, Lance -Original Message- From: Bernd Eckenfels [mailto:e...@zusammenkunft.net] Sent: Monday, August 01, 2016 2:18 PM To: Commons Users List <user@commons.apache.org> Subject: Re: [fileupload] UTF-8 encoding issue Hello, http headers are essentially ASCII. Especially for things like the boundary. The arguments (filename) might be QP, but not all browsers like that. Gruss Bernd Am Mon, 1 Aug 2016 12:10:24 -0700 schrieb Gary Gregory <garydgreg...@gmail.com>: > I can see that > in org.apache.commons.fileupload.FileUploadBase.getBoundary(String) > we have: > > boundary = boundaryStr.getBytes("ISO-8859-1"); > > Should that be: > > boundary = boundaryStr.getBytes(headerEncoding); > > ? > > Gary > > On Mon, Aug 1, 2016 at 11:01 AM, Campbell, Lance <la...@illinois.edu> > wrote: > > > There is still an issue. > > I had a typo in my email. There should not have been the line > > resp.setContentType("UTF-8"); > > > > Sorry. > > > > > > > > -Original Message- > > From: Campbell, Lance [mailto:la...@illinois.edu] > > Sent: Monday, August 01, 2016 12:51 PM > > To: 'user@commons.apache.org' <user@commons.apache.org> > > Subject: [fileupload] UTF-8 encoding issue > > > > Commons File Upload 1.3.2 > > > > I am using the commons file upload version 1.3.2 via servlets for > > apache-tomcat 8. All of my servlets work with UTF-8 except for when > > I am using the commons file upload library. It seems to be setting > > the encoding to "ISO-8859-1" . I have set both the request and > > response headers to UTF-8. I have also set the Java VM to use > > UTF-8. > > > > How can I get around this issue? > > > > I have read online to do the following: > > req.setCharacterEncoding("UTF-8"); > > resp.setContentType("UTF-8"); > > resp.setCharacterEncoding("UTF-8"); > > > > I have also read to set this at the form: > > > > accept-charset="UTF-8" > > > > It seems like your code is hardcoding the encoding to "ISO-8859-1" > > in the class FileUploadBase. > > > > Why not allow us to set the encoding in a method then use the > > "ISO-8859-1" as a fall back? > > > > Thanks, > > > > Lance Campbell > > > > > > > > - To unsubscribe, e-mail: user-unsubscr...@commons.apache.org > > For additional commands, e-mail: user-h...@commons.apache.org > > > > > > > > - To unsubscribe, e-mail: user-unsubscr...@commons.apache.org > > For additional commands, e-mail: user-h...@commons.apache.org > > > > > > - To unsubscribe, e-mail: user-unsubscr...@commons.apache.org For additional commands, e-mail: user-h...@commons.apache.org - To unsubscribe, e-mail: user-unsubscr...@commons.apache.org For additional commands, e-mail: user-h...@commons.apache.org
Re: [fileupload] UTF-8 encoding issue
Hello, http headers are essentially ASCII. Especially for things like the boundary. The arguments (filename) might be QP, but not all browsers like that. Gruss Bernd Am Mon, 1 Aug 2016 12:10:24 -0700 schrieb Gary Gregory <garydgreg...@gmail.com>: > I can see that > in org.apache.commons.fileupload.FileUploadBase.getBoundary(String) > we have: > > boundary = boundaryStr.getBytes("ISO-8859-1"); > > Should that be: > > boundary = boundaryStr.getBytes(headerEncoding); > > ? > > Gary > > On Mon, Aug 1, 2016 at 11:01 AM, Campbell, Lance <la...@illinois.edu> > wrote: > > > There is still an issue. > > I had a typo in my email. There should not have been the line > > resp.setContentType("UTF-8"); > > > > Sorry. > > > > > > > > -Original Message- > > From: Campbell, Lance [mailto:la...@illinois.edu] > > Sent: Monday, August 01, 2016 12:51 PM > > To: 'user@commons.apache.org' <user@commons.apache.org> > > Subject: [fileupload] UTF-8 encoding issue > > > > Commons File Upload 1.3.2 > > > > I am using the commons file upload version 1.3.2 via servlets for > > apache-tomcat 8. All of my servlets work with UTF-8 except for > > when I am using the commons file upload library. It seems to be > > setting the encoding to "ISO-8859-1" . I have set both the request > > and response headers to UTF-8. I have also set the Java VM to use > > UTF-8. > > > > How can I get around this issue? > > > > I have read online to do the following: > > req.setCharacterEncoding("UTF-8"); > > resp.setContentType("UTF-8"); > > resp.setCharacterEncoding("UTF-8"); > > > > I have also read to set this at the form: > > > > accept-charset="UTF-8" > > > > It seems like your code is hardcoding the encoding to "ISO-8859-1" > > in the class FileUploadBase. > > > > Why not allow us to set the encoding in a method then use the > > "ISO-8859-1" as a fall back? > > > > Thanks, > > > > Lance Campbell > > > > > > - > > To unsubscribe, e-mail: user-unsubscr...@commons.apache.org > > For additional commands, e-mail: user-h...@commons.apache.org > > > > > > - > > To unsubscribe, e-mail: user-unsubscr...@commons.apache.org > > For additional commands, e-mail: user-h...@commons.apache.org > > > > > > - To unsubscribe, e-mail: user-unsubscr...@commons.apache.org For additional commands, e-mail: user-h...@commons.apache.org
Re: [fileupload] UTF-8 encoding issue
I can see that in org.apache.commons.fileupload.FileUploadBase.getBoundary(String) we have: boundary = boundaryStr.getBytes("ISO-8859-1"); Should that be: boundary = boundaryStr.getBytes(headerEncoding); ? Gary On Mon, Aug 1, 2016 at 11:01 AM, Campbell, Lance <la...@illinois.edu> wrote: > There is still an issue. > I had a typo in my email. There should not have been the line > resp.setContentType("UTF-8"); > > Sorry. > > > > -Original Message- > From: Campbell, Lance [mailto:la...@illinois.edu] > Sent: Monday, August 01, 2016 12:51 PM > To: 'user@commons.apache.org' <user@commons.apache.org> > Subject: [fileupload] UTF-8 encoding issue > > Commons File Upload 1.3.2 > > I am using the commons file upload version 1.3.2 via servlets for > apache-tomcat 8. All of my servlets work with UTF-8 except for when I am > using the commons file upload library. It seems to be setting the encoding > to "ISO-8859-1" . I have set both the request and response headers to > UTF-8. I have also set the Java VM to use UTF-8. > > How can I get around this issue? > > I have read online to do the following: > req.setCharacterEncoding("UTF-8"); > resp.setContentType("UTF-8"); > resp.setCharacterEncoding("UTF-8"); > > I have also read to set this at the form: > > accept-charset="UTF-8" > > It seems like your code is hardcoding the encoding to "ISO-8859-1" in the > class FileUploadBase. > > Why not allow us to set the encoding in a method then use the "ISO-8859-1" > as a fall back? > > Thanks, > > Lance Campbell > > > - > To unsubscribe, e-mail: user-unsubscr...@commons.apache.org > For additional commands, e-mail: user-h...@commons.apache.org > > > - > To unsubscribe, e-mail: user-unsubscr...@commons.apache.org > For additional commands, e-mail: user-h...@commons.apache.org > > -- E-Mail: garydgreg...@gmail.com | ggreg...@apache.org Java Persistence with Hibernate, Second Edition <http://www.manning.com/bauer3/> JUnit in Action, Second Edition <http://www.manning.com/tahchiev/> Spring Batch in Action <http://www.manning.com/templier/> Blog: http://garygregory.wordpress.com Home: http://garygregory.com/ Tweet! http://twitter.com/GaryGregory
RE: [fileupload] UTF-8 encoding issue
There is still an issue. I had a typo in my email. There should not have been the line resp.setContentType("UTF-8"); Sorry. -Original Message- From: Campbell, Lance [mailto:la...@illinois.edu] Sent: Monday, August 01, 2016 12:51 PM To: 'user@commons.apache.org' <user@commons.apache.org> Subject: [fileupload] UTF-8 encoding issue Commons File Upload 1.3.2 I am using the commons file upload version 1.3.2 via servlets for apache-tomcat 8. All of my servlets work with UTF-8 except for when I am using the commons file upload library. It seems to be setting the encoding to "ISO-8859-1" . I have set both the request and response headers to UTF-8. I have also set the Java VM to use UTF-8. How can I get around this issue? I have read online to do the following: req.setCharacterEncoding("UTF-8"); resp.setContentType("UTF-8"); resp.setCharacterEncoding("UTF-8"); I have also read to set this at the form: accept-charset="UTF-8" It seems like your code is hardcoding the encoding to "ISO-8859-1" in the class FileUploadBase. Why not allow us to set the encoding in a method then use the "ISO-8859-1" as a fall back? Thanks, Lance Campbell - To unsubscribe, e-mail: user-unsubscr...@commons.apache.org For additional commands, e-mail: user-h...@commons.apache.org - To unsubscribe, e-mail: user-unsubscr...@commons.apache.org For additional commands, e-mail: user-h...@commons.apache.org
[fileupload] UTF-8 encoding issue
Commons File Upload 1.3.2 I am using the commons file upload version 1.3.2 via servlets for apache-tomcat 8. All of my servlets work with UTF-8 except for when I am using the commons file upload library. It seems to be setting the encoding to "ISO-8859-1" . I have set both the request and response headers to UTF-8. I have also set the Java VM to use UTF-8. How can I get around this issue? I have read online to do the following: req.setCharacterEncoding("UTF-8"); resp.setContentType("UTF-8"); resp.setCharacterEncoding("UTF-8"); I have also read to set this at the form: accept-charset="UTF-8" It seems like your code is hardcoding the encoding to "ISO-8859-1" in the class FileUploadBase. Why not allow us to set the encoding in a method then use the "ISO-8859-1" as a fall back? Thanks, Lance Campbell - To unsubscribe, e-mail: user-unsubscr...@commons.apache.org For additional commands, e-mail: user-h...@commons.apache.org