How to recreate the issue:
In order to recreate the problem simply put an HTML input text box into a form. 
 When filling out the data in the browser on a windows box type ALT-0225 .  
Then submit the form.  The value will not be retrieved correctly.  The only 
work around I could come up with was to do this:

if (fileItem.isFormField())
{
  String value = fileItem.getString("UTF-8");
}

But unfortunately this implies that you always have UTF-8.  It seems to me that 
the file upload library should account for the encoding in the method 
".getString()" for FileItem.  But since the encoding is always "ISO-8859-1" 
there is no way to work around this issue.  There are many questions on Stack 
Overflow talking about this issue.  It seems to be a common problem for people. 
 It is very common to have UTF-8 encoding these days especially at Universities.

Thanks,

Lance

-----Original Message-----
From: Bernd Eckenfels [mailto:e...@zusammenkunft.net] 
Sent: Monday, August 01, 2016 2:18 PM
To: Commons Users List <user@commons.apache.org>
Subject: Re: [fileupload] UTF-8 encoding issue

Hello,

http headers are essentially ASCII. Especially for things like the boundary. 
The arguments (filename) might be QP, but not all browsers like that.

Gruss
Bernd

Am Mon, 1 Aug 2016 12:10:24 -0700
schrieb Gary Gregory <garydgreg...@gmail.com>:

> I can see that
> in org.apache.commons.fileupload.FileUploadBase.getBoundary(String)
> we have:
> 
>   boundary = boundaryStr.getBytes("ISO-8859-1");
> 
> Should that be:
> 
>   boundary = boundaryStr.getBytes(headerEncoding);
> 
> ?
> 
> Gary
> 
> On Mon, Aug 1, 2016 at 11:01 AM, Campbell, Lance <la...@illinois.edu>
> wrote:
> 
> > There is still an issue.
> > I had a typo in my email.  There should not have been the line 
> > resp.setContentType("UTF-8");
> >
> > Sorry.
> >
> >
> >
> > -----Original Message-----
> > From: Campbell, Lance [mailto:la...@illinois.edu]
> > Sent: Monday, August 01, 2016 12:51 PM
> > To: 'user@commons.apache.org' <user@commons.apache.org>
> > Subject: [fileupload] UTF-8 encoding issue
> >
> > Commons File Upload 1.3.2
> >
> > I am using the commons file upload version 1.3.2 via servlets for 
> > apache-tomcat 8.  All of my servlets work with UTF-8 except for when 
> > I am using the commons file upload library.  It seems to be setting 
> > the encoding to "ISO-8859-1" .  I have set both the request and 
> > response headers to UTF-8.  I have also set the Java VM to use 
> > UTF-8.
> >
> > How can I get around this issue?
> >
> > I have read online to do the following:
> > req.setCharacterEncoding("UTF-8");
> > resp.setContentType("UTF-8");
> > resp.setCharacterEncoding("UTF-8");
> >
> > I have also read to set this at the form:
> >
> > accept-charset="UTF-8"
> >
> > It seems like your code is hardcoding the encoding to "ISO-8859-1"
> > in the class FileUploadBase.
> >
> > Why not allow us to set the encoding in a method then use the 
> > "ISO-8859-1" as a fall back?
> >
> > Thanks,
> >
> > Lance Campbell
> >
> >
> > --------------------------------------------------------------------
> > - To unsubscribe, e-mail: user-unsubscr...@commons.apache.org
> > For additional commands, e-mail: user-h...@commons.apache.org
> >
> >
> > --------------------------------------------------------------------
> > - To unsubscribe, e-mail: user-unsubscr...@commons.apache.org
> > For additional commands, e-mail: user-h...@commons.apache.org
> >
> >
> 
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@commons.apache.org
For additional commands, e-mail: user-h...@commons.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@commons.apache.org
For additional commands, e-mail: user-h...@commons.apache.org

Reply via email to