Joe, Thought the following bugs might be useful input to you...
There is an alternative solution suggested in Bug 23255 - which was to use a filter to set the character encoding if it has not been set by the browser: http://issues.apache.org/bugzilla/show_bug.cgi?id=23255 In bugs 29824/29668 they discuss request encoding in general, not just for multipart requests, so maybe a wider solution is required: http://issues.apache.org/bugzilla/show_bug.cgi?id=29824 http://issues.apache.org/bugzilla/show_bug.cgi?id=29668 Also, does the new "acceptCharset" attribute on <html:form> help at all with this problem? Niall ----- Original Message ----- From: "Joe Germuska" <[EMAIL PROTECTED]> To: <[EMAIL PROTECTED]> Sent: Saturday, August 28, 2004 11:19 PM Subject: Changing how CommonsMultipartRequestHandler handles text parameters? > Hi, there: > > I've recently had a request to add a file upload field to a form. > It seems as though when the form's "enctype" value is changed to > "multipart/form-data", then string values from non-form fields are no > longer being processed with the correct character encoding. > > Looking more closely, the problem seems to land in > CommonsMultipartRequestHandler, where in addTextParameter(...), it > assumes a default encoding of "ISO-8859-1" if > request.getCharacterEncoding() returns null. > > Now, for years, I've been wishing that request.getCharacterEncoding() > would return correct values, but it never seems to return anything > but null. If I'm doing something wrong, please let me know. (I'm > using Tomcat 4.1.x for my local development and testing, but the > problem is also manifesting in a QA deployment environment using > JBoss 3.2.5/Tomcat 5.x > > By changing the assumed encoding from ISO-8859-1 to UTF-8 (in > CommonsMultipartRequestHandler), I solve the problem. Obviously, > that's not a good permanent solution. I have a plan, but I thought > I'd air it and see if anyone has an opinion before I do it. > > The plan is to add a configuration property to ControllerConfig and > an equivalent property to the ModuleConfig interface and the > ModuleConfigImpl class. Then, CommonsMultipartRequestHandler would > consult the ModuleConfig to see if a character encoding had been > specified, and if so, it would use it rather than ISO-8859-1 > > I was ready to just do this, but a few things gave me pause enough to > run it by the dev-list. > > 1) I don't love changing the ModuleConfig interface. However, odds > seem slim that there are implementations of the interface which don't > extend ModuleConfigImpl, so the impact will probably be low. > 2) I'm not sure what to name the property. At first, I was going to > call it "parameterValueEncoding" but then I was concerned that people > would believe that it would affect the handling of non-multipart > request parameters. Perhaps "multipartParameterEncoding," but would > people then think that it had any impact on the file part? > > Now that I've written this out, I'm inclined to add a String > property, "multipartParameterEncoding", to ControllerConfig and > ModuleConfig and use that as the way to externally control it. But > now that I've written this out, I might as well send it to the list > and see if there's any feedback before I do it. > > Joe > > PS a slightly more radical change would be to add a > "MultipartHandlerConfig" as a child of the ControllerConfig and > centralize the multipart config values (now four, I think counting > the implementation class) but that would be more work, maybe not for > much payoff... > > -- > Joe Germuska > [EMAIL PROTECTED] > http://blog.germuska.com > "In fact, when I die, if I don't hear 'A Love Supreme,' I'll turn > back; I'll know I'm in the wrong place." > - Carlos Santana --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
