Are you using Tomcat?  If you do some searches, you'll probably turn up
other postings on this topic.  I can't remember all the ins & outs, but
supposedly the browsers incorrectly supply encoding information in their
requests to Tomcat (ours were POST requests), and Tomcat by default assumes
ISO-8859-1 encoding for the submitted form data.  This gives you garbage
data for anything outside plain ASCII, if your browser sends it in UTF-8.
I found some webapp filter class that works with Tomcat to coerce the
character encoding (or Tomcat's perception of the encoding) for all
requests to UTF-8, to match all of our pages and forms.  I can send you our
filter and the web.xml entries separately; if you want them, just e-mail
me.  There may be better solutions now with Tomcat 5, but we have just kept
on using this old filter, and it seems to do the trick.

(Now that I think about it, I think this filter is/was supplied with the
Tomcat source distribution, maybe.  Look for the SetCharacterEncodingFilter
class.)

-Christopher



|---------+---------------------------->
|         |           "Kist, Paul"     |
|         |           <[EMAIL PROTECTED]|
|         |           com>             |
|         |                            |
|         |           06/23/2004 04:03 |
|         |           PM               |
|         |           Please respond to|
|         |           users            |
|         |                            |
|---------+---------------------------->
  
>--------------------------------------------------------------------------------------------------------------|
  |                                                                                    
                          |
  |       To:       "'[EMAIL PROTECTED]'" <[EMAIL PROTECTED]>                          
              |
  |       cc:                                                                          
                          |
  |       Subject:  UTF-8 and Encoding Problems                                        
                          |
  
>--------------------------------------------------------------------------------------------------------------|




Here's one I haven't seen before:

I have an HTML form with the following input tag:

<input onclick="updateFields(this);" name="select" value="Patient&rsquo;s
Page" type="radio">

The &rsquo; is supposed to translate to this form of the single quote:  '

However this string is being stored in the database, as well as the object
bean, with jumbled up characters.  Somewhere in the submit, the integrity
of
this character is being lost.

I made some debug statements in the flowscript that handles the submit and
this is the value of that radio button if it is selected:

"PatientΓ??s Page"

I tried also changing the &rsquo; to the actual character itself, and the
hex representation, but the same result happens.    Has anyone seen
anything
like this and would know what is happening?

-Paul


------------------------------------------------------------------------------

Notice:  This e-mail message, together with any attachments, contains
information of Merck & Co., Inc. (One Merck Drive, Whitehouse Station, New
Jersey, USA 08889), and/or its affiliates (which may be known outside the
United States as Merck Frosst, Merck Sharp & Dohme or MSD and in Japan, as
Banyu) that may be confidential, proprietary copyrighted and/or legally
privileged. It is intended solely for the use of the individual or entity
named on this message.  If you are not the intended recipient, and have
received this message in error, please notify us immediately by reply
e-mail and then delete it from your system.
------------------------------------------------------------------------------


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Reply via email to