Unfortunately that doesn't work for what I am using it for. I am stepping through the code to work out what it is doing, but haven't as yet sortes it out.
This is the example string I am using, ィセチをツ this is the result of a request.getParamater(name); each section bounded by '&' and ';' the converURL method actually converts this to "451475481434484" this is after I edited the different bounding characters. boefore I edited it it just ended up with the original string (ィセチをツ). I know that the integer (12451) within ィ represents the correct character in java because if a get the Integer.intvalue() value of the string (12451) and cast it to a char, I get the correct character. i then string the request together. e.g. http://bla.bla.bla/blah.do?character=????? then I do a response.sendRedirect(string) and use the above string as the redirect parameter. this then goes throught the normal tomcat chain.doFilter function, and somewhere along the way it must hit a snag because it converts the string to "?????" -----Original Message----- From: Evan Child [mailto:[EMAIL PROTECTED]] Sent: Friday, 12 April 2002 10:28 AM To: 'Tomcat Users List' Subject: RE: Foreing Character encoding from jsp form (Character Encoding doesn't work) I found it somewhere on the Internet, which I cannot now remember, but here it is. We've been using it for the past couple of months, and it appears to work well. Good luck, Evan public static String convertURLEncodedUTF8Str(String s) { if (s == null) { return ""; } StringBuffer sbuf = new StringBuffer () ; int l = s.length() ; int ch = -1 ; int b, sumb = 0; for (int i = 0, more = -1 ; i < l ; i++) { /* Get next byte b from URL segment s */ switch (ch = s.charAt(i)) { case '%': ch = s.charAt (++i) ; int hb = (Character.isDigit ((char) ch) ? ch - '0' : 10+Character.toLowerCase((char) ch) - 'a') & 0xF ; ch = s.charAt (++i) ; int lb = (Character.isDigit ((char) ch) ? ch - '0' : 10+Character.toLowerCase ((char) ch)-'a') & 0xF ; b = (hb << 4) | lb ; break ; case '+': b = ' ' ; break ; default: b = ch ; } /* Decode byte b as UTF-8, sumb collects incomplete chars */ if ((b & 0xc0) == 0x80) { // 10xxxxxx (continuation byte) sumb = (sumb << 6) | (b & 0x3f) ; // Add 6 bits to sumb if (--more == 0) sbuf.append((char) sumb) ; // Add char to sbuf } else if ((b & 0x80) == 0x00) { // 0xxxxxxx (yields 7 bits) sbuf.append((char) b) ; // Store in sbuf } else if ((b & 0xe0) == 0xc0) { // 110xxxxx (yields 5 bits) sumb = b & 0x1f; more = 1; // Expect 1 more byte } else if ((b & 0xf0) == 0xe0) { // 1110xxxx (yields 4 bits) sumb = b & 0x0f; more = 2; // Expect 2 more bytes } else if ((b & 0xf8) == 0xf0) { // 11110xxx (yields 3 bits) sumb = b & 0x07; more = 3; // Expect 3 more bytes } else if ((b & 0xfc) == 0xf8) { // 111110xx (yields 2 bits) sumb = b & 0x03; more = 4; // Expect 4 more bytes } else /*if ((b & 0xfe) == 0xfc)*/ { // 1111110x (yields 1 bit) sumb = b & 0x01; more = 5; // Expect 5 more bytes } /* We don't test if the UTF-8 encoding is well-formed */ } return sbuf.toString() ; } -----Original Message----- From: Lee Chin Khiong [mailto:[EMAIL PROTECTED]] Sent: Thursday, April 11, 2002 6:23 PM To: 'Tomcat Users List' Subject: RE: Foreing Character encoding from jsp form (Character Encoding doesn't work) Yes, can I have it too. Thanks. -----Original Message----- From: Evan Child [mailto:[EMAIL PROTECTED]] Sent: Friday, April 12, 2002 8:21 AM To: 'Tomcat Users List' Subject: RE: Foreing Character encoding from jsp form (Character Encoding doesn't work) What browser are you using to submit the form? Before you start getting parameters, you need to do a request.setCharacterEncoding("UTF-8"); I couldn't understand from below if you're already doing that. Assuming that you ultimately want the characters to end up in a utf-8 encoding. If the browser url-encodes the parameters, (for example if this is an HTTP GET request), you'll need to get a decoder to decode that and convert it into regular UTF-8. I have a decoder in java, if you want it. Thanks, Evan -----Original Message----- From: Steve Vanspall [mailto:[EMAIL PROTECTED]] Sent: Thursday, April 11, 2002 6:27 PM To: Tomcat Users List Subject: Foreing Character encoding from jsp form (Character Encoding doesn't work) HI there, I am having problem with reading foreign characters from a form. I am trying to make it do that Chinese characters can be entered. When I enter them the a received by the request in the form 寇蔆 etc... I have set the character encoding and filter for UTF-8 in web.xml, I know that it goes through the filter, but the output is the same. presumably because it reads each character in as '&' '#' '2' '3' 4' '9' '5' ';', seeing these character as regular ascii character, it doesn't try to change them All my pages are set to UTF-8 charcter encoding. I have altered the filter code myself to intercept the filter and recursive replace these code. basically converting the integere one by one into chars. two problems arise from this. 1. When I then add then string them together using a string buffer/string I get a string of '??????', this is also how it is entered into the database (which is set to UTF-8 encoding also) 2. Surely there is a better way to do this. Can anybody help me here, Thanks in advance Steve Vanspall -- To unsubscribe: <mailto:[EMAIL PROTECTED]> For additional commands: <mailto:[EMAIL PROTECTED]> Troubles with the list: <mailto:[EMAIL PROTECTED]> -- To unsubscribe: <mailto:[EMAIL PROTECTED]> For additional commands: <mailto:[EMAIL PROTECTED]> Troubles with the list: <mailto:[EMAIL PROTECTED]> -- To unsubscribe: <mailto:[EMAIL PROTECTED]> For additional commands: <mailto:[EMAIL PROTECTED]> Troubles with the list: <mailto:[EMAIL PROTECTED]> -- To unsubscribe: <mailto:[EMAIL PROTECTED]> For additional commands: <mailto:[EMAIL PROTECTED]> Troubles with the list: <mailto:[EMAIL PROTECTED]>