utf-8 with tomcat 5: second round

2004-07-04 Thread Asher Tarnopolski
hi folks, 
i've published a question about it a couple of days ago, but didn't get any responses.
i've tried some things i found in bugzilla, but they didn't help. so, i wanna try to 
get your help once more.
once more about my problem: 
i try to send utf-8 encoded parameters in POST body, but they arrived encoded in ISO...
this worked perfectly with tomcat 4.0.x. 
from the info i've got from a developer at bugzilla i learned that the difference 
between tc4.0 and tc5  
that causes the change is actually in coyote http1.1 connector. there is an  attribute
called useBodyEncodingForURI which was set to "true" in tc4, but became "false" in tc5.
setting it to "true" together with <%@ page pageEncoding="UTF-8" %> and 
<%request.setCharacterEncoding("UTF-8");%> will make the difference.
i made the change, the jsp tags are in the code and coyote settings look like this now:






but this doesn't help! another request to bugzilla didn't help either, i was told that 
this is not a bug in tomcat,
so they are not going to deal with the question. well, may be it's not a tomcat bug, 
but it should be some kind of bug.
any ideas?

my testing code comes here:



<[EMAIL PROTECTED] contentType="text/html; charset=utf-8"%>
<[EMAIL PROTECTED] pageEncoding="utf-8"%>




 





 
<%
request.setCharacterEncoding("UTF-8");

if(request.getParameter("source")!=null)
{ 
  out.println(request.getParameter("source").length()+"");
 
  out.println(request.getParameter("source"));
 
  StringBuffer sb = new StringBuffer();
  for(int i=0; i"+ sb.toString());
}
%>
 







RE: utf-8 with tomcat 5: second round

2004-07-04 Thread Mark Thomas
Asher,

A few questions...

What do you put in the text box on the form and what output do you see?

Are you really using "" or do you mean
?

When I did my test I copied your UTF-8 character form the bugzilla report and
pasted into the text box. I was seeing question marks in the output until I
added the <[EMAIL PROTECTED] pageEncoding="UTF-8"%> The test was on XP (as per the bug
report) and I assume you used IE as the browser.

The URI encoding is a red herring in this case. Because you are using post it is
only the request encoding that matters.

The full text of my test JSP is below.

Mark

<%@ page language="java" import="java.lang.*,java.util.*" %>
<%@ page pageEncoding="UTF-8" %>


 





 
<%
request.setCharacterEncoding("UTF-8");

if(request.getParameter("source")!=null)
{ 
  out.println(request.getParameter("source").length()+"");
 
  out.println(request.getParameter("source"));
 
  StringBuffer sb = new StringBuffer();
  for(int i=0; i"+ sb.toString());
}
%>
 



 

> -Original Message-----
> From: Asher Tarnopolski [mailto:[EMAIL PROTECTED] 
> Sent: Sunday, July 04, 2004 6:25 PM
> To: [EMAIL PROTECTED]
> Subject: utf-8 with tomcat 5: second round 
> 
> hi folks, 
> i've published a question about it a couple of days ago, but 
> didn't get any responses.
> i've tried some things i found in bugzilla, but they didn't 
> help. so, i wanna try to get your help once more.
> once more about my problem: 
> i try to send utf-8 encoded parameters in POST body, but they 
> arrived encoded in ISO...
> this worked perfectly with tomcat 4.0.x. 
> from the info i've got from a developer at bugzilla i learned 
> that the difference between tc4.0 and tc5  
> that causes the change is actually in coyote http1.1 
> connector. there is an  attribute
> called useBodyEncodingForURI which was set to "true" in tc4, 
> but became "false" in tc5.
> setting it to "true" together with <%@ page 
> pageEncoding="UTF-8" %> and 
> <%request.setCharacterEncoding("UTF-8");%> will make the difference.
> i made the change, the jsp tags are in the code and coyote 
> settings look like this now:
> 
> 
> 
> maxThreads="150" minSpareThreads="25" 
> maxSpareThreads="75"
>enableLookups="false" redirectPort="8443" 
> acceptCount="100"
>debug="0" connectionTimeout="2"
>useBodyEncodingForURI="true"
>disableUploadTimeout="true" />
> 
> 
> but this doesn't help! another request to bugzilla didn't 
> help either, i was told that this is not a bug in tomcat,
> so they are not going to deal with the question. well, may be 
> it's not a tomcat bug, but it should be some kind of bug.
> any ideas?
> 
> my testing code comes here:
> 
> 
> 
> <[EMAIL PROTECTED] contentType="text/html; charset=utf-8"%>
> <[EMAIL PROTECTED] pageEncoding="utf-8"%>
> 
> 
> 
> 
>  
> 
> 
> 
> 
> 
>  
> <%
> request.setCharacterEncoding("UTF-8");
> 
> if(request.getParameter("source")!=null)
> { 
>   out.println(request.getParameter("source").length()+"");
>  
>   out.println(request.getParameter("source"));
>  
>   StringBuffer sb = new StringBuffer();
>   for(int i=0; i   {
> if(request.getParameter("source").charAt(i) == '&')
>   sb.append("&");
> else
>   sb.append(request.getParameter("source").charAt(i));
>  
>   }
>   out.println(""+ sb.toString());
> }
> %>
>  
> 
> 
> 
> 
> 
> 



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: utf-8 with tomcat 5: second round

2004-07-04 Thread Asher Tarnopolski
hey mark, thanks for response.
i run the code i pasted below.
for example, i enter one hebrew letter. it's utf
code is 1488.
on tc 4.0.xx i get the following results:

7 (the length of its utf-8 code)
 א (the letter itself in utf-8 encoding)
 &#1488;(same as above parsed to be visible in browser)

in tc 5 i get this:
1(which already lets me know that this is not really utf-8)
the entered hebrew letter
the entered hebrew letter (nothing is parsed, so '&' signed wasn't even met)
this is it.

- Original Message -
From: "Mark Thomas" <[EMAIL PROTECTED]>
To: "'Tomcat Users List'" <[EMAIL PROTECTED]>; "'Asher
Tarnopolski'" <[EMAIL PROTECTED]>
Sent: Sunday, July 04, 2004 8:46 PM
Subject: RE: utf-8 with tomcat 5: second round


> Asher,
>
> A few questions...
>
> What do you put in the text box on the form and what output do you see?
>
> Are you really using "" or do you
mean
> ?
>
> When I did my test I copied your UTF-8 character form the bugzilla report
and
> pasted into the text box. I was seeing question marks in the output until
I
> added the <[EMAIL PROTECTED] pageEncoding="UTF-8"%> The test was on XP (as per the
bug
> report) and I assume you used IE as the browser.
>
> The URI encoding is a red herring in this case. Because you are using post
it is
> only the request encoding that matters.
>
> The full text of my test JSP is below.
>
> Mark
>
> <%@ page language="java" import="java.lang.*,java.util.*" %>
> <%@ page pageEncoding="UTF-8" %>
> 
> 
>
> 
> 
> 
> 
> 
>
> <%
> request.setCharacterEncoding("UTF-8");
>
> if(request.getParameter("source")!=null)
> {
>   out.println(request.getParameter("source").length()+"");
>
>   out.println(request.getParameter("source"));
>
>   StringBuffer sb = new StringBuffer();
>   for(int i=0; i   {
> if(request.getParameter("source").charAt(i) == '&')
>   sb.append("&");
> else
>   sb.append(request.getParameter("source").charAt(i));
>
>   }
>   out.println(""+ sb.toString());
> }
> %>
>
> 
> 
>
>
>
> > -Original Message-
> > From: Asher Tarnopolski [mailto:[EMAIL PROTECTED]
> > Sent: Sunday, July 04, 2004 6:25 PM
> > To: [EMAIL PROTECTED]
> > Subject: utf-8 with tomcat 5: second round
> >
> > hi folks,
> > i've published a question about it a couple of days ago, but
> > didn't get any responses.
> > i've tried some things i found in bugzilla, but they didn't
> > help. so, i wanna try to get your help once more.
> > once more about my problem:
> > i try to send utf-8 encoded parameters in POST body, but they
> > arrived encoded in ISO...
> > this worked perfectly with tomcat 4.0.x.
> > from the info i've got from a developer at bugzilla i learned
> > that the difference between tc4.0 and tc5
> > that causes the change is actually in coyote http1.1
> > connector. there is an  attribute
> > called useBodyEncodingForURI which was set to "true" in tc4,
> > but became "false" in tc5.
> > setting it to "true" together with <%@ page
> > pageEncoding="UTF-8" %> and
> > <%request.setCharacterEncoding("UTF-8");%> will make the difference.
> > i made the change, the jsp tags are in the code and coyote
> > settings look like this now:
> >
> > 
> > 
> >  >maxThreads="150" minSpareThreads="25"
> > maxSpareThreads="75"
> >enableLookups="false" redirectPort="8443"
> > acceptCount="100"
> >debug="0" connectionTimeout="2"
> >useBodyEncodingForURI="true"
> >disableUploadTimeout="true" />
> > 
> >
> > but this doesn't help! another request to bugzilla didn't
> > help either, i was told that this is not a bug in tomcat,
> > so they are not going to deal with the question. well, may be
> > it's not a tomcat bug, but it should be some kind of bug.
> > any ideas?
> >
> > my testing code comes here:
> >
> > 
> >
> > <[EMAIL PROTECTED] contentType="text/html; charset=utf-8"%>
> > <[EMAIL PROTECTED] pageEncoding="utf-8"%>
> > 
> > 
> > 
> > 
> >
> > 
> > 
> > 
> > 
> > 
> >
> > <%
> > request.setCharacterEncoding("UTF-8");
> >
> > if(request.getParameter("source")!=null)
> > {
> >   out.println(request.getParameter("source").length()+"");
> >
> >   out.println(request.getParameter("source"));
> >
> >   StringBuffer sb = new StringBuffer();
> >   for(int i=0; i >   {
> > if(request.getParameter("source").charAt(i) == '&')
> >   sb.append("&");
> > else
> >   sb.append(request.getParameter("source").charAt(i));
> >
> >   }
> >   out.println(""+ sb.toString());
> > }
> > %>
> >
> > 
> > 
> >
> >
> > 
> >
>
>
>
> -
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: utf-8 with tomcat 5: second round

2004-07-05 Thread M.Hockings
Hi Asher,
It looks like you are using Struts?  If so then setting the encoding in 
the response is too late as the Struts runtime has already set it.

Look into using a filter (that is what I do) for your webapp, I expect 
that should solve your problem.

You can Google about for more on utf-8 and Struts.
http://www.anassina.com/struts/i18n/i18n.html
Good luck
Mike
Asher Tarnopolski wrote:
hey mark, thanks for response.
i run the code i pasted below.
for example, i enter one hebrew letter. it's utf
code is 1488.
on tc 4.0.xx i get the following results:
7 (the length of its utf-8 code)
 א (the letter itself in utf-8 encoding)
 &#1488;(same as above parsed to be visible in browser)
in tc 5 i get this:
1(which already lets me know that this is not really utf-8)
the entered hebrew letter
the entered hebrew letter (nothing is parsed, so '&' signed wasn't even met)
this is it.
- Original Message -
From: "Mark Thomas" <[EMAIL PROTECTED]>
To: "'Tomcat Users List'" <[EMAIL PROTECTED]>; "'Asher
Tarnopolski'" <[EMAIL PROTECTED]>
Sent: Sunday, July 04, 2004 8:46 PM
Subject: RE: utf-8 with tomcat 5: second round

Asher,
A few questions...
What do you put in the text box on the form and what output do you see?
Are you really using "" or do you
mean
?
When I did my test I copied your UTF-8 character form the bugzilla report
and
pasted into the text box. I was seeing question marks in the output until
I
added the <[EMAIL PROTECTED] pageEncoding="UTF-8"%> The test was on XP (as per the
bug
report) and I assume you used IE as the browser.
The URI encoding is a red herring in this case. Because you are using post
it is
only the request encoding that matters.
The full text of my test JSP is below.
Mark
<%@ page language="java" import="java.lang.*,java.util.*" %>
<%@ page pageEncoding="UTF-8" %>







<%
request.setCharacterEncoding("UTF-8");
if(request.getParameter("source")!=null)
{
 out.println(request.getParameter("source").length()+"");
 out.println(request.getParameter("source"));
 StringBuffer sb = new StringBuffer();
 for(int i=0; i
 }
 out.println(""+ sb.toString());
}
%>



-Original Message-
From: Asher Tarnopolski [mailto:[EMAIL PROTECTED]
Sent: Sunday, July 04, 2004 6:25 PM
To: [EMAIL PROTECTED]
Subject: utf-8 with tomcat 5: second round
hi folks,
i've published a question about it a couple of days ago, but
didn't get any responses.
i've tried some things i found in bugzilla, but they didn't
help. so, i wanna try to get your help once more.
once more about my problem:
i try to send utf-8 encoded parameters in POST body, but they
arrived encoded in ISO...
this worked perfectly with tomcat 4.0.x.
from the info i've got from a developer at bugzilla i learned
that the difference between tc4.0 and tc5
that causes the change is actually in coyote http1.1
connector. there is an  attribute
called useBodyEncodingForURI which was set to "true" in tc4,
but became "false" in tc5.
setting it to "true" together with <%@ page
pageEncoding="UTF-8" %> and
<%request.setCharacterEncoding("UTF-8");%> will make the difference.
i made the change, the jsp tags are in the code and coyote
settings look like this now:


   

but this doesn't help! another request to bugzilla didn't
help either, i was told that this is not a bug in tomcat,
so they are not going to deal with the question. well, may be
it's not a tomcat bug, but it should be some kind of bug.
any ideas?
my testing code comes here:

<[EMAIL PROTECTED] contentType="text/html; charset=utf-8"%>
<[EMAIL PROTECTED] pageEncoding="utf-8"%>









<%
request.setCharacterEncoding("UTF-8");
if(request.getParameter("source")!=null)
{
 out.println(request.getParameter("source").length()+"");
 out.println(request.getParameter("source"));
 StringBuffer sb = new StringBuffer();
 for(int i=0; i
 }
 out.println(""+ sb.toString());
}
%>




-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Re: utf-8 with tomcat 5: second round

2004-07-05 Thread Asher Tarnopolski
sorry, no struts are involved.

- Original Message -
From: "M.Hockings" <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Sent: Monday, July 05, 2004 7:04 PM
Subject: Re: utf-8 with tomcat 5: second round


> Hi Asher,
>
> It looks like you are using Struts?  If so then setting the encoding in
> the response is too late as the Struts runtime has already set it.
>
> Look into using a filter (that is what I do) for your webapp, I expect
> that should solve your problem.
>
> You can Google about for more on utf-8 and Struts.
>
> http://www.anassina.com/struts/i18n/i18n.html
>
> Good luck
>
> Mike
>
>
> Asher Tarnopolski wrote:
> > hey mark, thanks for response.
> > i run the code i pasted below.
> > for example, i enter one hebrew letter. it's utf
> > code is 1488.
> > on tc 4.0.xx i get the following results:
> >
> > 7 (the length of its utf-8 code)
> >  א (the letter itself in utf-8 encoding)
> >  &#1488;(same as above parsed to be visible in browser)
> >
> > in tc 5 i get this:
> > 1(which already lets me know that this is not really utf-8)
> > the entered hebrew letter
> > the entered hebrew letter (nothing is parsed, so '&' signed wasn't even
met)
> > this is it.
> >
> > - Original Message -----
> > From: "Mark Thomas" <[EMAIL PROTECTED]>
> > To: "'Tomcat Users List'" <[EMAIL PROTECTED]>; "'Asher
> > Tarnopolski'" <[EMAIL PROTECTED]>
> > Sent: Sunday, July 04, 2004 8:46 PM
> > Subject: RE: utf-8 with tomcat 5: second round
> >
> >
> >
> >>Asher,
> >>
> >>A few questions...
> >>
> >>What do you put in the text box on the form and what output do you see?
> >>
> >>Are you really using "" or do you
> >
> > mean
> >
> >>?
> >>
> >>When I did my test I copied your UTF-8 character form the bugzilla
report
> >
> > and
> >
> >>pasted into the text box. I was seeing question marks in the output
until
> >
> > I
> >
> >>added the <[EMAIL PROTECTED] pageEncoding="UTF-8"%> The test was on XP (as per the
> >
> > bug
> >
> >>report) and I assume you used IE as the browser.
> >>
> >>The URI encoding is a red herring in this case. Because you are using
post
> >
> > it is
> >
> >>only the request encoding that matters.
> >>
> >>The full text of my test JSP is below.
> >>
> >>Mark
> >>
> >><%@ page language="java" import="java.lang.*,java.util.*" %>
> >><%@ page pageEncoding="UTF-8" %>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >><%
> >>request.setCharacterEncoding("UTF-8");
> >>
> >>if(request.getParameter("source")!=null)
> >>{
> >>  out.println(request.getParameter("source").length()+"");
> >>
> >>  out.println(request.getParameter("source"));
> >>
> >>  StringBuffer sb = new StringBuffer();
> >>  for(int i=0; i >>  {
> >>if(request.getParameter("source").charAt(i) == '&')
> >>  sb.append("&");
> >>else
> >>  sb.append(request.getParameter("source").charAt(i));
> >>
> >>  }
> >>  out.println(""+ sb.toString());
> >>}
> >>%>
> >>
> >>
> >>
> >>
> >>
> >>
> >>>-Original Message-
> >>>From: Asher Tarnopolski [mailto:[EMAIL PROTECTED]
> >>>Sent: Sunday, July 04, 2004 6:25 PM
> >>>To: [EMAIL PROTECTED]
> >>>Subject: utf-8 with tomcat 5: second round
> >>>
> >>>hi folks,
> >>>i've published a question about it a couple of days ago, but
> >>>didn't get any responses.
> >>>i've tried some things i found in bugzilla, but they didn't
> >>>help. so, i wanna try to get your help once more.
> >>>once more about my problem:
> >>>i try to send utf-8 encoded parameters in POST body, but they
> >>>arrived encoded in ISO...
> >>>this worked perfectly with tomcat 4.0.x.
> >>>from the info i've 

Re: utf-8 with tomcat 5: second round

2004-07-05 Thread M.Hockings
Hmm, OK, still try the filter tho as I still expect that setting the 
char encoding where you have it in the .jsp will be too late.  Before 
using the filter (with struts) I was using a controller servlet 
(non-struts) that set the encoding first thing.

I run UTF-8 through TC4, TC5 with no changes to the TC config at all.
Mike
Asher Tarnopolski wrote:
sorry, no struts are involved.
- Original Message -
From: "M.Hockings" <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Sent: Monday, July 05, 2004 7:04 PM
Subject: Re: utf-8 with tomcat 5: second round

Hi Asher,
It looks like you are using Struts?  If so then setting the encoding in
the response is too late as the Struts runtime has already set it.
Look into using a filter (that is what I do) for your webapp, I expect
that should solve your problem.
You can Google about for more on utf-8 and Struts.
http://www.anassina.com/struts/i18n/i18n.html
Good luck
Mike
Asher Tarnopolski wrote:
hey mark, thanks for response.
i run the code i pasted below.
for example, i enter one hebrew letter. it's utf
code is 1488.
on tc 4.0.xx i get the following results:
7 (the length of its utf-8 code)
א (the letter itself in utf-8 encoding)
&#1488;(same as above parsed to be visible in browser)
in tc 5 i get this:
1(which already lets me know that this is not really utf-8)
the entered hebrew letter
the entered hebrew letter (nothing is parsed, so '&' signed wasn't even
met)
this is it.
- Original Message -
From: "Mark Thomas" <[EMAIL PROTECTED]>
To: "'Tomcat Users List'" <[EMAIL PROTECTED]>; "'Asher
Tarnopolski'" <[EMAIL PROTECTED]>
Sent: Sunday, July 04, 2004 8:46 PM
Subject: RE: utf-8 with tomcat 5: second round


Asher,
A few questions...
What do you put in the text box on the form and what output do you see?
Are you really using "" or do you
mean

?
When I did my test I copied your UTF-8 character form the bugzilla
report
and

pasted into the text box. I was seeing question marks in the output
until
I

added the <[EMAIL PROTECTED] pageEncoding="UTF-8"%> The test was on XP (as per the
bug

report) and I assume you used IE as the browser.
The URI encoding is a red herring in this case. Because you are using
post
it is

only the request encoding that matters.
The full text of my test JSP is below.
Mark
<%@ page language="java" import="java.lang.*,java.util.*" %>
<%@ page pageEncoding="UTF-8" %>







<%
request.setCharacterEncoding("UTF-8");
if(request.getParameter("source")!=null)
{
out.println(request.getParameter("source").length()+"");
out.println(request.getParameter("source"));
StringBuffer sb = new StringBuffer();
for(int i=0; i
}
out.println(""+ sb.toString());
}
%>



-Original Message-
From: Asher Tarnopolski [mailto:[EMAIL PROTECTED]
Sent: Sunday, July 04, 2004 6:25 PM
To: [EMAIL PROTECTED]
Subject: utf-8 with tomcat 5: second round
hi folks,
i've published a question about it a couple of days ago, but
didn't get any responses.
i've tried some things i found in bugzilla, but they didn't
help. so, i wanna try to get your help once more.
once more about my problem:
i try to send utf-8 encoded parameters in POST body, but they
arrived encoded in ISO...
this worked perfectly with tomcat 4.0.x.

from the info i've got from a developer at bugzilla i learned

that the difference between tc4.0 and tc5
that causes the change is actually in coyote http1.1
connector. there is an  attribute
called useBodyEncodingForURI which was set to "true" in tc4,
but became "false" in tc5.
setting it to "true" together with <%@ page
pageEncoding="UTF-8" %> and
<%request.setCharacterEncoding("UTF-8");%> will make the difference.
i made the change, the jsp tags are in the code and coyote
settings look like this now:


  

but this doesn't help! another request to bugzilla didn't
help either, i was told that this is not a bug in tomcat,
so they are not going to deal with the question. well, may be
it's not a tomcat bug, but it should be some kind of bug.
any ideas?
my testing code comes here:

<[EMAIL PROTECTED] contentType="text/html; charset=utf-8"%>
<[EMAIL PROTECTED] pageEncoding="utf-8"%>









<%
request.setCharacterEncoding("UTF-8");
if(request.getParameter("source")!=null)
{
out.println(request.getParameter("source").length()+"");
out.println(request.getParameter("source"));
StringBuffer sb = new StringBuffer();
for(int i=0; i
}
out.println(""+ sb.toString());
}
%>




-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


RE: utf-8 with tomcat 5: second round

2004-07-05 Thread Mark Thomas
This is exactly what should happen. You are working with characters not bytes
hence you see 1 UTF-8 character.

Mark

> -Original Message-
> From: Asher Tarnopolski [mailto:[EMAIL PROTECTED] 
> Sent: Sunday, July 04, 2004 11:18 PM
> To: Tomcat Users List
> Subject: Re: utf-8 with tomcat 5: second round 
> 
> hey mark, thanks for response.
> i run the code i pasted below.
> for example, i enter one hebrew letter. it's utf
> code is 1488.
> on tc 4.0.xx i get the following results:
> 
> 7 (the length of its utf-8 code)
>  א (the letter itself in utf-8 encoding)
>  &#1488;(same as above parsed to be visible in browser)
> 
> in tc 5 i get this:
> 1(which already lets me know that this is not really utf-8)
> the entered hebrew letter
> the entered hebrew letter (nothing is parsed, so '&' signed 
> wasn't even met)
> this is it.
> 
> - Original Message -
> From: "Mark Thomas" <[EMAIL PROTECTED]>
> To: "'Tomcat Users List'" <[EMAIL PROTECTED]>; "'Asher
> Tarnopolski'" <[EMAIL PROTECTED]>
> Sent: Sunday, July 04, 2004 8:46 PM
> Subject: RE: utf-8 with tomcat 5: second round
> 
> 
> > Asher,
> >
> > A few questions...
> >
> > What do you put in the text box on the form and what output 
> do you see?
> >
> > Are you really using " method=post>" or do you
> mean
> > ?
> >
> > When I did my test I copied your UTF-8 character form the 
> bugzilla report
> and
> > pasted into the text box. I was seeing question marks in 
> the output until
> I
> > added the <[EMAIL PROTECTED] pageEncoding="UTF-8"%> The test was on XP 
> (as per the
> bug
> > report) and I assume you used IE as the browser.
> >
> > The URI encoding is a red herring in this case. Because you 
> are using post
> it is
> > only the request encoding that matters.
> >
> > The full text of my test JSP is below.
> >
> > Mark
> >
> > <%@ page language="java" import="java.lang.*,java.util.*" %>
> > <%@ page pageEncoding="UTF-8" %>
> > 
> > 
> >
> > 
> > 
> > 
> > 
> > 
> >
> > <%
> > request.setCharacterEncoding("UTF-8");
> >
> > if(request.getParameter("source")!=null)
> > {
> >   out.println(request.getParameter("source").length()+"");
> >
> >   out.println(request.getParameter("source"));
> >
> >   StringBuffer sb = new StringBuffer();
> >   for(int i=0; i >   {
> > if(request.getParameter("source").charAt(i) == '&')
> >   sb.append("&");
> > else
> >   sb.append(request.getParameter("source").charAt(i));
> >
> >   }
> >   out.println(""+ sb.toString());
> > }
> > %>
> >
> > 
> > 
> >
> >
> >
> > > -Original Message-
> > > From: Asher Tarnopolski [mailto:[EMAIL PROTECTED]
> > > Sent: Sunday, July 04, 2004 6:25 PM
> > > To: [EMAIL PROTECTED]
> > > Subject: utf-8 with tomcat 5: second round
> > >
> > > hi folks,
> > > i've published a question about it a couple of days ago, but
> > > didn't get any responses.
> > > i've tried some things i found in bugzilla, but they didn't
> > > help. so, i wanna try to get your help once more.
> > > once more about my problem:
> > > i try to send utf-8 encoded parameters in POST body, but they
> > > arrived encoded in ISO...
> > > this worked perfectly with tomcat 4.0.x.
> > > from the info i've got from a developer at bugzilla i learned
> > > that the difference between tc4.0 and tc5
> > > that causes the change is actually in coyote http1.1
> > > connector. there is an  attribute
> > > called useBodyEncodingForURI which was set to "true" in tc4,
> > > but became "false" in tc5.
> > > setting it to "true" together with <%@ page
> > > pageEncoding="UTF-8" %> and
> > > <%request.setCharacterEncoding("UTF-8");%> will make the 
> difference.
> > > i made the change, the jsp tags are in the code and coyote
> > > settings look like this now:
> > >
> > > 
> > > 
> > >  > >maxThreads="150" minSpareThreads="25"
> > > maxSpareThreads="75"