Re: utf-8 with tomcat 5: second round

2004-07-05 Thread M.Hockings
Hi Asher,
It looks like you are using Struts?  If so then setting the encoding in 
the response is too late as the Struts runtime has already set it.

Look into using a filter (that is what I do) for your webapp, I expect 
that should solve your problem.

You can Google about for more on utf-8 and Struts.
http://www.anassina.com/struts/i18n/i18n.html
Good luck
Mike
Asher Tarnopolski wrote:
hey mark, thanks for response.
i run the code i pasted below.
for example, i enter one hebrew letter. it's utf
code is 1488.
on tc 4.0.xx i get the following results:
7 (the length of its utf-8 code)
 #1488; (the letter itself in utf-8 encoding)
 amp;#1488;(same as above parsed to be visible in browser)
in tc 5 i get this:
1(which already lets me know that this is not really utf-8)
the entered hebrew letter
the entered hebrew letter (nothing is parsed, so '' signed wasn't even met)
this is it.
- Original Message -
From: Mark Thomas [EMAIL PROTECTED]
To: 'Tomcat Users List' [EMAIL PROTECTED]; 'Asher
Tarnopolski' [EMAIL PROTECTED]
Sent: Sunday, July 04, 2004 8:46 PM
Subject: RE: utf-8 with tomcat 5: second round

Asher,
A few questions...
What do you put in the text box on the form and what output do you see?
Are you really using form act=/tests/utf.jsp method=post or do you
mean
form action=/tests/utf.jsp method=post?
When I did my test I copied your UTF-8 character form the bugzilla report
and
pasted into the text box. I was seeing question marks in the output until
I
added the [EMAIL PROTECTED] pageEncoding=UTF-8% The test was on XP (as per the
bug
report) and I assume you used IE as the browser.
The URI encoding is a red herring in this case. Because you are using post
it is
only the request encoding that matters.
The full text of my test JSP is below.
Mark
%@ page language=java import=java.lang.*,java.util.* %
%@ page pageEncoding=UTF-8 %
html
body
form action=bug29900.jsp method=post
input type=text name=source 
input type=submit
form
p
%
request.setCharacterEncoding(UTF-8);
if(request.getParameter(source)!=null)
{
 out.println(request.getParameter(source).length()+p);
 out.println(request.getParameter(source));
 StringBuffer sb = new StringBuffer();
 for(int i=0; irequest.getParameter(source).length(); i++)
 {
   if(request.getParameter(source).charAt(i) == '')
 sb.append();
   else
 sb.append(request.getParameter(source).charAt(i));
 }
 out.println(p+ sb.toString());
}
%
/body
/html

-Original Message-
From: Asher Tarnopolski [mailto:[EMAIL PROTECTED]
Sent: Sunday, July 04, 2004 6:25 PM
To: [EMAIL PROTECTED]
Subject: utf-8 with tomcat 5: second round
hi folks,
i've published a question about it a couple of days ago, but
didn't get any responses.
i've tried some things i found in bugzilla, but they didn't
help. so, i wanna try to get your help once more.
once more about my problem:
i try to send utf-8 encoded parameters in POST body, but they
arrived encoded in ISO...
this worked perfectly with tomcat 4.0.x.
from the info i've got from a developer at bugzilla i learned
that the difference between tc4.0 and tc5
that causes the change is actually in coyote http1.1
connector. there is an  attribute
called useBodyEncodingForURI which was set to true in tc4,
but became false in tc5.
setting it to true together with %@ page
pageEncoding=UTF-8 % and
%request.setCharacterEncoding(UTF-8);% will make the difference.
i made the change, the jsp tags are in the code and coyote
settings look like this now:
code
!-- Define a non-SSL Coyote HTTP/1.1 Connector on port 8080 --
   Connector port=8080
  maxThreads=150 minSpareThreads=25
maxSpareThreads=75
  enableLookups=false redirectPort=8443
acceptCount=100
  debug=0 connectionTimeout=2
  useBodyEncodingForURI=true
  disableUploadTimeout=true /
/code
but this doesn't help! another request to bugzilla didn't
help either, i was told that this is not a bug in tomcat,
so they are not going to deal with the question. well, may be
it's not a tomcat bug, but it should be some kind of bug.
any ideas?
my testing code comes here:
code
[EMAIL PROTECTED] contentType=text/html; charset=utf-8%
[EMAIL PROTECTED] pageEncoding=utf-8%
html
head
/head
body
form act=/tests/utf.jsp method=post
input type=text name=source 
input type=submit
form
p
%
request.setCharacterEncoding(UTF-8);
if(request.getParameter(source)!=null)
{
 out.println(request.getParameter(source).length()+p);
 out.println(request.getParameter(source));
 StringBuffer sb = new StringBuffer();
 for(int i=0; irequest.getParameter(source).length(); i++)
 {
   if(request.getParameter(source).charAt(i) == '')
 sb.append();
   else
 sb.append(request.getParameter(source).charAt(i));
 }
 out.println(p+ sb.toString());
}
%
/body
/html
/code

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED

Re: utf-8 with tomcat 5: second round

2004-07-05 Thread Asher Tarnopolski
sorry, no struts are involved.

- Original Message -
From: M.Hockings [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Sent: Monday, July 05, 2004 7:04 PM
Subject: Re: utf-8 with tomcat 5: second round


 Hi Asher,

 It looks like you are using Struts?  If so then setting the encoding in
 the response is too late as the Struts runtime has already set it.

 Look into using a filter (that is what I do) for your webapp, I expect
 that should solve your problem.

 You can Google about for more on utf-8 and Struts.

 http://www.anassina.com/struts/i18n/i18n.html

 Good luck

 Mike


 Asher Tarnopolski wrote:
  hey mark, thanks for response.
  i run the code i pasted below.
  for example, i enter one hebrew letter. it's utf
  code is 1488.
  on tc 4.0.xx i get the following results:
 
  7 (the length of its utf-8 code)
   #1488; (the letter itself in utf-8 encoding)
   amp;#1488;(same as above parsed to be visible in browser)
 
  in tc 5 i get this:
  1(which already lets me know that this is not really utf-8)
  the entered hebrew letter
  the entered hebrew letter (nothing is parsed, so '' signed wasn't even
met)
  this is it.
 
  - Original Message -
  From: Mark Thomas [EMAIL PROTECTED]
  To: 'Tomcat Users List' [EMAIL PROTECTED]; 'Asher
  Tarnopolski' [EMAIL PROTECTED]
  Sent: Sunday, July 04, 2004 8:46 PM
  Subject: RE: utf-8 with tomcat 5: second round
 
 
 
 Asher,
 
 A few questions...
 
 What do you put in the text box on the form and what output do you see?
 
 Are you really using form act=/tests/utf.jsp method=post or do you
 
  mean
 
 form action=/tests/utf.jsp method=post?
 
 When I did my test I copied your UTF-8 character form the bugzilla
report
 
  and
 
 pasted into the text box. I was seeing question marks in the output
until
 
  I
 
 added the [EMAIL PROTECTED] pageEncoding=UTF-8% The test was on XP (as per the
 
  bug
 
 report) and I assume you used IE as the browser.
 
 The URI encoding is a red herring in this case. Because you are using
post
 
  it is
 
 only the request encoding that matters.
 
 The full text of my test JSP is below.
 
 Mark
 
 %@ page language=java import=java.lang.*,java.util.* %
 %@ page pageEncoding=UTF-8 %
 html
 body
 
 form action=bug29900.jsp method=post
 input type=text name=source 
 input type=submit
 form
 p
 
 %
 request.setCharacterEncoding(UTF-8);
 
 if(request.getParameter(source)!=null)
 {
   out.println(request.getParameter(source).length()+p);
 
   out.println(request.getParameter(source));
 
   StringBuffer sb = new StringBuffer();
   for(int i=0; irequest.getParameter(source).length(); i++)
   {
 if(request.getParameter(source).charAt(i) == '')
   sb.append();
 else
   sb.append(request.getParameter(source).charAt(i));
 
   }
   out.println(p+ sb.toString());
 }
 %
 
 /body
 /html
 
 
 
 -Original Message-
 From: Asher Tarnopolski [mailto:[EMAIL PROTECTED]
 Sent: Sunday, July 04, 2004 6:25 PM
 To: [EMAIL PROTECTED]
 Subject: utf-8 with tomcat 5: second round
 
 hi folks,
 i've published a question about it a couple of days ago, but
 didn't get any responses.
 i've tried some things i found in bugzilla, but they didn't
 help. so, i wanna try to get your help once more.
 once more about my problem:
 i try to send utf-8 encoded parameters in POST body, but they
 arrived encoded in ISO...
 this worked perfectly with tomcat 4.0.x.
 from the info i've got from a developer at bugzilla i learned
 that the difference between tc4.0 and tc5
 that causes the change is actually in coyote http1.1
 connector. there is an  attribute
 called useBodyEncodingForURI which was set to true in tc4,
 but became false in tc5.
 setting it to true together with %@ page
 pageEncoding=UTF-8 % and
 %request.setCharacterEncoding(UTF-8);% will make the difference.
 i made the change, the jsp tags are in the code and coyote
 settings look like this now:
 
 code
 !-- Define a non-SSL Coyote HTTP/1.1 Connector on port 8080 --
 Connector port=8080
maxThreads=150 minSpareThreads=25
 maxSpareThreads=75
enableLookups=false redirectPort=8443
 acceptCount=100
debug=0 connectionTimeout=2
useBodyEncodingForURI=true
disableUploadTimeout=true /
 /code
 
 but this doesn't help! another request to bugzilla didn't
 help either, i was told that this is not a bug in tomcat,
 so they are not going to deal with the question. well, may be
 it's not a tomcat bug, but it should be some kind of bug.
 any ideas?
 
 my testing code comes here:
 
 code
 
 [EMAIL PROTECTED] contentType=text/html; charset=utf-8%
 [EMAIL PROTECTED] pageEncoding=utf-8%
 html
 head
 /head
 body
 
 form act=/tests/utf.jsp method=post
 input type=text name=source 
 input type=submit
 form
 p
 
 %
 request.setCharacterEncoding(UTF-8);
 
 if(request.getParameter(source)!=null)
 {
   out.println(request.getParameter(source).length()+p);
 
   out.println(request.getParameter(source));
 
   StringBuffer sb = new

Re: utf-8 with tomcat 5: second round

2004-07-05 Thread M.Hockings
Hmm, OK, still try the filter tho as I still expect that setting the 
char encoding where you have it in the .jsp will be too late.  Before 
using the filter (with struts) I was using a controller servlet 
(non-struts) that set the encoding first thing.

I run UTF-8 through TC4, TC5 with no changes to the TC config at all.
Mike
Asher Tarnopolski wrote:
sorry, no struts are involved.
- Original Message -
From: M.Hockings [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Sent: Monday, July 05, 2004 7:04 PM
Subject: Re: utf-8 with tomcat 5: second round

Hi Asher,
It looks like you are using Struts?  If so then setting the encoding in
the response is too late as the Struts runtime has already set it.
Look into using a filter (that is what I do) for your webapp, I expect
that should solve your problem.
You can Google about for more on utf-8 and Struts.
http://www.anassina.com/struts/i18n/i18n.html
Good luck
Mike
Asher Tarnopolski wrote:
hey mark, thanks for response.
i run the code i pasted below.
for example, i enter one hebrew letter. it's utf
code is 1488.
on tc 4.0.xx i get the following results:
7 (the length of its utf-8 code)
#1488; (the letter itself in utf-8 encoding)
amp;#1488;(same as above parsed to be visible in browser)
in tc 5 i get this:
1(which already lets me know that this is not really utf-8)
the entered hebrew letter
the entered hebrew letter (nothing is parsed, so '' signed wasn't even
met)
this is it.
- Original Message -
From: Mark Thomas [EMAIL PROTECTED]
To: 'Tomcat Users List' [EMAIL PROTECTED]; 'Asher
Tarnopolski' [EMAIL PROTECTED]
Sent: Sunday, July 04, 2004 8:46 PM
Subject: RE: utf-8 with tomcat 5: second round


Asher,
A few questions...
What do you put in the text box on the form and what output do you see?
Are you really using form act=/tests/utf.jsp method=post or do you
mean

form action=/tests/utf.jsp method=post?
When I did my test I copied your UTF-8 character form the bugzilla
report
and

pasted into the text box. I was seeing question marks in the output
until
I

added the [EMAIL PROTECTED] pageEncoding=UTF-8% The test was on XP (as per the
bug

report) and I assume you used IE as the browser.
The URI encoding is a red herring in this case. Because you are using
post
it is

only the request encoding that matters.
The full text of my test JSP is below.
Mark
%@ page language=java import=java.lang.*,java.util.* %
%@ page pageEncoding=UTF-8 %
html
body
form action=bug29900.jsp method=post
input type=text name=source 
input type=submit
form
p
%
request.setCharacterEncoding(UTF-8);
if(request.getParameter(source)!=null)
{
out.println(request.getParameter(source).length()+p);
out.println(request.getParameter(source));
StringBuffer sb = new StringBuffer();
for(int i=0; irequest.getParameter(source).length(); i++)
{
  if(request.getParameter(source).charAt(i) == '')
sb.append();
  else
sb.append(request.getParameter(source).charAt(i));
}
out.println(p+ sb.toString());
}
%
/body
/html

-Original Message-
From: Asher Tarnopolski [mailto:[EMAIL PROTECTED]
Sent: Sunday, July 04, 2004 6:25 PM
To: [EMAIL PROTECTED]
Subject: utf-8 with tomcat 5: second round
hi folks,
i've published a question about it a couple of days ago, but
didn't get any responses.
i've tried some things i found in bugzilla, but they didn't
help. so, i wanna try to get your help once more.
once more about my problem:
i try to send utf-8 encoded parameters in POST body, but they
arrived encoded in ISO...
this worked perfectly with tomcat 4.0.x.

from the info i've got from a developer at bugzilla i learned

that the difference between tc4.0 and tc5
that causes the change is actually in coyote http1.1
connector. there is an  attribute
called useBodyEncodingForURI which was set to true in tc4,
but became false in tc5.
setting it to true together with %@ page
pageEncoding=UTF-8 % and
%request.setCharacterEncoding(UTF-8);% will make the difference.
i made the change, the jsp tags are in the code and coyote
settings look like this now:
code
!-- Define a non-SSL Coyote HTTP/1.1 Connector on port 8080 --
  Connector port=8080
 maxThreads=150 minSpareThreads=25
maxSpareThreads=75
 enableLookups=false redirectPort=8443
acceptCount=100
 debug=0 connectionTimeout=2
 useBodyEncodingForURI=true
 disableUploadTimeout=true /
/code
but this doesn't help! another request to bugzilla didn't
help either, i was told that this is not a bug in tomcat,
so they are not going to deal with the question. well, may be
it's not a tomcat bug, but it should be some kind of bug.
any ideas?
my testing code comes here:
code
[EMAIL PROTECTED] contentType=text/html; charset=utf-8%
[EMAIL PROTECTED] pageEncoding=utf-8%
html
head
/head
body
form act=/tests/utf.jsp method=post
input type=text name=source 
input type=submit
form
p
%
request.setCharacterEncoding(UTF-8);
if(request.getParameter(source)!=null)
{
out.println(request.getParameter(source).length()+p

RE: utf-8 with tomcat 5: second round

2004-07-05 Thread Mark Thomas
This is exactly what should happen. You are working with characters not bytes
hence you see 1 UTF-8 character.

Mark

 -Original Message-
 From: Asher Tarnopolski [mailto:[EMAIL PROTECTED] 
 Sent: Sunday, July 04, 2004 11:18 PM
 To: Tomcat Users List
 Subject: Re: utf-8 with tomcat 5: second round 
 
 hey mark, thanks for response.
 i run the code i pasted below.
 for example, i enter one hebrew letter. it's utf
 code is 1488.
 on tc 4.0.xx i get the following results:
 
 7 (the length of its utf-8 code)
  #1488; (the letter itself in utf-8 encoding)
  amp;#1488;(same as above parsed to be visible in browser)
 
 in tc 5 i get this:
 1(which already lets me know that this is not really utf-8)
 the entered hebrew letter
 the entered hebrew letter (nothing is parsed, so '' signed 
 wasn't even met)
 this is it.
 
 - Original Message -
 From: Mark Thomas [EMAIL PROTECTED]
 To: 'Tomcat Users List' [EMAIL PROTECTED]; 'Asher
 Tarnopolski' [EMAIL PROTECTED]
 Sent: Sunday, July 04, 2004 8:46 PM
 Subject: RE: utf-8 with tomcat 5: second round
 
 
  Asher,
 
  A few questions...
 
  What do you put in the text box on the form and what output 
 do you see?
 
  Are you really using form act=/tests/utf.jsp 
 method=post or do you
 mean
  form action=/tests/utf.jsp method=post?
 
  When I did my test I copied your UTF-8 character form the 
 bugzilla report
 and
  pasted into the text box. I was seeing question marks in 
 the output until
 I
  added the [EMAIL PROTECTED] pageEncoding=UTF-8% The test was on XP 
 (as per the
 bug
  report) and I assume you used IE as the browser.
 
  The URI encoding is a red herring in this case. Because you 
 are using post
 it is
  only the request encoding that matters.
 
  The full text of my test JSP is below.
 
  Mark
 
  %@ page language=java import=java.lang.*,java.util.* %
  %@ page pageEncoding=UTF-8 %
  html
  body
 
  form action=bug29900.jsp method=post
  input type=text name=source 
  input type=submit
  form
  p
 
  %
  request.setCharacterEncoding(UTF-8);
 
  if(request.getParameter(source)!=null)
  {
out.println(request.getParameter(source).length()+p);
 
out.println(request.getParameter(source));
 
StringBuffer sb = new StringBuffer();
for(int i=0; irequest.getParameter(source).length(); i++)
{
  if(request.getParameter(source).charAt(i) == '')
sb.append();
  else
sb.append(request.getParameter(source).charAt(i));
 
}
out.println(p+ sb.toString());
  }
  %
 
  /body
  /html
 
 
 
   -Original Message-
   From: Asher Tarnopolski [mailto:[EMAIL PROTECTED]
   Sent: Sunday, July 04, 2004 6:25 PM
   To: [EMAIL PROTECTED]
   Subject: utf-8 with tomcat 5: second round
  
   hi folks,
   i've published a question about it a couple of days ago, but
   didn't get any responses.
   i've tried some things i found in bugzilla, but they didn't
   help. so, i wanna try to get your help once more.
   once more about my problem:
   i try to send utf-8 encoded parameters in POST body, but they
   arrived encoded in ISO...
   this worked perfectly with tomcat 4.0.x.
   from the info i've got from a developer at bugzilla i learned
   that the difference between tc4.0 and tc5
   that causes the change is actually in coyote http1.1
   connector. there is an  attribute
   called useBodyEncodingForURI which was set to true in tc4,
   but became false in tc5.
   setting it to true together with %@ page
   pageEncoding=UTF-8 % and
   %request.setCharacterEncoding(UTF-8);% will make the 
 difference.
   i made the change, the jsp tags are in the code and coyote
   settings look like this now:
  
   code
   !-- Define a non-SSL Coyote HTTP/1.1 Connector on port 8080 --
   Connector port=8080
  maxThreads=150 minSpareThreads=25
   maxSpareThreads=75
  enableLookups=false redirectPort=8443
   acceptCount=100
  debug=0 connectionTimeout=2
  useBodyEncodingForURI=true
  disableUploadTimeout=true /
   /code
  
   but this doesn't help! another request to bugzilla didn't
   help either, i was told that this is not a bug in tomcat,
   so they are not going to deal with the question. well, may be
   it's not a tomcat bug, but it should be some kind of bug.
   any ideas?
  
   my testing code comes here:
  
   code
  
   [EMAIL PROTECTED] contentType=text/html; charset=utf-8%
   [EMAIL PROTECTED] pageEncoding=utf-8%
   html
   head
   /head
   body
  
   form act=/tests/utf.jsp method=post
   input type=text name=source 
   input type=submit
   form
   p
  
   %
   request.setCharacterEncoding(UTF-8);
  
   if(request.getParameter(source)!=null)
   {
 out.println(request.getParameter(source).length()+p);
  
 out.println(request.getParameter(source));
  
 StringBuffer sb = new StringBuffer();
 for(int i=0; irequest.getParameter(source).length(); i++)
 {
   if(request.getParameter(source).charAt(i) == '')
 sb.append

RE: utf-8 with tomcat 5: second round

2004-07-04 Thread Mark Thomas
Asher,

A few questions...

What do you put in the text box on the form and what output do you see?

Are you really using form act=/tests/utf.jsp method=post or do you mean
form action=/tests/utf.jsp method=post?

When I did my test I copied your UTF-8 character form the bugzilla report and
pasted into the text box. I was seeing question marks in the output until I
added the [EMAIL PROTECTED] pageEncoding=UTF-8% The test was on XP (as per the bug
report) and I assume you used IE as the browser.

The URI encoding is a red herring in this case. Because you are using post it is
only the request encoding that matters.

The full text of my test JSP is below.

Mark

%@ page language=java import=java.lang.*,java.util.* %
%@ page pageEncoding=UTF-8 %
html
body
 
form action=bug29900.jsp method=post
input type=text name=source 
input type=submit
form
p
 
%
request.setCharacterEncoding(UTF-8);

if(request.getParameter(source)!=null)
{ 
  out.println(request.getParameter(source).length()+p);
 
  out.println(request.getParameter(source));
 
  StringBuffer sb = new StringBuffer();
  for(int i=0; irequest.getParameter(source).length(); i++)
  {
if(request.getParameter(source).charAt(i) == '')
  sb.append();
else
  sb.append(request.getParameter(source).charAt(i));
 
  }
  out.println(p+ sb.toString());
}
%
 
/body
/html

 

 -Original Message-
 From: Asher Tarnopolski [mailto:[EMAIL PROTECTED] 
 Sent: Sunday, July 04, 2004 6:25 PM
 To: [EMAIL PROTECTED]
 Subject: utf-8 with tomcat 5: second round 
 
 hi folks, 
 i've published a question about it a couple of days ago, but 
 didn't get any responses.
 i've tried some things i found in bugzilla, but they didn't 
 help. so, i wanna try to get your help once more.
 once more about my problem: 
 i try to send utf-8 encoded parameters in POST body, but they 
 arrived encoded in ISO...
 this worked perfectly with tomcat 4.0.x. 
 from the info i've got from a developer at bugzilla i learned 
 that the difference between tc4.0 and tc5  
 that causes the change is actually in coyote http1.1 
 connector. there is an  attribute
 called useBodyEncodingForURI which was set to true in tc4, 
 but became false in tc5.
 setting it to true together with %@ page 
 pageEncoding=UTF-8 % and 
 %request.setCharacterEncoding(UTF-8);% will make the difference.
 i made the change, the jsp tags are in the code and coyote 
 settings look like this now:
 
 code
 !-- Define a non-SSL Coyote HTTP/1.1 Connector on port 8080 --
 Connector port=8080
maxThreads=150 minSpareThreads=25 
 maxSpareThreads=75
enableLookups=false redirectPort=8443 
 acceptCount=100
debug=0 connectionTimeout=2
useBodyEncodingForURI=true
disableUploadTimeout=true /
 /code
 
 but this doesn't help! another request to bugzilla didn't 
 help either, i was told that this is not a bug in tomcat,
 so they are not going to deal with the question. well, may be 
 it's not a tomcat bug, but it should be some kind of bug.
 any ideas?
 
 my testing code comes here:
 
 code
 
 [EMAIL PROTECTED] contentType=text/html; charset=utf-8%
 [EMAIL PROTECTED] pageEncoding=utf-8%
 html
 head
 /head
 body
  
 form act=/tests/utf.jsp method=post
 input type=text name=source 
 input type=submit
 form
 p
  
 %
 request.setCharacterEncoding(UTF-8);
 
 if(request.getParameter(source)!=null)
 { 
   out.println(request.getParameter(source).length()+p);
  
   out.println(request.getParameter(source));
  
   StringBuffer sb = new StringBuffer();
   for(int i=0; irequest.getParameter(source).length(); i++)
   {
 if(request.getParameter(source).charAt(i) == '')
   sb.append();
 else
   sb.append(request.getParameter(source).charAt(i));
  
   }
   out.println(p+ sb.toString());
 }
 %
  
 /body
 /html
 
 
 /code
 



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: utf-8 with tomcat 5: second round

2004-07-04 Thread Asher Tarnopolski
hey mark, thanks for response.
i run the code i pasted below.
for example, i enter one hebrew letter. it's utf
code is 1488.
on tc 4.0.xx i get the following results:

7 (the length of its utf-8 code)
 #1488; (the letter itself in utf-8 encoding)
 amp;#1488;(same as above parsed to be visible in browser)

in tc 5 i get this:
1(which already lets me know that this is not really utf-8)
the entered hebrew letter
the entered hebrew letter (nothing is parsed, so '' signed wasn't even met)
this is it.

- Original Message -
From: Mark Thomas [EMAIL PROTECTED]
To: 'Tomcat Users List' [EMAIL PROTECTED]; 'Asher
Tarnopolski' [EMAIL PROTECTED]
Sent: Sunday, July 04, 2004 8:46 PM
Subject: RE: utf-8 with tomcat 5: second round


 Asher,

 A few questions...

 What do you put in the text box on the form and what output do you see?

 Are you really using form act=/tests/utf.jsp method=post or do you
mean
 form action=/tests/utf.jsp method=post?

 When I did my test I copied your UTF-8 character form the bugzilla report
and
 pasted into the text box. I was seeing question marks in the output until
I
 added the [EMAIL PROTECTED] pageEncoding=UTF-8% The test was on XP (as per the
bug
 report) and I assume you used IE as the browser.

 The URI encoding is a red herring in this case. Because you are using post
it is
 only the request encoding that matters.

 The full text of my test JSP is below.

 Mark

 %@ page language=java import=java.lang.*,java.util.* %
 %@ page pageEncoding=UTF-8 %
 html
 body

 form action=bug29900.jsp method=post
 input type=text name=source 
 input type=submit
 form
 p

 %
 request.setCharacterEncoding(UTF-8);

 if(request.getParameter(source)!=null)
 {
   out.println(request.getParameter(source).length()+p);

   out.println(request.getParameter(source));

   StringBuffer sb = new StringBuffer();
   for(int i=0; irequest.getParameter(source).length(); i++)
   {
 if(request.getParameter(source).charAt(i) == '')
   sb.append();
 else
   sb.append(request.getParameter(source).charAt(i));

   }
   out.println(p+ sb.toString());
 }
 %

 /body
 /html



  -Original Message-
  From: Asher Tarnopolski [mailto:[EMAIL PROTECTED]
  Sent: Sunday, July 04, 2004 6:25 PM
  To: [EMAIL PROTECTED]
  Subject: utf-8 with tomcat 5: second round
 
  hi folks,
  i've published a question about it a couple of days ago, but
  didn't get any responses.
  i've tried some things i found in bugzilla, but they didn't
  help. so, i wanna try to get your help once more.
  once more about my problem:
  i try to send utf-8 encoded parameters in POST body, but they
  arrived encoded in ISO...
  this worked perfectly with tomcat 4.0.x.
  from the info i've got from a developer at bugzilla i learned
  that the difference between tc4.0 and tc5
  that causes the change is actually in coyote http1.1
  connector. there is an  attribute
  called useBodyEncodingForURI which was set to true in tc4,
  but became false in tc5.
  setting it to true together with %@ page
  pageEncoding=UTF-8 % and
  %request.setCharacterEncoding(UTF-8);% will make the difference.
  i made the change, the jsp tags are in the code and coyote
  settings look like this now:
 
  code
  !-- Define a non-SSL Coyote HTTP/1.1 Connector on port 8080 --
  Connector port=8080
 maxThreads=150 minSpareThreads=25
  maxSpareThreads=75
 enableLookups=false redirectPort=8443
  acceptCount=100
 debug=0 connectionTimeout=2
 useBodyEncodingForURI=true
 disableUploadTimeout=true /
  /code
 
  but this doesn't help! another request to bugzilla didn't
  help either, i was told that this is not a bug in tomcat,
  so they are not going to deal with the question. well, may be
  it's not a tomcat bug, but it should be some kind of bug.
  any ideas?
 
  my testing code comes here:
 
  code
 
  [EMAIL PROTECTED] contentType=text/html; charset=utf-8%
  [EMAIL PROTECTED] pageEncoding=utf-8%
  html
  head
  /head
  body
 
  form act=/tests/utf.jsp method=post
  input type=text name=source 
  input type=submit
  form
  p
 
  %
  request.setCharacterEncoding(UTF-8);
 
  if(request.getParameter(source)!=null)
  {
out.println(request.getParameter(source).length()+p);
 
out.println(request.getParameter(source));
 
StringBuffer sb = new StringBuffer();
for(int i=0; irequest.getParameter(source).length(); i++)
{
  if(request.getParameter(source).charAt(i) == '')
sb.append();
  else
sb.append(request.getParameter(source).charAt(i));
 
}
out.println(p+ sb.toString());
  }
  %
 
  /body
  /html
 
 
  /code
 



 -
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]