One problem I have with the patches ( and I sent a 
reply to Pilho Kim the first time he posted the
patches) is the impact on performances.

The least we can do is to detect if the encoding is 
compatible with ASCII ( or UNICODE ) and use the
getBytes() only if it isn't. The method has a big
impact on all VMs - try it if you don't believe me.

In tomcat3.3 OutputBuffer tries to resolve exactly
that problem ( char->byte ), and seems to work well,
but we need more work to convert header values, etc.
( i.e. for 3.3 the patch shouldn't be needed, the
problem is supposed to be solved by OutputBuffer ).

This is a very delicate subject from the point of view
of performance, and I spent a lot of time tunning
tomcat -> I would like to review any patch on encoding
before it is commited.

Costin


--- Kazuhiro Kazama <[EMAIL PROTECTED]> wrote:
> From: Pilho Kim <[EMAIL PROTECTED]>
> Subject: Re: My patches for Tomcat 3.2 wrt mutlibyte
> characters
> Date: Thu, 7 Dec 2000 11:30:42 +0900 (KST)
> Message-ID:
>
<[EMAIL PROTECTED]>
> > This is your big fault.
> > You should know the real meaning of default in
> API.
> > What is the default encoding in JVM?
> 
> The default character encoding in JVM is set by the
> "file.encoding"
> property.
> 
> You may aim to serve Korean contents on a Korean
> locale or language
> environment. In shoft, only one language!
> 
> But it is general to serve muliple language (ex.
> english, japanese,
> chinese etc.) contents in one tomcat. A
> well-internationalized servlet
> container will be able to serve them in the same
> situation -
> specifying its charset (and language)!
> 
> In conclusion, the following patches seems good!
> 
>
src/share/org/apache/tomcat/core/BufferedServletOutputStream.java
> 
> 
> But the following patches aren't appropriate or
> needs discussion.
> 
> 1)
> src/share/org/apache/tomcat/adapter/HttpAdapter.java
>
src/share/org/apache/tomcat/service/http/HttpRequestAdapter.java
> 
> (I don't mention cache-control patch for same files)
> 
> These patches convert to native encoding by
> getBytes() (=
> getBytes(CharToByteConverter.getDefault())). But
> quoted-string is used
> for header values in HTTP specification (but I don't
> know this spec is
> reasonable).
> 
> 2)
> src/share/org/apache/tomcat/core/Constants.java 
>
src/share/org/apache/tomcat/context/DefaultCMSetter.java
> 
> DefaultCMSetter must resolve an encoding problem
> internally.
> 
> 3)
> org/apache/jasper/compiler/Compiler.java
> 
> JSP's default charset is ISO-8859-1. Another patch
> will be wellcomed
> (see the following comments).
> 
>       // This seems to be a reasonable point to scan the
> JSP file
>       // for a 'contentType' directive. If it found then
> the set
>       // the value of 'jspEncoding to reflect the value
> specified.
>       // Note: if (true) is convenience programming. It
> can be
>       // taken out once we have a more efficient method.
> 
> And my colleagure (Mr. KAREZAKI) are checking the
> rest of your patches. He
> will post his comment to tomcat-dev.
> 
> Kazuhiro Kazama ([EMAIL PROTECTED])   NTT Network
> Innovation Laboratories


__________________________________________________
Do You Yahoo!?
Yahoo! Shopping - Thousands of Stores. Millions of Products.
http://shopping.yahoo.com/

Reply via email to