RE: Intro/question possible buglet with Content-Type and Charsets - now more of an RFC

2003-12-19 Thread Greg . Cope
Hello All,

I know it may be considered rude to reply to my own post, but it seems to
have fallen on deaf ears.

In trying to solve the problems I seem to be going round in circles trying
to find a fix within the tomcat source - can someone point me in right
direction?

Any clues as to how to proceed would be most welcome.  Should I reopen the
bug, as it would appear that the supplied fix, does not.

Not wishing to put anyones backs up, Thanks,

Greg


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



RE: Intro/question possible buglet with Content-Type and Charsets - now more of an RFC

2003-12-18 Thread Greg . Cope
Hi All,

(We may be barking up the wrong tree here, so if so please point me in the
right direction)

This is still causing us issues - as IE fails to parse a charset when it is
tacked on to Content-Type: application/vnd.ms-excel

It would appear that the charset is being tacked onto the Content-Type in
setContentType method of
catalina/src/share/org/apache/catalina/connector/ResponseBase.java in the
event of it not being supplied in the Content-Type (it looks for a ';')

The encoding can never be null as it is extracted from the locale in the
setLocale method below it.

I understand this to mean that the charset will always be tacked on
irrespective of the Type.

However;

I cannot find an explicit reference to not defining a charset for binary
Types, but I cannot see why you would want to.

HTTP 1.1 implies that there is a default charset for text Types (makes
sense)(http://www.w3.org/Protocols/rfc2068/rfc2068)

'When no explicit charset parameter is provided by the sender, media
subtypes of the text type are defined to have a default charset value of
ISO-8859-1' 

Which I understand that it is fair enough to add it to text/* Types.

RFC 1341 (http://www.faqs.org/rfcs/rfc1341.html) states that:

'2.a.  A text Content-Type value, which can be used to represent  textual
information  in  a  number  of character  sets  and  formatted  text
description languages in a standardized manner.'

But no mention of Charsets in Application types:

'2.c.  An application Content-Type value, which can be used  to transmit
application data or binary data, and hence,  among  other  uses,  to
implement  an electronic mail file transfer service.

What I would suggest is a little if wrapper to only add a default if the
Content-Type is text/

A sudo code below (not tested)

###
catalina/src/share/org/apache/catalina/connector/ResponseBase.java

 public void setContentType(String type) {

if (isCommitted())
return;

if (included)
return; // Ignore any call from an included servlet

this.contentType = type;
if (type.indexOf(';') = 0) {
encoding = RequestUtil.parseCharacterEncoding(type);
if (encoding == null)
encoding = ISO-8859-1;
} else {
if (encoding != null  type.startsWith('text/'))
this.contentType = type + ;charset= + encoding;
}

}

Regards,

Greg


 -Original Message-
 From: Tim Funk [mailto:[EMAIL PROTECTED]
 Sent: 16 December 2003 18:09
 To: Tomcat Developers List
 Subject: Re: Intro/question possible buglet with Content-Type and
 Charsets .
 
 
 Yeah, nagoya.apache.org seems down. Hopefully it will be back 
 soon. The bug 
 has good detail of what and how to fix.
 
 -Tim
 
 [EMAIL PROTECTED] wrote:
 
  Thanks Tim,
  
  Having a little trouble getting anything from bugzilla, 
 nagoya.apache.org
  seems to be having a little trouble!
  
  Looking in the archives for this id, I see that someone has 
 a 4.1.29 patch
  and a complied class, but cannot see either email address 
 or content via the
  archive.
  
  Ho hum
  
  Thanks for the pointer.
  
  Greg
  
  
  
  
 -Original Message-
 From: Tim Funk [mailto:[EMAIL PROTECTED]
 Sent: 16 December 2003 12:31
 To: Tomcat Developers List
 Subject: Re: Intro/question possible buglet with Content-Type and
 Charsets.
 
 
 http://nagoya.apache.org/bugzilla/show_bug.cgi?id=24970
 
 [EMAIL PROTECTED] wrote:
 
 Hi All,
 
 Quick intro, and then a question;
 
 We use tomcat to host java web applications at our 
 
 location.  My client
 
 requires us to follow very strict rules for deploying 
 
 software, that means
 
 it can be a documentation intensive process (evidence 
 
 gathering/ IQP's etc
 
 ).  So we rarely upgrade as it is quite allot of 
 
 work. Luckily
 
 tomcat is excellent and rarely needs upgrading or patching.
 
 Now the question;
 
 Tomcat 4.1.29 seems to insist on added charset to the 
 
 content type, even if
 
 a Content-Type has been set using response.setContentType 
 or similar
 (without a charset).  Tomcat 5 seems to do something 
 
 similar judging from
 
 http://www.mail-archive.com/[EMAIL PROTECTED]/msg4
 9015.html but
 
 I think it fails to check if the Content type is a text one 
 
 (HTML) and adds
 
 it for any content type, which would appear not to be right IMHO.
 
 Without wishing to appear rude :-) I need to change this 
 
 behaviour and
 
 remove the insertion of the charset for non text based 
 
 Content-Types  eg:
 
 application/vnd.ms-excel
 
 
 
 -
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]
 

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]