Re: Encode::MIME::=?UTF-8?Q?Header=20my=202=C2=A2?=

Nick Ing-Simmons Mon, 07 Oct 2002 00:49:47 -0700

Dan Kogai <[EMAIL PROTECTED]> writes:
>
>That one I am not sure.  I got mails of the opposite opinions asking 
>for strict RFC 2047 compliance (in Jcode), especially when line folding 
>was concerned.  So I made Encode::MIME::Header RFC 2047 compliant.  But 
>I agree that =20 instead of '_' maybe too much.  Nevertheless, =20 is 
>exactly what RFC 2047 recommends;
>
>RFC 2047
>>  As a consequence, unencoded white space
>>    characters (such as SPACE and HTAB) are FORBIDDEN within an
>>    'encoded-word'.


I must re-read the RFC but I think I am saying "don't encode multiple
ASCII words as one UTF-8 word.

>> For example, the character sequence
>>
>>       =?iso-8859-1?q?this is some text?=
>>
>>    would be parsed as four 'atom's, rather than as a single 'atom' (by
>>    an RFC 822 parser) or 'encoded-word' (by a parser which understands
>>    'encoded-words').  The correct way to encode the string "this is 
>> some
>>    text" is to encode the SPACE characters as well, e.g.
>>
>>       =?iso-8859-1?q?this=20is=20some=20text?=

But likewise a traditional RFC822 Subject line 

Subject: This is some text    

_is_ 4 words

But 

Subject: =?iso-8859-1?q?this=20is=20some=20text?=

Is one word.

>>
>>    (3) 8-bit values which correspond to printable ASCII characters 
>> other
>>        than "=", "?", and "_" (underscore), MAY be represented as those
>>        characters.  (But see section 5 for restrictions.)  In
>>        particular, SPACE and TAB MUST NOT be represented as themselves
>>        within encoded words.
>
>With this understood,
>
>> Suggestions:
>>  - leave ASCII or even iso-8859-1 sequences as such
>
>Only ASCII printable was allowed so I have to decline this one.  

ASCII printable would solve most of my issues - my memory of RFC was 
that iso-8859-1 was the "default" - if it is only ASCII then fine.

>'MIME-Q' is already implemented that way.  Bottom line is that I do not 
>want to give up RFC 2047 conformance.

Neither do I.

>
>>  - wrap sequences of ch > 0xff in qhichever of 'Q' or 'B' is shorter
>>    (do both encodings and throw one away).
>
>I'll consider this one instead.  This one at least does not breach RFC 
>2047.
>
>> Are patches in that direction likely to be accepted or do I build
>> a MIME-Smart on top ?
>
>As I said, Encode::MIME::Header has those restrictions;
>
>* the Encode API
>* RFC 2047
>
>This is very restrictive considering the nature of MIME Header 
>Encoding.  Surprisingly the name space Encode::MIME itself remains 
>empty and maybe we can make use of it....

I probably will - there are a whole slew of Encode-oid issues with 
body part of MIME.


>
>Dan the Encode Maintainer
-- 
Nick Ing-Simmons
http://www.ni-s.u-net.com/

Re: Encode::MIME::=?UTF-8?Q?Header=20my=202=C2=A2?=

Reply via email to