from:"Brian Grayson"

Re: Mutt guessing wrong encoding for outgoing PDFs?

2002-09-10 Thread Brian Grayson


On Mon, Sep 09, 2002 at 09:42:21AM -0700, Michael Elkins wrote:
 Brian Grayson wrote:
I downloaded 1.4 on Friday just to see, and the same problem
  occurs.  The fundamental problem is once the CTE code sees a
  nonzero value of lobin, it goes into quoted, regardless of
  whether hibin is nonzero.  The following patch does the right
  thing for my testcase here, but I don't know if there's a good
  reason why the lobin/quotable check currently ignores whether
  there are any hibins or not.
  
After a bit of inspection, the file rep.5k has hibins and
  _no_ lobins, and hence goes properly into 8bit encoding.  But
  the file rep1k has a lobin (0x0b at offset 0x340, for example),
  so it short-circuits into quoted-printable.  Try mailing the
  base64-encoded version of that to yourself, and it should
  choose quotable, even in 1.4.
 
 Thanks for the extra info.  I looked into this more closely, and I see
 that there are a couple of factors that come into play into this
 situation.  First, I noticed that your PDF attachment was labeled
 improperly as text/plain.  This is not so bad in itself, but that
 piece of code that checks for which transfer encoding to use assumes
 that it really is text, which is a problem.  Since there was no
 extension to the file, Mutt fell back into making a guess as to whether
 or not the file was of type text/plain or appliation/octet-stream.  Mutt
 guessed text/plain because it saw only a few lobins in the file.
 However, Mutt failed to notice that there were bare CRs in the file when
 choosing the transfer encoding.  The attach patch checks info-binary
 even for the text/plain case.  I just tested this and it correctly chose
 base64 encoding for the file.

  Argggh!  I found out the fundamental problem.  It's not with
the encoding type -- quoted-printable should be fine even in
the presence of 8-bit characters (right?), except we have an Exchange
server as our mail server.  The Exchange mail server is
apparently un-encoding the quoted-printable attachment, and
then re-encoding it buggily.

  I visually verified this by telnet'ing to the SMTP port, and
cut-and-pasting a MIME mail with a quoted-printable attachment.
If I send that mail to \bgrayson, I get different results than
if the mail goes through our Exchange server.  So it appears to
me that Exchange goes into the mail message and mucks around,
and manages to also corrupt some mail while it's in there

  For example, I sent (and received from \bgrayson):
%PDF-1.2=0D%=E2=E3=CF=D3=0D=0A317 0 obj=0D =0D/Linearized 1 =0D/O 319 =0D=/H [ 
728 767 ] =0D/L 363450 =0D/E 62838 =0D/N 100 =0D/T 356991 =0D =0Dend=
obj=0D xref=0D317 16 =  

  When I let Exchange touch it, I end up with:
%PDF-1.2=0D%=E2=E3=CF=D3
317 0 obj=0D =0D/Linearized 1 =0D/O 319 =0D/H [ 728 767 ] =0D/L =
363450 =0D/E 62838 =0D/N 100 =0D/T 356991 =0D =0Dendobj=0D=

  So, for a solution, is there an easy way for me to tell mutt,
Never use quoted-printable because the world unfortunately has
Exchange servers?  Has anyone else seen this problem?

  Thanks.  And sorry about the wild goose chase -- I didn't
realize until now that quoted-printable should be able to
handle arbitrary binaries without corruption (at least I
_think_ it should be able to do so).

  (Microsoft just lost more respect from me.  Which is amazing,
since I didn't think there was any more to lose!)

  Brian

Re: Mutt guessing wrong encoding for outgoing PDFs?

2002-09-09 Thread Brian Grayson


On Fri, Sep 06, 2002 at 11:21:09PM -0700, Michael Elkins wrote:
 Brian Grayson wrote:
Hm.  I have 1.2.5 source locally, and it looks like in
  mutt_set_encoding() in sendlib.c, the following logic may be
  faulty:
 
 I just noticed that you are using an extremely ancient version of Mutt
 (0.95).  Please try using Mutt 1.4, which is the current stable version.
 The logic for picking the CTE is much more complex now, and it should
 address your issue.

  I downloaded 1.4 on Friday just to see, and the same problem
occurs.  The fundamental problem is once the CTE code sees a
nonzero value of lobin, it goes into quoted, regardless of
whether hibin is nonzero.  The following patch does the right
thing for my testcase here, but I don't know if there's a good
reason why the lobin/quotable check currently ignores whether
there are any hibins or not.

  After a bit of inspection, the file rep.5k has hibins and
_no_ lobins, and hence goes properly into 8bit encoding.  But
the file rep1k has a lobin (0x0b at offset 0x340, for example),
so it short-circuits into quoted-printable.  Try mailing the
base64-encoded version of that to yourself, and it should
choose quotable, even in 1.4.

  Brian
-- 
Brian Grayson, SysPerf (System Performance, Modeling, and Simulation)
Somerset Design Center
Motorola
Austin, TX


--- sendlib.c   Sat Apr 20 02:25:49 2002
+++ sendlib.c.mod   Fri Sep  6 21:27:18 2002
 -1196,10 +1196,12 
   if (b-type == TYPETEXT)
   {
 char *chsname = mutt_get_body_charset (send_charset, sizeof (send_charset), b);
-if ((info-lobin  strncasecmp (chsname, iso-2022, 8)) || info-linemax  990 
|| (info-from  option (OPTENCODEFROM)))
-  b-encoding = ENCQUOTEDPRINTABLE;
-else if (info-hibin)
+if (info-hibin)
+{
   b-encoding = option (OPTALLOW8BIT) ? ENC8BIT : ENCQUOTEDPRINTABLE;
+}
+else if ((info-lobin  strncasecmp (chsname, iso-2022, 8)) || info-linemax  
+990 || (info-from  option (OPTENCODEFROM)))
+  b-encoding = ENCQUOTEDPRINTABLE;
 else
   b-encoding = ENC7BIT;
   }

gbnet.net [was Re: After-editing hook?]

2002-09-06 Thread Brian Grayson


On Fri, Sep 06, 2002 at 04:28:27AM -0700, David T-G wrote:
 Jeff --
 
 BTW, you should use the @mutt.org address for the mutt-users list rather
 than the @gbnet address.  Yes, the gbnet address leaks out now and again
 (I don't really know how, but think it might be digest-related), but
 we're trying to get it squashed once and for all.

  I think I sent my post to gbnet.net because once I subscribed,
the welcome message mentioned [EMAIL PROTECTED], and
so I just sent to [EMAIL PROTECTED], since I'm used to
majordomo-run lists.  Someone might want to tweak the welcome
message to remove all references to [EMAIL PROTECTED] and
[EMAIL PROTECTED] if you truly want to hide that domain
name.

  Brian
-- 
Brian Grayson, SysPerf (System Performance, Modeling, and Simulation)
[EMAIL PROTECTED]
Somerset Design Center
Motorola
Austin, TX

Re: Mutt guessing wrong encoding for outgoing PDFs?

2002-09-06 Thread Brian Grayson


  I'm attaching three files:

  rep1k (quoted -- this is what came up).  This is the first 1K
of the PDF that misbehaved, and it decided wrong.

  rep.5k (8bit -- this is what came up, so it guessed right).
This is only the first 512 bytes of the PDF.

  rep1k, forced to use base64 encoding, so that you can see
what's _really_ in there, and so you can play around yourself.

  Let me know if you need more info!

  Also, for my own education, which file contains the guessing
code?

  Thanks!

  Brian
-- 
Brian Grayson, SysPerf (System Performance, Modeling, and Simulation)
[EMAIL PROTECTED]
Somerset Design Center
Motorola
Austin, TX


%PDF-1.2
%âãÏÓ
317 0 obj
 
/Linearized 1 
/O 319 
/H [ 728 767 ] 
/L 363450 
/E 62838 
/N 100 
/T 
356991 
 
endobj
 xref
317 16 
16 0 n 
000671 0 n 
001495 0 n 
001653 0 n 
001885 0 n 
001996 0 n 
002102 0 n 
002283 0 n 
002335 0 n 
004243 0 n 
004351 0 n 
004457 0 n 
061520 0 n 
061598 0 n 
000728 0 n 
001473 0 n 
trailer

/Size 333
/Info 305 0 R 
/Root 318 0 R 
/Prev 356980 
/ID[f420f46189f89a9d08ec59e2f57273f3f420f46189f89a9d08ec59e2f57273f3]

startxref
0
%%EOF

318 0 obj
 
/Type /Catalog 
/Pages 307 0 R 
 
endobj
331 0 obj
 /S 
1193 /Filter /FlateDecode /Length 332 0 R  
stream
HÜTMhA~ÉîÌ¼]hX¨¤ADëAR¥Ö¦ÁUB«Ù`J~Þ¢ibÛ6M×^ì!  {«xS/ZDÄC±hKñ§XXZ©*
Î6Ý¤'¯vvç½÷ýÌ
Ì`@v(×8ÿÇÖ_ä.¶;RÊtsdIìq.°Ý8Kñ³kòSÅÊÚTqYÀ¦i?».é9î·   
 ùû5Å3Ò.:ÃÚ%#;èá=ÇI)))°{ÄäÔ+4õ

%PDF-1.2
%âãÏÓ
317 0 obj
 
/Linearized 1 
/O 319 
/H [ 728 767 ] 
/L 363450 
/E 62838 
/N 100 
/T 
356991 
 
endobj
 xref
317 16 
16 0 n 
000671 0 n 
001495 0 n 
001653 0 n 
001885 0 n 
001996 0 n 
002102 0 n 
002283 0 n 
002335 0 n 
004243 0 n 
004351 0 n 
004457 0 n 
061520 0 n 
061598 0 n 
000728 0 n 
001473 0 n 
traile

%PDF-1.2
%âãÏÓ
317 0 obj
 
/Linearized 1 
/O 319 
/H [ 728 767 ] 
/L 363450 
/E 62838 
/N 100 
/T 
356991 
 
endobj
 xref
317 16 
16 0 n 
000671 0 n 
001495 0 n 
001653 0 n 
001885 0 n 
001996 0 n 
002102 0 n 
002283 0 n 
002335 0 n 
004243 0 n 
004351 0 n 
004457 0 n 
061520 0 n 
061598 0 n 
000728 0 n 
001473 0 n 
trailer

/Size 333
/Info 305 0 R 
/Root 318 0 R 
/Prev 356980 
/ID[f420f46189f89a9d08ec59e2f57273f3f420f46189f89a9d08ec59e2f57273f3]

startxref
0
%%EOF

318 0 obj
 
/Type /Catalog 
/Pages 307 0 R 
 
endobj
331 0 obj
 /S 
1193 /Filter /FlateDecode /Length 332 0 R  
stream
HÜTMhA~ÉîÌ¼]hX¨¤ADëAR¥Ö¦ÁUB«Ù`J~Þ¢ibÛ6M×^ì!  {«xS/ZDÄC±hKñ§XXZ©*
Î6Ý¤'¯vvç½÷ýÌ
Ì`@v(×8ÿÇÖ_ä.¶;RÊtsdIìq.°Ý8Kñ³kòSÅÊÚTqYÀ¦i?».é9î·   
 ùû5Å3Ò.:ÃÚ%#;èá=ÇI)))°{ÄäÔ+4õ

Re: Mutt guessing wrong encoding for outgoing PDFs?

2002-09-06 Thread Brian Grayson


  Hm.  I have 1.2.5 source locally, and it looks like in
mutt_set_encoding() in sendlib.c, the following logic may be
faulty:

static void mutt_set_encoding (BODY *b, CONTENT *info)
{
   if (b-type == TYPETEXT)
   {
  if (info-lobin || info-linemax  990 || (info-from 
   option (OPTENCODEFROM)))
 b-encoding = ENCQUOTEDPRINTABLE;
  else if (info-hibin)
 b-encoding = option (OPTALLOW8BIT) ?
ENC8BIT : ENCQUOTEDPRINTABLE;
  else
 b-encoding = ENC7BIT;
   }
   ...
}

  Note that if hibin is greater than zero, but lobin is also
greater than zero, we'll use quoted-printable.

  Shouldn't it be something more like:

   if (info-hibin) {
  b-encoding = option (OPTALLOW8BIT) ? ENC8BIT : ENCQUOTEDPRINTABLE;
   } else if (info-lobin || info-linemax  990 ||
 (info-from  option (OPTENCODEFROM)))
   {
  b-encoding = ENCQUOTEDPRINTABLE;
   }
   else
  b-encoding = ENC7BIT;
   ...

   That is, if we have 8-bit characters, don't even consider
quoted unless OPTALLOW8BIT is false.

  Brian
--
Brian Grayson, SysPerf (System Performance, Modeling, and Simulation)
[EMAIL PROTECTED]
Somerset Design Center
Motorola
Austin, TX

Mutt guessing wrong encoding for outgoing PDFs?

2002-09-05 Thread Brian Grayson


  I didn't see this in the FAQ or in a search of the archives, so
just point me to the right spot if this is an FAQ that I missed
somehow.

  When sending some PDFs, mutt is incorrectly guessing that
'quoted printable' is sufficient -- the PDF in question doesn't
contain 8-bit characters in the first several dozen lines, and
I'm guessing mutt only scans the first several before making
its choice?  Using the wrong encoding causes CRLF etc. to be
munged deep in the 8-bit characters, leading to corrupted PDFs.

  Is there any way to control mutt's behavior to say 'always send
PDF files as base64', sort of like a reverse mailcap, or to make
it check more thoroughly?

  Thanks!

  Brian
-- 
Brian Grayson, SysPerf (System Performance, Modeling, and Simulation)
[EMAIL PROTECTED]
Somerset Design Center
Motorola
Austin, TX

Re: Mutt guessing wrong encoding for outgoing PDFs?

Re: Mutt guessing wrong encoding for outgoing PDFs?

gbnet.net [was Re: After-editing hook?]

Re: Mutt guessing wrong encoding for outgoing PDFs?

Re: Mutt guessing wrong encoding for outgoing PDFs?

Mutt guessing wrong encoding for outgoing PDFs?

6 matches

Site Navigation

Mail list logo

Footer information