[Nmh-workers] RFC 2047 vs RFC 2231 encoding for MIME parameters

2016-09-28 Thread Ken Hornstein
So, some backstory and explanation.

For representing 8-bit characters in email headers, the encoding used
is defined in RFC 2047.  You've probably seen that at some point; it
looks like:

=?UTF-8?Q?Hi!_=F0=9F=92=A9?=

Those can be used in only a few places: in "text" in a Subject or Comment
header, a MIME body part field where the field body is defined as "*text"
(such as Content-Description ... and really, that's the only one), or
as a replacement for a "word" in an email address in a place where an
email address exists.

Specifically, RFC 2047 says:

+ An 'encoded-word' MUST NOT be used in parameter of a MIME
  Content-Type or Content-Disposition field, or in any structured
  field body except within a 'comment' or 'phrase'.

For MIME parameters, they used an alternate encoding defined by
RFC 2231.  That looks like:

name*=utf-8''Hi!%F0%9F%92%A9

(There's more if you have a long parameter name, but you get the idea).

So, incompatible encoding.  Fine.  Nmh has supported RFC 2047 encoding
for _decode_ for a long time; for 1.6 we added 2047 encoding, and support
for RFC 2231 for both encoding and decoding.

However ... nothing is ever simple.  Specifically, there was a patch
contributed (but later reverted) that enabled RFC 2047 decoding for
some MIME parameters.

The exact issue is that some MUAs will use RFC 2047 encoding
for a filename that contains 8-bit characters when creating a
Content-Disposition field.  This was a problem with older versions of
Outlook (like pre-2007), Lotus/IBM Notes (which I was surprised to
discover was still a thing), but most troublesome, RFC 2047 encoding is
ALSO used when you attach a filename with 8-bit characters when you use
the web interface for Gmail.  If you Google "rfc 2047 vs rfc2231" you
can get an idea of what happened (Chrome and Thunderbird support it for
decode, and Google uses that as justification for keeping it ... and
Chrome and Thunderbird don't want to disable that support, because Gmail
still uses it.  Argh).

I am torn as to what to do here.  It feels somehow wrong to support this
for decode natively, but I'm not completely convinced of that.  We have
a number of email programs that get this wrong, including a very popular
one.  This might be something perfect for mhfixmsg to deal with.  What
do others think?

--Ken

___
Nmh-workers mailing list
Nmh-workers@nongnu.org
https://lists.nongnu.org/mailman/listinfo/nmh-workers


Re: [Nmh-workers] RFC 2047 vs RFC 2231 encoding for MIME parameters

2016-09-28 Thread David Levine
Ken wrote:

> The exact issue is that some MUAs will use RFC 2047 encoding
> for a filename that contains 8-bit characters when creating a
> Content-Disposition field.

> I am torn as to what to do here.  It feels somehow wrong to support this
> for decode natively, but I'm not completely convinced of that.  We have
> a number of email programs that get this wrong, including a very popular
> one.  This might be something perfect for mhfixmsg to deal with.  What
> do others think?

Sounds like a job for mhfixmsg, I'll look into it for 1.7.  It would
probably go into the small category of things that it fixes
unconditionally.

The question remains of whether mhstore should decode 2047-encoded
filenames natively.  It'd be friendly and it's very unlikely that what
looks ike an encoded string isn't.  On the other hand, running mhfixmsg
shouldn't be prohibitive.

David

___
Nmh-workers mailing list
Nmh-workers@nongnu.org
https://lists.nongnu.org/mailman/listinfo/nmh-workers


Re: [Nmh-workers] RFC 2047 vs RFC 2231 encoding for MIME parameters

2016-09-28 Thread Ralph Corderoy
Hi Ken,

> RFC 2047 encoding is ALSO used when you attach a filename with 8-bit
> characters when you use the web interface for Gmail.  If you Google
> "rfc 2047 vs rfc2231" you can get an idea of what happened (Chrome and
> Thunderbird support it for decode, and Google uses that as
> justification for keeping it ... and Chrome and Thunderbird don't want
> to disable that support, because Gmail still uses it.  Argh).

That Chrome decodes the broken encoding isn't a reason for Gmail to
continue to produce it if Chrome decodes the correct content too?  Is
there somewhere where a Googler states that's the reason to keep
producing RFC-violating content?

We could do with a friendly Googler as an earpiece.  :-)  The ones I
know are all Xooglers now.  I wonder if Go's
https://golang.org/pkg/net/mail/#ReadMessage cares.

I'm all for balking at handling it, similar to

mhshow: "multipart/mixed" type in message 9740 must be encoded in
7bit, 8bit, or binary, per RFC 2045 (6.4).  One workaround is to
manually edit the file and change the "Quoted-printable"
Content-Transfer-Encoding to one of those.  For now, continuing...

BTW, that "continuing..." means skipping that one particular
multipart/mixed type in that message?  It half implies "dealing with it
anyway, sigh".

-- 
Cheers, Ralph.
https://plus.google.com/+RalphCorderoy

___
Nmh-workers mailing list
Nmh-workers@nongnu.org
https://lists.nongnu.org/mailman/listinfo/nmh-workers


Re: [Nmh-workers] RFC 2047 vs RFC 2231 encoding for MIME parameters

2016-09-28 Thread Ken Hornstein
>The question remains of whether mhstore should decode 2047-encoded
>filenames natively.  It'd be friendly and it's very unlikely that what
>looks ike an encoded string isn't.  On the other hand, running mhfixmsg
>shouldn't be prohibitive.

I know; I'm torn about this.  I mean, yeah, mhfixmsg will take care of it.
But still ... ugh; if the brokenness wasn't so widespread, I would say no.
Not sure.  Thoughts from anyone else?

--Ken

___
Nmh-workers mailing list
Nmh-workers@nongnu.org
https://lists.nongnu.org/mailman/listinfo/nmh-workers


Re: [Nmh-workers] RFC 2047 vs RFC 2231 encoding for MIME parameters

2016-09-28 Thread Ken Hornstein
>That Chrome decodes the broken encoding isn't a reason for Gmail to
>continue to produce it if Chrome decodes the correct content too?  Is
>there somewhere where a Googler states that's the reason to keep
>producing RFC-violating content?

https://bugzilla.mozilla.org/show_bug.cgi?id=601933

This was talking specifically about HTTP header field parameters, but
the same issues apply.  Here's a note from that thread:

-There are web sites that indeed use RFC 2047 encoding (GMail, probably
-other Google services, and maybe more...); and one explanation for it
-is that it "works" in both Firefox and Chrome and thus does allow a
-single code path.

>I'm all for balking at handling it, similar to

Well, we don't balk right now.  It still gets parsed correctly.  It's
just when you go to save it, the filename is the RFC 2047 encoded name.
Which, AFAIK, is still a valid (if awkward) Unix filename.

--Ken

___
Nmh-workers mailing list
Nmh-workers@nongnu.org
https://lists.nongnu.org/mailman/listinfo/nmh-workers


Re: [Nmh-workers] RFC 2047 vs RFC 2231 encoding for MIME parameters

2016-10-01 Thread David Levine
I wrote:

> Ken wrote:
>
> > The exact issue is that some MUAs will use RFC 2047 encoding
> > for a filename that contains 8-bit characters when creating a
> > Content-Disposition field.
>
> > I am torn as to what to do here.  It feels somehow wrong to support this
> > for decode natively, but I'm not completely convinced of that.  We have
> > a number of email programs that get this wrong, including a very popular
> > one.  This might be something perfect for mhfixmsg to deal with.
>
> Sounds like a job for mhfixmsg, I'll look into it for 1.7.

Done.

David

___
Nmh-workers mailing list
Nmh-workers@nongnu.org
https://lists.nongnu.org/mailman/listinfo/nmh-workers


Re: [Nmh-workers] RFC 2047 vs RFC 2231 encoding for MIME parameters

2016-10-02 Thread Earl Hood
On Wed, Sep 28, 2016 at 1:11 PM, Ken Hornstein wrote:

>>The question remains of whether mhstore should decode 2047-encoded
>>filenames natively.  It'd be friendly and it's very unlikely that what
>>looks ike an encoded string isn't.  On the other hand, running mhfixmsg
>>shouldn't be prohibitive.
>
> I know; I'm torn about this.  I mean, yeah, mhfixmsg will take care of it.
> But still ... ugh; if the brokenness wasn't so widespread, I would say no.
> Not sure.  Thoughts from anyone else?

I experienced this problem years ago with my project.  I ended up
implementing 2047 decoding for filename parameter since it appears others
are unable to read specifications.

Google has no excuse for generating such data, but as you note in your OP,
other MUAs have been doing it for a long time and from vendors that are
notorious for not following specs properly.  I do not know how many MUAs
support RFC 2231.

I do not recommend blanket 2047 decoding for parameter data.  Just limit it
to parameters associated with a filename.

--ewh

___
Nmh-workers mailing list
Nmh-workers@nongnu.org
https://lists.nongnu.org/mailman/listinfo/nmh-workers


Re: [Nmh-workers] RFC 2047 vs RFC 2231 encoding for MIME parameters

2016-10-02 Thread Ken Hornstein
>Google has no excuse for generating such data, but as you note in your OP,
>other MUAs have been doing it for a long time and from vendors that are
>notorious for not following specs properly.  I do not know how many MUAs
>support RFC 2231.

AFAICT, with the exception of older versions of Outlook (like before 2007)
and Lotus notes, pretty much "everybody" can decode RFC 2231 correctly.
And I believe that "most" modern MUAs (including nmh! :-) ) will generate
it.  Some people will generate both:

http://www.igaware.com/blog/attachments-converted-to-dat-files-when-sending-from-zarafa-solved/

Personally that seems mega-bozo to me, as I'm not sure what's supposed to
happen if you include two parameters of the same name (even if one
is encoded, and one isn't).

>I do not recommend blanket 2047 decoding for parameter data.  Just limit it
>to parameters associated with a filename.

I find this argument convicing; thoughts from others?

--Ken

___
Nmh-workers mailing list
Nmh-workers@nongnu.org
https://lists.nongnu.org/mailman/listinfo/nmh-workers


Re: [Nmh-workers] RFC 2047 vs RFC 2231 encoding for MIME parameters

2016-10-02 Thread David Levine
Ken wrote:

> >[Earl:]
> >I do not recommend blanket 2047 decoding for parameter data.  Just limit it
> >to parameters associated with a filename.
>
> I find this argument convicing; thoughts from others?

That's what I implemented in mhfixmsg:  just Content-Type name and
Content-Disposition filename.

David

___
Nmh-workers mailing list
Nmh-workers@nongnu.org
https://lists.nongnu.org/mailman/listinfo/nmh-workers


Re: [Nmh-workers] RFC 2047 vs RFC 2231 encoding for MIME parameters

2016-10-02 Thread Ken Hornstein
>That's what I implemented in mhfixmsg:  just Content-Type name and
>Content-Disposition filename.

I think we're all fine with that; I'm wondering if we should see if it's
an RFC 2047-encoded filename and just decode it automatically.

--Ken

___
Nmh-workers mailing list
Nmh-workers@nongnu.org
https://lists.nongnu.org/mailman/listinfo/nmh-workers


Re: [Nmh-workers] RFC 2047 vs RFC 2231 encoding for MIME parameters

2016-10-02 Thread Valdis . Kletnieks
On Sun, 02 Oct 2016 20:50:35 -0400, Ken Hornstein said:

> I think we're all fine with that; I'm wondering if we should see if it's
> an RFC 2047-encoded filename and just decode it automatically.

I see plenty of sources of heartburn if somebody sends a filename with
Hebrew characters in whatever 8859-foo (-8?), to somebody in a UTF-8
environment.  Unlike textual data intended to be read, where "decode to
recipient's locale" makes sense, when it's a filename things get stickier,
because there can be external references (indexes, etc) that point at a
filename in a particular encoding - or even the 2047-encoded string as the
filename. :)

Adding to the fun, Unix-y filenames don't *have* a locale...


pgpdpCvzQI9R_.pgp
Description: PGP signature
___
Nmh-workers mailing list
Nmh-workers@nongnu.org
https://lists.nongnu.org/mailman/listinfo/nmh-workers


Re: [Nmh-workers] RFC 2047 vs RFC 2231 encoding for MIME parameters

2016-10-02 Thread Paul Vixie



Ken Hornstein wrote:

That's what I implemented in mhfixmsg:  just Content-Type name and
Content-Disposition filename.


I think we're all fine with that; I'm wondering if we should see if it's
an RFC 2047-encoded filename and just decode it automatically.


that would follow the principle of least astonishment, considering all 
the broken UI's our users will have seen or used before they found MH, 
and considering how widely broken this protocol element has become.


+1.

--
P Vixie


___
Nmh-workers mailing list
Nmh-workers@nongnu.org
https://lists.nongnu.org/mailman/listinfo/nmh-workers


Re: [Nmh-workers] RFC 2047 vs RFC 2231 encoding for MIME parameters

2016-10-02 Thread Paul Vixie



Ken Hornstein wrote:

That's what I implemented in mhfixmsg: just Content-Type name and
Content-Disposition filename.


I think we're all fine with that; I'm wondering if we should see if it's
an RFC 2047-encoded filename and just decode it automatically.


that would follow the principle of least astonishment, considering all 
the broken UI's our users will have seen or used before they found MH, 
and considering how widely broken this protocol element has become.


+1.

-- P Vixie


___
Nmh-workers mailing list
Nmh-workers@nongnu.org
https://lists.nongnu.org/mailman/listinfo/nmh-workers


Re: [Nmh-workers] RFC 2047 vs RFC 2231 encoding for MIME parameters

2016-10-03 Thread Ralph Corderoy
Hi,

Valdis wrote:
> Ken wrote:
> > I think we're all fine with that; I'm wondering if we should see if
> > it's an RFC 2047-encoded filename and just decode it automatically.

I think my -1 has already been counted.

> I see plenty of sources of heartburn if somebody sends a filename with
> Hebrew characters in whatever 8859-foo (-8?), to somebody in a UTF-8
> environment.

Acting like other MUA's on this doesn't match nmh's behaviour on other
transgressions and I'd prefer the wrong encoding not to be swept under
the carpet.  nmh users are often savvy enough that they can chase back
to the creator, e.g. FLOSS PHP library, but only once they become aware
of the problem.

-- 
Cheers, Ralph.
https://plus.google.com/+RalphCorderoy

___
Nmh-workers mailing list
Nmh-workers@nongnu.org
https://lists.nongnu.org/mailman/listinfo/nmh-workers


Re: [Nmh-workers] RFC 2047 vs RFC 2231 encoding for MIME parameters

2016-10-03 Thread Ken Hornstein
>I see plenty of sources of heartburn if somebody sends a filename with
>Hebrew characters in whatever 8859-foo (-8?), to somebody in a UTF-8
>environment.  Unlike textual data intended to be read, where "decode
>to recipient's locale" makes sense, when it's a filename things get
>stickier, because there can be external references (indexes, etc) that
>point at a filename in a particular encoding - or even the 2047-encoded
>string as the filename. :)

I am not sure this is within the scope of nmh; I mean, it's a general
problem that exists even if you use RFC 2231 encoding.  Also ... I'm
unclear from your response if you are a +1 or -1 on the idea of nmh
automatically decoding RFC-2047 encoded filenames, which was my original
query :-)

Regarding filenames lacking locale (really, I think you mean character
set), that is no longer true.  I see a number of network filesystems
start to enforce UTF-8 (really, the only sane choice) and more Unix-like
operating systems are doing the same for local filesystems (MacOS and
Solaris are the ones that I'm aware of).  Really, I think that's where
things will ultimately end up.

--Ken

___
Nmh-workers mailing list
Nmh-workers@nongnu.org
https://lists.nongnu.org/mailman/listinfo/nmh-workers


Re: [Nmh-workers] RFC 2047 vs RFC 2231 encoding for MIME parameters

2016-10-03 Thread Ken Hornstein
>Acting like other MUA's on this doesn't match nmh's behaviour on other
>transgressions and I'd prefer the wrong encoding not to be swept under
>the carpet.  nmh users are often savvy enough that they can chase back
>to the creator, e.g. FLOSS PHP library, but only once they become aware
>of the problem.

Sigh.  Since I did that, I am sympathetic to that argument, but the specific
case you're talking about (marking an enclosing multipart with q-p) is
not exactly the same here.

For comparison, yes, they're both explicit RFC violations.  But the
former introduced a real ambiguity; should the enclosing parts be
encoded with q-p?  The latter ... well, not so much.  And while some
MUAs have been slowly fixing things (see: Outlook), others are clearly
fringe players (Lotus Notes), Gmail is kind of a dominant player.  And
it's clear from the stuff I've read online that people with much more
clout than I have tried, and the people in charge of Gmail simply Don't
Give a Shit.

Again, it sticks in my craw that we have to do this, but all indications
are that this is unfortunately rather common.

--Ken

___
Nmh-workers mailing list
Nmh-workers@nongnu.org
https://lists.nongnu.org/mailman/listinfo/nmh-workers


Re: [Nmh-workers] RFC 2047 vs RFC 2231 encoding for MIME parameters

2016-10-03 Thread Lyndon Nerenberg

> On Oct 3, 2016, at 7:53 AM, Ken Hornstein  wrote:
> 
> Again, it sticks in my craw that we have to do this, but all indications
> are that this is unfortunately rather common.

Can't we just pull off the Cyrus IMAP way of replacing all the invalid octets 
in the decoded stream with Xs? That makes a valid filename while (deliberately) 
ignoring the encoding brain damage.
___
Nmh-workers mailing list
Nmh-workers@nongnu.org
https://lists.nongnu.org/mailman/listinfo/nmh-workers


Re: [Nmh-workers] RFC 2047 vs RFC 2231 encoding for MIME parameters

2016-10-04 Thread Ken Hornstein
>Can't we just pull off the Cyrus IMAP way of replacing all the invalid
>octets in the decoded stream with Xs? That makes a valid filename while
>(deliberately) ignoring the encoding brain damage.

I don't think you're understanding the problem.  There are no invalid
octets here; the issue is whether or not we perform an automatic decode.

--Ken

___
Nmh-workers mailing list
Nmh-workers@nongnu.org
https://lists.nongnu.org/mailman/listinfo/nmh-workers


Re: [Nmh-workers] RFC 2047 vs RFC 2231 encoding for MIME parameters

2016-10-04 Thread Ralph Corderoy
Hi Ken,

> I don't think you're understanding the problem.  There are no invalid
> octets here; the issue is whether or not we perform an automatic
> decode.

But now Lyndon's brought it up,

Content-Type: inode/x-empty; name*=UTF-8''%41%00%42
Content-Disposition: attachment; filename*=UTF-8''%41%00%42

`mhstore -auto' creates `./A'.  Perhaps the RFCs rule out %00?  But then
again, we're talking about crap that doesn't follow the RFCs.  If it's
%41%2F%42 then `A/B' is created if A exists, so that seems OK.

BTW, file(1) might be happy with inode/x-empty, and nmh stores an empty
file, but I wonder if other systems complain they don't know what to do?

-- 
Cheers, Ralph.
https://plus.google.com/+RalphCorderoy

___
Nmh-workers mailing list
Nmh-workers@nongnu.org
https://lists.nongnu.org/mailman/listinfo/nmh-workers


Re: [Nmh-workers] RFC 2047 vs RFC 2231 encoding for MIME parameters

2016-10-04 Thread Ken Hornstein
>Content-Type: inode/x-empty; name*=UTF-8''%41%00%42
>Content-Disposition: attachment; filename*=UTF-8''%41%00%42
>
>`mhstore -auto' creates `./A'.  Perhaps the RFCs rule out %00?  But then
>again, we're talking about crap that doesn't follow the RFCs.  If it's
>%41%2F%42 then `A/B' is created if A exists, so that seems OK.

Sigh. We use a lot of C strings, so we're not so great on handling
embedded NULs.  It's one of those things that is simultanously hard
to fix, and AFAICT not worth it.  Let me ask you, Ralph ... what do you
WANT to happen here?

--Krn

___
Nmh-workers mailing list
Nmh-workers@nongnu.org
https://lists.nongnu.org/mailman/listinfo/nmh-workers


Re: [Nmh-workers] RFC 2047 vs RFC 2231 encoding for MIME parameters

2016-10-04 Thread David Levine
I almost hate to bring this up, but . . . mhstore already asks users
(if isatty).  We could do something like:

the filename of =?...?= is encoded, save as unencoded [...] instead? [y/n]

And define what to do if the response is no.

If not a tty, we're back to the question.  Safer to fail, friendlier to
decode.

David

___
Nmh-workers mailing list
Nmh-workers@nongnu.org
https://lists.nongnu.org/mailman/listinfo/nmh-workers


Re: [Nmh-workers] RFC 2047 vs RFC 2231 encoding for MIME parameters

2016-10-05 Thread Earl Hood
On Tue, Oct 4, 2016 at 7:44 AM, Ralph Corderoy wrote:

> Content-Type: inode/x-empty; name*=UTF-8''%41%00%42
> Content-Disposition: attachment; filename*=UTF-8''%41%00%42
>
> `mhstore -auto' creates `./A'.  Perhaps the RFCs rule out %00?  But then
> again, we're talking about crap that doesn't follow the RFCs.  If it's
> %41%2F%42 then `A/B' is created if A exists, so that seems OK.

It not okay.  Filenames specified in email is considered informative only
since there are security implications of blindly using what is provided.

For any file nmh creates based on email parameter input, it should run it
through a sanitizer to remove any characters deemed invalid and remove any
pathname components.  For example, what if I have:

  Content-Type: application/octet-stream
  Content-Disposition: attachment; filename="/etc/passwd"

or relative pathname attacks using "../.."?

I do not recall if nmh checks if a file with same name already exists.

If we are to be security conscience, filename parameter should be ignored,
with files stored based on content-type, or at a minimum, just use the
filename parameter extension.  An option can be provided to specify that the
filename parameter be honored, but even then, only use the basename after it
has been sanitized.

--ewh

___
Nmh-workers mailing list
Nmh-workers@nongnu.org
https://lists.nongnu.org/mailman/listinfo/nmh-workers


Re: [Nmh-workers] RFC 2047 vs RFC 2231 encoding for MIME parameters

2016-10-05 Thread Earl Hood
On Tue, Oct 4, 2016 at 8:57 AM, David Levine wrote:

> If not a tty, we're back to the question.  Safer to fail, friendlier to
> decode.

Decode.

How often are real files with "=?...?=" in their name them encountered?

--ewh

___
Nmh-workers mailing list
Nmh-workers@nongnu.org
https://lists.nongnu.org/mailman/listinfo/nmh-workers


Re: [Nmh-workers] RFC 2047 vs RFC 2231 encoding for MIME parameters

2016-10-06 Thread David Levine
Earl wrote:

> For any file nmh creates based on email parameter input, it should run it
> through a sanitizer to remove any characters deemed invalid and remove any
> pathname components.

For security reasons, this filename will be ignored if it begins
with the character '/', '.', '|', or '!', or if it contains the
character '%'.

> For example, what if I have:
>
>   Content-Type: application/octet-stream
>   Content-Disposition: attachment; filename="/etc/passwd"
>
> or relative pathname attacks using "../.."?

The /etc/passwd or relative pathanme will be ignored, and a name of
the form message#.part#.subtype will be used instead (assuming no
profile override).

> I do not recall if nmh checks if a file with same name already exists.

It can, starting with 1.6, using the mhstore(1) -clobber switch.

> If we are to be security conscience, filename parameter should be ignored,
> with files stored based on content-type, or at a minimum, just use the
> filename parameter extension.  An option can be provided to specify that the
> filename parameter be honored, but even then, only use the basename after it
> has been sanitized.

Yup, we're there.  The mhstore switch you refer to is -auto; the
default is -noauto.

mhstore also has an -outfile switch, so the user can specify a
particular filename (to store all selected content).

David

___
Nmh-workers mailing list
Nmh-workers@nongnu.org
https://lists.nongnu.org/mailman/listinfo/nmh-workers


Re: [Nmh-workers] RFC 2047 vs RFC 2231 encoding for MIME parameters

2016-10-06 Thread David Levine
Earl wrote:

> Decode.

I'm leaning that way, too.

> How often are real files with "=?...?=" in their name them encountered?

Often enough for some.

David

___
Nmh-workers mailing list
Nmh-workers@nongnu.org
https://lists.nongnu.org/mailman/listinfo/nmh-workers


Re: [Nmh-workers] RFC 2047 vs RFC 2231 encoding for MIME parameters

2016-10-06 Thread Ralph Corderoy
Hi Earl,

> > If not a tty, we're back to the question.  Safer to fail, friendlier
> > to decode.
>
> Decode.  How often are real files with "=?...?=" in their name them
> encountered?

If you other recent email you said "If we are to be security conscience"
and I think that's the right default stance.

I can't think of a way of exploiting having a filename with the wrong
encoding being decoded anyway, but I prefer to start with allowing
nothing and working out what to add than the other way around.  The
email may be seen at other MUAs that display the filenames differently,
but the unpacking left to nmh without checking.  The attachments may
overwrite one another or not depending whether the MUA sticks to the
RFCs, and so unpacking multiple times with different MUAs could give
different results.  Even if no exploit, there's obviously room for
confusion, and that's inevitable if other MUAs don't follow the RFCs.

If we do the right thing by the RFCs then we can justify it, have the
high ground, and point to mhfixmsg(1) with the user realising they need
to tread carefully.

-- 
Cheers, Ralph.
https://plus.google.com/+RalphCorderoy

___
Nmh-workers mailing list
Nmh-workers@nongnu.org
https://lists.nongnu.org/mailman/listinfo/nmh-workers


Re: [Nmh-workers] RFC 2047 vs RFC 2231 encoding for MIME parameters

2016-10-06 Thread Lyndon Nerenberg

> On Oct 6, 2016, at 5:20 AM, David Levine  wrote:
> 
>> For example, what if I have:
>> 
>>  Content-Type: application/octet-stream
>>  Content-Disposition: attachment; filename="/etc/passwd"
>> 
>> or relative pathname attacks using "../.."?
> 
> The /etc/passwd or relative pathanme will be ignored, and a name of
> the form message#.part#.subtype will be used instead (assuming no
> profile override).

I think this is very wrong behaviour.

Filenames in the attachment meta-data are suggestions.  But they can be very 
valid suggestions, and shouldn't be ignored for arbitrary reasons.

But leading paths must be ignored, as security dictates.

The safest course of action is:

1) Take the basename of the suggested filename.

2) Perform an exclusive open+create of the filename.

2a) If the file exists, and we are interactive, prompt for a replacement name 
(or to overwrite); else (2c)

2b) If the as-encoded filename results in an error from the underlying open() 
call, report the error and fall through.

2c) Synthesize a unique name, write to that, and report the name.

--lyndon
___
Nmh-workers mailing list
Nmh-workers@nongnu.org
https://lists.nongnu.org/mailman/listinfo/nmh-workers


Re: [Nmh-workers] RFC 2047 vs RFC 2231 encoding for MIME parameters

2016-10-06 Thread Lyndon Nerenberg

> On Oct 6, 2016, at 7:36 PM, Lyndon Nerenberg  wrote:
> 
> 2) Perform an exclusive open+create of the filename.
> 
> 2a) If the file exists, and we are interactive, prompt for a replacement name 
> (or to overwrite); else (2c)
> 
> 2b) If the as-encoded filename results in an error from the underlying open() 
> call, report the error and fall through.
> 
> 2c) Synthesize a unique name, write to that, and report the name.

Sorry, I was not at all clear about this.  I am proposing NO decoding what so 
ever of any incorrectly encoded file name.  Case (2b) above avoids any issues 
with filenames that are invalid for the implementation.  And we can't count on 
the old POSIX static semantics for those.  As Ken pointed out, ZFS filesystem 
have a switch that enforces UTF-8 compliance.  Or not.  It's not up to us to 
judge "or not."  open(2) determines the validity of the proposed filename.

--lyndon
 
___
Nmh-workers mailing list
Nmh-workers@nongnu.org
https://lists.nongnu.org/mailman/listinfo/nmh-workers


Re: [Nmh-workers] RFC 2047 vs RFC 2231 encoding for MIME parameters

2016-10-06 Thread David Levine
Lyndon wrote:

> > On Oct 6, 2016, at 5:20 AM, David Levine  wrote:
> > 
> > The /etc/passwd or relative pathanme will be ignored, and a name of
> > the form message#.part#.subtype will be used instead (assuming no
> > profile override).
>
> I think this is very wrong behaviour.
>
> Filenames in the attachment meta-data are suggestions.  But they can be very 
> valid suggestions, and shouldn't be ignored for arbitrary reasons.

I don' think they are.

> But leading paths must be ignored, as security dictates.
>
> The safest course of action is:
>
> 1) Take the basename of the suggested filename.

But I wouldn't consider the likely result with filename=/foo/bar/README
to be safest.

> 2) Perform an exclusive open+create of the filename.
>
> 2a) If the file exists, and we are interactive, prompt for a replacement name 
> (or to overwrite); else (2c)

That can be configured with -clobber ask, but that's not the default for
(decades of) historical precedent.

I don't think we should change the default here.  It's easy enough for
users to override.

David

___
Nmh-workers mailing list
Nmh-workers@nongnu.org
https://lists.nongnu.org/mailman/listinfo/nmh-workers


Re: [Nmh-workers] RFC 2047 vs RFC 2231 encoding for MIME parameters

2016-10-06 Thread Lyndon Nerenberg

> On Oct 6, 2016, at 8:11 PM, David Levine  wrote:
> 
> But I wouldn't consider the likely result with filename=/foo/bar/README
> to be safest.

Why not?  If there is no "README" file, create it.  If there is, prompt for a 
replacement if stdin is a tty, else synthesize a unique replacement name and be 
done with it.
___
Nmh-workers mailing list
Nmh-workers@nongnu.org
https://lists.nongnu.org/mailman/listinfo/nmh-workers


Re: [Nmh-workers] RFC 2047 vs RFC 2231 encoding for MIME parameters

2016-10-07 Thread David Levine
Lyndon wrote:

> > On Oct 6, 2016, at 8:11 PM, David Levine  wrote:
> > 
> > But I wouldn't consider the likely result with filename=/foo/bar/README
> > to be safest.
>
> Why not?  If there is no "README" file, create it.  If there is, prompt for a 
> replacement if stdin is a tty, else synthesize a unique replacement name and 
> be done with it.

It wouldn't be safest because I would risk overwriting README in the
current directory.  That's not what I expect.

In any case, I don't think that we should change the mhstore
defaults because that might break scripts as well as user
expecations.  Those include the default of -noauto.  You can
override those defaults in your profile (I do) to get pretty close
to what you ask for.  Though there isn't a basename, it should be
possible to support that with formatting strings that pipe the content
to a script that runs basename(1) on %a.

To get back to the question about RFC 2047 encoded name parameters,
I'll just add a note to the mhstore man page for users to use mhfixmsg
to get around that.

David

___
Nmh-workers mailing list
Nmh-workers@nongnu.org
https://lists.nongnu.org/mailman/listinfo/nmh-workers


Re: [Nmh-workers] RFC 2047 vs RFC 2231 encoding for MIME parameters

2016-10-07 Thread Ralph Corderoy
Hi Krn,

> > Content-Type: inode/x-empty; name*=UTF-8''%41%00%42
> > Content-Disposition: attachment; filename*=UTF-8''%41%00%42
>
> Sigh. We use a lot of C strings, so we're not so great on handling
> embedded NULs.  It's one of those things that is simultanously hard to
> fix, and AFAICT not worth it.  Let me ask you, Ralph ... what do you
> WANT to happen here?
> 
> --Krn

How about...  Detect any decoding that produces a NUL up with which our
string representation cannot put.  Stop at that point.  If the user
really wants to proceed, and they probably don't once the dodgy filename
is drawn to their attention, have him drop -auto or alter the email.

-- 
Cheers, Ralph.
https://plus.google.com/+RalphCorderoy

___
Nmh-workers mailing list
Nmh-workers@nongnu.org
https://lists.nongnu.org/mailman/listinfo/nmh-workers


Re: [Nmh-workers] RFC 2047 vs RFC 2231 encoding for MIME parameters

2016-10-10 Thread Ralph Corderoy
Hi Ken,

> ...Gmail is kind of a dominant player.  And it's clear from the stuff
> I've read online that people with much more clout than I have tried,
> and the people in charge of Gmail simply Don't Give a Shit.
>
> Again, it sticks in my craw that we have to do this, but all
> indications are that this is unfortunately rather common.

That's not the attitude that built an empire!  :-)  I've submitted
Feedback within Gmail spelling out the problem and asking for contact.
If I don't hear anything in a few days then I'll try some other avenues,
hammering home the same message.

(I should also check that the actual SMTP from a Gmail machine has the
fault in that manner and it isn't being re-written from something else
that's wrong, e.g. UTF-8, just for completeness.)

-- 
Cheers, Ralph.
https://plus.google.com/+RalphCorderoy

___
Nmh-workers mailing list
Nmh-workers@nongnu.org
https://lists.nongnu.org/mailman/listinfo/nmh-workers


Re: [Nmh-workers] RFC 2047 vs RFC 2231 encoding for MIME parameters

2016-10-10 Thread Ken Hornstein
>That's not the attitude that built an empire!  :-)  I've submitted
>Feedback within Gmail spelling out the problem and asking for contact.
>If I don't hear anything in a few days then I'll try some other avenues,
>hammering home the same message.
>
>(I should also check that the actual SMTP from a Gmail machine has the
>fault in that manner and it isn't being re-written from something else
>that's wrong, e.g. UTF-8, just for completeness.)

Sigh.  I admire your energy, Ralph ... it's just that at this point,
I lack the energy and I don't share your optimism.  I mean, fight the
good fight by all means.  Please let us know how it goes.

And AFAICT, it's only a problem with the webmail front end; that's what
I tested that had the problem.

--Ken

___
Nmh-workers mailing list
Nmh-workers@nongnu.org
https://lists.nongnu.org/mailman/listinfo/nmh-workers