[Nmh-workers] RFC 2047 vs RFC 2231 encoding for MIME parameters
So, some backstory and explanation. For representing 8-bit characters in email headers, the encoding used is defined in RFC 2047. You've probably seen that at some point; it looks like: =?UTF-8?Q?Hi!_=F0=9F=92=A9?= Those can be used in only a few places: in "text" in a Subject or Comment header, a MIME body part field where the field body is defined as "*text" (such as Content-Description ... and really, that's the only one), or as a replacement for a "word" in an email address in a place where an email address exists. Specifically, RFC 2047 says: + An 'encoded-word' MUST NOT be used in parameter of a MIME Content-Type or Content-Disposition field, or in any structured field body except within a 'comment' or 'phrase'. For MIME parameters, they used an alternate encoding defined by RFC 2231. That looks like: name*=utf-8''Hi!%F0%9F%92%A9 (There's more if you have a long parameter name, but you get the idea). So, incompatible encoding. Fine. Nmh has supported RFC 2047 encoding for _decode_ for a long time; for 1.6 we added 2047 encoding, and support for RFC 2231 for both encoding and decoding. However ... nothing is ever simple. Specifically, there was a patch contributed (but later reverted) that enabled RFC 2047 decoding for some MIME parameters. The exact issue is that some MUAs will use RFC 2047 encoding for a filename that contains 8-bit characters when creating a Content-Disposition field. This was a problem with older versions of Outlook (like pre-2007), Lotus/IBM Notes (which I was surprised to discover was still a thing), but most troublesome, RFC 2047 encoding is ALSO used when you attach a filename with 8-bit characters when you use the web interface for Gmail. If you Google "rfc 2047 vs rfc2231" you can get an idea of what happened (Chrome and Thunderbird support it for decode, and Google uses that as justification for keeping it ... and Chrome and Thunderbird don't want to disable that support, because Gmail still uses it. Argh). I am torn as to what to do here. It feels somehow wrong to support this for decode natively, but I'm not completely convinced of that. We have a number of email programs that get this wrong, including a very popular one. This might be something perfect for mhfixmsg to deal with. What do others think? --Ken ___ Nmh-workers mailing list Nmh-workers@nongnu.org https://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] RFC 2047 vs RFC 2231 encoding for MIME parameters
Ken wrote: > The exact issue is that some MUAs will use RFC 2047 encoding > for a filename that contains 8-bit characters when creating a > Content-Disposition field. > I am torn as to what to do here. It feels somehow wrong to support this > for decode natively, but I'm not completely convinced of that. We have > a number of email programs that get this wrong, including a very popular > one. This might be something perfect for mhfixmsg to deal with. What > do others think? Sounds like a job for mhfixmsg, I'll look into it for 1.7. It would probably go into the small category of things that it fixes unconditionally. The question remains of whether mhstore should decode 2047-encoded filenames natively. It'd be friendly and it's very unlikely that what looks ike an encoded string isn't. On the other hand, running mhfixmsg shouldn't be prohibitive. David ___ Nmh-workers mailing list Nmh-workers@nongnu.org https://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] RFC 2047 vs RFC 2231 encoding for MIME parameters
Hi Ken, > RFC 2047 encoding is ALSO used when you attach a filename with 8-bit > characters when you use the web interface for Gmail. If you Google > "rfc 2047 vs rfc2231" you can get an idea of what happened (Chrome and > Thunderbird support it for decode, and Google uses that as > justification for keeping it ... and Chrome and Thunderbird don't want > to disable that support, because Gmail still uses it. Argh). That Chrome decodes the broken encoding isn't a reason for Gmail to continue to produce it if Chrome decodes the correct content too? Is there somewhere where a Googler states that's the reason to keep producing RFC-violating content? We could do with a friendly Googler as an earpiece. :-) The ones I know are all Xooglers now. I wonder if Go's https://golang.org/pkg/net/mail/#ReadMessage cares. I'm all for balking at handling it, similar to mhshow: "multipart/mixed" type in message 9740 must be encoded in 7bit, 8bit, or binary, per RFC 2045 (6.4). One workaround is to manually edit the file and change the "Quoted-printable" Content-Transfer-Encoding to one of those. For now, continuing... BTW, that "continuing..." means skipping that one particular multipart/mixed type in that message? It half implies "dealing with it anyway, sigh". -- Cheers, Ralph. https://plus.google.com/+RalphCorderoy ___ Nmh-workers mailing list Nmh-workers@nongnu.org https://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] RFC 2047 vs RFC 2231 encoding for MIME parameters
>The question remains of whether mhstore should decode 2047-encoded >filenames natively. It'd be friendly and it's very unlikely that what >looks ike an encoded string isn't. On the other hand, running mhfixmsg >shouldn't be prohibitive. I know; I'm torn about this. I mean, yeah, mhfixmsg will take care of it. But still ... ugh; if the brokenness wasn't so widespread, I would say no. Not sure. Thoughts from anyone else? --Ken ___ Nmh-workers mailing list Nmh-workers@nongnu.org https://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] RFC 2047 vs RFC 2231 encoding for MIME parameters
>That Chrome decodes the broken encoding isn't a reason for Gmail to >continue to produce it if Chrome decodes the correct content too? Is >there somewhere where a Googler states that's the reason to keep >producing RFC-violating content? https://bugzilla.mozilla.org/show_bug.cgi?id=601933 This was talking specifically about HTTP header field parameters, but the same issues apply. Here's a note from that thread: -There are web sites that indeed use RFC 2047 encoding (GMail, probably -other Google services, and maybe more...); and one explanation for it -is that it "works" in both Firefox and Chrome and thus does allow a -single code path. >I'm all for balking at handling it, similar to Well, we don't balk right now. It still gets parsed correctly. It's just when you go to save it, the filename is the RFC 2047 encoded name. Which, AFAIK, is still a valid (if awkward) Unix filename. --Ken ___ Nmh-workers mailing list Nmh-workers@nongnu.org https://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] RFC 2047 vs RFC 2231 encoding for MIME parameters
I wrote: > Ken wrote: > > > The exact issue is that some MUAs will use RFC 2047 encoding > > for a filename that contains 8-bit characters when creating a > > Content-Disposition field. > > > I am torn as to what to do here. It feels somehow wrong to support this > > for decode natively, but I'm not completely convinced of that. We have > > a number of email programs that get this wrong, including a very popular > > one. This might be something perfect for mhfixmsg to deal with. > > Sounds like a job for mhfixmsg, I'll look into it for 1.7. Done. David ___ Nmh-workers mailing list Nmh-workers@nongnu.org https://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] RFC 2047 vs RFC 2231 encoding for MIME parameters
On Wed, Sep 28, 2016 at 1:11 PM, Ken Hornstein wrote: >>The question remains of whether mhstore should decode 2047-encoded >>filenames natively. It'd be friendly and it's very unlikely that what >>looks ike an encoded string isn't. On the other hand, running mhfixmsg >>shouldn't be prohibitive. > > I know; I'm torn about this. I mean, yeah, mhfixmsg will take care of it. > But still ... ugh; if the brokenness wasn't so widespread, I would say no. > Not sure. Thoughts from anyone else? I experienced this problem years ago with my project. I ended up implementing 2047 decoding for filename parameter since it appears others are unable to read specifications. Google has no excuse for generating such data, but as you note in your OP, other MUAs have been doing it for a long time and from vendors that are notorious for not following specs properly. I do not know how many MUAs support RFC 2231. I do not recommend blanket 2047 decoding for parameter data. Just limit it to parameters associated with a filename. --ewh ___ Nmh-workers mailing list Nmh-workers@nongnu.org https://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] RFC 2047 vs RFC 2231 encoding for MIME parameters
>Google has no excuse for generating such data, but as you note in your OP, >other MUAs have been doing it for a long time and from vendors that are >notorious for not following specs properly. I do not know how many MUAs >support RFC 2231. AFAICT, with the exception of older versions of Outlook (like before 2007) and Lotus notes, pretty much "everybody" can decode RFC 2231 correctly. And I believe that "most" modern MUAs (including nmh! :-) ) will generate it. Some people will generate both: http://www.igaware.com/blog/attachments-converted-to-dat-files-when-sending-from-zarafa-solved/ Personally that seems mega-bozo to me, as I'm not sure what's supposed to happen if you include two parameters of the same name (even if one is encoded, and one isn't). >I do not recommend blanket 2047 decoding for parameter data. Just limit it >to parameters associated with a filename. I find this argument convicing; thoughts from others? --Ken ___ Nmh-workers mailing list Nmh-workers@nongnu.org https://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] RFC 2047 vs RFC 2231 encoding for MIME parameters
Ken wrote: > >[Earl:] > >I do not recommend blanket 2047 decoding for parameter data. Just limit it > >to parameters associated with a filename. > > I find this argument convicing; thoughts from others? That's what I implemented in mhfixmsg: just Content-Type name and Content-Disposition filename. David ___ Nmh-workers mailing list Nmh-workers@nongnu.org https://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] RFC 2047 vs RFC 2231 encoding for MIME parameters
>That's what I implemented in mhfixmsg: just Content-Type name and >Content-Disposition filename. I think we're all fine with that; I'm wondering if we should see if it's an RFC 2047-encoded filename and just decode it automatically. --Ken ___ Nmh-workers mailing list Nmh-workers@nongnu.org https://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] RFC 2047 vs RFC 2231 encoding for MIME parameters
On Sun, 02 Oct 2016 20:50:35 -0400, Ken Hornstein said: > I think we're all fine with that; I'm wondering if we should see if it's > an RFC 2047-encoded filename and just decode it automatically. I see plenty of sources of heartburn if somebody sends a filename with Hebrew characters in whatever 8859-foo (-8?), to somebody in a UTF-8 environment. Unlike textual data intended to be read, where "decode to recipient's locale" makes sense, when it's a filename things get stickier, because there can be external references (indexes, etc) that point at a filename in a particular encoding - or even the 2047-encoded string as the filename. :) Adding to the fun, Unix-y filenames don't *have* a locale... pgpdpCvzQI9R_.pgp Description: PGP signature ___ Nmh-workers mailing list Nmh-workers@nongnu.org https://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] RFC 2047 vs RFC 2231 encoding for MIME parameters
Ken Hornstein wrote: That's what I implemented in mhfixmsg: just Content-Type name and Content-Disposition filename. I think we're all fine with that; I'm wondering if we should see if it's an RFC 2047-encoded filename and just decode it automatically. that would follow the principle of least astonishment, considering all the broken UI's our users will have seen or used before they found MH, and considering how widely broken this protocol element has become. +1. -- P Vixie ___ Nmh-workers mailing list Nmh-workers@nongnu.org https://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] RFC 2047 vs RFC 2231 encoding for MIME parameters
Ken Hornstein wrote: That's what I implemented in mhfixmsg: just Content-Type name and Content-Disposition filename. I think we're all fine with that; I'm wondering if we should see if it's an RFC 2047-encoded filename and just decode it automatically. that would follow the principle of least astonishment, considering all the broken UI's our users will have seen or used before they found MH, and considering how widely broken this protocol element has become. +1. -- P Vixie ___ Nmh-workers mailing list Nmh-workers@nongnu.org https://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] RFC 2047 vs RFC 2231 encoding for MIME parameters
Hi, Valdis wrote: > Ken wrote: > > I think we're all fine with that; I'm wondering if we should see if > > it's an RFC 2047-encoded filename and just decode it automatically. I think my -1 has already been counted. > I see plenty of sources of heartburn if somebody sends a filename with > Hebrew characters in whatever 8859-foo (-8?), to somebody in a UTF-8 > environment. Acting like other MUA's on this doesn't match nmh's behaviour on other transgressions and I'd prefer the wrong encoding not to be swept under the carpet. nmh users are often savvy enough that they can chase back to the creator, e.g. FLOSS PHP library, but only once they become aware of the problem. -- Cheers, Ralph. https://plus.google.com/+RalphCorderoy ___ Nmh-workers mailing list Nmh-workers@nongnu.org https://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] RFC 2047 vs RFC 2231 encoding for MIME parameters
>I see plenty of sources of heartburn if somebody sends a filename with >Hebrew characters in whatever 8859-foo (-8?), to somebody in a UTF-8 >environment. Unlike textual data intended to be read, where "decode >to recipient's locale" makes sense, when it's a filename things get >stickier, because there can be external references (indexes, etc) that >point at a filename in a particular encoding - or even the 2047-encoded >string as the filename. :) I am not sure this is within the scope of nmh; I mean, it's a general problem that exists even if you use RFC 2231 encoding. Also ... I'm unclear from your response if you are a +1 or -1 on the idea of nmh automatically decoding RFC-2047 encoded filenames, which was my original query :-) Regarding filenames lacking locale (really, I think you mean character set), that is no longer true. I see a number of network filesystems start to enforce UTF-8 (really, the only sane choice) and more Unix-like operating systems are doing the same for local filesystems (MacOS and Solaris are the ones that I'm aware of). Really, I think that's where things will ultimately end up. --Ken ___ Nmh-workers mailing list Nmh-workers@nongnu.org https://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] RFC 2047 vs RFC 2231 encoding for MIME parameters
>Acting like other MUA's on this doesn't match nmh's behaviour on other >transgressions and I'd prefer the wrong encoding not to be swept under >the carpet. nmh users are often savvy enough that they can chase back >to the creator, e.g. FLOSS PHP library, but only once they become aware >of the problem. Sigh. Since I did that, I am sympathetic to that argument, but the specific case you're talking about (marking an enclosing multipart with q-p) is not exactly the same here. For comparison, yes, they're both explicit RFC violations. But the former introduced a real ambiguity; should the enclosing parts be encoded with q-p? The latter ... well, not so much. And while some MUAs have been slowly fixing things (see: Outlook), others are clearly fringe players (Lotus Notes), Gmail is kind of a dominant player. And it's clear from the stuff I've read online that people with much more clout than I have tried, and the people in charge of Gmail simply Don't Give a Shit. Again, it sticks in my craw that we have to do this, but all indications are that this is unfortunately rather common. --Ken ___ Nmh-workers mailing list Nmh-workers@nongnu.org https://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] RFC 2047 vs RFC 2231 encoding for MIME parameters
> On Oct 3, 2016, at 7:53 AM, Ken Hornstein wrote: > > Again, it sticks in my craw that we have to do this, but all indications > are that this is unfortunately rather common. Can't we just pull off the Cyrus IMAP way of replacing all the invalid octets in the decoded stream with Xs? That makes a valid filename while (deliberately) ignoring the encoding brain damage. ___ Nmh-workers mailing list Nmh-workers@nongnu.org https://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] RFC 2047 vs RFC 2231 encoding for MIME parameters
>Can't we just pull off the Cyrus IMAP way of replacing all the invalid >octets in the decoded stream with Xs? That makes a valid filename while >(deliberately) ignoring the encoding brain damage. I don't think you're understanding the problem. There are no invalid octets here; the issue is whether or not we perform an automatic decode. --Ken ___ Nmh-workers mailing list Nmh-workers@nongnu.org https://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] RFC 2047 vs RFC 2231 encoding for MIME parameters
Hi Ken, > I don't think you're understanding the problem. There are no invalid > octets here; the issue is whether or not we perform an automatic > decode. But now Lyndon's brought it up, Content-Type: inode/x-empty; name*=UTF-8''%41%00%42 Content-Disposition: attachment; filename*=UTF-8''%41%00%42 `mhstore -auto' creates `./A'. Perhaps the RFCs rule out %00? But then again, we're talking about crap that doesn't follow the RFCs. If it's %41%2F%42 then `A/B' is created if A exists, so that seems OK. BTW, file(1) might be happy with inode/x-empty, and nmh stores an empty file, but I wonder if other systems complain they don't know what to do? -- Cheers, Ralph. https://plus.google.com/+RalphCorderoy ___ Nmh-workers mailing list Nmh-workers@nongnu.org https://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] RFC 2047 vs RFC 2231 encoding for MIME parameters
>Content-Type: inode/x-empty; name*=UTF-8''%41%00%42 >Content-Disposition: attachment; filename*=UTF-8''%41%00%42 > >`mhstore -auto' creates `./A'. Perhaps the RFCs rule out %00? But then >again, we're talking about crap that doesn't follow the RFCs. If it's >%41%2F%42 then `A/B' is created if A exists, so that seems OK. Sigh. We use a lot of C strings, so we're not so great on handling embedded NULs. It's one of those things that is simultanously hard to fix, and AFAICT not worth it. Let me ask you, Ralph ... what do you WANT to happen here? --Krn ___ Nmh-workers mailing list Nmh-workers@nongnu.org https://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] RFC 2047 vs RFC 2231 encoding for MIME parameters
I almost hate to bring this up, but . . . mhstore already asks users (if isatty). We could do something like: the filename of =?...?= is encoded, save as unencoded [...] instead? [y/n] And define what to do if the response is no. If not a tty, we're back to the question. Safer to fail, friendlier to decode. David ___ Nmh-workers mailing list Nmh-workers@nongnu.org https://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] RFC 2047 vs RFC 2231 encoding for MIME parameters
On Tue, Oct 4, 2016 at 7:44 AM, Ralph Corderoy wrote: > Content-Type: inode/x-empty; name*=UTF-8''%41%00%42 > Content-Disposition: attachment; filename*=UTF-8''%41%00%42 > > `mhstore -auto' creates `./A'. Perhaps the RFCs rule out %00? But then > again, we're talking about crap that doesn't follow the RFCs. If it's > %41%2F%42 then `A/B' is created if A exists, so that seems OK. It not okay. Filenames specified in email is considered informative only since there are security implications of blindly using what is provided. For any file nmh creates based on email parameter input, it should run it through a sanitizer to remove any characters deemed invalid and remove any pathname components. For example, what if I have: Content-Type: application/octet-stream Content-Disposition: attachment; filename="/etc/passwd" or relative pathname attacks using "../.."? I do not recall if nmh checks if a file with same name already exists. If we are to be security conscience, filename parameter should be ignored, with files stored based on content-type, or at a minimum, just use the filename parameter extension. An option can be provided to specify that the filename parameter be honored, but even then, only use the basename after it has been sanitized. --ewh ___ Nmh-workers mailing list Nmh-workers@nongnu.org https://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] RFC 2047 vs RFC 2231 encoding for MIME parameters
On Tue, Oct 4, 2016 at 8:57 AM, David Levine wrote: > If not a tty, we're back to the question. Safer to fail, friendlier to > decode. Decode. How often are real files with "=?...?=" in their name them encountered? --ewh ___ Nmh-workers mailing list Nmh-workers@nongnu.org https://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] RFC 2047 vs RFC 2231 encoding for MIME parameters
Earl wrote: > For any file nmh creates based on email parameter input, it should run it > through a sanitizer to remove any characters deemed invalid and remove any > pathname components. For security reasons, this filename will be ignored if it begins with the character '/', '.', '|', or '!', or if it contains the character '%'. > For example, what if I have: > > Content-Type: application/octet-stream > Content-Disposition: attachment; filename="/etc/passwd" > > or relative pathname attacks using "../.."? The /etc/passwd or relative pathanme will be ignored, and a name of the form message#.part#.subtype will be used instead (assuming no profile override). > I do not recall if nmh checks if a file with same name already exists. It can, starting with 1.6, using the mhstore(1) -clobber switch. > If we are to be security conscience, filename parameter should be ignored, > with files stored based on content-type, or at a minimum, just use the > filename parameter extension. An option can be provided to specify that the > filename parameter be honored, but even then, only use the basename after it > has been sanitized. Yup, we're there. The mhstore switch you refer to is -auto; the default is -noauto. mhstore also has an -outfile switch, so the user can specify a particular filename (to store all selected content). David ___ Nmh-workers mailing list Nmh-workers@nongnu.org https://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] RFC 2047 vs RFC 2231 encoding for MIME parameters
Earl wrote: > Decode. I'm leaning that way, too. > How often are real files with "=?...?=" in their name them encountered? Often enough for some. David ___ Nmh-workers mailing list Nmh-workers@nongnu.org https://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] RFC 2047 vs RFC 2231 encoding for MIME parameters
Hi Earl, > > If not a tty, we're back to the question. Safer to fail, friendlier > > to decode. > > Decode. How often are real files with "=?...?=" in their name them > encountered? If you other recent email you said "If we are to be security conscience" and I think that's the right default stance. I can't think of a way of exploiting having a filename with the wrong encoding being decoded anyway, but I prefer to start with allowing nothing and working out what to add than the other way around. The email may be seen at other MUAs that display the filenames differently, but the unpacking left to nmh without checking. The attachments may overwrite one another or not depending whether the MUA sticks to the RFCs, and so unpacking multiple times with different MUAs could give different results. Even if no exploit, there's obviously room for confusion, and that's inevitable if other MUAs don't follow the RFCs. If we do the right thing by the RFCs then we can justify it, have the high ground, and point to mhfixmsg(1) with the user realising they need to tread carefully. -- Cheers, Ralph. https://plus.google.com/+RalphCorderoy ___ Nmh-workers mailing list Nmh-workers@nongnu.org https://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] RFC 2047 vs RFC 2231 encoding for MIME parameters
> On Oct 6, 2016, at 5:20 AM, David Levine wrote: > >> For example, what if I have: >> >> Content-Type: application/octet-stream >> Content-Disposition: attachment; filename="/etc/passwd" >> >> or relative pathname attacks using "../.."? > > The /etc/passwd or relative pathanme will be ignored, and a name of > the form message#.part#.subtype will be used instead (assuming no > profile override). I think this is very wrong behaviour. Filenames in the attachment meta-data are suggestions. But they can be very valid suggestions, and shouldn't be ignored for arbitrary reasons. But leading paths must be ignored, as security dictates. The safest course of action is: 1) Take the basename of the suggested filename. 2) Perform an exclusive open+create of the filename. 2a) If the file exists, and we are interactive, prompt for a replacement name (or to overwrite); else (2c) 2b) If the as-encoded filename results in an error from the underlying open() call, report the error and fall through. 2c) Synthesize a unique name, write to that, and report the name. --lyndon ___ Nmh-workers mailing list Nmh-workers@nongnu.org https://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] RFC 2047 vs RFC 2231 encoding for MIME parameters
> On Oct 6, 2016, at 7:36 PM, Lyndon Nerenberg wrote: > > 2) Perform an exclusive open+create of the filename. > > 2a) If the file exists, and we are interactive, prompt for a replacement name > (or to overwrite); else (2c) > > 2b) If the as-encoded filename results in an error from the underlying open() > call, report the error and fall through. > > 2c) Synthesize a unique name, write to that, and report the name. Sorry, I was not at all clear about this. I am proposing NO decoding what so ever of any incorrectly encoded file name. Case (2b) above avoids any issues with filenames that are invalid for the implementation. And we can't count on the old POSIX static semantics for those. As Ken pointed out, ZFS filesystem have a switch that enforces UTF-8 compliance. Or not. It's not up to us to judge "or not." open(2) determines the validity of the proposed filename. --lyndon ___ Nmh-workers mailing list Nmh-workers@nongnu.org https://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] RFC 2047 vs RFC 2231 encoding for MIME parameters
Lyndon wrote: > > On Oct 6, 2016, at 5:20 AM, David Levine wrote: > > > > The /etc/passwd or relative pathanme will be ignored, and a name of > > the form message#.part#.subtype will be used instead (assuming no > > profile override). > > I think this is very wrong behaviour. > > Filenames in the attachment meta-data are suggestions. But they can be very > valid suggestions, and shouldn't be ignored for arbitrary reasons. I don' think they are. > But leading paths must be ignored, as security dictates. > > The safest course of action is: > > 1) Take the basename of the suggested filename. But I wouldn't consider the likely result with filename=/foo/bar/README to be safest. > 2) Perform an exclusive open+create of the filename. > > 2a) If the file exists, and we are interactive, prompt for a replacement name > (or to overwrite); else (2c) That can be configured with -clobber ask, but that's not the default for (decades of) historical precedent. I don't think we should change the default here. It's easy enough for users to override. David ___ Nmh-workers mailing list Nmh-workers@nongnu.org https://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] RFC 2047 vs RFC 2231 encoding for MIME parameters
> On Oct 6, 2016, at 8:11 PM, David Levine wrote: > > But I wouldn't consider the likely result with filename=/foo/bar/README > to be safest. Why not? If there is no "README" file, create it. If there is, prompt for a replacement if stdin is a tty, else synthesize a unique replacement name and be done with it. ___ Nmh-workers mailing list Nmh-workers@nongnu.org https://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] RFC 2047 vs RFC 2231 encoding for MIME parameters
Lyndon wrote: > > On Oct 6, 2016, at 8:11 PM, David Levine wrote: > > > > But I wouldn't consider the likely result with filename=/foo/bar/README > > to be safest. > > Why not? If there is no "README" file, create it. If there is, prompt for a > replacement if stdin is a tty, else synthesize a unique replacement name and > be done with it. It wouldn't be safest because I would risk overwriting README in the current directory. That's not what I expect. In any case, I don't think that we should change the mhstore defaults because that might break scripts as well as user expecations. Those include the default of -noauto. You can override those defaults in your profile (I do) to get pretty close to what you ask for. Though there isn't a basename, it should be possible to support that with formatting strings that pipe the content to a script that runs basename(1) on %a. To get back to the question about RFC 2047 encoded name parameters, I'll just add a note to the mhstore man page for users to use mhfixmsg to get around that. David ___ Nmh-workers mailing list Nmh-workers@nongnu.org https://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] RFC 2047 vs RFC 2231 encoding for MIME parameters
Hi Krn, > > Content-Type: inode/x-empty; name*=UTF-8''%41%00%42 > > Content-Disposition: attachment; filename*=UTF-8''%41%00%42 > > Sigh. We use a lot of C strings, so we're not so great on handling > embedded NULs. It's one of those things that is simultanously hard to > fix, and AFAICT not worth it. Let me ask you, Ralph ... what do you > WANT to happen here? > > --Krn How about... Detect any decoding that produces a NUL up with which our string representation cannot put. Stop at that point. If the user really wants to proceed, and they probably don't once the dodgy filename is drawn to their attention, have him drop -auto or alter the email. -- Cheers, Ralph. https://plus.google.com/+RalphCorderoy ___ Nmh-workers mailing list Nmh-workers@nongnu.org https://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] RFC 2047 vs RFC 2231 encoding for MIME parameters
Hi Ken, > ...Gmail is kind of a dominant player. And it's clear from the stuff > I've read online that people with much more clout than I have tried, > and the people in charge of Gmail simply Don't Give a Shit. > > Again, it sticks in my craw that we have to do this, but all > indications are that this is unfortunately rather common. That's not the attitude that built an empire! :-) I've submitted Feedback within Gmail spelling out the problem and asking for contact. If I don't hear anything in a few days then I'll try some other avenues, hammering home the same message. (I should also check that the actual SMTP from a Gmail machine has the fault in that manner and it isn't being re-written from something else that's wrong, e.g. UTF-8, just for completeness.) -- Cheers, Ralph. https://plus.google.com/+RalphCorderoy ___ Nmh-workers mailing list Nmh-workers@nongnu.org https://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] RFC 2047 vs RFC 2231 encoding for MIME parameters
>That's not the attitude that built an empire! :-) I've submitted >Feedback within Gmail spelling out the problem and asking for contact. >If I don't hear anything in a few days then I'll try some other avenues, >hammering home the same message. > >(I should also check that the actual SMTP from a Gmail machine has the >fault in that manner and it isn't being re-written from something else >that's wrong, e.g. UTF-8, just for completeness.) Sigh. I admire your energy, Ralph ... it's just that at this point, I lack the energy and I don't share your optimism. I mean, fight the good fight by all means. Please let us know how it goes. And AFAICT, it's only a problem with the webmail front end; that's what I tested that had the problem. --Ken ___ Nmh-workers mailing list Nmh-workers@nongnu.org https://lists.nongnu.org/mailman/listinfo/nmh-workers