Re: [Nmh-workers] Understanding nmh (aka. What's the goal) [ really non-ASCII message bodies ]
>Sure, but reading this list for several years I have the impression, >that more effort goes into finding consensus than on writing code. I guess that's a fair criticism, but I think that's unavoidable with a software project in this situation. Remember that the ORIGINAL MH was developed in 1979. So we've got a software package that's over 30 years old; that's ... what, 3 lifetimes in terms of Internet time? :-) The original people aren't involved anymore. Well, okay, John Romine is still around (nice to hear from you!), and we've seen messages from Norman Shapiro not too long ago. Jerry Peek is a lurker who pops up now and then. Richard Coleman, the guy who forked nmh, isn't doing that anymore. So, the question THEN becomes ... who's in charge of nmh? Is it me? Is it Jon Steinhart? Is it Peter Maydell? We're all listed as nmh admins on the savannah web pages, but are we "in charge"? I think if one of us wanted to be in charge, the others wouldn't fight it too much. I think (but I don't want to speak for the others) that our personalities don't work that way; each of us wants nmh to succeed, but nmh isn't quite important enough to us to devote the time/energy to be the guy who is willing to adhere to a vision and occasionally run roughshod over the objections of others (because, in my mind, that really is what being in charge means: occasionally you're going to have to be the asshole). So, we're in a situation where nmh works in a certain way; we want to change it, but we don't know how that change will affect others. So we solicit feedback and try to build a consensus for a change. Because in my mind, nmh really belongs to everyone; while I may (occasionally) shepherd it in a particular direction, I want it to succeed so that others find it useful, and to do that I need to make sure that other people find it useful as well. Also, given the fact that (n)mh is 30+ years old, people have strong opinions about the way it should work; I do try to be sensitive to that, and I think others feel the same way as well. So it's a balancing act between the way MH currently works versus the way it should work, and of course not everyone agrees on the way it SHOULD work. Add to that the fact that most of us have lives not in front of a computer, and we're working on nmh in our free time, and people tend to be focused on what is important to them personally rather than the long-term vision. >All I wanted to say is: This situation encourages people to think of >work arounds instead of reporting bugs. So the fact that no bugs were >reported is very weak evidence that everybody agrees on the current >behaviour. I for one understand that. I don't necessarily think that people are happy with the way nmh works, but that's why we're having these conversations - to figure out nmh SHOULD work. I think that's an inefficiency we have to live with, unless someone wants to step up and be the occasional asshole. --Ken ___ Nmh-workers mailing list Nmh-workers@nongnu.org http://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] Understanding nmh (aka. What's the goal) [really non-ASCII message bodies ]
Peter Maydell wrote: > Has there been much of interest gone in since 1.3 to merit a 1.4? If not, we could still release a 1.3.1. I'm a fan of the release early, release often principle. > I did the 'turn handle, release nmh' work last time round but I'm > currently horribly busy so am unlikely to be able to do that before > next year. If somebody else is willing to do it I don't object. I'm happy to volunteer to turn the handle this time if you're currently busy but a release now is deemed to be a useful thing to do. So, is there anything half finished currently in git? Oliver ___ Nmh-workers mailing list Nmh-workers@nongnu.org http://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] Understanding nmh (aka. What's the goal) [really non-ASCII message bodies ]
Ken Hornstein wrote: >>That's what (as I understand it) Jon's patch handles (I still use the >>latest released version of nmh, which predates all this stuff ... there >>hasn't been a new release (since 1.3) for a long time now...) and makes a >>standards conforming message. > >You know, this has been bugging me. Should we cut a new release? I kinda >think it's long-overdue. > >I'm thinking that what we have now (modulo any autoconf cleanup, which >I've been itching to work on) should be 1.4, and whatever comes out of >this MIME discussion should be post-1.4. Or do people want to include >this new MIME work? Has there been much of interest gone in since 1.3 to merit a 1.4? I certainly agree that this MIME rework should be either all in the next release or not in it at all -- we don't want to accidentally release half-a-change. I did the 'turn handle, release nmh' work last time round but I'm currently horribly busy so am unlikely to be able to do that before next year. If somebody else is willing to do it I don't object. -- PMM ___ Nmh-workers mailing list Nmh-workers@nongnu.org http://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] Understanding nmh (aka. What's the goal) [ really non-ASCII message bodies ]
> >Well, I always considered the current behaviour a bug, but I didn't say > >anything, because the nmh-way of software development seemed pretty > >inefficient and I didn't want to look at the code myself either. > > We have a "way" of software development? When did that happen? :-) > > In all seriousness ... what, exactly, do you mean? I guess our current > way is, "Anyone who's interested, please contribute!". I don't see how > that's really much different than other open-source packages. Sure, but reading this list for several years I have the impression, that more effort goes into finding consensus than on writing code. Don't get me wrong: So far I wrote one patch for nmh which eventually got accepted (some much edited version anyway). I learned a lot from this about programming in C and I really appreciate all the constructive feedback. All I wanted to say is: This situation encourages people to think of work arounds instead of reporting bugs. So the fact that no bugs were reported is very weak evidence that everybody agrees on the current behaviour. Harald ___ Nmh-workers mailing list Nmh-workers@nongnu.org http://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] Understanding nmh (aka. What's the goal) [really non-ASCII message bodies ]
> Even for attachments, as I understand it, that's keyed off a pseudo-header > added to the components file (and so appears in the draft), right? Do we > really need a switch to enable that. I'm (again) all for backwards > compatability, but is there any serious believe that people are really > adding "Attachment:" (whatever it is, "MH-Attach:" might be better) headers > to their messages and expecting that to be delivered?And yes, I know > non-standard headers are OK, but we have non-switch-enabled locally invented > headers used for this kind of purpose already (like fcc) - another, > expecially if given a MH specific name, should be harmless. It would be > simpler to just do the processing and not require a switch (switches that > we more or less tell people that "everyone should have this in their profile" > are just dumb...) Let me explain how the code works here. There needed to be some way track attachments for a draft between programs because of the modular nature of nmh. I suppose that I could have used a shadow file, but instead I used a legal "X" header. However, there is no way that I could 100% guarantee that any X header wouldn't collide with something that some user was doing since that namespace is uncontrolled. That's why the name of the header that tracks attachments is settable. All attachment headers are stripped off by send before the message is sent. Send examines the headers and changes the draft into a MIME message that includes the attachments specified by those headers. I can see where one could claim that this approach was overkill and could have just been hardwired. This was my first contribution to nmh and it was a rather large change in functionality. It was clear from reading the mailing list that breaking things was bad, so I went about this in a safe way. Had I received this input a decade ago I might have done things differently. But, I didn't. I am aware of users with custom environments who build drafts with these attachment headers. I don't see anything so compelling in changing the name of the header field name that it's worth breaking things for these users. It's done. Hindsight is wonderful but at some point you gotta let go. Jon ___ Nmh-workers mailing list Nmh-workers@nongnu.org http://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] Understanding nmh (aka. What's the goal) [really non-ASCII message bodies ]
>That's what (as I understand it) Jon's patch handles (I still use the >latest released version of nmh, which predates all this stuff ... there >hasn't been a new release (since 1.3) for a long time now...) and makes a >standards conforming message. You know, this has been bugging me. Should we cut a new release? I kinda think it's long-overdue. I'm thinking that what we have now (modulo any autoconf cleanup, which I've been itching to work on) should be 1.4, and whatever comes out of this MIME discussion should be post-1.4. Or do people want to include this new MIME work? --Ken ___ Nmh-workers mailing list Nmh-workers@nongnu.org http://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] Understanding nmh (aka. What's the goal) [ really non-ASCII message bodies ]
Peter Maydell wrote: > Ralph Corderoy wrote: > > I agree. I've never used -attach, instead sticking with > > #-directives, and yet still send UTF-8 bodies with £ in text/plain > > emails with no attachments by mistake. This bug doesn't depend on > > using -attach. > > Yes. Perhaps we should make the condition be "do this if the incoming > draft does not already have MIME headers" ? That way if the user has > script-type solutions that pass an already-sensible draft to send we > don't mess it up. Good point. Cheers, Ralph. ___ Nmh-workers mailing list Nmh-workers@nongnu.org http://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] Understanding nmh (aka. What's the goal) [ really non-ASCII message bodies ]
Ralph Corderoy wrote: >Peter Maydell wrote: >> (In fact I think we should go ahead and change the behaviour for >> not-plain-ASCII bodies even if the user didn't pass the -attach >> switch, but I'm guessing I might get argued down on that.) > >I agree. I've never used -attach, instead sticking with #-directives, >and yet still send UTF-8 bodies with =C2=A3 in text/plain emails with no >attachments by mistake. This bug doesn't depend on using -attach. Yes. Perhaps we should make the condition be "do this if the incoming draft does not already have MIME headers" ? That way if the user has script-type solutions that pass an already-sensible draft to send we don't mess it up. >> if (*p != '\t' && (*p >= 127 || *p < 32) { >> non_ascii = 1; > >In that case, would plumping for character literals be nicer? > >if (*p < ' ' && *p != '\t' || *p > '~') { >non_ascii = 1; I don't care much either way on the style, I just want the logic to be right. (Also I'd just like to note that I had to manually edit out some quoted-printable encoding before I could send this mail with nmh, in a passing demonstration of our poor MIME handling :-)) -- PMM ___ Nmh-workers mailing list Nmh-workers@nongnu.org http://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] Understanding nmh (aka. What's the goal) [ really non-ASCII message bodies ]
Peter Maydell wrote: > (In fact I think we should go ahead and change the behaviour for > not-plain-ASCII bodies even if the user didn't pass the -attach > switch, but I'm guessing I might get argued down on that.) I agree. I've never used -attach, instead sticking with #-directives, and yet still send UTF-8 bodies with £ in text/plain emails with no attachments by mistake. This bug doesn't depend on using -attach. > if (*p != '\t' && (*p >= 127 || *p < 32) { > non_ascii = 1; In that case, would plumping for character literals be nicer? if (*p < ' ' && *p != '\t' || *p > '~') { non_ascii = 1; Cheers, Ralph. ___ Nmh-workers mailing list Nmh-workers@nongnu.org http://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] Understanding nmh (aka. What's the goal) [really non-ASCII message bodies ]
In this discussion people (other than perhaps Jon, though he hasn't said this explicitly) have just been assuming that if the e-mail body of a message contains data that is not ascii, then it must be some other character set, because after all, all e-mail is text ... In the days of MIME, that's simply not true, and while it is unlikely that anyone is going to use prompter, or even some other editor, to produce a jpeg file, there's nothing to prevent a script producing a file with a jpeg body, and 822 headers, and handing that to nmh to process. We might prefer such a script to generate all the right headers, but MH really doesn't like it if we attempt to tell it mime info in the components file (or the draft) - it insists on adding that itself, so not doing all the content type processing before calling nmh processes is understandable. Now there is nothing at all illegal about this - even ignoring that "illegal" is the wrong word to use in any case, "non-conforming" would be the correct term. The standards don't apply to what the user feeds nmh, that's locally defined "anything goes" territory. What matters is what nmh hands to the MTA (and even more, what the local MTA passes on to its peer). There if we simply send an 822 (old style, non-MIME) message with arbitrary binary content we have a non-conforming message. That's what (as I understand it) Jon's patch handles (I still use the latest released version of nmh, which predates all this stuff ... there hasn't been a new release (since 1.3) for a long time now...) and makes a standards conforming message. It obviously has no way of knowing what the data that it detects as non-ascii is (or not without extra information from the sender), so "application/octet-stream" sounds to me as if it is the perfect choice (along with either QP or B64 encoding to handle the body format) to indicate "here is stuff, but I have no idea what it means, work it out for yourself" - which for many users of this kind of procedure would probably be adequate. I'm certainly not arguing that we should keep this behaviour, and certainly not as the default - I expect that real users of binary message bodies that are not text are so rare that, even if there are any at all, updating them would not be a huge problem (provided the change notes for the next nmh release make it clear this has happened). However, I don't think we should give up the ability to simply send an e-mail where the body is image/jpeg or whatever - there's no requirement that there be any text in the body of the message at all, even though most MUA's simply assume that, and require a multipart to include anything that is not text. MH should be better than that, being just as good as "most MUA's" is a fairly grevious insult IMO. And while retaining the # language of mhbuild, or something equivalent, is essential to enable truly general messages to be created, expecting to use that for trivial tasks is, I agree, asking too much - and requiring explicit mime processing at the whatnow stage should only be necessary when the full mhbuild procedure is to be invoked. (Do recall that wnen this was added, MIME messages were rare, and lots of users didn't like them - most MUA's had no way to display them, not even as "good" as nmh does now - and so wanting that processing was very unusual. These days, almost every message should comply with at least basic mime formats. My suggestion to handle general bodies is to allow a switch that sets the MIME content-type of the message (defaulting to text/plain) - and then base all the other decisions off that. If (as a result of the default, or by being explicitly set) we get a text/* content-type, then we can attempt to work out the charset involved, and add the proper indicator. On the other hand, if someone really wants to send an application/octet-stream, then let's allow them to do that, or if they want to send image/jpeg or audio/whatever they should be able to do that too (a message that is entirely audio/* could even be handled my "show" by playing it through the local system's speakers, assuming that's possible - implementing voice-email) I also don't believe that this processing should be keyed off some -attach switch - as a way to simplify adding an attachment to a message (incidentally, if given twice, can we have two attachments, or is there some other way to do that?) it sounds OK, but for charset processing? For text messages, the right thing should be done regardless of whether there's any plan or intent to add attachments, and using a switch "-attach" in the profile to mean "encode my text correctly" is bizarre... I'm all for backwards compatability, but only backwards compatible for correct behaviour, keeping all the existing bugs should not be required (though I think there are environments where even that is expected.) Even for attachments, as I understand it, that's keyed off a pseudo-header added to the components file (and so appears in th
Re: [Nmh-workers] Understanding nmh (aka. What's the goal) [ really non-ASCII message bodies ]
>> The current behavior was the best idea that I had >> at the time and nobody has said anything about it until recently. I don't >> mind it changing, but I don't want to all of a sudden get complaints from >> people who were counting on this behavior. > >Well, I always considered the current behaviour a bug, but I didn't say >anything, because the nmh-way of software development seemed pretty >inefficient and I didn't want to look at the code myself either. We have a "way" of software development? When did that happen? :-) In all seriousness ... what, exactly, do you mean? I guess our current way is, "Anyone who's interested, please contribute!". I don't see how that's really much different than other open-source packages. Also, as long as we're on the subject ... if people want to submit patches to the list, perhaps formatting them with git format-patch (since, hey, I went to the whole trouble to convert everything to git) might be worthwhile. Or perhaps just doing that when everyone agrees on the code would make things easier (because the patch could be processed with git am). --Ken ___ Nmh-workers mailing list Nmh-workers@nongnu.org http://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] Understanding nmh (aka. What's the goal) [ really non-ASCII message bodies ]
On Tue, 07 Dec 2010 12:03:38 CST, Earl Hood said: > On Tue, Dec 7, 2010 at 11:10 AM, Jon Steinhart wrote: > > I understand that my attachment system does not handle non-ASCII message > > bodies, but again, that's because non-ASCII message bodies are not "legal". > > Please cite an RFC that says non-ASCII bodies are not legal. > > With MIME, you have the Content-Transfer-Encoding field, which allows > for 8bit. And then, if you have a Content-Type type that supports > charset parameter, you can "legally" have a body that is non-ASCII. A MIME message that has a Mime-Version: and appropriate C-T-E: headers can certainly be non-ASCII. What's illegal is sending non-ASCII *without* such headers (which is what nmh has been doing in the past). pgpqGwTkJbOGo.pgp Description: PGP signature ___ Nmh-workers mailing list Nmh-workers@nongnu.org http://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] Understanding nmh (aka. What's the goal) [ really non-ASCII message bodies ]
Jon Steinhart : > The current behavior was the best idea that I had > at the time and nobody has said anything about it until recently. I don't > mind it changing, but I don't want to all of a sudden get complaints from > people who were counting on this behavior. Well, I always considered the current behaviour a bug, but I didn't say anything, because the nmh-way of software development seemed pretty inefficient and I didn't want to look at the code myself either. So, when I found out that nmh was sending illegal mails (the timestamps of my filesystem tell me it was 25. Jun 2008), I just "fixed" this by adding three lines to the components file: Content-Type: text/plain; charset="iso-8859-15" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit (which of course I have to remove if I actually want to attach a file to a mail) I guess, I'm not the only one silently applying some workaround but I also guess that much more people unknowingly send illegal mail. So unless somebody on this mailinglist states that he actually needs the current behaviour, I think it is very safe to assume that no such complaints you are fearing will ever be made. Given the number of mails wasted on this seemingly obvious question now, I really regret not filing a bug report 2.5 years ago. BTW, I also use nmh exclusively for my mail, except for the two times/year when I actually need to send signed/encrypted mails. Personally I think that nmh should rather abort than silently sending illegal mails. Harald ___ Nmh-workers mailing list Nmh-workers@nongnu.org http://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] Understanding nmh (aka. What's the goal) [ really non-ASCII message bodies ]
markus schnalke wrote: >[2010-12-07 12:33] Jon Steinhart >Examples for what gets generated from mail *body text*: Thanks for doing this and saving me the effort ;-) >The old code generates ... > >... for ASCII: > > Content-Type: text/plain; name="sendKi9x7j"; x-unix-mode="0644"; > charset="us-ascii" > Content-ID: <4962.128958967...@argentina.foo> > Content-Description: ASCII text > > foo > >... for non-ASCII (only if at least one attachment is present): > > Content-Type: application/octet-stream; name="sendbRaV8T"; > x-unix-mode="0644" > Content-ID: <5209.128958999...@argentina.foo> > Content-Description: UTF-8 Unicode text > Content-Transfer-Encoding: base64 > > d2l0aCBKb24 These are definitely just wrong -- we shouldn't be specifying name and x-unix-mode for the body text (and base64ing when we could q-p is a bit unfriendly). -- PMM ___ Nmh-workers mailing list Nmh-workers@nongnu.org http://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] Understanding nmh (aka. What's the goal) [ really non-ASCII message bodies ]
Jon, sorry for the harsh mail. I really had been in rage. :-/ In a recent mail, you said: > The current behavior was the best idea that I had > at the time and nobody has said anything about it until recently. I really don't blame you for what we have; quite the opposite: I am very gretaful for what we have. [2010-12-07 22:45] markus schnalke > > I ask other people to take a look and express their opinion. Thanks to everyone speaking up. meillo ___ Nmh-workers mailing list Nmh-workers@nongnu.org http://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] Understanding nmh (aka. What's the goal) [ really non-ASCII message bodies ]
[2010-12-07 12:33] Jon Steinhart > Still trying to understand this. Decided to finally look at the code instead > of > relying on my fading memory. :-) Thanks. > The existing code takes a non-ASCII message body and sends it as an attachment > of type application/octet-stream. > > Your patch changes this behavior so that it is sent as type text/plain with > the > appropriately chosen character set. Both correct. > In order to do this, you test the message body for non-ASCII characters in > attach(). If you find any, you write an entry directly into the composition > file instead of calling make_mime_composition_file_entry(). Correct. > This is changing existing behavior if I understand it correctly. Behavior in what gets generated, yes, but this is of need when fixing things. Examples for what gets generated from mail *body text*: The old code generates ... ... for ASCII: Content-Type: text/plain; name="sendKi9x7j"; x-unix-mode="0644"; charset="us-ascii" Content-ID: <4962.128958967...@argentina.foo> Content-Description: ASCII text foo ... for non-ASCII (only if at least one attachment is present): Content-Type: application/octet-stream; name="sendbRaV8T"; x-unix-mode="0644" Content-ID: <5209.128958999...@argentina.foo> Content-Description: UTF-8 Unicode text Content-Transfer-Encoding: base64 d2l0aCBKb24 With my patch such MIME parts are generated ... ... for ASCII: Content-Type: text/plain; charset="us-ascii" Content-ID: <5048.128958978...@argentina.foo> foo ... for non-ASCII: Content-Type: text/plain; charset="UTF-8" Content-ID: <5260.128959006...@argentina.foo> Content-Transfer-Encoding: quoted-printable Umlauts: =C3=A4 and =C3=B6 and =C3=BC. The function make_mime_composition_file_entry() gives us nothing but information we don't need/want (temp file names, file permissions) and it definately does not use the best possible CT and CTE for the body text. > This is fine > with me provided that users must explicitly enable the change using an option. An option to activate a fix??? > Now that I'm actually looking at the code, I would suggest an option > (choose a better name) of binary-body-content-type. You could change the > make_mime_composition_file_entry() line > > content_type = binary ? "application/octet-stream" : "text/plain"; > > to replace the "application/octet-stream" with the option value, or the > existing > value as a default if the option is not specified. > > A user wanting this behavior would have a profile entry of > > binary-body-content-type: text/plain > > I think that this would be a simpler code change that would accomplish your > goal. Sorry, but I really think you don't get the point. We don't need config options if we already have the facilities to do it right automatically. Why should any user need to tell nmh what content type to use for the text he writes? The mhbuild facility can already find out which is appropriate. My patch also divides between mail text and attachments for which different things are relevant. Your comment above does this not and would then use binary-body-content-type for any non-ASCII attachment also. I read your mails and ask myself if you really read what I write and if you have had a look into the code we are talking about. I very much value the work you did for nmh and you see that this patch bases on what you created, but now it may be to point to either have a close look at the problem and code or step back and let other people talk. I do think you don't understand the situation and the relavant code good enough (presumably due to lack of time). Of course, I may be wrong. Currently, however, it rather seems to me as if it's not me who is not understanding the whole thing. I really spent much time in the code and doing tests. And I did my best explaining everything. I ask other people to take a look and express their opinion. meillo ___ Nmh-workers mailing list Nmh-workers@nongnu.org http://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] Understanding nmh (aka. What's the goal) [ really non-ASCII message bodies ]
chad wrote: >On Dec 7, 2010, at 12:48 PM, Ken Hornstein wrote: >> You know I'm all for backwards compatibility and everything, but >> I'm wondering ... did the previous behavior actually make sense? Can >> people argue that it was desirable or correct? Or was the previous >> behavior actually wrong, and this is really fixing a long-standing >> bug? > >It was a bug. The only suggesting that it wasn't a bug is instead saying >that it was `illegal' (which it wasn't... it just usually was a bad idea). I agree; we should just change the behaviour here. (In particular the previous behaviour would have differed in how it handled the body depending on whether there was an attachment or not, which suggests to me that it's just not a case anybody has cared about before now.) (In fact I think we should go ahead and change the behaviour for not-plain-ASCII bodies even if the user didn't pass the -attach switch, but I'm guessing I might get argued down on that.) I do think the "is the body not plain ASCII?" check is not quite right. I think that the presence of special characters (most notably ESC) ought to also MIMEification. Otherwise we will not do the right thing for Japanese character sets like shift-JIS. (Yes, I do care about this, it's not just idle nitpickery.) I would suggest if (*p != '\t' && (*p >= 127 || *p < 32) { non_ascii = 1; ie encode unless it's in the printable ascii range or space or tab. -- PMM ___ Nmh-workers mailing list Nmh-workers@nongnu.org http://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] Understanding nmh (aka. What's the goal) [ really non-ASCII message bodies ]
On Dec 7, 2010, at 12:48 PM, Ken Hornstein wrote: > You know I'm all for backwards compatibility and everything, but > I'm wondering ... did the previous behavior actually make sense? Can > people argue that it was desirable or correct? Or was the previous > behavior actually wrong, and this is really fixing a long-standing > bug? It was a bug. The only suggesting that it wasn't a bug is instead saying that it was `illegal' (which it wasn't... it just usually was a bad idea). nmh was used so infrequently in places that aren't 7-bit clean that none of the people who noticed complained; they just binned it with the host of non-i18n'd software and moved on. This is generalizing a bit from a small sample, but I'd be even more astonished than if we found more than a dozen people who use mh exclusively for all their email. ;) *Chad ___ Nmh-workers mailing list Nmh-workers@nongnu.org http://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] Understanding nmh (aka. What's the goal) [ really non-ASCII message bodies ]
> >The existing code takes a non-ASCII message body and sends it as an > >attachment > >of type application/octet-stream. > > > >Your patch changes this behavior so that it is sent as type text/plain with > >the > >appropriately chosen character set. > > You know I'm all for backwards compatibility and everything, but > I'm wondering ... did the previous behavior actually make sense? Can > people argue that it was desirable or correct? Or was the previous > behavior actually wrong, and this is really fixing a long-standing > bug? Because if we decide that the previous behavior is a bug, then > I don't think an explicit enable option for this chance makes sense; I'd > prefer that the new behavior be the default. > > (I am personally on the fence regarding whether or not the previous > behavior is a bug). > > --Ken Don't disagree with you. The current behavior was the best idea that I had at the time and nobody has said anything about it until recently. I don't mind it changing, but I don't want to all of a sudden get complaints from people who were counting on this behavior. Maybe that number is 0, but I have no way of knowing. I don't care that much, so if you think compability isn't an issue here that's fine with me. If the defalt behavior was to change I would add a "binary" flag to make_mime_composition_file_entry() so that the body didn't have to be scanned for non-ASCII characters twice. Jon ___ Nmh-workers mailing list Nmh-workers@nongnu.org http://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] Understanding nmh (aka. What's the goal) [ really non-ASCII message bodies ]
>The existing code takes a non-ASCII message body and sends it as an attachment >of type application/octet-stream. > >Your patch changes this behavior so that it is sent as type text/plain with the >appropriately chosen character set. You know I'm all for backwards compatibility and everything, but I'm wondering ... did the previous behavior actually make sense? Can people argue that it was desirable or correct? Or was the previous behavior actually wrong, and this is really fixing a long-standing bug? Because if we decide that the previous behavior is a bug, then I don't think an explicit enable option for this chance makes sense; I'd prefer that the new behavior be the default. (I am personally on the fence regarding whether or not the previous behavior is a bug). --Ken ___ Nmh-workers mailing list Nmh-workers@nongnu.org http://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] Understanding nmh (aka. What's the goal) [ really non-ASCII message bodies ]
Still trying to understand this. Decided to finally look at the code instead of relying on my fading memory. Sorry if some of my earlier memory-based comments were off-base. Please let me know if my understanding of your proposed patch is correct. The existing code takes a non-ASCII message body and sends it as an attachment of type application/octet-stream. Your patch changes this behavior so that it is sent as type text/plain with the appropriately chosen character set. In order to do this, you test the message body for non-ASCII characters in attach(). If you find any, you write an entry directly into the composition file instead of calling make_mime_composition_file_entry(). This is changing existing behavior if I understand it correctly. This is fine with me provided that users must explicitly enable the change using an option. Now that I'm actually looking at the code, I would suggest an option (choose a better name) of binary-body-content-type. You could change the make_mime_composition_file_entry() line content_type = binary ? "application/octet-stream" : "text/plain"; to replace the "application/octet-stream" with the option value, or the existing value as a default if the option is not specified. A user wanting this behavior would have a profile entry of binary-body-content-type: text/plain I think that this would be a simpler code change that would accomplish your goal. Jon ___ Nmh-workers mailing list Nmh-workers@nongnu.org http://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] Understanding nmh (aka. What's the goal) [ really non-ASCII message bodies ]
Hi, markus schnalke wrote: > > BTW, I would suggest using isascii() rather than (*p > 127 || *p < 0). > > I just kept what you once wrote. ;-P > But, yes, you are right. Given it's `char *p' then *p may be unsigned on some systems, e.g. ARM, and a compiler could warn on testing if it's negative so isascii() is much nicer. :-) Cheers, Ralph. ___ Nmh-workers mailing list Nmh-workers@nongnu.org http://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] Understanding nmh (aka. What's the goal) [ really non-ASCII message bodies ]
On Tue, Dec 7, 2010 at 11:10 AM, Jon Steinhart wrote: > I understand that my attachment system does not handle non-ASCII message > bodies, but again, that's because non-ASCII message bodies are not "legal". Please cite an RFC that says non-ASCII bodies are not legal. With MIME, you have the Content-Transfer-Encoding field, which allows for 8bit. And then, if you have a Content-Type type that supports charset parameter, you can "legally" have a body that is non-ASCII. Note, way back when MIME was first defined, it was probably not good practice to use 8bit CTE since MTAs were not friendly with 8bit data, but today, that is less likely. --ewh ___ Nmh-workers mailing list Nmh-workers@nongnu.org http://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] Understanding nmh (aka. What's the goal) [ really non-ASCII message bodies ]
[2010-12-07 09:10] Jon Steinhart > > [2010-12-07 08:27] Jon Steinhart > > > Sounds good. Is the patch that you sent out complete? I don't see an > > > option > > > that enables/disables this behavior and I think that there should be one. > > > > I believe it's correct that there is no switch. > > > > If one wants to deactivate it, do not specify -attach. > > > > If -attach is given, I believe the changes are fixes for broken > > behavior. Your attachment system lacks some awareness for non-ASCII > > text, which you probably don't deal with much. This is improved with > > my patch. > > OK. I think that there should be a switch. I guess it bugs me to see the > character-by-character examination of the message body on by default. The char-by-char examination is ugly, yes. > I understand that my attachment system does not handle non-ASCII message > bodies, but again, that's because non-ASCII message bodies are not "legal". > I think that you have justified extending nmh to handle "illegal" message > bodies. I'm just nitpicking on the implementation details. With MIMEification, they are legal. I want nmh to convert illegal draft messages to legal messages. Currently nmh sends illegal messages if the user composes such ones. With my patch nmh cares to only send legal messages. Programs should support humans if possible. > Could you please explain again how you get the character set information > for non-ASCII message bodies? Sorry that I didn't save your original > message on this. I seem to recall that you got it from the profile; I > would rather see you get this from the LANG environment variable. I just leave it up to buildmimeproc to find out. :-) We don't need to do it at several places. I only say it's text/plain but nothing about the encoding. The man page of mhbuild(1) writes: If a text content contains any 8-bit characters (characters with the high bit set) and the character set is not specified as above, then mhbuild will assume the character set is of the type given by the environment variable MM_CHARSET. If this environment variable is not set, then the character set will be labeled as “x-unknown”. If a text content contains only 7-bit characters and the character set is not specified as above, then the character set will be labeled as “us-ascii”. This information probably is outdated, but generally it hits the point, probably the code is already better (in respect to MM_CHARSET). > BTW, I would suggest using isascii() rather than (*p > 127 || *p < 0). I just kept what you once wrote. ;-P But, yes, you are right. meillo ___ Nmh-workers mailing list Nmh-workers@nongnu.org http://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] Understanding nmh (aka. What's the goal) [ really non-ASCII message bodies ]
> [2010-12-07 08:27] Jon Steinhart > > Sounds good. Is the patch that you sent out complete? I don't see an > > option > > that enables/disables this behavior and I think that there should be one. > > I believe it's correct that there is no switch. > > If one wants to deactivate it, do not specify -attach. > > If -attach is given, I believe the changes are fixes for broken > behavior. Your attachment system lacks some awareness for non-ASCII > text, which you probably don't deal with much. This is improved with > my patch. > > meillo OK. I think that there should be a switch. I guess it bugs me to see the character-by-character examination of the message body on by default. I understand that my attachment system does not handle non-ASCII message bodies, but again, that's because non-ASCII message bodies are not "legal". I think that you have justified extending nmh to handle "illegal" message bodies. I'm just nitpicking on the implementation details. Could you please explain again how you get the character set information for non-ASCII message bodies? Sorry that I didn't save your original message on this. I seem to recall that you got it from the profile; I would rather see you get this from the LANG environment variable. BTW, I would suggest using isascii() rather than (*p > 127 || *p < 0). Thanks, Jon ___ Nmh-workers mailing list Nmh-workers@nongnu.org http://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] Understanding nmh (aka. What's the goal) [ really non-ASCII message bodies ]
[2010-12-07 08:27] Jon Steinhart > Sounds good. Is the patch that you sent out complete? I don't see an option > that enables/disables this behavior and I think that there should be one. I believe it's correct that there is no switch. If one wants to deactivate it, do not specify -attach. If -attach is given, I believe the changes are fixes for broken behavior. Your attachment system lacks some awareness for non-ASCII text, which you probably don't deal with much. This is improved with my patch. meillo ___ Nmh-workers mailing list Nmh-workers@nongnu.org http://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] Understanding nmh (aka. What's the goal) [ really non-ASCII message bodies ]
Sounds good. Is the patch that you sent out complete? I don't see an option that enables/disables this behavior and I think that there should be one. Jon ___ Nmh-workers mailing list Nmh-workers@nongnu.org http://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] Understanding nmh (aka. What's the goal) [ really non-ASCII message bodies ]
Hoi, discussing these things hadn't been easy sometimes, but the points and arguments became clear and now we reached some kind of consensus. For me, the discussion had been worthwhile. [2010-12-03 11:33] Jon Steinhart > > o Nobody objects to markus addressing this issue. The objections are that > his implementation breaks things, and handling illegal body content is > not a compelling enough reason for breaking things. > So, I think that enough has been said on this topic. markus, can you outline > for us an implementation that doesn't break things? I think that everyone > will > bless your changes if you do. I agree with you here. Hence I created a new patch that concentrates on the fourth case, explained in the other mail. AFAIS it does not break anything. Let me explain: As it only modifies your attachment system now, everything is the same if -attach is not specified. For -attach being specified, the situation is such: no attachment hdr + body contains only ASCII -> sent as is attachment hdr + body contains only ASCII -> MIMEified attachment hdr + body contains non-ASCII -> MIMEified no attachment hdr + body contains non-ASCII -> MIMEified The fourth case is different. Additionally, the body text will be sent with a correct mime-type in any case. Currently it was sent as application/octet-stream in the third case. The relation to `mime' at the whatnow prompt: One surely wants to unset automimeproc when using -attach. Running `mime' at the whatnow prompts is usually not needed as Jon's attachment system handles it automatically. Collisions only occure if an attachment header is present in the mail and one runs `mime' at the whatnow prompt. If the body text contains non-ASCII chars or not is irrelevant, it works as expected in both cases. As long as one does not add attachment headers to a specific draft, one is able to use any mhbuild directives (/^#/) when running `mime' at the whatnow prompt afterwards. Further work: The documentation currently does not cover my changes. Not much to change, and I like to do that if the proposed changes are accepted. More complex MIME structures than ``text followed by attachments'' are not possible with Jon's attachment system. (Like they are not with most MUAs.) One needs to create them with mhbuild directives and run `mime' manually. (For forwarding messages, see below.) Jon's attachment system still needs mhshow-suffix- entries or it will be really dumb. This is something that should be covered separately, maybe by a conceptional redesign (automatic detection, mailcap, ...). Forwarding messages in MIME format could be added to Jon's system in a way similar to what I proposed initially. I believe this would be possible without breaking stuff. We would need to add -attach to forw(1). meillo > P.S. I'm trying to honor the way that you're name appears in your mail > header. > Do you really want it to be "markus" or should it be "Markus"? Usually, I prefer ``meillo'' because that's a nearly unique identifier. If you want to use my real name, I don't care if you spell it in lower-case or with capital `M'. More important is honoring my work by mentioning my name in the ChangeLog or commit messages. ;-) diff --git a/uip/sendsbr.c b/uip/sendsbr.c index 57ef007..8f5f2e1 100644 --- a/uip/sendsbr.c +++ b/uip/sendsbr.c @@ -196,6 +196,7 @@ attach(char *attachment_header_field_name, char *draft_file_name, int c; /* current character for body copy */ int has_attachment; /* draft has at least one attachment */ int has_body; /* draft has a message body */ +int non_ascii; /* msg body contains non-ASCII chars */ int length; /* length of attachment header field name */ char *p; /* miscellaneous string pointer */ @@ -228,29 +229,36 @@ attach(char *attachment_header_field_name, char *draft_file_name, if (strncasecmp(field, attachment_header_field_name, length) == 0 && field[length] == ':') has_attachment = 1; -if (has_attachment == 0) - return (DONE); - /* - * We have at least one attachment. Look for at least one non-blank line - * in the body of the message which indicates content in the body. + * Check if body contains at least one non-blank char (= not empty) + * and if it contains non-ASCII chars (= need MIME). + * We MIMEify the message also if the body contains non-ASCII text. */ has_body = 0; +non_ascii = 0; while (get_line() != EOF) { for (p = field; *p != '\0'; p++) { - if (*p != ' ' && *p != '\t') { + if (*p != ' ' && *p != '\t') has_body = 1; + if (*p > 127 || *p < 0) { + non_ascii = 1; break; } } - - if (has_body) + if (non_ascii) break; } /* + * Bail out if there are no attachments and only ASCII text. + * This means we don't need to convert it to MIME. + */ +if (!has_attachment && non_ascii == 0) + return (DONE);
Re: [Nmh-workers] Understanding nmh (aka. What's the goal) [ really non-ASCII message bodies ]
Jon Steinhart wrote: >Let me try to summarize here because there seems to be lots of energy here >for commentary but little for things that move us forward: > > o Many people still exclusively use nmh, some have drifted away. 'Nuf said >on this topic. > > o There is a general concensus that backward compatibility is important and >that changes shouldn't break things unless there is a compelling reason. > > o The particular thing that markus and Ralph want to address is how illegal >body content is handled. Fixing this would be convenient. Yes, I want this fixed too, and I have an opinion on how it should be fixed. > o markus is willing to write code. Cool! > > o Nobody objects to markus addressing this issue. The objections are that >his implementation breaks things, and handling illegal body content is >not a compelling enough reason for breaking things. > > o At least one suggestion (mine) has been floated on a way to implement this >in a way that does not break things. Two suggestions -- mine as well. I think it's mostly complementary to yours rather than at cross purposes, some of the aspects it's fixing are different. -- PMM ___ Nmh-workers mailing list Nmh-workers@nongnu.org http://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] Understanding nmh (aka. What's the goal) [ really non-ASCII message bodies ]
> markus schnalke wrote: > > (4) no attachments & non-ASCII -> no MIME; the message contains > > non-ASCII chars; the recipient can not know which charset the > > non-ASCII chars had been. > > > > To cover case 4, one needs to run mime at the whatnow prompt manually. > > Yes, and I find that annoying. Most of the time I want to sent non-MIME > ASCII emails and I do that just fine. Sometimes my English email will > mention money, £42, and if I forget to "mime" at whatnow I send a broken > email. So far, markus' approach of catching this fourth case sounds > good. What's currently happening is wrong AFAICS. > > Cheers, > Ralph. Let me try to summarize here because there seems to be lots of energy here for commentary but little for things that move us forward: o Many people still exclusively use nmh, some have drifted away. 'Nuf said on this topic. o There is a general concensus that backward compatibility is important and that changes shouldn't break things unless there is a compelling reason. o The particular thing that markus and Ralph want to address is how illegal body content is handled. Fixing this would be convenient. o markus is willing to write code. Cool! o Nobody objects to markus addressing this issue. The objections are that his implementation breaks things, and handling illegal body content is not a compelling enough reason for breaking things. o At least one suggestion (mine) has been floated on a way to implement this in a way that does not break things. So, I think that enough has been said on this topic. markus, can you outline for us an implementation that doesn't break things? I think that everyone will bless your changes if you do. Jon P.S.I'm trying to honor the way that you're name appears in your mail header. Do you really want it to be "markus" or should it be "Markus"? ___ Nmh-workers mailing list Nmh-workers@nongnu.org http://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] Understanding nmh (aka. What's the goal) [ really non-ASCII message bodies ]
markus schnalke wrote: > (4) no attachments & non-ASCII -> no MIME; the message contains > non-ASCII chars; the recipient can not know which charset the > non-ASCII chars had been. > > To cover case 4, one needs to run mime at the whatnow prompt manually. Yes, and I find that annoying. Most of the time I want to sent non-MIME ASCII emails and I do that just fine. Sometimes my English email will mention money, £42, and if I forget to "mime" at whatnow I send a broken email. So far, markus' approach of catching this fourth case sounds good. What's currently happening is wrong AFAICS. Cheers, Ralph. ___ Nmh-workers mailing list Nmh-workers@nongnu.org http://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] Understanding nmh (aka. What's the goal) [ really non-ASCII message bodies ]
> Probably parts of what I said were ``lost in translation'' as English > is not my native language. > > > I explain again what I meant: > > Using non-ASCII chars in the mail *body* together with Jon's > attachment system leads to the following cases: > > (1) no attachments & only ASCII -> no MIME; everything well > > (2) attachments & only ASCII -> MIME; everything well > > (3) attachments & non-ASCII -> MIME; everything well > > (4) no attachments & non-ASCII -> no MIME; the message contains > non-ASCII chars; the recipient can not know which charset the > non-ASCII chars had been. > > To cover case 4, one needs to run mime at the whatnow prompt manually. > > > meillo Actually, I think that I figured that out. Maybe you haven't gotten through the stack of emails yet. The summary is: 1. The #4 case is not "legit" in rfc-land. That doesn't mean that it wouldn't be a good idea to add convenience functionality. 2. I suggested an alternate way to do this which I think is simpler, doesn't break anything, and works from the LANG environment variable so that the charset information wouldn't have to be configured on a per-user basis. Jon ___ Nmh-workers mailing list Nmh-workers@nongnu.org http://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] Understanding nmh (aka. What's the goal) [ really non-ASCII message bodies ]
Probably parts of what I said were ``lost in translation'' as English is not my native language. I explain again what I meant: Using non-ASCII chars in the mail *body* together with Jon's attachment system leads to the following cases: (1) no attachments & only ASCII -> no MIME; everything well (2) attachments & only ASCII -> MIME; everything well (3) attachments & non-ASCII -> MIME; everything well (4) no attachments & non-ASCII -> no MIME; the message contains non-ASCII chars; the recipient can not know which charset the non-ASCII chars had been. To cover case 4, one needs to run mime at the whatnow prompt manually. meillo ___ Nmh-workers mailing list Nmh-workers@nongnu.org http://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] Understanding nmh (aka. What's the goal) [ really non-ASCII message bodies ]
> On Thu, Dec 2, 2010 at 12:42 PM, Jon Steinhart wrote: > > Correct me if my memory is failing me here, I'm being lazy and not rereading > > rfcs at the moment because I have other things to do. > > > > I recall that in the absence of appropriate headers messages are defined as > > ASCII. If that's the case, it strikes me that you're "fixing" something > > that > > is convenient for you but technically not broken. > > In the ideal world, what you have stated is true. However, in the > real world, in non-English locales, character data occurs in > headers that not encoded according to MIME rules. > > In my Perl app, I added an option to specify what is the default > character encoding to assume for non-encoded data. The default > value is us-ascii (as the rfcs state), but the user can change > it to something else. > > Not sure if nmh needs something similar, but if there are > users in non-English locales that have messages with non-English > non-encoded data in header fields, then allowing to specify > a default encoding to assume may be of use. > > --ewh I believe that that was what I was suggesting, in a don't-break-anything way. Jon ___ Nmh-workers mailing list Nmh-workers@nongnu.org http://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] Understanding nmh (aka. What's the goal) [ really non-ASCII message bodies ]
On Thu, Dec 2, 2010 at 12:42 PM, Jon Steinhart wrote: > Correct me if my memory is failing me here, I'm being lazy and not rereading > rfcs at the moment because I have other things to do. > > I recall that in the absence of appropriate headers messages are defined as > ASCII. If that's the case, it strikes me that you're "fixing" something that > is convenient for you but technically not broken. In the ideal world, what you have stated is true. However, in the real world, in non-English locales, character data occurs in headers that not encoded according to MIME rules. In my Perl app, I added an option to specify what is the default character encoding to assume for non-encoded data. The default value is us-ascii (as the rfcs state), but the user can change it to something else. Not sure if nmh needs something similar, but if there are users in non-English locales that have messages with non-English non-encoded data in header fields, then allowing to specify a default encoding to assume may be of use. --ewh ___ Nmh-workers mailing list Nmh-workers@nongnu.org http://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] Understanding nmh (aka. What's the goal) [ really non-ASCII message bodies ]
> English is probably your native language and the one your emails are > written in. > > This way non-ASCII text does not get handled specially (if no > attachment is present). > > No MIMEification means that the characters are plainly in the mail > body. The original charset is not available in the mail which will > lead to broken or wrong charcters display on systems with a > non-compatible native charset. > > Also the mail message is 8bit then. > > Fixing this problem is part of my patch. @Jon: This actually can be > considered as some kind of usage bug of your system. If you include > non-ASCII text but no attachments then you need to run mime at whatnow > prompt manually, otherwise you must not. > > My patch however does break compatibility if one likes to send > messages that contain non-ASCII chars without MIME. Correct me if my memory is failing me here, I'm being lazy and not rereading rfcs at the moment because I have other things to do. I recall that in the absence of appropriate headers messages are defined as ASCII. If that's the case, it strikes me that you're "fixing" something that is convenient for you but technically not broken. I think that I now understand what you want, which is to be able to do a simple "comp" and have things work automatically with non-ASCII message bodies without having to mess around with mhbuild and all. I'm going to keep harping on the "don't break things" theme, because it seems like the goal should be to make things more convenient for you without making things less convenient for others. Would you consider leaving the -attach stuff as it is and adding a -assume_non_ascii_is_in_the_charset_defined_by_the_lang_environment_variable option instead? Fine with me if you use a shorter name :) Doing it this way doesn't break existing code, and makes the behavior optional so that it won't surprise anybody who hasn't explicitly enabled it. Jon ___ Nmh-workers mailing list Nmh-workers@nongnu.org http://lists.nongnu.org/mailman/listinfo/nmh-workers