Re: [Nmh-workers] Understanding nmh (aka. What's the goal) [really non-ASCII message bodies ]
In this discussion people (other than perhaps Jon, though he hasn't said this explicitly) have just been assuming that if the e-mail body of a message contains data that is not ascii, then it must be some other character set, because after all, all e-mail is text ... In the days of MIME, that's simply not true, and while it is unlikely that anyone is going to use prompter, or even some other editor, to produce a jpeg file, there's nothing to prevent a script producing a file with a jpeg body, and 822 headers, and handing that to nmh to process. We might prefer such a script to generate all the right headers, but MH really doesn't like it if we attempt to tell it mime info in the components file (or the draft) - it insists on adding that itself, so not doing all the content type processing before calling nmh processes is understandable. Now there is nothing at all illegal about this - even ignoring that "illegal" is the wrong word to use in any case, "non-conforming" would be the correct term. The standards don't apply to what the user feeds nmh, that's locally defined "anything goes" territory. What matters is what nmh hands to the MTA (and even more, what the local MTA passes on to its peer). There if we simply send an 822 (old style, non-MIME) message with arbitrary binary content we have a non-conforming message. That's what (as I understand it) Jon's patch handles (I still use the latest released version of nmh, which predates all this stuff ... there hasn't been a new release (since 1.3) for a long time now...) and makes a standards conforming message. It obviously has no way of knowing what the data that it detects as non-ascii is (or not without extra information from the sender), so "application/octet-stream" sounds to me as if it is the perfect choice (along with either QP or B64 encoding to handle the body format) to indicate "here is stuff, but I have no idea what it means, work it out for yourself" - which for many users of this kind of procedure would probably be adequate. I'm certainly not arguing that we should keep this behaviour, and certainly not as the default - I expect that real users of binary message bodies that are not text are so rare that, even if there are any at all, updating them would not be a huge problem (provided the change notes for the next nmh release make it clear this has happened). However, I don't think we should give up the ability to simply send an e-mail where the body is image/jpeg or whatever - there's no requirement that there be any text in the body of the message at all, even though most MUA's simply assume that, and require a multipart to include anything that is not text. MH should be better than that, being just as good as "most MUA's" is a fairly grevious insult IMO. And while retaining the # language of mhbuild, or something equivalent, is essential to enable truly general messages to be created, expecting to use that for trivial tasks is, I agree, asking too much - and requiring explicit mime processing at the whatnow stage should only be necessary when the full mhbuild procedure is to be invoked. (Do recall that wnen this was added, MIME messages were rare, and lots of users didn't like them - most MUA's had no way to display them, not even as "good" as nmh does now - and so wanting that processing was very unusual. These days, almost every message should comply with at least basic mime formats. My suggestion to handle general bodies is to allow a switch that sets the MIME content-type of the message (defaulting to text/plain) - and then base all the other decisions off that. If (as a result of the default, or by being explicitly set) we get a text/* content-type, then we can attempt to work out the charset involved, and add the proper indicator. On the other hand, if someone really wants to send an application/octet-stream, then let's allow them to do that, or if they want to send image/jpeg or audio/whatever they should be able to do that too (a message that is entirely audio/* could even be handled my "show" by playing it through the local system's speakers, assuming that's possible - implementing voice-email) I also don't believe that this processing should be keyed off some -attach switch - as a way to simplify adding an attachment to a message (incidentally, if given twice, can we have two attachments, or is there some other way to do that?) it sounds OK, but for charset processing? For text messages, the right thing should be done regardless of whether there's any plan or intent to add attachments, and using a switch "-attach" in the profile to mean "encode my text correctly" is bizarre... I'm all for backwards compatability, but only backwards compatible for correct behaviour, keeping all the existing bugs should not be required (though I think there are environments where even that is expected.) Even for attachments, as I understand it, that's keyed off a pseudo-header added to the components file (and so appears in th
Re: [Nmh-workers] Understanding nmh (aka. What's the goal) [ really non-ASCII message bodies ]
>> The current behavior was the best idea that I had >> at the time and nobody has said anything about it until recently. I don't >> mind it changing, but I don't want to all of a sudden get complaints from >> people who were counting on this behavior. > >Well, I always considered the current behaviour a bug, but I didn't say >anything, because the nmh-way of software development seemed pretty >inefficient and I didn't want to look at the code myself either. We have a "way" of software development? When did that happen? :-) In all seriousness ... what, exactly, do you mean? I guess our current way is, "Anyone who's interested, please contribute!". I don't see how that's really much different than other open-source packages. Also, as long as we're on the subject ... if people want to submit patches to the list, perhaps formatting them with git format-patch (since, hey, I went to the whole trouble to convert everything to git) might be worthwhile. Or perhaps just doing that when everyone agrees on the code would make things easier (because the patch could be processed with git am). --Ken ___ Nmh-workers mailing list Nmh-workers@nongnu.org http://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] SMTP/IMAP/POP Support
>woe... I said, "as much as possible" --- I mean, let's remove things >that truly *NOBODY* uses. Fair enough; I guess when you said "as much as possible", I started to get worried. I mean, it is certainly POSSIBLE to remove POP support from inc; if the criteria is only things that people don't use, then I'm fine with that. >I *WANT* POP support in inc --- but I'm cool if we get "modern" POP >interface by having inc popen(3) fetchmail, or link against it's >library, if that's more sane than fixing the code that's there to do >stuff like POP/SSL. Well, I can now speak from experience that adding TLS support to our current code is not terrible. Despite what others had thought, it was actually pretty straightforward, even with all of the code in place to do SASL encryption. A quick glance at uip/popsbr.c leads me to believe that adding similar TLS code to the POP library would be relatively simple. Someone just has to do it, and now they even have example code to crib from. I might do it if I get some free time, but certainly someone else can do it probably a lot sooner than I can. Or if someone wants to modify the pop code to popen() fetchmail, I'm alright with that as well. --Ken ___ Nmh-workers mailing list Nmh-workers@nongnu.org http://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] Understanding nmh (aka. What's the goal) [ reallynon-ASCII message bodies ]
On 10-12-07 3:48 PM, Jon Steinhart wrote: This is the first that anybody has spoken up about this as far as I'm aware, so I was trying to protect backward compatibility. A lot of MTAs just accept the stuff, even though it violates the standards. The assumption was 'just treat it as 8859-1'. That sort of worked long ago, but not any more. Now that the whole email delivery chain has had to start dealing with character set encodings properly I've noticed a (very) slight increase in the number of sites that are rejecting un- and mis-encoded non-ASCII text. ___ Nmh-workers mailing list Nmh-workers@nongnu.org http://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] Understanding nmh (aka. What's the goal) [ reallynon-ASCII message bodies ]
If this is a bug that nobody has bothered with for years then by all means go ahead and fix it. This is the first that anybody has spoken up about this as far as I'm aware, so I was trying to protect backward compatibility. No need to do that for bugs though. Jon ___ Nmh-workers mailing list Nmh-workers@nongnu.org http://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] Understanding nmh (aka. What's the goal) [ really non-ASCII message bodies ]
On Tue, 07 Dec 2010 12:03:38 CST, Earl Hood said: > On Tue, Dec 7, 2010 at 11:10 AM, Jon Steinhart wrote: > > I understand that my attachment system does not handle non-ASCII message > > bodies, but again, that's because non-ASCII message bodies are not "legal". > > Please cite an RFC that says non-ASCII bodies are not legal. > > With MIME, you have the Content-Transfer-Encoding field, which allows > for 8bit. And then, if you have a Content-Type type that supports > charset parameter, you can "legally" have a body that is non-ASCII. A MIME message that has a Mime-Version: and appropriate C-T-E: headers can certainly be non-ASCII. What's illegal is sending non-ASCII *without* such headers (which is what nmh has been doing in the past). pgpqGwTkJbOGo.pgp Description: PGP signature ___ Nmh-workers mailing list Nmh-workers@nongnu.org http://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] Understanding nmh (aka. What's the goal) [ reallynon-ASCII message bodies ]
Peter wrote: > markus schnalke wrote: > >The old code generates ... > > > >... for ASCII: > > > > Content-Type: text/plain; name="sendKi9x7j"; x-unix-mode="0644"; > > charset="us-ascii" > > Content-ID: <4962.128958967...@argentina.foo> > > Content-Description: ASCII text > > > > foo > > > >... for non-ASCII (only if at least one attachment is present): > > > > Content-Type: application/octet-stream; name="sendbRaV8T"; > > x-unix-mode="0644" > > Content-ID: <5209.128958999...@argentina.foo> > > Content-Description: UTF-8 Unicode text > > Content-Transfer-Encoding: base64 > > > > d2l0aCBKb24 > > These are definitely just wrong -- we shouldn't be specifying > name and x-unix-mode for the body text Adding -attachformat 1 to the send entry of your .mh_profile will get rid of the name and x-unix-mode. That option can also be added when entering send at the whatnow prompt. The send man page has examples of what it produces. If there's consensus to make that the default, it would be an easy code and documentation change. (Yes, I'm volunteering to make the changes. But not to push for consensus :-) > (and base64ing when we could q-p is a bit unfriendly). Blackberries, and I think Droids, unnecessarily base64 text. But I do agree with you, nmh shouldn't. David __ This email has been scanned by the MessageLabs Email Security System. For more information please visit http://www.messagelabs.com/email __ ___ Nmh-workers mailing list Nmh-workers@nongnu.org http://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] Understanding nmh (aka. What's the goal) [ really non-ASCII message bodies ]
Jon Steinhart : > The current behavior was the best idea that I had > at the time and nobody has said anything about it until recently. I don't > mind it changing, but I don't want to all of a sudden get complaints from > people who were counting on this behavior. Well, I always considered the current behaviour a bug, but I didn't say anything, because the nmh-way of software development seemed pretty inefficient and I didn't want to look at the code myself either. So, when I found out that nmh was sending illegal mails (the timestamps of my filesystem tell me it was 25. Jun 2008), I just "fixed" this by adding three lines to the components file: Content-Type: text/plain; charset="iso-8859-15" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit (which of course I have to remove if I actually want to attach a file to a mail) I guess, I'm not the only one silently applying some workaround but I also guess that much more people unknowingly send illegal mail. So unless somebody on this mailinglist states that he actually needs the current behaviour, I think it is very safe to assume that no such complaints you are fearing will ever be made. Given the number of mails wasted on this seemingly obvious question now, I really regret not filing a bug report 2.5 years ago. BTW, I also use nmh exclusively for my mail, except for the two times/year when I actually need to send signed/encrypted mails. Personally I think that nmh should rather abort than silently sending illegal mails. Harald ___ Nmh-workers mailing list Nmh-workers@nongnu.org http://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] Understanding nmh (aka. What's the goal) [ really non-ASCII message bodies ]
markus schnalke wrote: >[2010-12-07 12:33] Jon Steinhart >Examples for what gets generated from mail *body text*: Thanks for doing this and saving me the effort ;-) >The old code generates ... > >... for ASCII: > > Content-Type: text/plain; name="sendKi9x7j"; x-unix-mode="0644"; > charset="us-ascii" > Content-ID: <4962.128958967...@argentina.foo> > Content-Description: ASCII text > > foo > >... for non-ASCII (only if at least one attachment is present): > > Content-Type: application/octet-stream; name="sendbRaV8T"; > x-unix-mode="0644" > Content-ID: <5209.128958999...@argentina.foo> > Content-Description: UTF-8 Unicode text > Content-Transfer-Encoding: base64 > > d2l0aCBKb24 These are definitely just wrong -- we shouldn't be specifying name and x-unix-mode for the body text (and base64ing when we could q-p is a bit unfriendly). -- PMM ___ Nmh-workers mailing list Nmh-workers@nongnu.org http://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] Understanding nmh (aka. What's the goal) [ really non-ASCII message bodies ]
Jon, sorry for the harsh mail. I really had been in rage. :-/ In a recent mail, you said: > The current behavior was the best idea that I had > at the time and nobody has said anything about it until recently. I really don't blame you for what we have; quite the opposite: I am very gretaful for what we have. [2010-12-07 22:45] markus schnalke > > I ask other people to take a look and express their opinion. Thanks to everyone speaking up. meillo ___ Nmh-workers mailing list Nmh-workers@nongnu.org http://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] Understanding nmh (aka. What's the goal) [ really non-ASCII message bodies ]
[2010-12-07 12:33] Jon Steinhart > Still trying to understand this. Decided to finally look at the code instead > of > relying on my fading memory. :-) Thanks. > The existing code takes a non-ASCII message body and sends it as an attachment > of type application/octet-stream. > > Your patch changes this behavior so that it is sent as type text/plain with > the > appropriately chosen character set. Both correct. > In order to do this, you test the message body for non-ASCII characters in > attach(). If you find any, you write an entry directly into the composition > file instead of calling make_mime_composition_file_entry(). Correct. > This is changing existing behavior if I understand it correctly. Behavior in what gets generated, yes, but this is of need when fixing things. Examples for what gets generated from mail *body text*: The old code generates ... ... for ASCII: Content-Type: text/plain; name="sendKi9x7j"; x-unix-mode="0644"; charset="us-ascii" Content-ID: <4962.128958967...@argentina.foo> Content-Description: ASCII text foo ... for non-ASCII (only if at least one attachment is present): Content-Type: application/octet-stream; name="sendbRaV8T"; x-unix-mode="0644" Content-ID: <5209.128958999...@argentina.foo> Content-Description: UTF-8 Unicode text Content-Transfer-Encoding: base64 d2l0aCBKb24 With my patch such MIME parts are generated ... ... for ASCII: Content-Type: text/plain; charset="us-ascii" Content-ID: <5048.128958978...@argentina.foo> foo ... for non-ASCII: Content-Type: text/plain; charset="UTF-8" Content-ID: <5260.128959006...@argentina.foo> Content-Transfer-Encoding: quoted-printable Umlauts: =C3=A4 and =C3=B6 and =C3=BC. The function make_mime_composition_file_entry() gives us nothing but information we don't need/want (temp file names, file permissions) and it definately does not use the best possible CT and CTE for the body text. > This is fine > with me provided that users must explicitly enable the change using an option. An option to activate a fix??? > Now that I'm actually looking at the code, I would suggest an option > (choose a better name) of binary-body-content-type. You could change the > make_mime_composition_file_entry() line > > content_type = binary ? "application/octet-stream" : "text/plain"; > > to replace the "application/octet-stream" with the option value, or the > existing > value as a default if the option is not specified. > > A user wanting this behavior would have a profile entry of > > binary-body-content-type: text/plain > > I think that this would be a simpler code change that would accomplish your > goal. Sorry, but I really think you don't get the point. We don't need config options if we already have the facilities to do it right automatically. Why should any user need to tell nmh what content type to use for the text he writes? The mhbuild facility can already find out which is appropriate. My patch also divides between mail text and attachments for which different things are relevant. Your comment above does this not and would then use binary-body-content-type for any non-ASCII attachment also. I read your mails and ask myself if you really read what I write and if you have had a look into the code we are talking about. I very much value the work you did for nmh and you see that this patch bases on what you created, but now it may be to point to either have a close look at the problem and code or step back and let other people talk. I do think you don't understand the situation and the relavant code good enough (presumably due to lack of time). Of course, I may be wrong. Currently, however, it rather seems to me as if it's not me who is not understanding the whole thing. I really spent much time in the code and doing tests. And I did my best explaining everything. I ask other people to take a look and express their opinion. meillo ___ Nmh-workers mailing list Nmh-workers@nongnu.org http://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] Understanding nmh (aka. What's the goal) [ really non-ASCII message bodies ]
chad wrote: >On Dec 7, 2010, at 12:48 PM, Ken Hornstein wrote: >> You know I'm all for backwards compatibility and everything, but >> I'm wondering ... did the previous behavior actually make sense? Can >> people argue that it was desirable or correct? Or was the previous >> behavior actually wrong, and this is really fixing a long-standing >> bug? > >It was a bug. The only suggesting that it wasn't a bug is instead saying >that it was `illegal' (which it wasn't... it just usually was a bad idea). I agree; we should just change the behaviour here. (In particular the previous behaviour would have differed in how it handled the body depending on whether there was an attachment or not, which suggests to me that it's just not a case anybody has cared about before now.) (In fact I think we should go ahead and change the behaviour for not-plain-ASCII bodies even if the user didn't pass the -attach switch, but I'm guessing I might get argued down on that.) I do think the "is the body not plain ASCII?" check is not quite right. I think that the presence of special characters (most notably ESC) ought to also MIMEification. Otherwise we will not do the right thing for Japanese character sets like shift-JIS. (Yes, I do care about this, it's not just idle nitpickery.) I would suggest if (*p != '\t' && (*p >= 127 || *p < 32) { non_ascii = 1; ie encode unless it's in the printable ascii range or space or tab. -- PMM ___ Nmh-workers mailing list Nmh-workers@nongnu.org http://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] Understanding nmh (aka. What's the goal) [ really non-ASCII message bodies ]
On Dec 7, 2010, at 12:48 PM, Ken Hornstein wrote: > You know I'm all for backwards compatibility and everything, but > I'm wondering ... did the previous behavior actually make sense? Can > people argue that it was desirable or correct? Or was the previous > behavior actually wrong, and this is really fixing a long-standing > bug? It was a bug. The only suggesting that it wasn't a bug is instead saying that it was `illegal' (which it wasn't... it just usually was a bad idea). nmh was used so infrequently in places that aren't 7-bit clean that none of the people who noticed complained; they just binned it with the host of non-i18n'd software and moved on. This is generalizing a bit from a small sample, but I'd be even more astonished than if we found more than a dozen people who use mh exclusively for all their email. ;) *Chad ___ Nmh-workers mailing list Nmh-workers@nongnu.org http://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] Understanding nmh (aka. What's the goal) [ really non-ASCII message bodies ]
> >The existing code takes a non-ASCII message body and sends it as an > >attachment > >of type application/octet-stream. > > > >Your patch changes this behavior so that it is sent as type text/plain with > >the > >appropriately chosen character set. > > You know I'm all for backwards compatibility and everything, but > I'm wondering ... did the previous behavior actually make sense? Can > people argue that it was desirable or correct? Or was the previous > behavior actually wrong, and this is really fixing a long-standing > bug? Because if we decide that the previous behavior is a bug, then > I don't think an explicit enable option for this chance makes sense; I'd > prefer that the new behavior be the default. > > (I am personally on the fence regarding whether or not the previous > behavior is a bug). > > --Ken Don't disagree with you. The current behavior was the best idea that I had at the time and nobody has said anything about it until recently. I don't mind it changing, but I don't want to all of a sudden get complaints from people who were counting on this behavior. Maybe that number is 0, but I have no way of knowing. I don't care that much, so if you think compability isn't an issue here that's fine with me. If the defalt behavior was to change I would add a "binary" flag to make_mime_composition_file_entry() so that the body didn't have to be scanned for non-ASCII characters twice. Jon ___ Nmh-workers mailing list Nmh-workers@nongnu.org http://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] Understanding nmh (aka. What's the goal) [ really non-ASCII message bodies ]
>The existing code takes a non-ASCII message body and sends it as an attachment >of type application/octet-stream. > >Your patch changes this behavior so that it is sent as type text/plain with the >appropriately chosen character set. You know I'm all for backwards compatibility and everything, but I'm wondering ... did the previous behavior actually make sense? Can people argue that it was desirable or correct? Or was the previous behavior actually wrong, and this is really fixing a long-standing bug? Because if we decide that the previous behavior is a bug, then I don't think an explicit enable option for this chance makes sense; I'd prefer that the new behavior be the default. (I am personally on the fence regarding whether or not the previous behavior is a bug). --Ken ___ Nmh-workers mailing list Nmh-workers@nongnu.org http://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] Understanding nmh (aka. What's the goal) [ really non-ASCII message bodies ]
Still trying to understand this. Decided to finally look at the code instead of relying on my fading memory. Sorry if some of my earlier memory-based comments were off-base. Please let me know if my understanding of your proposed patch is correct. The existing code takes a non-ASCII message body and sends it as an attachment of type application/octet-stream. Your patch changes this behavior so that it is sent as type text/plain with the appropriately chosen character set. In order to do this, you test the message body for non-ASCII characters in attach(). If you find any, you write an entry directly into the composition file instead of calling make_mime_composition_file_entry(). This is changing existing behavior if I understand it correctly. This is fine with me provided that users must explicitly enable the change using an option. Now that I'm actually looking at the code, I would suggest an option (choose a better name) of binary-body-content-type. You could change the make_mime_composition_file_entry() line content_type = binary ? "application/octet-stream" : "text/plain"; to replace the "application/octet-stream" with the option value, or the existing value as a default if the option is not specified. A user wanting this behavior would have a profile entry of binary-body-content-type: text/plain I think that this would be a simpler code change that would accomplish your goal. Jon ___ Nmh-workers mailing list Nmh-workers@nongnu.org http://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] Understanding nmh (aka. What's the goal) [ really non-ASCII message bodies ]
Hi, markus schnalke wrote: > > BTW, I would suggest using isascii() rather than (*p > 127 || *p < 0). > > I just kept what you once wrote. ;-P > But, yes, you are right. Given it's `char *p' then *p may be unsigned on some systems, e.g. ARM, and a compiler could warn on testing if it's negative so isascii() is much nicer. :-) Cheers, Ralph. ___ Nmh-workers mailing list Nmh-workers@nongnu.org http://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] Understanding nmh (aka. What's the goal) [ really non-ASCII message bodies ]
On Tue, Dec 7, 2010 at 11:10 AM, Jon Steinhart wrote: > I understand that my attachment system does not handle non-ASCII message > bodies, but again, that's because non-ASCII message bodies are not "legal". Please cite an RFC that says non-ASCII bodies are not legal. With MIME, you have the Content-Transfer-Encoding field, which allows for 8bit. And then, if you have a Content-Type type that supports charset parameter, you can "legally" have a body that is non-ASCII. Note, way back when MIME was first defined, it was probably not good practice to use 8bit CTE since MTAs were not friendly with 8bit data, but today, that is less likely. --ewh ___ Nmh-workers mailing list Nmh-workers@nongnu.org http://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] Understanding nmh (aka. What's the goal) [ really non-ASCII message bodies ]
[2010-12-07 09:10] Jon Steinhart > > [2010-12-07 08:27] Jon Steinhart > > > Sounds good. Is the patch that you sent out complete? I don't see an > > > option > > > that enables/disables this behavior and I think that there should be one. > > > > I believe it's correct that there is no switch. > > > > If one wants to deactivate it, do not specify -attach. > > > > If -attach is given, I believe the changes are fixes for broken > > behavior. Your attachment system lacks some awareness for non-ASCII > > text, which you probably don't deal with much. This is improved with > > my patch. > > OK. I think that there should be a switch. I guess it bugs me to see the > character-by-character examination of the message body on by default. The char-by-char examination is ugly, yes. > I understand that my attachment system does not handle non-ASCII message > bodies, but again, that's because non-ASCII message bodies are not "legal". > I think that you have justified extending nmh to handle "illegal" message > bodies. I'm just nitpicking on the implementation details. With MIMEification, they are legal. I want nmh to convert illegal draft messages to legal messages. Currently nmh sends illegal messages if the user composes such ones. With my patch nmh cares to only send legal messages. Programs should support humans if possible. > Could you please explain again how you get the character set information > for non-ASCII message bodies? Sorry that I didn't save your original > message on this. I seem to recall that you got it from the profile; I > would rather see you get this from the LANG environment variable. I just leave it up to buildmimeproc to find out. :-) We don't need to do it at several places. I only say it's text/plain but nothing about the encoding. The man page of mhbuild(1) writes: If a text content contains any 8-bit characters (characters with the high bit set) and the character set is not specified as above, then mhbuild will assume the character set is of the type given by the environment variable MM_CHARSET. If this environment variable is not set, then the character set will be labeled as “x-unknown”. If a text content contains only 7-bit characters and the character set is not specified as above, then the character set will be labeled as “us-ascii”. This information probably is outdated, but generally it hits the point, probably the code is already better (in respect to MM_CHARSET). > BTW, I would suggest using isascii() rather than (*p > 127 || *p < 0). I just kept what you once wrote. ;-P But, yes, you are right. meillo ___ Nmh-workers mailing list Nmh-workers@nongnu.org http://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] Understanding nmh (aka. What's the goal) [ really non-ASCII message bodies ]
> [2010-12-07 08:27] Jon Steinhart > > Sounds good. Is the patch that you sent out complete? I don't see an > > option > > that enables/disables this behavior and I think that there should be one. > > I believe it's correct that there is no switch. > > If one wants to deactivate it, do not specify -attach. > > If -attach is given, I believe the changes are fixes for broken > behavior. Your attachment system lacks some awareness for non-ASCII > text, which you probably don't deal with much. This is improved with > my patch. > > meillo OK. I think that there should be a switch. I guess it bugs me to see the character-by-character examination of the message body on by default. I understand that my attachment system does not handle non-ASCII message bodies, but again, that's because non-ASCII message bodies are not "legal". I think that you have justified extending nmh to handle "illegal" message bodies. I'm just nitpicking on the implementation details. Could you please explain again how you get the character set information for non-ASCII message bodies? Sorry that I didn't save your original message on this. I seem to recall that you got it from the profile; I would rather see you get this from the LANG environment variable. BTW, I would suggest using isascii() rather than (*p > 127 || *p < 0). Thanks, Jon ___ Nmh-workers mailing list Nmh-workers@nongnu.org http://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] Understanding nmh (aka. What's the goal) [ really non-ASCII message bodies ]
[2010-12-07 08:27] Jon Steinhart > Sounds good. Is the patch that you sent out complete? I don't see an option > that enables/disables this behavior and I think that there should be one. I believe it's correct that there is no switch. If one wants to deactivate it, do not specify -attach. If -attach is given, I believe the changes are fixes for broken behavior. Your attachment system lacks some awareness for non-ASCII text, which you probably don't deal with much. This is improved with my patch. meillo ___ Nmh-workers mailing list Nmh-workers@nongnu.org http://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] Understanding nmh (aka. What's the goal) [ really non-ASCII message bodies ]
Sounds good. Is the patch that you sent out complete? I don't see an option that enables/disables this behavior and I think that there should be one. Jon ___ Nmh-workers mailing list Nmh-workers@nongnu.org http://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] Understanding nmh (aka. What's the goal) [ really non-ASCII message bodies ]
Hoi, discussing these things hadn't been easy sometimes, but the points and arguments became clear and now we reached some kind of consensus. For me, the discussion had been worthwhile. [2010-12-03 11:33] Jon Steinhart > > o Nobody objects to markus addressing this issue. The objections are that > his implementation breaks things, and handling illegal body content is > not a compelling enough reason for breaking things. > So, I think that enough has been said on this topic. markus, can you outline > for us an implementation that doesn't break things? I think that everyone > will > bless your changes if you do. I agree with you here. Hence I created a new patch that concentrates on the fourth case, explained in the other mail. AFAIS it does not break anything. Let me explain: As it only modifies your attachment system now, everything is the same if -attach is not specified. For -attach being specified, the situation is such: no attachment hdr + body contains only ASCII -> sent as is attachment hdr + body contains only ASCII -> MIMEified attachment hdr + body contains non-ASCII -> MIMEified no attachment hdr + body contains non-ASCII -> MIMEified The fourth case is different. Additionally, the body text will be sent with a correct mime-type in any case. Currently it was sent as application/octet-stream in the third case. The relation to `mime' at the whatnow prompt: One surely wants to unset automimeproc when using -attach. Running `mime' at the whatnow prompts is usually not needed as Jon's attachment system handles it automatically. Collisions only occure if an attachment header is present in the mail and one runs `mime' at the whatnow prompt. If the body text contains non-ASCII chars or not is irrelevant, it works as expected in both cases. As long as one does not add attachment headers to a specific draft, one is able to use any mhbuild directives (/^#/) when running `mime' at the whatnow prompt afterwards. Further work: The documentation currently does not cover my changes. Not much to change, and I like to do that if the proposed changes are accepted. More complex MIME structures than ``text followed by attachments'' are not possible with Jon's attachment system. (Like they are not with most MUAs.) One needs to create them with mhbuild directives and run `mime' manually. (For forwarding messages, see below.) Jon's attachment system still needs mhshow-suffix- entries or it will be really dumb. This is something that should be covered separately, maybe by a conceptional redesign (automatic detection, mailcap, ...). Forwarding messages in MIME format could be added to Jon's system in a way similar to what I proposed initially. I believe this would be possible without breaking stuff. We would need to add -attach to forw(1). meillo > P.S. I'm trying to honor the way that you're name appears in your mail > header. > Do you really want it to be "markus" or should it be "Markus"? Usually, I prefer ``meillo'' because that's a nearly unique identifier. If you want to use my real name, I don't care if you spell it in lower-case or with capital `M'. More important is honoring my work by mentioning my name in the ChangeLog or commit messages. ;-) diff --git a/uip/sendsbr.c b/uip/sendsbr.c index 57ef007..8f5f2e1 100644 --- a/uip/sendsbr.c +++ b/uip/sendsbr.c @@ -196,6 +196,7 @@ attach(char *attachment_header_field_name, char *draft_file_name, int c; /* current character for body copy */ int has_attachment; /* draft has at least one attachment */ int has_body; /* draft has a message body */ +int non_ascii; /* msg body contains non-ASCII chars */ int length; /* length of attachment header field name */ char *p; /* miscellaneous string pointer */ @@ -228,29 +229,36 @@ attach(char *attachment_header_field_name, char *draft_file_name, if (strncasecmp(field, attachment_header_field_name, length) == 0 && field[length] == ':') has_attachment = 1; -if (has_attachment == 0) - return (DONE); - /* - * We have at least one attachment. Look for at least one non-blank line - * in the body of the message which indicates content in the body. + * Check if body contains at least one non-blank char (= not empty) + * and if it contains non-ASCII chars (= need MIME). + * We MIMEify the message also if the body contains non-ASCII text. */ has_body = 0; +non_ascii = 0; while (get_line() != EOF) { for (p = field; *p != '\0'; p++) { - if (*p != ' ' && *p != '\t') { + if (*p != ' ' && *p != '\t') has_body = 1; + if (*p > 127 || *p < 0) { + non_ascii = 1; break; } } - - if (has_body) + if (non_ascii) break; } /* + * Bail out if there are no attachments and only ASCII text. + * This means we don't need to convert it to MIME. + */ +if (!has_attachment && non_ascii == 0) + return (DONE);