Re: Unsupported nroff macros on MacOS X
>I am ... concerned about depending on pandoc, because of this: > > Pandoc is available in lxplus, aiadm and most RPM repositories. It's > written in Haskell, which means it relies on hundreds of megabytes of > library dependencies. That's certainly fair, but wouldn't it need to be used only once, after which the documentation could be maintained in markdown format? I suppose that would require a tool to go from markdown to man, but at least it's a thought. >I have no objection to Markdown but I'm not sure what it would gain us >exactly, other than maybe someone younger than 35 could edit the >documentation. That may be the point -- or not, I suppose, depending on one's point of view. (I'm far past the point of being under 35 myself, for what that's worth.) - Steven -- ___ Steven Winikoff | Montreal, QC, Canada | "The cure for boredom is curiousity. s...@smwonline.ca | There is no cure for curiousity." http://smwonline.ca | | - Dorothy Parker
Re: Unsupported nroff macros on MacOS X
>In a more practical sense, I am not sure there is anyone with the free >cycles to convert the current man pages into some other markup language. This seems like the sort of thing that should be possible to automate, and that question has been raised before. A quick search turned up the following, among others: https://stackoverflow.com/questions/13433903/convert-all-linux-man-pages-to-text-html-or-markdown https://jeromebelleman.gitlab.io/posts/publishing/manpages/ - Steven -- ___ Steven Winikoff | "Science is built upon facts, as a house is Montreal, QC, Canada | built of stones; but an accumulation of s...@smwonline.ca | facts is no more a science than a heap of http://smwonline.ca | stones is a house." | - Henri Poincaré
Re: new release
on Manjaro (21.2.6, "Qonos"): === All 118 tests passed === - Steven -- _______ Steven Winikoff | Montreal, QC, Canada | "It's amazing how much 'mature wisdom' s...@smwonline.ca | resembles being too tired." http://smwonline.ca | | - Robert Heinlein
Re: mhfixmsg character set conversion
>This should fix it for you, Steven: It does. Thanks again! I really appreciate your help with this. - Steven -- ___ Steven Winikoff | Sometimes you will never know the value Montreal, QC, Canada | of a moment until it becomes a memory. s...@smwonline.ca | http://smwonline.ca | - Dr. Seuss
Re: mhfixmsg character set conversion
>> So having searched and found it, don't send it on. :-) > >Very good advice. Another good reason to retain the unmodified message. ...which I always do. Sending it on isn't the point anyway (for me, I mean), and is something I rarely do -- and when I do it, I usually quote selectively rather than forward an entire message. For the few exceptions, an unmodified copy of the original message is available as a backup. - Steven -- ___ Steven Winikoff | Montreal, QC, Canada | "What I want is all of the power and none s...@smwonline.ca | of the responsibility." http://smwonline.ca | | - fortune(6)
Re: mhfixmsg character set conversion
>To be fair ... that's completely permitted according to the spec! That doesn't make it a good idea. :-) >The decoder used by the format engine can deal with that (but as >David mentioned, it's only designed to convert stuff to the native >character set). The problem here isn't the character set conversion, but decoding from quoted-printable (in this case, or whatever encoding is used in general). The goal is, and has always been, to save the message in a format that can be searched easily at a later date. Anything that interferes with that unnecessarily is perverse, at least in my opinion. :-/ - Steven -- ___ Steven Winikoff | "The man who has ceased to learn ought Montreal, QC, Canada | not to be allowed to wander around s...@smwonline.ca | loose in these dangerous days." http://smwonline.ca | | - M. M. Coady
Re: mhfixmsg character set conversion
>>Subject: Re: [KCBExec] >> =?utf-8?q?Fwd=3A_r=C3=A9pertoire_des_ensembles_musicau?= >> =?utf-8?q?x?= > >Well, mhfixmsg doesn't expect a mix of unencoded and encoded text. And it really shouldn't have to. But some senders are really, really perverse. :-( >I'll look into it. Thank you! - Steven -- _______ Steven Winikoff | Sometimes you will never know the value Montreal, QC, Canada | of a moment until it becomes a memory. s...@smwonline.ca | http://smwonline.ca | - Dr. Seuss
Re: mhfixmsg character set conversion
>Thank you for reporting the issue you observed and working to improve >mhfixmsg! I'm very happy to do what I can. ...but I'm less happy to have to report that I just ran into a new problem. In particular, I just received a message with this header: Subject: Re: [KCBExec] =?utf-8?q?Fwd=3A_r=C3=A9pertoire_des_ensembles_musicau?= =?utf-8?q?x?= I ran that through mhfixmsg -decodetext 8bit -decodetypes text -textcharset utf-8 \ -reformat -fixcte -fixboundary -noreplacetextplain\ -fixtype application/octet-stream -noverbose \ -decodeheaderfieldbodies utf-8\ -file "${source}" -outfile "${tf}.fixed" ...but the Subject header came through unchanged. What am I missing? - Steven -- _______ Steven Winikoff | "Science is built upon facts, as a house is Montreal, QC, Canada | built of stones; but an accumulation of s...@smwonline.ca | facts is no more a science than a heap of http://smwonline.ca | stones is a house." | - Henri Poincaré
Re: mhfixmsg character set conversion
>Commit a73f7f08a07e09200f320a734233ab0293e8f428. Steven, this >should decode your ASCII-encoded header field bodies. I just tested it, and I can confirm that it does. >Of course it didn't end up being that simple. Nothing ever does. :-/ >A possible future enhancement would be to convert to any specified >charset. And maybe repurpose the argument of the mhfixmsg >-decodeheaderfieldbodies switch to specify the destination charset. Those sound like good ideas in general, but not ones I'd personally expect to use. ...but I'm very happy with how mhfixmsg behaves right now. :-) Thank you! - Steven -- _______ Steven Winikoff | Montreal, QC, Canada | "If at first you don't succeed, transform s...@smwonline.ca | your data set." http://smwonline.ca | | - fortune(6)
Re: mhfixmsg character set conversion
>>[ regarding decoding of encoded ASCII in headers] > >Ok, I'll add support for it to mhfixmsg -decodeheaderfieldbodies utf8. Thank you! >When I look at the message in the lists.nongnu.org archive [1], the >line isn't too long. But it's not folded, either. The continuation >is on separate line with no leading whitespace. Something got lost in translation. In the original message (as saved by procmail before being munged in any way), it was one long line, with exactly one space between the end of the first encoded portion and the beginning of the second one. The relevant excerpt (with parts elided to keep the whole short enough here for purposes of illustration) is Subject: =?US-ASCII?Q?Using_[...]_to_mak?= =?US-ASCII?Q?e_text_[...]?= - Steven -- _______ Steven Winikoff | Montreal, QC, Canada | "/Earth is 98%% full. Please delete anyone s...@smwonline.ca | you can." http://smwonline.ca | | - fortune(6)
Re: mhfixmsg character set conversion
>That's because -decodeheaderfieldbodys utf8 only decodes UTF-8 text. That makes sense. I'd forgotten that utf-8 is a mandatory argument for "-decodeheaderfieldbodies. >There was a reason for only allowing decoding of UTF-8 header field >bodies. If any character set could be decoded, it would be possible >to produce header field bodies with embedded nulls, I didn't know that. >we could decode ASCII because 1) we've seen it in the wild, 2) it seems as >harmless as it is pointless to encode ASCII as ASCII, assuming no NULs, >and 3) it's a proper subset of UTF-8 so it doesn't interfere with the >semantics of the "-decodeheaderfieldbodies utf8" switch. That also makes sense. >Any other suggestions? No, but then I've never noticed any encoded headers that weren't utf-8 or ASCII. And I agree that encoding ASCII seems pointless, but that doesn't stop my Android mail app from doing it anyway. :-/ >So I'm curious, why is the ASCII encoded as ASCII? Why not just fold >the header as usual? I have absolutely no idea. This isn't a configurable choice as far as I'm aware, it't just something that the app does. If you're curious, it's called "K-9 Mail": https://play.google.com/store/apps/details?id=com.fsck.k9=en_CA=US https://k9mail.app/ https://github.com/k9mail/k-9 >This line is too long, I'm not sure if that is related or if it's a >separate issue: It's probably related. I can't prove that, but in general, shorter subject lines appear to be passed through without encoding. Regardless, this kind of thing is exactly what I'm trying to eliminate in my saved messages. I just realized that my decode_headers program doesn't detect the second encoded string in the same header, but I'm about to go fix that. :-) - Steven -- ___ Steven Winikoff | Montreal, QC, Canada | "Any teacher who _can_ be replaced by a s...@smwonline.ca | machine, _should_ be." http://smwonline.ca | |- Arthur C. Clarke
Re: mhfixmsg character set conversion
>> [ re -decodeheaderfieldbodies ] > >Ok, couple of issues, both due to very limited support of >encoded formats by -decodeheaderfieldbodies. I'll work on >them. Thank you. >Note that the only encoded headers in your message are >us-ascii, that seems pointless. In the case of that particular message, the encoded headers are ones that I'd almost never want to search for anyway. But today I sent myself a message using an IMAP-based app on my phone, resulting in the appended, and I'd definitely want to decode the Subject: header. Unfortunately, running it through mhfixmsg results in the message coming back unchanged. Is that specifically about -decodeheaderfieldbodies, or is mhfixmsg doing nothing because the message body is already unencoded text/plain? - Steven 8<- cut here >8 >From s...@smwonline.ca Sun Feb 13 15:03:01 2022 Return-Path: Received: from server03.4goodhosting.com (198.178.116.238:993) by mort.smwonline.ca with IMAP4-SSL; 13 Feb 2022 20:03:01 - Delivered-To: s...@smwonline.ca Received: from server03.4goodhosting.com by server03.4goodhosting.com with LMTP id qHlDCdljCWKQfgAA2eRUeQ (envelope-from ) for ; Sun, 13 Feb 2022 15:02:33 -0500 Envelope-to: s...@smwonline.ca Delivery-date: Sun, 13 Feb 2022 15:02:33 -0500 Received: from mort.smwonline.ca ([206.248.137.116]:59412 helo=[127.0.0.1]) by server03.4goodhosting.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.94.2) (envelope-from ) id 1nJL4r-Oz-0y for s...@smwonline.ca; Sun, 13 Feb 2022 15:02:33 -0500 Date: Sun, 13 Feb 2022 15:02:32 -0500 From: Steven Winikoff To: s...@smwonline.ca Subject: =?US-ASCII?Q?Using_the_Linux_fold_command_to_mak?= =?US-ASCII?Q?e_text_more_readable_=7C_Network_World?= User-Agent: K-9 Mail for Android Message-ID: <43c6f911-0ded-4953-897b-3a5cffaf9...@smwonline.ca> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-getmail-retrieved-from-mailbox: INBOX https://www.networkworld.com/article/3646748/using-the-linux-fold-command-to-make-text-more-readable.amp.html 8<- cut here >8 -- _______ Steven Winikoff | Montreal, QC, Canada | "Believe that life is worth living, and s...@smwonline.ca | your belief will help create the fact." http://smwonline.ca | |- William James
Re: mhfixmsg character set conversion
>>1) What's the best replacement for elinks? > >mhn.defaults.sh looks for text/html helpers in this order: >1. w3m >2. lynx >3. elinks > >I don't know if one is necessarily "better" than another. I've tried all of w3m, elinks, links and lynx at different times, and I'd settled on elinks at the one that reproduced the HTML most accurately (subject to the obvious limitations of a text terminal, but you know what I mean). I no longer remember what the differences were, but they probably had to do with HTML constructions. Anyhow, accuracy/faithfulness to the original is what I meant by "best". I've settled on w3m for now, subject to further testing. >If you have suggestions on how to improve the arguments that mhn.defaults.sh >uses for elinks, please let us know. If I can make elinks do what I need, I certainly will; however, for the moment at least it looks like I'm unable to accomplish that, which is why I've switched to w3m for the moment. For the record, my w3m invocation looks like this: w3m -I ${cset} -T text/html -dump -s -o display_link_number=1 \ -o color=1 -graph ${html} | sed 's/^ //;s/[ ]*$//' ...where ${cset} is the character set assigned in .mh_profile, and ${html} contains the HTML code to be rendered. >>2) Should I replace my 1.7.1 installation by the version I just built? >> Basically I'm asking what benefits the current snapshot has over >> 1.7.1, > >See docs/pending-release-notes. Thanks, I will. >> and how far away the next numbered release might be. > >Unknown. Ken appears to be busy. One of us here could push it out. It's >been almost 4 years so I think that would be a good idea. Perhaps after >things here settle down a bit. Please let me clarify that I wasn't trying to rush anything or put pressure on anyone; I was just asking for an estimate, because that would help me decide whether or not to wait for it. >>3) How can I guarantee that messages will be saved with quoted-printable >> or base64 parts decoded, without patching mhfixmsg to deal with >> messages in which the decoded text would be more than 998 characters >> long? > >I don't know your reason for patching mhfixmsg. At the time I didn't understand how or why to use -decodetext binary, so the patch was the only way I could find to guarantee that text/html parts would be decoded, no matter how badly formatted the HTML is (and by then I'd already discovered just how bad that can be :-/). >IIRC, you were using -decodetext 8bit; binary instead of 8bit might help. Yes, I understand that now, though I still have the question you answered below about the practical difference between binary and 8bit. >> - Why wasn't the text/html part converted to utf-8? > >mhfixmsg only converts the character set of text/plain. That was a >design decision. Other subtypes can be extracted with mhstore and run >through iconv. If there's a use for converting them in place in >mhfixmsg, it wouldn't be difficult but I'm not sure how useful it >would be. It would be useful for me, because some messages don't have a text/plain part, and my main motivation for storing the decoded text is the ability to search it with grep and mairix. ...but I can modify my shell script to run mhstore and iconv as you suggest, so for me having a modified mhfixmsg would be nice but not actually necessary. >> - Regardless of the answer to the previous question, after a >>message has been refiled (and assuming I'm not planning to >>resend it to anyone), is there a practical difference between >>binary and 8bit encoding? > >"Note that -decodetext binary can produce messages that are not compliant >with RFC 5322, §2.1.1." Understood (you made it clear when I first asked about the 998-character limit that my patch has the same effect), but I don't care; I'm storing messages in case I need to reread them later, and if I ever need to resend something that wouldn't be compliant (and so far I can't remember that ever happening), I'd be sending the converted plain text anyway. >Is it a proper MIME message (does mhfixmsg return with a non-zero exit >status)? If so, can you send it to me off-line? It's the same message I already sent to you, that I've been using as a test case all through this discussion. I just checked, and mhfixmsg returns a zero exit status for it. - Steven -- ___ Steven Winikoff | Montreal, QC, Canada | "It is never too late to be what you might s...@smwonline.ca | have been." http://smwonline.ca | - George Eliot
Re: mhfixmsg character set conversion
>Thanks again, that has > >$ g 'seems not' *patch >+L"Warning: Locale seems not configured\n"); >$ > >Note the ‘L’; it's wide. > >$ tr -d \\000 > LC_ALL=C egrep -oa '[ -~]*seems not configured' >Warning: Locale seems not configured >$ >$ sed -n l paraur | egrep -A7 'W[\\0]+a[\\0]+r[\\0]+n' >\000\000\000\000\000\000\000W\000\000\000a\000\000\000r\000\000\000n\ >\000\000\000i\000\000\000n\000\000\000g\000\000\000:\000\000\000 \000\ >\000\000L\000\000\000o\000\000\000c\000\000\000a\000\000\000l\000\000\ >\000e\000\000\000 \000\000\000s\000\000\000e\000\000\000e\000\000\000\ >m\000\000\000s\000\000\000 \000\000\000n\000\000\000o\000\000\000t\ >\000\000\000 \000\000\000c\000\000\000o\000\000\000n\000\000\000f\000\ >\000\000i\000\000\000g\000\000\000u\000\000\000r\000\000\000e\000\000\ >\000d\000\000\000$ >$ I think that clears up the last mystery remaining from the questions I raised. Thank you for your help throughout! - Steven -- ___ Steven Winikoff | "To do each day two things one dislikes is Montreal, QC, Canada | a precept I have followed scrupulously; s...@smwonline.ca | every day I have got up and I have gone http://smwonline.ca | to bed." | - W. Somerset Maugham
Re: mhfixmsg character set conversion
>The file has UTF-8 and later ISO 8859-1. Another point that should have been obvious to me, and is in hindsight, is that I can't expect vim to detect the character set properly for something like this. :-/ >There's no BOM so ucs-bom fails. The ISO 8859-1 bytes don't happen to >be valid UTF-8. ‘default’ means use your environment, which is probably >UTF-8 again; fails. Which means we arrive at ‘latin1’, AKA ISO 8859-1, >which is happy. Happy, and just as half-correct as utf-8 would have been. Meanwhile, I did a web search based on what you wrote here, and discovered https://vim.fandom.com/wiki/Working_with_Unicode ...which confirms everything you wrote, but also https://stackoverflow.com/questions/25115752/vim-encodings-latin1-and-utf-8 ...which suggests using this command in vim to force it to reload the file in utf-8 encoding: :e ++enc=utf-8 path_to_file Of course this can also be done directly from the command line as vim -c "e ++enc=utf-8" path_to_file More interestingly, when vim reopens the file (or just opens it, in the latter case) in utf-8, it emits this status line message: "/tmp/nmh_testing/bad" [ILLEGAL BYTE in line 289] 336 lines, 49366 bytes ...and of course the line in question contains accented characters encoded in ISO 8859-1, so everything is consistent. >> ...but in bash, although the line gets pasted, the newline at the end >> of it somehow doesn't. > >Another difference is the pasted text is normally highlighted in some >way, e.g. inverse video, until it's committed with Enter. In my experience with tcsh, the inverse video highlighting stays in place even after the paste is committed, and remains so until something else is highlighted. This appears to be the case for me in bash (invoked as sh just now), at least with enable-bracketed-paste turned off. - Steven -- ___ Steven Winikoff | "Science is built upon facts, as a house is Montreal, QC, Canada | built of stones; but an accumulation of s...@smwonline.ca | facts is no more a science than a heap of http://smwonline.ca | stones is a house." | - Henri Poincaré
Re: mhfixmsg character set conversion
= -1 ENOENT (No such file or directory) openat(AT_FDCWD, "/usr/lib/gconv/gconv-modules", O_RDONLY|O_CLOEXEC) = 5 openat(AT_FDCWD, "/usr/lib/gconv/ISO8859-1.so", O_RDONLY|O_CLOEXEC) = 5 openat(AT_FDCWD, "/home/smw/Mail/mhfixmsguEmroo", O_RDWR|O_CREAT|O_EXCL, 0600) = 5 openat(AT_FDCWD, "/home/smw/Mail/mhfixmsgkbIBCl", O_RDONLY) = 6 mhfixmsg: /home/smw/Mail/mhfixmsgsLWrjg part 2, convert UTF-8 to UTF-8 openat(AT_FDCWD, "/home/smw/Mail/mhfixmsguEmroo", O_RDONLY) = 5 openat(AT_FDCWD, "/home/smw/Mail/mhfixmsgnhCjdt", O_RDONLY) = 5 +++ exited with 0 +++ 8<- cut here >8 -- ___ Steven Winikoff | "When things are not as they appear to be, Montreal, QC, Canada | it's because they're actually simpler s...@smwonline.ca | than you think them to be." http://smwonline.ca | - Robert Rankin, in The Hollow | Chocolate Bunnies of the Apocalypse
Re: mhfixmsg character set conversion
>> > Can you try searching par again, this time with >> > >> > file /usr/bin/par >> > env LC_ALL=3DC egrep -boa 'seems not configured' /usr/bin/par >> >> Done, but this still produces no output. > >I'd have least expected some output from file(1). ;-) Details. :-) $ file usr/bin/paraur /usr/bin/paraur: ELF 64-bit LSB pie executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=e91dc39316ca8e55a7e0037200805128b4e8b5a6, for GNU/Linux 3.2.0, stripped That's par from AUR, renamed after installing 1.53.0 from source. >If you still have the par from AUR which you've been using all this >time, could you make it available to me and I'll have a look. An email >off list, or a URL, or... I don't mind. I'll send it to you privately, but we can do better than that. I just thought to check archive.org for the patch, and found it at https://web.archive.org/web/20211124173449if_/http://sysmic.org/dl/par/par-1.52-i18n.4.patch ...so you can examine the source code directly. - Steven -- _______ Steven Winikoff | Montreal, QC, Canada | Life is uncertain. s...@smwonline.ca | Eat dessert first! http://smwonline.ca |
Re: mhfixmsg character set conversion
>I would do this if you haven't already: >1. download nmh HEAD, build, and install somewhere >2. move your $(mhpath +)/mhn.defaults >3. move your profile and create one with just a Path: entry >4. run the "mhfixmsg -file original_copy -out -" from 1. and see if the > output looks good or bad I just tried this, and a couple of other things, but only after installing par 1.53.0 from source and using that to replace the AUR binary. Here's what I learned: 1) Replacing par does indeed fix one of the three failed tests. I can send you the details, but I seem to recall that you already have them from Valdis Klētnieks; please let me know if I should forward them anyway. 2) After running make install, the newly built mhfixmsg produces correct output. But so does nmh-1.7.1 mhfixmsg when compiled without my patch. 3) Step (3) above was the key, and it turned out that I was being misled by this .mh_profile entry: mhshow-show-text/html: html_to_text %F | cat - ...where html_to_text is a shell script that basically just runs this command: elinks -force-html -dump -dump-charset utf-8 ${html} Removing this profile entry causes the message to be displayed correctly -- both the original, unmodified version, and the one that was saved after being converted by my patched version of nmh-1.7.1 mhfixmsg. That's pretty conclusive evidence that I'd been looking in the wrong place all along. :-( The man page for elinks describes -dump-charset as follows: -dump-charset (alias for document.dump.codepage) Codepage used when formatting dump output. Interestingly, when I restored the mhshow-show-text/html .mh_profile entry and modified my shell script to run elinks without this option, I still saw the same doubly encoded output. So next I tried passing the character set to my script as follows: mhshow-show-text/html: html_to_text %{charset} %F ...and changed the script to use the provided character set rather than forcing utf-8: elinks -force-html -dump -dump-charset $1 ${html} This failed differently. Instead of rendering the message with '�' marking undisplayable characters, it used '*' instead. Somehow, I don't consider that to be much of an improvement. :-/ ...so clearly I need to replace elinks in my html_to_text script, and doing that will solve the problem that prompted this discussion, leaving the following questions: 1) What's the best replacement for elinks? 2) Should I replace my 1.7.1 installation by the version I just built? Basically I'm asking what benefits the current snapshot has over 1.7.1, and how far away the next numbered release might be. 3) How can I guarantee that messages will be saved with quoted-printable or base64 parts decoded, without patching mhfixmsg to deal with messages in which the decoded text would be more than 998 characters long? I used the current mhfixmsg with the test message I've been using throughout this discussion, with this command line: /tmp/nmh/root/bin/mhfixmsg \ -decodeheaderfieldbodies utf-8 -decodetext binary \ -decodetypes text -textcharset UTF-8 -reformat \ -fixcte -fixboundary -noreplacetextplain \ -fixtype application/octet-stream \ -verbose -file $source -outfile $destination ...and that resulted in these headers after decoding: - for the text/plain part: Content-Transfer-Encoding: 8bit Content-Type: text/plain; charset="UTF-8" - for the text/html part: Content-Transfer-Encoding: binary Content-Type: text/html; charset=iso-8859-1 That raises some further questions: - Why wasn't the text/html part converted to utf-8? - Regardless of the answer to the previous question, after a message has been refiled (and assuming I'm not planning to resend it to anyone), is there a practical difference between binary and 8bit encoding? - Why are the headers of the decoded message identical to those of the input, despite the use of -decodeheaderfieldbodies? (...and yes, the unmodified version of the message does contain some encoded headers that my decode_headers program found and decoded; mhfixmsg appears not to have done so). Thanks, - Steven -- _______ Steven Winikoff | "'Somebody, SOMEBODY Montreal, QC, Canada | Has to, you see.' s...@smwonline.ca | Then she picked out two Somebodies. http://smwonline.ca | Sally and me." |- Dr. Seuss
Re: mhfixmsg character set conversion
> | in bash, although the line gets pasted, the newline at the end of it > | somehow doesn't. When > >This is a recent bash (IMO mis-)feature - I absolutely agree with your opinion on this topic. >I believe there's an option (run time) to return to sanity, as I think I >set it ... but just now I cannot find it! A quick web search for "bash xterm paste newline" turns up https://unix.stackexchange.com/questions/633370/bash-needs-another-newline-to-execute-pasted-lines The summary is to add the following entry to ~/.inputrc set enable-bracketed-paste off I just tried this, and it works. Thank you for pointing that out! - Steven -- ___ Steven Winikoff | "Science is built upon facts, as a house is Montreal, QC, Canada | built of stones; but an accumulation of s...@smwonline.ca | facts is no more a science than a heap of http://smwonline.ca | stones is a house." | - Henri Poincaré
Re: mhfixmsg character set conversion
>>- run ~smw/bin/decode_headers using $source as stdin (this explicitly >> decodes headers which are RFC 2047-encoded, and passes the body >> through unchanged) > >This sounds like the kind of thing which might insert bytes which alter >vim's idea of the ‘fileencoding’. Given > >To: =?ISO-8859-1?Q?Keld_J=F8rn_Simonsen?= > >as taken from RFC 2047, is it going to put in a byte 0xf8 for ISO 8859-1 >encoding, or 0xc3 0xb8 for UTF-8? I didn't know, so I just tried it. Here's what happens: # decode_headers < rfc2407_test_header > converted_rfc2407_header # cat converted_rfc2407_header To: Keld Jørn Simonsen # hexdump -C converted_rfc2407_header 54 6f 3a 20 4b 65 6c 64 20 4a c3 b8 72 6e 20 53 |To: Keld J..rn S| 0010 69 6d 6f 6e 73 65 6e 20 3c 6b 65 6c 64 40 64 6b |imonsen .| 0028 ...so it writes 0xc3 0xb8, which I believe is what it should be doing. - Steven -- ___ Steven Winikoff | "The most exciting phrase to hear in Montreal, QC, Canada | science, the one that heralds new s...@smwonline.ca | discoveries, is not 'Eureka!' (I found http://smwonline.ca | it!), but 'That's funny...'" | - Isaac Asimov
Re: mhfixmsg character set conversion
>I assume vim(1) will read up to a certain amount until it either makes up >its mind or assumes the default. That makes sense. >Try this to remove the boring ASCII bytes and see what's left. > >tr -d ' -~' https://en.wikipedia.org/wiki/Specials_(Unicode_block)#Replacement_character >describes ‘�’ and it's being seen above because cut(1) is cutting bytes >and the ‘108:’ at the start of the line has shifted the 68/69 cut-off >point to part-way through the UTF-8 for a single code point AKA rune. For me, this falls into the category of "things that are perfectly obvious, but only after they've been explained". Thank you for explaining it. >Try > >sh >LC_ALL=C; export LC_ALL >locale >perl -lpe 's/[^ -~]/sprintf "<%02x>", ord($&)/ge' good_snippet Done, and I just learned something interesting. First, the output looks like this: sh-5.1$ LC_ALL=C; export LC_ALL sh-5.1$ locale LANG=en_CA.UTF-8 LC_CTYPE="C" LC_NUMERIC="C" LC_TIME="C" LC_COLLATE="C" LC_MONETARY="C" LC_MESSAGES="C" LC_PAPER="C" LC_NAME="C" LC_ADDRESS="C" LC_TELEPHONE="C" LC_MEASUREMENT="C" LC_IDENTIFICATION="C" LC_ALL=C sh-5.1$ perl -lpe 's/[^ -~]/sprintf "<%02x>", ord($&)/ge' good_snippet Veuillez ne pas rpondre au prsent courriel. Il a t gnr Second, the problem with the original command appearing to hang turns out to be an interaction between bash and xterm's pasting mechanism(!). I'm accustomed to pasting a command line by triple-clicking to select the whole line, then middle-clicking to paste it. That's how xterm has worked since I first started using it years ago. ...and it still works exactly this way, and the line gets pasted just as I expect, in tcsh. ...but in bash, although the line gets pasted, the newline at the end of it somehow doesn't. When LC_ALL=C perl -lpe 's/[^ -~]/sprintf "<%02x>", ord($&)/ge' good_snippet originally seemed to hang, in fact it was just waiting for me to press the Enter key! I still don't know why this is happening, but at least I'm comforted by the fact that my bash binary isn't totally broken. :-/ >Beware that invoking bash(1) as ‘sh’ is not the same as running ‘bash’. I did know that, but thank you for mentioning it just in case. >Might not make a difference in this case, but in general it's better to >run whichever is desired. Right, but in this case sh was what was desired. As I understand it, when invoked that way bash behaves closer to a real Bourne shell than when involved as bash. >> I propose to forget this particular clupea harengus of the crimson >> variety unless you find it interesting in and of itself. > >It is odd. And odd might affect other things, including to do with nmh. >:-) Odd indeed, but apparently only when used interactively with xterm, so nmh is unlikely to be affected. - Steven -- ___ Steven Winikoff | Montreal, QC, Canada | "The reward of a thing well s...@smwonline.ca | done is to have done it." http://smwonline.ca | | - Emerson tr_output.pdf Description: tr_output.pdf
Re: mhfixmsg character set conversion
>Three tests failed. You can run "make install" in the nmh directory >to install. Thanks. I'll try that later tonight. >If things are still broken with that nmh, I would remove your par[1] >executable and rebuild mhn.defaults. That might at least allow >make check to pass. This seems like a worthwhile thing to test at the same time even if everything else works, so I'll try that also. >And while we seem to have eliminated par and lynx, the symptoms are >consistent with one or both of them being used by mhfixmsg. ...except that my strace of mhfixmsg shows no external programs being run. What am I missing? - Steven -- _______ Steven Winikoff | Montreal, QC, Canada | "He who has imagination without learning s...@smwonline.ca | has wings but no feet." http://smwonline.ca | | - fortune(6)
Re: mhfixmsg character set conversion
>That ‘i18n’ smells given the nature of the other patch I found earlier. I now understand what you're referring to, and unsurprisingly you're right. I didn't realize that par isn't a Manjaro package at all, but in fact something I installed directly from the Arch User Repository. It's clear that you already found the AUR page for par, but for the record I'll quote this very interesting comment from the package maintainer: @ifreund notified me back in march via the out-of-date mechanism that a new version (1.53) was available upstream¹ (yup, after 19 years since the 1.52 release) \o/. Unfortunately, the i18n patch that we are applying to the 1.52, and which confers par the ability to deal with UTF-8, does not apply cleanly to 1.53. The par author has introduced some fixes to the locale handling for (single byte) charsets other than US ASCII, but no support for multibyte encodings² yet. Until the i18n patch gets updated to apply to 1.53 (any uptakers?), I would say that we are better off as we are. I would have guessed that that patch would be a good thing, but apparently the author of par agrees with you that isn't, given that the patch was offered and not accepted. I'll build 1.53 from source myself before continuing with my testing. >Assuming Manjaro is just picking this up from Arch Linux, That's how things work for packages which are included in the Manjaro repositories, which are separate from those of Arch; however, the AUR is a community effort which maintains packages not included in the repositories for Arch (and thus, not in Manjaro or other Arch-derived distributions). AUR packages are typically downloaded and built from source, although a few also offer binary downloads -- but par isn't one of those, and since the patch is no longer available online, the AUR package for it won't even build anymore. Clearly the patch was available at the time I installed par as part of the OS installation on this machine, back in December 2019. >I think this is the shell script which builds the package. >https://aur.archlinux.org/cgit/aur.git/tree/PKGBUILD?h=par It is. >Can you try searching par again, this time with > >file /usr/bin/par >env LC_ALL=C egrep -boa 'seems not configured' /usr/bin/par Done, but this still produces no output. - Steven -- ___ Steven Winikoff | "I don't want to run the world; I merely Montreal, QC, Canada | want to own a substantial portion of the s...@smwonline.ca | the preferred stock." http://smwonline.ca | |- Alan Dean Foster, Cat-A-Lyst
Re: mhfixmsg character set conversion
>> >I would look at output from mime_helper and see if it's UTF-8. >> >> Please forgive me for having to ask this, but how is mime_helper even >> involved? Isn't that used only when I read the message? It isn't in >> the procmail chain that saves the original copy, and it's the original >> copy that we've been looking at. > >I don't know how mime_helper might fit in. The lynx invocation is still >my pick for the root cause but you said you're not clear on how it is >involved. I understand how it's involved for reading a message; the part I don't understand is how it's involved in the sequence of steps that occurs when a new message is received. Specifically, to the best of my knowlege: 1) sendmail hands the message off to procmail 2) this procmail recipe is activated: :0 HBfw * ^Content-Type:.*text/ | /home/smw/bin/email_decoder I'll append a copy of email_decoder, but the gist of it is: - explicitly unset LC_ALL and set LANG to en_CA.UTF-8 - save the incoming standard input in $source (a file in /tmp) - run ~smw/bin/decode_headers using $source as stdin (this explicitly decodes headers which are RFC 2047-encoded, and passes the body through unchanged) - feed stdout from decode_headers into the same mhfixmsg command I've already quoted a few times; I'll quote it again here to keep everything in one place: mhfixmsg -decodetext 8bit -decodetypes text -textcharset UTF-8 \ -reformat -fixcte -fixboundary -noreplacetextplain\ -fixtype application/octet-stream -noverbose -file - \ -outfile "${tf}.fixed" ...where ${tf}.fixed is another, newly created file in /tmp - use cmp to compare $source and ${tf}.fixed; if they differ, save $source as a new message in +reformatted The file which started this discussion is the one from +reformatted, and I still can't see how lynx would have been involved in its creation. >I would do this if you haven't already: >1. download nmh HEAD, build, and install somewhere I got this far, but I've been unable to proceed since the build failed as described previously. (To be fair, I also haven't had time to try to get farther as yet.) >2. move your $(mhpath +)/mhn.defaults >3. move your profile and create one with just a Path: entry >4. run the "mhfixmsg -file original_copy -out -" from 1. and see if the > output looks good or bad > >If it's good, then start adding things back in one at a time in reverse >order (starting with mhfixmsg switches) until it's bad. This sounds like an excellent plan, and I intend to follow through with it on Friday; unfortunately I'll be busy with other things until then. ...although I may need help getting past the build problem. - Steven 8<- cut here ---->8 #!/bin/sh # # email_decoder -- rewrite quoted-printable and base64 text in a message # # Steven Winikoff # 2008/09/11 # 2010/01/22 -- use mhshow to decode # 2014/05/19 -- always exit with status 0 (see note below) # 2018/01/22 -- rewrite using mhfixmsg to do the heavy lifting # 2019/10/17 -- ...and use ~smw/bin/decode_headers to decode RFC 2047 #headers (for use with procmail, grep and mairix) # # Given an email message on standard input with at least one portion # containing text encoded in base64 or quoted-printable format, the # object of the game is to send the same message back to stdout with # the text part(s) decoded. # # A copy of the original message will also be saved in +reformatted # (AKA ~smw/Mail/reformatted/) unless the -t (test mode) option is # specified. # # This is intended to be invoked in a procmail filter recipe. # # Note that this is the reason why we always exit with status 0, even # when something goes wrong; this prevents procmail from cluttering its # log with messages similar to these: # # procmail: Program failure (3) of "/home/smw/bin/email_decoder" # procmail: Rescue of unfiltered data succeeded # # usage: email_decoder [-t] # #-- # setup: PATH="/local/paths:/bin:/usr/bin:$PATH" export PATH unset LC_ALL; LANG="en_CA.UTF-8"; export LC_ALL LANG tf="/tmp/decoder.`date +%Y%m%d.%H%M%S.$$`" trap 'rm -rf ${tf}* >/dev/null 2>&1' 1 2 3 15 save_folder="+reformatted" test_mode=0 #-- # are we operating in test mode? if [ ! -z "${1}" ] then # officially test mode is indicated by the -t option, but in # practice we'll accept any argument at all to mean test mod; test_mode=1 fi #-- # save a cop
Re: mhfixmsg character set conversion
>Typically for me (at least) bad encoded files have been processed to find >'thing' and converted to the Microsoft belief you meant to use the real >pair of quote marks they prefer. Thank you. That helps. >processed by super-smart software. the worst kind. "I was only trying to >help" software. Of course. :-/ You might enjoy this utility, if you haven't already seen it: https://www.fourmilab.ch/webtools/demoroniser/ - Steven -- _______ Steven Winikoff | "There are millions of chords. There are Montreal, QC, Canada | millions of numbers. And everyone forgets s...@smwonline.ca | the one that is a zero. But without the http://smwonline.ca | zero, numbers are just arithmetic. Without | the empty chord, music is just noise." | - Terry Pratchett (Soul Music)
Re: mhfixmsg character set conversion
>> I think Steven says he's running Manjaro which is an Arch Linux spin off, and >> Archers prefer to pass on upstream code unaltered where possible. > >Except that par has been altered? Not by me, at any rate. >I use this version, unaltered: >$ par version >par 1.53.0 $ par version 1.52-i18n.4 $ pacman -Qi par Name: par Version : 1.52-8 Description : Paragraph reformatter Architecture: x86_64 URL : http://www.nicemice.net/par/ Licenses: custom Groups : None Provides: None Depends On : None Optional Deps : None Required By : None Optional For: None Conflicts With : None Replaces: None Installed Size : 98.90 KiB Packager: Unknown Packager Build Date : Mon 06 Jan 2020 12:53:58 AM Install Date: Mon 06 Jan 2020 12:54:19 AM Install Reason : Explicitly installed Install Script : No Validated By: None >> > Do you have any idea where the following warning comes from? >> >> My money's on par(1) given >> >> >> https://inbox.vuxu.org/voidlinux-github/20191027084150.NZqC6wHlZkyQJ7AkACI7juvuCp0AD_u_IIwftMlDmKs@z/T/ > >That sure looks like it. Perhaps, but it isn't. >> Steven, to confirm, try >> >> egrep -l 'seems not configured' /usr/bin/par $ egrep -l 'seems not configured' /usr/bin/par $ echo $? 1 >Steven, I would try removing par from the end of your mhbuild-convert-text/html >entry. The problem with that is that it's not there in the first place: $ grep par ~/.mh_profile $ echo $? 1 In fact, $ grep mhbuild ~/.mh_profile mhbuild:-maxunencoded 500 $ grep html ~/.mh_profile #: mhshow-show-text/html: %pmime_helper %F %s %{name} mhshow-show-text/html: html_to_text %F | cat -s mhshow_in_browser-show-text/html: %pmime_helper %F %s "%{name}" mhfixmsg-format-text/html: html_to_text < '%F' $ grep -w par ~/bin/html_to_text $ echo $? 1 I'll append the full text of the script in case you'd like to see it, but I'm pretty sure it's not implicated here. In fact there are no invocations of par anywhere in my ~/bin directory; the only occurrences of the word are in some old data files: $ grep -lrisw par ~/bin /home/smw/bin/mars/reports/data/FMARS/jrn/text/20070729 /home/smw/bin/mars/reports/data/FMARS/jrn/text/20070718 /home/smw/bin/mars/reports/data/FMARS/jrn/text/20070719 /home/smw/bin/mars/reports/data/FMARS/jrn/raw/20070719 ...and these files have nothing to do with nmh in any way. I'm reminded of an old Jackie Mason routine, in which he describes a visit to a psychiatrist. After a fair bit of dialog which I won't repeat here, this snippet occurs: psychiatrist: I see your problem. You hate your sister. Jackie Mason: I haven't got a sister. psychiatrist: I can't help you if you won't cooperate. ...so I feel a need to apologize for being uncooperative :-/, but I'm at a loss here. - Steven 8<- cut here >8 #!/bin/sh # # html_to_text -- convert HTML to plain text # # Steven Winikoff # 2010/04/28 # # note: this script uses links # [ http://atrey.karlin.mff.cuni.cz/~clock/twibright/links ] # because it seems to be the only program available which # renders tables reasonably # # alternatives (lynx and vilistextum) both show tables one # column at a time instead of row by row! # # # UPDATE, 2018/08/22: # # switched from links to elinks, because links fails when invoked # via procmail if the source HTML code contains invalid characters # (as in a file in Windows character encoding which isn't labelled # as such) -- the symptom is that a properly structured message # will be converted into one which has an empty HTML part, which # is a problem if (and only if :-) the HTML part needs to be viewed # in a graphical browser (see ~smw/bin/view_html_message, as called # from ~smw/bin/mhread) # #-- if [ ! -z "${1}" ] then html="${1}" else # links (as of April 2010, at least) refuses to read standard # input with -dump html="/tmp/html_to_text.`date +%Y%m%d.%H%M%S`.$$" trap "rm -f ${html} >/dev/null 2>&1; exit 1" 1 2 3 15 cat > ${html} fi elinks -force-html -dump -dump-charset utf-8 ${html} | sed 's/^ //;s/[ ]*$//' ## | cat -s # # w3m -I utf8 -T text/html -dump -s -o display_link_number=1 \ # -o color=1 -graph ${html} | sed 's/^ //;s/[
Re: mhfixmsg character set conversion
>What platform are you on (uname -a and relevant excerpt from /etc/*-release)? $ uname -a Linux mort 5.15.6-2-MANJARO #1 SMP PREEMPT Sat Dec 4 11:11:58 UTC 2021 x86_64 GNU/Linux $ ls -l /etc/*-release lrwxrwxrwx 1 root root 15 Dec 18 10:21 /etc/arch-release -> manjaro-release -rw-r--r-- 1 root root 106 Feb 5 02:23 /etc/lsb-release -rw-r--r-- 1 root root 14 Dec 18 10:21 /etc/manjaro-release lrwxrwxrwx 1 root root 21 Sep 13 2019 /etc/os-release -> ../usr/lib/os-release $ cat /etc/lsb-release DISTRIB_ID=ManjaroLinux DISTRIB_RELEASE=21.2.3 DISTRIB_CODENAME=Qonos DISTRIB_DESCRIPTION="Manjaro Linux" $ cat /etc/manjaro-release Manjaro Linux The last one isn't very interesting :-/, but I hope that's enough to give you the idea. :-) >What output do you see from these two mhparam commands? > >$ mhparam mimetypeproc >file --brief --dereference --mime-type >$ mhparam mimeencodingproc >file --brief --dereference --mime-encoding Exactly the same thing that you do. >What are the jpg entries in your profile and mhn.defaults None; I see no output from $ grep -i jpg ~/.mh_profile ~/Mail/mhn.defaults However, I do have this entry in .mh_profile: mhshow-show-image: %pmime_helper %F %s "%{name}" That's the only entry which has anything to do with any type of image format. The same thing is repeated, apparently redundantly, in ~/Mail/mhn.defaults: $ cat ~/Mail/mhn.defaults mhshow-show-application/pdf: %pmime_helper %F %s "%{name}" mhshow-show-application: %pmime_helper %F %s "%{name}" mhshow-show-audio: %pmime_helper %F %s "%{name}" mhshow-show-video: %pmime_helper %F %s "%{name}" mhshow-show-image: %pmime_helper %F %s "%{name}" mhshow-show-text/richtext: %pmime_helper %F %s "%{name}" >Do you have any idea where the following warning comes from? I don't >find it using: >find /bin/ /usr/ /etc/ $HOME -type f -print0 | xargs=0 egrep -l >'seems not configured' The same command also returns for me without finding anything. I've read Ralph's followup suggesting it might be /usr/bin/par, but apparently that's not the case. - Steven -- ___ Steven Winikoff | Montreal, QC, Canada | "Yoda was wrong when it comes to s...@smwonline.ca | programming. Do or undo. There http://smwonline.ca | is always try." | - Ron Jeffries
Re: mhfixmsg character set conversion
>I'm referring to your first email on this topic. > >Message-ID: <4155787-1643946141.609...@w322.mcfw.6EvO> >Date: Thu, 03 Feb 2022 22:42:21 -0500 >Subject: mhfixmsg character set conversion > >It also has fields which tie it into an earlier email on another topic. > >In-reply-to: <202202011803.211i3f1f2458...@darkstar.fourwinds.com> >X-In-reply-to: Your message of Tue, 01 Feb 2022 13:03:15 -0500 Yes, I see. I have no idea why I did that, and I'd completely forgotten having done it. I'll try to be more careful next time. - Steven -- _______ Steven Winikoff | It's is not, it isn't ain't, and it's it's, Montreal, QC, Canada | not its, if you mean it is. If you don't, s...@smwonline.ca | it's its. Then too, it's hers. It isn't http://smwonline.ca | her's. It isn't our's either. It's ours, | and likewise yours and theirs. | - Oxford University Press
Re: mhfixmsg character set conversion
>> Really. I'm not making this up. :-/ > >No, I don't think you are. I think that line in both files is correctly >UTF-8 encoded. And now that you've explained what's going on, it's clear that you're right. >vim isn't the vi(1) I grew up with, and probably you too. Definitely. The first time I used vi was in 1984, on a 68000-based Cadmus system. >Try ‘:se fileencoding?’ when vim-ing good and again with bad. Good point: $ vim good :set fileencoding fileencoding=utf-8 $ vim bad :set fileencoding fileencoding=latin1 >I expect the bad file has something earlier on which fixes vim's idea of >the encoding to ISO 8859-1 That does seem to be the case. Do you have any idea what kind of thing that might be? (I know you can't diagnose a file you haven't seen, but in general, what sorts of things should I look for?) >> But wait. It gets worse: >> >>$ grep -n ^Veuillez good | cut -c1-68 >>108:Veuillez ne pas répondre au présent courriel. Il a été gén� >> >>$ grep -n ^Veuillez bad | cut -c1-68 >>108:Veuillez ne pas répondre au présent courriel. Il a été gén� > >The worse being it is the very same line 108 you're seeing in vim which >grep is also showing? Exactly, because... >(The ‘�’ at the end is to be expected.) ...this is still more evidence that you know more about character sets and conversions than I do. As if further evidence was needed at this point. :-/ Until now, I've only ever seen that glyph when a character doesn't exist in the font being used -- but that can't be the case here because that same character is shown correctly five times in the same line of output. Why is it to be expected? >>$ LC_ALL=C perl -lpe 's/[^ -~]/sprintf "<%02x>", ord($&)/ge' good_snippet >> [...] > >I don't understand that. The -p sets up a loop to read a line from >good_snippet, do the substitution on it, and print the result, until >EOF. The -l strips off the linefeed on input and puts it back on the >output. The substitution in between changes all bytes, thanks to >LC_ALL=C, which aren't space to tilde into a ‘<42>’ string representing >their hex value. Thank you for explaining that. Just for fun, I tried the following in tcsh: $ setenv LC_ALL C $ perl -lpe 's/[^ -~]/sprintf "<%02x>", ord($&)/ge' good_snippet Veuillez ne pas rpondre au prsent courriel. Il a t gnr As expected, this returned pretty much instantly. Then I tried this: $ sh $ LC_ALL=C $ echo $LC_ALL C $ perl -lpe 's/[^ -~]/sprintf "<%02x>", ord($&)/ge' good_snippet ...and that also hung. Which in a way is good, because at least it means bash is behaving consistently. But also not good, because it's behaving badly. :-/ On my system, /bin/sh is a symlink to /bin/bash, which is version 5.1.016-2 as packaged by Manjaro. ...but troubleshooting bash is far outside the scope of this discussion, so I propose to forget this particular clupea harengus of the crimson variety unless you find it interesting in and of itself. >Nothing wrong with od(1). If you have hexdump(1) installed then it with >-C gives quite nice output. Yes, I see (or -C? :-). Thanks for that tip; I hadn't known that hexdump existed. >> ...and both snippets are identical! > >Well, those lines were identical to start with before snipping. >You could confirm this with > >cmp <(sed -n 108p good) <(sed -n 108p bad) As written, this also hangs in bash (and is invalid syntax in tcsh). But it's effectively equivalent to $ sed -n 108p good > good.sed $ sed -n 108p bad > bad.sed $ cmp good.sed bad.sed $ echo $? 0 ...which behaves as expected. >> Strangely, both snippet files look fine in vim. > >Because you have chopped off the non-UTF-8 which occurs earlier in bad >which fixes vim's idea of the file's encoding. In retrospect this should have been obvious. :-/ >> ...but for the bad file, that becomes >> >>"bad" [converted] 336 lines, 49471 bytes 1,1 Top > >Ta-da! Indeed. :-) Thank you. - Steven -- ___ Steven Winikoff | Montreal, QC, Canada | Eschew obfuscation. s...@smwonline.ca | http://smwonline.ca |
Re: mhfixmsg character set conversion
>No apology is necessary. This uncovered an issue with mhfixmsg that >we fixed. Thank you. >> The key is the message about the line length being too long. Seeing that >> reminded me that I'd modified the stock 1.7.1 mhfixmsg with this patch: > >-decodetext binary instead of 8bit would be safer, I expect. It >sounds like you might have tried that in the past without success. >It might help to dig in to that. That's definitely my plan now that we've gotten this far. >> ...but when I look at the files with command-line tools such as more or >> head, *both* versions look correct. > >But are they correct? It sounds like not, based on viewing in text >editors. I agree, but I'm running out of things to try to in order understand what's happening (please see my reply to Ralph if you haven't already). >> In summary, I now know what's happening and (mostly) what to do about it, >> but I still don't know why. > >I would look at output from mime_helper and see if it's UTF-8. Please forgive me for having to ask this, but how is mime_helper even involved? Isn't that used only when I read the message? It isn't in the procmail chain that saves the original copy, and it's the original copy that we've been looking at. - Steven -- _______ Steven Winikoff | Montreal, QC, Canada | "The cure for boredom is curiousity. s...@smwonline.ca | There is no cure for curiousity." http://smwonline.ca | | - Dorothy Parker
Re: mhfixmsg character set conversion
p/nmh/nmh/test/testdir/test-convert530072.expected. first named test failure: -convertarg with multiple parts and additional text in draft FAIL: test/repl/test-convert PASS: test/repl/test-if-str PASS: test/repl/test-multicomp PASS: test/repl/test-repl PASS: test/repl/test-trailing-newline PASS: test/scan/test-header-parsing PASS: test/scan/test-scan PASS: test/scan/test-scan-file PASS: test/scan/test-scan-multibyte PASS: test/send/test-sendfrom PASS: test/sequences/test-flist PASS: test/sequences/test-mark PASS: test/sequences/test-out-of-range PASS: test/show/test-show PASS: test/slocal/test-slocal PASS: test/whatnow/test-attach-detach PASS: test/whatnow/test-cd PASS: test/whatnow/test-ls PASS: test/whom/test-whom PASS: test/cleanup === 3 of 118 tests failed Please report to nmh-workers@nongnu.org === make[1]: *** [Makefile:4996: check-TESTS] Error 1 make[1]: Leaving directory '/tmp/nmh/nmh' make: *** [Makefile:5261: check-am] Error 2 8<--------- cut here >8 -- ___ Steven Winikoff | "I have learned Montreal, QC, Canada | To spell hors d'oeuvres s...@smwonline.ca | Which still grates on http://smwonline.ca | Some people's n'oeuvres." | - Warren Knox
Re: mhfixmsg character set conversion
060 49 6c 20 61 20 c3 a9 74 c3 a9 20 67 c3 a9 6e c3 I l a 303 251 t 303 251 g 303 251 n 303 100 a9 72 c3 a9 0a 251 r 303 251 \n 105 $ od -t x1c bad_snippet 000 56 65 75 69 6c 6c 65 7a 20 6e 65 20 70 61 73 20 V e u i l l e z n e p a s 020 72 c3 a9 70 6f 6e 64 72 65 20 61 75 20 70 72 c3 r 303 251 p o n d r e a u p r 303 040 a9 73 65 6e 74 20 63 6f 75 72 72 69 65 6c 2e 20 251 s e n t c o u r r i e l . 060 49 6c 20 61 20 c3 a9 74 c3 a9 20 67 c3 a9 6e c3 I l a 303 251 t 303 251 g 303 251 n 303 100 a9 72 c3 a9 0a 251 r 303 251 \n 105 ...and both snippets are identical! Suddenly I understand even less than I did when I started writing this reply. :-( Strangely, both snippet files look fine in vim. But the original bad file still looks bad in vim, and I'm at a loss for how to prove that except by taking a screen shot, so I've done that and attached the result as a 34 Kb PDF file. One additional fact which must be relevant although I don't know enough to say exactly how is that the status bar in vim looks like this when the good file is newly opened: "good" 836 lines, 50844 bytes1,1 Top ...but for the bad file, that becomes "bad" [converted] 336 lines, 49471 bytes 1,1 Top The smaller number of lines is expected (that's the effect of my no-longer-wanted patch to mhfixmsg), but does that also explain the different number of bytes? More importantly, vim explicitly claims that the bad file is "[converted]", so maybe that's the source of the double encoding? The more I try to think about this, the more my head hurts. :-( - Steven -- ___ Steven Winikoff | Montreal, QC, Canada | "Do not meddle in the affairs of dragons, s...@smwonline.ca | for you are crunchy and good with ketchup." http://smwonline.ca | bad.pdf Description: bad.pdf
Re: mhfixmsg character set conversion
>This is an easy way to download and build the latest, assuming that >you have the prerequisites listed in the MACHINES file (and respond >to the build_nmh questions): > >wget http://git.savannah.gnu.org/cgit/nmh.git/plain/build_nmh >sh build_nmh -v Thanks for that. I just tried it, but unfortunately the build failed at the test step (details appended). I don't know whether it matters that I ran the build script using my regular account rather than with root privileges, with the root directory configured in /tmp (because this is a temporary installation for testing rather than something I'm planning to keep). In case it matters, the configuration answers below are the same as in my real installation of version 1.7.1. >Ah, OK, maybe it wasn't lynx. It wasn't. I just realized what it was, and it turns out I owe you an apology for reasons I'll explain separately in a reply to your message from last night. - Steven 8<- cut here >8 $ mkdir -p /tmp/nmh/root $ cd /tmp/nmh $ wget http://git.savannah.gnu.org/cgit/nmh.git/plain/build_nmh $ sh build_nmh -v Install prefix [/local]: /tmp/nmh/root Locking type (dot|fcntl|flock|lockf) [determined by configure]: fcntl MTS (smtp|sendmail/smtp|sendmail/pipe) [smtp]: SMTP server [localhost]: Cyrus SASL support (y|n) [determined by configure]: no TLS support (y|n) [determined by configure]: yes downloading . . . autoconfiguring . . . configuring . . . building . . . testing . . . build failed, build log is in nmh/build_nmh.log $ tail -30 nmh/build_nmh.log /sbin/sed -f man/man.sed man/show.man > man/show.1 /sbin/sed -f man/man.sed man/slocal.man > man/slocal.1 /sbin/sed -f man/man.sed man/sortm.man > man/sortm.1 /sbin/sed -f man/man.sed man/unseen.man > man/unseen.1 /sbin/sed -f man/man.sed man/whatnow.man > man/whatnow.1 /sbin/sed -f man/man.sed man/whom.man > man/whom.1 ./etc/bash_completion_nmh-gen > etc/bash_completion_nmh ./etc/mhn.defaults.sh "/home/smw/bin:/local/paths:/usr/local/sbin:/sbin:/usr/sbin:/bin:/usr/bin:/usr/lib" ./etc/mhn.find.sh > etc/mhn.defaults /sbin/sed -e 's,%mts%,smtp,' \ -e 's,%mailspool%,/var/mail,' \ -e 's,%smtpserver%,localhost,' \ -e 's,%default_locking%,fcntl,' \ -e 's,%supported_locks%,fcntl dot flock lockf,' \ < ./etc/mts.conf.in > etc/mts.conf make[1]: Leaving directory '/tmp/nmh/nmh' make[1]: *** [Makefile:4996: check-TESTS] Error 1 make: *** [Makefile:5261: check-am] Error 2 === FAIL: test/mhbuild/test-attach FAIL: test/mhbuild/test-ext-params FAIL: test/repl/test-convert 3 of 118 tests failed === configure.ac:8: warning: The macro `AC_CONFIG_HEADER' is obsolete. configure.ac:135: warning: The macro `AC_TRY_COMPILE' is obsolete. configure.ac:142: warning: The macro `AC_TRY_COMPILE' is obsolete. configure.ac:188: warning: The macro `AC_TRY_LINK' is obsolete. configure.ac:213: warning: AC_PROG_LEX without either yywrap or noyywrap is obsolete make[1]: *** [Makefile:4996: check-TESTS] Error 1 make: *** [Makefile:5261: check-am] Error 2 8<- cut here >8 -- _______ Steven Winikoff | "Don't make your decisions because they are Montreal, QC, Canada | the easiest, the cheapest, or the most s...@smwonline.ca | popular; make them because you know they http://smwonline.ca | are right." | - Theodore Hesburgh
Re: mhfixmsg character set conversion
>I'm unable to replicate your problem here with the original message, >and using your mhfixmsg invocation, mhfixmsg-format-text/html, and >locale. The only piece I think I'm missing is your mime_helper. >I would give that a try if you send it to me. I've attached the script, but (without having looked at it in a while) I suspect it depends too heavily on other parts of my personal setup to be usable for anyone else. It turns out not to be relevant, but perhaps it might be interesting to someone anyway. >With nmh-1.7 mhfixmsg: >mhfixmsg: /home/levine/src/nmh/msg part 2, decode text/plain; >charset=iso-8859-1 >mhfixmsg: /home/levine/src/nmh/msg part 1, will not decode because it >is binary (line length > 998) >mhfixmsg: /home/levine/src/nmh/msg part 2, convert UTF-8 to UTF-8 ...and therein lies the answer. I owe you an apology about this, and I'm sincerely sorry for wasting your time on this question. The key is the message about the line length being too long. Seeing that reminded me that I'd modified the stock 1.7.1 mhfixmsg with this patch: --- uip/mhfixmsg.c.original 2018-03-06 14:05:56.0 -0500 +++ uip/mhfixmsg.c 2019-08-17 19:51:25.723267048 -0400 @@ -2144,13 +2144,13 @@ int last_char_was_cr = 0; for (i = 0, cp = buffer; i < inbytes; ++i, ++cp) { -if (*cp == '\0' || ++line_len > 998 || +if (*cp == '\0' || ++line_len > 8 || (*cp != '\n' && last_char_was_cr)) { encoding = CE_BINARY; if (*cp == '\0') { *reason = "null character"; -} else if (line_len > 998) { -*reason = "line length > 998"; +} else if (line_len > 8) { +*reason = "line length > 8"; } else if (*cp != '\n' && last_char_was_cr) { *reason = "CR not followed by LF"; } else { I remember asking about the 998-character limit on this list, in a thread from January 2018. You explained why the limit exists, and suggested another way to achieve what I was trying to do, which I tried but without success -- I wasn't able to get what I wanted without this change, but I no longer remember the details. Obviously I need to revisit this question, because I just compiled a copy of mhfixmsg from 1.7.1 without this patch, and it now behaves as you'd expect: it complains about the line length, and then generates correct output with these headers: Content-Type: multipart/alternative; boundary=0126698af956ff6e1d4da4d88ae8ef4ebfb0beb8c16cc29b787641a31378 Content-Transfer-Encoding: 8bit --0126698af956ff6e1d4da4d88ae8ef4ebfb0beb8c16cc29b787641a31378 Content-Transfer-Encoding: 8bit Content-Type: text/plain; charset="UTF-8" Mime-Version: 1.0 --0126698af956ff6e1d4da4d88ae8ef4ebfb0beb8c16cc29b787641a31378 Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=iso-8859-1 Mime-Version: 1.0 With my patch, I get these headers: Content-Type: multipart/alternative; boundary=0126698af956ff6e1d4da4d88ae8ef4ebfb0beb8c16cc29b787641a31378 Content-Transfer-Encoding: 8bit --0126698af956ff6e1d4da4d88ae8ef4ebfb0beb8c16cc29b787641a31378 Content-Transfer-Encoding: 8bit Content-Type: text/plain; charset="UTF-8" Mime-Version: 1.0 --0126698af956ff6e1d4da4d88ae8ef4ebfb0beb8c16cc29b787641a31378 Content-Transfer-Encoding: 8bit Content-Type: text/html; charset=iso-8859-1 Mime-Version: 1.0 There's still something going on that I don't understand, however. The way I've evaluated the output from mhfixmsg was by viewing it in vim, and there's no question that the unpatched output looks fine while the patched output is as I've been describing since the beginning of this thread. ...but when I look at the files with command-line tools such as more or head, *both* versions look correct. When I open both files in xed, the unpatched file is fine, but the patched file generates this message: There was a problem opening the file /tmp/nmh_testing/xxx. The file you opened has some invalid characters. If you continue editing this file you could corrupt this document. You can also choose another character encoding and try again. ...with a menu offering "Automatically Detected", "Current Locale (UTF-8)" and "Western (ISO-8859-15)" as possible character encodings. In summary, I now know what's happening and (mostly) what to do about it, but I still don't know why. - Steven -- ___ Steven Win
Re: mhfixmsg character set conversion
>Ah, OK, maybe it wasn't lynx. I don't know enough about your >environment to say exactly what heppened. Quick question: would it help if I were to run mhfixmsg under control of strace and send you the output? That's the most definitive way I can think of to show you exactly what happens in my environment. (Yes, the problem occurred when procmail invoked my shell script which then invoked mhfixmsg, but I see the same behaviour when I run mhfixmsg directly on the command line with the same input file, so I believe that strace under those circumstances should capture everything relevant.) - Steven -- ___ Steven Winikoff | "The man who has ceased to learn ought Montreal, QC, Canada | not to be allowed to wander around s...@smwonline.ca | loose in these dangerous days." http://smwonline.ca | | - M. M. Coady
Re: mhfixmsg character set conversion
>Steven wrote: > >> That's probably a helpful thing to do, but the question I was wondering >> about wasn't why the UTF-to-UTF conversion was reported, but rather why >> the iso-8859-1-to-UTF conversion wasn't reported. > >Ok, commit 41ce4490ac5d might fix the problem for you. Thank you! I'll check it out tomorrow and see what happens. Do you think the patch will apply to 1.7.1, or will I need to install the latest version from the repository? >The cause was mismatch between the character set of content generated by >the external program (lynx, in this case) and mhfixmsg -textcharset UTF-8. >mhfixmsg wasn't capturing the charset of the generated output, assuming it >was unchanged. It then converted it again. I'm afraid I have to admit I'm not entirely clear on how lynx is even involved. I know I have these entry in my system-wide mhn.defaults file: $ grep lynx /local/pkg/nmh/root-nmh-1.7.1/etc/mhn.defaults mhbuild-convert-text/html: charset="%{charset}"; /sbin/lynx -child -dump -force_html ${charset:+--assume_charset} ${charset:+"$charset"} %F | sed 's/^\(.\)/> \1/; s/^$/>/;' | par 64 mhfixmsg-format-text/html: charset="%{charset}"; /sbin/lynx -child -dump -force_html ${charset:+--assume_charset} ${charset:+"$charset"} %F | expand | sed -e 's/^ //' -e 's/ *$//' mhshow-show-text/html: charset="%{charset}"; %l/sbin/lynx -child -dump -force-html ${charset:+--assume_charset} ${charset:+"$charset"} %F ...and the second of these certainly looks relevant, but: - While testing on Friday, I emptied that file out completely and still observed the same behaviour. - In your message from 12:35 today in the "crufty mhn.default.sh stuff" thread, you wrote: > There is a way. etcpath() looks for mhn.defaults in this order: > * 3) Next, check in nmh Mail directory. > * 4) Next, check in nmh `etc' directory. > > So if the user puts an mhn.defaults in their Mail directory, then > only it will be read. They'd have to copy any entries that they > do want from /etc/nmh/mhn.defaults to their own. I do have an mhn.defaults file in my Mail directory, with (only) these entries in it: mhshow-show-application/pdf: %pmime_helper %F %s "%{name}" mhshow-show-application: %pmime_helper %F %s "%{name}" mhshow-show-audio: %pmime_helper %F %s "%{name}" mhshow-show-video: %pmime_helper %F %s "%{name}" mhshow-show-image: %pmime_helper %F %s "%{name}" mhshow-show-text/richtext: %pmime_helper %F %s "%{name}" ...so while I believe that lynx is involved, I don't know where that involvement is coming from. While I'm replying to you anyway, I realize I forgot to reply to your question from yesterday morning. You asked: Have you tried the -decodeheaderfieldbodies switch to mhfixmsg? I haven't, mainly because I didn't know that switch existed. I don't know what it does (other than what I can infer from the name, of course), and I can't find any mention of it in the man page for mhfixmsg, or anywhere in the source code for version 1.7.1. Was this switch added after 1.7.1 was released? >The fix is for mhfixmsg to detect the charset of the content, >using file --brief --mime-encoding if it can. If it can't, it >falls back to the -textcharset value. If that wasn't used, it >gets it from the locale and advises the user. That sounds reasonable to me. >I'm not completely sure that this will fix your problem because >it's aimed at added text/plain parts. But with -noreplacetextplain >I think that's the path to your issue. Please advise on the easiest way to try it (between applying 41ce4490ac5d to 1.7.1, or just downloading and building the current version of the master branch), and I'll do so tomorrow (I'm unable to do it before then due to a prior commitment). - Steven -- ___ Steven Winikoff | "The thing is, I mean, there's times when Montreal, QC, Canada | you look at the universe and you think, s...@smwonline.ca | 'What about me?' and you can just hear http://smwonline.ca | the universe replying, 'Well, what about | you?'" | - Terry Pratchett (Thief of Time)
Re: mhfixmsg character set conversion
>I am wondering ... do you maybe have some old configuration in mhn.defaults >or your .mh_profile that does some iso8859-1 to UTF-8 conversion? Good question! The easiest way to answer it for .mh_profile was to empty it temporarily of everything except Path: /home/smw/Mail This made no difference. As for mhn_defaults, I have both a personal file in ~smw/Mail/mhn.defaults as well as the system-wide version in /local/pkg/nmh/root-nmh-1.7.1/etc/mhn.defaults The personal file contains only these entries: mhshow-show-application/pdf: %pmime_helper %F %s "%{name}" mhshow-show-application: %pmime_helper %F %s "%{name}" mhshow-show-audio: %pmime_helper %F %s "%{name}" mhshow-show-video: %pmime_helper %F %s "%{name}" mhshow-show-image: %pmime_helper %F %s "%{name}" mhshow-show-text/richtext: %pmime_helper %F %s "%{name}" ...where mime_helper is a shell script which I'll be happy to share if anyone's interested. In any case these entries seem irrelevant to the matter at hand, but please let me know if you disagree. Meanwhile, I have these entries in the system-wide file: $ grep mhfixmsg /local/pkg/nmh/root-nmh-1.7.1/etc/mhn.defaults mhfixmsg-format-application/ics: mhical -infile %F mhfixmsg-format-text/calendar: mhical -infile %F mhfixmsg-format-text/html: charset="%{charset}"; /sbin/lynx -child -dump -force_html ${charset:+--assume_charset} ${charset:+"$charset"} %F | expand | sed -e 's/^ //' -e 's/ *$//' I apologize for the length of that last line, but it's probably easier to read as is than it would be if I tried to break it up. In any case, this looks like it might be relevant, so I tried commenting it out; when that also made no difference, I used a bigger hammer and emptied out /local/pkg/nmh/root-nmh-1.7.1/etc/mhn.defaults completely, but even that made no difference that I could detect. - Steven -- ___ Steven Winikoff | "I knew 'Enterprise Computing Systems' were Montreal, QC, Canada | evil before I touched an actual computer s...@smwonline.ca | for the first time, because I used to http://smwonline.ca | watch Kirk and Spock fighting for control | of it." - Anthony de Boer
Re: mhfixmsg character set conversion
>As Robert and Ken pointed out, one explanation could be that the >content is converted twice, the second time incorrectly. I saw those replies, but I wasn't sure how to interpret them (as in, the evidence is compelling, but I have no idea why that would be happening or what to do about it). >I don't see at this point how mhfixmsg could do that but this needs more >investigation. We can continue this way, or if you want to send me a >sanitized excerpt of the message, I'd be glad to work with it. I can't think of a reasonable way to sanitize it, but I'm willing to send it to you privately. Should I use your address for this purpose? >> $ mhfixmsg -decodetext 8bit -decodetypes text -textcharset UTF-8 -reformat \ >>-fixcte -fixboundary -noreplacetextplain \ >>-fixtype application/octet-stream -verbose -file - \ >>-outfile $destination < $source >> mhfixmsg: /home/smw/Mail/mhfixmsgnss3pI part 2, decode text/plain; >> charset=iso-8859-1 >> mhfixmsg: /home/smw/Mail/mhfixmsgnss3pI part 1, decode text/html; >> charset=iso-8859-1 >> mhfixmsg: /home/smw/Mail/mhfixmsgnss3pI part 2, convert UTF-8 to UTF-8 >> >> ...which is interesting for more than one reason, including that there's >> apparently no conversion of iso-8859-1 to UTF-8, > >That's strange, unless $source had already been run through mhfixmsg. It hadn't. In normal use my procmail-invoked shell script does run the message through a program I wrote myself, which decodes 2047-encoded headers -- but that only affects the headers, and passes the body through unmodified; the relevant excerpt for that is: [ loop that processes header lines elided] 172 /** an empty input line means the end of the message headers: **/ 173 174 if (strlen(input_line) < 1) break; 175} 176 177 178/** read and write message body: **/ 179 180while (getline(_line, , infile) >= 0) 181{ 182 fputs(input_line, outfile); 183} 184 185 186/** ...and we're done: **/ 187 188return(0); 189 190 } The only change this produces in the problematic message is as follows: 47,57c47,57 < X-SG-EID: =?us-ascii?Q?CePduXinO1TKWf=2FmbcRcIcb5o7KEfW6Q=2FLxIZrPrRA0dtxQ5evb2UIV0M0r6v6?= < =?us-ascii?Q?DfqG=2FoldGlAr6l6p1riD1OEyVdX0=2F57dKo740dz?= < =?us-ascii?Q?NZIhwlTw5J3KSyIU4H7pjfyfMBv0e9LGxKHVezS?= < =?us-ascii?Q?FeSLaVJyOzyyK3LeB3eGx+QysKjtjkJzuVDXsW4?= < =?us-ascii?Q?ZiePczPvW34XaHeheXAl2m0RGMRgZENpvRzzX2M?= < =?us-ascii?Q?G6=2FuEHfZ5+X57rF1w=3D?= < X-SG-ID: =?us-ascii?Q?N2C25iY2uzGMFz6rgvQsb8raWjw0ZPf1VmjsCkspi=2FKHgAsE=2FCUk5eZaRe5Ltr?= < =?us-ascii?Q?cbw5EBe1xYnaBlEvYrWq76guWX6eVcLnBjZLZsv?= < =?us-ascii?Q?fUgud7M9swcG4+O7RGb81dd6HibI6WdUCRYi2bx?= < =?us-ascii?Q?T8y2GlCc1B+71TSgKjD9dEU2IqN30RZ1qRbAGlx?= < =?us-ascii?Q?5EAyl462xuJc+?= --- > X-SG-EID: CePduXinO1TKWf/mbcRcIcb5o7KEfW6Q/LxIZrPrRA0dtxQ5evb2UIV0M0r6v6 > DfqG/oldGlAr6l6p1riD1OEyVdX0/57dKo740dz > NZIhwlTw5J3KSyIU4H7pjfyfMBv0e9LGxKHVezS > FeSLaVJyOzyyK3LeB3eGx+QysKjtjkJzuVDXsW4 > ZiePczPvW34XaHeheXAl2m0RGMRgZENpvRzzX2M > G6/uEHfZ5+X57rF1w= > X-SG-ID: N2C25iY2uzGMFz6rgvQsb8raWjw0ZPf1VmjsCkspi/KHgAsE/CUk5eZaRe5Ltr > cbw5EBe1xYnaBlEvYrWq76guWX6eVcLnBjZLZsv > fUgud7M9swcG4+O7RGb81dd6HibI6WdUCRYi2bx > T8y2GlCc1B+71TSgKjD9dEU2IqN30RZ1qRbAGlx > 5EAyl462xuJc+ ...but in my testing last night and just now, I see the same behavior when I run mhfixmsg directly on the unmodified original file (my script always saves an unmodified copy when it makes changes, in case something goes wrong). >Conversion to the same charset is a no-op, I'll look into removing the >verbose output in that case. That's probably a helpful thing to do, but the question I was wondering about wasn't why the UTF-to-UTF conversion was reported, but rather why the iso-8859-1-to-UTF conversion wasn't reported. >> and that in fact it's part 1 rather than part 2 that gets converted >> improperly > >The part numbers are reversed because that's the order used for display. >Part 2 is the text/plain part, that's the one that got converted. Thank you. That clears up part of my confusion. - Steven -- ___ Steven Winikoff | "The thing is, I mean, there's times when Montreal, QC, Canada | you look at the universe and you think, s...@smwonline.ca | 'What about me?' and you can just hear http://smwonline.ca | the universe replying, 'Well, what about | you?'" | - Terry Pratchett (Thief of Time)
Re: mhfixmsg character set conversion
>I expect that your environment is close enough to: > >[details snipped] Pretty much. Here's what I have: $ iconv --version iconv (GNU libc) 2.33 $ locale LANG=en_CA.UTF-8 LC_CTYPE="en_CA.UTF-8" LC_NUMERIC=en_CA.UTF-8 LC_TIME=en_CA.UTF-8 LC_COLLATE=C LC_MONETARY=en_CA.UTF-8 LC_MESSAGES="en_CA.UTF-8" LC_PAPER=en_CA.UTF-8 LC_NAME=en_CA.UTF-8 LC_ADDRESS=en_CA.UTF-8 LC_TELEPHONE=en_CA.UTF-8 LC_MEASUREMENT=en_CA.UTF-8 LC_IDENTIFICATION=en_CA.UTF-8 LC_ALL= ...so the only differences are LC_COLLATE=C, which I set because I prefer the way it sorts, and LC_ALL, which must be being set by a side effect of something, because I'm not doing so explicitly. >With this small example: > >[snip] >I see correct conversion of the quoted-printable E9 to UTF-8 C3A9: So do I, which suggests that there's something in the content of the specific message I'm working with. >Does adding -verbose to your mhfixmsg invocation provide any clues? >mhfixmsg: /tmp/mhfixmsgUgtVK1 part 2, decode text/plain; charset=iso-8859-1 >mhfixmsg: /tmp/mhfixmsgUgtVK1 part 1, decode text/html; charset=iso-8859-1 >mhfixmsg: /tmp/mhfixmsgUgtVK1 part 2, convert iso-8859-1 to UTF-8 This is the output I receive: $ mhfixmsg -decodetext 8bit -decodetypes text -textcharset UTF-8 -reformat \ -fixcte -fixboundary -noreplacetextplain \ -fixtype application/octet-stream -verbose -file - \ -outfile $destination < $source mhfixmsg: /home/smw/Mail/mhfixmsgnss3pI part 2, decode text/plain; charset=iso-8859-1 mhfixmsg: /home/smw/Mail/mhfixmsgnss3pI part 1, decode text/html; charset=iso-8859-1 mhfixmsg: /home/smw/Mail/mhfixmsgnss3pI part 2, convert UTF-8 to UTF-8 ...which is interesting for more than one reason, including that there's apparently no conversion of iso-8859-1 to UTF-8, and that in fact it's part 1 rather than part 2 that gets converted improperly; part 2 still has Content-Type: text/html; charset=iso-8859-1 - Steven -- ___ Steven Winikoff | "Algebra? [...] But that's far too Montreal, QC, Canada | difficult for seven-year-olds!" s...@smwonline.ca | "Yes, but I didn't tell them that http://smwonline.ca | and so far they haven't found out" | | - Terry Pratchett (Thief of Time)
mhfixmsg character set conversion
I routinely use mhfixmsg to clean up incoming messages, using this command in a shell script invoked through procmail: mhfixmsg -decodetext 8bit -decodetypes text -textcharset UTF-8 \ -reformat -fixcte -fixboundary -noreplacetextplain \ -fixtype application/octet-stream -noverbose -file - \ -outfile $destination < $source This usually does what I expect, but the other day I received a message with these characteristics: - mhlist reports the following structure: msg part type/subtype size description 72 multipart/alternative 45K 1 text/html 42K 2 text/plain1501 - the top level of the incoming message has this header (before mhfixmsg): Content-Type: multipart/alternative; boundary=01266[...] - the alternative parts have these headers: Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=iso-8859-1 and Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=iso-8859-1 - after mhfixmsg, the top-level header is unchanged, as expected; the alternative part headers are changed to Content-Transfer-Encoding: 8bit Content-Type: text/plain; charset="UTF-8" and Content-Transfer-Encoding: 8bit Content-Type: text/html; charset=iso-8859-1 ...but after conversion from iso-8859-1 to UTF-8, the output file is mangled. For reference, here's a section of the quoted-printable encoding from the original message: Veuillez ne pas r=E9pondre au pr=E9sent courriel. Il a =E9t=E9 g=E9n=E9r=E9= automatiquement, nous ne pourrons pas y donner suite. This should decode to the following (represented in UTF-8): Veuillez ne pas répondre au présent courriel. Il a été généré automatiquement, nous ne pourrons pas y donner suite. (all in one line, but split here for readability). ...but mhfixmsg turns that into Veuillez ne pas répondre au présent courriel. Il a été généré automatiquement, nous ne pourrons pas y donner suite. (also all in one line, but split here for readability). Not that I care very much about this particular boilerplate sentence :-/, but the message contained a lot of other text that I do care about, all of which was mangled in the same way. My questions are then: 1) Is this a bug in mhfixmsg, or am I just using it incorrectly? 2) If the former, is there further information I can supply to help track this down, or further tests I can conduct on the message in question? 3) ...or if the latter, what am I doing wrong, and what should I be doing instead? Thanks, - Steven -- _______ Steven Winikoff | Montreal, QC, Canada | Aleph-null bottles of beer on the wall, s...@smwonline.ca | Aleph-null bottles of beer... http://smwonline.ca |
Re: mhbuild: extraneous information in message
>‘pacman -Qi mailcap’ will query for information on that package and show >the upstream URL is https://pagure.io/mailcap. Pagure is like a >SourceForge or GitLab and that installation is Fedora's, despite the >misleading domain name: https://pagure.io/about/. Fedora took Red Hat's >source. Thanks for that! >I've access to a Manjaro system. After a ‘sudo -i pacman -Syu’ to >ensure its packages are up to date, I see > >$ pacman -Q file >file 5.40-2 >$ file -i /usr/share/mathjax2/extensions/a11y/invalid_keypress.mp3 >/usr/share/mathjax2/extensions/a11y/invalid_keypress.mp3: > audio/mpegapplication/octet-stream; charset=binary >$ b2sum -l32 /usr/share/mathjax2/extensions/a11y/invalid_keypress.mp3 >c7d7c71d /usr/share/mathjax2/extensions/a11y/invalid_keypress.mp3 Right. Last night I reported that Manjaro had version 5.38-3, but that was based on what I read at https://discover.manjaro.org/packages/file rather than what's actually on my machine. It turns out that I have the same version you do. >So the bug is there. Does it report >‘audio/mpegapplication/octet-stream’ for lots of your MP3 files? Yes. As an experiment, I ran file -i on 2243 MP3 files; two were reported as application/octet-stream, with all of the remaining 2241 reported as audio/mpegapplication/octet-stream. >On both machines, ‘pacman -Qi file’ reports that package's upstream is >https://www.darwinsys.com/file/. ...which links to https://github.com/file/file I just downloaded and built the master branch, and it works correctly: $ /tmp/file/root/bin/file -i /tmp/session2.mp3 /tmp/session2.mp3: audio/mpeg; charset=binary So that's definitely the root cause. Thanks again for all your help on this! - Steven -- _______ Steven Winikoff | Montreal, QC, Canada | "Do not meddle in the affairs of dragons, s...@smwonline.ca | for you are crunchy and good with ketchup." http://smwonline.ca |
Re: mhbuild: extraneous information in message
>Also .deb files just install on Arch, no? (been a long time since I had >one.) No, Arch uses its own package manager: https://wiki.archlinux.org/title/pacman Details for the file package in Arch are here: https://archlinux.org/packages/core/x86_64/file/ The current version is 5.40-3, and Ralph is right: the current version for the same package in Manjaro (which sources packages from Arch, but releases them from its own repositories after testing) is 5.38-3. It's 6:02 am in this timezone and I definitely need sleep, but the next thing to try is to grab the 5.40 package from Arch and see what happens. - Steven -- ___ Steven Winikoff | Montreal, QC, Canada | "If you're not part of the solution, s...@smwonline.ca | you're part of the precipitate." http://smwonline.ca | |- Steven Wright
Re: mhbuild: extraneous information in message
>I think it's a file(1) bug in the executable, probably known and fixed >upstream. That would make sense, and I'll follow it up in that direction (after getting some sleep :-/). >What system are you running, e.g. Ubuntu, and its version? Like Arch, Manjaro is a rolling release, so the number doesn't actually mean very much -- but for what it's worth, $ cat /etc/lsb-release DISTRIB_ID=ManjaroLinux DISTRIB_RELEASE=21.0.4 DISTRIB_CODENAME=Ornara DISTRIB_DESCRIPTION="Manjaro Linux" I do keep up with updates, and as I write this there are none pending. - Steven -- _______ Steven Winikoff | "The difference between the right word and Montreal, QC, Canada | the nearly right word is the difference s...@smwonline.ca | between the lightning and the lightning http://smwonline.ca | bug." | - Mark Twain
Re: mhbuild: extraneous information in message
>My debian distro looks for such things in /etc/mailcap. I'd look there first. Thanks! ...but I suspect that may be a Clupea harengus of the crimson variety :-), partly because it was last modified more than a year ago: $ ls -l /etc/mailcap -rw-r--r-- 1 root root 272 May 5 2020 /etc/mailcap ...but mostly because it's almost empty: $ cat /etc/mailcap ### ### Begin Red Hat Mailcap ### audio/*; /usr/bin/xdg-open %s image/*; /usr/bin/xdg-open %s application/msword; /usr/bin/xdg-open %s application/pdf; /usr/bin/xdg-open %s application/postscript ; /usr/bin/xdg-open %s text/html; /usr/bin/xdg-open %s ; copiousoutput I'm not going to speculate on why an Arch-derived distribution has an /etc/mailcap sourced from Red Hat. :-/ Just for fun I tried $ /usr/bin/xdg-open /tmp/session2.mp3 I'm not sure why xdg-open decided that I want to open .mp3 files in clementine (that's a question for another time), but in fact it did so with no output other than debug messages from clementine (and why so many debug messages are emitted is also a question for another time). - Steven -- ___ Steven Winikoff | Montreal, QC, Canada | For clarity in writing, be careful about s...@smwonline.ca | word selection. For example, never http://smwonline.ca | utilize 'utilize' when you can use 'use'.
Re: mhbuild: extraneous information in message
>The complaint about ‘/octet-stream’ coupled with the trailing >‘application’ after ‘audio/mpeg’ looks like two things are being >combined, e.g. ‘audio/mpeg application/octet-stream’. That makes sense. >- How do you attach the MP3 file? By typing "at /path/to/file.mp3" at the whatnow? prompt. ...and I just checked the man page for whatnow and discovered the -v option: What now? at -v /tmp/session2.mp3 Attaching /tmp/session2.mp3 as a audio/mpegapplication/octet-stream What now? s (I didn't actually send anything just now, but that's what would follow). The relevant .mh_profile entries (at least, the ones I recognize as being relevant) are: comp: -form .compform send: -msgid -messageid random -alias .aliases -port 25 mhbuild:-maxunencoded 500 >- Can we see a draft before mhbuild gets run? Sure, here's one: 8<-- cut here >8 To: s...@smwonline.ca Subject: foo Fcc: inbox From: Steven Winikoff Reply-to: Steven Winikoff Content-Type: text/plain; charset="UTF-8" Nmh-Attach: /tmp/session2.mp3 -- _______ Steven Winikoff | Montreal, QC, Canada | "The worst misunderstandings are the s...@smwonline.ca | unspoken ones." http://smwonline.ca | | - Spider Robinson 8<-- cut here ->8 But I don't think you'll need it, because... >- What does ‘file -i’ give on the MP3 file? ...Aha! You nailed it: $ file -i /tmp/session2.mp3 /tmp/session2.mp3: audio/mpegapplication/octet-stream; charset=binary ...and similarly, $ file --mime-type /tmp/session2.mp3 /tmp/session2.mp3: audio/mpegapplication/octet-stream I hadn't known about the -i option until you suggested it, and I found --mime-type just now while looking up -i. With no options, file reports $ file /tmp/session2.mp3 /tmp/session2.mp3: Audio file with ID3 version 2.4.0, contains:MPEG ADTS, layer III, v1, 64 kbps, 48 kHz, Stereo Running strace on file lists the following openat() calls: openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3 openat(AT_FDCWD, "/usr/lib/libmagic.so.1", O_RDONLY|O_CLOEXEC) = 3 openat(AT_FDCWD, "/usr/lib/libseccomp.so.2", O_RDONLY|O_CLOEXEC) = 3 openat(AT_FDCWD, "/usr/lib/libpthread.so.0", O_RDONLY|O_CLOEXEC) = 3 openat(AT_FDCWD, "/usr/lib/libc.so.6", O_RDONLY|O_CLOEXEC) = 3 openat(AT_FDCWD, "/usr/lib/liblzma.so.5", O_RDONLY|O_CLOEXEC) = 3 openat(AT_FDCWD, "/usr/lib/libbz2.so.1.0", O_RDONLY|O_CLOEXEC) = 3 openat(AT_FDCWD, "/usr/lib/libz.so.1", O_RDONLY|O_CLOEXEC) = 3 openat(AT_FDCWD, "/usr/lib/locale/locale-archive", O_RDONLY|O_CLOEXEC) = 3 openat(AT_FDCWD, 0x555892c2d4f0, O_RDONLY) = 3 openat(AT_FDCWD, 0x153f7cb54848, O_RDONLY) = -1 ENOENT (No such file or directory) openat(AT_FDCWD, 0x7ffc2a816a10, O_RDONLY|O_CLOEXEC) = 3 openat(AT_FDCWD, 0x7ffc2a819209, O_RDONLY|O_NONBLOCK|O_CLOEXEC) = 3 ...which includes none of the files I expected to see. The magic database on this system is /usr/share/file/misc/magic.mgc (the text version is supported by libmagic, but doesn't exist), and suspiciously, it was modified by a recent system upgrade: $ ls -l /usr/share/file/misc/magic.mgc -rw-r--r-- 1 root root 7012776 Apr 12 12:20 /usr/share/file/misc/magic.mgc I have a backup copy from before the upgrade: $ ls -l /path/to/backup/of/misc/magic.mgc -rw-r--r-- 6 root root 6652192 Jun 16 2020 /path/to/backup/of/magic.mgc But: $ file -i -k -m /path/to/backup/of/misc/magic.mgc /tmp/session2.mp3 /tmp/session2.mp3: audio/mpegapplication/octet-stream; charset=binary ...and the atime reported by stat(1) confirms that the backup file was accessed, so there's still something I'm obviously missing. >- What's ‘folder -version’ yield? $ folder -version folder -- nmh-1.7.1 built 2019-12-16 03:09:06 + on mort - Steven -- ___ Steven Winikoff | "The best executive is one who has sense Montreal, QC, Canada | enough to pick good people to do what he s...@smwonline.ca | wants done, and self-restraint enough to http://smwonline.ca | keep from meddling with them while they | do it." | - Theodore Roosevelt
mhbuild: extraneous information in message
Recently I've been seeing this message when sending email with an attached .mp3 file: mhbuild: extraneous information in message /home/smw/Mail/drafts/1's Content-Type: field (/octet-stream) I've appended the Fcc copy of a typical message, in which the Content-Type: field for the attachment is Content-Type: audio/mpegapplication; name="session2.mp3" I'm quite sure this isn't nmh's fault, but rather an error in the MIME configuration on my Manjaro Linux machine. I'm hoping for a hint about where to look for the config file responsible for "audio/mpegapplication;". Thanks, - Steven 8<- cut here >8 To: s...@smwonline.ca Subject: testing testing testing From: Steven Winikoff Reply-to: Steven Winikoff MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="- =_aa0" Content-ID: <408771.1620805880.0@mort> Date: Wed, 12 May 2021 03:51:21 -0400 Message-ID: <408773-1620805881.053...@ksvt.ghqa.ia-w> --- =_aa0 Content-Type: text/plain; charset="us-ascii" Content-ID: <408771.1620805880.1@mort> - Steven -- ___ Steven Winikoff | "There are millions of chords. There are Montreal, QC, Canada | millions of numbers. And everyone forgets s...@smwonline.ca | the one that is a zero. But without the http://smwonline.ca | zero, numbers are just arithmetic. Without | the empty chord, music is just noise." | - Terry Pratchett --- =_aa0 Content-Type: audio/mpegapplication; name="session2.mp3" Content-Description: session2.mp3 Content-Disposition: attachment; filename="session2.mp3" Content-Transfer-Encoding: base64 SUQzBAAAI1RTU0UPAAADTGF2ZjU4Ljc2LjEwMAAA//tU [many more lines of base64 encoding deleted to save electrons :-)] --- =_aa0-- 8<- cut here >8
Re: displaying Date using local timezone
>I had an inkling that it might be bad for NMH to try to handle >DST calculations on its own; Tom Scott would agree: https://www.youtube.com/watch?v=-5wpm-gesOY This is probably the best explanation I've ever seen of why time zones and DST calculations induce madness. - Steven -- ___ Steven Winikoff | "Some men see things as they are and ask Montreal, QC, Canada | 'Why?'. I dream things that never s...@smwonline.ca | were and ask, 'Why not?'." http://smwonline.ca | |- Robert F. Kennedy
Re: coming back to (N)mh after a 15 year hiatus..
>Have you tried orgrow yet? If not, it's a swiss army knife of an >application and may help you out. Speaking for myself, this is the first I've heard of it, and I haven't been able to find any information via web search; the search lists are full of references to marijuana and fertilizer, which I somehow suspect isn't what you meant. :-) Would you be willing to share a link? Thanks, - Steven -- ___ Steven Winikoff | "The man who has ceased to learn ought Montreal, QC, Canada | not to be allowed to wander around s...@smwonline.ca | loose in these dangerous days." http://smwonline.ca | | - M. M. Coady
Re: [nmh-workers] logging outgoing messages
>But for the larger issue of whether or not you should submit email to >your own SMTP server or your email provider's ... well, obviously my >OPINION is that you should submit it to your email provider's server >directly from nmh (see previous emails on why I think this). But plenty >of people disagree with me on this, and that's fine. If you're the sort >of person who doesn't have a problem configuring your own SMTP server, >then fine, you should do that! Thank you. Between this and other comments, I've decided to revert to having post communicate directly with my local SMTP server. >But I think recommending that to people is a mistake; it creates the >impression that you need to run your own SMTP server to use nmh, and that >is absolutely not true. Understood. I'm comfortable enough with sendmail that running it on my home system isn't a problem, but I'm well aware that many people would prefer not to have to do that. - Steven -- _______ Steven Winikoff| Montreal, QC, Canada | "Stars are facts; constellations are s...@smwonline.ca | theories." http://smwonline.ca| - Michael F. Flynn -- nmh-workers https://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [nmh-workers] logging outgoing messages
>I agree with that, and even when ifdef's are added, they should be >positive, not double negative, so > #ifndef NOSYSLOG >is just perferse, Of course it is. As I mentioned in my previous message... > #ifdef USE_SYSLOG >would work just as well (it does mean the name needs to be explicitly >defined to get the new code, ...I was just too lazy to do that for a proof of concept. There's no question that you're right if such a patch were to be added in production while using #ifdef > | - It is not clear to me that you can state with certainly that the > | 250 response code will contain the queue identifier > >No, you can't, but these days it almost always does. That matches my experience. >Personally, I'd just suggest keeping the local MTA, having post deliver >to that, and let it do the logging That's exactly what I've always done, from time immemorial until just about two weeks ago. Ironically enough I actually prefer to do it this way, but I was under the impression that this is deprecated in modern configurations. I'd be happy to be wrong about that. - Steven -- _______ Steven Winikoff| Montreal, QC, Canada | Don't use no double negatives. s...@smwonline.ca | http://smwonline.ca| -- nmh-workers https://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [nmh-workers] logging outgoing messages
>>Is there any interest in adding an improved version of this to the code >>base? > >So ... maybe? But, some thoughts. Thank you (and everyone else!) for taking the time to reply to this. Before I say anything else, I never meant to ask for my patch to be incorporated as-is -- I know there are many ways in which it would need to be improved for production use. I sent it mostly as a proof of concept (it's currently just barely good enough to do what I personally need :-/), and partly in hopes it might help anyone else if something like it isn't added to nmh itself. >- We don't, in general, want to have any more #ifdefs in the code unless > they are completely unavoidable (e.g., operating system differences or > optional third-party libraries like OpenSSL). So this would require > some run-time configuration. Understood, of course. I used those mostly as an easy way to mark the code I added -- and for those wondering why I chose to write them in the negative, that was purely out of laziness (so that I didn't have to add -DSYSLOG to the configure process). Again, this was never intended for production use, and I apologize if I didn't make that clear originally. >- It is not clear to me that you can state with certainly that the > 250 response code will contain the queue identifier (that is, in > fact, not a concept that appears anywhere that I can find in the SMTP > RFCs). That's unfortunate. I've mostly worked with sendmail, and I've never seen a case where the QID wasn't sent back to the originating MTA, so I wasn't aware that the RFCs don't require that behaviour. > As a practical matter I've never had to give anyone the queue > identifier of a message (because it's not normally logged on the > client; really, most people are happy with recipients and a time, and > they are really happy if you have a message-id). That doesn't match my experience. >I think this should be a lot more generic. So ... an alternate proposal. > > [ details snipped for brevity, but the summary is be to create a > "post hook" and use that instead ] I'd have no problem with that as long as the post hook provides the same information gathered in my patch (i.e., sender and recipient addresses, message ID, relay server and port, and resulting status and QID). - Steven -- ___ Steven Winikoff| "...and every single one of them wanted Montreal, QC, Canada | to be involved in the decision-making s...@smwonline.ca | process without necessarily going http://smwonline.ca| through the intelligence-using process | first." - Terry Pratchett -- nmh-workers https://lists.nongnu.org/mailman/listinfo/nmh-workers
[nmh-workers] logging outgoing messages
: "RCPT TO:<%s%s>", FENDNULL(path), mbox, host)) { @@ -717,6 +776,19 @@ } for (bp = buffer; bp && len > 0; bp++, len--) { +#ifndef NOSYSLOG +if (strncmp(bp, "Message-ID: ", 12) == 0) +{ + int i; + + (void)strncpy(syslog_msgid, bp + 12, SYSLOG_FIELD_SIZE); + for (i=0; i +#ifndef NOSYSLOG + #include +#endif + #include #ifndef CYRUS_SASL @@ -1760,6 +1764,15 @@ } fflush (stdout); + +#ifndef NOSYSLOG +openlog("nmh_smtp", LOG_PID, LOG_MAIL); +syslog(LOG_NOTICE, + "from=%s, to=%s, msgid=%s, relay=%s, port=%s, stat=%s", + syslog_from, syslog_to, syslog_msgid, syslog_server, + syslog_port, syslog_qid); +closelog(); +#endif } } 8<- cut here >8 -- ___ Steven Winikoff| Montreal, QC, Canada | "I'd love to go out with you, but I want s...@smwonline.ca | to spend more time with my blender." http://smwonline.ca| |- fortune(6) -- nmh-workers https://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [nmh-workers] Can't forward MIME-encoded message
>>I didn't know forw had a -mime switch. Since this is something I'd find >>very helpful, I just tried it, but it completely failed to work for me. > >forw -mime doesn't have a wonderful interface; what it does is generate >a mhbuild directive and it puts it in the draft message. You then have >to run "mime" on the resulting draft for the right thing to happen. Thank you! I just tried that, and it worked perfectly. >This is actually all covered in the man page for forw(1); let me know if >it is unclear. No, it's clear enough; I just didn't think to read the man page until you pointed it out just now. >I'm not defending this practice; it's the way it's always worked and I am >unable to come up with a better solution at this time. Maybe someday ... That's okay; the two-step process is still much easier than what I'd been doing until now. - Steven -- _______ Steven Winikoff| "Sometimes I think we're alone in the Montreal, QC, Canada | universe, and sometimes I think we're s...@smwonline.ca | not. In either case, the idea is quite http://smwonline.ca| staggering." | - Arthur C. Clarke -- nmh-workers https://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [nmh-workers] Can't forward MIME-encoded message
>When you say, "Forward", what _EXACTLY_ do you mean? > >I just 3 different things with the latest nmh, and they all behaved >"as designed". > >- When I forwarded a message with forw -mime, it generates a new message > with an message/rfc822 MIME type; the original message contained > in the message/rfc822 has the correct Content-Type header I didn't know forw had a -mime switch. Since this is something I'd find very helpful, I just tried it, but it completely failed to work for me. The message I tried to forward is described as follows: # mhlist +wing 3943 msg part type/subtype size description 3943 multipart/mixed16M 1 text/plain 326 2 application/pdf 3836K a_night_at_the_ballet_2nd.pdf 3 application/pdf 1098K light_vibrations_2nd.pdf 4 application/pdf 2092K sinatra_in_concert_2nd.pdf 5 application/pdf 3629K a_night_at_the_ballet_3rd.pdf 6 application/pdf 1100K light_vibrations_3rd.pdf 7 application/pdf 389K sinatra_in_concert_3rd.pdf My test was invoked as follows: # forw -mime +wing 3943 The resulting message is 8<- cut here >8 From s...@smwonline.ca Thu May 9 13:06:39 2019 Return-Path: Received: from mort (localhost.localdomain [127.0.0.1]) by 206-248-137-116.dsl.teksavvy.com (8.15.2/8.15.2/Debian-10) with ESMTP id x49H6dIA003943 for ; Thu, 9 May 2019 13:06:39 -0400 To: smw Subject: testing /local/paths/forw -mime From: Steven Winikoff Reply-to: Steven Winikoff MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-ID: <3941.1557421599.1@mort> Date: Thu, 09 May 2019 13:06:39 -0400 Message-ID: <3942.1557421599@mort> #forw [forwarded message] +/home/smw/Mail/wing 3943 8<- cut here >8 I wondered if this might be caused by a profile entry. # grep forw ~/.mh_profile forw: -filter .forwardfilter -form .forwardform ...but when I tried deleting this entry, the same thing happened: 8<- cut here >8 From s...@206-248-137-116.dsl.teksavvy.com Thu May 9 13:08:15 2019 Return-Path: Received: from mort (localhost.localdomain [127.0.0.1]) by 206-248-137-116.dsl.teksavvy.com (8.15.2/8.15.2/Debian-10) with ESMTP id x49H8F46004123 for ; Thu, 9 May 2019 13:08:15 -0400 From: Steven Winikoff To: smw Subject: Re: FW: A Night at the Ballet - 2 (fwd) MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-ID: <4121.1557421695.1@mort> Date: Thu, 09 May 2019 13:08:15 -0400 Message-ID: <4122.1557421695@mort> #forw [forwarded message] +/home/smw/Mail/wing 3943 8<- cut here >8 Here the only difference is the Subject: header; in my first test the default was empty, and in the second test I left the default value unchanged. Please let me know if there's any additional information I can supply about this. - Steven -- ___ Steven Winikoff| "I knew 'Enterprise Computing Systems' were Montreal, QC, Canada | evil before I touched an actual computer s...@smwonline.ca | for the first time, because I used to http://smwonline.ca| watch Kirk and Spock fighting for control | of it." - Anthony de Boer -- nmh-workers https://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] why does mhfixmsg dislike long text lines?
>Does the full path to mhn.defaults shown by "man mhfixmsg" match >/local/pkg/nmh/root-nmh-1.7/etc/mhn.defaults ? Yes. >If it does, maybe run mhfixmsg under ltrace or something similar to see >exactly what file it's trying to open. I used the strace command Ralph suggested (strace -fe open,openat), and that solved it. The problem was that I had a personal mhn.defaults file, and mhfixmsg was reading that (which I expected) but then not reading the system version (which I didn't expect -- I would have expected the system one to be read first unconditionally, to be supplemented and/or overridden by the personal file). Ironically, the personal mhn.defaults in question isn't needed and shouldn't have been there anyway; it's an artifact of the transition I'm going through right now, from an older, about-to-be-decommissioned server with nmh-1.6 to my desktop machine running 1.7. With the personal mhn.defaults file deleted mhfixmsg works as expected using the system version. >> >> I thought Ken said the RFC 5322 limit was 998. But... >> > >> >Right. He also noted that he's had problems with insertion of '!' in long >> >lines of HTML. >> >> What about the idea of reformatting the text/html part to reduce the line >> width? > >Then -maxunencoded wouldn't be necessary. Though I'm not sure if you're >talking about outgoing or incoming messages here. I'm talking about incoming messages. >> Is there a way to get mhfixmsg to decode the base64 and then run it through >> tidy with a given set of command-line options? > >Yes, via mhfixmsg-format-text/html. See the mhfixmsg and mhshow man pages. I did read those man pages, but perhaps I'm still failing to understand parts of them. I do know how mhfixmsg-format-text/html specifies the command which generates the text/plain part from the text/html part, but I don't see how to do that and also reformat the text/html part. - Steven -- ___ Steven Winikoff| "If you have built castles in the air, Concordia University | your work need not be lost; that is Montreal, QC, Canada | where they should be. Now put steven.winik...@concordia.ca | foundations under them." | - Henry David Thoreau -- Nmh-workers https://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] why does mhfixmsg dislike long text lines?
>>mhfixmsg: Don't know how to convert /home/smw/Mail/reformatted/17352, >> there is no mhfixmsg-format-text/html profile entry >> >> ...which makes sense because I don't know what to put in that profile entry. > >Is there a mhfixmsg-format-text/html line in your mhn.defaults? Yes: # grep mhfixmsg-format-text/html /local/pkg/nmh/root-nmh-1.7/etc/mhn.defaults mhfixmsg-format-text/html: charset=%{charset}; /usr/bin/lynx -child -dump -force_html ${charset:+--assume_charset} ${charset:+"$charset"} %F | expand | sed -e 's/^ //' -e 's/ *$//' Of course I can just copy this entry into my .mh_profile, and I'll try that tomorrow when I have some time -- but it sounds like you're suggesting that the entry in /local/pkg/nmh/root-nmh-1.7/etc/mhn.defaults should be picked up directory from there, and that isn't happening. >> I thought Ken said the RFC 5322 limit was 998. But... > >Right. He also noted that he's had problems with insertion of '!' in long >lines of HTML. What about the idea of reformatting the text/html part to reduce the line width? I've been playing with tidy (AKA html-tidy), and it's capable of transforming the HTML message I received last week from a single line of 42187 characters into a version with 1896 lines with a maximum line width of 138. Is there a way to get mhfixmsg to decode the base64 and then run it through tidy with a given set of command-line options? - Steven -- _______ Steven Winikoff| Concordia University | "The end of the world will occur at Montreal, QC, Canada | 3:00 p.m., this Friday, with symposium steven.winik...@concordia.ca | to follow." |- fortune(6) -- Nmh-workers https://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] why does mhfixmsg dislike long text lines?
>Well, "binary" has a specific meaning in the MIME world. Specifically, >it refers to a MIME Content Transfer Encoding of binary, which has no >restrictions in terms of line length. So when that message says that >it can't decode it because the part would have to be binary, THAT is what >it is referring to. This helps, but I'm still a bit confused. (That's an exaggeration; I'm really still very much confused. :-() I just looked up Content-Transfer-Encoding header, and found what you already know (but which I'll repeat here, for the record and for my own future reference): The Content-Transfer-Encoding field is designed to specify an invertible mapping between the "native" representation of a type of data and a representation that can be readily exchanged using 7 bit mail transport protocols, such as those defined by RFC 821 (SMTP). This field has not been defined by any previous standard. The field's value is a single token specifying the type of encoding, as enumerated below. Formally: Content-Transfer-Encoding := "BASE64" / "QUOTED-PRINTABLE" / "8BIT" / "7BIT" / "BINARY" / x-token ...so when a message clearly contains Content-Transfer-Encoding: base64 shouldn't that mean we don't need to test the decoded content to see if it's binary or not? You just said in your previous message that there's no line length restriction in the content after decoding. >But David points out that if you tell it to, mhfixmsg will happily >generate such messages (but the documentation does caution you that the >resulting messages may not be readable with nmh). That's good to know, but I really have no plans to create out-of-spec messages; I just want to be able to read the messages I'm receiving, and you clearly explained that I should be able to do that, because the encoded form follows the RFC specification and the decoded form doesn't have to. Or at least that's what I thought you said. >Our only general-purpose nmh list is nmh-workers; plenty of people on it >are not coders, so please don't be concerned on that score. Thanks. I've just subscribed. - Steven -- ___ Steven Winikoff| "Nature is by and large to be found out Concordia University | out of doors, a location where, it Montreal, QC, Canada | cannot be argued, there are never steven.winik...@concordia.ca | enough comfortable chairs." |- Fran Leibowitz -- Nmh-workers https://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] why does mhfixmsg dislike long text lines?
>Are you saying you received via SMTP a RFC5322 message where there >was 42027 characters between CR-LF pairs? I think I might have said that :-/, but whether I did or not you're right that it isn't what I meant. >That suggests to me that you in fact received a message that had lines no >greater than 78 characters between CR-LF pairs, and _after you decoded it_ >it might have had a very long line. Exactly. But that's also the situation with the message I received today which sparked my original question. That one had only one part, described with: Content-Transfer-Encoding: base64 Content-Disposition: inline Content-Type: text/html; charset="UTF-8" MIME-Version: 1.0 Before decoding, the body width was 76 characters (some of the headers were wider, even those were all under 200 characters wide) -- but when I tried to decode it, this happened: mhfixmsg: /tmp/msg, will not decode text/html; charset="UTF-8" because it is binary (line length > 998) ...but (line length > 998) refers to the decoded text, which really is more than 998 characters wide. This is what I was originally asking about (or trying to :-/, and I apologize for not being clear on that point). >THAT is completely legal according to the RFCs. For the most part, it >doesn't matter what it decodes to; what nmh cares about is that the >message it is reading is valid according to RFC 5322. THAT is where the >998 byte line length limit comes into play. You could send the entirety >of "War and Peace" in text/plain part all as one line, and as long as it >was encoded properly that would be fine. This suggests to me that removing the 998-character limit in mhfixmsg (only, and nowhere else) is a reasonable thing to do. The comment in mhfixmsg which I quoted at the beginning of this thread seems to be saying that sometimes message components described as text/* are really binary files, and that the 998-character limit is used in mhfixmsg (only) as a heuristic to identify this situation. >>But you're quite right that this code isn't easy to understand. If I were >>to modify uip/mhfixmsg.c without touching sbr/m_getfld.c, am I risking >>anything other than generating messages that nmh won't be able to read? > >Good question! Your use cases seem to be ... well, I don't understand >them. That's because I keep being unclear, which in turn is because I don't know enough to be clearer -- though I'm learning a lot just from this discussion. :-) My use case is simply that people keep sending me messages which decode to HTML with horribly long lines, and I'd prefer to save the decoded text rather than the encoded version[*]. (Digression: I'd also prefer to reformat the long lines at the same time. I'm seriously considering piping the decoded HTML through something like tidy [ http://www.html-tidy.org/ ] before saving it. :-/) As it happens, I have mhbuild: -maxunencoded 900 in my .mh_profile, and have had for a while. This is a coincidence, in that I was unaware of the 998-character limit, until today, but happily I'm under it anyway. :-) ...so if I were to quote text with wider lines than that the right thing would happen -- although in practice if I were to quote text with lines that long, I'd almost certainly run them through fmt first. >And might I suggest that if you're going to keep asking us questions >about nmh, you should join the mailing list? :-) I'd be happy to, as long as it wouldn't be considered as a commitment to work on the code -- not that I'm opposed to that in principle, but I think I've already demonstrated I'm not competent to step in and do anything useful. :-( The only reason I've been writing to nmh-workers is that I'm unaware of anywhere else to turn. Is there a corresponding nmh-users list or something similar? - Steven [*] That's because one of the biggest reasons for using nmh, at least for me, is that it's so useful to be able to manipulate saved email with standard command-line tools. For example, I particularly depend on being able to find specific saved messages using grep or mairix[**] -- and if the message body is saved in base64 encoding, both of those programs fail completely. [**] http://www.rpcurnow.force9.co.uk/mairix/ -- ___ Steven Winikoff|"Garfield is, for my money at least, the Concordia University | shining exemplar of that productive Montreal, QC, Canada | laziness that gave us flush plumbing, steven.winik...@concordia.ca | clothes washers, dish washers, electric | lights, and automated guitar string | factories." - Mike Andrews -- Nmh-workers https://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] why does mhfixmsg dislike long text lines?
>To answer your larger question (on the subject line): > >- MH/nmh doesn't handle lines greater than 998 characters because such > messages are not valid according to RFC 5322, and mhfixmsg isn't going > to generate a message that nmh cannot handle. Whether or not nmh SHOULD > handle such messages is a different question. Thank you, that helps. And I won't presume to suggest what nmh should do, but I will point out that I recently received a message with a text/html part which was one single line of 42027 characters. Clearly there are at least some senders who have as much respect for RFC 5322 as Microsoft has for standards in general. :-/ But I'm confused, because I didn't have any problems reading that message. The structure on it is as follows: msg part type/subtype size description 4 multipart/alternative2213K 1 multipart/related2211K 1.1 text/html 41K 1.2 image/jpeg 28K 1.3 image/jpeg 42K [...] 1.33 image/jpeg 350 2 text/plain 808 ...and part 1.1 has these headers: --Apple-Mail=_7C2BA5CB-FA71-4036-9FAD-C693FF38AF09 Content-Type: multipart/related; type="text/html"; boundary="Apple-Mail=_B4252506-2E52-4348-A3AD-C92C9A9FBD3D" --Apple-Mail=_B4252506-2E52-4348-A3AD-C92C9A9FBD3D Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=us-ascii This part is 670 lines before decoding, and exactly one line afterward. This arrived before I started using mhfixmsg, but given what I've just learned I'd certainly expect mhfixmsg to refuse to decode it. >- The line length limit is imposed by m_getfld(), and that function is ... > hairy. I think changing that might have unexpected consequences; it > might be fine, but I don't make any guarantees. But the fact you said > you could "easily modify" it suggests to me that you have not actually > LOOKED at the code in question :-) What I'd looked at was the content_encoding() function in uip/mhfixmsg.c, where there are a few instances of literal 998 which really would be easy to change. You're right that I hadn't looked at the larger context, mostly because I didn't know there was one. This is the main reason why I asked before doing anything. I just took a quick look at sbr/m_getfld.c. The first thing that struck me was this comment at lines 158-163 (of the 1.7 version): [...] I considered using a Vax "scanc" to locate the end of the field followed by a "memmove" but the routine call overhead on a Vax is too large for this to work on short names. If Berkeley ever makes "inline" part of the C optimiser (so things like "scanc" turn into inline instructions) a change here would be worthwhile. I'm beginning to get a sense of (and becoming impressed by) just how old this code base is. But you're quite right that this code isn't easy to understand. If I were to modify uip/mhfixmsg.c without touching sbr/m_getfld.c, am I risking anything other than generating messages that nmh won't be able to read? - Steven -- ___ Steven Winikoff| Concordia University | Celibacy is hereditary. If your parents Montreal, QC, Canada | didn't have children, chances are you steven.winik...@concordia.ca | won't either. -- Nmh-workers https://lists.nongnu.org/mailman/listinfo/nmh-workers
[Nmh-workers] why does mhfixmsg dislike long text lines?
I'm in the middle of integrating mhfixmsg (1.7) into my proxmail setup, and just discovered this behaviour: # mhfixmsg -verbose -textcharset utf8 -fixcte -noreformat -fixboundary -file /tmp/msg -outfile /tmp/501.fm mhfixmsg: /tmp/msg, will not decode text/html; charset="UTF-8" because it is binary (line length > 998) The comments in the code where this happens are as follows (lines 2118-2123): /* * See if the decoded content is 7bit, 8bit, or binary. It's binary * if it has any NUL characters, a CR not followed by a LF, or lines * greater than 998 characters in length. If binary, reason is set * to a string explaining why. */ I can certainly understand this in the general case, but: - The case which tripped this for me specifically involved a text part; the headers in the message were: Content-Transfer-Encoding: base64 Content-Disposition: inline Content-Type: text/html; charset="UTF-8" MIME-Version: 1.0 The message structure was as simple as it gets: # mhlist -file /tmp/msg msg part type/subtype size description 0text/html 20K ...so it was clearly marked as text. If a sender packages a binary file but describes it as text/html, it's already broken, and I really don't care if mhfixmsg "damages" it even further. :-/ - More and more senders these days are using auto-generated HTML in which the entire body is a single line of text. This message wasn't even one of those, but the point is that HTML with very long lines isn't unusual anymore. Instead of leaving it base64-encoded, arguably the right thing to do with something like that is to decode it and then run it through an HTML pretty-printer, although I acknowledge that that's beyond the scope of what mhfixmsg is designed to do. :-) So I guess what I'm asking is: I can easily modify my copy of nmh to raise the 998-character limit, but it's not clear to me what I might break by doing so. Would someone please explain what I'm missing here? - Steven -- _______ Steven Winikoff| Concordia University | "Peter's Principle of Success: Montreal, QC, Canada | Get up one more time than steven.winik...@concordia.ca | you're knocked down." | - fortune(6) -- Nmh-workers https://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] proposed patch for shell metacharacter failure in nmh-1.7
>># fixdate -- fix the time zone on a Date: header in an email message > >Forgive me if this is a dumb question, but ... why do you care what the >timezone is in your Date: headers? > >If you want them to appear in your local timezone when they are displayed, >that is trivial to do with mh-format(5) To begin with, I didn't know that mh-format had a date2local funtion, so that would be the main reason why I'm not using it. :-) But even now that I do know, I still value having the local timezone stored in the file. That's because it's not uncommon for me to read an entire message into a file (for example, when the body of an email message is an explanation of how to do something and I want to save that explanation for posterity, including the email headers to show its provenance), and it's nice to have the local timezone in the file without having to convert it manually. I might feel differently if receiving messages from other time zones was something that happened only once in a while, as in fact it was for most of my career. ...but a few years ago Concordia moved to Exchange for its central email system, and that stamps every message which passes through it in UTC. In general I very much favour the principle of storing times in UTC and converting to local time for display, but this (for me, at least) is an exception. - Steven -- ___ Steven Winikoff| Concordia University | Today is a good day for making decisions. Montreal, QC, Canada | steven.winik...@concordia.ca | ...or is it? -- Nmh-workers https://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] proposed patch for shell metacharacter failure in nmh-1.7
>So it seems as if your method of storing David's patch, and of quoting >his email to reply to it, have both turned «'\''» into «'''». That's right. >Hopefully, this is some home-brew script rather than core nmh, but if >it's the latter then we'd like to know. :-) It's nothing to do with nmh, and coincidentally I discovered the problem myself about three days ago during an email discussion of mounting CIFS shares on Linux. The culprit was the appended shell script, shown here in its fixed version. Specifically, the read statements weren't using the -r option. My signature quotes are chosen at random from a collection, including the one on this message. But sometimes the random choice throws up something appropriate. :-) - Steven 8<- cut here >8 #!/bin/sh # # fixdate -- fix the time zone on a Date: header in an email message # # Steven Winikoff 2012/12/12 # # it's annoying to view Date: headers marked in a different time zone; # that annoyance wasn't important in a world where invalid time zones were # infrequent, but EVERY SINGLE MESSAGE from Concordia's new Exchange # servers is stamped in UTC :-( # # usage: fixdate < message # #where standard input is the mail message to be fixed; the (possibly #modified) message will be echoed to standard output # # exit status: 0 if the date was modified, or 1 otherwise # #-- # helper function: when are two date strings equal? :-) # # normally this would be just a simple text comparison, but some # systems present date stamps such as this one: # #Mon, 4 Jun 2012 14:24:06 -0400 # # this gets canonicalized by /bin/date as follows: # #Mon, 04 Jun 2012 14:24:06 -0400 # # ...and we don't want to bother rewriting the date in this case, so # detect and eliminate it: function samedate() { #-- dispose of the simplest case first :-) [ "${1}" = "${2}" ] && return 0 #-- the next simplest case occurs when the original date has a suffix; # for example, "Tue, 26 Jun 2012 01:13:26 -0400 (EDT)", which should # be treated as equal to "Tue, 26 Jun 2012 01:13:26 -0400" truncated="`echo \"${1}\" | cut -c1-31`" [ "${truncated}" = "${2}" ] && return 0 #-- if the day number has no leading zero, these dates are definitely # different: possible_zero="`echo \"${2}\" | cut -c6`" [ "${possible_zero}" = "0" ] || return 1 #-- if we're still here, these dates may be identical except for the # leading zero: rest="`echo \"${2}\" | cut -c1-5,7-`" test "${1}" = "${rest}" } #-- # process message headers, one line at a time: IFS=' ' while read -r line do #-- have we reached the end of the headers yet? if [ -z "${line}" ] then echo break fi #-- if we're here, this line is a header: start="`echo \"${line}\" | cut -c1-6`" if [ "${start}" != "Date: " ] then #-- not a Date: header, so just blat to standard output: echo "${line}" else #-- convert to our time zone: old="`echo \"${line}\" | unqp | sed 's/^Date: //'`" new="`date -d \"${old}\" -R`" if samedate "${old}" "${new}" then #-- already correct: echo "${line}" else #-- use the new date, but keep the old one also: echo "Date: ${new}" echo "X-Original-Date: ${old}" fi fi done #-- # now read and emit the body: while read -r line do echo "${line}" done exit 0 8<- cut here >8 -- ___ Steven Winikoff| "I really hate this dumb machine; I wish Concordia University | that they would sell it. It never does Montreal, QC, Canada | quite what I mean, but only what I tell steven.winik...@concordia.ca | it!" |- fortune(6) -- Nmh-workers https://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] rcvdist with non-default port
>I'm a little surprised, I thought that mhstore would store the HTML without >any modification because it just copies the bytes. It does. >the original message to verify? I just tried it on a message here and the >HTML content was preserved, even an img tag. Your timing is good: I just discovered my mistake about a minute before you sent this. :-) It turns out that the extracted HTML does contain an tag -- it's just that I'd missed it because I was searching for ".png" in the source, and in fact the tag looks like this: Sure enough, the attachment containing the image has these headers: Content-Type: image/png Content-Transfer-Encoding: base64 Content-ID: Needless to say, until now I'd never seen "src=cid:" in an tag, and hadn't known that was possible. So I do have everything I need, I just need to put it together. - Steven -- ___ Steven Winikoff| Concordia University | "The cure for boredom is curiousity. Montreal, QC, Canada | There is no cure for curiousity." steven.winik...@concordia.ca | |- Dorothy Parker -- Nmh-workers https://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] rcvdist with non-default port
>Did you know you can run mhstore(1) under another name, e.g. with a >symlink, and it uses that to look up the .mh_profile entries. You could >have a second ...-show-text/html definition in the normal $MH. No, I didn't know that. Thank you for pointing it out! I'm running into another roadblock with the HTML+images viewer that I'm trying to put together. My sample HTML message for this has a single image, and when I read it in an IMAP client such as K-9 Mail (an Android app which I occasionally use on my phone), the page comes up with the image in place. When I extract the HTML portion and attachment from the message with mhstore, I do get both components, but the HTML portion no longer contains an tag to load the image. Do you have any advice on how to deal with this?[*] Thanks, - Steven [*] My test message is an example of the main reason why I care about viewing HTML messages with images intact. Specifically, the message is a notice from Canada Post that a package is available to be picked up at my local post office, and the image is a barcode of the tracking number. The message text also contains the tracking number, and it's possible to collect the package without the barcode, but in that case the counter clerk has to key in the tracking number manually. I'm trying to be kind to them by printing the message with the barcode intact, and I'd prefer not to have to do that by opening the message in some other mail client. -- _______ Steven Winikoff| Concordia University | "In theory, there is no difference Montreal, QC, Canada | between theory and practice. In steven.winik...@concordia.ca | practice, there is." |- Chuck Reid -- Nmh-workers https://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] proposed patch for shell metacharacter failure in nmh-1.7
>would a new .mh_profile entry that gave arguments for /usr/bin/tr do the >trick? that is, could you as a user do the #,! -> _ or even "-d #!" so >that the default would be no editing Yes, that would work for me. I'm not sure what profile entry would look like, though; would the right hand side be an actual tr command? If not, how would nmh parse the entry? - Steven -- _______ Steven Winikoff| "I really hate this dumb machine; I wish Concordia University | that they would sell it. It never does Montreal, QC, Canada | quite what I mean, but only what I tell steven.winik...@concordia.ca | it!" |- fortune(6) -- Nmh-workers https://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] proposed patch for shell metacharacter failure in nmh-1.7
>The SECOND thing is we now have the ability to place MIME parameters into >some of those command strings, which are from email messages, which is >where things are "interesting". We don't normally do that in anything we >distribute, I think, but here we have a user that did. I think this is the key observation, and it stems from the fact that the original MH predated MIME. I don't know how most nmh users handle incoming attachments, and part of my problem is that this isn't really documented anywhere. MIME handling improved significantly in 1.6 and even more so in 1.7, but almost all the online documentation I can find is for 1.4 or older (and in most cases, *much* older, as in MH 6.8!). What I'm trying to accomplish is what IMAP provides by default, namely the ability to see the same messages with the same attachments from more than one place. If nmh could adapt to using maildir format all my problems would just disappear, since there are IMAP servers which also understand that format -- but that's an entirely different can of particularly ugly worms, and I'm no more inclined to try to open it than I imagine you are. But that leaves me wanting to be able to open attachments in MH-formatted messages from multiple systems, and as of this minute I have something that already does about 98% of what I'm looking for (and the last 2% is irrelevant to this discussion, so I won't go into that here). It's just that what I'm doing works better if I can extract the original filename for a given attachment, and as you point out that's exactly where the fun starts. >My proposal is to simply edit out shell metacharacters (add # and ! like >David suggested) in those strings. That seems simple and reasonable to >me. Well, maybe replace them with an _ or something. For what it's worth I'd prefer the "replace them with _" option, but even without it this would do what I'm looking for. - Steven -- _______ Steven Winikoff| Concordia University | Don't use no double negatives. Montreal, QC, Canada | steven.winik...@concordia.ca | -- Nmh-workers https://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] rcvdist with non-default port
>Steven, if you haven't been building from git and want to give it a try, >please let us know. Thanks again for reporting it. I haven't been building from git, but an hour ago I backported your patch into 1.7's rcvdist and tested that. It seems to be working perfectly. - Steven -- ___ Steven Winikoff| Concordia University | "If you're not part of the solution, Montreal, QC, Canada | you're part of the precipitate." steven.winik...@concordia.ca | | - Steven Wright -- Nmh-workers https://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] rcvdist with non-default port
>All of these things are DOABLE, it's just more complicated than it seems >at first glance, and working it all out requires some careful thought. >Welcome to programming nmh! :-/ I guess this is why they say that confidence is the feeling you have before you understand the problem. :-/ Thank you for taking the time to explain all of that. - Steven -- _______ Steven Winikoff| "43rd Law of Computing: Concordia University | Anything that can go wr Montreal, QC, Canada | Segmentation violation -- Core dumped" steven.winik...@concordia.ca | | - fortune(6) -- Nmh-workers https://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] rcvdist with non-default port
>post(8) not reading the profile was a long-standing deliberate design >decision, and the way the code is implemented it's not possible to >distinguish between "switches from the profile" and "switches from the >command line". This is where it shows that you know the code base and I don't. :-) Although I don't fully understand this; post already accepts -port on the command line, the problem is (just) that I can't take advantage of that because I'm not running post directly. >Putting things in mts.conf is also a pain; we never really figured out >a reasonable syntax for port number specification I don't understand why the syntax would be difficult, though that's probably only because I'm not familiar with the issues involved. Still, from the perspective of an outsider who may be unaware of an obvious reason why this would be a bad idea, what I'd propose is: - the (new, optional) mts.conf entry would be specified as port: for some integer - the entry would be ignored unless mts has a value for which a port number is appropriate - the specified port number would replace 587 as the default value for post, to be overridden if -port NNN is supplied on the command line >Since rcvdist is deficient here, seems to me the right answer is to fix >that. I agree. My point was just that there seems to be some disagreement about how best to do that, and that it might be nice to be able to take the time to discuss it thoroughly enough to reach a consensus. But yes, fixing rcvdist properly (for an agreed-upon value of 'properly' :-) in time for 1.7.1 would be preferable. - Steven -- ___ Steven Winikoff| Concordia University | "Any teacher who _can_ be replaced by a Montreal, QC, Canada | machine, _should_ be." steven.winik...@concordia.ca | | - Arthur C. Clarke -- Nmh-workers https://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] rcvdist with non-default port
>> My immediate problem could be solved by having post check for a -port >> switch (with value :-) in .mh_profile; > >post doesn't read the user's profile, I know that (though only because you mentioned it in a previous message yesterday), but... >so that wouldn't work. ...the intent of my question was about how difficult it would be to change that and have post read the profile, if only for that one entry. Or might it be easier to add a port entry to mts.conf, to complement the mts and servers entries? This is in the spirit of a workaround, even if the only reason for doing it would be to delay having to fix rcvdist until after 1.7.1. - Steven -- _______ Steven Winikoff| Concordia University | Boren's Laws: Montreal, QC, Canada | (1) When in charge, ponder. steven.winik...@concordia.ca | (2) When in trouble, delegate. | (3) When in doubt, mumble. -- Nmh-workers https://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] rcvdist with non-default port
>But if the user wants to only pass a switch argument to post that does NOT >take an argument, it's not possible from them to get it right. Just a thought, but... My immediate problem could be solved by having post check for a -port switch (with value :-) in .mh_profile; if doing that wouldn't be too difficult, would it reduce the urgency of fixing rcvdist and therefore allow time to decide how to do that in the best possible way? - Steven -- ___ Steven Winikoff| "You can leave in a taxi. If you can't Concordia University | get a taxi, you can leave in a huff. Montreal, QC, Canada | If that's too soon, you can leave in a steven.winik...@concordia.ca | minute and a huff." | - Groucho Marx -- Nmh-workers https://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] proposed patch for shell metacharacter failure in nmh-1.7
>I'm wondering if this is the correct approach. > >It seems kind of fragile to me to try quoting these characters, assuming >we are passing the entire line for mhshow entries to /bin/sh -c, since >we don't have any idea what that command line looks like I'm not up to speed on the code in nmh (other than having looked at just enough of mhshowsbr.c to have proposed the parentheses patch in the first place). ...but my experience working with /bin/sh in other matters over the years suggests that the safest thing to do is always to quote shell metacharacters you aren't deliberately intending to interpret. >(although ... I don't think I really understand why Steven is using >%{name}, I have a script which, given the message part corresponding to an attachment, copies that attachment to a known directory on the machine whose console I'm sitting at (which may or may not be the same machine where nmh is running). It's certainly possible to use the basename constructed by mhshow, but I find it more useful to save the attachment under its real name if/when that name can be determined. That's why I want the value of %{name} here. For years I was using single quotes around the values of all of the mhshow escapes in my .mh_profile, but I recently learned that's not supposed to be necessary. ...but whether I used single quotes or not, some filenames were causing problems for me. That's where this whole discussion began, since the problem presented itself as an error interpreting ( and ) in the filename. David convinced me that double-quoting %{name} accomplishes the same goal as my proposed patch, which is therefore unnecessary. However, there's certainly still some unexpected behaviour going on; when I run "%{name}" through an RFC-2047 decoder (using David's suggested usage of fmttest, or with a standalone python script I tried earlier today, the entire string passed into my script is single-quoted even though the quote marks aren't part of the decoded filename. Whether this (or anything else I may not have run into :-) is actually a problem which needs to be solved is something I'll leave to you and others who know the code better than I do. >I really think to be safe we should simply replace any shell >metacharacters for those things, because I can imagine some nasty >security holes that we might encounter. That's a stronger version of what I was trying to say above. :-) - Steven -- ___ Steven Winikoff| Concordia University | "This sentence contradicts itself; Montreal, QC, Canada | well, no, actually it doesn't." steven.winik...@concordia.ca | |- Douglas Hofstadter -- Nmh-workers https://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] proposed patch for shell metacharacter failure in nmh-1.7
>That's not right, it should be: > >while ((pp = strchr (pp, ''')) && buflen > 3) { That's what I thought based on your patch. But it was only after I sent my last message that I noticed the first line of your patch file: diff --git a/uip/mhshowsbr.c b/uip/mhshowsbr.c ...which suggests you really are looking at a newer copy of the source than I am. >Seems to me we had another problem with botched patches >recently. At this point, I'd say let's not bother with >it. No problem. Thanks for all your help! - Steven -- _______ Steven Winikoff| "The reasonable man adapts himself to the Concordia University | world; the unreasonable one persists in Montreal, QC, Canada | trying to adapt the world to himself. steven.winik...@concordia.ca | Therefore all progress depends on the | unreasonable man." | - George Bernard Shaw -- Nmh-workers https://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] proposed patch for shell metacharacter failure in nmh-1.7
>It's attached to this message. I got it, but I'm not sure I know what to do with it. What I did was this: % cd /local/pkg/nmh/nmh-1.7 % patch -p1 < /tmp/qpatch ...but this is what happened: patching file uip/mhshowsbr.c Hunk #1 FAILED at 980. Hunk #2 FAILED at 987. Hunk #3 succeeded at 989 (offset -1 lines). 2 out of 3 hunks FAILED -- saving rejects to file uip/mhshowsbr.c.rej I see that the first hunk is trying to match on while ((pp = strchr (pp, ''')) && buflen > 3) { ...but the corresponding line (line 979, not line 980) in my copy of uip/mhshowsbr.c is while ((pp = strchr (pp, '\'')) && buflen > 3) { Is it possible that you're starting with a newer version of the source than I am? - Steven -- ___ Steven Winikoff| Concordia University | "My interest is in the future because I Montreal, QC, Canada | am going to spend the rest of my life steven.winik...@concordia.ca | there." | - Charles F. Kettering -- Nmh-workers https://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] rcvdist with non-default port
>I guess no one has cared up until now. I find that rcvdist isn't a program I use often, but there are times when it's exactly what I need. >I'm not sure if I should thank Steve or curse him for pointing >this bug out :-) You're welcome to do both. I can take a curse or two if it means getting this fixed. :-) - Steven -- _______ Steven Winikoff|"If I traveled to the end of the rainbow Concordia University | As Dame Fortune did intend, Montreal, QC, Canada | Murphy would be there to tell me steven.winik...@concordia.ca | The pot's at the other end." | - Bert Whitney -- Nmh-workers https://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] proposed patch for shell metacharacter failure in nmh-1.7
>If you want to try the attached patch to mhshowsbr.c, I'd be happy to, but (ironically in this context :-) I'm not seeing the attachment. - Steven -- ___ Steven Winikoff| "Good managers learn to share decisions Concordia University | with others even though they alone must Montreal, QC, Canada | accept responsibility for the results." steven.winik...@concordia.ca | |- fortune(6) -- Nmh-workers https://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] rcvdist with non-default port
>Wait, 1.6 rcvdist works for you? It behaves the same as the 1.7 rcvdist >for me, not passing switch arguments. Right, but 1.6 post defaults to port 25, so I don't need to pass the port. >>- If called with -b, it extracts the HTML part to a file and opens that >> with a browser. (Currently I'm doing this by creating a second profile >> file with a different value for mhshow-show-text/html, and selecting >> that by changing the value of $MH; I consider this to be ugly, but it >> works, and it's the only thing I could think of which does.) > >Are you using mhshow to store the HTML part? mhstore should be more direct. > >[...] > >mhstore -type text/html -type image or something like that? Thanks! That's exactly what I'm working on right now. :-) - Steven -- _______ Steven Winikoff| Concordia University | "It is never too late to be what you Montreal, QC, Canada | might have been." steven.winik...@concordia.ca | - George Eliot -- Nmh-workers https://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] proposed patch for shell metacharacter failure in nmh-1.7
>That's just where the shell ran into trouble. > >The decoded text is: > >'SEAO - Résultats d''ouverture (002).pdf Yes, I see how that explains all the symptoms. Thank you for being patient enough to explain that! >I'm not sure there's a good way to fix this. Maybe this? > > 'SEAO - Résultats d'"'"'ouverture (002).pdf' I believe that's the right thing to do, and the only real alternative to removing the ' characters altogether. Meanwhile, I just ran into an entirely unrelated problem that I hope you won't mind advising me on. I don't have access to an SMTP server with an open submission port, so in 1.7 I had to add the '-port 25' option to this .mh_profile entry: send: -alias .aliases -msgid -port 25 This works perfectly. But last night I tried to use 1.7's rcvdist for the first time, and ran into this: % rcvdist smw < ~/Mail/inbox/18 post: problem initializing server; [RPLY] 530 5.7.0 Authentication required /local/pkg/nmh/root-nmh-1.7/bin/post: exit 1 So of course I added rcvdist: -port 25 But that doesn't help, or at least it doesn't help enough: % rcvdist smw < ~/Mail/inbox/18 post: missing argument to -port /local/pkg/nmh/root-nmh-1.7/bin/post: exit 1 I tried adding post: -port 25 in addition, but resulted in exactly the same error message. - Steven -- ___ Steven Winikoff| Concordia University | "It is easier to love humanity than to Montreal, QC, Canada | love one's neighbor." steven.winik...@concordia.ca | | - Eric Hoffer -- Nmh-workers https://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] proposed patch for shell metacharacter failure in nmh-1.7
>> The problem was the embedded parentheses, specifically the (002) part of >> the filename: >> >> =?iso-8859-1?Q?SEAO_-_R=E9sultats_d'ouverture_(002).pdf?= > >I still think it's due to the single quote, which confuses the quoting added >by mhshow. I understand why that seems likely, especially given the comments in that part of the code. But the error message I received specifically mentioned the parentheses, and while I concede that my patch isn't necessary when putting double quotes around %{name}, nevertheless that patch did work without the double quotes, and it did so without touching the actual quote code. Of course that doesn't necessarily mean I'm right; it only explains why I think so. >> This mostly works, but I'm running into quote-handling weirdness. > >Maybe the (ab)use of fmttest in the profile was just a bit too fancy. But fmttest does the right thing outside .mh_profile... It's not a problem (I just cleaned up the extraneous ' and \ characters in my mime_handler script). >> I don't know why I thought these entries are case-sensitive; are they not? > >They aren't, we should note that in the map page. The comparison is done >with strcasecmp(). Thank you for confirming that. - Steven -- ___ Steven Winikoff| Concordia University | "It is easier to fight for principles Montreal, QC, Canada | than to live up to them." steven.winik...@concordia.ca | | - Alfred Adler -- Nmh-workers https://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] proposed patch for shell metacharacter failure in nmh-1.7
>Would you consider this? > >mhshow-show-application/pdf: %pmime_helper %F %s "%{name}" Yes, I would. I just tested it against the unpatched 1.7, and it works. >That handles the embedded quote, which I think is why mhshow >doesn't quote the argument correctly. The problem was the embedded parentheses, specifically the (002) part of the filename: =?iso-8859-1?Q?SEAO_-_R=E9sultats_d'ouverture_(002).pdf?= But yes, "%{name}" does the right thing with that. >And it would be nice if nmh could decode the filename, so your >mime_helper doesn't have to (if it does). It certainly would. :-) >This works, though hopefully there's a better way: > >mhshow-show-application/pdf: %pmime_helper %F %s `fmttest -raw -format > '%(decode{text})' "%{name}"` I had to revise it just slightly: mhshow-show-application/pdf: %pmime_helper %F %s "`fmttest -raw -format '%(decode{text})' \"%{name}\"`" This mostly works, but I'm running into quote-handling weirdness. Specifically, if I run the fmttest command directly, I get this: % fmttest -raw -format '%(decode{text})' "=?iso-8859-1?Q?SEAO_-_R=E9sultats_d'ouverture_(002).pdf?=" SEAO - Résultats d'ouverture (002).pdf ...but in .mh_profile, the same thing results in mime_helper receiving 'SEAO - Résultats d'\'ouverture (002).pdf as its third argument. >(If you want to save a line in your profile, that >mhshow-suffix-application/PDF line is in mhn.defaults.) It's there, but as mhshow-suffix-application/pdf Likewise, so is mhshow-suffix-application/postscript but not mhshow-suffix-application/PostScript I don't know why I thought these entries are case-sensitive; are they not? - Steven -- ___ Steven Winikoff| Concordia University | Cheop's Law: Montreal, QC, Canada | steven.winik...@concordia.ca |Nothing *ever* gets built on schedule |or within budget. -- Nmh-workers https://lists.nongnu.org/mailman/listinfo/nmh-workers
[Nmh-workers] proposed patch for shell metacharacter failure in nmh-1.7
Yesterday I happened to receive an email message with an attachment described by these headers: Content-Type: application/pdf; name="=?iso-8859-1?Q?SEAO_-_R=E9sultats_d'ouverture_(002).pdf?=" Content-Description: =?iso-8859-1?Q?SEAO_-_R=E9sultats_d'ouverture_(002).pdf?= Content-Disposition: attachment; filename="=?iso-8859-1?Q?SEAO_-_R=E9sultats_d'ouverture_(002).pdf?="; size=503419; creation-date="Fri, 12 Jan 2018 12:44:41 GMT"; modification-date="Fri, 12 Jan 2018 12:50:33 GMT" Content-Transfer-Encoding: base64 My .mh_profile has these relevant entries: mhshow-suffix-application/PDF: .pdf mhshow-show-application/pdf: %pmime_helper %F %s %{name} ...where mime_helper is a shell script which opens attachments with the relevant application when run locally, or copies attachments to a remote desktop machine and opens them there via ssh. I'm happy to share it if anyone's interested, but it's not the point right now. The point is that the attachment failed to open, with these messages: [ part 2 - application/pdf - =?iso-8859-1?Q?SEAO_-_R=E9sultats_d'ouverture_(002).pdf?= 503.5KB ] /bin/sh: -c: line 0: syntax error near unexpected token `(' /bin/sh: -c: line 0: `mime_helper '/home/smw/Mail/mhshowdVgoi7.pdf' 'pdf' '=?iso-8859-1?Q?SEAO_-_R=E9sultats_d'\'ouverture_(002).pdf?= "$@"' The right fix is probably to educate people not to use such abominable filenames :-), but meanwhile I worked around it as follows: 8<- cut here >8 --- mhshowsbr.c.original2017-11-17 10:01:46.0 -0500 +++ mhshowsbr.c 2018-01-13 16:12:53.270723183 -0500 @@ -803,7 +803,7 @@ char *file, char *buffer, size_t buflen, int multipart) { int len, quoted = 0; -char *bp = buffer, *pp; +char *bp = buffer, *pp, *sp; CI ci = >c_ctinfo; bp[0] = bp[buflen] = '\0'; @@ -975,6 +975,18 @@ bp++; quoted = 1; } + /* Escape existing parentheses */ + sp = pp; + while (*sp) { + if (buflen && ((*sp == '(') || (*sp == ')'))) { + len = strlen (sp); + memmove (sp + 1, sp, len+1); + *sp++ = '\\'; + buflen--; + bp++; + } + sp++; + } /* Escape existing quotes */ while ((pp = strchr (pp, '\'')) && buflen > 3) { len = strlen (pp++); 8<- cut here >8 I'm passing this on in case this might be considered worth adopting. I'm not subscribed to this list, so I'd appreciate replies to my personal address of steven.winik...@concordia.ca Thanks, - Steven -- ___ Steven Winikoff| "Writing is easy; all you do is sit Concordia University | staring at a blank sheet of paper Montreal, QC, Canada | until the drops of blood form on steven.winik...@concordia.ca | your forehead." | - Gene Fowler -- Nmh-workers https://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] possible problem with mhfixmsg in nmh-1.7
>> As a non-contributor I don't (and shouldn't :-) get a vote here. > >Users get votes here. :-) I *like* this project. :-) Seriously, I really do appreciate what you're doing here. I can't imagine having to use any other email client. - Steven -- ___ Steven Winikoff| Concordia University | "A life spent making mistakes is not Montreal, QC, Canada | only more honorable but more useful steven.winik...@concordia.ca | than a life spent doing nothing." | - George Bernard Shaw -- Nmh-workers https://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] possible problem with mhfixmsg in nmh-1.7
>So in Steven's case something should be done, because test failures are >alarming and would probably stop installation, and perhaps cause >abandonment. Him not having to work around /nmh suffixes is 1.8. As a non-contributor I don't (and shouldn't :-) get a vote here. But for what it's worth I agree with this completely. - Steven -- _______ Steven Winikoff| "The reasonable man adapts himself to the Concordia University | world; the unreasonable one persists in Montreal, QC, Canada | trying to adapt the world to himself. steven.winik...@concordia.ca | Therefore all progress depends on the | unreasonable man." | - George Bernard Shaw -- Nmh-workers https://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] possible problem with mhfixmsg in nmh-1.7
>Steven wrote: > >> I've run into a few issues, most of which are trivial (and which I'll >> describe below for the sake of completeness), > >Thank you for for detail description. It looks like Ralph has addressed >the substantive issues. > >> The other issues I mentioned are: >> >> - It would be really nice if configure and Makefile.in didn't force a >> trailing /nmh on the pathnames I supply for libexecdir and sysconfdir. > >Forcing them seems to be a common convention. It does help avoid namespace >collisions. Understood, and when nmh is installed under /usr this makes perfect sense. In my case I was eager to get my hands on 1.7 rather than wait for it to be packaged by the OS maintainers, and I purposely keep all my locally installed packages out of any directory at risk of being overwritten in a future OS upgrade. In this case that resulted in the /big/local/pkg/nmh directory referred to in a previous message in this thread. For completeness, my installation unpacks the source archive into /big/local/pkg/nmh/nmh-1.7 ...and all of the installed files go into subdirectories of /big/local/pkg/nmh/root-nmh-1.7 This also has the (intentional :-) side effect of making it easy for multiple versions to coexist. (On one server I still have 1.4 installed because at least one user preferred to keep using it rather than learn how things changed in 1.6 :-/). Under these circumstances the extra nmh subdirectory isn't helpful, which is why I wanted to avoid using it. I know I'm in a minority here, which is why I requested a configure option rather than an outright change. Thanks, - Steven -- _______ Steven Winikoff| Concordia University | "My interest is in the future because I Montreal, QC, Canada | am going to spend the rest of my life steven.winik...@concordia.ca | there." | - Charles F. Kettering -- Nmh-workers https://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] possible problem with mhfixmsg in nmh-1.7
>I think it was something obvious. Can you add `-P' to the pwd below in >your copy and see if it passes? Done, and yes, it did. >I suspect your /big is a symlink. :-) Close. :-) It isn't, but the level below it is. On some of my systems, rather than try to figure out how much space to allocate to /home and to locally installed software (which I put in /local to protect it from future OS upgrades), I create a single partition spanning everything not used by the OS, with individual directories for /home, /local and whatever else happens to fit there. Then I have /local -> /big/local, /home -> /big/home, etc. (Yes, I know I could just use LVM2, but even that would require some kind of guess at the initial sizing.) - Steven -- _______ Steven Winikoff| Concordia University | "Quidquid latine dictum sit, altum Montreal, QC, Canada | viditur. (Whatever is said in Latin steven.winik...@concordia.ca | sounds profound.)" |- fortune(6) -- Nmh-workers https://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] possible problem with mhfixmsg in nmh-1.7
>> - I've been using mh for decades (literally!), so I no longer remember >> why I originally chose to configure using --with-hash-backup. > >This is now fixed on the master branch, if Ken or David are happy then >it can get cherry-picked across to branch 1.7-release. >http://git.savannah.nongnu.org/cgit/nmh.git/commit/?id=47b86722957cca6057bf5fcd07c9d1f01b4516f8 That was fast. :-) It turns out there are two more such failures in test-mhfixmsg, which I didn't see yesterday because I wasn't getting that far. (Yes, this is intended to imply that your suggested fix of pwd -P worked perfectly.): diff: /big/local/pkg/nmh/nmh-1.7/test/testdir/Mail/inbox/,11: No such file or directory ./test/mhfixmsg/test-mhfixmsg: test failed, outputs are in /big/local/pkg/nmh/nmh-1.7/test/testdir/Mail/inbox/,11 and /big/local/pkg/nmh/nmh-1.7/test/testdir/Mail/inbox/11.original. first named test failure: with no options: checks backup FAIL: test/mhfixmsg/test-mhfixmsg and diff: /big/local/pkg/nmh/nmh-1.7/test/testdir/Mail/inbox/,21: No such file or directory ./test/mhfixmsg/test-mhfixmsg: test failed, outputs are in /big/local/pkg/nmh/nmh-1.7/test/testdir/Mail/inbox/22 and /big/local/pkg/nmh/nmh-1.7/test/testdir/Mail/inbox/,21. first named test failure: -normmproc FAIL: test/mhfixmsg/test-mhfixmsg You may have fixed these already, but I figured I should mention them just in case. >>./test/mhfixmsg/test-mhfixmsg: test failed, outputs are in >>/big/local/pkg/nmh/nmh-1.7/test/testdir/Mail/inbox/31 and >>/big/local/pkg/nmh/nmh-1.7/test/testdir/test-mhfixmsg2494.actual. >>first named test failure: pass through message with relative folder >>path with parse error >>FAIL: test/mhfixmsg/test-mhfixmsg >> >> Is this something I can safely ignore? > >Well, if you don't use mhfixmsg, then probably. :-) Excellent point. :-) This is where I admit that until yesterday I hadn't know that mhfixmsg existed. Now I see it was also included in 1.6, which I've been using for over three years, but I must not have read the release notes for it carefully enough. Now that I know, I'll probably start using it in future. - Steven -- ___ Steven Winikoff| Concordia University | "The Universe is not only stranger than Montreal, QC, Canada | we imagine; it is stranger than we can steven.winik...@concordia.ca | imagine." |- J.B.S. Haldane -- Nmh-workers https://lists.nongnu.org/mailman/listinfo/nmh-workers
[Nmh-workers] possible problem with mhfixmsg in nmh-1.7
I'm not an nmh contributor, but I'm currently working on installing nmh-1.7 on one of my servers. I've run into a few issues, most of which are trivial (and which I'll describe below for the sake of completeness), but the one which may not be trivial is this: # make check [...passed tests and benign failures elided...] *** /big/local/pkg/nmh/nmh-1.7/test/testdir/Mail/inbox/31 2017-11-25 21:06:40.117262850 -0500 --- /big/local/pkg/nmh/nmh-1.7/test/testdir/test-mhfixmsg2494.actual 2017-11-25 21:06:40.125262997 -0500 *** *** 1,15 - To: recipi...@example.com - From: sen...@example.com - Subject: mhfixmsg pass through on parse error - MIME-Version: 1.0 - Content-Type: multipart/mixed; boundary="- =_aa0" - - --- =_aa0 - Content-Type: text/plain; charset="iso-8859-1 - Content-Disposition: attachment; filename="test1.txt" - Content-Transfer-Encoding: quoted-printable - - This is the= - text/plain part. - - --- =_aa0-- --- 0 ./test/mhfixmsg/test-mhfixmsg: test failed, outputs are in /big/local/pkg/nmh/nmh-1.7/test/testdir/Mail/inbox/31 and /big/local/pkg/nmh/nmh-1.7/test/testdir/test-mhfixmsg2494.actual. first named test failure: pass through message with relative folder path with parse error FAIL: test/mhfixmsg/test-mhfixmsg Is this something I can safely ignore? The other issues I mentioned are: - It would be really nice if configure and Makefile.in didn't force a trailing /nmh on the pathnames I supply for libexecdir and sysconfdir. This is trivial because it was easy enough to edit both files myself to make the change I wanted, but it would be much nicer if there were a way to do that by supplying an option to configure. - I've been using mh for decades (literally!), so I no longer remember why I originally chose to configure using --with-hash-backup. Nevertheless I did so, and ever since I've been continuing to do so for sake of consistency. This causes three other tests to fail, because they hardcode backup filenames using a comma: In this case I know these failures are benign, but in any case I proved that to myself by reconfiguring without --with-hash-backup and running make check again; in that situation, mhfixmsg/test-mhfixmsg is the only test that failed, but it did still fail. I'm not subscribed to this list, so I'd appreciate replies to my personal address of steven.winik...@concordia.ca Thanks, - Steven -- _______ Steven Winikoff| "The reasonable man adapts himself to the Concordia University | world; the unreasonable one persists in Montreal, QC, Canada | trying to adapt the world to himself. steven.winik...@concordia.ca | Therefore all progress depends on the | unreasonable man." | - George Bernard Shaw -- Nmh-workers https://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [Nmh-workers] possible problem with mhfixmsg in nmh-1.7
In my previous message, I wrote: This causes three other tests to fail, because they hardcode backup filenames using a comma: I then continued the rest of the message, and forgot to go back and list the three tests in question. I apologize for that oversight, but at least I can rectify it here: rm: cannot remove '/big/local/pkg/nmh/nmh-1.7/test/testdir/,23036.draft.orig': No such file or directory FAIL: test/mhbuild/test-forw [...passed tests and mhfixmsg failure already described elided...] mv: cannot stat '/big/local/pkg/nmh/nmh-1.7/test/testdir/Mail/,draft': No such file or directory first named test failure: smtp server doesn't support SMTPUTF8 FAIL: test/post/test-rfc6531 [...passed tests elided...] ./test/refile/test-refile: refile -nounlink failed FAIL: test/refile/test-refile To recap, these benign failures are all triggered by calling configure with the --with-hash-backup option. - Steven -- ___ Steven Winikoff| Zymurgy's Law of Evolving Systems Concordia University | Dynamics: Montreal, QC, Canada |Once you open a can of worms, steven.winik...@concordia.ca |the only way to recan them is |to use a larger can. -- Nmh-workers https://lists.nongnu.org/mailman/listinfo/nmh-workers