Re: mhfixmsg character set conversion

2022-02-08 Thread Steven Winikoff
>I'm unable to replicate your problem here with the original message,
>and using your mhfixmsg invocation, mhfixmsg-format-text/html, and
>locale.  The only piece I think I'm missing is your mime_helper.
>I would give that a try if you send it to me.

I've attached the script, but (without having looked at it in a while) I
suspect it depends too heavily on other parts of my personal setup to be
usable for anyone else.  It turns out not to be relevant, but perhaps it
might be interesting to someone anyway.


>With nmh-1.7 mhfixmsg:
>mhfixmsg: /home/levine/src/nmh/msg part 2, decode text/plain; 
>charset=iso-8859-1
>mhfixmsg: /home/levine/src/nmh/msg part 1, will not decode because it
>is binary (line length > 998)
>mhfixmsg: /home/levine/src/nmh/msg part 2, convert UTF-8 to UTF-8

...and therein lies the answer.

I owe you an apology about this, and I'm sincerely sorry for wasting your
time on this question.

The key is the message about the line length being too long.  Seeing that
reminded me that I'd modified the stock 1.7.1 mhfixmsg with this patch:

   --- uip/mhfixmsg.c.original 2018-03-06 14:05:56.0 -0500
   +++ uip/mhfixmsg.c  2019-08-17 19:51:25.723267048 -0400
   @@ -2144,13 +2144,13 @@
int last_char_was_cr = 0;

for (i = 0, cp = buffer; i < inbytes; ++i, ++cp) {
   -if (*cp == '\0'  ||  ++line_len > 998  ||
   +if (*cp == '\0'  ||  ++line_len > 8  ||
(*cp != '\n'  &&  last_char_was_cr)) {
encoding = CE_BINARY;
if (*cp == '\0') {
*reason = "null character";
   -} else if (line_len > 998) {
   -*reason = "line length > 998";
   +} else if (line_len > 8) {
   +*reason = "line length > 8";
} else if (*cp != '\n'  &&  last_char_was_cr) {
*reason = "CR not followed by LF";
} else {

I remember asking about the 998-character limit on this list, in a thread
from January 2018.  You explained why the limit exists, and suggested
another way to achieve what I was trying to do, which I tried but without
success -- I wasn't able to get what I wanted without this change, but I no
longer remember the details.

Obviously I need to revisit this question, because I just compiled a copy
of mhfixmsg from 1.7.1 without this patch, and it now behaves as you'd
expect:  it complains about the line length, and then generates correct
output with these headers:

   Content-Type: multipart/alternative;
boundary=0126698af956ff6e1d4da4d88ae8ef4ebfb0beb8c16cc29b787641a31378
   Content-Transfer-Encoding: 8bit
   
   --0126698af956ff6e1d4da4d88ae8ef4ebfb0beb8c16cc29b787641a31378
   Content-Transfer-Encoding: 8bit
   Content-Type: text/plain; charset="UTF-8"
   Mime-Version: 1.0
   
   --0126698af956ff6e1d4da4d88ae8ef4ebfb0beb8c16cc29b787641a31378
   Content-Transfer-Encoding: quoted-printable
   Content-Type: text/html; charset=iso-8859-1
   Mime-Version: 1.0

With my patch, I get these headers:

   Content-Type: multipart/alternative;
  boundary=0126698af956ff6e1d4da4d88ae8ef4ebfb0beb8c16cc29b787641a31378
   Content-Transfer-Encoding: 8bit

   --0126698af956ff6e1d4da4d88ae8ef4ebfb0beb8c16cc29b787641a31378
   Content-Transfer-Encoding: 8bit
   Content-Type: text/plain; charset="UTF-8"
   Mime-Version: 1.0

   --0126698af956ff6e1d4da4d88ae8ef4ebfb0beb8c16cc29b787641a31378
   Content-Transfer-Encoding: 8bit
   Content-Type: text/html; charset=iso-8859-1
   Mime-Version: 1.0

There's still something going on that I don't understand, however.  The
way I've evaluated the output from mhfixmsg was by viewing it in vim, and
there's no question that the unpatched output looks fine while the patched
output is as I've been describing since the beginning of this thread.

...but when I look at the files with command-line tools such as more or
head, *both* versions look correct.  When I open both files in xed, the
unpatched file is fine, but the patched file generates this message:

   There was a problem opening the file /tmp/nmh_testing/xxx.

   The file you opened has some invalid characters. If you continue editing
   this file you could corrupt this document.

   You can also choose another character encoding and try again.

...with a menu offering "Automatically Detected", "Current Locale (UTF-8)"
and "Western (ISO-8859-15)" as possible character encodings.

In summary, I now know what's happening and (mostly) what to do about it,
but I still don't know why.

 - Steven
-- 
___
Steven Winikoff  |
Montreal, QC, Canada | "I'd love to go out with you, but I'm
s...@smwonline.ca |  attending the opening of my garage door."
http://smwonline.ca  |
 |   - fortu

Re: mhfixmsg character set conversion

2022-02-08 Thread Steven Winikoff
>This is an easy way to download and build the latest, assuming that
>you have the prerequisites listed in the MACHINES file (and respond
>to the build_nmh questions):
>
>wget http://git.savannah.gnu.org/cgit/nmh.git/plain/build_nmh
>sh build_nmh -v

Thanks for that.  I just tried it, but unfortunately the build failed
at the test step (details appended).

I don't know whether it matters that I ran the build script using my
regular account rather than with root privileges, with the root directory
configured in /tmp (because this is a temporary installation for testing
rather than something I'm planning to keep).

In case it matters, the configuration answers below are the same as in my
real installation of version 1.7.1.


>Ah, OK, maybe it wasn't lynx.

It wasn't.  I just realized what it was, and it turns out I owe you an
apology for reasons I'll explain separately in a reply to your message
from last night.

 - Steven


8<-   cut here   >8
$ mkdir -p /tmp/nmh/root
$ cd /tmp/nmh
$ wget http://git.savannah.gnu.org/cgit/nmh.git/plain/build_nmh
$ sh build_nmh -v
Install prefix [/local]: /tmp/nmh/root
Locking type (dot|fcntl|flock|lockf) [determined by configure]: fcntl
MTS (smtp|sendmail/smtp|sendmail/pipe) [smtp]: 
SMTP server [localhost]: 
Cyrus SASL support (y|n) [determined by configure]: no
TLS support (y|n) [determined by configure]: yes
downloading . . .
autoconfiguring . . .
configuring . . .
building . . .
testing . . .
build failed, build log is in nmh/build_nmh.log

$ tail -30 nmh/build_nmh.log
/sbin/sed -f man/man.sed man/show.man > man/show.1
/sbin/sed -f man/man.sed man/slocal.man > man/slocal.1
/sbin/sed -f man/man.sed man/sortm.man > man/sortm.1
/sbin/sed -f man/man.sed man/unseen.man > man/unseen.1
/sbin/sed -f man/man.sed man/whatnow.man > man/whatnow.1
/sbin/sed -f man/man.sed man/whom.man > man/whom.1
./etc/bash_completion_nmh-gen > etc/bash_completion_nmh
./etc/mhn.defaults.sh 
"/home/smw/bin:/local/paths:/usr/local/sbin:/sbin:/usr/sbin:/bin:/usr/bin:/usr/lib"
 ./etc/mhn.find.sh > etc/mhn.defaults
/sbin/sed -e 's,%mts%,smtp,' \
   -e 's,%mailspool%,/var/mail,' \
   -e 's,%smtpserver%,localhost,' \
   -e 's,%default_locking%,fcntl,' \
   -e 's,%supported_locks%,fcntl dot flock lockf,' \
< ./etc/mts.conf.in > etc/mts.conf
make[1]: Leaving directory '/tmp/nmh/nmh'
make[1]: *** [Makefile:4996: check-TESTS] Error 1
make: *** [Makefile:5261: check-am] Error 2
===
FAIL: test/mhbuild/test-attach
FAIL: test/mhbuild/test-ext-params
FAIL: test/repl/test-convert
3 of 118 tests failed
===
configure.ac:8: warning: The macro `AC_CONFIG_HEADER' is obsolete.
configure.ac:135: warning: The macro `AC_TRY_COMPILE' is obsolete.
configure.ac:142: warning: The macro `AC_TRY_COMPILE' is obsolete.
configure.ac:188: warning: The macro `AC_TRY_LINK' is obsolete.
configure.ac:213: warning: AC_PROG_LEX without either yywrap or noyywrap is 
obsolete
make[1]: *** [Makefile:4996: check-TESTS] Error 1
make: *** [Makefile:5261: check-am] Error 2
8<-   cut here   >8
-- 
___
Steven Winikoff  | "Don't make your decisions because they are
Montreal, QC, Canada |  the easiest, the cheapest, or the most
s...@smwonline.ca |  popular; make them because you know they
http://smwonline.ca  |  are right."
 |  - Theodore Hesburgh



Re: automatic decode mime in repl

2022-02-08 Thread Philipp Takacs
Hi

[2022-02-07 16:54] Ralph Corderoy 
> > Instand of only depending on the programm name the config can be
> > extended to also depend on the calling programm. So you can config
> > diffrent options depending on which nmh programm called another nmh
> > programm. This could be done by (ab)using argv[0].
>
> Adding yet another style of interface between nmh's parts, which would
> have to be documented and users would have to understand and use, seems
> wrong.  What's your reason against running mhshow as ‘mhreplyfmt’, or
> something?  Is it the need to duplicate all MIME-configuration entries
> from ‘mhshow-*’ to ‘mhreplyfmt-*’?

The duplication is the one thing. The other thing is I would like to
extend the existing configuration system. Yes you can run mhshow as
`mhrelyfmt', but there is no logic behind the choice of argv[0]. The
idea was to define such a logic and use it to make the config more
powerfull.

As an exaple it could be possible to have diffrent showmimeproc config
for a `replA' and a `replB'. Or what about a diffrent sendmail or even
another path?

Yes this is a big change and way beound this patch. This might be as far
in the future as implementing MIME for mhl. Start with smaller fix like
setting argv[0] to `mhreplyfmt' is perfect fine.

> > * mhshow display string can contain '%l' which leads to a listing
> > prior to displaying content. This listing is contained in the draft.
>
> Again, the mhreplyfmt-... could choose whether to use it.  I think
> I would sometimes want it as part of the reply text, e.g. it's a
> one-line summary of a PDF and I want to quote that line for context
> about my following chunk of reply.

I have never thought[0] about solving this in this way. Yes this will
also work. In general I'm not a big fan of the '%l' and would prefer
one switch instand of '%l', but this is another discussion.

Philipp

[0] Mostly because of small diffrences between mmh and nmh.



Re: automatic decode mime in repl

2022-02-08 Thread Philipp Takacs
Hi

[2022-02-05 16:53] Philipp Takacs 
> Reply to mime messages is an old problem. I have an idea to workaround
> or fix this. Instand of adding mime to mhl, repl could also first decode
> mime before pipe the mail to mhl. This can be done by mhshow.
>
> I have attached a patch for this.

Thanks for all the discoussion. I have an updated version of this patch.
I belive it's clear that you want a switch for this so I have implemented
it. Currently on by default. Also I have an other patch which implements
Ralphs suggestion.

I'm not sure if the manpage updates are good. Also the tests should be
updated. I'm also look for a way to autodetect if -nofixmime should be
set. I think the best place would be in nmh_init().

As a disclousure: I work with this patch since 2016. If you implement it
or not will not affect me. I thought this was clear from the beginning,
but it looks for me that this wasn't clear for everyone in the discussion.
Sorry for that. I belive this patch is good and you will benefit from it.
Thats is why I have send it to you and discoused why I belive this should
be enable by default. But in the end you can decide if and in which way
you want this patch.

Philipp
From a587df65eafb93bea5cc74053823fd07bc13ea8b Mon Sep 17 00:00:00 2001
From: Philipp Takacs 
Date: Sat, 5 Feb 2022 12:21:35 +0100
Subject: [PATCH 1/2] repl: pipe mail through mhshow

When reply to a MIME message the mhshow will decode the mime message
before mhl will get it to create a reply draft. This makes it more
convinient to reply to a MIME message.

Problems:

* mhshow can be configured to display parts in a graphical programm. This
  can lead to unexpected behavior on repl (i.e. start plaing musik).

* mhshow display string can contain '%l' which leads to a listing prior
  to displaying content. This listing is contained in the draft.
---
 man/repl.man  |  22 
 uip/repl.c|  11 +++-
 uip/replsbr.c | 137 +-
 uip/replsbr.h |   2 +-
 4 files changed, 134 insertions(+), 38 deletions(-)

diff --git a/man/repl.man b/man/repl.man
index 790e19b5..fc1935fc 100644
--- a/man/repl.man
+++ b/man/repl.man
@@ -51,6 +51,7 @@ all/to/cc/me]
 .RB [ \-build ]
 .RB [ \-file
 .IR msgfile ]
+.RB [ \-fixmime " | " \-nofixmime ]
 .ad
 .SH DESCRIPTION
 .B repl
@@ -71,6 +72,14 @@ format file (see
 .IR mh\-format (5)
 for details).
 .PP
+To decode MIME messages
+.B repl
+pipe the message through
+.IR showmimeproc .
+This can be disabled with the
+.B \-nofixmime
+switch.
+.PP
 If the switch
 .B \-nogroup
 is given (it is on by default), then
@@ -516,6 +525,7 @@ is checked.
 ^Editor:~^To override the default editor
 ^Msg\-Protect:~^To set mode when creating a new message (draft)
 ^fileproc:~^Program to refile the message
+^showmimeproc:~^Program to show non-text (MIME) messages
 ^mhlproc:~^Program to filter message being replied-to
 ^whatnowproc:~^Program to ask the \*(lqWhat now?\*(rq questions
 .fi
@@ -542,6 +552,7 @@ is checked.
 .RB ` \-noquery '
 .RB ` \-noatfile '
 .RB ` "\-width\ 72" '
+.RB ` \-fixmime '
 .fi
 .SH CONTEXT
 If a folder is given, it will become the current folder.  The message
@@ -578,3 +589,14 @@ don't call it
 since
 .B repl
 won't run it.
+.PP
+The
+.RB \-fixmime
+switch causes repl to pipe the message through
+.IR showmimeproc .
+If the
+.IR showmimeproc
+is configured to display listing prior a mime part, these listing will end up in the draft.
+There might also be some unexpected behaviors, when the
+.IR showmimeproc
+is configured to display the content externel (i.e. open a browser to display html).
diff --git a/uip/repl.c b/uip/repl.c
index b1ceb1f6..8ed22709 100644
--- a/uip/repl.c
+++ b/uip/repl.c
@@ -72,6 +72,8 @@
 X("noatfile", 0, NOATFILESW) \
 X("fmtproc program", 0, FMTPROCSW) \
 X("nofmtproc", 0, NFMTPROCSW) \
+X("fixmime", 0, FIXMIMESW) \
+X("nofixmime", 0, NFIXMIMESW) \
 
 #define X(sw, minchars, id) id,
 DEFINE_SWITCH_ENUM(REPL);
@@ -143,6 +145,7 @@ main (int argc, char **argv)
 bool nedit = false;
 bool nwhat = false;
 bool atfile = false;
+bool fixmime = true;
 int fmtproc = -1;
 char *cp, *cwd, *dp, *maildir, *file = NULL;
 char *folder = NULL, *msg = NULL, *dfolder = NULL;
@@ -354,6 +357,12 @@ main (int argc, char **argv)
 		case NFMTPROCSW:
 		fmtproc = 0;
 		continue;
+		case FIXMIMESW:
+		fixmime = true;
+		continue;
+		case NFIXMIMESW:
+		fixmime = false;
+		continue;
 	}
 	}
 	if (*cp == '+' || *cp == '@') {
@@ -469,7 +478,7 @@ try_it_again:
 }
 
 replout (in, msg, drft, mp, outputlinelen, mime, form, filter,
-	 fcc, fmtproc);
+	 fcc, fmtproc, fixmime);
 fclose (in);
 
 {
diff --git a/uip/replsbr.c b/uip/replsbr.c
index ef4925ae..d2bab72f 100644
--- a/uip/replsbr.c
+++ b/uip/replsbr.c
@@ -61,7 +61,7 @@ static char *addrcomps[] = {
  * static prototypes
  */
 static int insert (struct mailname *);
-static void replfilter (FILE *, FILE *, char *, int);

Re: mhfixmsg character set conversion

2022-02-08 Thread Ralph Corderoy
Hi Steven,

> There's still something going on that I don't understand, however.  The
> way I've evaluated the output from mhfixmsg was by viewing it in vim, and
> there's no question that the unpatched output looks fine while the patched
> output is as I've been describing since the beginning of this thread.

Good.  BTW, to begin a thread, please don't reply to an existing message
on the list and change the subject as it doesn't start a new thread and
leads to weird presentation in the archives, threading trees, etc.

> ...but when I look at the files with command-line tools such as more or
> head, *both* versions look correct.

Have you patched more or head?  ;-)

Can you cut-and-paste commands and output from your terminal to show us
the problem.  Otherwise we have to trust your competency, no offence
intended, and imagine what was done and seen which adds to the effort in
dealing with the email.  Here's my go.

How I could be influencing programs.

$ locale
LANG=en_GB.utf8
LC_CTYPE="en_GB.utf8"
LC_NUMERIC="en_GB.utf8"
LC_TIME="en_GB.utf8"
LC_COLLATE="en_GB.utf8"
LC_MONETARY="en_GB.utf8"
LC_MESSAGES="en_GB.utf8"
LC_PAPER="en_GB.utf8"
LC_NAME="en_GB.utf8"
LC_ADDRESS="en_GB.utf8"
LC_TELEPHONE="en_GB.utf8"
LC_MEASUREMENT="en_GB.utf8"
LC_IDENTIFICATION="en_GB.utf8"
LC_ALL=
$

Test inputs.

$ cat good
Veuillez ne pas répondre au présent courriel. Il a été généré
automatiquement, nous ne pourrons pas y donner suite.
$ cat bad
Veuillez ne pas répondre au présent courriel. Il a été généré
automatiquement, nous ne pourrons pas y donner suite.
$

bad is double-encoded.

$ iconv -f iso-8859-1 -t utf-8 good | cmp - bad
$

head(1) and more(1) don't disguise that.

$ head bad
Veuillez ne pas répondre au présent courriel. Il a été généré
automatiquement, nous ne pourrons pas y donner suite.
$ more bad
Veuillez ne pas répondre au présent courriel. Il a été généré
automatiquement, nous ne pourrons pas y donner suite.
$

Show the hex values of non-ASCII bytes.

$ LC_ALL=C perl -lpe 's/[^ -~]/sprintf "<%02x>", ord($&)/ge' good
Veuillez ne pas rpondre au prsent courriel. Il a 
t gnr
automatiquement, nous ne pourrons pas y donner suite.
$
$ LC_ALL=C perl -lpe 's/[^ -~]/sprintf "<%02x>", ord($&)/ge' bad
Veuillez ne pas r<83>pondre au pr<83>sent courriel. 
Il a <83>t<83> 
g<83>n<83>r<83>
automatiquement, nous ne pourrons pas y donner suite.
$

-- 
Cheers, Ralph.



Re: mhfixmsg character set conversion

2022-02-08 Thread David Levine
Steven Winikoff wrote:

> I owe you an apology about this, and I'm sincerely sorry for wasting your
> time on this question.

No apology is necessary.  This uncovered an issue with mhfixmsg that
we fixed.

> The key is the message about the line length being too long.  Seeing that
> reminded me that I'd modified the stock 1.7.1 mhfixmsg with this patch:

-decodetext binary instead of 8bit would be safer, I expect.  It
sounds like you might have tried that in the past without success.
It might help to dig in to that.

> There's still something going on that I don't understand, however.  The
> way I've evaluated the output from mhfixmsg was by viewing it in vim, and
> there's no question that the unpatched output looks fine while the patched
> output is as I've been describing since the beginning of this thread.
>
> ...but when I look at the files with command-line tools such as more or
> head, *both* versions look correct.

But are they correct?  It sounds like not, based on viewing in text
editors.

> In summary, I now know what's happening and (mostly) what to do about it,
> but I still don't know why.

I would look at output from mime_helper and see if it's UTF-8.

David



Re: mhfixmsg character set conversion

2022-02-08 Thread David Levine
Steven wrote:

> Thanks for that.  I just tried it, but unfortunately the build failed
> at the test step (details appended).

They don't show what test(s) failed, which may be relevant.  You can
run "make check" from the command line to get that information.

> I don't know whether it matters that I ran the build script using my
> regular account rather than with root privileges

Root privileges are not necessary.

David



Re: automatic decode mime in repl

2022-02-08 Thread David Levine
Philipp wrote:

> Thanks for all the discoussion. I have an updated version of this patch.

I tried the patch.  It works, but it produces text content of fixed
line lengths.  So it can break URLs and other text that shouldn't be
arbitrarily modified.

I haven't deciphered how it works, but I would think the same result
could be obtained using repl's -convertargs.  That relies on the
mhbuild(1) Convert Interface.  That was intended to be flexible, at
the expense of simplicity.  Could it be used instead of adding new
code?

David



Re: mhfixmsg character set conversion

2022-02-08 Thread Steven Winikoff
>BTW, to begin a thread, please don't reply to an existing message
>on the list and change the subject

That makes sense, but (a) I wasn't trying to start a new thread, and
(b) I replied to an existing message without changing the subject.

I'll try to remember that for future reference, but I don't understand
why you mentioned it here and now.


>> ...but when I look at the files with command-line tools such as more or
>> head, *both* versions look correct.
>
>Have you patched more or head?  ;-)

No, but that's a fair question. :-)

They're both unpatched, installed as part of util-linux 2.37.3-2 (for more)
and coreutils 9.0-2 (for head) on Manjaro Linux.


>Can you cut-and-paste commands and output from your terminal to show us
>the problem.

Of course.


>Otherwise we have to trust your competency, no offence intended,

None taken.  It's a perfectly fair request.


>Here's my go.
>
>How I could be influencing programs.
>
>$ locale
>LANG=en_GB.utf8
>LC_CTYPE="en_GB.utf8"
>LC_NUMERIC="en_GB.utf8"
>LC_TIME="en_GB.utf8"
>LC_COLLATE="en_GB.utf8"
>LC_MONETARY="en_GB.utf8"
>LC_MESSAGES="en_GB.utf8"
>LC_PAPER="en_GB.utf8"
>LC_NAME="en_GB.utf8"
>LC_ADDRESS="en_GB.utf8"
>LC_TELEPHONE="en_GB.utf8"
>LC_MEASUREMENT="en_GB.utf8"
>LC_IDENTIFICATION="en_GB.utf8"
>LC_ALL=
>$

Mine's 

   $ locale
   LANG=en_CA.UTF-8
   LC_CTYPE="en_CA.UTF-8"
   LC_NUMERIC="en_CA.UTF-8"
   LC_TIME="en_CA.UTF-8"
   LC_COLLATE=C
   LC_MONETARY="en_CA.UTF-8"
   LC_MESSAGES="en_CA.UTF-8"
   LC_PAPER="en_CA.UTF-8"
   LC_NAME="en_CA.UTF-8"
   LC_ADDRESS="en_CA.UTF-8"
   LC_TELEPHONE="en_CA.UTF-8"
   LC_MEASUREMENT="en_CA.UTF-8"
   LC_IDENTIFICATION="en_CA.UTF-8"
   LC_ALL=


>Test inputs.
>
>$ cat good
>Veuillez ne pas répondre au présent courriel. Il a été généré
>automatiquement, nous ne pourrons pas y donner suite.
>$ cat bad
>Veuillez ne pas répondre au présent courriel. Il a été généré
>automatiquement, nous ne pourrons pas y donner suite.
>$

In my case I don't have just the one sentence in a file by itself, but
let's try grep (unpatched, and installed from grep 3.7-1 on Manjaro):

   $ grep ^Veuillez good | cut -c1-68
   Veuillez ne pas répondre au présent courriel. Il a été généré

   $ grep ^Veuillez bad | cut -c1-68
   Veuillez ne pas répondre au présent courriel. Il a été généré

Really.  I'm not making this up. :-/

...but if I open the incorrect output file in vim and go to line 108,
I see this (pasted from an xterm in which vim was running):

   Veuillez ne pas répondre au présent courriel. Il a été généré

But wait.  It gets worse:

   $ grep -n ^Veuillez good | cut -c1-68
   108:Veuillez ne pas répondre au présent courriel. Il a été gén�

   $ grep -n ^Veuillez bad | cut -c1-68
   108:Veuillez ne pas répondre au présent courriel. Il a été gén�

Is my shell somehow getting involved?

   $ echo $SHELL
   /usr/bin/tcsh

That's (also unpatched :-) tcsh 6.23.02-1 from Manjaro's tcsh package.


>bad is double-encoded.
>
>$ iconv -f iso-8859-1 -t utf-8 good | cmp - bad
>$

I understand that, although I don't understand why that's happening.


>head(1) and more(1) don't disguise that.

They certainly shouldn't, but:

   $ head -108 bad | tail -1 | cut -c1-68
   Veuillez ne pas répondre au présent courriel. Il a été généré

If you tell me this shouldn't be happening, I'll agree 100%.  But somehow
it is happening and I have no idea why.


>Show the hex values of non-ASCII bytes.

I can't do that on the whole file, so I did this:

   $ cp -p good good_snippet
   $ cp -p bad bad_snippet
   $ vi good_snippet bad_snippet
# delete all but the relevant part of line 108

   $ LC_ALL=C perl -lpe 's/[^ -~]/sprintf "<%02x>", ord($&)/ge' good_snippet

...but nothing appeared to happen, and I killed the command after waiting
about a minute.  (...and yes, I tried that in a bash subshell because I know
that syntax won't work in tcsh).

My perl is a bit rusty, so I'm not sure exactly how this command works.
However, just to muddy the waters even further, I fell back on od:

   $ od -t x1c good_snippet 
   000  56  65  75  69  6c  6c  65  7a  20  6e  65  20  70  61  73  20
 V   e   u   i   l   l   e   z   n   e   p   a   s
   020  72  c3  a9  70  6f  6e  64  72  65  20  61  75  20  70  72  c3
 r 303 251   p   o   n   d   r   e   a   u   p   r 303
   040  a9  73  65  6e  74  20  63  6f  75  72  72  69  65  6c  2e  20
   251   s   e   n   t   c   o   u   r   r   i   e   l   .
   060  49  6c  20  61  20  c3  a9  74  c3  a9  20  67  c3  a9  6e  c3
 I   l   a 303 251   t 303 251   g 303 251   n 303
   100  a9  72  c3  a9  0a
   251   r 303 251  \n
   105

   $ od -t x1c bad_snippet 
   000  56  65  75  69  6c  6c  65  7a  20  6e  65  20  70  61  73  20
 V   e   u   i   l   l   e   z   n   e   p   a   

Re: mhfixmsg character set conversion

2022-02-08 Thread Steven Winikoff
>They don't show what test(s) failed, which may be relevant.  You can
>run "make check" from the command line to get that information.

Done, with the appended results.


>> I don't know whether it matters that I ran the build script using my
>> regular account rather than with root privileges
>
>Root privileges are not necessary.

I guessed that would be the case, but thank you for confirming it.

 - Steven


8<-   cut here   >8
$ make check
make  test/fakehttp test/fakepop test/fakesmtp test/getcanon test/getcwidth 
test/getfullname test/runpty test/common.sh
make[1]: Entering directory '/tmp/nmh/nmh'
make[1]: 'test/fakepop' is up to date.
make[1]: 'test/fakesmtp' is up to date.
make[1]: 'test/getcanon' is up to date.
make[1]: 'test/getcwidth' is up to date.
make[1]: 'test/getfullname' is up to date.
make[1]: 'test/runpty' is up to date.
make[1]: 'test/common.sh' is up to date.
make[1]: Leaving directory '/tmp/nmh/nmh'
make  check-TESTS
make[1]: Entering directory '/tmp/nmh/nmh'
make[2]: Entering directory '/tmp/nmh/nmh'
make[3]: Entering directory '/tmp/nmh/nmh'
 /sbin/mkdir -p '/tmp/nmh/nmh/test/testdir/inst/tmp/nmh/root/bin'
  /bin/install -c uip/ali uip/anno uip/burst uip/comp uip/dist uip/flist 
uip/fmttest uip/folder uip/forw uip/inc uip/install-mh uip/mark uip/mhbuild 
uip/mhfixmsg uip/mhical uip/mhlist uip/mhlogin uip/mhn uip/mhparam uip/mhpath 
uip/mhshow uip/mhstore uip/msgchk uip/new uip/packf uip/pick uip/prompter 
uip/refile uip/repl uip/rmf uip/rmm uip/scan uip/send uip/show uip/sortm 
uip/whatnow uip/whom '/tmp/nmh/nmh/test/testdir/inst/tmp/nmh/root/bin'
 /sbin/mkdir -p '/tmp/nmh/nmh/test/testdir/inst/tmp/nmh/root/bin'
 /bin/install -c etc/sendfiles uip/mhmail 
'/tmp/nmh/nmh/test/testdir/inst/tmp/nmh/root/bin'
 /sbin/mkdir -p '/tmp/nmh/nmh/test/testdir/inst/tmp/nmh/root/libexec/nmh'
  /bin/install -c uip/ap uip/dp uip/fmtdump uip/mhl uip/mkstemp uip/post 
uip/rcvdist uip/rcvpack uip/rcvstore uip/rcvtty uip/slocal uip/viamail 
'/tmp/nmh/nmh/test/testdir/inst/tmp/nmh/root/libexec/nmh'
 /sbin/mkdir -p '/tmp/nmh/nmh/test/testdir/inst/tmp/nmh/root/libexec/nmh'
 /bin/install -c uip/spost 
'/tmp/nmh/nmh/test/testdir/inst/tmp/nmh/root/libexec/nmh'
make  install-exec-hook
make[4]: Entering directory '/tmp/nmh/nmh'
ln /tmp/nmh/nmh/test/testdir/inst/tmp/nmh/root/bin/flist 
/tmp/nmh/nmh/test/testdir/inst/tmp/nmh/root/bin/flists
ln /tmp/nmh/nmh/test/testdir/inst/tmp/nmh/root/bin/folder 
/tmp/nmh/nmh/test/testdir/inst/tmp/nmh/root/bin/folders
ln /tmp/nmh/nmh/test/testdir/inst/tmp/nmh/root/bin/new 
/tmp/nmh/nmh/test/testdir/inst/tmp/nmh/root/bin/fnext
ln /tmp/nmh/nmh/test/testdir/inst/tmp/nmh/root/bin/new 
/tmp/nmh/nmh/test/testdir/inst/tmp/nmh/root/bin/fprev
ln /tmp/nmh/nmh/test/testdir/inst/tmp/nmh/root/bin/new 
/tmp/nmh/nmh/test/testdir/inst/tmp/nmh/root/bin/unseen
ln /tmp/nmh/nmh/test/testdir/inst/tmp/nmh/root/bin/show 
/tmp/nmh/nmh/test/testdir/inst/tmp/nmh/root/bin/prev
ln /tmp/nmh/nmh/test/testdir/inst/tmp/nmh/root/bin/show 
/tmp/nmh/nmh/test/testdir/inst/tmp/nmh/root/bin/next
if test x != x; then \
chgrp root /tmp/nmh/nmh/test/testdir/inst/tmp/nmh/root/bin/inc && \
chmod 2755 /tmp/nmh/nmh/test/testdir/inst/tmp/nmh/root/bin/inc; \
fi
make[4]: Leaving directory '/tmp/nmh/nmh'
 /sbin/mkdir -p '/tmp/nmh/nmh/test/testdir/inst/tmp/nmh/root/share/doc/nmh'
 /bin/install -c -m 644 COPYRIGHT INSTALL NEWS README VERSION 
docs/COMPLETION-TCSH docs/COMPLETION-ZSH docs/DIFFERENCES docs/FAQ 
docs/MAIL.FILTERING docs/MAILING-LISTS docs/README-ATTACHMENTS 
docs/README-HOOKS docs/README-components docs/README.SASL docs/README.about 
docs/README.developers docs/README.manpages docs/TODO 
'/tmp/nmh/nmh/test/testdir/inst/tmp/nmh/root/share/doc/nmh'
 /sbin/mkdir -p 
'/tmp/nmh/nmh/test/testdir/inst/tmp/nmh/root/share/doc/nmh/contrib'
 /bin/install -c -m 644 docs/contrib/replaliases 
'/tmp/nmh/nmh/test/testdir/inst/tmp/nmh/root/share/doc/nmh/contrib'
 /sbin/mkdir -p 
'/tmp/nmh/nmh/test/testdir/inst/tmp/nmh/root/share/doc/nmh/contrib'
 /bin/install -c docs/contrib/localpostproc docs/contrib/ml 
docs/contrib/replyfilter docs/contrib/vpick 
'/tmp/nmh/nmh/test/testdir/inst/tmp/nmh/root/share/doc/nmh/contrib'
 /sbin/mkdir -p '/tmp/nmh/nmh/test/testdir/inst/tmp/nmh/root/etc/nmh'
 /bin/install -c -m 644 etc/MailAliases etc/components etc/digestcomps 
etc/distcomps etc/forwcomps etc/mhical.12hour etc/mhical.24hour etc/mhl.body 
etc/mhl.digest etc/mhl.format etc/mhl.forward etc/mhl.headers etc/mhl.reply 
etc/mhl.replywithoutbody etc/mhshow.marker etc/rcvdistcomps 
etc/rcvdistcomps.outbox etc/replcomps etc/replgroupcomps etc/scan.MMDDYY 
etc/scan.MMDD etc/scan.curses etc/scan.default etc/scan.highlighted 
etc/scan.mailx etc/scan.nomime etc/scan.size etc/scan.time etc/scan.timely 
etc/scan.unseen '/tmp/nmh/nmh/test/testdir/inst/tmp/nmh/root/etc/nmh'
 /sbin/mkdir -p '/tmp/nmh/nmh/test/testdir/inst/tmp/nmh/root/etc/nmh'
 /bin/install -c etc/rmmproc.messa

Re: mhfixmsg character set conversion

2022-02-08 Thread Steven Winikoff
>No apology is necessary.  This uncovered an issue with mhfixmsg that
>we fixed.

Thank you.


>> The key is the message about the line length being too long.  Seeing that
>> reminded me that I'd modified the stock 1.7.1 mhfixmsg with this patch:
>
>-decodetext binary instead of 8bit would be safer, I expect.  It
>sounds like you might have tried that in the past without success.
>It might help to dig in to that.

That's definitely my plan now that we've gotten this far.


>> ...but when I look at the files with command-line tools such as more or
>> head, *both* versions look correct.
>
>But are they correct?  It sounds like not, based on viewing in text
>editors.

I agree, but I'm running out of things to try to in order understand what's
happening (please see my reply to Ralph if you haven't already).


>> In summary, I now know what's happening and (mostly) what to do about it,
>> but I still don't know why.
>
>I would look at output from mime_helper and see if it's UTF-8.

Please forgive me for having to ask this, but how is mime_helper even
involved?  Isn't that used only when I read the message?  It isn't in
the procmail chain that saves the original copy, and it's the original
copy that we've been looking at.

 - Steven
-- 
___
Steven Winikoff  |
Montreal, QC, Canada | "The cure for boredom is curiousity.
s...@smwonline.ca |  There is no cure for curiousity."
http://smwonline.ca  |
 |  - Dorothy Parker