Re: automatic decode mime in repl

2022-02-10 Thread David Levine
Philipp wrote:

> Hi David
>
> I don't understand why do you try to convince me from convertargs.

I'm not trying to convince you of anything.  I'm trying to understand
how nmh could benefit from your patch.  And whether your approach could
be improved by tying it to features that are already in repl (and that
you weren't aware of when you wrote the patch).  And whether your patch
interferes with existing repl features.  And what are both intended and
unintended uses of the new feature.

To clear up some things:
* replyaliases is not required to use convertargs.  The replyaliases
  shell functions can boil down to this:
repl -filter mhl.replywithoutbody -convertargs text/html ''
  That can be entered on the command line, of course.  And of course,
  it's much easier to put that in an alias or function.

* We could just as easily make that convertargs behavior the default,
  as we could make the behavior of your patch the default.  But as
  noted previously, there are good reasons for not doing that.

* Your nmh installation is broken, as you observed.  I expect that the
  cause was deleting par after nmh was installed.  The test suite did
  the right thing by alerting you to the deficiency.  We could guide
  you through (easy) repair of your installation, but I would instead
  suggest a complete rebuild, if you built from sources, or re-install
  if you obtained a pre-built package.

> How can I use it from other things then a Bourne-compatible shell?

The same way all nmh commands are used from other than a Bourne-
compatible shell.

> As far as I see you
> would need to port contrib/replaliases to all these interfaces

There's no need.  replaliases contains Bourne-shell functions for
convenience, that's all.

Alternatively, a user can add those switches to their profile.  Or,
a user can rely on whatever method they'd like to add arguments to a
command line.  That's one of the beauties of MH/nmh.

> So in genaral convertargs looks like a stopgap solution. I wanted to
> have a solution and I have this solution since 2016.

I've been using convertargs since 2014.

> The question is: Do you want it?

Not with its current handling of URLs.  How do you handle URLs in
messages that you reply to?

David



Re: automatic decode mime in repl

2022-02-10 Thread Philipp Takacs
Hi David

I don't understand why do you try to convince me from convertargs. As
said before: I have a working solution. Also convertargs will not work
in my setup. No I don't say convertargs is bad, I say my aproatch is
better. It has it own drawbacks, but in comparson they are smaller and
better to fix or workaround. Maybe I'm biased on this, but thats my view
on this. But lets look at it a bit deeper:

[2022-02-09 22:34] David Levine 
> Philipp wrote:
>
> > The problem I see is that convertargs looks complex.
>
> If you use a Bourne-compatible shell, would you try this, please:
>
> source $(mhparam docdir)/contrib/replaliases
> rtm [msg]

This is not that easy, because I use mmh. So convertargs is not there.
But I can give you my experience I have got from the nmh tests. The
test repl/test-convert fails with following error:

> /bin/sh: 1: par: not found
i> charset=; iconv -f ${charset:-us-ascii} -t utf-8 
'/home/satanist/src/nmh/test/testdir/Mail/mhbuildm4m2Df' | sed 's/^\(.\)/> \1/; 
s/^$/>/;' | par 64  >/home/satanist/src/nmh/test/testdir/Mail/mhbuildwpUMcf 
"$@": exited 127
> mhbuild: store of text/plain content failed, continuing...
> /bin/sh: 1: par: not found
> charset=; iconv -f ${charset:-us-ascii} -t utf-8 
> '/home/satanist/src/nmh/test/testdir/Mail/mhbuildgcLmQf' | sed 's/^\(.\)/> 
> \1/; s/^$/>/;' | par 64  
> >/home/satanist/src/nmh/test/testdir/Mail/mhbuildXaOu6d "$@": exited 127
> mhbuild: store of text/plain content failed, continuing...

This is probaly a bug in etc/mhn.defaults.sh and install par fixed it.
But this is a bad first experience.

> > The problem I see is that convertargs looks complex. The goal of my
> > patch is to have an easy to use solution which is enabled by default.
>
> Easy for developers or easy for users?

First for the user, but also for the developers. For a user I see a few
problems:

What needs to be done that this is enabled by default? Yes I poke on this
point, because I belive this is importend. Wheter you like it or not:
MIME messages are common and nmh is not able to reply to one in the
default configuration.

How do I create alternativ repl configurations? Like have a `replwork'
which has diffrent default switches and forms. I just see the solution
to write an alias. But I like the argv0 configuration methode and would
miss it.

How can I use it from other things then a Bourne-compatible shell? Like
fish shell, powershell, ipython or a java process. As far as I see you
would need to port contrib/replaliases to all these interfaces or use a
sh instand of calling repl directly.

As a developer I see also a few problems:

For me both arpoces are code changes[0]. convertargs changes two
programms and needs a wrapper which provides an easy interface. My
aproatch changes only repl and depends with the rest on already
implemented features.

The required wrapper also needs some maintainership. Fixing bugs,
changed interface, interfaces for other shells. Yes the wrapper is
small, but it moves parts of a core feature to a contributer script.

So in genaral convertargs looks like a stopgap solution. I wanted to
have a solution and I have this solution since 2016. Yes it's not the
best solution, but a good one. The question is: Do you want it?

Philipp

[0] Or at least my arpoatch was 2016 when I intruduced it to mmh.



Re: mhfixmsg character set conversion

2022-02-10 Thread Ralph Corderoy
Hi Steven,

>- explicitly unset LC_ALL and set LANG to en_CA.UTF-8
...
> unset LC_ALL; LANG="en_CA.UTF-8"; export LC_ALL LANG

The export of LC_ALL is a bit misleading.  AFAICT it doesn't do any harm
as LC_ALL isn't set, but it would read better not to be there IMO.

Also, this is assuming either all the other LC_* are unset or that it's
desired they trump LANG.  Precedence is

LC_ALL
LC_CTYPE, LC_NUMERIC, etc.
LANG

Anyway, locale(1) is a good way to test you're getting the desired
result.

>- run ~smw/bin/decode_headers using $source as stdin (this explicitly
>  decodes headers which are RFC 2047-encoded, and passes the body
>  through unchanged)

This sounds like the kind of thing which might insert bytes which alter
vim's idea of the ‘fileencoding’.  Given

To: =?ISO-8859-1?Q?Keld_J=F8rn_Simonsen?= 

as taken from RFC 2047, is it going to put in a byte 0xf8 for ISO 8859-1
encoding, or 0xc3 0xb8 for UTF-8?

-- 
Cheers, Ralph.



Re: mhfixmsg character set conversion

2022-02-10 Thread Ralph Corderoy
Hi Steven,

> > I expect the bad file has something earlier on which fixes vim's
> > idea of the encoding to ISO 8859-1
>
> That does seem to be the case.  Do you have any idea what kind of
> thing that might be?  (I know you can't diagnose a file you haven't
> seen, but in general, what sorts of things should I look for?)

Non-ASCII bytes from the start of the file.  I assume vim(1) will read
up to a certain amount until it either makes up its mind or assumes the
default.

Try this to remove the boring ASCII bytes and see what's left.

tr -d ' -~'  > >$ grep -n ^Veuillez good | cut -c1-68
> > >108:Veuillez ne pas répondre au présent courriel. Il a été gén�
...
> > (The ‘�’ at the end is to be expected.)
...
> Until now, I've only ever seen that glyph when a character doesn't
> exist in the font being used

No, it's not related to a Unicode code point not being in the font, or
only historically.
https://en.wikipedia.org/wiki/Specials_(Unicode_block)#Replacement_character
describes ‘�’ and it's being seen above because cut(1) is cutting bytes
and the ‘108:’ at the start of the line has shifted the 68/69 cut-off
point to part-way through the UTF-8 for a single code point AKA rune.

>$ setenv LC_ALL C
>$ perl -lpe 's/[^ -~]/sprintf "<%02x>", ord($&)/ge' good_snippet
>Veuillez ne pas rpondre au prsent courriel. Il a 
> t gnr

Good.

> As expected, this returned pretty much instantly.  Then I tried this:
>
>$ sh
>$ LC_ALL=C
>$ echo $LC_ALL
>C
>$ perl -lpe 's/[^ -~]/sprintf "<%02x>", ord($&)/ge' good_snippet

That's setting a local shell variable LC_ALL unless LC_ALL already
exists in the environment, and it probably doesn't.  Try

sh
LC_ALL=C; export LC_ALL
locale
perl -lpe 's/[^ -~]/sprintf "<%02x>", ord($&)/ge' good_snippet

> Which in a way is good, because at least it means bash is behaving
> consistently.

Beware that invoking bash(1) as ‘sh’ is not the same as running ‘bash’.
Might not make a difference in this case, but in general it's better to
run whichever is desired.

> I propose to forget this particular clupea harengus of the crimson
> variety unless you find it interesting in and of itself.

It is odd.  And odd might affect other things, including to do with nmh.
:-)

-- 
Cheers, Ralph.



Re: mhfixmsg character set conversion

2022-02-10 Thread Ralph Corderoy
Hi Steven,

> > I use this version, unaltered:
> > $ par version
> > par 1.53.0
>
> $ par version
> 1.52-i18n.4

That ‘i18n’ smells given the nature of the other patch I found earlier.

> $ pacman -Qi par
> Name: par
> Version : 1.52-8
> Description : Paragraph reformatter
> Architecture: x86_64
> URL : http://www.nicemice.net/par/

Assuming Manjaro is just picking this up from Arch Linux,
I think this is the shell script which builds the package.
https://aur.archlinux.org/cgit/aur.git/tree/PKGBUILD?h=par

In particular,

source=('http://www.nicemice.net/par/Par152.tar.gz'
'http://sysmic.org/dl/par/par-1.52-i18n.4.patch')

Attempting to retrieve the patch times out for me on the French IPv4
address.

I didn't find it anywhere else.  I did find NixOS also pulls it in.
https://github.com/NixOS/nixpkgs/blob/master/pkgs/tools/text/par/default.nix

patches = [
# A patch by Jérôme Pouiller that adds support for multibyte
# charsets (like UTF-8), plus Debian packaging.
(fetchpatch {
  url = "http://sysmic.org/dl/par/par-1.52-i18n.4.patch;;
  sha256 = "0alw44lf511jmr38jnh4j0mpp7vclgy0grkxzqf7q158vzdb6g23";
})
];

Can you try searching par again, this time with

file /usr/bin/par
env LC_ALL=C egrep -boa 'seems not configured' /usr/bin/par

-- 
Cheers, Ralph.