date:20220211

Re: automatic decode mime in repl

2022-02-11 Thread Philipp Takacs

[2022-02-11 02:55] David Levine 
> Philipp wrote:
>
> > Hi David
> >
> > I don't understand why do you try to convince me from convertargs.
>
> I'm not trying to convince you of anything.  I'm trying to understand
> how nmh could benefit from your patch.  And whether your approach could
> be improved by tying it to features that are already in repl (and that
> you weren't aware of when you wrote the patch).  And whether your patch
> interferes with existing repl features.  And what are both intended and
> unintended uses of the new feature.

Ok sorry then I missunderstand your request.

> To clear up some things:
[ ... ]
>
> * Your nmh installation is broken, as you observed.  I expect that the
>   cause was deleting par after nmh was installed.  The test suite did
>   the right thing by alerting you to the deficiency.  We could guide
>   you through (easy) repair of your installation, but I would instead
>   suggest a complete rebuild, if you built from sources, or re-install
>   if you obtained a pre-built package.

I think you miss a big point point, I don't have nmh. I have _m_mh.

All I did was clone the repo, build it and run ``make check''. The
hole time ``par'' wasn't installed. The build creates etc/mhn.defaults
with the wrong mhbuild-convert-text entry.

> And whether your approach could be improved by tying it to features
> that are already in repl (and that you weren't aware of when you wrote
> the patch).

As a _m_mh user I don't have the -convertargs switch. And I don't see
why I need it. It looks like the -convertargs switch and the convert
interface are only there to allow repl decoding mime.

But to decode mime messages there is a feature in nmh which was there
before convertargs: mhshow. All my patch does is call mhshow and pipe
the mail through it. Nothing more and nothing less. By this way you
get mhshow to decode mime and all of it's features.

convertargs is a diffrent aproatch, therefor I don't see a way to
improve my patch by using convertargs.

> And whether your patch interferes with existing repl features.

The -convertargs switch still works, because it is used with the filter
mhl.replywithoutbody. With the -mime switch you wont even hit the
coresponding code path. I don't see any possible interferes with other
repl features.

With other workarounds there may some problems. I would asume most will
just work, because they should also work on non mime mails.

> And what are both intended and unintended uses of the new feature.

Intendet is to make repl -fixmime be able to reply to a mime message.

Unintendet is that repl may start a videoplayer, if there is a video in
the message (or start some other graphical programms).

> I'm trying to understand how nmh could benefit from your patch.

Have an easy to use, enable by default and sane way to reply to mime
messages.

Under a easy to use and enable by default solution I understand something
the user needs at most one switch to use this in a sane way. No aliases
or editing of the profile, no requirement to manuall call mhbuild (or
anything else) and no convenience shell file with aliases.

> > The question is: Do you want it?
>
> Not with its current handling of URLs.  How do you handle URLs in
> messages that you reply to?

This patch doesn't handle URLs at all. mhshow will print them as is to
stdout. After that mhl will add the '> ' and linebreaks. As said in the
other mail, this is the default behavior of repl. This has nothing to do
with my patch. There where already two ways mentioned in this thread how
this could be fixed.

Philipp

Re: automatic decode mime in repl

2022-02-11 Thread Laura Creighton

In a message of Fri, 11 Feb 2022 16:52:34 +0100, Philipp Takacs writes:
>[2022-02-11 02:55] David Levine 

>But to decode mime messages there is a feature in nmh which was there
>before convertargs: mhshow. All my patch does is call mhshow and pipe
>the mail through it. Nothing more and nothing less. By this way you
>get mhshow to decode mime and all of it's features.

I would really like to have this.  The workaround I have made is really ugly, 
and
doesn't always work.

>This patch doesn't handle URLs at all. mhshow will print them as is to
>stdout. After that mhl will add the '> ' and linebreaks. As said in the
>other mail, this is the default behavior of repl. This has nothing to do
>with my patch. There where already two ways mentioned in this thread how
>this could be fixed.

and I don't care about urls.

>Philipp

Laura Creighton

Re: automatic decode mime in repl

2022-02-11 Thread David Levine

Philipp wrote:

> All I did was clone the repo, build it and run ``make check''. The
> hole time ``par'' wasn't installed. The build creates etc/mhn.defaults
> with the wrong mhbuild-convert-text entry.

Here is the mhn.defaults.sh code that puts par in mhn.defaults:
if [ -n `$SEARCHPROG "$SEARCHPATH" par` ]; then
textfmt=' | par 64'
replfmt=" | sed 's/^\(.\)/> \1/; s/^$/>/;' | par 64"
SEARCHPROG is etc/mhn.find.sh.  SEARCHPATH is your PATH.  If you
want to help us track down this issue, would you please:
  1. cd to your cloned nmh directory
  2. execute etc/mhn.defaults.sh "$PATH" etc/mhn.find.sh and pipe
 the output through:  grep 'par '
using whatever commands are appropriate for your shell, e.g., for bash:
  cd (your nmh directory)
  etc/mhn.defaults "$PATH" etc/mh.find.sh | grep 'par '
And let us (or me) know the result.  Those commands will not modify
anything in your nmh directory hierarchy.

> As a _m_mh user I don't have the -convertargs switch.

You submitted a patch to nmh.  I'm not sure what mmh has to do with
any of this.  Are you asking for an nmh enhancement to support mmh?

> It looks like the -convertargs switch and the convert
> interface are only there to allow repl decoding mime.

More than just decoding mime content, convertargs provides an
interface to support conversion of any content when replying.

> But to decode mime messages there is a feature in nmh which was there
> before convertargs: mhshow. All my patch does is call mhshow and pipe
> the mail through it. Nothing more and nothing less. By this way you
> get mhshow to decode mime and all of it's features.

And all of its drawbacks.

> > I'm trying to understand how nmh could benefit from your patch.
>
> Have an easy to use, enable by default and sane way to reply to mime
> messages.

It doesn't work for me.  There are other issues, likely to do with
other configuration in my setup.  But if I run into issues, I should
expect that other people will, too.

> This patch doesn't handle URLs at all. mhshow will print them as is to
> stdout. After that mhl will add the '> ' and linebreaks. As said in the
> other mail, this is the default behavior of repl. This has nothing to do
> with my patch.

With your patch, URLs break for me.  I can look at fixing that somewhere
else, but then this isn't true for me:

> Under a easy to use and enable by default solution I understand something
> the user needs at most one switch to use this in a sane way. No aliases
> or editing of the profile, no requirement to manuall call mhbuild (or
> anything else) and no convenience shell file with aliases.

. . .

> There where already two ways mentioned in this thread how
> this could be fixed.

I missed them, and just went and searched and didn't find them.  Would
you summarize for me, please?

As far as using convertargs instead of your patch, doing this seems to
get close to the same result for me:

  1. add to my profile:
  mhbuild-convert-text/html: charset="%{charset}"; mhl -nomoreproc
-noclear -width 99 %F | mhshow -file - -concat | /bin/w3m -dump
${charset:+-I} ${charset:+"$charset"} -T text/html | sed 's/^\(.\)/>
\1/; s/^$/>/;' | par 64
  2. rtm [msg]

I would rather do that then add new code to nmh that doesn't seem
like a complete solution.

David

Re: automatic decode mime in repl

2022-02-11 Thread David Levine

Laura wrote:

> I would really like to have this.  The workaround I have made is really > 
> ugly, and doesn't always work.

Does this produce a result that seems at least as good?

  \repl -filter mhl.replywithoutbody -convertargs text/html ''
-convertargs text/plain '' -editor mhbuild [msg]

Where [msg] is the usual optional message argument to repl.

David

Re: automatic decode mime in repl

2022-02-11 Thread Ralph Corderoy

Hi David,

> Does this produce a result that seems at least as good?

It's better than what I normally do.  I need to read the fine manual,
understand what's going on, and then unwind my ancient methods and
adjust to trying this.

-- 
Cheers, Ralph.

Re: automatic decode mime in repl

2022-02-11 Thread Laura Creighton

In a message of Fri, 11 Feb 2022 18:37:28 +, David Levine writes:
>Laura wrote:
>
>> I would really like to have this.  The workaround I have made is really > 
>> ugly, and doesn't always work.
>
>Does this produce a result that seems at least as good?
>
>  \repl -filter mhl.replywithoutbody -convertargs text/html ''
>-convertargs text/plain '' -editor mhbuild [msg]
>
>Where [msg] is the usual optional message argument to repl.
>
>David

I will have to find a suitable piece of mail to check it on, but
I already have a working solution for replying to html mail.  It's
the people who send me mail as pdf and mail as encoded base64 that
are hard to reply to.  But it's not a common occurrance these days,
which is why I don't have something handy.  Will look tomorrow after
I have had some sleep.

Laura

Re: automatic decode mime in repl

2022-02-11 Thread David Levine

Ralph wrote:

> It's better than what I normally do.  I need to read the fine manual,
> understand what's going on, and then unwind my ancient methods and
> adjust to trying this.

The most important piece is mhbuild-convert-text/html in your profile
or mhn.defaults.

David

Re: automatic decode mime in repl

2022-02-11 Thread David Levine

Laura wrote:

> It's
> the people who send me mail as pdf and mail as encoded base64 that
> are hard to reply to.  But it's not a common occurrance these days,

I would hope so.  It should be easy to add support for pdf as long as
it originated from text rather than an image.  And it should be easy
to support base64 directly, or try piping it through mhfixmsg.

David

Re: mhfixmsg character set conversion

2022-02-11 Thread Steven Winikoff

>That ‘i18n’ smells given the nature of the other patch I found earlier.

I now understand what you're referring to, and unsurprisingly you're right.

I didn't realize that par isn't a Manjaro package at all, but in fact
something I installed directly from the Arch User Repository.  It's
clear that you already found the AUR page for par, but for the record
I'll quote this very interesting comment from the package maintainer:

   @ifreund notified me back in march via the out-of-date mechanism that a
   new version (1.53) was available upstream¹ (yup, after 19 years since
   the 1.52 release) \o/.

   Unfortunately, the i18n patch that we are applying to the 1.52, and
   which confers par the ability to deal with UTF-8, does not apply cleanly
   to 1.53. The par author has introduced some fixes to the locale handling
   for (single byte) charsets other than US ASCII, but no support for
   multibyte encodings² yet.

   Until the i18n patch gets updated to apply to 1.53 (any uptakers?), I
   would say that we are better off as we are.

I would have guessed that that patch would be a good thing, but apparently
the author of par agrees with you that isn't, given that the patch was
offered and not accepted.

I'll build 1.53 from source myself before continuing with my testing.


>Assuming Manjaro is just picking this up from Arch Linux,

That's how things work for packages which are included in the Manjaro
repositories, which are separate from those of Arch; however, the AUR is a
community effort which maintains packages not included in the repositories
for Arch (and thus, not in Manjaro or other Arch-derived distributions).

AUR packages are typically downloaded and built from source, although a few
also offer binary downloads -- but par isn't one of those, and since the
patch is no longer available online, the AUR package for it won't even
build anymore.  Clearly the patch was available at the time I installed par
as part of the OS installation on this machine, back in December 2019.


>I think this is the shell script which builds the package.
>https://aur.archlinux.org/cgit/aur.git/tree/PKGBUILD?h=par

It is.


>Can you try searching par again, this time with
>
>file /usr/bin/par
>env LC_ALL=C egrep -boa 'seems not configured' /usr/bin/par

Done, but this still produces no output.

 - Steven
-- 
___
Steven Winikoff  | "I don't want to run the world; I merely
Montreal, QC, Canada |  want to own a substantial portion of the
s...@smwonline.ca |  the preferred stock."
http://smwonline.ca  |
 |- Alan Dean Foster,  Cat-A-Lyst

Re: mhfixmsg character set conversion

2022-02-11 Thread Steven Winikoff

>Three tests failed.  You can run "make install" in the nmh directory
>to install.

Thanks.  I'll try that later tonight.


>If things are still broken with that nmh, I would remove your par[1]
>executable and rebuild mhn.defaults.  That might at least allow
>make check to pass.

This seems like a worthwhile thing to test at the same time even if
everything else works, so I'll try that also.


>And while we seem to have eliminated par and lynx, the symptoms are
>consistent with one or both of them being used by mhfixmsg.

...except that my strace of mhfixmsg shows no external programs being run.
What am I missing?

 - Steven
-- 
___
Steven Winikoff  |
Montreal, QC, Canada | "He who has imagination without learning
s...@smwonline.ca |  has wings but no feet."
http://smwonline.ca  |
 |   - fortune(6)

Re: mhfixmsg character set conversion

2022-02-11 Thread Steven Winikoff

>I assume vim(1) will read up to a certain amount until it either makes up
>its mind or assumes the default.

That makes sense.


>Try this to remove the boring ASCII bytes and see what's left.
>
>tr -d ' -~' https://en.wikipedia.org/wiki/Specials_(Unicode_block)#Replacement_character
>describes ‘�’ and it's being seen above because cut(1) is cutting bytes
>and the ‘108:’ at the start of the line has shifted the 68/69 cut-off
>point to part-way through the UTF-8 for a single code point AKA rune.

For me, this falls into the category of "things that are perfectly obvious,
but only after they've been explained".  Thank you for explaining it.


>Try
>
>sh
>LC_ALL=C; export LC_ALL
>locale
>perl -lpe 's/[^ -~]/sprintf "<%02x>", ord($&)/ge' good_snippet

Done, and I just learned something interesting.  First, the output looks
like this:

   sh-5.1$ LC_ALL=C; export LC_ALL
   sh-5.1$ locale
   LANG=en_CA.UTF-8
   LC_CTYPE="C"
   LC_NUMERIC="C"
   LC_TIME="C"
   LC_COLLATE="C"
   LC_MONETARY="C"
   LC_MESSAGES="C"
   LC_PAPER="C"
   LC_NAME="C"
   LC_ADDRESS="C"
   LC_TELEPHONE="C"
   LC_MEASUREMENT="C"
   LC_IDENTIFICATION="C"
   LC_ALL=C
   sh-5.1$ perl -lpe 's/[^ -~]/sprintf "<%02x>", ord($&)/ge' good_snippet
   Veuillez ne pas rpondre au prsent courriel. Il a 
t gnr

Second, the problem with the original command appearing to hang turns out
to be an interaction between bash and xterm's pasting mechanism(!).

I'm accustomed to pasting a command line by triple-clicking to select the
whole line, then middle-clicking to paste it.  That's how xterm has worked
since I first started using it  years ago.

...and it still works exactly this way, and the line gets pasted just as I
expect, in tcsh.

...but in bash, although the line gets pasted, the newline at the end of it
somehow doesn't.  When 

   LC_ALL=C perl -lpe 's/[^ -~]/sprintf "<%02x>", ord($&)/ge' good_snippet

originally seemed to hang, in fact it was just waiting for me to press the
Enter key!  I still don't know why this is happening, but at least I'm
comforted by the fact that my bash binary isn't totally broken. :-/


>Beware that invoking bash(1) as ‘sh’ is not the same as running ‘bash’.

I did know that, but thank you for mentioning it just in case.


>Might not make a difference in this case, but in general it's better to
>run whichever is desired.

Right, but in this case sh was what was desired.  As I understand it,
when invoked that way bash behaves closer to a real Bourne shell than
when involved as bash.


>> I propose to forget this particular clupea harengus of the crimson
>> variety unless you find it interesting in and of itself.
>
>It is odd.  And odd might affect other things, including to do with nmh.
>:-)

Odd indeed, but apparently only when used interactively with xterm, so nmh
is unlikely to be affected.

 - Steven
-- 
___
Steven Winikoff  |
Montreal, QC, Canada | "The reward of a thing well
s...@smwonline.ca |  done is to have done it."
http://smwonline.ca  |
 |   - Emerson


tr_output.pdf
Description: tr_output.pdf

Re: mhfixmsg character set conversion

2022-02-11 Thread Steven Winikoff

>>- run ~smw/bin/decode_headers using $source as stdin (this explicitly
>>  decodes headers which are RFC 2047-encoded, and passes the body
>>  through unchanged)
>
>This sounds like the kind of thing which might insert bytes which alter
>vim's idea of the ‘fileencoding’.  Given
>
>To: =?ISO-8859-1?Q?Keld_J=F8rn_Simonsen?= 
>
>as taken from RFC 2047, is it going to put in a byte 0xf8 for ISO 8859-1
>encoding, or 0xc3 0xb8 for UTF-8?

I didn't know, so I just tried it.  Here's what happens:

   # decode_headers < rfc2407_test_header > converted_rfc2407_header
   # cat converted_rfc2407_header
   To: Keld Jørn Simonsen 

   # hexdump -C converted_rfc2407_header
     54 6f 3a 20 4b 65 6c 64  20 4a c3 b8 72 6e 20 53  |To: Keld J..rn 
S|
   0010  69 6d 6f 6e 73 65 6e 20  3c 6b 65 6c 64 40 64 6b  |imonsen 
.|
   0028

...so it writes 0xc3 0xb8, which I believe is what it should be doing.

 - Steven
-- 
___
Steven Winikoff  | "The most exciting phrase to hear in
Montreal, QC, Canada |  science, the one that heralds new
s...@smwonline.ca |  discoveries, is not 'Eureka!' (I found
http://smwonline.ca  |  it!), but 'That's funny...'"
 | - Isaac Asimov

Re: mhfixmsg character set conversion

2022-02-11 Thread Robert Elz

Date:Fri, 11 Feb 2022 20:24:11 -0500
From:Steven Winikoff 
Message-ID:  <788346-1644629051.102489@xc6K.Kf2p.OQ3k>

  | in bash, although the line gets pasted, the newline at the end of it
  | somehow doesn't.  When 

This is a recent bash (IMO mis-)feature - I believe there's an
option (run time) to return to sanity, as I think I set it ...
but just now I cannot find it!

kre

Re: mhfixmsg character set conversion

2022-02-11 Thread Steven Winikoff

>  | in bash, although the line gets pasted, the newline at the end of it
>  | somehow doesn't.  When 
>
>This is a recent bash (IMO mis-)feature -

I absolutely agree with your opinion on this topic.


>I believe there's an option (run time) to return to sanity, as I think I
>set it ...  but just now I cannot find it!

A quick web search for "bash xterm paste newline" turns up

   
https://unix.stackexchange.com/questions/633370/bash-needs-another-newline-to-execute-pasted-lines

The summary is to add the following entry to ~/.inputrc

   set enable-bracketed-paste off

I just tried this, and it works.  Thank you for pointing that out!

 - Steven
-- 
___
Steven Winikoff  | "Science is built upon facts, as a house is
Montreal, QC, Canada |  built of stones; but an accumulation of
s...@smwonline.ca |  facts is no more a science than a heap of
http://smwonline.ca  |  stones is a house."
 |   - Henri Poincaré

Re: mhfixmsg character set conversion

2022-02-11 Thread Robert Elz

Date:Sat, 12 Feb 2022 10:07:47 +0700
From:Robert Elz 
Message-ID:  <20709.1644635...@jinx.noi.kre.to>

  | This is a recent bash (IMO mis-)feature - I believe there's an
  | option (run time) to return to sanity, as I think I set it ...
  | but just now I cannot find it!

Found it ... a readline option rather than bash.

In .inputrc add:

set enable-bracketed-paste off

kre

Experimental IMAP branch

2022-02-11 Thread Eric Gillespie

Hi all,

Here's what I've been up to.  Quoting from docs/imap/DESIGN:

The world is now mobile-first, and many of us are mobile-only, or nearly so.
When my smart phone took over my life, two things happened to my nmh usage:

- At work, I was frustrated constantly by messages read under nmh being unread
  on my phone and vice versa.
- In my personal life, that same frustration drove me away from nmh completely.
  Whenever I need to reply to email from my computer, I find my self
  incorporating months of messages.

We can use IMAP to make nmh a first-class citizen in the modern mail world.

* Goals

- Treat the IMAP server as the source of truth.
- Minimize startup penalty to nmh commands.
- Minimize the impact to the nmh code-base.
- Minimize round-trips to the server.
- Enhance =inc= to incorporate mail from a hierarchy of IMAP mailboxes.
- Enhance =inc= to delete mail locally that has been deleted on the server.
- Enhance all commands interpreting the user's =Unseen-Sequence= (i.e. all those
  taking message specifications (=mh-sequence(5)=), e.g. =mark=, =pick=, =scan=,
  =show=) to interpret references to the =Unseen-Sequence= as all messages not
  flagged as =\Seen= on the server.
- Enhance =scan= output to reflect live =\Seen= status when it is asked to
  display whether messages appear in the user's =Unseen-Sequence=.
- Enhance =show= to set the =\Seen= flag on messages shown.
- Enhance =mark= to set or clear the =\Seen= flag for messages removed from or
  added to the user's =Unseen-Sequence=, respectively.
- Enhance =rmm= to delete messages from the server.
- Enhance =refile= to move messages on the server.
- Enhance =comp= and =repl= to allow usage of a server-side draft folder.

** Language

I've programmed in C and C++ for most of my career and I've just about had it
with unsafe languages.  The thought of implementing IMAP synchronization in C
discouraged me from even starting for years.

In my last position at Google, my fellow tech lead and I would often joke
about dividing the team in two, with one team continuing to chase down a
memory corruption error we'd failed for months to find, and the other
rewriting the system in Rust.  It was a joke, but the idea stuck with me.

Since Rust allows transparent calls into C, we can seamlessly embed Rust into
nmh programs.  Why not give it a shot?

I know it wasn't all that long ago I proposed using Perl from the
test suite and that was rejected for the perfectly valid reason
that we want the test suite to keep working without Perl.

So how can I show up and propose Rust, which is not even
supported on many of our supported platforms?

Well, unlike Perl in the test suite, this is fully optional and
fully self-contained.  Users without access to Rust can use use
mh just the same way they could for the last 4 decades.

I am already using this as my daily driver, but a lot of work
remains before I'd be comfortable proposing merging this to
master, and taking the "experimental" tag off it is even
farther out.

I have plenty of things to do here in the short term (testing and
handling so many edge and error cases, cleaning up the big mess
I've made in sbr are top of the list), but if others have any
ideas or suggestions, I'd be happy to hear them.  I'll be working
on the branch in public now; no more big code drops.

Also, if anyone wants to try it out but has trouble getting it
working, let me know.  It's quite rough around the edges now.
Aside from daily usage on FreeBSD, I periodically test it on
Oracle Linux.

Here's what I have in .mh_profile:

Managed-Folders: g gmail
#: e...@pretzelnet.org at Gandi
g-manager: /home/epg/work/nmh/.o/clang13/target/debug/imap-folder-manager
g-db: /home/epg/Mail/.epggandi.db
g-socket: /home/epg/Mail/.epggandi.db/socket
g-log-trace-to: /home/epg/Mail/.epggandi.db/log
g-log-level: debug
g-max-fetch-messages-in-flight: 100
g-host: mail.gandi.net
g-port: imaps
g-tls: implicit
g-auth: SASL
g-user: e...@pretzelnet.org
#: GMail IMAP
gmail-manager: /home/epg/work/nmh/.o/clang13/target/debug/imap-folder-manager
gmail-db: /home/epg/Mail/.gmail.db
gmail-socket: /home/epg/Mail/.gmail.db/socket
gmail-log-trace-to: /home/epg/Mail/.gmail.db/log
gmail-log-level: debug
gmail-max-fetch-messages-in-flight: 100
gmail-host: imap.gmail.com
gmail-port: imaps
gmail-tls: implicit
gmail-auth: SASL
gmail-saslmech: xoauth2
gmail-user: eric.gilles...@gmail.com
gmail-authservice: gmail
gmail-mailbox-root: [Gmail]/
gmail-mailbox-exclude: .
gmail-mailbox-include: All Mail

Thanks, and happy hacking!

Re: mhfixmsg character set conversion

2022-02-11 Thread Steven Winikoff

>I would do this if you haven't already:
>1. download nmh HEAD, build, and install somewhere
>2. move your $(mhpath +)/mhn.defaults
>3. move your profile and create one with just a Path: entry
>4. run the "mhfixmsg -file original_copy -out -" from 1. and see if the
>   output looks good or bad

I just tried this, and a couple of other things, but only after installing
par 1.53.0 from source and using that to replace the AUR binary.  Here's
what I learned:

   1) Replacing par does indeed fix one of the three failed tests.  I can
  send you the details, but I seem to recall that you already have them
  from Valdis Klētnieks; please let me know if I should forward them
  anyway.

   2) After running make install, the newly built mhfixmsg produces correct
  output.  But so does nmh-1.7.1 mhfixmsg when compiled without my patch.

   3) Step (3) above was the key, and it turned out that I was being misled
  by this .mh_profile entry:

 mhshow-show-text/html:  html_to_text %F | cat -

  ...where html_to_text is a shell script that basically just runs this
  command:

 elinks -force-html -dump -dump-charset utf-8 ${html}

  Removing this profile entry causes the message to be displayed
  correctly -- both the original, unmodified version, and the one that
  was saved after being converted by my patched version of nmh-1.7.1
  mhfixmsg.  That's pretty conclusive evidence that I'd been looking
  in the wrong place all along. :-(

  The man page for elinks describes -dump-charset as follows:

 -dump-charset (alias for document.dump.codepage)
 Codepage used when formatting dump output.

  Interestingly, when I restored the mhshow-show-text/html .mh_profile
  entry and modified my shell script to run elinks without this option,
  I still saw the same doubly encoded output.

  So next I tried passing the character set to my script as follows:

 mhshow-show-text/html:  html_to_text %{charset} %F

  ...and changed the script to use the provided character set rather
  than forcing utf-8:

 elinks -force-html -dump -dump-charset $1 ${html}

  This failed differently.  Instead of rendering the message with '�'
  marking undisplayable characters, it used '*' instead.  Somehow, I
  don't consider that to be much of an improvement. :-/

...so clearly I need to replace elinks in my html_to_text script, and doing
that will solve the problem that prompted this discussion, leaving the
following questions:

   1) What's the best replacement for elinks?

   2) Should I replace my 1.7.1 installation by the version I just built?
  Basically I'm asking what benefits the current snapshot has over
  1.7.1, and how far away the next numbered release might be.

   3) How can I guarantee that messages will be saved with quoted-printable
  or base64 parts decoded, without patching mhfixmsg to deal with
  messages in which the decoded text would be more than 998 characters
  long?

  I used the current mhfixmsg with the test message I've been using
  throughout this discussion, with this command line:

 /tmp/nmh/root/bin/mhfixmsg \
 -decodeheaderfieldbodies utf-8 -decodetext binary \
 -decodetypes text -textcharset UTF-8 -reformat \
 -fixcte -fixboundary -noreplacetextplain  \
 -fixtype application/octet-stream \
 -verbose -file $source -outfile $destination

  ...and that resulted in these headers after decoding:

 - for the text/plain part:

  Content-Transfer-Encoding: 8bit
  Content-Type: text/plain; charset="UTF-8"

 - for the text/html part:

  Content-Transfer-Encoding: binary
  Content-Type: text/html; charset=iso-8859-1

  That raises some further questions:

 - Why wasn't the text/html part converted to utf-8?

 - Regardless of the answer to the previous question, after a
   message has been refiled (and assuming I'm not planning to
   resend it to anyone), is there a practical difference between
   binary and 8bit encoding?

 - Why are the headers of the decoded message identical to those
   of the input, despite the use of -decodeheaderfieldbodies?

   (...and yes, the unmodified version of the message does contain
some encoded headers that my decode_headers program found and
decoded; mhfixmsg appears not to have done so).

   Thanks,

 - Steven
-- 
___
Steven Winikoff  | "'Somebody, SOMEBODY
Montreal, QC, Canada | Has to, you see.'
s...@smwonline.ca | Then she picked out two Somebodies.
http://smwonline.ca  | Sally and me."
 |- Dr. Seuss

Re: automatic decode mime in repl

Re: automatic decode mime in repl

Re: automatic decode mime in repl

Re: automatic decode mime in repl

Re: automatic decode mime in repl

Re: automatic decode mime in repl

Re: automatic decode mime in repl

Re: automatic decode mime in repl

Re: mhfixmsg character set conversion

Re: mhfixmsg character set conversion

Re: mhfixmsg character set conversion

Re: mhfixmsg character set conversion

Re: mhfixmsg character set conversion

Re: mhfixmsg character set conversion

Re: mhfixmsg character set conversion

Experimental IMAP branch

Re: mhfixmsg character set conversion

17 matches

Site Navigation

Mail list logo

Footer information