Re: [Nmh-workers] why does mhfixmsg dislike long text lines?

2018-01-24 Thread David Levine
Steven wrote:

> >> Is there a way to get mhfixmsg to decode the base64 and then run it through
> >> tidy with a given set of command-line options?
> >
> >Yes, via mhfixmsg-format-text/html.  See the mhfixmsg and mhshow man pages.
>
> I did read those man pages, but perhaps I'm still failing to understand
> parts of them.  I do know how mhfixmsg-format-text/html specifies the
> command which generates the text/plain part from the text/html part, but
> I don't see how to do that and also reformat the text/html part.

Maybe "reformat" is a misleading name.  It doesn't change a
text/html part, for example, in place.  Instead, it creates a
text/plain version and inserts that in a new (with the default
-noreplacetextplain) text/plain MIME part.  With -replacetextplain,
the content in a corresponding, existing text/plain part is
replaced.

mhfixmsg takes care of the decoding from base64, then feeds the
decoded content to the command specified by
mhfixmsg-format-text/html.  That command can be whatever shell
command, including a pipeline, you'd like.  So you should be able to
do whatever formatting you wish.  The result is placed in the
text/plain part.

David

-- 
Nmh-workers
https://lists.nongnu.org/mailman/listinfo/nmh-workers


Re: [Nmh-workers] why does mhfixmsg dislike long text lines?

2018-01-24 Thread Steven Winikoff
>Does the full path to mhn.defaults shown by "man mhfixmsg" match
>/local/pkg/nmh/root-nmh-1.7/etc/mhn.defaults ?

Yes.


>If it does, maybe run mhfixmsg under ltrace or something similar to see
>exactly what file it's trying to open.

I used the strace command Ralph suggested (strace -fe open,openat), and
that solved it.

The problem was that I had a personal mhn.defaults file, and mhfixmsg was
reading that (which I expected) but then not reading the system version
(which I didn't expect -- I would have expected the system one to be
read first unconditionally, to be supplemented and/or overridden by the
personal file).

Ironically, the personal mhn.defaults in question isn't needed and
shouldn't have been there anyway; it's an artifact of the transition
I'm going through right now, from an older, about-to-be-decommissioned
server with nmh-1.6 to my desktop machine running 1.7.

With the personal mhn.defaults file deleted mhfixmsg works as expected
using the system version.


>> >> I thought Ken said the RFC 5322 limit was 998.  But...
>> >
>> >Right.  He also noted that he's had problems with insertion of '!' in long
>> >lines of HTML.
>>
>> What about the idea of reformatting the text/html part to reduce the line
>> width?
>
>Then -maxunencoded wouldn't be necessary.  Though I'm not sure if you're
>talking about outgoing or incoming messages here.

I'm talking about incoming messages.


>> Is there a way to get mhfixmsg to decode the base64 and then run it through
>> tidy with a given set of command-line options?
>
>Yes, via mhfixmsg-format-text/html.  See the mhfixmsg and mhshow man pages.

I did read those man pages, but perhaps I'm still failing to understand
parts of them.  I do know how mhfixmsg-format-text/html specifies the
command which generates the text/plain part from the text/html part, but
I don't see how to do that and also reformat the text/html part.

 - Steven
-- 
___
Steven Winikoff| "If you have built castles in the air,
Concordia University   |  your work need not be lost; that is
Montreal, QC, Canada   |  where they should be.  Now put
steven.winik...@concordia.ca   |  foundations under them."
   |   - Henry David Thoreau

-- 
Nmh-workers
https://lists.nongnu.org/mailman/listinfo/nmh-workers


Re: [Nmh-workers] why does mhfixmsg dislike long text lines?

2018-01-24 Thread Ralph Corderoy
Hi David,

> run mhfixmsg under ltrace or something similar to see exactly what
> file it's trying to open.

strace -fe open,openat mhfixmsg

> Yes, via mhfixmsg-format-text/html.

It seems mhfixmsg is quite close to being able to have me run it by
another argv[0] to provide a separate set of ...-format-image/... and
have those be simple `cats' of prepared, crushed, 1×1 pixel, versions of
those image types.  This would allow `cid:', etc., references to still
be valid, but toss away all those vacuous corporate logos and calls to
click.

-- 
Cheers, Ralph.
https://plus.google.com/+RalphCorderoy

-- 
Nmh-workers
https://lists.nongnu.org/mailman/listinfo/nmh-workers

Re: [Nmh-workers] why does mhfixmsg dislike long text lines?

2018-01-24 Thread Ralph Corderoy
Hi,

David wrote:
> [Ken] also noted that he's had problems with insertion of '!' in long
> lines of HTML.

Sendmail has been splitting a long line that violates the RFCs at that
point by inserting a `!'.  It happens without caring what the line's
content type or encoding is, or if it's MIME.

One simply can't send off an RFC-violating long-line email and expect it
to arrive unharmed.

-- 
Cheers, Ralph.
https://plus.google.com/+RalphCorderoy

-- 
Nmh-workers
https://lists.nongnu.org/mailman/listinfo/nmh-workers


Re: [Nmh-workers] why does mhfixmsg dislike long text lines?

2018-01-24 Thread David Levine
Steven wrote:

> but it sounds like you're suggesting that
> the entry in /local/pkg/nmh/root-nmh-1.7/etc/mhn.defaults should be picked
> up directory from there, and that isn't happening.

Right.  Does the full path to mhn.defaults shown by "man mhfixmsg" match
/local/pkg/nmh/root-nmh-1.7/etc/mhn.defaults ?  If it does, maybe run
mhfixmsg under ltrace or something similar to see exactly what file it's
trying to open.

> >> I thought Ken said the RFC 5322 limit was 998.  But...
> >
> >Right.  He also noted that he's had problems with insertion of '!' in long
> >lines of HTML.
>
> What about the idea of reformatting the text/html part to reduce the line
> width?

Then -maxunencoded wouldn't be necessary.  Though I'm not sure if you're
talking about outgoing or incoming messages here.

> I've been playing with tidy (AKA html-tidy), and it's capable of
> transforming the HTML message I received last week from a single line of
> 42187 characters into a version with 1896 lines with a maximum line width
> of 138.
>
> Is there a way to get mhfixmsg to decode the base64 and then run it through
> tidy with a given set of command-line options?

Yes, via mhfixmsg-format-text/html.  See the mhfixmsg and mhshow man pages.

David

-- 
Nmh-workers
https://lists.nongnu.org/mailman/listinfo/nmh-workers


Re: [Nmh-workers] why does mhfixmsg dislike long text lines?

2018-01-23 Thread Steven Winikoff
>>mhfixmsg: Don't know how to convert /home/smw/Mail/reformatted/17352,
>>  there is no mhfixmsg-format-text/html profile entry
>>
>> ...which makes sense because I don't know what to put in that profile entry.
>
>Is there a mhfixmsg-format-text/html line in your mhn.defaults?

Yes:

   # grep mhfixmsg-format-text/html /local/pkg/nmh/root-nmh-1.7/etc/mhn.defaults
   mhfixmsg-format-text/html: charset=%{charset}; /usr/bin/lynx -child -dump 
-force_html ${charset:+--assume_charset} ${charset:+"$charset"} %F | expand | 
sed -e 's/^   //' -e 's/  *$//'

Of course I can just copy this entry into my .mh_profile, and I'll try that
tomorrow when I have some time -- but it sounds like you're suggesting that
the entry in /local/pkg/nmh/root-nmh-1.7/etc/mhn.defaults should be picked
up directory from there, and that isn't happening.


>> I thought Ken said the RFC 5322 limit was 998.  But...
>
>Right.  He also noted that he's had problems with insertion of '!' in long
>lines of HTML.

What about the idea of reformatting the text/html part to reduce the line
width?  I've been playing with tidy (AKA html-tidy), and it's capable of
transforming the HTML message I received last week from a single line of
42187 characters into a version with 1896 lines with a maximum line width
of 138.

Is there a way to get mhfixmsg to decode the base64 and then run it through
tidy with a given set of command-line options?

 - Steven
-- 
___
Steven Winikoff|
Concordia University   | "The end of the world will occur at
Montreal, QC, Canada   |  3:00 p.m., this Friday, with symposium
steven.winik...@concordia.ca   |  to follow."
   |- fortune(6)

-- 
Nmh-workers
https://lists.nongnu.org/mailman/listinfo/nmh-workers


Re: [Nmh-workers] why does mhfixmsg dislike long text lines?

2018-01-23 Thread David Levine
Steven wrote:

> The resulting file has these headers:
>
>Content-Transfer-Encoding: binary
>Content-Disposition: inline
>Content-Type: text/html;
>charset="UTF-8"
>MIME-Version: 1.0 
>
> So if I correctly understand what I'm learning from you and Ken, this
> should be compliant.

With RFC 2045, yes, but maybe not with the line length limitations of RFC 5322 
§2.1.1.

> I tried it just now, and the (expected) result was
>
>mhfixmsg: Don't know how to convert /home/smw/Mail/reformatted/17352,
>  there is no mhfixmsg-format-text/html profile entry
>
> ...which makes sense because I don't know what to put in that profile entry.

Is there a mhfixmsg-format-text/html line in your mhn.defaults?  There should 
be, if you installed nmh with "make install" and one of w3m, lynx, or elinks 
was already present and on your PATH.

> >Uh, that's a different issue.  -maxunencoded 900 can cause creation of
> >messages with lines that long, and they wouldn't comply with RFC 5322.
>
> I thought Ken said the RFC 5322 limit was 998.  But...

Right.  He also noted that he's had problems with insertion of '!' in long 
lines of HTML.

David

-- 
Nmh-workers
https://lists.nongnu.org/mailman/listinfo/nmh-workers

Re: [Nmh-workers] why does mhfixmsg dislike long text lines?

2018-01-23 Thread Ralph Corderoy
Hi Steven,

> ...so when a message clearly contains
>
>Content-Transfer-Encoding: base64
>
> shouldn't that mean we don't need to test the decoded content to see
> if it's binary or not?  You just said in your previous message that
> there's no line length restriction in the content after decoding.

mhfixmsg is trying to take a valid email in and write one out.  The
base64 input is decoded and then tested to see if it can be represented
in another form, e.g. binary, in the output email.  Lines that are too
long is one reason it can't so the conversion doesn't happen.

Ken's `no line length' is, I assume, referring to the output file of
mhstore(1) or similar where you want the original file from the sender
back, outside of any email message.

> you clearly explained that I should be able to do that, because the
> encoded form follows the RFC specification and the decoded form
> doesn't have to.

No, both have to for nmh to process them.

-- 
Cheers, Ralph.
https://plus.google.com/+RalphCorderoy

-- 
Nmh-workers
https://lists.nongnu.org/mailman/listinfo/nmh-workers


Re: [Nmh-workers] why does mhfixmsg dislike long text lines?

2018-01-22 Thread Andy Bradford
Thus said David Levine on Mon, 22 Jan 2018 20:43:50 -0500:

> > The  comment  in  mhfixmsg  which  I  quoted  at  the  beginning  of
> > this thread  seems to  be saying  that sometimes  message components
> > described  as  text/*   are  really  binary  files,   and  that  the
> > 998-character limit  is used  in mhfixmsg (only)  as a  heuristic to
> > identify this situation.
>
> I  wouldn't call  it a  heuristic. It's  definitive, according  to RFC
> 2045.

One of the reasons why RFC 2045  has this definition can be found in RFC
5321 (previously 2821, previously 821) where a Text Line is defined:

4.5.3.1.6.  Text Line

   The maximum total length of a text line including the  is 1000
   octets (not counting the leading dot duplicated for transparency).
   This number may be increased by the use of SMTP Service Extensions.

So, regardless  of the on-disk  format or how  a message might  meet RFC
5322, if  it wants to be  sent via SMTP, it  will have to be  encoded in
some fashion (enter MIME, uuencode, etc...).

Andy
-- 
TAI64 timestamp: 40005a66d3a7



-- 
Nmh-workers
https://lists.nongnu.org/mailman/listinfo/nmh-workers


Re: [Nmh-workers] why does mhfixmsg dislike long text lines?

2018-01-22 Thread Steven Winikoff
>Well, "binary" has a specific meaning in the MIME world.  Specifically,
>it refers to a MIME Content Transfer Encoding of binary, which has no
>restrictions in terms of line length.  So when that message says that
>it can't decode it because the part would have to be binary, THAT is what
>it is referring to.

This helps, but I'm still a bit confused.  (That's an exaggeration; I'm
really still very much confused. :-()

I just looked up Content-Transfer-Encoding header, and found what you
already know (but which I'll repeat here, for the record and for my
own future reference):

   The Content-Transfer-Encoding field is designed to specify an invertible
   mapping between the "native" representation of a type of data and
   a representation that can be readily exchanged using 7 bit mail
   transport protocols, such as those defined by RFC 821 (SMTP). This field
   has not been defined by any previous standard. The field's value is
   a single token specifying the type of encoding, as enumerated below.
   Formally:

   Content-Transfer-Encoding := "BASE64" / "QUOTED-PRINTABLE" /
"8BIT"   / "7BIT" /
"BINARY" / x-token

...so when a message clearly contains

   Content-Transfer-Encoding: base64

shouldn't that mean we don't need to test the decoded content to see if
it's binary or not?  You just said in your previous message that there's
no line length restriction in the content after decoding.


>But David points out that if you tell it to, mhfixmsg will happily
>generate such messages (but the documentation does caution you that the
>resulting messages may not be readable with nmh).

That's good to know, but I really have no plans to create out-of-spec
messages; I just want to be able to read the messages I'm receiving, and
you clearly explained that I should be able to do that, because the encoded
form follows the RFC specification and the decoded form doesn't have to.

Or at least that's what I thought you said.


>Our only general-purpose nmh list is nmh-workers; plenty of people on it
>are not coders, so please don't be concerned on that score.

Thanks.  I've just subscribed.

 - Steven
-- 
___
Steven Winikoff| "Nature is by and large to be found out
Concordia University   |  out of doors, a location where, it
Montreal, QC, Canada   |  cannot be argued, there are never
steven.winik...@concordia.ca   |  enough comfortable chairs."
   |- Fran Leibowitz

-- 
Nmh-workers
https://lists.nongnu.org/mailman/listinfo/nmh-workers


Re: [Nmh-workers] why does mhfixmsg dislike long text lines?

2018-01-22 Thread David Levine
Ken wrote:

> >> That's a question for David, since mhfixmsg is his baby.  I would suggest
> >> that was probably an oversight and he might fix it at some point (and
> >> by "fix it" I mean, "refuse to decode it", but that's up to him).
> >
> >I thought that mhfixmsg DID refuse to decode it:
> >
> >   mhfixmsg: /tmp/msg, will not decode text/html;  charset="UTF-8" because 
> > it is binary (line length > 998)
>
> Well, Steven also said that it worked fine for a super-long text/html
> part that was quoted-printable, that's what I was specifically referring
> to.  But I see that the check in mhfixmsg should occur for either base64
> or quoted-printable, so maybe there was a CR/LF in there that wasn't
> obvious.

I still see nothing broken here.  Counterexamples welcome.

David

-- 
Nmh-workers
https://lists.nongnu.org/mailman/listinfo/nmh-workers


Re: [Nmh-workers] why does mhfixmsg dislike long text lines?

2018-01-22 Thread David Levine
Steven wrote:

> This suggests to me that removing the 998-character limit in mhfixmsg
> (only, and nowhere else) is a reasonable thing to do.

I think that -decodetext binary would be a better approach, but note
that warning about producing non-compliant messages.  But maybe none of
this is necessary, see below re. -reformat.

> The comment in mhfixmsg which I quoted at the beginning of this thread
> seems to be saying that sometimes message components described as text/*
> are really binary files, and that the 998-character limit is used in
> mhfixmsg (only) as a heuristic to identify this situation.

I wouldn't call it a heuristic.  It's definitive, according to RFC 2045.

> (Digression:  I'd also prefer to reformat the long lines at the same time.
> I'm seriously considering piping the decoded HTML through something like
> tidy [ http://www.html-tidy.org/ ] before saving it. :-/)

mhfixmsg -reformat does that.  That's the default, but you overrode it
with -noreformat.

> As it happens, I have >
>mhbuild:  -maxunencoded 900
>
> in my .mh_profile, and have had for a while.
>
> This is a coincidence, in that I was unaware of the 998-character limit,
> until today, but happily I'm under it anyway. :-)

Uh, that's a different issue.  -maxunencoded 900 can cause creation of
messages with lines that long, and they wouldn't comply with RFC 5322.
Going a little over 78 might not be of a problem in practice, but . . .

> ...so if I were to quote text with wider lines than that the right thing
> would happen

No guarantee of that.  I wouldn't consider 900 to be "a little over" 78.

> The only reason I've been writing to nmh-workers is that I'm unaware
> of anywhere else to turn.  Is there a corresponding nmh-users list or
> something similar?

No, nmh-workers is the place for all things nmh.

> For example, I particularly depend on being able to find specific saved
> messages using grep or mairix[**] -- and if the message body is saved in
> base64 encoding, both of those programs fail completely.

That was the main motivation for creating mhfixmsg.

David

-- 
Nmh-workers
https://lists.nongnu.org/mailman/listinfo/nmh-workers


Re: [Nmh-workers] why does mhfixmsg dislike long text lines?

2018-01-22 Thread Ken Hornstein
>The comment in mhfixmsg which I quoted at the beginning of this thread
>seems to be saying that sometimes message components described as text/*
>are really binary files, and that the 998-character limit is used in
>mhfixmsg (only) as a heuristic to identify this situation.

Well, "binary" has a specific meaning in the MIME world.  Specifically,
it refers to a MIME Content Transfer Encoding of binary, which has no
restrictions in terms of line length.  So when that message says that
it can't decode it because the part would have to be binary, THAT is what
it is referring to.

But David points out that if you tell it to, mhfixmsg will happily
generate such messages (but the documentation does caution you that the
resulting messages may not be readable with nmh).

One of my medium-term plans is to redo the mail parser with more modern
tools so nmh doesn't have such limits.  Don't ask me when that will
happen, though.

>The only reason I've been writing to nmh-workers is that I'm unaware
>of anywhere else to turn.  Is there a corresponding nmh-users list or
>something similar?

Our only general-purpose nmh list is nmh-workers; plenty of people on it
are not coders, so please don't be concerned on that score.

--Ken

-- 
Nmh-workers
https://lists.nongnu.org/mailman/listinfo/nmh-workers


Re: [Nmh-workers] why does mhfixmsg dislike long text lines?

2018-01-22 Thread Ken Hornstein
>> That's a question for David, since mhfixmsg is his baby.  I would suggest
>> that was probably an oversight and he might fix it at some point (and
>> by "fix it" I mean, "refuse to decode it", but that's up to him).
>
>I thought that mhfixmsg DID refuse to decode it:
>
>   mhfixmsg: /tmp/msg, will not decode text/html;  charset="UTF-8" because it 
> is binary (line length > 998)

Well, Steven also said that it worked fine for a super-long text/html
part that was quoted-printable, that's what I was specifically referring
to.  But I see that the check in mhfixmsg should occur for either base64
or quoted-printable, so maybe there was a CR/LF in there that wasn't
obvious.

--Ken

-- 
Nmh-workers
https://lists.nongnu.org/mailman/listinfo/nmh-workers


Re: [Nmh-workers] why does mhfixmsg dislike long text lines?

2018-01-22 Thread David Levine
Ken wrote:

> [Steven wrote:]
> >This part is 670 lines before decoding, and exactly one line afterward.
> >This arrived before I started using mhfixmsg, but given what I've just
> >learned I'd certainly expect mhfixmsg to refuse to decode it.
>
> That's a question for David, since mhfixmsg is his baby.  I would suggest
> that was probably an oversight and he might fix it at some point (and
> by "fix it" I mean, "refuse to decode it", but that's up to him).

I thought that mhfixmsg DID refuse to decode it:

   mhfixmsg: /tmp/msg, will not decode text/html;  charset="UTF-8" because it 
is binary (line length > 998)

I'll assume that Steven wasn't asking a question above, but just noting
that his expectations have been corrected.  If that's not the case, please
let me know what's broken.

> > ...so it was clearly marked as text.  If a sender packages a binary
> > file but describes it as text/html, it's already broken, and I really
> > don't care if mhfixmsg "damages" it even further. :-/

It's not necessarily broken.  Binary text/html can certainly be encoded as
base64, for example.  See Ken's description that includes the "War and Peace"
example.

Now in regards to mhfixmsg decoding text that RFC 2045 defines as binary,
please see the description of the -decodetext switch in the man page
(8bit is the default):

Similarly, with 8-bit, if the decoded text would be binary, then the
part is not decoded (and a message will be displayed if -verbose is
enabled).  Note that -decodetext binary can produce messages that
are not RFC 2045 compliant.

So, it sounds like -decodetext binary will get mhfixmsg to do what you
want(ed).  But, you might be left with messages that nmh programs can't
ingest.

David

-- 
Nmh-workers
https://lists.nongnu.org/mailman/listinfo/nmh-workers


Re: [Nmh-workers] why does mhfixmsg dislike long text lines?

2018-01-22 Thread Steven Winikoff
>Are you saying you received via SMTP a RFC5322 message where there
>was 42027 characters between CR-LF pairs?

I think I might have said that :-/, but whether I did or not you're right
that it isn't what I meant.


>That suggests to me that you in fact received a message that had lines no
>greater than 78 characters between CR-LF pairs, and _after you decoded it_
>it might have had a very long line.

Exactly.

But that's also the situation with the message I received today which
sparked my original question.  That one had only one part, described
with:

   Content-Transfer-Encoding: base64
   Content-Disposition: inline
   Content-Type: text/html;
   charset="UTF-8"
   MIME-Version: 1.0

Before decoding, the body width was 76 characters (some of the headers were
wider, even those were all under 200 characters wide) -- but when I tried to
decode it, this happened:

mhfixmsg: /tmp/msg, will not decode text/html;  charset="UTF-8" because it 
is binary (line length > 998)

...but (line length > 998) refers to the decoded text, which really is more
than 998 characters wide.  This is what I was originally asking about (or
trying to :-/, and I apologize for not being clear on that point).


>THAT is completely legal according to the RFCs.  For the most part, it
>doesn't matter what it decodes to; what nmh cares about is that the
>message it is reading is valid according to RFC 5322.  THAT is where the
>998 byte line length limit comes into play.  You could send the entirety
>of "War and Peace" in text/plain part all as one line, and as long as it
>was encoded properly that would be fine.

This suggests to me that removing the 998-character limit in mhfixmsg
(only, and nowhere else) is a reasonable thing to do.

The comment in mhfixmsg which I quoted at the beginning of this thread
seems to be saying that sometimes message components described as text/*
are really binary files, and that the 998-character limit is used in
mhfixmsg (only) as a heuristic to identify this situation.


>>But you're quite right that this code isn't easy to understand.  If I were
>>to modify uip/mhfixmsg.c without touching sbr/m_getfld.c, am I risking
>>anything other than generating messages that nmh won't be able to read?
>
>Good question!  Your use cases seem to be ... well, I don't understand
>them.

That's because I keep being unclear, which in turn is because I don't
know enough to be clearer -- though I'm learning a lot just from this
discussion. :-)

My use case is simply that people keep sending me messages which decode
to HTML with horribly long lines, and I'd prefer to save the decoded text
rather than the encoded version[*].

(Digression:  I'd also prefer to reformat the long lines at the same time.
I'm seriously considering piping the decoded HTML through something like
tidy [ http://www.html-tidy.org/ ] before saving it. :-/)

As it happens, I have 

   mhbuild:  -maxunencoded 900

in my .mh_profile, and have had for a while.

This is a coincidence, in that I was unaware of the 998-character limit,
until today, but happily I'm under it anyway. :-)

...so if I were to quote text with wider lines than that the right thing
would happen -- although in practice if I were to quote text with lines that
long, I'd almost certainly run them through fmt first.


>And might I suggest that if you're going to keep asking us questions
>about nmh, you should join the mailing list? :-)

I'd be happy to, as long as it wouldn't be considered as a commitment to
work on the code -- not that I'm opposed to that in principle, but I think
I've already demonstrated I'm not competent to step in and do anything
useful. :-(

The only reason I've been writing to nmh-workers is that I'm unaware
of anywhere else to turn.  Is there a corresponding nmh-users list or
something similar?

 - Steven


[*]
That's because one of the biggest reasons for using nmh, at least for
me, is that it's so useful to be able to manipulate saved email with
standard command-line tools.

For example, I particularly depend on being able to find specific saved
messages using grep or mairix[**] -- and if the message body is saved in
base64 encoding, both of those programs fail completely.


[**]
http://www.rpcurnow.force9.co.uk/mairix/

-- 
___
Steven Winikoff|"Garfield is, for my money at least, the
Concordia University   | shining exemplar of that productive
Montreal, QC, Canada   | laziness that gave us flush plumbing,
steven.winik...@concordia.ca   | clothes washers, dish washers, electric
   | lights, and automated guitar string
   | factories." - Mike Andrews

-- 
Nmh-workers
https://lists.nongnu.org/mailman/listinfo/nmh-workers


Re: [Nmh-workers] why does mhfixmsg dislike long text lines?

2018-01-22 Thread Ken Hornstein
>And I won't presume to suggest what nmh should do, but I will point out
>that I recently received a message with a text/html part which was one
>single line of 42027 characters.

Let me ask you a more precise question.

Are you saying you received via SMTP a RFC5322 message where there
was 42027 characters between CR-LF pairs?  Because while I believe
that might happen, I am actually skeptical that's what you received.
Specifically because of this:

>   Content-Transfer-Encoding: quoted-printable

That suggests to me that you in fact received a message that had lines no
greater than 78 characters between CR-LF pairs, and _after you decoded it_
it might have had a very long line.

THAT is completely legal according to the RFCs.  For the most part, it
doesn't matter what it decodes to; what nmh cares about is that the
message it is reading is valid according to RFC 5322.  THAT is where the
998 byte line length limit comes into play.  You could send the entirety
of "War and Peace" in text/plain part all as one line, and as long as it
was encoded properly that would be fine.

>This part is 670 lines before decoding, and exactly one line afterward.
>This arrived before I started using mhfixmsg, but given what I've just
>learned I'd certainly expect mhfixmsg to refuse to decode it.

That's a question for David, since mhfixmsg is his baby.  I would suggest
that was probably an oversight and he might fix it at some point (and
by "fix it" I mean, "refuse to decode it", but that's up to him).  As far
as I know the goal of mhfixmsg is to transform RFC 5322-format messages
from one set of encodings to another, but still produce a valid RFC 5322
formatted message.

>But you're quite right that this code isn't easy to understand.  If I were
>to modify uip/mhfixmsg.c without touching sbr/m_getfld.c, am I risking
>anything other than generating messages that nmh won't be able to read?

Good question!  Your use cases seem to be ... well, I don't understand
them.  Using nmh to produce a message that nmh cannot handle seems
suboptimal at best.  I don't think we will make any guarantees in the
code that we will be able to handle such messages.  Emailing such a
message without encoding it also is not guaranteed to work properly,
and that is outside of nmh's control; I can tell you from personal
experience that some MTAs will insert a '!' at the end of unencoded long
lines like that and force a line break, and that plays hell with HTML.

And might I suggest that if you're going to keep asking us questions
about nmh, you should join the mailing list? :-)

--Ken

-- 
Nmh-workers
https://lists.nongnu.org/mailman/listinfo/nmh-workers


Re: [Nmh-workers] why does mhfixmsg dislike long text lines?

2018-01-22 Thread Steven Winikoff
>To answer your larger question (on the subject line):
>
>- MH/nmh doesn't handle lines greater than 998 characters because such
>  messages are not valid according to RFC 5322, and mhfixmsg isn't going
>  to generate a message that nmh cannot handle.  Whether or not nmh SHOULD
>  handle such messages is a different question.

Thank you, that helps.

And I won't presume to suggest what nmh should do, but I will point out
that I recently received a message with a text/html part which was one
single line of 42027 characters.  Clearly there are at least some senders
who have as much respect for RFC 5322 as Microsoft has for standards in
general. :-/

But I'm confused, because I didn't have any problems reading that message.
The structure on it is as follows:

 msg part  type/subtype  size description
   4   multipart/alternative2213K
 1 multipart/related2211K
 1.1   text/html  41K
 1.2   image/jpeg 28K
 1.3   image/jpeg 42K
 [...]
 1.33  image/jpeg 350
 2 text/plain 808

...and part 1.1 has these headers:

   --Apple-Mail=_7C2BA5CB-FA71-4036-9FAD-C693FF38AF09
   Content-Type: multipart/related;
   type="text/html";
   boundary="Apple-Mail=_B4252506-2E52-4348-A3AD-C92C9A9FBD3D"

   --Apple-Mail=_B4252506-2E52-4348-A3AD-C92C9A9FBD3D
   Content-Transfer-Encoding: quoted-printable
   Content-Type: text/html;
   charset=us-ascii

This part is 670 lines before decoding, and exactly one line afterward.
This arrived before I started using mhfixmsg, but given what I've just
learned I'd certainly expect mhfixmsg to refuse to decode it.


>- The line length limit is imposed by m_getfld(), and that function is ...
>  hairy.  I think changing that might have unexpected consequences; it
>  might be fine, but I don't make any guarantees.  But the fact you said
>  you could "easily modify" it suggests to me that you have not actually
>  LOOKED at the code in question :-)

What I'd looked at was the content_encoding() function in uip/mhfixmsg.c,
where there are a few instances of literal 998 which really would be easy
to change.

You're right that I hadn't looked at the larger context, mostly because
I didn't know there was one.  This is the main reason why I asked before
doing anything.

I just took a quick look at sbr/m_getfld.c.  The first thing that struck me
was this comment at lines 158-163 (of the 1.7 version):

   [...] I considered
   using a Vax "scanc" to locate the end of the field followed by a
   "memmove" but the routine call overhead on a Vax is too large for this
   to work on short names.  If Berkeley ever makes "inline" part of the
   C optimiser (so things like "scanc" turn into inline instructions) a
   change here would be worthwhile.

I'm beginning to get a sense of (and becoming impressed by) just how old
this code base is.

But you're quite right that this code isn't easy to understand.  If I were
to modify uip/mhfixmsg.c without touching sbr/m_getfld.c, am I risking
anything other than generating messages that nmh won't be able to read?

 - Steven
-- 
___
Steven Winikoff|
Concordia University   | Celibacy is hereditary.  If your parents
Montreal, QC, Canada   | didn't have children, chances are you
steven.winik...@concordia.ca   | won't either.

-- 
Nmh-workers
https://lists.nongnu.org/mailman/listinfo/nmh-workers


Re: [Nmh-workers] why does mhfixmsg dislike long text lines?

2018-01-22 Thread Ken Hornstein
>So I guess what I'm asking is:  I can easily modify my copy of nmh to raise
>the 998-character limit, but it's not clear to me what I might break by
>doing so.  Would someone please explain what I'm missing here?

To answer your larger question (on the subject line):

- MH/nmh doesn't handle lines greater than 998 characters because such
  messages are not valid according to RFC 5322, and mhfixmsg isn't going
  to generate a message that nmh cannot handle.  Whether or not nmh SHOULD
  handle such messages is a different question.

- The line length limit is imposed by m_getfld(), and that function is ...
  hairy.  I think changing that might have unexpected consequences; it
  might be fine, but I don't make any guarantees.  But the fact you said
  you could "easily modify" it suggests to me that you have not actually
  LOOKED at the code in question :-)

--Ken

-- 
Nmh-workers
https://lists.nongnu.org/mailman/listinfo/nmh-workers


Re: [Nmh-workers] Why Attempt fork(2) up to Five Times?

2017-09-09 Thread Ralph Corderoy
Hi kre,

> Resources were in limited supply back then, and simply retrying (often
> after a short sleep, though not always required) would often end up
> succeeding

Yes, this was sleep(5) after each failure.
Thanks, they're now all a simple single call to fork().

-- 
Cheers, Ralph.
https://plus.google.com/+RalphCorderoy

___
Nmh-workers mailing list
Nmh-workers@nongnu.org
https://lists.nongnu.org/mailman/listinfo/nmh-workers


Re: [Nmh-workers] Why Attempt fork(2) up to Five Times?

2017-09-08 Thread Robert Elz
Date:Fri, 08 Sep 2017 15:57:27 +0100
From:Ralph Corderoy 
Message-ID:  <20170908145727.a7d571f...@orac.inputplus.co.uk>

  | Ken wondered why we bother
  | with multiple attempts.  It's a good point.  Historically, they were
  | probably vfork(2), but even so...  If anyone knows a good reason then
  | pipe up promptly,

In modern systems, almost certainly pointless.

vfork() is unrelated - retrying a vfork() was never particularly useful.

Back in 6th edn unix days, it was quite common for fork() to fail on busy
systems (which was most of them) due to one limit being temporarily exceeded
or another (process table full, out of memory, ...)   Resources were in limited
supply back then, and simply retrying (often after a short sleep, though not
always required) would often end up succeeding, if some other process happened
to have exited in the meantime (sometimes even just exec'd).

It was very common back then to repeat failed fork() calls, particularly when
there was no easy other recovery mechanism available (an editor forking to
run a "!' command shell can just tell the user the fork failed, and allow
them to try again if they want, processes that need to fork without user
involvement really want the fork to work, if it can possibly be made to 
happen.)

MH dates from 6th edition unix times, so ...

kre



___
Nmh-workers mailing list
Nmh-workers@nongnu.org
https://lists.nongnu.org/mailman/listinfo/nmh-workers


Re: [Nmh-workers] Why Does mh-tailor(5) Exist?

2016-11-14 Thread David Levine
Ralph wrote:

> Can mh-tailor get the chop if the references get switched over too?

I'm OK with that.  Though it would upset the consistency of:

   Files Used by nmh Commands
   mh-alias(5) alias file for nmh message system
   mh-format(5)format file for nmh message system
   mh-profile(5)   user customization for nmh message system
   mh-tailor(5)mail transport customization for nmh message system

I'd also be OK with moving everything useful that's left in
mts.conf to the profile, after 1.7.

To answer your question, it looks like mh-tailor was the original
documentation, and mts.conf(5) was added in 2000.

David

___
Nmh-workers mailing list
Nmh-workers@nongnu.org
https://lists.nongnu.org/mailman/listinfo/nmh-workers


Re: [Nmh-workers] Why wasn't I paying attention?

2014-06-05 Thread norm
Ken Hornstein k...@pobox.com writes:
During the development of nmh-1.6, there was extensive message traffic
about mhshow. Wouldn't I have fewer questions and make fewer stupid remarks
if I had paid attention to that traffic? Maybe?  I did try to follow
those discussions but couldn't understand most of them.

I don't think you (or anyone else)should feel bad about that.  Most of
the discussions about nmh changes have two parts:

1) Discussion of the concepts behind an interface change
2) How to implement 1).

In practice, to the extent that I understood 1) I didn't have much problem
understanding 2). After all, I was once a competent C hacker. But I had
problems understanding discussions of mhshow, mostly because I don't know
Mime, nor am I very familiar with RFC 5322 et. al.


Norman Shapiro

___
Nmh-workers mailing list
Nmh-workers@nongnu.org
https://lists.nongnu.org/mailman/listinfo/nmh-workers


Re: [Nmh-workers] Why wasn't I paying attention?

2014-06-03 Thread Ken Hornstein
During the development of nmh-1.6, there was extensive message traffic
about mhshow. Wouldn't I have fewer questions and make fewer stupid remarks
if I had paid attention to that traffic? Maybe?  I did try to follow
those discussions but couldn't understand most of them.

I don't think you (or anyone else) should feel bad about that.  Most of
the discussions about nmh changes have two parts:

1) Discussion of the concepts behind an interface change
2) How to implement 1).

Most of the longer discussions end up being 2), which can go off into
the weeds onto very technical things like how the iconv library works
and particular paragraphs of RFCs.

When I suggest an interface change, I usually try to generate a summary
email that covers the broad concepts to explain my ideas.  Here's an
example:

   http://lists.nongnu.org/archive/html/nmh-workers/2014-03/msg00030.html

Most of the later discussion was about nuts and bolts things, like
what to do with particular MIME parts, particular flags, etc etc.  But
if anyone has any questions about anything, you should feel free to
ask!  I usually have a good idea in my head about how the change impacts
users, even if I'm not always so good about explaining it.

--Ken

___
Nmh-workers mailing list
Nmh-workers@nongnu.org
https://lists.nongnu.org/mailman/listinfo/nmh-workers


Re: [Nmh-workers] why at ?

2014-02-14 Thread Jerrad Pierce
My guess is uucp, whichhad a ! delimited path,
but there was still an intended recipient at a machine, but not @.

___
Nmh-workers mailing list
Nmh-workers@nongnu.org
https://lists.nongnu.org/mailman/listinfo/nmh-workers


Re: [Nmh-workers] why at ?

2014-02-14 Thread Ken Hornstein
this falls way down low in the pet peeve category:

why is it that post, when showing me where my message is
going to be delivered, says someone at example.com rather
than some...@example.com?

i'm guessing there's some ancient history surrounding that, but can't
imagine what it might be.

That's the old RFC 733 address syntax, and yes, the nmh parser supports
it, although I don't know how well it interoperates with some of the
more modern stuff we've added since then.  If I had to guess, that's
just a holdover from those days.  RFC 733 was published in 1977, and one
of the authors was David Crocker, who worked at Rand at the time.

(BTW, I've been meaning to try to get in a What does the Fox Say? joke
with you on the nmh mailing list for a while now, but I couldn't quite
pull it together).

--Ken

___
Nmh-workers mailing list
Nmh-workers@nongnu.org
https://lists.nongnu.org/mailman/listinfo/nmh-workers


Re: [Nmh-workers] why at ?

2014-02-14 Thread Paul Fox
ken wrote:
  this falls way down low in the pet peeve category:
  
  why is it that post, when showing me where my message is
  going to be delivered, says someone at example.com rather
  than some...@example.com?
  
  i'm guessing there's some ancient history surrounding that, but can't
  imagine what it might be.
  
  That's the old RFC 733 address syntax, and yes, the nmh parser supports

whoa.  who knew?  thanks.  i had no idea it was actual syntax.

  it, although I don't know how well it interoperates with some of the
  more modern stuff we've added since then.  If I had to guess, that's
  just a holdover from those days.  RFC 733 was published in 1977, and one
  of the authors was David Crocker, who worked at Rand at the time.
  
  (BTW, I've been meaning to try to get in a What does the Fox Say? joke
  with you on the nmh mailing list for a while now, but I couldn't quite
  pull it together).

Fraka-kaka-kaka-kaka-kow!!!

of course.

paul
--
 paul fox, p...@foxharp.boston.ma.us (arlington, ma, where it's 24.3 degrees)

___
Nmh-workers mailing list
Nmh-workers@nongnu.org
https://lists.nongnu.org/mailman/listinfo/nmh-workers


Re: [Nmh-workers] Why aren't papers in nmh distribution?

2006-01-28 Thread Oliver Kiddle
Joel Reicher wrote:
 Does anyone know why the contents of papers/ from mh-6.8.4 aren't in
 the nmh distribution?

I wouldn't know but their content is sufficiently old that they don't
have much value other than historic interest. Jerry's book is more
useful as a tutorial and reference. It's perhaps worth fixing them to
compile and putting PDFs on the website but I'd have thought it was
better to avoid having anything in the nmh distribution that we aren't
prepared to maintain and update. If anything. it'd be good if we could
remove more dead stuff from the distribution. vmh for example.

Oliver


This e-mail and any attachment is for authorised use by the intended 
recipient(s) only. It may contain proprietary material, confidential 
information and/or be subject to legal privilege. It should not be copied, 
disclosed to, retained or used by, any other party. If you are not an intended 
recipient then please promptly delete this e-mail and any attachment and all 
copies and inform the sender. Thank you.


___
Nmh-workers mailing list
Nmh-workers@nongnu.org
http://lists.nongnu.org/mailman/listinfo/nmh-workers