Hello Wietse,

On 21/09/2020 17:12, Wietse Venema wrote:

> I am not opposed to making bounces thread-friendly, but I wonder,
> Postfix includes a copy of the undelivered message, why is it
> difficult to find out what message was not delivered? Are mail
> systems routinely deleting that information, because spam?

I do not believe that many mailservers routinely remove attachments. On
the wider internet I'd guess that it's mostly either accept/deny.

Enterprise systems however are a completely different matter.
In an unrelated issue, I recently stumbled over an Exchange server that
would routinely repack every single mime attachment, helpfully add
ATTxxxxxxx as filename, throw away the original plain text part and then
try to build a new plain text part from the HTML part by stripping all
tags. At which point spaces and linebreaks got eaten so that e.g. table
cell content is now one long string without spaces in between...
Which left me rather astonished that such an anti-feature exists but
also that it is enabled on outlook365 for a horrifyingly large numnber
of customers.

And yes, that server also mangles with DSN messages, so your suggestion
that parts might get eaten is probably accurate.
Turns out, if the message was S/MIME-Signed, the content would be left
alone. But that would be a bit intrusive and questionable, adding such a
feature to postfix.


In the specific case considering my users however I is just lack of
understanding/lack of ability to read and understand error messages.
There's "tech-mumbo-jumbo" about "some error I didn't understand and
then I stopped reading" and "what do you mean, there's the original
message attached?". Pair that with mobile phone mail clients that have
trouble showing delivery status notifications in the first place and
show broken attachments instead, I can somewhat understand where the
users are coming from.

Adding the Reference information made these things easier and for
example Apple Mail and Thunderbird now nicely show the delivery status
notification as a reply to the original mail.
Now especially on mails CC'd to 35 different people the users are still
sometimes confused figuring out which one of the 35 was undeliverable...
But there's only so much I can do for that...




> The code is not bad for a rusty C programmer. However, I see a new
> vstring_alloc() call without a corresponding vstring_free().

Thank you, you're too kind. :-D I'll see about freeing that.


> I don't think that you should change the code that handles the case
> of "no sender record before message content". This already was
> delicate code, and changing it is likely to introduce a bug. It
> will greatly increase my time for code review (first to determine
> that the old code is correct, and then to determine that the new
> code does not break the old code).

Hmmm. That is an interesting constraint. The only way of _not_ touching
that piece of code - as far as I can see - though is to do a second
parsing run over the data and extract the Message-Id that way. Doing it
in the same loop as the "no sender record before message content" is not
really possible if I read the code correctly.

What's going on previously was:

  - Iterate over every rec_type top to bottom, extract the offset of the
original mail (based on the size attrib), the arrival time and the sender.
  - If the M(essage following) rec_type is hit and there is no offset
because no S rec_type was found yet, use the current fp position as
offset and abort.

The warning about no sender found at that point in the file is just
that, a warning.
In the absence of a sender rec_type, the bounce_info->sender part is
empty which means no X-Postfix-Sender header would be added to the DSN
attachment and the Return-Path header in the attached mail will be
showing the null sender (<>).


What's going on now with my changes is:

  - Iterate over every rec_type top to bottom, extract the offset of the
original mail (based on the size attrib), the arrival time and the sender.
  - If the M(essage following) rec_type is hit and there is offset
because no S attrib was found yet, use the current fp position as offset
and _DO NOT_ abort. This is new. The abort condition has moved to my
newly added code and will be triggered on the empty line between Header
and Body.
  - If the NORM or CONT types are hit, we try to see if it's a Msg-Id
header.
  - If it doesn't look like a header or is a folded line, skip to the
next and try again.

The no sender part and the no size part should be unchanged. But having
a second pair of eyes is certainly appreciated.


> Extracting headers from a record stream is tricky. Unlikely as it
> may be, your code may see message-id: in the middle of a long
> physical line (which Postfix stores as one or more REC_TYPE_CONT
> records followed by REC_TYPE_NORM, each of which could start with
> message-id:). For comparison, I looked at code that I wrote to
> extract Delivered-To headers, and that code misses information when
> some header is very long (does not handle REC_TYPE_CONT). Extracting
> headers correctly requires a state variable that remembers whether
> the last record came from the beginning of a line.

Ohhh. I see your point. Could it also happen that "Message-I" is in the
one record and the next one continues with "d: <whate...@example.com>"?
That'd be a bummer and no, I certainly did not take that into account...

> The only Postfix code that handles long headers correctly is in the
> Postfix MIME parser (which contains such a state variable); the
> cleanup daemon uses that to extract header information. Some of
> that information ends up in the Postfix queue file segment for
> extracted information. We could use that to extract the message-id.

Would it make sense to extract those functions, abstract them and then
use them for the delivered-to header as well?


> Anyway, this is Monday, and it will be a few days before I can do
> a weekend project.

Doing the above change was my weekend project. I just finished Monday
morning at 5am...

cheers,
  Andreas

Reply via email to