Re: [exim] Queue ID format

Yves Goergen via Exim-users Thu, 24 Dec 2020 15:40:57 -0800

Don't worry, I'm not trying to make a meaning out of that ID. Just wantto narrow down the recognised pattern to avoid false interpretation ofthe log entries.

Well, if that log was intended for humans, would it be an interestingidea to write a machine-readable log as well? I'm specificallyinterested in metrics such as these:

* How many messages are submitted by a specific user/address or for aspecific recipient/domain?* From how many different hosts/IP addresses are messages submitted fora specific user/sender?* How many remote SMTP errors indicating server reputation issues do wesee, and from which remote services?

* How many messages from a specific user/address could not be delivered?
* Does a user have a high sender spam score?

This is all to monitor the quality of the local service and detecthacked accounts or other kinds of misuse of the service.

But I already see at my last list item which uses a log message from mycustom Exim config that it's probably hard to generate a moreparsing-friendly format (e.g. JSON). Every custom log message would needto be annotated for that.

By now, out of 20000 log lines, I can't recognise 30. From all others itseems I can extract sufficient meaning and data for the necessarymetrics. I can live with that, it's just a lot of code required to getthere.


-Yves


-------- Ursprüngliche Nachricht --------
Von: Jeremy Harris via Exim-users <[email protected]>
Gesendet: Donnerstag, 24. Dezember 2020, 23:35 MEZ
Betreff: [exim] Queue ID format

On 24/12/2020 22:17, Yves Goergen via Exim-users wrote:

I'm parsing Exim log files, specifically the mainlog. Man, that's acomplex structure and it's hard to find all necessary details from thedocumentation and by reading my actual log files. I'm using severalregular expressions for different kinds of lines. But a stateful parser(the ones used to understand programming languages) would probably havebeen the better choice here. Apache access logs just require a singleregex, for Exim I already have 8, one of which just covers mostmeaningless messages I don't care about, and lots of detailedpost-processing.


The logs are really designed for human use, not for machine consumption.

What assumptions can I make about the format of a queue message ID? Fornow, I use this regex:


    [^ ]+

Though it seems they always match this regex:

    [0-9A-Za-z]{6}-[0-9A-Za-z]{6}-[0-9A-Za-z]{2}

It may change at any time from future development changes.
There's a relevant comment in the source:

/* Now build the unique message id. This has changed several times over the

lifetime of Exim. This description was rewritten for Exim 4.14 (February2003).

...

I *think* that some high-volume sites are at or close to performancelimits [1]that the current format imposes, hence I must reiterate: this (themessage_id

format) is not supposed to be an exported interface.  It's only documented
behaviour is that it is unique.

It's fairly reasonable to assume it'll never have an embedded space. Iwould not

recommend trying to extract meaning from it.



--
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/

Re: [exim] Queue ID format

Reply via email to