Hello Gilles,

In article <20190101143249.ga41...@ams-1.poolp.org> Gilles Chehade 
<gil...@poolp.org> wrote:
> On Tue, Jan 01, 2019 at 01:14:54PM +0100, Walter Alejandro Iglesias wrote:
> > On Fri, Dec 21, 2018 at 06:59:58PM +0100, Gilles Chehade wrote:
> > > On Fri, Dec 21, 2018 at 06:56:57PM +0100, Walter Alejandro Iglesias wrote:
> > > > Hello Gilles,
> > > > 
> > > > In article <20181221145201.ga90...@ams-1.poolp.org> Gilles Chehade 
> > > > <gil...@poolp.org> wrote:
> > > > > On Fri, Dec 21, 2018 at 07:41:41AM -0700, Gilles Chehade wrote:
> > > > > > CVSROOT:      /cvs
> > > > > > Module name:  src
> > > > > > Changes by:   gil...@cvs.openbsd.org  2018/12/21 07:41:41
> > > > > > 
> > > > > > Modified files:
> > > > > >       usr.sbin/smtpd : smtp_session.c 
> > > > > > 
> > > > > > Log message:
> > > > > > start simplifying log lines, they're no longer intended to be 
> > > > > > parseable, we
> > > > > > have a reporting API for tools that want to analyze events, maillog 
> > > > > > is just
> > > > > > for us, hoomans.
> > > > > > 
> > > > > 
> > > > > that was not the best way to phrase my commit log ... sorry
> > > > > 
> > > > > i meant they're no longer intended to be friendlier to scripts than to
> > > > > humans: there will still be in a format that's easy to quickly script,
> > > > > but they will hold information easily readable by humans, not a lot of
> > > > > unrelated context infos so tools can generate dashboards out of single
> > > > > lines.
> > > > > 
> > > > > logs for humans, event reports for tools.
> > > > > 
> > > > 
> > > > Since long I've been greping IPs from spammers and attackers from
> > > > /var/log/maillog, /var/log/authlog and /var/log/daemon using a shell
> > > > script I wrote that automatically includes them in a file read by a pf
> > > > table.  In the case of maillog, it relies in the address="" and host=""
> > > > info currently included.
> > > > 
> > > > Will it appear sender's IP and hostname in /var/log/maillog after this
> > > > change?
> > > > 
> > > 
> > > yes, you'll still be able to grep that information from maillog
> > 
> > You selected carefully the words in your answer. :-)
> > 
> 
> not really, I don't know what your scripts do and how you wrote them.

I made this clear in my explanation below.  At least the relevant part.

> 
> the sender IP and hostname appear in the log, they are just not repeated
> on every single log line but that shouldn't prevent scripts from keeping
> track of them.

Also clear in my explanation that I understood this.

> 
> anyways, as stated in the commit log and my follow up message:
> 
> "we have a reporting API for tools that want to analyse events, maillog
>  is just for us, hoomans"
> 
> "logs for humans, event reports for tools"

System administrators (i.e. those who will use your software) are also
humans. :-)


> 
> the maillog format is going to go through many changes to simplify it,
> remove redundant information, add missing information, etc... basing a
> script on it is not recommended as we'll break them with every change.
> 
> > Indeed, I still can grep "IP" and "host" in maillog, but they are alone
> > in a first line and the only way to associate them with the following
> > lines containing the from= to= and result= (to know what "happened" with
> > that connection) is by using the connection id, what will *painfully*
> > overcomplicate my scripts.
> > 
> 
> As you imagine, I can't take into account individual scripts.
> 
> Other people have asked that the port or listener tag appear in lines.
> Should these appear on all lines too ?
> And the cipher ? and the authenticated user ?
> Why is the IP/host information more legitimate to be repeated than other
> information on every single line ?
> What about the fcrdns check which will appear on connect lines, does the
> check have to appear on every line now ?
> What about the spf check when it is added at some point ?
> 
> maillog is not a context-free format, where each individual line carries
> all of the information so you don't have to look at previous lines. Line
> should describe an event and carry informations related to THAT event.

I'm aware, nowadays, most people out there are a bit childish, not my
case.  I sent you this message because I think the issue is relevant for
anyone.  I don't bother developers with my own particular problems or
"asking for features".  On the contrary, the less "features" the better
for me.

My point is, I still think the IP is the more relevant data, it was
sensible to be redundant in this case.  The new format is not only more
difficult for scripting but also to the eye.  Now when I see a line
telling me "Invalid command" I have to scan the previous lines with the
same id to identify what IP came the attack from.

But don't worry, I won't die.  I'm able to change my scripts. :-)


> 
> The only guarantee I make on the format is that you can always find what
> you're looking for with at most 2 grep, one to find a session id, one to
> find the event you're looking for.
> 
> That being said, there's a new reporting mechanism which is intended for
> scripts and tools. It comes with a format that's easily parsable, that's
> going to be stabilized, versionned and which actually provides more info
> than maillog. It doesn't solve your context-free issue but it can easily
> be used to script an output that repeats the info you need on all lines,
> to be fed to your existing scripts. I have such scripts myself.
> 
> If you describe how your scripts work, I can probably help you.

Not necessary, but thank you anyways.

> 
> 
> > I don't know what's the opinion of the rest about this change.  I'd
> > highly appreciate you to include again the IP on each line of info as
> > before. :-)
> > 
> 
> I didn't put this change to vote :-p
> 
> A lot of people had a bad opinion about the new config format but I knew
> it was an improvement and ultimately it has unlocked so many issues that
> we have had more commits in the last three months than in the last three
> years.
> 
> I know you would prefer that I didn't change the log format but what you
> want is still doable, so I won't revert unless there is a good rationale
> that I actually made some use-cases undoable and unfixable.
> 
> Fixing your scripts to not be context-free only requires a few lines for
> them to catch connect/disconnect events and map that to a session id. It
> is also possible to not fix them but write a script that produces output
> they want from the reporting mechanism.
> 

As far as I concerned, I lately adopted your software just because I
like OpenBSD and I found it logical to give the tools that come with the
system a try.

I don't doubt that you can do with your creature as you please.  Neither
that, in the long term, how bold you are with your changes will affect
you and only you.


Just my opinion, don't get upset with me. :-)



        Walter


P.S.: For some reason I ignore, this message of yours didn't reach my
server.  I read it from gmane.

Reply via email to