Steve Grubb wrote:
On Tuesday 29 January 2008 17:56:36 John Dennis wrote:

Hence the audit parsing library. The idea is to abstract this away so that anyone wanting to write a tool does not need to study all the messages and figure out the parsing rules.

> The way forward has to be the audit parsing library.

The problem is auparse is just as screwed as anybody else. Unparseable output is is just plain wrong and inexcusable. You're suggesting auparse embed all sorts of hacks and heuristics to unravel a problem which should never exist in the first place. It's a house of cards which in time will collapse. You also haven't explained how auparse is going to deal with log data generated by different kernel versions, especially when logs are aggregated.

tools developed around these messages and making wholesale changes will break them.

Break what is already fundamentally broken? That's not an answer ;-)

Any fix will break someone's tool somewhere unless they are coded to the audit parsing library.

auparse is going to break too. The current situation is you can't determine if a field is encoded or not by reading the output, you also have to know the kernel source code, that's wrong.

Auparse is not the answer to irregular kernel audit message

This is the answer in so many ways. In order to make any change, you have to decouple applications from the actual data structure. You cannot normalize the data without breaking somebody somewhere.

Which is why making the output so it can be parsed independent of the kernel version an essential requirement.

For example, suppose we all agreed the data structure is an abomination and had to be fixed. We get all the code into 2.6.26 kernel. meanwhile Fedora 9 is released on the 2.6.24 kernel. We get the user space pieces fixed up to be released at the same time as 2.6.26. Then Fedora steps up to 2.6.25 kernel and then ultimately 2.6.26. The userspace in Fedora 9 was never intended to work with the new format. We can't keep the kernel team from doing what's right for everyone that wants new device drivers. We're stuck.

You're only stuck if the output can only be parsed by one version, if the output were regular the problem goes away. Isn't that the desired result?

auparse_get_field_str() returns the field value in it's encoded form,

I would chose the words, raw form.

Yes, raw is a better term. Some raw values are encoded, some aren't, that's the problem.

this is almost never of value to the caller. The caller wants the
field value to be unencoded so it can operate on it.

Sometimes. It depends on the situation.

Very rarely. As an analogy 99.99% of the time you want your email client to decode the contents from the transfer encoding it arrived in, otherwise it's just gibberish. Raw form is really only useful when debugging the encoding/decoding.

If you want the field value to be unencoded you have to call
auparse_interpret_field().

Correct.

But auparse_interpret_field() performs two distinctly different operations,

It does only one thing, that is translate the data from raw to interpreted form.

Wrong :-) It does two entirely different things and those operations cannot be separated. The two operations are:

1) decoding (e.g. decoding a field value encoded in hexadecimal form back into it's original string)

2) interpretation (e.g. translating a uid field into a username). I call this interpretation "contextual substitution" because it's taking a field value and substituting in another value, often in a different format. You cannot interpret a field value until it has been decoded.

What if I don't want auparse to change the field value and instead simply return the field value? Currently you can't simply get the field value! Why? Because some fields are encoded, so you either get the raw encoded value (which is meaningless 99.99% of the time, if it had been encoded) or you get something which is completely munged.

So, John, if you want selinux format changes, complain on their mail list. I've already done that and lost. :)

FWIW, I can live with not changing the message contents. But no one can live with a situation where the data can't be parsed, it is simply wrong. Just to be clear the problem is you can't determine as one parses if a field value is encoded or not which means you can't decide if it has to be decoded or not.

Here is an example from the real world, an audit message has this field

comm=df

So is the value the string "df" (e.g. disk free) or is this the hexadecimal encoded byte value 223? The only way to know is by looking at the kernel source code and knowing that the "comm" field in a specific audit record is generated by calling audit_log_untrustedstring(). What if it doesn't call that in an different kernel version? What if a new field is added in a new kernel version, how will the parser know what which function kernel used to generate the string? What if in one kernel version the string was output with audit_log_untrustedstring() but in another kernel version it wasn't?

--
John Dennis <[EMAIL PROTECTED]>

--
Linux-audit mailing list
Linux-audit@redhat.com
https://www.redhat.com/mailman/listinfo/linux-audit

Reply via email to