Re: Kernel audit output is inconsistent, hard to parse

John Dennis Wed, 30 Jan 2008 08:19:44 -0800

Steve Grubb wrote:

On Tuesday 29 January 2008 17:56:36 John Dennis wrote:

Hence the audit parsing library. The idea is to abstract this away so thatanyone wanting to write a tool does not need to study all the messages andfigure out the parsing rules.


> The way forward has to be the audit parsing library.

The problem is auparse is just as screwed as anybody else. Unparseableoutput is is just plain wrong and inexcusable. You're suggesting auparseembed all sorts of hacks and heuristics to unravel a problem whichshould never exist in the first place. It's a house of cards which intime will collapse. You also haven't explained how auparse is going todeal with log data generated by different kernel versions, especiallywhen logs are aggregated.

tools developed around these messages and making wholesale changes will breakthem.


Break what is already fundamentally broken? That's not an answer ;-)

Any fix will break someone's tool somewhere unless they are coded to the auditparsing library.

auparse is going to break too. The current situation is you can'tdetermine if a field is encoded or not by reading the output, you alsohave to know the kernel source code, that's wrong.

Auparse is not the answer to irregular kernel audit message

This is the answer in so many ways. In order to make any change, you have todecouple applications from the actual data structure. You cannot normalizethe data without breaking somebody somewhere.

Which is why making the output so it can be parsed independent of thekernel version an essential requirement.

For example, suppose we all agreed the data structure is an abomination andhad to be fixed. We get all the code into 2.6.26 kernel. meanwhile Fedora 9is released on the 2.6.24 kernel. We get the user space pieces fixed up to bereleased at the same time as 2.6.26. Then Fedora steps up to 2.6.25 kerneland then ultimately 2.6.26. The userspace in Fedora 9 was never intended towork with the new format. We can't keep the kernel team from doing what'sright for everyone that wants new device drivers. We're stuck.

You're only stuck if the output can only be parsed by one version, ifthe output were regular the problem goes away. Isn't that the desiredresult?

auparse_get_field_str() returns the field value in it's encoded form,

I would chose the words, raw form.

Yes, raw is a better term. Some raw values are encoded, some aren't,that's the problem.

this is almost never of value to the caller. The caller wants the
field value to be unencoded so it can operate on it.


Sometimes. It depends on the situation.

Very rarely. As an analogy 99.99% of the time you want your email clientto decode the contents from the transfer encoding it arrived in,otherwise it's just gibberish. Raw form is really only useful whendebugging the encoding/decoding.

If you want the field value to be unencoded you have to call
auparse_interpret_field().


Correct.

But auparse_interpret_field() performs two distinctly different operations,
It does only one thing, that is translate the data from raw to interpretedform.

Wrong :-) It does two entirely different things and those operationscannot be separated. The two operations are:

1) decoding (e.g. decoding a field value encoded in hexadecimal formback into it's original string)

2) interpretation (e.g. translating a uid field into a username). I callthis interpretation "contextual substitution" because it's taking afield value and substituting in another value, often in a differentformat. You cannot interpret a field value until it has been decoded.

What if I don't want auparse to change the field value and insteadsimply return the field value? Currently you can't simply get the fieldvalue! Why? Because some fields are encoded, so you either get the rawencoded value (which is meaningless 99.99% of the time, if it had beenencoded) or you get something which is completely munged.

So, John, if you want selinux format changes, complain on their mail list.I've already done that and lost. :)

FWIW, I can live with not changing the message contents. But no one canlive with a situation where the data can't be parsed, it is simplywrong. Just to be clear the problem is you can't determine as one parsesif a field value is encoded or not which means you can't decide if ithas to be decoded or not.


Here is an example from the real world, an audit message has this field

comm=df

So is the value the string "df" (e.g. disk free) or is this thehexadecimal encoded byte value 223? The only way to know is by lookingat the kernel source code and knowing that the "comm" field in aspecific audit record is generated by callingaudit_log_untrustedstring(). What if it doesn't call that in andifferent kernel version? What if a new field is added in a new kernelversion, how will the parser know what which function kernel used togenerate the string? What if in one kernel version the string was outputwith audit_log_untrustedstring() but in another kernel version it wasn't?


--
John Dennis <[EMAIL PROTECTED]>

--
Linux-audit mailing list
Linux-audit@redhat.com
https://www.redhat.com/mailman/listinfo/linux-audit

Re: Kernel audit output is inconsistent, hard to parse

Reply via email to