I was envisioning either unconditionally recording UID/GID, or perhaps
making this configurable at compile time. I hadn't considered runtime
configuration (distinct from registering a changelog consumer). Is this
necessary?
Our site is probably not interested in recording NIDs. Although it would
make sense to include them in an audit trail, I'm concerned about the
possible overhead. Perhaps recording of NIDs could be configurable? I
would prefer to consider this later, if ever.
I had originally hoped to avoid increasing the number of records
generated by a given MPI job, by adding the UID and GID as extensions to
a changelog record type that was already being generated.
However, in my testing, some actions which should be captured in even a
minimal audit trail, such as a sequence of open/read/write/close system
calls, do not generate any changelog records.
As a result, it's my understanding that generation of additional
changelog record(s) is unavoidable. Is this an accurate assessment?
Regards,
Matt
On 27/06/17 21:35, Dilger, Andreas wrote:
On Jun 27, 2017, at 01:18, Matthew Sanderson <matthew.sander...@anu.edu.au>
wrote:
Hi all,
Change logs would form a more complete audit trail if they contained a user ID
(and possibly also a primary group ID, maybe even all of the user's
supplementary group IDs).
Is there a particular reason why this information isn't currently stored in
changelog records?
After some investigation with my colleague (cc'd), it looks like this would be
a comparatively easy change to make. The information is already sent over the
wire to the MDS; it's just not persisted in the changelog.
The additional fields could be added to the userspace 'struct changelog_rec' as
an additional extension, similar to the way renames and job IDs are stored. As
far as I can tell, this wouldn't break compatibility with existing applications
that consume changelogs.
This definitely sounds interesting. Originally, the ChangeLog was developed
for tracking resync of changes to the filesystem, and the ownership of the
files can be found by looking up the inode by FID. Definitely there has been
some interest in having auditing for Lustre.
Your analysis of the updated ChangeLog format is correct - it was implemented
to allow addition of new fields, and I'd definitely be in support of your
proposal to record the process UID/GID accessing the files, if auditing was
enabled. Would the client NID also need to be recorded?
I was thinking that enabling auditing for Lustre would potentially be too much
overhead, but if this was limited to a single record for each UID/GID opening
each file it would likely be fairly reasonable since only a single record would
be needed for even a large MPI job.
Cheers, Andreas
--
Andreas Dilger
Lustre Principal Architect
Intel Corporation
_______________________________________________
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org