I was envisioning either unconditionally recording UID/GID, or perhaps making this configurable at compile time. I hadn't considered runtime configuration (distinct from registering a changelog consumer). Is this necessary?

Our site is probably not interested in recording NIDs. Although it would make sense to include them in an audit trail, I'm concerned about the possible overhead. Perhaps recording of NIDs could be configurable? I would prefer to consider this later, if ever.

I had originally hoped to avoid increasing the number of records generated by a given MPI job, by adding the UID and GID as extensions to a changelog record type that was already being generated. However, in my testing, some actions which should be captured in even a minimal audit trail, such as a sequence of open/read/write/close system calls, do not generate any changelog records.

As a result, it's my understanding that generation of additional changelog record(s) is unavoidable. Is this an accurate assessment?

Regards,

Matt


On 27/06/17 21:35, Dilger, Andreas wrote:
On Jun 27, 2017, at 01:18, Matthew Sanderson <matthew.sander...@anu.edu.au> 
wrote:
Hi all,

Change logs would form a more complete audit trail if they contained a user ID 
(and possibly also a primary group ID, maybe even all of the user's 
supplementary group IDs).

Is there a particular reason why this information isn't currently stored in 
changelog records?

After some investigation with my colleague (cc'd), it looks like this would be 
a comparatively easy change to make. The information is already sent over the 
wire to the MDS; it's just not persisted in the changelog.

The additional fields could be added to the userspace 'struct changelog_rec' as 
an additional extension, similar to the way renames and job IDs are stored. As 
far as I can tell, this wouldn't break compatibility with existing applications 
that consume changelogs.
This definitely sounds interesting.  Originally, the ChangeLog was developed 
for tracking resync of changes to the filesystem, and the ownership of the 
files can be found by looking up the inode by FID.  Definitely there has been 
some interest in having auditing for Lustre.

Your analysis of the updated ChangeLog format is correct - it was implemented 
to allow addition of new fields, and I'd definitely be in support of your 
proposal to record the process UID/GID accessing the files, if auditing was 
enabled.  Would the client NID also need to be recorded?

I was thinking that enabling auditing for Lustre would potentially be too much 
overhead, but if this was limited to a single record for each UID/GID opening 
each file it would likely be fairly reasonable since only a single record would 
be needed for even a large MPI job.

Cheers, Andreas
--
Andreas Dilger
Lustre Principal Architect
Intel Corporation








_______________________________________________
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Reply via email to