Dan,

I have to +1 all of your well-thought-out comments.  As a potential
consuming of this functionality for the Immutable Service Container
project, answers to these questions are critical.

I am also interested in whether additional fields can be added to
the output similar to a "-o field1,field2" scenario?  It would be
nice to have data such as file type, modification time (where
applicable).

Also, will this functionality be able to tell how files were modified?
Things like changes in file ownership, group membership, permissions
and ACLs, size, times, etc.?  Even if this processing is not directly
implemented as part of the zfs diff command, perhaps the fields could
be made available (per -o comment above) to be consumed by layered
tools?

g


On 3/30/10 12:54 AM, Dan Price wrote:
> On Mon 29 Mar 2010 at 11:14AM, Bart Smaalders wrote:
>> On 03/29/10 11:01, Matthew Ahrens wrote:
>>
>>> How do commands like ls and find handle printing of filenames with
>>> arbitrary characters (newlines and such)?
>>
>> In general, badly.
>
> Tim,
>
> My concern, which others have hinted at, is that there are a legion of
> people who are going to want to consume this information and there is
> great value in making said information be machine parseable.  Automated
> build systems, tripwires, fancy backup/recovery tools, et cetera.
>
> In summary, the current output seems mostly OK if it's for humans, but
> the case is ambiguous about who the expected consumer is.  It would
> be a tragedy if there wasn't a machine consumable way to get at this
> information.
>
> I also have questions about how intelligent a consuming piece of
> software must be in order to make sense of this information.  Has anyone
> written a proof of concept tool using this?  For example, if a directory
> /foo/a is renamed to /foo/b, then an analyzer would need to stat /foo/b
> in order to discover that /foo/b is a directory, then traverse into as
> needed.  It would be a shame if everyone who wanted to consume this had
> to write the same thousand lines of code (I'm happy to be convinced that
> this isn't the case).
>
> Some specific questions...
>
> 1) In what order are the changes printed?  If I saw:
>
>       +       /myfiles/rename_dir
>       R       /myfiles/rename_dir ->  /myfiles/rename_dir
>
> My analyzer would need to be smart enough to realize that the second
> must have happened before the first, and that both paths need
> evaluation.  Right?
>
> 2) The meaning of "file/directory" (Don's concern aside) seems ambiguous in
> the proposal.  Are we tracking the filesystem *namespace* entry?  Or the
> actual object?  I found that not being sure of this made the proposal
> hard to evaluate.  Simple thought experiment which confused me:
>
>       snapshot at 1
>       rm a/b
>       rm a/c
>       rmdir a
>       echo "foo">  a
>       snapshot at 2
>
> Does that yield this?                 Or this?
>
>       -       a/b           |         -       a/b
>       -       a/c           |         -       a/c
>       -       a             |         M       a
>       +       a             |
>
> 3) Output is shown with leading slashes.  Is output shown relative to the
> mount point?  Or something else?  (If the former, what if between @a and
> @b the mountpoint changed?)
>
> 4) I would also vote for a mode which simply outputs a list of
> pathnames to investigate for differences.  This would enable:
>
>     zfs diff -someflag a at 1 a at 2 | xargs do_some_analysis_on_these
>
>
> Thanks for tackling this,
>
>          -dp
>

Reply via email to