Jeff King <p...@peff.net> writes:

> This is weirdly specific. Can we accomplish the same thing with existing
> tools?
>
> E.g., could:
>
>   git cat-file --batch-all-objects --batch-check='%(objectname)' |
>   shuffle |
>   head -n 100
>
> do the same thing?
>
> I know that "shuffle" isn't available everywhere, but I'd much rather
> see us fill in portability gaps in a general way, rather than
> introducing one-shot C code that needs to be maintained (and you
> wouldn't _think_ that t/helper programs need much maintenance, but try
> perusing "git log t/helper" output; they have to adapt to the same
> tree-wide changes as the rest of the code).

I was thinking about this a bit more, and came to the conclusion
that "sort -R" and "shuf" are wrong tools to use.  We would want to
measure with something close to real world workload.  for example,
letting

        git rev-list --all --objects

produce the listof objects in traversal order (i.e. this is very
similar to the order in which "git log -p" needs to access the
objects) and chomping at the number of sample objects you need in
your test would give you such a list.

Reply via email to