This is a re-casting of my previous filter-objects command but without
any of the filtering so it is now just "list-all-objects".

I have retained the "--verbose" option which outputs the same format as
the default "cat-file --batch-check" as it provides a useful performance
gain to filtering though "cat-file" if this basic information is all
that is needed.

The motivating use case is to enable a script to quickly scan a large
number of repositories for any large objects.

I performed some test timings of some different commands on a clone of
the Linux kernel which was completely packed.

        $ time git rev-list --all --objects |
                cut -d" " -f1 |
                git cat-file --batch-check |
                awk '{if ($3 >= 512000) { print $1 }}' |
                wc -l
        958

        real    0m30.823s
        user    0m41.904s
        sys     0m7.728s

list-all-objects gives a significant improvement:

        $ time git list-all-objects |
                git cat-file --batch-check |
                awk '{if ($3 >= 512000) { print $1 }}' |
                wc -l
        958

        real    0m9.585s
        user    0m10.820s
        sys     0m4.960s

skipping the cat-filter filter is a lesser but still significant
improvement:

        $ time git list-all-objects -v |
                awk '{if ($3 >= 512000) { print $1 }}' |
                wc -l
        958

        real    0m5.637s
        user    0m6.652s
        sys     0m0.156s

The old filter-objects could do the size filter a little be faster, but
not by much:

        $ time git filter-objects --min-size=500k |
                wc -l
        958

        real    0m4.564s
        user    0m4.496s
        sys     0m0.064s
--
To unsubscribe from this list: send the line "unsubscribe git" in

Reply via email to