On Thu, Mar 22, 2018 at 10:46:09PM -0400, Jeff King wrote:
> > which begs the question, how much slower would it be if we
> > replaced the radix-sort with an in-place sort (e.g. heapsort).
> >
> > I hacked up the patch below, just for fun. I don't have any
> > large repos (or enough disk space) to do any meaningful perf
> > tests, but I did at least compile it and it passes the test-suite.
> > (That is no guarantee that I haven't introduced bugs, of course!)
>
> It might have been easier to just revert 8b8dfd5132 (pack-revindex:
> radix-sort the revindex, 2013-07-11). It even includes some performance
> numbers. :)
>
> In short, no, I don't think we want to go back to a comparison-sort. The
> radix sort back then was around 4 times faster for linux.git. And that
> was when there were half as many objects in the repository, so the radix
> sort should continue to improve as the repo size grows.
I was curious whether my hand-waving there was true. It turns out that
it is: the radix sort has stayed about the same speed but the comparison
sort has gotten even slower. Here are best-of-five timings for "git
cat-file --batch-check='%(objectsize:disk)'", which does very little
besides generate the rev-index:
[current master, using radix sort]
real 0m0.104s
user 0m0.088s
sys 0m0.016s
[reverting 8b8dfd5132, going back to qsort]
real 0m1.193s
user 0m1.176s
sys 0m0.016s
So it's now a factor of 11. Yikes.
That number does match some napkin math. The radix sort uses four 16-bit
buckets, but can quit when after two rounds (because none of the offsets
is beyond 2^32). So it's essentially O(2n). Whereas the comparison sort
is O(n log n), and with n around 6M, that puts log(n) right around 22.
It's possible that some other comparison-based sort might be a little
more efficient than qsort, but I don't think you'll be able to beat the
algorithmic speedup.
The revert of 8b8dfd5132 is below for reference (it needed a few
conflict tweaks).
-Peff
---
diff --git a/pack-revindex.c b/pack-revindex.c
index ff5f62c033..c20aa9541b 100644
--- a/pack-revindex.c
+++ b/pack-revindex.c
@@ -15,102 +15,11 @@
* get the object sha1 from the main index.
*/
-/*
- * This is a least-significant-digit radix sort.
- *
- * It sorts each of the "n" items in "entries" by its offset field. The "max"
- * parameter must be at least as large as the largest offset in the array,
- * and lets us quit the sort early.
- */
-static void sort_revindex(struct revindex_entry *entries, unsigned n, off_t
max)
+static int cmp_offset(const void *a_, const void *b_)
{
- /*
- * We use a "digit" size of 16 bits. That keeps our memory
- * usage reasonable, and we can generally (for a 4G or smaller
- * packfile) quit after two rounds of radix-sorting.
- */
-#define DIGIT_SIZE (16)
-#define BUCKETS (1 << DIGIT_SIZE)
- /*
- * We want to know the bucket that a[i] will go into when we are using
- * the digit that is N bits from the (least significant) end.
- */
-#define BUCKET_FOR(a, i, bits) (((a)[(i)].offset >> (bits)) & (BUCKETS-1))
-
- /*
- * We need O(n) temporary storage. Rather than do an extra copy of the
- * partial results into "entries", we sort back and forth between the
- * real array and temporary storage. In each iteration of the loop, we
- * keep track of them with alias pointers, always sorting from "from"
- * to "to".
- */
- struct revindex_entry *tmp, *from, *to;
- int bits;
- unsigned *pos;
-
- ALLOC_ARRAY(pos, BUCKETS);
- ALLOC_ARRAY(tmp, n);
- from = entries;
- to = tmp;
-
- /*
- * If (max >> bits) is zero, then we know that the radix digit we are
- * on (and any higher) will be zero for all entries, and our loop will
- * be a no-op, as everybody lands in the same zero-th bucket.
- */
- for (bits = 0; max >> bits; bits += DIGIT_SIZE) {
- unsigned i;
-
- memset(pos, 0, BUCKETS * sizeof(*pos));
-
- /*
- * We want pos[i] to store the index of the last element that
- * will go in bucket "i" (actually one past the last element).
- * To do this, we first count the items that will go in each
- * bucket, which gives us a relative offset from the last
- * bucket. We can then cumulatively add the index from the
- * previous bucket to get the true index.
- */
- for (i = 0; i < n; i++)
- pos[BUCKET_FOR(from, i, bits)]++;
- for (i = 1; i < BUCKETS; i++)
- pos[i] += pos[i-1];
-
- /*
- * Now we can drop the elements into their correct buckets (in
- * our temporary array). We iterate the pos counter backwards
- * to avoid using an extra index to count up. And since we are
- * going backwards there, we must also go backwards through the
- * array itself, to keep the sort stable.
- *
- * Note that we use an unsigned iterator to make sure we can
- * handle 2^32-1 objects, even on a 32-bit system. But this
- * means we cannot use the more obvious "i >= 0" loop condition
- * for counting backwards, and must instead check for
- * wrap-around with UINT_MAX.
- */
- for (i = n - 1; i != UINT_MAX; i--)
- to[--pos[BUCKET_FOR(from, i, bits)]] = from[i];
-
- /*
- * Now "to" contains the most sorted list, so we swap "from" and
- * "to" for the next iteration.
- */
- SWAP(from, to);
- }
-
- /*
- * If we ended with our data in the original array, great. If not,
- * we have to move it back from the temporary storage.
- */
- if (from != entries)
- COPY_ARRAY(entries, tmp, n);
- free(tmp);
- free(pos);
-
-#undef BUCKET_FOR
-#undef BUCKETS
-#undef DIGIT_SIZE
+ const struct revindex_entry *a = a_;
+ const struct revindex_entry *b = b_;
+ return (a->offset < b->offset) ? -1 : (a->offset > b->offset) ? 1 : 0;
}
/*
@@ -152,7 +61,7 @@ static void create_pack_revindex(struct packed_git *p)
*/
p->revindex[num_ent].offset = p->pack_size - 20;
p->revindex[num_ent].nr = -1;
- sort_revindex(p->revindex, num_ent, p->pack_size);
+ qsort(p->revindex, num_ent, sizeof(*p->revindex), cmp_offset);
}
void load_pack_revindex(struct packed_git *p)