Re: [PATCH v6 00/11] nd/pack-objects-pack-struct updates

2018-03-23 Thread Jeff King
On Fri, Mar 23, 2018 at 04:01:50PM +, Ramsay Jones wrote:

> Not that it matters, but I assume this was something like:
> 
>   $ time (echo HEAD | git cat-file --batch-check="%(objectsize:disk)")
> 
> ... and I suspect it was on the linux.git repo, yes?

Yes to both.

> If I do this on my biggest repo (ffmpeg), I get:
> 
>   $ cd ../ffmpeg/
> 
>   $ time (echo HEAD | git cat-file --batch-check="%(objectsize:disk)")
>   227
> 
>   real0m0.037s
>   user0m0.020s
>   sys 0m0.004s
> 
>   $ time (echo HEAD | ../git/git-cat-file --batch-check="%(objectsize:disk)")
>   227
> 
>   real0m0.146s
>   user0m0.112s
>   sys 0m0.012s
> 
>   $ 
> 
> Where I'm using a version with my patch applied, rather than
> reverting commit 8b8dfd5132. A 395% slowdown is bad enough, but
> not as bad as a factor of 11! I bet you have a much more modern
> system (with a fast SSD) than my old laptop. :-D

Yes, though it was all being run out of disk cache anyway. I also have a
lot of RAM. :)

The ffmpeg repository only has ~550k objects. So that's log(19), and
we'd expect radix to be something like 8-9x faster than a comparison
sort. But at some point the constants take over too (each O(n) round the
radix sort actually has to look at the items twice, and of course there
are negative cache effects when duplicating the array).

So your numbers match what I'd expect.

> Thanks for looking into this, even if it was a wild
> goose chase. :)

No problem. I think it's nice to sanity check my own hand-waving once in
a while. ;)

-Peff


Re: [PATCH v6 00/11] nd/pack-objects-pack-struct updates

2018-03-23 Thread Ramsay Jones


On 23/03/18 05:50, Jeff King wrote:
> On Thu, Mar 22, 2018 at 10:46:09PM -0400, Jeff King wrote:
[snip]
> I was curious whether my hand-waving there was true. It turns out that
> it is: the radix sort has stayed about the same speed but the comparison
> sort has gotten even slower. Here are best-of-five timings for "git
> cat-file --batch-check='%(objectsize:disk)'", which does very little
> besides generate the rev-index:

Not that it matters, but I assume this was something like:

  $ time (echo HEAD | git cat-file --batch-check="%(objectsize:disk)")

... and I suspect it was on the linux.git repo, yes?

I used to have a copy of the linux repo on disk, but I had to
delete it a while ago to recover some disk space (no matter how
big disks get, they never seem big enough)!

If I do this on my biggest repo (ffmpeg), I get:

  $ cd ../ffmpeg/

  $ time (echo HEAD | git cat-file --batch-check="%(objectsize:disk)")
  227

  real  0m0.037s
  user  0m0.020s
  sys   0m0.004s

  $ time (echo HEAD | ../git/git-cat-file --batch-check="%(objectsize:disk)")
  227

  real  0m0.146s
  user  0m0.112s
  sys   0m0.012s

  $ 

Where I'm using a version with my patch applied, rather than
reverting commit 8b8dfd5132. A 395% slowdown is bad enough, but
not as bad as a factor of 11! I bet you have a much more modern
system (with a fast SSD) than my old laptop. :-D

>   [current master, using radix sort]
>   real0m0.104s
>   user0m0.088s
>   sys 0m0.016s
> 
>   [reverting 8b8dfd5132, going back to qsort]
>   real0m1.193s
>   user0m1.176s
>   sys 0m0.016s
> 
> So it's now a factor of 11. Yikes.

Thanks for looking into this, even if it was a wild
goose chase. :)

ATB,
Ramsay Jones



Re: [PATCH v6 00/11] nd/pack-objects-pack-struct updates

2018-03-23 Thread Ramsay Jones


On 23/03/18 02:46, Jeff King wrote:
> On Fri, Mar 23, 2018 at 01:28:12AM +, Ramsay Jones wrote:
> 
>>> Of the used heap after your patches:
>>>
>>>  - ~40% of that is from packlist_alloc()
>>>  - ~17% goes to "struct object"
>>>  - ~10% for the object.c hash table to store all the "struct object"
>>>  - ~7% goes to the delta cache
>>>  - ~7% goes to the pack revindex (actually, there's a duplicate 7%
>>>there, too; I think our peak is when we're sorting the revindex
>>>and have to keep two copies in memory at once)
>>
>> which begs the question, how much slower would it be if we
>> replaced the radix-sort with an in-place sort (e.g. heapsort).
>>
>> I hacked up the patch below, just for fun. I don't have any
>> large repos (or enough disk space) to do any meaningful perf
>> tests, but I did at least compile it and it passes the test-suite.
>> (That is no guarantee that I haven't introduced bugs, of course!)
> 
> It might have been easier to just revert 8b8dfd5132 (pack-revindex:
> radix-sort the revindex, 2013-07-11). It even includes some performance
> numbers. :)

But not as much fun! :)

> In short, no, I don't think we want to go back to a comparison-sort. The
> radix sort back then was around 4 times faster for linux.git. And that
> was when there were half as many objects in the repository, so the radix
> sort should continue to improve as the repo size grows.

Yes, I expected radix-sort to be faster.

> The absolute time savings aren't huge for something as bulky as a
> repack, so it's less exciting in this context. But it's also not that
> much memory (7% of the peak here, but as I showed elsewhere, if we can
> stop holding all of the "struct object" memory once we're done with it,
> then this revindex stuff doesn't even factor into the peak).

I didn't see this post until afterwards. So, if it isn't even a
factor in the peak memory use, then it's clear this specific
space/time trade-off also isn't an issue! ;-)

Thanks.

ATB,
Ramsay Jones


Re: [PATCH v6 00/11] nd/pack-objects-pack-struct updates

2018-03-23 Thread Duy Nguyen
On Fri, Mar 23, 2018 at 3:46 AM, Jeff King  wrote:
> On Fri, Mar 23, 2018 at 01:28:12AM +, Ramsay Jones wrote:
>
>> > Of the used heap after your patches:
>> >
>> >  - ~40% of that is from packlist_alloc()
>> >  - ~17% goes to "struct object"
>> >  - ~10% for the object.c hash table to store all the "struct object"
>> >  - ~7% goes to the delta cache
>> >  - ~7% goes to the pack revindex (actually, there's a duplicate 7%
>> >there, too; I think our peak is when we're sorting the revindex
>> >and have to keep two copies in memory at once)
>>
>> which begs the question, how much slower would it be if we
>> replaced the radix-sort with an in-place sort (e.g. heapsort).
>>
>> I hacked up the patch below, just for fun. I don't have any
>> large repos (or enough disk space) to do any meaningful perf
>> tests, but I did at least compile it and it passes the test-suite.
>> (That is no guarantee that I haven't introduced bugs, of course!)
>
> It might have been easier to just revert 8b8dfd5132 (pack-revindex:
> radix-sort the revindex, 2013-07-11). It even includes some performance
> numbers. :)
>
> In short, no, I don't think we want to go back to a comparison-sort. The
> radix sort back then was around 4 times faster for linux.git. And that
> was when there were half as many objects in the repository, so the radix
> sort should continue to improve as the repo size grows.
>
> The absolute time savings aren't huge for something as bulky as a
> repack, so it's less exciting in this context. But it's also not that
> much memory (7% of the peak here, but as I showed elsewhere, if we can
> stop holding all of the "struct object" memory once we're done with it,
> then this revindex stuff doesn't even factor into the peak).

We probably could do something about revindex after rev-list memory is
freed up too. revindex is used in two places (i'm ignoring
packfile.c): when we look for base objects in ofs-delta in
check_objects() and when we write a reused object. The first case
can't be helped, it's why we need revindex. The second case though is
just to get the compressed object size so we can copy the object.

If we cache the compressed size (like with uint32_t) then we can free
up revindex after check_objects() is done. Even if we repack
everything, this is still a very good saving (32 bits vs 128 bits per
object). The freed up memory could probably be used for delta cache.
But then if we hit a compressed object size larger than 4GB, revindex
must be recreated back, but we could detect this at check_objects
phase and keep revindex alivie.
-- 
Duy


Re: [PATCH v6 00/11] nd/pack-objects-pack-struct updates

2018-03-22 Thread Jeff King
On Thu, Mar 22, 2018 at 10:46:09PM -0400, Jeff King wrote:

> > which begs the question, how much slower would it be if we
> > replaced the radix-sort with an in-place sort (e.g. heapsort).
> > 
> > I hacked up the patch below, just for fun. I don't have any
> > large repos (or enough disk space) to do any meaningful perf
> > tests, but I did at least compile it and it passes the test-suite.
> > (That is no guarantee that I haven't introduced bugs, of course!)
> 
> It might have been easier to just revert 8b8dfd5132 (pack-revindex:
> radix-sort the revindex, 2013-07-11). It even includes some performance
> numbers. :)
> 
> In short, no, I don't think we want to go back to a comparison-sort. The
> radix sort back then was around 4 times faster for linux.git. And that
> was when there were half as many objects in the repository, so the radix
> sort should continue to improve as the repo size grows.

I was curious whether my hand-waving there was true. It turns out that
it is: the radix sort has stayed about the same speed but the comparison
sort has gotten even slower. Here are best-of-five timings for "git
cat-file --batch-check='%(objectsize:disk)'", which does very little
besides generate the rev-index:

  [current master, using radix sort]
  real  0m0.104s
  user  0m0.088s
  sys   0m0.016s

  [reverting 8b8dfd5132, going back to qsort]
  real  0m1.193s
  user  0m1.176s
  sys   0m0.016s

So it's now a factor of 11. Yikes.

That number does match some napkin math. The radix sort uses four 16-bit
buckets, but can quit when after two rounds (because none of the offsets
is beyond 2^32). So it's essentially O(2n). Whereas the comparison sort
is O(n log n), and with n around 6M, that puts log(n) right around 22.

It's possible that some other comparison-based sort might be a little
more efficient than qsort, but I don't think you'll be able to beat the
algorithmic speedup.

The revert of 8b8dfd5132 is below for reference (it needed a few
conflict tweaks).

-Peff

---
diff --git a/pack-revindex.c b/pack-revindex.c
index ff5f62c033..c20aa9541b 100644
--- a/pack-revindex.c
+++ b/pack-revindex.c
@@ -15,102 +15,11 @@
  * get the object sha1 from the main index.
  */
 
-/*
- * This is a least-significant-digit radix sort.
- *
- * It sorts each of the "n" items in "entries" by its offset field. The "max"
- * parameter must be at least as large as the largest offset in the array,
- * and lets us quit the sort early.
- */
-static void sort_revindex(struct revindex_entry *entries, unsigned n, off_t 
max)
+static int cmp_offset(const void *a_, const void *b_)
 {
-   /*
-* We use a "digit" size of 16 bits. That keeps our memory
-* usage reasonable, and we can generally (for a 4G or smaller
-* packfile) quit after two rounds of radix-sorting.
-*/
-#define DIGIT_SIZE (16)
-#define BUCKETS (1 << DIGIT_SIZE)
-   /*
-* We want to know the bucket that a[i] will go into when we are using
-* the digit that is N bits from the (least significant) end.
-*/
-#define BUCKET_FOR(a, i, bits) (((a)[(i)].offset >> (bits)) & (BUCKETS-1))
-
-   /*
-* We need O(n) temporary storage. Rather than do an extra copy of the
-* partial results into "entries", we sort back and forth between the
-* real array and temporary storage. In each iteration of the loop, we
-* keep track of them with alias pointers, always sorting from "from"
-* to "to".
-*/
-   struct revindex_entry *tmp, *from, *to;
-   int bits;
-   unsigned *pos;
-
-   ALLOC_ARRAY(pos, BUCKETS);
-   ALLOC_ARRAY(tmp, n);
-   from = entries;
-   to = tmp;
-
-   /*
-* If (max >> bits) is zero, then we know that the radix digit we are
-* on (and any higher) will be zero for all entries, and our loop will
-* be a no-op, as everybody lands in the same zero-th bucket.
-*/
-   for (bits = 0; max >> bits; bits += DIGIT_SIZE) {
-   unsigned i;
-
-   memset(pos, 0, BUCKETS * sizeof(*pos));
-
-   /*
-* We want pos[i] to store the index of the last element that
-* will go in bucket "i" (actually one past the last element).
-* To do this, we first count the items that will go in each
-* bucket, which gives us a relative offset from the last
-* bucket. We can then cumulatively add the index from the
-* previous bucket to get the true index.
-*/
-   for (i = 0; i < n; i++)
-   pos[BUCKET_FOR(from, i, bits)]++;
-   for (i = 1; i < BUCKETS; i++)
-   pos[i] += pos[i-1];
-
-   /*
-* Now we can drop the elements into their correct buckets (in
-* our temporary array).  We iterate the pos counter backwards
-* to avoid using an extra index to count up. And since we 

Re: [PATCH v6 00/11] nd/pack-objects-pack-struct updates

2018-03-22 Thread Jeff King
On Fri, Mar 23, 2018 at 01:28:12AM +, Ramsay Jones wrote:

> > Of the used heap after your patches:
> > 
> >  - ~40% of that is from packlist_alloc()
> >  - ~17% goes to "struct object"
> >  - ~10% for the object.c hash table to store all the "struct object"
> >  - ~7% goes to the delta cache
> >  - ~7% goes to the pack revindex (actually, there's a duplicate 7%
> >there, too; I think our peak is when we're sorting the revindex
> >and have to keep two copies in memory at once)
> 
> which begs the question, how much slower would it be if we
> replaced the radix-sort with an in-place sort (e.g. heapsort).
> 
> I hacked up the patch below, just for fun. I don't have any
> large repos (or enough disk space) to do any meaningful perf
> tests, but I did at least compile it and it passes the test-suite.
> (That is no guarantee that I haven't introduced bugs, of course!)

It might have been easier to just revert 8b8dfd5132 (pack-revindex:
radix-sort the revindex, 2013-07-11). It even includes some performance
numbers. :)

In short, no, I don't think we want to go back to a comparison-sort. The
radix sort back then was around 4 times faster for linux.git. And that
was when there were half as many objects in the repository, so the radix
sort should continue to improve as the repo size grows.

The absolute time savings aren't huge for something as bulky as a
repack, so it's less exciting in this context. But it's also not that
much memory (7% of the peak here, but as I showed elsewhere, if we can
stop holding all of the "struct object" memory once we're done with it,
then this revindex stuff doesn't even factor into the peak).

-Peff


Re: [PATCH v6 00/11] nd/pack-objects-pack-struct updates

2018-03-22 Thread Ramsay Jones


On 22/03/18 09:32, Jeff King wrote:
> On Wed, Mar 21, 2018 at 04:59:19PM +0100, Duy Nguyen wrote:
> 
>>> I hate to be a wet blanket, but am I the only one who is wondering
>>> whether the tradeoffs is worth it? 8% memory reduction doesn't seem
>>> mind-bogglingly good,
>>
>> AEvar measured RSS. If we count objects[] array alone, the saving is
>> 40% (136 bytes per entry down to 80). Some is probably eaten up by
>> mmap in rss.
> 
> Measuring actual heap usage with massif, I get before/after peak heaps
> of 1728 and 1346MB respectively when repacking linux.git. So that's ~22%
> savings overall.
> 
> Of the used heap after your patches:
> 
>  - ~40% of that is from packlist_alloc()
>  - ~17% goes to "struct object"
>  - ~10% for the object.c hash table to store all the "struct object"
>  - ~7% goes to the delta cache
>  - ~7% goes to the pack revindex (actually, there's a duplicate 7%
>there, too; I think our peak is when we're sorting the revindex
>and have to keep two copies in memory at once)

which begs the question, how much slower would it be if we
replaced the radix-sort with an in-place sort (e.g. heapsort).

I hacked up the patch below, just for fun. I don't have any
large repos (or enough disk space) to do any meaningful perf
tests, but I did at least compile it and it passes the test-suite.
(That is no guarantee that I haven't introduced bugs, of course!)

;-)

ATB,
Ramsay Jones

-- >8 --
Subject: [PATCH] pack-revindex: replace radix-sort with in-place heapsort

Signed-off-by: Ramsay Jones 
---
 pack-revindex.c | 60 +
 1 file changed, 60 insertions(+)

diff --git a/pack-revindex.c b/pack-revindex.c
index ff5f62c03..16f17eac1 100644
--- a/pack-revindex.c
+++ b/pack-revindex.c
@@ -15,6 +15,7 @@
  * get the object sha1 from the main index.
  */
 
+#ifdef DUMMY
 /*
  * This is a least-significant-digit radix sort.
  *
@@ -112,6 +113,65 @@ static void sort_revindex(struct revindex_entry *entries, 
unsigned n, off_t max)
 #undef BUCKETS
 #undef DIGIT_SIZE
 }
+#endif
+
+static inline void swap(struct revindex_entry *a, int i, int j)
+{
+   struct revindex_entry t;
+
+   t = a[i];
+   a[i] = a[j];
+   a[j] = t;
+}
+
+/*
+ * assume that elements first .. last (array index first-1 .. last-1) obey
+ * the partially ordered tree property, except possibly for the children of
+ * the first element. push down the first element until the partially
+ * ordered tree property is restored.
+ */
+static void push_down(struct revindex_entry *a, int first, int last)
+{
+   int parent = first;
+   int last_node = last / 2;
+
+   while (parent <= last_node) {
+
+   int left = 2 * parent;
+   int right = left + 1;
+   int biggest;
+
+   if (right > last) /* parent only has one child */
+   biggest = left;
+   else {
+   if (a[left-1].offset >= a[right-1].offset)
+   biggest = left;
+   else
+   biggest = right;
+
+   }
+
+   if (a[parent-1].offset >= a[biggest-1].offset)
+   break; /* partially ordered tree property, we're done */
+
+   /* push parent down */
+   swap(a, parent-1, biggest-1);
+   parent = biggest;
+   }
+}
+
+static void sort_revindex(struct revindex_entry *entries, unsigned n, off_t 
max)
+{
+   int i;
+
+   for (i = n/2; i > 0; i--)
+   push_down(entries, i, n);
+
+   for (i = n; i > 1; i--) {
+   swap(entries, 0, i-1);
+   push_down(entries, 1, i-1);
+   }
+}
 
 /*
  * Ordered list of offsets of objects in the pack.
-- 
2.16.0



Re: [PATCH v6 00/11] nd/pack-objects-pack-struct updates

2018-03-22 Thread Duy Nguyen
On Thu, Mar 22, 2018 at 12:52 PM, Jeff King  wrote:
> On Thu, Mar 22, 2018 at 11:57:27AM +0100, Duy Nguyen wrote:
>
>> On Thu, Mar 22, 2018 at 10:32 AM, Jeff King  wrote:
>> > That would still mean you could get into a broken state for serving
>> > fetches, but you could at least get out of it by running "git repack".
>>
>> I was puzzled by this "broken state" statement. But you were right! I
>> focused on the repack case and forgot about fetch/clone case. I will
>> probably just drop this patch for now. Then maybe revisit this some
>> time in fiture when I find out how to deal with this nicely.
>
> Here's a sketch of the "separate array" concept I mentioned before, in
> case that helps. Not tested at all beyond compiling.

Brilliant! Sorry I couldn't read your suggestion carefully this morning.
-- 
Duy


Re: [PATCH v6 00/11] nd/pack-objects-pack-struct updates

2018-03-22 Thread Jeff King
On Thu, Mar 22, 2018 at 11:57:27AM +0100, Duy Nguyen wrote:

> On Thu, Mar 22, 2018 at 10:32 AM, Jeff King  wrote:
> > That would still mean you could get into a broken state for serving
> > fetches, but you could at least get out of it by running "git repack".
> 
> I was puzzled by this "broken state" statement. But you were right! I
> focused on the repack case and forgot about fetch/clone case. I will
> probably just drop this patch for now. Then maybe revisit this some
> time in fiture when I find out how to deal with this nicely.

Here's a sketch of the "separate array" concept I mentioned before, in
case that helps. Not tested at all beyond compiling.

---
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 4406af640f..e4e308b453 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -1090,7 +1090,7 @@ static void create_object_entry(const struct object_id 
*oid,
else
nr_result++;
if (found_pack) {
-   oe_set_in_pack(entry, found_pack);
+   oe_set_in_pack(&to_pack, entry, found_pack);
entry->in_pack_offset = found_offset;
}
 
diff --git a/pack-objects.h b/pack-objects.h
index 9f8e450e19..b94b9232fa 100644
--- a/pack-objects.h
+++ b/pack-objects.h
@@ -7,6 +7,8 @@
 #define OE_Z_DELTA_BITS16
 #define OE_DELTA_SIZE_BITS 31
 
+#define OE_IN_PACK_EXTENDED ((1 << OE_IN_PACK_BITS) - 1)
+
 /*
  * State flags for depth-first search used for analyzing delta cycles.
  *
@@ -111,8 +113,13 @@ struct packing_data {
uint32_t index_size;
 
unsigned int *in_pack_pos;
-   int in_pack_count;
-   struct packed_git *in_pack[1 << OE_IN_PACK_BITS];
+
+   struct packed_git **in_pack;
+   uint32_t in_pack_count;
+   size_t in_pack_alloc;
+
+   uint32_t *in_pack_extended;
+   size_t in_pack_extended_alloc;
 };
 
 struct object_entry *packlist_alloc(struct packing_data *pdata,
@@ -174,17 +181,13 @@ static inline void oe_set_in_pack_pos(const struct 
packing_data *pack,
 static inline unsigned int oe_add_pack(struct packing_data *pack,
   struct packed_git *p)
 {
-   if (pack->in_pack_count >= (1 << OE_IN_PACK_BITS))
-   die(_("too many packs to handle in one go. "
- "Please add .keep files to exclude\n"
- "some pack files and keep the number "
- "of non-kept files below %d."),
-   1 << OE_IN_PACK_BITS);
if (p) {
if (p->index > 0)
die("BUG: this packed is already indexed");
p->index = pack->in_pack_count;
}
+   ALLOC_GROW(pack->in_pack, pack->in_pack_count + 1,
+  pack->in_pack_alloc);
pack->in_pack[pack->in_pack_count] = p;
return pack->in_pack_count++;
 }
@@ -192,18 +195,28 @@ static inline unsigned int oe_add_pack(struct 
packing_data *pack,
 static inline struct packed_git *oe_in_pack(const struct packing_data *pack,
const struct object_entry *e)
 {
-   return pack->in_pack[e->in_pack_idx];
-
+   uint32_t idx = e->in_pack_idx;
+   if (idx == OE_IN_PACK_EXTENDED)
+   idx = pack->in_pack_extended[e - pack->objects];
+   return pack->in_pack[idx];
 }
 
-static inline void oe_set_in_pack(struct object_entry *e,
+static inline void oe_set_in_pack(struct packing_data *pack,
+ struct object_entry *e,
  struct packed_git *p)
 {
if (p->index <= 0)
die("BUG: found_pack should be NULL "
"instead of having non-positive index");
-   e->in_pack_idx = p->index;
-
+   else if (p->index < OE_IN_PACK_EXTENDED)
+   e->in_pack_idx = p->index;
+   else {
+   size_t index = e - pack->objects;
+   ALLOC_GROW(pack->in_pack_extended,
+  index, pack->in_pack_extended_alloc);
+   pack->in_pack_extended[index] = p->index;
+   e->in_pack_idx = OE_IN_PACK_EXTENDED;
+   }
 }
 
 static inline struct object_entry *oe_delta(


Re: [PATCH v6 00/11] nd/pack-objects-pack-struct updates

2018-03-22 Thread Duy Nguyen
On Thu, Mar 22, 2018 at 10:32 AM, Jeff King  wrote:
> That would still mean you could get into a broken state for serving
> fetches, but you could at least get out of it by running "git repack".

I was puzzled by this "broken state" statement. But you were right! I
focused on the repack case and forgot about fetch/clone case. I will
probably just drop this patch for now. Then maybe revisit this some
time in fiture when I find out how to deal with this nicely.
-- 
Duy


Re: [PATCH v6 00/11] nd/pack-objects-pack-struct updates

2018-03-22 Thread Jeff King
On Thu, Mar 22, 2018 at 09:23:42AM +0100, Duy Nguyen wrote:

> > I wish you were right about the rarity, but it's unfortunately something
> > I have seen multiple times in the wild (and why I spent time optimizing
> > the many-packs case for pack-objects). Unfortunately I don't know how
> > often it actually comes up, because in theory running "git repack"
> > cleans it up without further ado. But after these patches, not so much.
> 
> It's good to know this case is real and I can start to fix it
> (assuming that the other concern about readability will not stop this
> series).
> 
> I think I'll try to fix this without involving repack. pack-objects
> can produce multiple packs, so if we have more than 16k pack files, we
> produce  one new pack per 16k old ones.

I suspect that's going to be hard given the structure of the code.

Could we perhaps just bump to an auxiliary storage in that case?  I.e.,
allocate the final index number to mean "look in this other table", and
then have another table of uint32 indices?

That would mean we can behave as we did previously, but just use a
little more memory in the uncommon >16k case.

-Peff


Re: [PATCH v6 00/11] nd/pack-objects-pack-struct updates

2018-03-22 Thread Jeff King
On Thu, Mar 22, 2018 at 05:32:12AM -0400, Jeff King wrote:

> So 27% of the total heap goes away if you switch to a separate rev-list.
> Though it's mostly just going to a different process, it does help peak
> because that process would have exited by the time we get to the
> revindex bits.
> 
> I suspect you could get the same effect by just teaching pack-objects to
> clear obj_hash and all of the allocated object structs. I think that
> should be safe to do as long as we clear _all_ of the objects, so there
> are no dangling pointers.

The patch below tries that. It's kind of hacky, but it drops my peak
heap for packing linux.git from 1336MB to 1129MB.

That's not quite as exciting as 27%, because it just moves our peak
earlier, to when we do have all of the object structs in memory (so the
savings are really just that we're not holding the revindex, etc at the
same time as the object structs).

But we also hold that peak for a lot shorter period, because we drop the
memory before we do any delta compression (which itself can be memory
hungry[1]), and don't hold onto it during the write phase (which can be
network-limited when serving a fetch). So during that write phase we're
holding only ~900MB instead of ~1250MB.

-Peff

[1] All of my timings are on noop repacks of a single pack, so there's
no actual delta compression. On average, it will use something like
"nr_threads * window * avg_blob_size". For a "normal" repo, that's
only a few megabytes. But the peak will depend on the large blobs,
so it could have some outsize cases. I don't know if it's worth
worrying about too much for this analysis.

---
Here's the patch. It's probably asking for trouble to have this kind of
clearing interface, as a surprising number of things may hold onto
pointers to objects (see the comment below about the bitmap code).

So maybe the separate process is less insane.

diff --git a/alloc.c b/alloc.c
index 12afadfacd..50d444a3b0 100644
--- a/alloc.c
+++ b/alloc.c
@@ -30,15 +30,23 @@ struct alloc_state {
int count; /* total number of nodes allocated */
int nr;/* number of nodes left in current allocation */
void *p;   /* first free node in current allocation */
+
+   /* book-keeping for clearing */
+   void *start;
+   struct alloc_state *prev;
 };
 
-static inline void *alloc_node(struct alloc_state *s, size_t node_size)
+static inline void *alloc_node(struct alloc_state **sp, size_t node_size)
 {
+   struct alloc_state *s = *sp;
void *ret;
 
-   if (!s->nr) {
+   if (!s || !s->nr) {
+   s = xmalloc(sizeof(*s));
s->nr = BLOCKING;
-   s->p = xmalloc(BLOCKING * node_size);
+   s->start = s->p = xmalloc(BLOCKING * node_size);
+   s->prev = *sp;
+   *sp = s;
}
s->nr--;
s->count++;
@@ -48,7 +56,7 @@ static inline void *alloc_node(struct alloc_state *s, size_t 
node_size)
return ret;
 }
 
-static struct alloc_state blob_state;
+static struct alloc_state *blob_state;
 void *alloc_blob_node(void)
 {
struct blob *b = alloc_node(&blob_state, sizeof(struct blob));
@@ -56,7 +64,7 @@ void *alloc_blob_node(void)
return b;
 }
 
-static struct alloc_state tree_state;
+static struct alloc_state *tree_state;
 void *alloc_tree_node(void)
 {
struct tree *t = alloc_node(&tree_state, sizeof(struct tree));
@@ -64,7 +72,7 @@ void *alloc_tree_node(void)
return t;
 }
 
-static struct alloc_state tag_state;
+static struct alloc_state *tag_state;
 void *alloc_tag_node(void)
 {
struct tag *t = alloc_node(&tag_state, sizeof(struct tag));
@@ -72,7 +80,7 @@ void *alloc_tag_node(void)
return t;
 }
 
-static struct alloc_state object_state;
+static struct alloc_state *object_state;
 void *alloc_object_node(void)
 {
struct object *obj = alloc_node(&object_state, sizeof(union 
any_object));
@@ -80,7 +88,7 @@ void *alloc_object_node(void)
return obj;
 }
 
-static struct alloc_state commit_state;
+static struct alloc_state *commit_state;
 
 unsigned int alloc_commit_index(void)
 {
@@ -103,7 +111,7 @@ static void report(const char *name, unsigned int count, 
size_t size)
 }
 
 #define REPORT(name, type) \
-report(#name, name##_state.count, name##_state.count * sizeof(type) >> 10)
+report(#name, name##_state->count, name##_state->count * sizeof(type) >> 
10)
 
 void alloc_report(void)
 {
@@ -113,3 +121,22 @@ void alloc_report(void)
REPORT(tag, struct tag);
REPORT(object, union any_object);
 }
+
+static void alloc_clear(struct alloc_state **sp)
+{
+   while (*sp) {
+   struct alloc_state *s = *sp;
+   *sp = s->prev;
+   free(s->start);
+   free(s);
+   }
+}
+
+void alloc_clear_all(void)
+{
+   alloc_clear(&blob_state);
+   alloc_clear(&tree_state);
+   alloc_clear(&commit_state);
+   alloc_clear(&tag_state);
+   a

Re: [PATCH v6 00/11] nd/pack-objects-pack-struct updates

2018-03-22 Thread Jeff King
On Wed, Mar 21, 2018 at 04:59:19PM +0100, Duy Nguyen wrote:

> > I hate to be a wet blanket, but am I the only one who is wondering
> > whether the tradeoffs is worth it? 8% memory reduction doesn't seem
> > mind-bogglingly good,
> 
> AEvar measured RSS. If we count objects[] array alone, the saving is
> 40% (136 bytes per entry down to 80). Some is probably eaten up by
> mmap in rss.

Measuring actual heap usage with massif, I get before/after peak heaps
of 1728 and 1346MB respectively when repacking linux.git. So that's ~22%
savings overall.

Of the used heap after your patches:

 - ~40% of that is from packlist_alloc()
 - ~17% goes to "struct object"
 - ~10% for the object.c hash table to store all the "struct object"
 - ~7% goes to the delta cache
 - ~7% goes to the pack revindex (actually, there's a duplicate 7%
   there, too; I think our peak is when we're sorting the revindex
   and have to keep two copies in memory at once)
 - ~5% goes to the packlist_find() hash table
 - ~3.5% for the get_object_details() sorting list (this is only held
 for a minute, but again, our peak comes during this sort, which
 in turn loads the revindex)

So 27% of the total heap goes away if you switch to a separate rev-list.
Though it's mostly just going to a different process, it does help peak
because that process would have exited by the time we get to the
revindex bits.

I suspect you could get the same effect by just teaching pack-objects to
clear obj_hash and all of the allocated object structs. I think that
should be safe to do as long as we clear _all_ of the objects, so there
are no dangling pointers.

> About the 16k limit (and some other limits as well), I'm making these
> patches with the assumption that large scale deployment probably will
> go with custom builds anyway. Adjusting the limits back should be
> quite easy while we can still provide reasonable defaults for most
> people.

I think this 16k limit is the thing I _most_ dislike about the series.
If we could tweak that case such that we always made forward progress, I
think I'd be a lot less nervous. I responded elsewhere in the thread
(before seeing that both Junio and you seemed aware of the issues ;) ),
but I think it would be acceptable to have git-repack enforce the limit.

That would still mean you could get into a broken state for serving
fetches, but you could at least get out of it by running "git repack".

-Peff


Re: [PATCH v6 00/11] nd/pack-objects-pack-struct updates

2018-03-22 Thread Duy Nguyen
On Thu, Mar 22, 2018 at 9:07 AM, Jeff King  wrote:
> On Wed, Mar 21, 2018 at 05:31:14PM +0100, Ævar Arnfjörð Bjarmason wrote:
>
>> > [...]Yes, having that many packs is insane, but that's going to be
>> > small consolation to somebody whose automated maintenance program now
>> > craps out at 16k packs, when it previously would have just worked to
>> > fix the situation[...]
>>
>> That's going to be super rare (and probably nonexisting) edge case, but
>> (untested) I wonder if something like this on top would alleviate your
>> concerns, i.e. instead of dying we just take the first N packs up to our
>> limit:
>
> I wish you were right about the rarity, but it's unfortunately something
> I have seen multiple times in the wild (and why I spent time optimizing
> the many-packs case for pack-objects). Unfortunately I don't know how
> often it actually comes up, because in theory running "git repack"
> cleans it up without further ado. But after these patches, not so much.

It's good to know this case is real and I can start to fix it
(assuming that the other concern about readability will not stop this
series).

I think I'll try to fix this without involving repack. pack-objects
can produce multiple packs, so if we have more than 16k pack files, we
produce  one new pack per 16k old ones.
-- 
Duy


Re: [PATCH v6 00/11] nd/pack-objects-pack-struct updates

2018-03-22 Thread Jeff King
On Wed, Mar 21, 2018 at 05:31:14PM +0100, Ævar Arnfjörð Bjarmason wrote:

> > [...]Yes, having that many packs is insane, but that's going to be
> > small consolation to somebody whose automated maintenance program now
> > craps out at 16k packs, when it previously would have just worked to
> > fix the situation[...]
> 
> That's going to be super rare (and probably nonexisting) edge case, but
> (untested) I wonder if something like this on top would alleviate your
> concerns, i.e. instead of dying we just take the first N packs up to our
> limit:

I wish you were right about the rarity, but it's unfortunately something
I have seen multiple times in the wild (and why I spent time optimizing
the many-packs case for pack-objects). Unfortunately I don't know how
often it actually comes up, because in theory running "git repack"
cleans it up without further ado. But after these patches, not so much.

I'll admit that my experiences aren't necessarily typical of most git
users. But I wouldn't be surprised if other people hosting their own
repositories run into this, too (e.g., somebody pushing in a loop,
auto-gc disabled or clogged by something silly like the "too many loose
objects" warning).

> diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
> index 4406af640f..49d467ab2a 100644
> --- a/builtin/pack-objects.c
> +++ b/builtin/pack-objects.c
> @@ -1065,8 +1065,9 @@ static int want_object_in_pack(const struct 
> object_id *oid,
> 
> want = 1;
>  done:
> -   if (want && *found_pack && !(*found_pack)->index)
> -   oe_add_pack(&to_pack, *found_pack);
> +   if (want && *found_pack && !(*found_pack)->index) {
> +   if (oe_add_pack(&to_pack, *found_pack) == -1)
> +   return 0;

Something like this does seem like a much better fallback, as we'd make
forward progress instead of aborting (and exacerbating whatever caused
the packs to stack up in the first place).

I think the patch as-is does not work, though. You say "oops, too many
packs" and so the "yes we want this object" return becomes "no, we do
not want it". And it is not included in the resulting packfile.

But what happens after that? After pack-objects finishes, we return to
"git repack", which assumes that pack-objects packed everything it was
told to. And with "-d", it then _deletes_ the old packs, knowing that
anything of value was copied to the new pack. So with this patch, we'd
corrupt the repository if this code is ever hit.

You'd need some way to report back to "git repack" that the pack was
omitted. Or probably more sensibly, you'd need "git repack" to count up
the packs and make sure that it marks anybody beyond the limit manually
as .keep (presumably using Duy's new command-line option rather than
actually writing a file).

-Peff


Re: [PATCH v6 00/11] nd/pack-objects-pack-struct updates

2018-03-21 Thread Junio C Hamano
Duy Nguyen  writes:

> And we could even do something like this to make custom builds
> easier. Some more gluing is needed so you can set this from config.mak
> but you get the idea. This removes all limits set by this
> series.

Yes, we _could_, but it would mean we would have many variants of
the codepath that is pretty crucial to the integrity of the data we
keep in the repository, all of which must pretty much be bug-free.

> Readability in pack-objects.c and object_entry struct declaration
> is still a concern though.

Yup, a change like this does not change the readability; personally,
I do not think the original is _too_ bad, though.



Re: [PATCH v6 00/11] nd/pack-objects-pack-struct updates

2018-03-21 Thread Duy Nguyen
On Wed, Mar 21, 2018 at 5:53 PM, Junio C Hamano  wrote:
> Ævar Arnfjörð Bjarmason  writes:
>
>> That's going to be super rare (and probably nonexisting) edge case, but
>> (untested) I wonder if something like this on top would alleviate your
>> concerns, i.e. instead of dying we just take the first N packs up to our
>> limit:
>>
>> diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
>> index 4406af640f..49d467ab2a 100644
>> --- a/builtin/pack-objects.c
>> +++ b/builtin/pack-objects.c
>> @@ -1065,8 +1065,9 @@ static int want_object_in_pack(const struct 
>> object_id *oid,
>>
>> want = 1;
>>  done:
>> -   if (want && *found_pack && !(*found_pack)->index)
>> -   oe_add_pack(&to_pack, *found_pack);
>> +   if (want && *found_pack && !(*found_pack)->index) {
>> +   if (oe_add_pack(&to_pack, *found_pack) == -1)
>> +   return 0;
>>
>> return want;
>>  }
>
> It is probably a small first step in the right direction, but we'd
> need to communicate which packs we ignored with this logic to the
> calling program.  I offhand do not know how we would handle the "-d"
> part of "repack -a -d" without it.

repack will delete all the packs except ones with .keep files and ones
created by pack-objects. So this change alone is not enough. I think I
did mention that we could make this work by making repack run
pack-objects multiple times. But I did not do it because I did not
think it could really happen.
-- 
Duy


Re: [PATCH v6 00/11] nd/pack-objects-pack-struct updates

2018-03-21 Thread Junio C Hamano
Ævar Arnfjörð Bjarmason  writes:

> That's going to be super rare (and probably nonexisting) edge case, but
> (untested) I wonder if something like this on top would alleviate your
> concerns, i.e. instead of dying we just take the first N packs up to our
> limit:
>
> diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
> index 4406af640f..49d467ab2a 100644
> --- a/builtin/pack-objects.c
> +++ b/builtin/pack-objects.c
> @@ -1065,8 +1065,9 @@ static int want_object_in_pack(const struct 
> object_id *oid,
>
> want = 1;
>  done:
> -   if (want && *found_pack && !(*found_pack)->index)
> -   oe_add_pack(&to_pack, *found_pack);
> +   if (want && *found_pack && !(*found_pack)->index) {
> +   if (oe_add_pack(&to_pack, *found_pack) == -1)
> +   return 0;
>
> return want;
>  }

It is probably a small first step in the right direction, but we'd
need to communicate which packs we ignored with this logic to the
calling program.  I offhand do not know how we would handle the "-d"
part of "repack -a -d" without it.



Re: [PATCH v6 00/11] nd/pack-objects-pack-struct updates

2018-03-21 Thread Duy Nguyen
On Wed, Mar 21, 2018 at 04:59:19PM +0100, Duy Nguyen wrote:
> About the 16k limit (and some other limits as well), I'm making these
> patches with the assumption that large scale deployment probably will
> go with custom builds anyway. Adjusting the limits back should be
> quite easy while we can still provide reasonable defaults for most
> people.

And we could even do something like this to make custom builds
easier. Some more gluing is needed so you can set this from config.mak
but you get the idea. This removes all limits set by this
series. Readability in pack-objects.c and object_entry struct
declaration is still a concern though.

-- 8< --
diff --git a/pack-objects.h b/pack-objects.h
index af40211105..b6e84c9b48 100644
--- a/pack-objects.h
+++ b/pack-objects.h
@@ -2,10 +2,17 @@
 #define PACK_OBJECTS_H
 
 #define OE_DFS_STATE_BITS  2
+#ifdef PACK_OBJECTS_BIG_MEMORY
+#define OE_DEPTH_BITS  31
+/* OE_IN_PACK_BITS is not defined */
+#define OE_Z_DELTA_BITS32
+#define OE_DELTA_SIZE_BITS 32
+#else
 #define OE_DEPTH_BITS  12
 #define OE_IN_PACK_BITS14
 #define OE_Z_DELTA_BITS16
 #define OE_DELTA_SIZE_BITS 31
+#endif
 
 /*
  * State flags for depth-first search used for analyzing delta cycles.
@@ -82,7 +89,11 @@ struct object_entry {
 */
uint32_t delta_size_:OE_DELTA_SIZE_BITS; /* delta data size 
(uncompressed) */
uint32_t delta_size_valid:1;
+#ifdef PACK_OBJECTS_BIG_MEMORY
+   struct packed_git *in_pack; /* already in pack */
+#else
unsigned in_pack_idx:OE_IN_PACK_BITS;   /* already in pack */
+#endif
unsigned size_valid:1;
unsigned z_delta_size:OE_Z_DELTA_BITS;
unsigned type_valid:1;
@@ -112,7 +123,9 @@ struct packing_data {
 
unsigned int *in_pack_pos;
int in_pack_count;
+#ifndef PACK_OBJECTS_BIG_MEMORY
struct packed_git *in_pack[1 << OE_IN_PACK_BITS];
+#endif
 };
 
 struct object_entry *packlist_alloc(struct packing_data *pdata,
@@ -174,6 +187,9 @@ static inline void oe_set_in_pack_pos(const struct 
packing_data *pack,
 static inline unsigned int oe_add_pack(struct packing_data *pack,
   struct packed_git *p)
 {
+#ifdef PACK_OBJECTS_BIG_MEMORY
+   return 0;
+#else
if (pack->in_pack_count >= (1 << OE_IN_PACK_BITS))
die(_("too many packs to handle in one go. "
  "Please add .keep files to exclude\n"
@@ -187,22 +203,31 @@ static inline unsigned int oe_add_pack(struct 
packing_data *pack,
}
pack->in_pack[pack->in_pack_count] = p;
return pack->in_pack_count++;
+#endif
 }
 
 static inline struct packed_git *oe_in_pack(const struct packing_data *pack,
const struct object_entry *e)
 {
+#ifdef PACK_OBJECTS_BIG_MEMORY
+   return e->in_pack;
+#else
return pack->in_pack[e->in_pack_idx];
+#endif
 
 }
 
 static inline void oe_set_in_pack(struct object_entry *e,
  struct packed_git *p)
 {
+#ifdef PACK_OBJECTS_BIG_MEMORY
+   e->in_pack = p;
+#else
if (p->index <= 0)
die("BUG: found_pack should be NULL "
"instead of having non-positive index");
e->in_pack_idx = p->index;
+#endif
 
 }
 
-- 8< --


Re: [PATCH v6 00/11] nd/pack-objects-pack-struct updates

2018-03-21 Thread Ævar Arnfjörð Bjarmason

On Wed, Mar 21 2018, Jeff King wrote:

> On Sun, Mar 18, 2018 at 03:25:15PM +0100, Nguyễn Thái Ngọc Duy wrote:
>
>> v6 fixes the one optimization that I just couldn't get right, fixes
>> two off-by-one error messages and a couple commit message update
>> (biggest change is in 11/11 to record some numbers from AEvar)
>
> [...]Yes, having that many packs is insane, but that's going to be
> small consolation to somebody whose automated maintenance program now
> craps out at 16k packs, when it previously would have just worked to
> fix the situation[...]

That's going to be super rare (and probably nonexisting) edge case, but
(untested) I wonder if something like this on top would alleviate your
concerns, i.e. instead of dying we just take the first N packs up to our
limit:

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 4406af640f..49d467ab2a 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -1065,8 +1065,9 @@ static int want_object_in_pack(const struct object_id 
*oid,

want = 1;
 done:
-   if (want && *found_pack && !(*found_pack)->index)
-   oe_add_pack(&to_pack, *found_pack);
+   if (want && *found_pack && !(*found_pack)->index) {
+   if (oe_add_pack(&to_pack, *found_pack) == -1)
+   return 0;

return want;
 }
diff --git a/pack-objects.h b/pack-objects.h
index 9f8e450e19..50ed2028fb 100644
--- a/pack-objects.h
+++ b/pack-objects.h
@@ -171,15 +171,17 @@ static inline void oe_set_in_pack_pos(const struct 
packing_data *pack,
pack->in_pack_pos[e - pack->objects] = pos;
 }

-static inline unsigned int oe_add_pack(struct packing_data *pack,
+static inline int oe_add_pack(struct packing_data *pack,
   struct packed_git *p)
 {
-   if (pack->in_pack_count >= (1 << OE_IN_PACK_BITS))
-   die(_("too many packs to handle in one go. "
- "Please add .keep files to exclude\n"
- "some pack files and keep the number "
- "of non-kept files below %d."),
+   if (pack->in_pack_count >= (1 << OE_IN_PACK_BITS)) {
+   warning(_("Too many packs to handle in one go. "
+ "Ran into the limit of %d.\n"
+ "Limping along by pretending packs beyond that"
+ "number have *.keep!"),
1 << OE_IN_PACK_BITS);
+   return -1;
+   }
if (p) {
if (p->index > 0)
die("BUG: this packed is already indexed");


Re: [PATCH v6 00/11] nd/pack-objects-pack-struct updates

2018-03-21 Thread Duy Nguyen
On Wed, Mar 21, 2018 at 5:17 PM, Ævar Arnfjörð Bjarmason
 wrote:
>>> I think ultimately to work on low-memory machines we'll need a
>>> fundamentally different approach that scales with the objects since the
>>> last pack, and not with the complete history.
>>
>> Absolutely. Which is covered in a separate "gc --auto" series. Some
>> memory reduction here may be still nice to have though. Even on beefy
>> machine, memory can still be reused somewhere other than wasted in
>> unused bits.
>
> FWIW I've been running a combination of these two at work (also keeping
> the big pack), and they've had a sizable impact on packing our monorepo,
> on one of our dev boxes on a real-world checkout with a combo of the
> "base" pack and other packs + loose objects, as measured by
> /usr/bin/time
>
>  * Reduction in user time by 37%
>  * Reduction in system time by 84%
>  * Reduction in RSS by 61%
>  * Reduction in page faults by 58% & 94% (time(1) reports two different 
> numbers)
>  * Reduction in the I of I/O by 58%
>  * Reduction in the O of I/O by 94%

The keeping big pack changes very likely contributes to most of this
reduction, so just to be clear these numbers can't be be used as an
argument in favor of this pack-objects series (but otherwise, wow! I
guess I need to finish up the gc series soon, then start the external
rev-list work to reduce even more ;-)
-- 
Duy


Re: [PATCH v6 00/11] nd/pack-objects-pack-struct updates

2018-03-21 Thread Ævar Arnfjörð Bjarmason

On Wed, Mar 21 2018, Duy Nguyen wrote:

> On Wed, Mar 21, 2018 at 9:24 AM, Jeff King  wrote:
>> On Sun, Mar 18, 2018 at 03:25:15PM +0100, Nguyễn Thái Ngọc Duy wrote:
>>
>>> v6 fixes the one optimization that I just couldn't get right, fixes
>>> two off-by-one error messages and a couple commit message update
>>> (biggest change is in 11/11 to record some numbers from AEvar)
>>
>> I was traveling during some of the earlier rounds, so I finally got a
>> chance to take a look at this.
>>
>> I hate to be a wet blanket, but am I the only one who is wondering
>> whether the tradeoffs is worth it? 8% memory reduction doesn't seem
>> mind-bogglingly good,
>
> AEvar measured RSS. If we count objects[] array alone, the saving is
> 40% (136 bytes per entry down to 80). Some is probably eaten up by
> mmap in rss.

Yeah, sorry about spreading that confusion.

>> and I'm concerned about two things:
>>
>>   1. The resulting code is harder to read and reason about (things like
>>  the DELTA() macros), and seems a lot more brittle (things like the
>>  new size_valid checks).
>>
>>   2. There are lots of new limits. Some of these are probably fine
>>  (e.g., the cacheable delta size), but things like the
>>  number-of-packs limit don't have very good user-facing behavior.
>>  Yes, having that many packs is insane, but that's going to be small
>>  consolation to somebody whose automated maintenance program now
>>  craps out at 16k packs, when it previously would have just worked
>>  to fix the situation.
>>
>> Saving 8% is nice, but the number of objects in linux.git grew over 12%
>> in the last year. So you've bought yourself 8 months before the problem
>> is back. Is it worth making these changes that we'll have to deal with
>> for many years to buy 8 months of memory savings?
>
> Well, with 40% it buys us a couple more months. The object growth
> affects rev-list --all too so the actual "good months" is probably not
> super far from 8 months.
>
> Is it worth saving? I don't know. I raised the readability point from
> the very first patch and if people believe it makes it much harder to
> read, then no it's not worth it.
>
> While pack-objects is simple from the functionality point of view, it
> has received lots of optimizations and to me is quite fragile.
> Readability does count in this code. Fortunately it still looks quite
> ok to me with this series applied (but then it's subjective)
>
> About the 16k limit (and some other limits as well), I'm making these
> patches with the assumption that large scale deployment probably will
> go with custom builds anyway. Adjusting the limits back should be
> quite easy while we can still provide reasonable defaults for most
> people.
>
>> I think ultimately to work on low-memory machines we'll need a
>> fundamentally different approach that scales with the objects since the
>> last pack, and not with the complete history.
>
> Absolutely. Which is covered in a separate "gc --auto" series. Some
> memory reduction here may be still nice to have though. Even on beefy
> machine, memory can still be reused somewhere other than wasted in
> unused bits.

FWIW I've been running a combination of these two at work (also keeping
the big pack), and they've had a sizable impact on packing our monorepo,
on one of our dev boxes on a real-world checkout with a combo of the
"base" pack and other packs + loose objects, as measured by
/usr/bin/time

 * Reduction in user time by 37%
 * Reduction in system time by 84%
 * Reduction in RSS by 61%
 * Reduction in page faults by 58% & 94% (time(1) reports two different numbers)
 * Reduction in the I of I/O by 58%
 * Reduction in the O of I/O by 94%


Re: [PATCH v6 00/11] nd/pack-objects-pack-struct updates

2018-03-21 Thread Duy Nguyen
On Wed, Mar 21, 2018 at 9:24 AM, Jeff King  wrote:
> On Sun, Mar 18, 2018 at 03:25:15PM +0100, Nguyễn Thái Ngọc Duy wrote:
>
>> v6 fixes the one optimization that I just couldn't get right, fixes
>> two off-by-one error messages and a couple commit message update
>> (biggest change is in 11/11 to record some numbers from AEvar)
>
> I was traveling during some of the earlier rounds, so I finally got a
> chance to take a look at this.
>
> I hate to be a wet blanket, but am I the only one who is wondering
> whether the tradeoffs is worth it? 8% memory reduction doesn't seem
> mind-bogglingly good,

AEvar measured RSS. If we count objects[] array alone, the saving is
40% (136 bytes per entry down to 80). Some is probably eaten up by
mmap in rss.

> and I'm concerned about two things:
>
>   1. The resulting code is harder to read and reason about (things like
>  the DELTA() macros), and seems a lot more brittle (things like the
>  new size_valid checks).
>
>   2. There are lots of new limits. Some of these are probably fine
>  (e.g., the cacheable delta size), but things like the
>  number-of-packs limit don't have very good user-facing behavior.
>  Yes, having that many packs is insane, but that's going to be small
>  consolation to somebody whose automated maintenance program now
>  craps out at 16k packs, when it previously would have just worked
>  to fix the situation.
>
> Saving 8% is nice, but the number of objects in linux.git grew over 12%
> in the last year. So you've bought yourself 8 months before the problem
> is back. Is it worth making these changes that we'll have to deal with
> for many years to buy 8 months of memory savings?

Well, with 40% it buys us a couple more months. The object growth
affects rev-list --all too so the actual "good months" is probably not
super far from 8 months.

Is it worth saving? I don't know. I raised the readability point from
the very first patch and if people believe it makes it much harder to
read, then no it's not worth it.

While pack-objects is simple from the functionality point of view, it
has received lots of optimizations and to me is quite fragile.
Readability does count in this code. Fortunately it still looks quite
ok to me with this series applied (but then it's subjective)

About the 16k limit (and some other limits as well), I'm making these
patches with the assumption that large scale deployment probably will
go with custom builds anyway. Adjusting the limits back should be
quite easy while we can still provide reasonable defaults for most
people.

> I think ultimately to work on low-memory machines we'll need a
> fundamentally different approach that scales with the objects since the
> last pack, and not with the complete history.

Absolutely. Which is covered in a separate "gc --auto" series. Some
memory reduction here may be still nice to have though. Even on beefy
machine, memory can still be reused somewhere other than wasted in
unused bits.
-- 
Duy


Re: [PATCH v6 00/11] nd/pack-objects-pack-struct updates

2018-03-21 Thread Jeff King
On Sun, Mar 18, 2018 at 03:25:15PM +0100, Nguyễn Thái Ngọc Duy wrote:

> v6 fixes the one optimization that I just couldn't get right, fixes
> two off-by-one error messages and a couple commit message update
> (biggest change is in 11/11 to record some numbers from AEvar)

I was traveling during some of the earlier rounds, so I finally got a
chance to take a look at this.

I hate to be a wet blanket, but am I the only one who is wondering
whether the tradeoffs is worth it? 8% memory reduction doesn't seem
mind-bogglingly good, and I'm concerned about two things:

  1. The resulting code is harder to read and reason about (things like
 the DELTA() macros), and seems a lot more brittle (things like the
 new size_valid checks).

  2. There are lots of new limits. Some of these are probably fine
 (e.g., the cacheable delta size), but things like the
 number-of-packs limit don't have very good user-facing behavior.
 Yes, having that many packs is insane, but that's going to be small
 consolation to somebody whose automated maintenance program now
 craps out at 16k packs, when it previously would have just worked
 to fix the situation.

Saving 8% is nice, but the number of objects in linux.git grew over 12%
in the last year. So you've bought yourself 8 months before the problem
is back. Is it worth making these changes that we'll have to deal with
for many years to buy 8 months of memory savings?

I think ultimately to work on low-memory machines we'll need a
fundamentally different approach that scales with the objects since the
last pack, and not with the complete history.

-Peff


Re: [PATCH v6 00/11] nd/pack-objects-pack-struct updates

2018-03-18 Thread Ævar Arnfjörð Bjarmason

On Sun, Mar 18 2018, Nguyễn Thái Ngọc Duy jotted:

> v6 fixes the one optimization that I just couldn't get right, fixes
> two off-by-one error messages and a couple commit message update
> (biggest change is in 11/11 to record some numbers from AEvar)

Thanks, aside from the minor typo I just noted in
https://public-inbox.org/git/878tapcucc@evledraar.gmail.com/ (which
I trust Junio can fix up) this all looks good to me.


[PATCH v6 00/11] nd/pack-objects-pack-struct updates

2018-03-18 Thread Nguyễn Thái Ngọc Duy
v6 fixes the one optimization that I just couldn't get right, fixes
two off-by-one error messages and a couple commit message update
(biggest change is in 11/11 to record some numbers from AEvar)

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index fb2aba80bf..4406af640f 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -3112,10 +3112,10 @@ int cmd_pack_objects(int argc, const char **argv, const 
char *prefix)
 
if (depth >= (1 << OE_DEPTH_BITS))
die(_("delta chain depth %d is greater than maximum limit %d"),
-   depth, (1 << OE_DEPTH_BITS));
+   depth, (1 << OE_DEPTH_BITS) - 1);
if (cache_max_small_delta_size >= (1 << OE_Z_DELTA_BITS))
die(_("pack.deltaCacheLimit is greater than maximum limit %d"),
-   1 << OE_Z_DELTA_BITS);
+   (1 << OE_Z_DELTA_BITS) - 1);
 
argv_array_push(&rp, "pack-objects");
if (thin) {
diff --git a/pack-objects.h b/pack-objects.h
index 55358da9f3..af40211105 100644
--- a/pack-objects.h
+++ b/pack-objects.h
@@ -275,7 +275,7 @@ static inline unsigned long oe_size(const struct 
object_entry *e)
}
 }
 
-static inline int contains_in_32bits(unsigned long limit)
+static inline int oe_fits_in_32bits(unsigned long limit)
 {
uint32_t truncated_limit = (uint32_t)limit;
 
@@ -287,8 +287,8 @@ static inline int oe_size_less_than(const struct 
object_entry *e,
 {
if (e->size_valid)
return e->size_ < limit;
-   if (contains_in_32bits(limit))
-   return 1;
+   if (oe_fits_in_32bits(limit)) /* limit < 2^32 <= size ? */
+   return 0;
return oe_size(e) < limit;
 }
 
@@ -297,8 +297,8 @@ static inline int oe_size_greater_than(const struct 
object_entry *e,
 {
if (e->size_valid)
return e->size_ > limit;
-   if (contains_in_32bits(limit))
-   return 0;
+   if (oe_fits_in_32bits(limit)) /* limit < 2^32 <= size ? */
+   return 1;
return oe_size(e) > limit;
 }
 
@@ -307,6 +307,14 @@ static inline void oe_set_size(struct object_entry *e,
 {
e->size_ = size;
e->size_valid = e->size_ == size;
+
+   if (!e->size_valid) {
+   unsigned long real_size;
+
+   if (sha1_object_info(e->idx.oid.hash, &real_size) < 0 ||
+   size != real_size)
+   die("BUG: 'size' is supposed to be the object size!");
+   }
 }
 
 static inline unsigned long oe_delta_size(struct packing_data *pack,

-- 
2.17.0.rc0.347.gf9cf61673a