RE: [PATCH v2 0/5] Allocate cache entries from memory pool

2018-05-03 Thread Jameson Miller


> -Original Message-
> From: git-ow...@vger.kernel.org  On Behalf Of
> Stefan Beller
> Sent: Thursday, May 3, 2018 4:59 PM
> To: Duy Nguyen 
> Cc: Jameson Miller ; git@vger.kernel.org;
> gits...@pobox.com; jonathanta...@google.com
> Subject: Re: [PATCH v2 0/5] Allocate cache entries from memory pool
> 
> On Thu, May 3, 2018 at 12:17 PM, Duy Nguyen  wrote:
> >
> >> To me it is also a clear yes when it comes to combining these two
> >> memory pools.
> >
> > I also did not notice that jm/mem-pool already landed in master.
> 
> Oh, thanks for telling! Now that I look at it, I am doubting it;
> 
> The reason for my doubt is the potential quadratic behavior for new 
> allocations,
> in mem_pool_alloc() we walk all mp_blocks to see if we can fit the requested
> allocation in one of the later blocks.
> So if we call mem_pool_alloc a million times, we get a O(n) mp_blocks which
> we'd have to walk in each call.

With the current design, when a new mp_block is allocated, it is
placed at the head of the linked list. This means that the most
recently allocated mp_block is the 1st block that is
searched. The *vast* majority of allocations should be fulfilled
from this 1st block. It is only when the block is full that we
search other mp_blocks in the list. If this is a concern, I think
we have a couple low cost options to mitigate it (maybe a flag to
control whether we search past the 1st mp_block for space, or
logic to move blocks out of the search queue when they are
full or fall below a threshold for available space).

If this is of interest, I could contribute a patch to enable one
of these behaviors?

> 
> However in alloc.c we do know that a slab is full as soon as we look take the
> next slab. That is the beauty of knowing 'len' at construction time of the
> allocator.
> 
> So I guess I'll just re-use the mp_block and introduce another struct
> fixed_sized_mem_pool, which will not look into other mp_blocks but the
> current.
> 
> 
> > Have
> > you tried measure (both memory usage and allocation speed) of it and
> > alloc.c?
> 
> No, I was about to, but then started reading the code in an attempt to replace
> alloc.c by a mempool and saw the quadratic behavior.
> 
> > Just take some big repo as an example and do count-objects -v to see
> > how many blobs/trees/commits it has, then allocate the same amount
> > with both alloc.c and mem-pool.c and measure both speed/mem.
> > I'm pretty sure you're right that mem-pool.c is a clear yes. I was
> > just being more conservative because we do (slightly) change
> > allocator's behavior when we make the switch. But it's also very
> > likely that any performance difference will be insignificant.
> >
> > I'm asking this because if mem-pool.c is a clear winner, you can start
> > to update you series to use it now and kill alloc.c in the process.
> 
> I'll implement the fixed_sized_mem_pool and take some measurements.
> 
> >
> > PS. Is Jeff back yet?
> 
> His last email on the public list is Apr 10th, stating that he'll be offline 
> for "a few
> weeks", in <20180406175349.gb32...@sigill.intra.peff.net> he said the
> vacation part is 3 weeks. So I think he is done with vacation and is just 
> hiding to
> figure out a nice comeback. ;-)
> 
> > I'm sure Junio is listening and all but I'm afraid he's too busy being
> > a maintainer so Jeff's opinion in this area is really valuable. He has
> > all the fun and weird use cases to play with at github.
> 
> ok. I'll cc him for these patches.
> 
> Thanks,
> Stefan


Re: [PATCH v2 0/5] Allocate cache entries from memory pool

2018-05-03 Thread Stefan Beller
On Thu, May 3, 2018 at 12:17 PM, Duy Nguyen  wrote:
>
>> To me it is also a clear yes when it comes to combining these
>> two memory pools.
>
> I also did not notice that jm/mem-pool already landed in master.

Oh, thanks for telling! Now that I look at it, I am doubting it;

The reason for my doubt is the potential quadratic behavior for
new allocations, in mem_pool_alloc() we walk all mp_blocks to
see if we can fit the requested allocation in one of the later blocks.
So if we call mem_pool_alloc a million times, we get a O(n)
mp_blocks which we'd have to walk in each call.

However in alloc.c we do know that a slab is full as soon as we
look take the next slab. That is the beauty of knowing 'len' at
construction time of the allocator.

So I guess I'll just re-use the mp_block and introduce another
struct fixed_sized_mem_pool, which will not look into other mp_blocks
but the current.


> Have
> you tried measure (both memory usage and allocation speed) of it and
> alloc.c?

No, I was about to, but then started reading the code in an attempt to replace
alloc.c by a mempool and saw the quadratic behavior.

> Just take some big repo as an example and do count-objects -v
> to see how many blobs/trees/commits it has, then allocate the same
> amount with both alloc.c and mem-pool.c and measure both speed/mem.
> I'm pretty sure you're right that mem-pool.c is a clear yes. I was
> just being more conservative because we do (slightly) change
> allocator's behavior when we make the switch. But it's also very
> likely that any performance difference will be insignificant.
>
> I'm asking this because if mem-pool.c is a clear winner, you can start
> to update you series to use it now and kill alloc.c in the process.

I'll implement the fixed_sized_mem_pool and take some measurements.

>
> PS. Is Jeff back yet?

His last email on the public list is Apr 10th, stating that he'll be offline for
"a few weeks", in <20180406175349.gb32...@sigill.intra.peff.net> he
said the vacation part is 3 weeks. So I think he is done with vacation and
is just hiding to figure out a nice comeback. ;-)

> I'm sure Junio is listening and all but I'm
> afraid he's too busy being a maintainer so Jeff's opinion in this area
> is really valuable. He has all the fun and weird use cases to play
> with at github.

ok. I'll cc him for these patches.

Thanks,
Stefan


Re: [PATCH v2 0/5] Allocate cache entries from memory pool

2018-05-03 Thread Duy Nguyen
On Thu, May 3, 2018 at 7:21 PM, Stefan Beller  wrote:
> On Thu, May 3, 2018 at 9:35 AM, Duy Nguyen  wrote:
>> On Mon, Apr 30, 2018 at 5:31 PM, Jameson Miller  wrote:
>>> This patch series improves the performance of loading indexes by
>>> reducing the number of malloc() calls. Loading the index from disk is
>>> partly dominated by the time in malloc(), which is called for each
>>> index entry. This patch series reduces the number of times malloc() is
>>> called as part of loading the index, and instead allocates a block of
>>> memory upfront that is large enough to hold all of the cache entries,
>>> and chunks this memory itself. This change builds on [1].
>>
>> I have only looked at the mem-pool related patches to see if
>> mem-pool.c is good enough to replace alloc.c. To me, it's a "yes"
>> after we optimize mem_pool_alloc() a bit (not that performance really
>> matters in alloc.c case, but that may be because it's already
>> blazingly fast that we never noticed about it).
>
> alloc.c was not just about speed, but mostly about dense packing?
> 855419f764a (Add specialized object allocator, 2006-06-19)

I know. I vaguely remembered Linus made that change but did not really
look it up :) That reference should be included when/if you switch
from alloc.c to mem-pool.c.

> To me it is also a clear yes when it comes to combining these
> two memory pools.

I also did not notice that jm/mem-pool already landed in master. Have
you tried measure (both memory usage and allocation speed) of it and
alloc.c? Just take some big repo as an example and do count-objects -v
to see how many blobs/trees/commits it has, then allocate the same
amount with both alloc.c and mem-pool.c and measure both speed/mem.
I'm pretty sure you're right that mem-pool.c is a clear yes. I was
just being more conservative because we do (slightly) change
allocator's behavior when we make the switch. But it's also very
likely that any performance difference will be insignificant.

I'm asking this because if mem-pool.c is a clear winner, you can start
to update you series to use it now and kill alloc.c in the process.

PS. Is Jeff back yet? I'm sure Junio is listening and all but I'm
afraid he's too busy being a maintainer so Jeff's opinion in this area
is really valuable. He has all the fun and weird use cases to play
with at github.
-- 
Duy


Re: [PATCH v2 0/5] Allocate cache entries from memory pool

2018-05-03 Thread Stefan Beller
On Thu, May 3, 2018 at 9:35 AM, Duy Nguyen  wrote:
> On Mon, Apr 30, 2018 at 5:31 PM, Jameson Miller  wrote:
>> This patch series improves the performance of loading indexes by
>> reducing the number of malloc() calls. Loading the index from disk is
>> partly dominated by the time in malloc(), which is called for each
>> index entry. This patch series reduces the number of times malloc() is
>> called as part of loading the index, and instead allocates a block of
>> memory upfront that is large enough to hold all of the cache entries,
>> and chunks this memory itself. This change builds on [1].
>
> I have only looked at the mem-pool related patches to see if
> mem-pool.c is good enough to replace alloc.c. To me, it's a "yes"
> after we optimize mem_pool_alloc() a bit (not that performance really
> matters in alloc.c case, but that may be because it's already
> blazingly fast that we never noticed about it).

alloc.c was not just about speed, but mostly about dense packing?
855419f764a (Add specialized object allocator, 2006-06-19)

To me it is also a clear yes when it comes to combining these
two memory pools.


Re: [PATCH v2 0/5] Allocate cache entries from memory pool

2018-05-03 Thread Duy Nguyen
On Mon, Apr 30, 2018 at 5:31 PM, Jameson Miller  wrote:
> This patch series improves the performance of loading indexes by
> reducing the number of malloc() calls. Loading the index from disk is
> partly dominated by the time in malloc(), which is called for each
> index entry. This patch series reduces the number of times malloc() is
> called as part of loading the index, and instead allocates a block of
> memory upfront that is large enough to hold all of the cache entries,
> and chunks this memory itself. This change builds on [1].

I have only looked at the mem-pool related patches to see if
mem-pool.c is good enough to replace alloc.c. To me, it's a "yes"
after we optimize mem_pool_alloc() a bit (not that performance really
matters in alloc.c case, but that may be because it's already
blazingly fast that we never noticed about it).

I probably should look at read-cache.c changes too. Maybe later.
Although after the change to use xmalloc() per entry a few years(?)
ago, it should be straight forward to use a different memory
allocator.
-- 
Duy