Re: Is realloc() between bucket sizes worthwhile with jemalloc?

2018-04-10 Thread Simon Sapin

On 09/04/18 13:58, Henri Sivonen wrote:

Specifically, is it actually useful that nsStringBuffer
uses realloc() as opposed to malloc(), memcpy() with actually
semantically filled amount and free()?

Upon superficial code reading, it seems to me that currently changing
the capacity of an nsA[C]STring might uselessly use realloc to copy
data that's not semantically live data from the string's point of view
and wouldn't really need to be preserved. Have I actually discovered
useless copying or am I misunderstanding?


This came up while discussing allocator APIs for Rust. (The outcome of 
those discussions is https://github.com/rust-lang/rust/pull/49669 and 
https://github.com/rust-lang/rust/issues/49668.)


In our new APIs, we could have an additional parameter to realloc that 
says how many bytes need to be copied. The conclusion was to not add 
that because:


* Not even the most exotic parts of jemalloc’s non-standard API have 
something like this, as far as I can tell.


* For vectors, the two common realloc case seem to be pushing one item 
while already at capacity, and shrinking (perhaps after removing some 
items) to make the capacity fit the length exactly. In both cases, 
min(old size, new size) already matches the amount of meaningful bytes.



However these APIs are not stable yet and we can still revisit this if 
there’s new information or arguments.


--
Simon Sapin
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Is realloc() between bucket sizes worthwhile with jemalloc?

2018-04-10 Thread Henri Sivonen
On Mon, Apr 9, 2018 at 10:30 PM, Eric Rahm  wrote:
>> Upon superficial code reading, it seems to me that currently changing
>> the capacity of an nsA[C]STring might uselessly use realloc to copy
>> data that's not semantically live data from the string's point of view
>> and wouldn't really need to be preserved. Have I actually discovered
>> useless copying or am I misunderstanding?
>
>
> In this case I think you're right. In the string code we use a doubling
> strategy up to 8MiB so they'll always be in a new bucket/chunk. After 8MiB
> we grow by 1.125 [2], but always round up to the nearest MiB. Our
> HugeRealloc logic always makes a new allocation if the difference is greater
> than or equal to 1MiB [3] so that's always going to get hit. I should note
> that on OSX we use some sort of 'pages_copy' when the realloc is large
> enough, this is probably more efficient than memcpy.

Thanks. Being able to avoid useless copying for most strings probably
outweighs the loss of the pages_copy optimization for huge strings on
Mac.

-- 
Henri Sivonen
hsivo...@hsivonen.fi
https://hsivonen.fi/
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Is realloc() between bucket sizes worthwhile with jemalloc?

2018-04-09 Thread Eric Rahm
On Mon, Apr 9, 2018 at 4:58 AM, Henri Sivonen  wrote:

> My understanding is that under some "huge" size, jemalloc returns
> allocations from particularly-sized buckets.
>

The mozjemalloc source has a decent ascii-art table [1].


> This makes me expect that realloc() between bucket sizes is going to
> always copy the data instead of just adjusting allocated metadata,
> because to do otherwise would mess up the bucketing.
>

Sure, but not all reallocs are between bucket sizes. You can realloc from
1KB to 1.99KB and end up in the same bucket.


> Is this so? Specifically, is it actually useful that nsStringBuffer
> uses realloc() as opposed to malloc(), memcpy() with actually
> semantically filled amount and free()?
>
> Upon superficial code reading, it seems to me that currently changing
>
the capacity of an nsA[C]STring might uselessly use realloc to copy
> data that's not semantically live data from the string's point of view
> and wouldn't really need to be preserved. Have I actually discovered
> useless copying or am I misunderstanding?
>

In this case I think you're right. In the string code we use a doubling
strategy up to 8MiB so they'll always be in a new bucket/chunk. After 8MiB
we grow by 1.125 [2], but always round up to the nearest MiB. Our
HugeRealloc logic always makes a new allocation if the difference is
greater than or equal to 1MiB [3] so that's always going to get hit. I
should note that on OSX we use some sort of 'pages_copy' when the realloc
is large enough, this is probably more efficient than memcpy.

-e

[1]
https://searchfox.org/mozilla-central/rev/7ccb618f45a1398e31a086a009f87c8fd3a790b6/memory/build/mozjemalloc.cpp#59-88
[2]
https://searchfox.org/mozilla-central/rev/7ccb618f45a1398e31a086a009f87c8fd3a790b6/xpcom/string/nsTSubstring.cpp#88-119
[3]
https://searchfox.org/mozilla-central/rev/7ccb618f45a1398e31a086a009f87c8fd3a790b6/memory/build/mozjemalloc.cpp#3811-3874

>
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Is realloc() between bucket sizes worthwhile with jemalloc?

2018-04-09 Thread Henri Sivonen
My understanding is that under some "huge" size, jemalloc returns
allocations from particularly-sized buckets.

This makes me expect that realloc() between bucket sizes is going to
always copy the data instead of just adjusting allocated metadata,
because to do otherwise would mess up the bucketing.

Is this so? Specifically, is it actually useful that nsStringBuffer
uses realloc() as opposed to malloc(), memcpy() with actually
semantically filled amount and free()?

Upon superficial code reading, it seems to me that currently changing
the capacity of an nsA[C]STring might uselessly use realloc to copy
data that's not semantically live data from the string's point of view
and wouldn't really need to be preserved. Have I actually discovered
useless copying or am I misunderstanding?

-- 
Henri Sivonen
hsivo...@hsivonen.fi
https://hsivonen.fi/
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform