Re: [PATCH] improve performance of std::allocator::deallocate

2019-05-20 Thread Jonathan Wakely

On 20/05/19 19:21 +0200, François Dumont wrote:

On 5/20/19 1:14 PM, Jonathan Wakely wrote:

-  r1->deallocate(p, 2);
+  r1->deallocate(p, 2, alignof(char));
+  __builtin_printf("%d\n", (int)bytes_allocated);


Was this last line really intended to be added ?


No, and I've already removed it locally. I'm going to make another
change to that test shortly, so was going to wait and remove the line
with the other change.




Re: [PATCH] improve performance of std::allocator::deallocate

2019-05-20 Thread François Dumont

On 5/20/19 1:14 PM, Jonathan Wakely wrote:

-  r1->deallocate(p, 2);
+  r1->deallocate(p, 2, alignof(char));
+  __builtin_printf("%d\n", (int)bytes_allocated);


Was this last line really intended to be added ?



Re: [PATCH] improve performance of std::allocator::deallocate

2019-05-20 Thread Jonathan Wakely

On 20/05/19 10:44 +0100, Jonathan Wakely wrote:

On 20/05/19 09:17 +, Pádraig Brady wrote:



On 04/02/2019 07:33 PM, Padraig Brady wrote:

On 03/07/2019 03:43 AM, Jonathan Wakely wrote:

OK, that makes me feel better about it. It's presumably much easier to
upgrade to 5.2 from 5.0 or 5.1 than it would be from 4.x.


How complicated is the fix to prevent the crashes? Would it be
feasible for distros to backport that fix? I see that RHEL8 has
jemalloc 5.0.1 for example, but if the fix could be backported to that
release then it's less of a problem.

The patch set is simple enough:
https://github.com/jemalloc/jemalloc/pull/1341/commits

Thanks. That does seem reasonable for distros and other packagers to
backport, if they want to support 5.0 or 5.1 for their users.

I'm leaning towards accepting the patch for gcc-9 (and if not, we
should do it early in the gcc-10 cycle).


FYI jemalloc 5.2 is released with the fix for zero sized deallocations:
https://github.com/jemalloc/jemalloc/releases/tag/5.2.0


Friendly ping.
I can create a bug to track if you prefer.


No thanks, patches belong on these lists.

Now that we're in GCC 10 development stage 1 I'm happy to apply the
patch. I think by the time GCC 10 is released it will be reasonable to
expect people to use the fixed version of jemalloc.

I'll do the change today.


Tested powerpc64le-linux (without jemalloc) and committed to trunk.

I also had to fix a latent testcase bug that this change revealed.


commit 77cfde812ee7d83610f2b53fe63f7ad2a7a36c3e
Author: Jonathan Wakely 
Date:   Mon May 20 11:46:57 2019 +0100

std::allocator::deallocate support sized-deallocation

Pass the size to the allocator so that it may optimize deallocation.
This was seen to significantly reduce the work required in jemalloc,
with about 40% reduction in CPU cycles in the free path.

Note jemalloc >= 5.2 is required to fix a crash with 0 sizes.

2019-05-20  P??draig Brady  

* libstdc++-v3/include/ext/new_allocator.h (deallocate): Pass the size
to the deallocator with -fsized-deallocation.

diff --git a/libstdc++-v3/include/ext/new_allocator.h b/libstdc++-v3/include/ext/new_allocator.h
index e24539100f9..f1ff7da530b 100644
--- a/libstdc++-v3/include/ext/new_allocator.h
+++ b/libstdc++-v3/include/ext/new_allocator.h
@@ -116,16 +116,24 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
   // __p is not permitted to be a null pointer.
   void
-  deallocate(pointer __p, size_type)
+  deallocate(pointer __p, size_type __t)
   {
 #if __cpp_aligned_new
 	if (alignof(_Tp) > __STDCPP_DEFAULT_NEW_ALIGNMENT__)
 	  {
-	::operator delete(__p, std::align_val_t(alignof(_Tp)));
+	::operator delete(__p,
+# if __cpp_sized_deallocation
+			  __t * sizeof(_Tp),
+# endif
+			  std::align_val_t(alignof(_Tp)));
 	return;
 	  }
 #endif
-	::operator delete(__p);
+	::operator delete(__p
+#if __cpp_sized_deallocation
+			  , __t * sizeof(_Tp)
+#endif
+			 );
   }
 
   size_type

commit d2327fd8f69ab126f73b5f65c46845f4cdc14cd8
Author: Jonathan Wakely 
Date:   Mon May 20 12:10:15 2019 +0100

Fix test bug with mismatched alignment in allocate/deallocate

* testsuite/experimental/memory_resource/new_delete_resource.cc: Fix
test by passing correct alignment to deallocate function.

diff --git a/libstdc++-v3/testsuite/experimental/memory_resource/new_delete_resource.cc b/libstdc++-v3/testsuite/experimental/memory_resource/new_delete_resource.cc
index 5c5a0a4d7e3..41232b6fc67 100644
--- a/libstdc++-v3/testsuite/experimental/memory_resource/new_delete_resource.cc
+++ b/libstdc++-v3/testsuite/experimental/memory_resource/new_delete_resource.cc
@@ -103,7 +103,6 @@ test02()
 
 void
 test03()
-
 {
   using std::max_align_t;
   using std::size_t;
@@ -123,7 +122,8 @@ test03()
   p = r1->allocate(2, alignof(char));
   VERIFY( bytes_allocated == 2 );
   VERIFY( aligned(p) );
-  r1->deallocate(p, 2);
+  r1->deallocate(p, 2, alignof(char));
+  __builtin_printf("%d\n", (int)bytes_allocated);
   VERIFY( bytes_allocated == 0 );
 
   p = r1->allocate(3, alignof(short));


Re: [PATCH] improve performance of std::allocator::deallocate

2019-05-20 Thread Jonathan Wakely

On 20/05/19 09:17 +, Pádraig Brady wrote:



On 04/02/2019 07:33 PM, Padraig Brady wrote:

On 03/07/2019 03:43 AM, Jonathan Wakely wrote:

OK, that makes me feel better about it. It's presumably much easier to
upgrade to 5.2 from 5.0 or 5.1 than it would be from 4.x.


How complicated is the fix to prevent the crashes? Would it be
feasible for distros to backport that fix? I see that RHEL8 has
jemalloc 5.0.1 for example, but if the fix could be backported to that
release then it's less of a problem.

The patch set is simple enough:
https://github.com/jemalloc/jemalloc/pull/1341/commits

Thanks. That does seem reasonable for distros and other packagers to
backport, if they want to support 5.0 or 5.1 for their users.

I'm leaning towards accepting the patch for gcc-9 (and if not, we
should do it early in the gcc-10 cycle).


FYI jemalloc 5.2 is released with the fix for zero sized deallocations:
https://github.com/jemalloc/jemalloc/releases/tag/5.2.0


Friendly ping.
I can create a bug to track if you prefer.


No thanks, patches belong on these lists.

Now that we're in GCC 10 development stage 1 I'm happy to apply the
patch. I think by the time GCC 10 is released it will be reasonable to
expect people to use the fixed version of jemalloc.

I'll do the change today.



Re: [PATCH] improve performance of std::allocator::deallocate

2019-05-20 Thread Pádraig Brady



On 04/02/2019 07:33 PM, Padraig Brady wrote:
> On 03/07/2019 03:43 AM, Jonathan Wakely wrote:
>> OK, that makes me feel better about it. It's presumably much easier to
>> upgrade to 5.2 from 5.0 or 5.1 than it would be from 4.x.
>>
 How complicated is the fix to prevent the crashes? Would it be
 feasible for distros to backport that fix? I see that RHEL8 has
 jemalloc 5.0.1 for example, but if the fix could be backported to that
 release then it's less of a problem.
>>> The patch set is simple enough:
>>> https://github.com/jemalloc/jemalloc/pull/1341/commits
>> Thanks. That does seem reasonable for distros and other packagers to
>> backport, if they want to support 5.0 or 5.1 for their users.
>>
>> I'm leaning towards accepting the patch for gcc-9 (and if not, we
>> should do it early in the gcc-10 cycle).
>>
> FYI jemalloc 5.2 is released with the fix for zero sized deallocations:
> https://github.com/jemalloc/jemalloc/releases/tag/5.2.0
>
Friendly ping.
I can create a bug to track if you prefer.

cheers,
Pádraig


Re: [PATCH] improve performance of std::allocator::deallocate

2019-04-02 Thread Pádraig Brady



On 03/07/2019 03:43 AM, Jonathan Wakely wrote:
> On 06/03/19 22:27 +, Pádraig Brady wrote:
>>
>>
>> On 03/06/2019 01:44 AM, Jonathan Wakely wrote:
>>> On 06/03/19 09:20 +, Pádraig Brady wrote:
 On 03/06/2019 12:50 AM, Jonathan Wakely wrote:
> On 06/03/19 02:43 +, Pádraig Brady wrote:
>>
>>
>> On 02/26/2019 04:23 PM, Padraig Brady wrote:
>>>
 Note jemalloc >= 5.1 is required to fix a bug with 0 sizes.

 How serious is the bug? What are the symptoms?

>>> I've updated the commit summary to say it's a crash.
>>> Arguably that's better than mem corruption.
>>>
 It looks like 5.1.0 is less than a year old, so older versions are
 presumably still in wide use.

 We could potentially workaround it by making
 new_allocator::allocate(0) call ::operator new(1) when
 __cpp_sized_deallocation is defined, and deallocate(p, 0) call
 ::operator delete(p, 1). Obviously I'd prefer not to do that,
 because
 the default operator new already has an equivalent check, and only
 programs linking to jemalloc require the workaround.

>>> Right the jemalloc fix was released May 2018.
>>> It would be great to avoid the extra workarounds.
>>> Given this would be released about a year after the jemalloc fix
>>> was
>>> released,
>>> and that this would only manifest upon program rebuild,
>>> I would be inclined to not add any workarounds.
>>> For reference tcmalloc and glibc malloc were seen to work fine with
>>> this.
>>
>> Actually the jemalloc issue will only be fixed with the release
>> of 5.2
>> (a few weeks away).
>> I've updated the commit message in the attached accordingly.
>
> Hmm, I'm a bit nervous about making a change that will cause crashes
> unless using an unreleased version (I know it will be released by the
> time GCC 9.1 is released, but some people might upgrade GCC without
> upgrading jemalloc).
 Yes it's not ideal. It does make it a lot less risky that one has to
 rebuild programs to get the new functionality, so existing programs
 will be unaffected. Also -fsized-deallocation is only enabled by
 default on gcc with -std >= c++14.
>>>
>>> The default is -std=gnu++14 so it's enabled unless you explicitly
>>> choose an older dialect or add -fno-sized-deallocation.
>> Good point :)
>>>
> On the other hand, zero sized allocations should be rare in practice.
 Yes they were rare in testing here

 So programs _rebuilt_ against the following would need to update
 to jemalloc 5.2:

  zero sized allocs, jemalloc<5.2, c++>=14, GCC>=9.1

 Hopefully that's a small enough set.
>>>
>>> Which versions of jemalloc replace operator delete(void*, size_t) ?
>>>
>>> Was that something new in 5.0 or did older versions already provide a
>>> replacement for the sized operator delete?
>>>
>>> If it was introduced in 5.0 then there won't be a problem for 3.x and
>>> 4.x because they'll use the default definition from libstdc++ which
>>> just calls ::operator delete(p).
>> Again very good point. The replacements were only added in 5.0:
>> https://github.com/jemalloc/jemalloc/commit/2319152d
>
> OK, that makes me feel better about it. It's presumably much easier to
> upgrade to 5.2 from 5.0 or 5.1 than it would be from 4.x.
>
>>>
>>> How complicated is the fix to prevent the crashes? Would it be
>>> feasible for distros to backport that fix? I see that RHEL8 has
>>> jemalloc 5.0.1 for example, but if the fix could be backported to that
>>> release then it's less of a problem.
>> The patch set is simple enough:
>> https://github.com/jemalloc/jemalloc/pull/1341/commits
>
> Thanks. That does seem reasonable for distros and other packagers to
> backport, if they want to support 5.0 or 5.1 for their users.
>
> I'm leaning towards accepting the patch for gcc-9 (and if not, we
> should do it early in the gcc-10 cycle).
>
FYI jemalloc 5.2 is released with the fix for zero sized deallocations:
https://github.com/jemalloc/jemalloc/releases/tag/5.2.0

cheers,
Pádraig


Re: [PATCH] improve performance of std::allocator::deallocate

2019-03-07 Thread Jonathan Wakely

On 06/03/19 22:27 +, Pádraig Brady wrote:



On 03/06/2019 01:44 AM, Jonathan Wakely wrote:

On 06/03/19 09:20 +, Pádraig Brady wrote:

On 03/06/2019 12:50 AM, Jonathan Wakely wrote:

On 06/03/19 02:43 +, Pádraig Brady wrote:



On 02/26/2019 04:23 PM, Padraig Brady wrote:



Note jemalloc >= 5.1 is required to fix a bug with 0 sizes.

How serious is the bug? What are the symptoms?


I've updated the commit summary to say it's a crash.
Arguably that's better than mem corruption.


It looks like 5.1.0 is less than a year old, so older versions are
presumably still in wide use.

We could potentially workaround it by making
new_allocator::allocate(0) call ::operator new(1) when
__cpp_sized_deallocation is defined, and deallocate(p, 0) call
::operator delete(p, 1). Obviously I'd prefer not to do that,
because
the default operator new already has an equivalent check, and only
programs linking to jemalloc require the workaround.


Right the jemalloc fix was released May 2018.
It would be great to avoid the extra workarounds.
Given this would be released about a year after the jemalloc fix was
released,
and that this would only manifest upon program rebuild,
I would be inclined to not add any workarounds.
For reference tcmalloc and glibc malloc were seen to work fine with
this.


Actually the jemalloc issue will only be fixed with the release of 5.2
(a few weeks away).
I've updated the commit message in the attached accordingly.


Hmm, I'm a bit nervous about making a change that will cause crashes
unless using an unreleased version (I know it will be released by the
time GCC 9.1 is released, but some people might upgrade GCC without
upgrading jemalloc).

Yes it's not ideal. It does make it a lot less risky that one has to
rebuild programs to get the new functionality, so existing programs
will be unaffected. Also -fsized-deallocation is only enabled by
default on gcc with -std >= c++14.


The default is -std=gnu++14 so it's enabled unless you explicitly
choose an older dialect or add -fno-sized-deallocation.

Good point :)



On the other hand, zero sized allocations should be rare in practice.

Yes they were rare in testing here

So programs _rebuilt_ against the following would need to update
to jemalloc 5.2:

 zero sized allocs, jemalloc<5.2, c++>=14, GCC>=9.1

Hopefully that's a small enough set.


Which versions of jemalloc replace operator delete(void*, size_t) ?

Was that something new in 5.0 or did older versions already provide a
replacement for the sized operator delete?

If it was introduced in 5.0 then there won't be a problem for 3.x and
4.x because they'll use the default definition from libstdc++ which
just calls ::operator delete(p).

Again very good point. The replacements were only added in 5.0:
https://github.com/jemalloc/jemalloc/commit/2319152d


OK, that makes me feel better about it. It's presumably much easier to
upgrade to 5.2 from 5.0 or 5.1 than it would be from 4.x.



How complicated is the fix to prevent the crashes? Would it be
feasible for distros to backport that fix? I see that RHEL8 has
jemalloc 5.0.1 for example, but if the fix could be backported to that
release then it's less of a problem.

The patch set is simple enough:
https://github.com/jemalloc/jemalloc/pull/1341/commits


Thanks. That does seem reasonable for distros and other packagers to
backport, if they want to support 5.0 or 5.1 for their users.

I'm leaning towards accepting the patch for gcc-9 (and if not, we
should do it early in the gcc-10 cycle).



Re: [PATCH] improve performance of std::allocator::deallocate

2019-03-06 Thread Pádraig Brady



On 03/06/2019 01:44 AM, Jonathan Wakely wrote:
> On 06/03/19 09:20 +, Pádraig Brady wrote:
>> On 03/06/2019 12:50 AM, Jonathan Wakely wrote:
>>> On 06/03/19 02:43 +, Pádraig Brady wrote:


 On 02/26/2019 04:23 PM, Padraig Brady wrote:
>
>> Note jemalloc >= 5.1 is required to fix a bug with 0 sizes.
>>
>> How serious is the bug? What are the symptoms?
>>
> I've updated the commit summary to say it's a crash.
> Arguably that's better than mem corruption.
>
>> It looks like 5.1.0 is less than a year old, so older versions are
>> presumably still in wide use.
>>
>> We could potentially workaround it by making
>> new_allocator::allocate(0) call ::operator new(1) when
>> __cpp_sized_deallocation is defined, and deallocate(p, 0) call
>> ::operator delete(p, 1). Obviously I'd prefer not to do that,
>> because
>> the default operator new already has an equivalent check, and only
>> programs linking to jemalloc require the workaround.
>>
> Right the jemalloc fix was released May 2018.
> It would be great to avoid the extra workarounds.
> Given this would be released about a year after the jemalloc fix was
> released,
> and that this would only manifest upon program rebuild,
> I would be inclined to not add any workarounds.
> For reference tcmalloc and glibc malloc were seen to work fine with
> this.

 Actually the jemalloc issue will only be fixed with the release of 5.2
 (a few weeks away).
 I've updated the commit message in the attached accordingly.
>>>
>>> Hmm, I'm a bit nervous about making a change that will cause crashes
>>> unless using an unreleased version (I know it will be released by the
>>> time GCC 9.1 is released, but some people might upgrade GCC without
>>> upgrading jemalloc).
>> Yes it's not ideal. It does make it a lot less risky that one has to
>> rebuild programs to get the new functionality, so existing programs
>> will be unaffected. Also -fsized-deallocation is only enabled by
>> default on gcc with -std >= c++14.
>
> The default is -std=gnu++14 so it's enabled unless you explicitly
> choose an older dialect or add -fno-sized-deallocation.
Good point :)
>
>>> On the other hand, zero sized allocations should be rare in practice.
>> Yes they were rare in testing here
>>
>> So programs _rebuilt_ against the following would need to update
>> to jemalloc 5.2:
>>
>>  zero sized allocs, jemalloc<5.2, c++>=14, GCC>=9.1
>>
>> Hopefully that's a small enough set.
>
> Which versions of jemalloc replace operator delete(void*, size_t) ?
>
> Was that something new in 5.0 or did older versions already provide a
> replacement for the sized operator delete?
>
> If it was introduced in 5.0 then there won't be a problem for 3.x and
> 4.x because they'll use the default definition from libstdc++ which
> just calls ::operator delete(p).
Again very good point. The replacements were only added in 5.0:
https://github.com/jemalloc/jemalloc/commit/2319152d
>
> How complicated is the fix to prevent the crashes? Would it be
> feasible for distros to backport that fix? I see that RHEL8 has
> jemalloc 5.0.1 for example, but if the fix could be backported to that
> release then it's less of a problem.
The patch set is simple enough:
https://github.com/jemalloc/jemalloc/pull/1341/commits

cheers,
Pádraig.


Re: [PATCH] improve performance of std::allocator::deallocate

2019-03-06 Thread Jonathan Wakely

On 06/03/19 09:20 +, Pádraig Brady wrote:

On 03/06/2019 12:50 AM, Jonathan Wakely wrote:

On 06/03/19 02:43 +, Pádraig Brady wrote:



On 02/26/2019 04:23 PM, Padraig Brady wrote:



Note jemalloc >= 5.1 is required to fix a bug with 0 sizes.

How serious is the bug? What are the symptoms?


I've updated the commit summary to say it's a crash.
Arguably that's better than mem corruption.


It looks like 5.1.0 is less than a year old, so older versions are
presumably still in wide use.

We could potentially workaround it by making
new_allocator::allocate(0) call ::operator new(1) when
__cpp_sized_deallocation is defined, and deallocate(p, 0) call
::operator delete(p, 1). Obviously I'd prefer not to do that, because
the default operator new already has an equivalent check, and only
programs linking to jemalloc require the workaround.


Right the jemalloc fix was released May 2018.
It would be great to avoid the extra workarounds.
Given this would be released about a year after the jemalloc fix was
released,
and that this would only manifest upon program rebuild,
I would be inclined to not add any workarounds.
For reference tcmalloc and glibc malloc were seen to work fine with
this.


Actually the jemalloc issue will only be fixed with the release of 5.2
(a few weeks away).
I've updated the commit message in the attached accordingly.


Hmm, I'm a bit nervous about making a change that will cause crashes
unless using an unreleased version (I know it will be released by the
time GCC 9.1 is released, but some people might upgrade GCC without
upgrading jemalloc).

Yes it's not ideal. It does make it a lot less risky that one has to
rebuild programs to get the new functionality, so existing programs
will be unaffected. Also -fsized-deallocation is only enabled by
default on gcc with -std >= c++14.


The default is -std=gnu++14 so it's enabled unless you explicitly
choose an older dialect or add -fno-sized-deallocation.


On the other hand, zero sized allocations should be rare in practice.

Yes they were rare in testing here

So programs _rebuilt_ against the following would need to update
to jemalloc 5.2:

 zero sized allocs, jemalloc<5.2, c++>=14, GCC>=9.1

Hopefully that's a small enough set.


Which versions of jemalloc replace operator delete(void*, size_t) ?

Was that something new in 5.0 or did older versions already provide a
replacement for the sized operator delete?

If it was introduced in 5.0 then there won't be a problem for 3.x and
4.x because they'll use the default definition from libstdc++ which
just calls ::operator delete(p).

How complicated is the fix to prevent the crashes? Would it be
feasible for distros to backport that fix? I see that RHEL8 has
jemalloc 5.0.1 for example, but if the fix could be backported to that
release then it's less of a problem.




Re: [PATCH] improve performance of std::allocator::deallocate

2019-03-06 Thread Pádraig Brady
On 03/06/2019 12:50 AM, Jonathan Wakely wrote:
> On 06/03/19 02:43 +, Pádraig Brady wrote:
>>
>>
>> On 02/26/2019 04:23 PM, Padraig Brady wrote:
>>>
 Note jemalloc >= 5.1 is required to fix a bug with 0 sizes.

 How serious is the bug? What are the symptoms?

>>> I've updated the commit summary to say it's a crash.
>>> Arguably that's better than mem corruption.
>>>
 It looks like 5.1.0 is less than a year old, so older versions are
 presumably still in wide use.

 We could potentially workaround it by making
 new_allocator::allocate(0) call ::operator new(1) when
 __cpp_sized_deallocation is defined, and deallocate(p, 0) call
 ::operator delete(p, 1). Obviously I'd prefer not to do that, because
 the default operator new already has an equivalent check, and only
 programs linking to jemalloc require the workaround.

>>> Right the jemalloc fix was released May 2018.
>>> It would be great to avoid the extra workarounds.
>>> Given this would be released about a year after the jemalloc fix was
>>> released,
>>> and that this would only manifest upon program rebuild,
>>> I would be inclined to not add any workarounds.
>>> For reference tcmalloc and glibc malloc were seen to work fine with
>>> this.
>>
>> Actually the jemalloc issue will only be fixed with the release of 5.2
>> (a few weeks away).
>> I've updated the commit message in the attached accordingly.
>
> Hmm, I'm a bit nervous about making a change that will cause crashes
> unless using an unreleased version (I know it will be released by the
> time GCC 9.1 is released, but some people might upgrade GCC without
> upgrading jemalloc).
Yes it's not ideal. It does make it a lot less risky that one has to
rebuild programs to get the new functionality, so existing programs
will be unaffected. Also -fsized-deallocation is only enabled by
default on gcc with -std >= c++14.
>
> On the other hand, zero sized allocations should be rare in practice.
Yes they were rare in testing here

So programs _rebuilt_ against the following would need to update
to jemalloc 5.2:

  zero sized allocs, jemalloc<5.2, c++>=14, GCC>=9.1

Hopefully that's a small enough set.

thanks,
Pádraig


Re: [PATCH] improve performance of std::allocator::deallocate

2019-03-06 Thread Jonathan Wakely

On 06/03/19 02:43 +, Pádraig Brady wrote:



On 02/26/2019 04:23 PM, Padraig Brady wrote:



Note jemalloc >= 5.1 is required to fix a bug with 0 sizes.

How serious is the bug? What are the symptoms?


I've updated the commit summary to say it's a crash.
Arguably that's better than mem corruption.


It looks like 5.1.0 is less than a year old, so older versions are
presumably still in wide use.

We could potentially workaround it by making
new_allocator::allocate(0) call ::operator new(1) when
__cpp_sized_deallocation is defined, and deallocate(p, 0) call
::operator delete(p, 1). Obviously I'd prefer not to do that, because
the default operator new already has an equivalent check, and only
programs linking to jemalloc require the workaround.


Right the jemalloc fix was released May 2018.
It would be great to avoid the extra workarounds.
Given this would be released about a year after the jemalloc fix was
released,
and that this would only manifest upon program rebuild,
I would be inclined to not add any workarounds.
For reference tcmalloc and glibc malloc were seen to work fine with this.


Actually the jemalloc issue will only be fixed with the release of 5.2
(a few weeks away).
I've updated the commit message in the attached accordingly.


Hmm, I'm a bit nervous about making a change that will cause crashes
unless using an unreleased version (I know it will be released by the
time GCC 9.1 is released, but some people might upgrade GCC without
upgrading jemalloc).

On the other hand, zero sized allocations should be rare in practice.



Re: [PATCH] improve performance of std::allocator::deallocate

2019-03-05 Thread Pádraig Brady


On 02/26/2019 04:23 PM, Padraig Brady wrote:
>
>> Note jemalloc >= 5.1 is required to fix a bug with 0 sizes.
>>
>> How serious is the bug? What are the symptoms?
>>
> I've updated the commit summary to say it's a crash.
> Arguably that's better than mem corruption.
>
>> It looks like 5.1.0 is less than a year old, so older versions are
>> presumably still in wide use.
>>
>> We could potentially workaround it by making
>> new_allocator::allocate(0) call ::operator new(1) when
>> __cpp_sized_deallocation is defined, and deallocate(p, 0) call
>> ::operator delete(p, 1). Obviously I'd prefer not to do that, because
>> the default operator new already has an equivalent check, and only
>> programs linking to jemalloc require the workaround.
>>
> Right the jemalloc fix was released May 2018.
> It would be great to avoid the extra workarounds.
> Given this would be released about a year after the jemalloc fix was
> released,
> and that this would only manifest upon program rebuild,
> I would be inclined to not add any workarounds.
> For reference tcmalloc and glibc malloc were seen to work fine with this.

Actually the jemalloc issue will only be fixed with the release of 5.2
(a few weeks away).
I've updated the commit message in the attached accordingly.

thanks,
Pádraig
From fec53f1500f5db6a2fe1b81c36a6695fd334758e Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?P=C3=A1draig=20Brady?= 
Date: Fri, 22 Feb 2019 17:21:57 -0800
Subject: [PATCH] std::allocator::deallocate support sized-deallocation

Pass the size to the allocator so that it may optimize deallocation.
This was seen to significantly reduce the work required in jemalloc,
with about 40% reduction in CPU cycles in the free path.
Note jemalloc >= 5.2 is required to fix a crash with 0 sizes.

* libstdc++-v3/include/ext/new_allocator.h (deallocate): Pass the size
to the deallocator with -fsized-deallocation.
---
 libstdc++-v3/include/ext/new_allocator.h | 14 +++---
 1 file changed, 11 insertions(+), 3 deletions(-)

diff --git a/libstdc++-v3/include/ext/new_allocator.h b/libstdc++-v3/include/ext/new_allocator.h
index e245391..f1ff7da 100644
--- a/libstdc++-v3/include/ext/new_allocator.h
+++ b/libstdc++-v3/include/ext/new_allocator.h
@@ -116,16 +116,24 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
   // __p is not permitted to be a null pointer.
   void
-  deallocate(pointer __p, size_type)
+  deallocate(pointer __p, size_type __t)
   {
 #if __cpp_aligned_new
 	if (alignof(_Tp) > __STDCPP_DEFAULT_NEW_ALIGNMENT__)
 	  {
-	::operator delete(__p, std::align_val_t(alignof(_Tp)));
+	::operator delete(__p,
+# if __cpp_sized_deallocation
+			  __t * sizeof(_Tp),
+# endif
+			  std::align_val_t(alignof(_Tp)));
 	return;
 	  }
 #endif
-	::operator delete(__p);
+	::operator delete(__p
+#if __cpp_sized_deallocation
+			  , __t * sizeof(_Tp)
+#endif
+			 );
   }
 
   size_type
-- 
2.5.5



Re: [PATCH] improve performance of std::allocator::deallocate

2019-02-26 Thread Pádraig Brady


On 02/26/2019 05:50 AM, Jonathan Wakely wrote:
> On 23/02/19 02:04 +, Pádraig Brady wrote:
>> Attached is a simple patch which has been extensively tested within
>> Facebook,
>> and is enabled by default in our code base.
>>
>> Passing the size to the allocator allows it to optimize deallocation,
>> and this was seen to significantly reduce the work required in jemalloc,
>> with about 40% reduction in CPU cycles in the free path.
>
> Thanks, the patch looks good.
>
> I would prefer to only change the function body, as otherwise LTO
> sometimes produces link-time warnings about ODR violations, because
> the same function is defined on two different lines. The ODR detection
> heuristics aren't smart enough to complain when the function
> declarator is always defined on the same line, but with two different
> function bodies.

We've not noticed issues here with LTO, though that could be
due to global enablement of -fsized-deallocation.
Anyway your suggestion is neater. Updated patch attached.
> Note jemalloc >= 5.1 is required to fix a bug with 0 sizes.
>
> How serious is the bug? What are the symptoms?
>
I've updated the commit summary to say it's a crash.
Arguably that's better than mem corruption.

> It looks like 5.1.0 is less than a year old, so older versions are
> presumably still in wide use.
>
> We could potentially workaround it by making
> new_allocator::allocate(0) call ::operator new(1) when
> __cpp_sized_deallocation is defined, and deallocate(p, 0) call
> ::operator delete(p, 1). Obviously I'd prefer not to do that, because
> the default operator new already has an equivalent check, and only
> programs linking to jemalloc require the workaround.
>
Right the jemalloc fix was released May 2018.
It would be great to avoid the extra workarounds.
Given this would be released about a year after the jemalloc fix was
released,
and that this would only manifest upon program rebuild,
I would be inclined to not add any workarounds.
For reference tcmalloc and glibc malloc were seen to work fine with this.

thanks!
Pádraig.
From 54b69ee9ac7a188fa938108b4eac46a66416d0ce Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?P=C3=A1draig=20Brady?= 
Date: Fri, 22 Feb 2019 17:21:57 -0800
Subject: [PATCH] std::allocator::deallocate support sized-deallocation

Pass the size to the allocator so that it may optimize deallocation.
This was seen to significantly reduce the work required in jemalloc,
with about 40% reduction in CPU cycles in the free path.
Note jemalloc >= 5.1 is required to fix a crash with 0 sizes.

* libstdc++-v3/include/ext/new_allocator.h (deallocate): Pass the size
to the deallocator with -fsized-deallocation.
---
 libstdc++-v3/include/ext/new_allocator.h | 14 +++---
 1 file changed, 11 insertions(+), 3 deletions(-)

diff --git a/libstdc++-v3/include/ext/new_allocator.h b/libstdc++-v3/include/ext/new_allocator.h
index e245391..f1ff7da 100644
--- a/libstdc++-v3/include/ext/new_allocator.h
+++ b/libstdc++-v3/include/ext/new_allocator.h
@@ -116,16 +116,24 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
   // __p is not permitted to be a null pointer.
   void
-  deallocate(pointer __p, size_type)
+  deallocate(pointer __p, size_type __t)
   {
 #if __cpp_aligned_new
 	if (alignof(_Tp) > __STDCPP_DEFAULT_NEW_ALIGNMENT__)
 	  {
-	::operator delete(__p, std::align_val_t(alignof(_Tp)));
+	::operator delete(__p,
+# if __cpp_sized_deallocation
+			  __t * sizeof(_Tp),
+# endif
+			  std::align_val_t(alignof(_Tp)));
 	return;
 	  }
 #endif
-	::operator delete(__p);
+	::operator delete(__p
+#if __cpp_sized_deallocation
+			  , __t * sizeof(_Tp)
+#endif
+			 );
   }
 
   size_type
-- 
2.5.5



Re: [PATCH] improve performance of std::allocator::deallocate

2019-02-26 Thread Jonathan Wakely

On 23/02/19 02:04 +, Pádraig Brady wrote:

Attached is a simple patch which has been extensively tested within
Facebook,
and is enabled by default in our code base.

Passing the size to the allocator allows it to optimize deallocation,
and this was seen to significantly reduce the work required in jemalloc,
with about 40% reduction in CPU cycles in the free path.


Thanks, the patch looks good.

I would prefer to only change the function body, as otherwise LTO
sometimes produces link-time warnings about ODR violations, because
the same function is defined on two different lines. The ODR detection
heuristics aren't smart enough to complain when the function
declarator is always defined on the same line, but with two different
function bodies.

i.e. define it as:

 // __p is not permitted to be a null pointer.
 void
 deallocate(pointer __p,
size_type __t __attribute__((__unused__)))
 {
#if __cpp_sized_deallocation
   // ...
#else
   // ..
#endif
 }

Or to reduce the duplication, maybe:

 // __p is not permitted to be a null pointer.
 void
 deallocate(pointer __p, size_type __t)
 {
#if __cpp_aligned_new
if (alignof(_Tp) > __STDCPP_DEFAULT_NEW_ALIGNMENT__)
  {
::operator delete(__p,
#if __cpp_sized_deallocation
  __t * sizeof(_Tp),
#endif
  std::align_val_t(alignof(_Tp)));
return;
  }
#endif
::operator delete(__p
#if __cpp_sized_deallocation
  , __t * sizeof(_Tp)
#endif
   );
 }



Note jemalloc >= 5.1 is required to fix a bug with 0 sizes.


How serious is the bug? What are the symptoms?

It looks like 5.1.0 is less than a year old, so older versions are
presumably still in wide use.

We could potentially workaround it by making
new_allocator::allocate(0) call ::operator new(1) when
__cpp_sized_deallocation is defined, and deallocate(p, 0) call
::operator delete(p, 1). Obviously I'd prefer not to do that, because
the default operator new already has an equivalent check, and only
programs linking to jemalloc require the workaround.