Re: [boost] Re: shifted_ptr w/ lock mechanism
From: David B. Held [EMAIL PROTECTED] Philippe A. Bouchard [EMAIL PROTECTED] wrote in message b19hhg$i2m$[EMAIL PROTECTED]">news:b19hhg$i2m$[EMAIL PROTECTED]... [...] list shifted_ptrT took 7.1966276647 seconds to reconstruct 2000 times. [...] list shared_ptrT took 14.0157271000 seconds to reconstruct 2000 times. Looks like your lead is getting eroded by the day. ;) And that's just with a quick hack. You better be worried about a serious small object allocator. To be fair, a factor of two improvement cannot just be shrugged off. But one point to keep in mind is that shared_ptrX px(new X); performs two allocations. We can optimize the count allocation until we're blue in the face but in a real project the whole expression will probably remain a bottleneck; so it's likely that X will acquire a class-specific operator new. And a X with a class-specific new can no longer be used with shifted_ptr. ___ Unsubscribe other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Re: [boost] Re: shifted_ptr w/ lock mechanism
From: David Abrahams [EMAIL PROTECTED] Would you indulge me and try the benchmark again with the enclosed shared_count patch applied and #undef BOOST_SP_USE_STD_ALLOCATOR? I don't really know what's going on under the covers in the SGI allocator; this is basically just the same hack I threw at the problem years ago. I've taken the liberty to convert the patch into detail/quick_allocator.hpp. #define BOOST_SP_USE_QUICK_ALLOCATOR to make shared_ptr use it. shared_ptr_alloc_test.cpp has been updated, too. You can now compare quick_allocator vs SGI std::allocator yourself. :-) quick_allocator doesn't compile on VC6 or Borland 5.5.1, though. I haven't investigated too deeply. My patch doesn't pretend to work for a threaded implementation, so only the no-threads test applies. I added a lightweight_mutex lock here and there, but haven't made any extensive multithreaded tests. ___ Unsubscribe other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Re: [boost] Re: shifted_ptr w/ lock mechanism
Peter Dimov [EMAIL PROTECTED] writes: From: David Abrahams [EMAIL PROTECTED] Would you indulge me and try the benchmark again with the enclosed shared_count patch applied and #undef BOOST_SP_USE_STD_ALLOCATOR? I don't really know what's going on under the covers in the SGI allocator; this is basically just the same hack I threw at the problem years ago. I've taken the liberty to convert the patch into detail/quick_allocator.hpp. #define BOOST_SP_USE_QUICK_ALLOCATOR to make shared_ptr use it. shared_ptr_alloc_test.cpp has been updated, too. You can now compare quick_allocator vs SGI std::allocator yourself. :-) I'm not all set up to run those tests and measure the times, which is why I was hoping Philippe would check it out. quick_allocator doesn't compile on VC6 or Borland 5.5.1, though. I haven't investigated too deeply. You must have changed the code I sent then. Works fine with VC6 for me; borland has the usual problems with ICEs and scope qualification. My patch doesn't pretend to work for a threaded implementation, so only the no-threads test applies. I added a lightweight_mutex lock here and there, but haven't made any extensive multithreaded tests. This was just a quick hack; I'm surprised you thought it was worthwhile modifying your sources to accomodate it despite the lack of *any* test data... or did you try it? -- David Abrahams [EMAIL PROTECTED] * http://www.boost-consulting.com Boost support, enhancements, training, and commercial distribution ___ Unsubscribe other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Re: [boost] Re: shifted_ptr w/ lock mechanism
At 06:23 AM 1/30/2003, Peter Dimov wrote: From: David B. Held [EMAIL PROTECTED] Philippe A. Bouchard [EMAIL PROTECTED] wrote in message b19hhg$i2m$[EMAIL PROTECTED]">news:b19hhg$i2m$[EMAIL PROTECTED]... [...] list shifted_ptrT took 7.1966276647 seconds to reconstruct 2000 times. [...] list shared_ptrT took 14.0157271000 seconds to reconstruct 2000 times. Looks like your lead is getting eroded by the day. ;) And that's just with a quick hack. You better be worried about a serious small object allocator. To be fair, a factor of two improvement cannot just be shrugged off. But one point to keep in mind is that shared_ptrX px(new X); performs two allocations. We can optimize the count allocation until we're blue in the face but in a real project the whole expression will probably remain a bottleneck; so it's likely that X will acquire a class-specific operator new. And a X with a class-specific new can no longer be used with shifted_ptr. I read a paper yesterday from the latest OOPSLA proceedings that argued that a class-specific new is almost never a win compared to a high-quality general purpose allocator like LEA. ___ Unsubscribe other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Re: [boost] Re: shifted_ptr w/ lock mechanism
Greg Colvin [EMAIL PROTECTED] writes: I read a paper yesterday from the latest OOPSLA proceedings that argued that a class-specific new is almost never a win compared to a high-quality general purpose allocator like LEA. In real code, I'm sure that's true. However, for the kind of meaningless benchmark-rustling we're engaged in now, I bet the class-specific allocator works great ;-) -- David Abrahams [EMAIL PROTECTED] * http://www.boost-consulting.com Boost support, enhancements, training, and commercial distribution ___ Unsubscribe other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Re: [boost] Re: shifted_ptr w/ lock mechanism
From: Greg Colvin [EMAIL PROTECTED] At 06:23 AM 1/30/2003, Peter Dimov wrote: To be fair, a factor of two improvement cannot just be shrugged off. But one point to keep in mind is that shared_ptrX px(new X); performs two allocations. We can optimize the count allocation until we're blue in the face but in a real project the whole expression will probably remain a bottleneck; so it's likely that X will acquire a class-specific operator new. And a X with a class-specific new can no longer be used with shifted_ptr. I read a paper yesterday from the latest OOPSLA proceedings that argued that a class-specific new is almost never a win compared to a high-quality general purpose allocator like LEA. This is the argument I've been using every time the question of adding an optimized count allocator to shared_ptr came up. ___ Unsubscribe other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Re: [boost] Re: shifted_ptr w/ lock mechanism
At 08:16 AM 1/30/2003, David Abrahams wrote: Peter Dimov [EMAIL PROTECTED] writes: From: David Abrahams [EMAIL PROTECTED] Would you indulge me and try the benchmark again with the enclosed shared_count patch applied and #undef BOOST_SP_USE_STD_ALLOCATOR? I don't really know what's going on under the covers in the SGI allocator; this is basically just the same hack I threw at the problem years ago. I've taken the liberty to convert the patch into detail/quick_allocator.hpp. #define BOOST_SP_USE_QUICK_ALLOCATOR to make shared_ptr use it. shared_ptr_alloc_test.cpp has been updated, too. You can now compare quick_allocator vs SGI std::allocator yourself. :-) I'm not all set up to run those tests and measure the times, which is why I was hoping Philippe would check it out. It would be interesting to see the test results for intrusive_ptr as well. ___ Unsubscribe other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Re: [boost] Re: shifted_ptr w/ lock mechanism
Peter Dimov [EMAIL PROTECTED] writes: I'm not all set up to run those tests and measure the times, which is why I was hoping Philippe would check it out. There is a test in libs/smart_ptr/test called shared_ptr_alloc_test.cpp that you can use. OK. quick_allocator doesn't compile on VC6 or Borland 5.5.1, though. I haven't investigated too deeply. You must have changed the code I sent then. Works fine with VC6 for me; borland has the usual problems with ICEs and scope qualification. I did. OK. This was just a quick hack; I'm surprised you thought it was worthwhile modifying your sources to accomodate it despite the lack of *any* test data... The modification to shared_count.hpp is small and macro-guarded, so it won't affect existing code. Seeing how well a quick hack performs can be valuable information, given how often the optimized count allocator question comes up. Sure; I was just surprised you wanted to check it in this early. or did you try it? This is almost an insult. Unintentional. Please don't be insulted; I guess I'm just afraid I'll be blamed for adding the little optimization that didn't ;-) -- David Abrahams [EMAIL PROTECTED] * http://www.boost-consulting.com Boost support, enhancements, training, and commercial distribution ___ Unsubscribe other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Re: [boost] Re: shifted_ptr w/ lock mechanism
At 09:12 AM 1/30/2003, Peter Dimov wrote: From: Greg Colvin [EMAIL PROTECTED] At 06:23 AM 1/30/2003, Peter Dimov wrote: To be fair, a factor of two improvement cannot just be shrugged off. But one point to keep in mind is that shared_ptrX px(new X); performs two allocations. We can optimize the count allocation until we're blue in the face but in a real project the whole expression will probably remain a bottleneck; so it's likely that X will acquire a class-specific operator new. And a X with a class-specific new can no longer be used with shifted_ptr. I read a paper yesterday from the latest OOPSLA proceedings that argued that a class-specific new is almost never a win compared to a high-quality general purpose allocator like LEA. This is the argument I've been using every time the question of adding an optimized count allocator to shared_ptr came up. The counter-argument is that the allocators supplied by many vendors are nowhere near as good as LEA. So it might be a good idea to optionally use boost::pool in shared_ptr. And it would be a good Boost project to provide a high-quality replacement operator new. ___ Unsubscribe other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Re: [boost] Re: shifted_ptr w/ lock mechanism
Peter Dimov [EMAIL PROTECTED] writes: From: David Abrahams [EMAIL PROTECTED] Peter Dimov [EMAIL PROTECTED] writes: I've taken the liberty to convert the patch into detail/quick_allocator.hpp. #define BOOST_SP_USE_QUICK_ALLOCATOR to make shared_ptr use it. shared_ptr_alloc_test.cpp has been updated, too. You can now compare quick_allocator vs SGI std::allocator yourself. :-) I'm not all set up to run those tests and measure the times, which is why I was hoping Philippe would check it out. There is a test in libs/smart_ptr/test called shared_ptr_alloc_test.cpp that you can use. Your test doesn't seem to terminate for me in a reasonable amount of time (minutes) in any configuration. -- David Abrahams [EMAIL PROTECTED] * http://www.boost-consulting.com Boost support, enhancements, training, and commercial distribution ___ Unsubscribe other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Re: [boost] Re: shifted_ptr w/ lock mechanism
Peter Dimov [EMAIL PROTECTED] writes: From: David Abrahams [EMAIL PROTECTED] Peter Dimov [EMAIL PROTECTED] writes: There is a test in libs/smart_ptr/test called shared_ptr_alloc_test.cpp that you can use. Your test doesn't seem to terminate for me in a reasonable amount of time (minutes) in any configuration. You might need to use a lower n. Here's what I get (randomly choosing g++/mingw): GNU C++ version 2.95.3-6 (mingw special) Win32 SGI standard library BOOST_HAS_THREADS: (not defined) ^^^ How did you arrange that? I had to tweak our config file to get there. -- David Abrahams [EMAIL PROTECTED] * http://www.boost-consulting.com Boost support, enhancements, training, and commercial distribution ___ Unsubscribe other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Re: [boost] Re: shifted_ptr w/ lock mechanism
From: David Abrahams [EMAIL PROTECTED] Peter Dimov [EMAIL PROTECTED] writes: From: David Abrahams [EMAIL PROTECTED] Peter Dimov [EMAIL PROTECTED] writes: There is a test in libs/smart_ptr/test called shared_ptr_alloc_test.cpp that you can use. Your test doesn't seem to terminate for me in a reasonable amount of time (minutes) in any configuration. You might need to use a lower n. Here's what I get (randomly choosing g++/mingw): GNU C++ version 2.95.3-6 (mingw special) Win32 SGI standard library BOOST_HAS_THREADS: (not defined) ^^^ How did you arrange that? I had to tweak our config file to get there. With -DBOOST_DISABLE_THREADS. ___ Unsubscribe other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Re: [boost] Re: shifted_ptr w/ lock mechanism
Peter Dimov [EMAIL PROTECTED] writes: You might need to use a lower n. Here's what I get (randomly choosing g++/mingw): GNU C++ version 2.95.3-6 (mingw special) Win32 SGI standard library BOOST_HAS_THREADS: (not defined) BOOST_SP_USE_STD_ALLOCATOR: (not defined) BOOST_SP_USE_QUICK_ALLOCATOR: (not defined) 1048576 shared_ptrint allocations + deallocations: 1.582 seconds. 1.713 seconds. 1.702 seconds. 1048576 shared_ptrX allocations + deallocations: 1.151 seconds. 1.593 seconds. 1.603 seconds. 1048576 shared_ptrY allocations + deallocations: 1.652 seconds. 1.071 seconds. 1.072 seconds. 1048576 shared_ptrZ allocations + deallocations: 1.713 seconds. 1.732 seconds. 1.722 seconds. GNU C++ version 2.95.3-6 (mingw special) Win32 SGI standard library BOOST_HAS_THREADS: (not defined) BOOST_SP_USE_STD_ALLOCATOR: (not defined) BOOST_SP_USE_QUICK_ALLOCATOR: (defined) 1048576 shared_ptrint allocations + deallocations: 1.151 seconds. 0.961 seconds. 0.931 seconds. 1048576 shared_ptrX allocations + deallocations: 0.571 seconds. 0.551 seconds. 0.571 seconds. 1048576 shared_ptrY allocations + deallocations: 0.761 seconds. 0.55 seconds. 0.54 seconds. 1048576 shared_ptrZ allocations + deallocations: 0.941 seconds. 0.921 seconds. 0.951 seconds. Wow, that's a much bigger improvement than I saw! I wonder why? -- David Abrahams [EMAIL PROTECTED] * http://www.boost-consulting.com Boost support, enhancements, training, and commercial distribution ___ Unsubscribe other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Re: [boost] Re: shifted_ptr w/ lock mechanism
Peter Dimov [EMAIL PROTECTED] writes: From: David Abrahams [EMAIL PROTECTED] Peter Dimov [EMAIL PROTECTED] writes: There is a test in libs/smart_ptr/test called shared_ptr_alloc_test.cpp that you can use. Your test doesn't seem to terminate for me in a reasonable amount of time (minutes) in any configuration. You might need to use a lower n. Here's what I get (randomly choosing g++/mingw): GNU C++ version 2.95.3-6 (mingw special) Win32 SGI standard library BOOST_HAS_THREADS: (not defined) BOOST_SP_USE_STD_ALLOCATOR: (not defined) BOOST_SP_USE_QUICK_ALLOCATOR: (not defined) 1048576 shared_ptrint allocations + deallocations: 1.582 seconds. 1.713 seconds. 1.702 seconds. 1048576 shared_ptrX allocations + deallocations: 1.151 seconds. 1.593 seconds. 1.603 seconds. 1048576 shared_ptrY allocations + deallocations: 1.652 seconds. 1.071 seconds. 1.072 seconds. 1048576 shared_ptrZ allocations + deallocations: 1.713 seconds. 1.732 seconds. 1.722 seconds. GNU C++ version 2.95.3-6 (mingw special) Win32 SGI standard library BOOST_HAS_THREADS: (not defined) BOOST_SP_USE_STD_ALLOCATOR: (not defined) BOOST_SP_USE_QUICK_ALLOCATOR: (defined) 1048576 shared_ptrint allocations + deallocations: 1.151 seconds. 0.961 seconds. 0.931 seconds. 1048576 shared_ptrX allocations + deallocations: 0.571 seconds. 0.551 seconds. 0.571 seconds. 1048576 shared_ptrY allocations + deallocations: 0.761 seconds. 0.55 seconds. 0.54 seconds. 1048576 shared_ptrZ allocations + deallocations: 0.941 seconds. 0.921 seconds. 0.951 seconds. At first I was surprised that you're seeing such a big difference, but I guess I just overlooked them in my case, because it does show similar results: myjam -sTOOLS=mingw --verbose-test -sBUILD=release defineBOOST_SP_USE_QUICK_ALLOCATOR threadingsingle -a shared_ptr_alloc_test ... GNU C++ version 2.95.3-5 (mingw special) Win32 SGI standard library BOOST_HAS_THREADS: (not defined) BOOST_SP_USE_STD_ALLOCATOR: (not defined) BOOST_SP_USE_QUICK_ALLOCATOR: (defined) 1048576 shared_ptrint allocations + deallocations: 1.562 seconds. 1.322 seconds. 1.131 seconds. 1048576 shared_ptrX allocations + deallocations: 0.871 seconds. 0.691 seconds. 0.841 seconds. 1048576 shared_ptrY allocations + deallocations: 1.001 seconds. 0.862 seconds. 0.611 seconds. 1048576 shared_ptrZ allocations + deallocations: 1.322 seconds. 1.152 seconds. 1.332 seconds. myjam -sTOOLS=mingw --verbose-test -sBUILD=release defineBOOST_SP_USE_STD_ALLOCATOR -a shared_ptr_alloc_test ... GNU C++ version 2.95.3-5 (mingw special) Win32 SGI standard library BOOST_HAS_THREADS: (defined) BOOST_SP_USE_STD_ALLOCATOR: (defined) BOOST_SP_USE_QUICK_ALLOCATOR: (not defined) 1048576 shared_ptrint allocations + deallocations: 1.722 seconds. 2.103 seconds. 1.783 seconds. 1048576 shared_ptrX allocations + deallocations: 1.502 seconds. 1.161 seconds. 1.462 seconds. 1048576 shared_ptrY allocations + deallocations: 1.583 seconds. 1.623 seconds. 1.212 seconds. 1048576 shared_ptrZ allocations + deallocations: 2.073 seconds. 1.763 seconds. 2.123 seconds. -- David Abrahams [EMAIL PROTECTED] * http://www.boost-consulting.com Boost support, enhancements, training, and commercial distribution ___ Unsubscribe other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Re: [boost] Re: shifted_ptr w/ lock mechanism
David Abrahams [EMAIL PROTECTED] writes: Peter Dimov [EMAIL PROTECTED] writes: You might need to use a lower n. Here's what I get (randomly choosing g++/mingw): GNU C++ version 2.95.3-6 (mingw special) Win32 Wow, that's a much bigger improvement than I saw! I wonder why? Improvements with MinGW-3.2 are much less-pronounced (no big surprise there), and we do a bit worse in the int case for reasons I can readily guess at ;-): GNU C++ version 3.2 (mingw special 20020817-1) Win32 GNU libstdc++ version 20020816 BOOST_HAS_THREADS: (not defined) BOOST_SP_USE_STD_ALLOCATOR: (defined) BOOST_SP_USE_QUICK_ALLOCATOR: (not defined) 1048576 shared_ptrint allocations + deallocations: 1.021 seconds. 1.242 seconds. 1.172 seconds. 1048576 shared_ptrX allocations + deallocations: 0.631 seconds. 0.531 seconds. 0.611 seconds. 1048576 shared_ptrY allocations + deallocations: 0.761 seconds. 0.651 seconds. 0.531 seconds. 1048576 shared_ptrZ allocations + deallocations: 1.232 seconds. 1.142 seconds. 1.252 seconds. -- GNU C++ version 3.2 (mingw special 20020817-1) Win32 GNU libstdc++ version 20020816 BOOST_HAS_THREADS: (not defined) BOOST_SP_USE_STD_ALLOCATOR: (not defined) BOOST_SP_USE_QUICK_ALLOCATOR: (defined) 1048576 shared_ptrint allocations + deallocations: 1.291 seconds. 0.952 seconds. 1.191 seconds. 1048576 shared_ptrX allocations + deallocations: 0.531 seconds. 0.541 seconds. 0.531 seconds. 1048576 shared_ptrY allocations + deallocations: 0.802 seconds. 0.561 seconds. 0.731 seconds. 1048576 shared_ptrZ allocations + deallocations: 0.982 seconds. 0.961 seconds. 0.971 seconds. -- David Abrahams [EMAIL PROTECTED] * http://www.boost-consulting.com Boost support, enhancements, training, and commercial distribution ___ Unsubscribe other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Re: [boost] Re: shifted_ptr w/ lock mechanism
From: David Abrahams [EMAIL PROTECTED] Your test doesn't seem to terminate for me in a reasonable amount of time (minutes) in any configuration. That was Cygwin GCC-3.2. You made me download it. GNU C++ version 3.2 20020927 (prerelease) Cygwin GNU libstdc++ version 20020927 BOOST_HAS_THREADS: (not defined) BOOST_SP_USE_STD_ALLOCATOR: (not defined) BOOST_SP_USE_QUICK_ALLOCATOR: (not defined) 1048576 shared_ptrint allocations + deallocations: 1.281 seconds. 1.552 seconds. 1.232 seconds. 1048576 shared_ptrX allocations + deallocations: 0.901 seconds. 0.912 seconds. 0.922 seconds. 1048576 shared_ptrY allocations + deallocations: 1.542 seconds. 0.992 seconds. 1.072 seconds. 1048576 shared_ptrZ allocations + deallocations: 1.272 seconds. 1.572 seconds. 1.222 seconds. GNU C++ version 3.2 20020927 (prerelease) Cygwin GNU libstdc++ version 20020927 BOOST_HAS_THREADS: (not defined) BOOST_SP_USE_STD_ALLOCATOR: (defined) BOOST_SP_USE_QUICK_ALLOCATOR: (not defined) 1048576 shared_ptrint allocations + deallocations: 1.041 seconds. 1.111 seconds. 1.091 seconds. 1048576 shared_ptrX allocations + deallocations: 0.901 seconds. 0.821 seconds. 0.831 seconds. 1048576 shared_ptrY allocations + deallocations: 1.442 seconds. 0.831 seconds. 0.831 seconds. 1048576 shared_ptrZ allocations + deallocations: 1.022 seconds. 1.092 seconds. 1.011 seconds. GNU C++ version 3.2 20020927 (prerelease) Cygwin GNU libstdc++ version 20020927 BOOST_HAS_THREADS: (not defined) BOOST_SP_USE_STD_ALLOCATOR: (not defined) BOOST_SP_USE_QUICK_ALLOCATOR: (defined) 1048576 shared_ptrint allocations + deallocations: 1.632 seconds. 1.052 seconds. 1.032 seconds. 1048576 shared_ptrX allocations + deallocations: 0.862 seconds. 0.761 seconds. 0.771 seconds. 1048576 shared_ptrY allocations + deallocations: 1.432 seconds. 0.761 seconds. 0.761 seconds. 1048576 shared_ptrZ allocations + deallocations: 0.962 seconds. 1.062 seconds. 1.062 seconds. std::allocator is good, but if you replace the quick hack deque with something that allocates bigger pages, you may get rid of the warm up cost. ___ Unsubscribe other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
[boost] Re: shifted_ptr w/ lock mechanism
Philippe A. Bouchard [EMAIL PROTECTED] wrote in message b1a99m$fil$[EMAIL PROTECTED]">news:b1a99m$fil$[EMAIL PROTECTED]... [...] shifted_ptr only works with shifted objects allocated with placement operator new (size_t, shifted_type const ). In theory it would be possible to displace operator delete (void *) to handle properly addresses not pointing to the beginning of a block (hack); to implement this directly in a compiler; etc. I guess it would also be possible to allocate a shifted object into some specific memory page, so operator delete will be able to quickly detect weither the object is shifted or not. This way it would be possible to overload the main operator new. Philippe A. Bouchard ___ Unsubscribe other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Re: [boost] Re: shifted_ptr w/ lock mechanism
Philippe A. Bouchard wrote: [snip] I guess it would also be possible to allocate a shifted object into some specific memory page, so operator delete will be able to quickly detect weither the object is shifted or not. This way it would be possible to overload the main operator new. I think this is what BW does to distinguish between pointers produced by GC_malloc_uncollectable and gc_malloc. Also, I think cmm does something similar (see ftp://ftp.di.unipi.it/pub/Papers/attardi/SPE.ps.gz ). ___ Unsubscribe other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
[boost] Re: shifted_ptr w/ lock mechanism
David Abrahams [EMAIL PROTECTED] wrote in message [EMAIL PROTECTED]">news:[EMAIL PROTECTED]... Philippe A. Bouchard [EMAIL PROTECTED] writes: Lock mechanism was added to shifted_ptr: http://groups.yahoo.com/group/boost/files/shifted_ptr.zip Benchmarks are also updated. Still shifted_ptr is using less memory and twice faster for reconstruction time. Almost. benchmark-nothreads.txt is more precise than benchmark-yesthreads.txt because there is no switching between threads. But yes it is almost doubled (90%). Notes: - The first memory map report is not precise (shifted_ptrU). - The reports were reordered (shifted_ptrU, shifted_ptrT shared_ptrT). I believe there is not that much left to do besides optimizations. Have you tried a comparison against a shared_ptr using an optimized count allocator? Nobody has invested as much effort in optimizing shared_ptr as you are pouring into shifted_ptr, but an experiment I did years ago made a huge difference in the efficiency of shared_ptr just by implementing a crude allocator for count objects (took me about 10 minutes to code up). For me to find shifted_ptr convincing I'd have to see a noticeable performance improvement over using an optimized count allocator with shared_ptr. I think the best way to optimize counts using shared_ptr is to derive the subject class from counted_base. This way shared_ptr should logically be exactly like shifted_ptr in terms of CPU cycles and memory usages. But you won't be able to use typenames easily and complex hierarchies will highly likely confront ambiguity for multiple ownership if virtual inheritance is not used. It is just another solution involving different costs / benefits. Regards, Philippe A. Bouchard ___ Unsubscribe other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
[boost] Re: shifted_ptr w/ lock mechanism
Peter Dimov [EMAIL PROTECTED] wrote in message 000901c2c7dc$e76195e0$1d00a8c0@pdimov2">news:000901c2c7dc$e76195e0$1d00a8c0@pdimov2... From: Peter Dimov [EMAIL PROTECTED] One easy way to estimate the impact of an optimized allocator is to #define BOOST_SP_USE_STD_ALLOCATOR, to make shared_ptr use std::allocator. On SGI derived STLs, std::allocator is usually faster than plain new. I tried to do that myself but benchmark.cpp doesn't compile for me, there's probably no timespec on Windows. I have defined BOOST_SP_USE_STD_ALLOCATOR in benchmark.cpp (gcc 2.95): Resources required by list shifted_ptrT (4000): Arena 0: system bytes = 226820 in use bytes = 226052 Total (incl. mmap): system bytes = 226820 in use bytes = 226052 max mmap regions = 0 max mmap bytes = 0 list shifted_ptrT took0.0002951000 seconds to construct. list shifted_ptrT took7.1966276647 seconds to reconstruct 2000 times. list shifted_ptrT took5.0495961000 seconds to copy 2000 times. list shifted_ptrT took4.0016951000 seconds to sort 4000 times. list shifted_ptrT took0.1382300647 seconds to swap 500 times. list shifted_ptrT took0.0003241000 seconds to destroy. Resources required by list shared_ptrT (4000): Arena 0: system bytes = 325124 in use bytes = 321988 Total (incl. mmap): system bytes = 325124 in use bytes = 321988 max mmap regions = 0 max mmap bytes = 0 list shared_ptrT took 0.0004259000 seconds to construct. list shared_ptrT took 14.0157271000 seconds to reconstruct 2000 times. list shared_ptrT took 5.0331178000 seconds to copy 2000 times. list shared_ptrT took 4.0376376000 seconds to sort 4000 times. list shared_ptrT took 0.1449102647 seconds to swap 500 times. list shared_ptrT took 0.0004831000 seconds to destroy. Philippe A. Bouchard ___ Unsubscribe other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Re: [boost] Re: shifted_ptr w/ lock mechanism
Philippe A. Bouchard [EMAIL PROTECTED] writes: Have you tried a comparison against a shared_ptr using an optimized count allocator? Nobody has invested as much effort in optimizing shared_ptr as you are pouring into shifted_ptr, but an experiment I did years ago made a huge difference in the efficiency of shared_ptr just by implementing a crude allocator for count objects (took me about 10 minutes to code up). For me to find shifted_ptr convincing I'd have to see a noticeable performance improvement over using an optimized count allocator with shared_ptr. I think the best way to optimize counts using shared_ptr is to derive the subject class from counted_base. This way shared_ptr should logically be exactly like shifted_ptr in terms of CPU cycles and memory usages. But you won't be able to use typenames easily and complex hierarchies will highly likely confront ambiguity for multiple ownership if virtual inheritance is not used. It is just another solution involving different costs / benefits. That may be the best way to optimize counts using shared_ptr if you don't care about complex hierarchies and using typenames easily (not that I understand what it means). The things you have to give up by using an intrusive count with shared_ptr are things I care about, though. AFAICT, the fact that shifted_ptr is testing faster than shared_ptr when those features are available is the argument for shifted_ptr's existence in the first place. What I'd like to know is whether shifted_ptr still provides any significant benefits over shared_ptr if you apply a simple optimization tweak to shared_ptr and use it in its most-flexible configuration. I don't think it's an unreasonable question, and I'll need the answer in order to decide whether something like shifted_ptr is worth accepting into Boost. -- David Abrahams [EMAIL PROTECTED] * http://www.boost-consulting.com Boost support, enhancements, training, and commercial distribution ___ Unsubscribe other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
[boost] Re: shifted_ptr w/ lock mechanism
Philippe A. Bouchard [EMAIL PROTECTED] wrote in message b19hhg$i2m$[EMAIL PROTECTED]">news:b19hhg$i2m$[EMAIL PROTECTED]... [...] list shifted_ptrT took0.0002951000 seconds to construct. list shifted_ptrT took7.1966276647 seconds to reconstruct 2000 times. list shifted_ptrT took5.0495961000 seconds to copy 2000 times. list shifted_ptrT took4.0016951000 seconds to sort 4000 times. list shifted_ptrT took0.1382300647 seconds to swap 500 times. list shifted_ptrT took0.0003241000 seconds to destroy. [...] list shared_ptrT took 0.0004259000 seconds to construct. list shared_ptrT took 14.0157271000 seconds to reconstruct 2000 times. list shared_ptrT took 5.0331178000 seconds to copy 2000 times. list shared_ptrT took 4.0376376000 seconds to sort 4000 times. list shared_ptrT took 0.1449102647 seconds to swap 500 times. list shared_ptrT took 0.0004831000 seconds to destroy. Looks like your lead is getting eroded by the day. ;) And that's just with a quick hack. You better be worried about a serious small object allocator. Not only that, but the items that seem most important to me are copy, sort, and swap, since those are the most frequent or computationally intensive. And in those three categories, you have virtually no advantage. In fact, shared_ptr beats shifted_ptr in copy??? What happened? The sort speed amounts to 1% difference. Even swap amounts to about a 5% diff. This is very telling. The biggest speed difference is during construction, and that is where shared_ptr is least optimized. Dave ___ Unsubscribe other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Re: [boost] Re: shifted_ptr w/ lock mechanism
Philippe A. Bouchard [EMAIL PROTECTED] writes: Peter Dimov [EMAIL PROTECTED] wrote in message 000901c2c7dc$e76195e0$1d00a8c0@pdimov2">news:000901c2c7dc$e76195e0$1d00a8c0@pdimov2... From: Peter Dimov [EMAIL PROTECTED] One easy way to estimate the impact of an optimized allocator is to #define BOOST_SP_USE_STD_ALLOCATOR, to make shared_ptr use std::allocator. On SGI derived STLs, std::allocator is usually faster than plain new. I tried to do that myself but benchmark.cpp doesn't compile for me, there's probably no timespec on Windows. I have defined BOOST_SP_USE_STD_ALLOCATOR in benchmark.cpp (gcc 2.95): Resources required by list shifted_ptrT (4000): Arena 0: system bytes = 226820 in use bytes = 226052 Total (incl. mmap): system bytes = 226820 in use bytes = 226052 max mmap regions = 0 max mmap bytes = 0 list shifted_ptrT took0.0002951000 seconds to construct. list shifted_ptrT took7.1966276647 seconds to reconstruct 2000 times. list shifted_ptrT took5.0495961000 seconds to copy 2000 times. list shifted_ptrT took4.0016951000 seconds to sort 4000 times. list shifted_ptrT took0.1382300647 seconds to swap 500 times. list shifted_ptrT took0.0003241000 seconds to destroy. Resources required by list shared_ptrT (4000): Arena 0: system bytes = 325124 in use bytes = 321988 Total (incl. mmap): system bytes = 325124 in use bytes = 321988 max mmap regions = 0 max mmap bytes = 0 list shared_ptrT took 0.0004259000 seconds to construct. list shared_ptrT took 14.0157271000 seconds to reconstruct 2000 times. list shared_ptrT took 5.0331178000 seconds to copy 2000 times. list shared_ptrT took 4.0376376000 seconds to sort 4000 times. list shared_ptrT took 0.1449102647 seconds to swap 500 times. list shared_ptrT took 0.0004831000 seconds to destroy. Would you indulge me and try the benchmark again with the enclosed shared_count patch applied and #undef BOOST_SP_USE_STD_ALLOCATOR? I don't really know what's going on under the covers in the SGI allocator; this is basically just the same hack I threw at the problem years ago. My patch doesn't pretend to work for a threaded implementation, so only the no-threads test applies. *** shared_count.hpp.~1.31.~ Sun Jan 19 10:14:15 2003 --- shared_count.hpp Wed Jan 29 17:28:11 2003 *** *** 37,45 --- 37,104 # pragma warn -8027 // Functions containing try are not expanded inline #endif + # include deque + # include boost/type_traits/type_with_alignment.hpp + # include boost/type_traits/alignment_of.hpp + namespace boost { + namespace aux_ + { + // # include iostream + + template unsigned sz, unsigned align + union freeblock + { + typename boost::type_with_alignmentalign::type aligner; + char bytes[sz]; + freeblock *next; + }; + + template unsigned sz, unsigned align + struct allocator_impl + { + typedef freeblocksz,align block; + + static std::dequeblock store; + static block* free; + + static inline void* alloc() + { + block* x = free; + if (x) + { + free = x-next; + return x; + } + else + { + store.resize(store.size() + 1); + return store.back(); + } + } + + static inline void dealloc(void* p_) + { + block* p = static_castblock*(p_); + p-next = free; + free = p; + } + }; + + + template unsigned sz, unsigned align + std::dequefreeblocksz,align allocator_implsz,align::store; + + template unsigned sz, unsigned align + freeblocksz,align* allocator_implsz,align::free = 0; + + template class T + struct quick_allocator : allocator_implsizeof(T), boost::alignment_ofT::value + { + }; + } // Debug hooks #if defined(BOOST_ENABLE_SP_DEBUG_HOOKS) *** *** 272,277 --- 331,346 void operator delete(void * p) { std::allocatorthis_type().deallocate(static_castthis_type *(p), 1); + } + #else + void * operator new(std::size_t) + { + return aux_::quick_allocatorthis_type::alloc(); + } + + void operator delete(void * p) + { + aux_::quick_allocatorthis_type::dealloc(p); } #endif -- David Abrahams [EMAIL PROTECTED] * http://www.boost-consulting.com Boost support, enhancements, training, and commercial distribution ___ Unsubscribe other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
[boost] Re: shifted_ptr w/ lock mechanism
David Abrahams [EMAIL PROTECTED] wrote in message [EMAIL PROTECTED]">news:[EMAIL PROTECTED]... Peter Dimov [EMAIL PROTECTED] writes: From: David Abrahams [EMAIL PROTECTED] Philippe A. Bouchard [EMAIL PROTECTED] writes: Lock mechanism was added to shifted_ptr: http://groups.yahoo.com/group/boost/files/shifted_ptr.zip Benchmarks are also updated. Still shifted_ptr is using less memory and twice faster for reconstruction time. [snip] One easy way to estimate the impact of an optimized allocator is to #define BOOST_SP_USE_STD_ALLOCATOR, to make shared_ptr use std::allocator. On SGI derived STLs, std::allocator is usually faster than plain new. Yeah; I'm pretty sure that my specialized allocator was faster still, since it just allocated fixed-sized blocks and linked them back into a free-list. It was pretty trivial to implement on top of a std::deque of POD unions. Latest C++ compilers come with fairly good allocator for small object . I played with Loki's SmallObjAllocator and even heavily sped up version didn't match the native allocators used in BCB or Intel C++ (being still 30% slower and no MT safety). I guess small object optimisation was already provided, maybe mixed together with tricks as assembler optimisation or cache wizardry. OTOH GCC 2.95.* was significantly slower than Loki. $0.02 /Pavel ___ Unsubscribe other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
[boost] Re: shifted_ptr w/ lock mechanism
David Abrahams [EMAIL PROTECTED] wrote in message [EMAIL PROTECTED]">news:[EMAIL PROTECTED]... Philippe A. Bouchard [EMAIL PROTECTED] writes: I believe there is not that much left to do besides optimizations. Have you tried a comparison against a shared_ptr using an optimized count allocator? Nobody has invested as much effort in optimizing shared_ptr as you are pouring into shifted_ptr, but an experiment I did years ago made a huge difference in the efficiency of shared_ptr just by implementing a crude allocator for count objects (took me about 10 minutes to code up). For me to find shifted_ptr convincing I'd have to see a noticeable performance improvement over using an optimized count allocator with shared_ptr. My understanding is that shifted_ptr mandates allocating *your* objects inside *its* doped memory area by using placement new. (Phillippe, please correct me if I'm wrong.) This slashes many good uses of smart pointers, which include, but are not limited to (damn, I feel I talk like a lawyer :o)): object factories (unless they use shfted_ptr from the get-go), the Template Method pattern (ditto), cross-DLL communication, object brokers (COM/CORBA), working nicely with other allocators. All these idioms require you to grab a pointer to an object that was constructed elsewhere. If my understanding is correct, shifted_ptr is not able to do that, while shared_ptr is. These are use cases that I personally find important, so, although there existed some precedent back in 1999, I did not consider using a custom allocator (that would allocate the object plus extra memory for bookkeeping) in mc++d. At any rate, anyone comparing shifted_ptr with shared_ptr should keep in mind that one has an apple flavor and the other an orange flavor. It would be interesting to see how much the implementation of shifted_ptr is simplified if it's put into an Ownership policy of the Borg, er, loki::smart_ptr. I guess a lot. That would nicely let the users decide whether they can do with allocating objects with placement new and enjoy the additional performance. Of course, all that *after* fixing smart_ptr's buggy constructor :o). Andrei ___ Unsubscribe other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
[boost] Re: shifted_ptr w/ lock mechanism
Andrei Alexandrescu [EMAIL PROTECTED] wrote in message b1a0uv$lju$[EMAIL PROTECTED]">news:b1a0uv$lju$[EMAIL PROTECTED]... [...] My understanding is that shifted_ptr mandates allocating *your* objects inside *its* doped memory area by using placement new. (Phillippe, please correct me if I'm wrong.) No, it works exactly this way. The reference count is allocated at the same time the object is. The returned pointer is shifted to the starting address of the object in question. This slashes many good uses of smart pointers, which include, but are not limited to (damn, I feel I talk like a lawyer :o)): object factories (unless they use shfted_ptr from the get-go), the Template Method pattern (ditto), cross-DLL communication, object brokers (COM/CORBA), working nicely with other allocators. All these idioms require you to grab a pointer to an object that was constructed elsewhere. If my understanding is correct, shifted_ptr is not able to do that, while shared_ptr is. shifted_ptr only works with shifted objects allocated with placement operator new (size_t, shifted_type const ). In theory it would be possible to displace operator delete (void *) to handle properly addresses not pointing to the beginning of a block (hack); to implement this directly in a compiler; etc. These are use cases that I personally find important, so, although there existed some precedent back in 1999, I did not consider using a custom allocator (that would allocate the object plus extra memory for bookkeeping) in mc++d. At any rate, anyone comparing shifted_ptr with shared_ptr should keep in mind that one has an apple flavor and the other an orange flavor. Yes, exactly. shared_ptr has its advantages and so is shifted_ptr. I am not a judge either. I like shifted_ptr because the only cost involves the placement operator new usage. It would be interesting to see how much the implementation of shifted_ptr is simplified if it's put into an Ownership policy of the Borg, er, loki::smart_ptr. I guess a lot. That would nicely let the users decide whether they can do with allocating objects with placement new and enjoy the additional performance. Of course, all that *after* fixing smart_ptr's buggy constructor :o). That would be great, agreed. But I thinly think it would require additional policies... @:) Regards, Philippe A. Bouchard ___ Unsubscribe other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
[boost] Re: shifted_ptr w/ lock mechanism
David B. Held [EMAIL PROTECTED] wrote in message b19io8$o05$[EMAIL PROTECTED]">news:b19io8$o05$[EMAIL PROTECTED]... [...] Looks like your lead is getting eroded by the day. ;) And that's just with a quick hack. You better be worried about a serious small object allocator. Not only that, but the items that seem most important to me are copy, sort, and swap, since those are the most frequent or computationally intensive. And in those three categories, you have virtually no advantage. In fact, shared_ptr beats shifted_ptr in copy??? What happened? The sort speed amounts to 1% difference. Even swap amounts to about a 5% diff. This is very telling. The biggest speed difference is during construction, and that is where shared_ptr is least optimized. Let's not forget constructions and destructions are optimized by the compiler when used consecutively. Reconstruction is a more concrete example. Thus I can play with construction and destruction as well. Swaps and copies should logically be similar since only the count is incremented, the memory blocks are not affected. Don't we forget also that every time I run the benchmark, the results are always different (I should use nice --20 as root) and there is some margin of error +/-1 second. Philippe A. Bouchard ___ Unsubscribe other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
[boost] Re: shifted_ptr w/ lock mechanism
Philippe A. Bouchard [EMAIL PROTECTED] wrote in message b16m42$7pv$[EMAIL PROTECTED]">news:b16m42$7pv$[EMAIL PROTECTED]... Lock mechanism was added to shifted_ptr: http://groups.yahoo.com/group/boost/files/shifted_ptr.zip Benchmarks are also updated. Still shifted_ptr is using less memory and twice faster for reconstruction time. Notes: - The first memory map report is not precise (shifted_ptrU). - The reports were reordered (shifted_ptrU, shifted_ptrT shared_ptrT). I believe there is not that much left to do besides optimizations. .. and besides true garbage collection. Philippe A. Bouchard ___ Unsubscribe other changes: http://lists.boost.org/mailman/listinfo.cgi/boost