Re: [opensource-dev] Mesh viewers and tcmalloc issues
Am wondering if anyone from Linden Lab can comment on this thread. This is such an important issue regarding mesh viewers and mesh based economy. Many mesh builders are reverting products back to sculpty versions because the majority of clients refuse to use the latest viewers that support mesh because of the significant fps drops. On Mon, Oct 3, 2011 at 8:38 AM, Twisted Laws twisted_l...@hotmail.comwrote: Only thing I don't understand about all this, is if you clone and build viewer-development and 3p-google-perftools, then LL_USE_TCMALLOC is undefined in both. in GooglePerfTools.cmake: set(TCMALLOC_FLAG -ULL_USE_TCMALLOC=1) and in llcommon property pages Undefine Preprocessor Definitions: LL_USE_TCMALLOC=1 So it would appear, at least by default, that tcmalloc is not enabled. Am I understanding this wrong? Or does LL build their viewer with it defined and sending the viewer out the opensource world with it disabled? I did follow Henri's msg and built a viewer with EnvToDouble(TCMALLOC_RELEASE_RATE, 1.0) (in 3p-google-perftools) and it seems to do exactly the same as without it. Twisted Date: Sat, 1 Oct 2011 22:24:56 +0200 From: sl...@free.fr To: opensource-dev@lists.secondlife.com Subject: [opensource-dev] Mesh viewers and tcmalloc issues Greetings, I noticed that all the viewers using tcmalloc (mesh viewers under Linux and Windows, since those use SSE2 and must gain memory-aligned malloc() and new() calls that their standard library does not provide) suffer from a serious problem: they never release memory back to the system, meaning that after visiting a crowded place and caming around a lot, the viewer can occupy 2.5Gb of memory, and even after you TP out to a skybox with almost no objects/textures and no avatar around, the viewer retains the full amount of alloctaed memory for itself. Worst, should you manage to keep the viewer from crashing during a full hour or so, its allocated memory will get so badly fragmented that it starts crawling down and finally crashes, even in quiet sims. Bao Linden recently worked on private memory pools to work around these issues, but so far and despite his hard work, the result is less than satisfactory: the memory is still never released to the system, and the viewers using private memory pools crash every few minutes after issuing a warning: LLPluginProcessParent::poll: apr_pollset_poll failed with status 4 Well, be happy since I found an easy work around for these problems while working on the Cool VL Viewer v1.26.1 (the mesh branch). tcmalloc is actually supposed to release back to the system the memory freed by the application using it, but it does so only after a certain number of memory blocks have been freed. There is an environment variable that you can set (TCMALLOC_RELEASE_RATE) to adjust the rate at which tcmalloc will release the freed blocks back to the system. In fact, this is not really a rate, but a divisor (the number of freed blocks is divided by the rate number (when != 0: a 0 rate means never release memory), and compared to a threshold. If the number is below the threshold, the freed blocks are released. The documentation for tcmalloc says that Reasonable rates are in the range [0,10], but even with a rate of 10, you never get the viewer to release more than a couple hundreds megabytes for 2+Gb of allocated memory. It occurred to me that the algorithm tcmalloc uses is simply crippled ! The good news, is that if you pass an unreasonnable rate, tcmalloc will finally release memory (the more unreasonnable and the more memory is released). With a rate of 1 (yes, ten thousands), you get the viewer to release everything when it doesn't need it any more, which matches the behaviour of tcmalloc-less viewers. Since the Windows builds don't use a wrapper script to launch the viewer, it is however best to hardcode this new rate as the default one in tcmalloc istelf. This is what I did for the Cool VL Viewer and it works like a charm. There is only one line to change in tcmalloc source, in src/page_heap.cc: DEFINE_double(tcmalloc_release_rate, EnvToDouble(TCMALLOC_RELEASE_RATE, 1.0), --- HERE Rate at which we release unused memory to the system. Zero means we never release memory back to the system. Increase this flag to return memory faster; decrease it to return memory slower. Reasonable rates are in the range [0,10]); Now, the viewer runs rock stable (just like the non-mesh, tcmalloc-less version) and uses very reasonnable amounts of memory. It also doesn't suffer from memory fragmentation any more since it is transparently taken care of by the OS (via the page table and the PMMU of the CPU, something neither tcmalloc nor Bao's private memory pool can do since these are userspace code). For what it is worth... Henri. ___
Re: [opensource-dev] Mesh viewers and tcmalloc issues
http://dl.dropbox.com/u/7833186/libtcmalloc_minimal.dll.JPG (build directory image) Is it possible that tcmalloc is used through this dll and the -U and un-define prevent linking it statically? From: Moriz Gupte moriz.gu...@gmail.com To: SLDEV opensource-dev@lists.secondlife.com Sent: Monday, October 3, 2011 3:17 PM Subject: Re: [opensource-dev] Mesh viewers and tcmalloc issues Am wondering if anyone from Linden Lab can comment on this thread. This is such an important issue regarding mesh viewers and mesh based economy. Many mesh builders are reverting products back to sculpty versions because the majority of clients refuse to use the latest viewers that support mesh because of the significant fps drops. On Mon, Oct 3, 2011 at 8:38 AM, Twisted Laws twisted_l...@hotmail.com wrote: Only thing I don't understand about all this, is if you clone and build viewer-development and 3p-google-perftools, then LL_USE_TCMALLOC is undefined in both. in GooglePerfTools.cmake: set(TCMALLOC_FLAG -ULL_USE_TCMALLOC=1) and in llcommon property pages Undefine Preprocessor Definitions: LL_USE_TCMALLOC=1So it would appear, at least by default, that tcmalloc is not enabled. Am I understanding this wrong? Or does LL build their viewer with it defined and sending the viewer out the opensource world with it disabled? I did follow Henri's msg and built a viewer with EnvToDouble(TCMALLOC_RELEASE_RATE, 1.0) (in 3p-google-perftools) and it seems to do exactly the same as without it. Twisted Date: Sat, 1 Oct 2011 22:24:56 +0200 From: sl...@free.fr To: opensource-dev@lists.secondlife.com Subject: [opensource-dev] Mesh viewers and tcmalloc issues Greetings, I noticed that all the viewers using tcmalloc (mesh viewers under Linux and Windows, since those use SSE2 and must gain memory-aligned malloc() and new() calls that their standard library does not provide) suffer from a serious problem: they never release memory back to the system, meaning that after visiting a crowded place and caming around a lot, the viewer can occupy 2.5Gb of memory, and even after you TP out to a skybox with almost no objects/textures and no avatar around, the viewer retains the full amount of alloctaed memory for itself. Worst, should you manage to keep the viewer from crashing during a full hour or so, its allocated memory will get so badly fragmented that it starts crawling down and finally crashes, even in quiet sims. Bao Linden recently worked on private memory pools to work around these issues, but so far and despite his hard work, the result is less than satisfactory: the memory is still never released to the system, and the viewers using private memory pools crash every few minutes after issuing a warning: LLPluginProcessParent::poll: apr_pollset_poll failed with status 4 Well, be happy since I found an easy work around for these problems while working on the Cool VL Viewer v1.26.1 (the mesh branch). tcmalloc is actually supposed to release back to the system the memory freed by the application using it, but it does so only after a certain number of memory blocks have been freed. There is an environment variable that you can set (TCMALLOC_RELEASE_RATE) to adjust the rate at which tcmalloc will release the freed blocks back to the system. In fact, this is not really a rate, but a divisor (the number of freed blocks is divided by the rate number (when != 0: a 0 rate means never release memory), and compared to a threshold. If the number is below the threshold, the freed blocks are released. The documentation for tcmalloc says that Reasonable rates are in the range [0,10], but even with a rate of 10, you never get the viewer to release more than a couple hundreds megabytes for 2+Gb of allocated memory. It occurred to me that the algorithm tcmalloc uses is simply crippled ! The good news, is that if you pass an unreasonnable rate, tcmalloc will finally release memory (the more unreasonnable and the more memory is released). With a rate of 1 (yes, ten thousands), you get the viewer to release everything when it doesn't need it any more, which matches the behaviour of tcmalloc-less viewers. Since the Windows builds don't use a wrapper script to launch the viewer, it is however best to hardcode this new rate as the default one in tcmalloc istelf. This is what I did for the Cool VL Viewer and it works like a charm. There is only one line to change in tcmalloc source, in src/page_heap.cc: DEFINE_double(tcmalloc_release_rate, EnvToDouble(TCMALLOC_RELEASE_RATE, 1.0),--- HERE Rate at which we release unused memory to the system. Zero means we never release memory back to the system. Increase this flag to return memory faster; decrease it to return memory slower. Reasonable rates are in the range [0,10]); Now, the viewer runs rock stable (just like the non-mesh, tcmalloc-less
Re: [opensource-dev] Mesh viewers and tcmalloc issues
in GooglePerfTools.cmake: set(TCMALLOC_FLAG -ULL_USE_TCMALLOC=1) and in llcommon property pages Undefine Preprocessor Definitions: LL_USE_TCMALLOC=1 So it would appear, at least by default, that tcmalloc is not enabled. Am I understanding this wrong? Or does LL build their viewer with it defined and sending the viewer out the opensource world with it disabled? That define is really only used for some statics code in llallocator.cpp. It does not influence if tcmalloc is used or not. ___ Policies and (un)subscribe information available here: http://wiki.secondlife.com/wiki/OpenSource-Dev Please read the policies before posting to keep unmoderated posting privileges
Re: [opensource-dev] Mesh viewers and tcmalloc issues
That define is really only used for some statics code in llallocator.cpp. It does not influence if tcmalloc is used or not. its not clear where/why/when it is used - any code path includuing tcmalloc on linux 32bit isn't compiled, though the viewer crashes if it isn't linked, which in my opinion at least needs a comment *that* just linking it makes a difference. Even better would be a comment with e.g. a link to furher information about *why* just linking it makes a difference. For Linux 64bit not linking tcmalloc makes no difference (anything is aligned anyway, and I didn't see any difference in speed). Armin ___ Policies and (un)subscribe information available here: http://wiki.secondlife.com/wiki/OpenSource-Dev Please read the policies before posting to keep unmoderated posting privileges