Re: [opensource-dev] Mesh viewers and tcmalloc issues

2011-10-03 Thread Moriz Gupte
Am wondering if anyone from Linden Lab can comment on this thread. This is
such an important issue regarding mesh viewers and mesh based economy. Many
mesh builders are reverting products back to sculpty versions because the
majority of clients refuse to use the latest viewers that support mesh
because of the significant fps drops.

On Mon, Oct 3, 2011 at 8:38 AM, Twisted Laws twisted_l...@hotmail.comwrote:

  Only thing I don't understand about all this, is if you clone and build
 viewer-development and 3p-google-perftools, then LL_USE_TCMALLOC is
 undefined in both.

 in GooglePerfTools.cmake:
 set(TCMALLOC_FLAG -ULL_USE_TCMALLOC=1)
  and in llcommon property pages
 Undefine Preprocessor Definitions: LL_USE_TCMALLOC=1
 So it would appear, at least by default, that tcmalloc is not enabled.   Am
 I understanding this wrong?  Or does LL build their viewer with it defined
 and sending the viewer out the opensource world with it disabled?

 I did follow Henri's msg and built a viewer with
 EnvToDouble(TCMALLOC_RELEASE_RATE, 1.0) (in 3p-google-perftools) and
 it seems to do exactly the same as without it.

 Twisted

  Date: Sat, 1 Oct 2011 22:24:56 +0200
  From: sl...@free.fr
  To: opensource-dev@lists.secondlife.com
  Subject: [opensource-dev] Mesh viewers and tcmalloc issues

 
  Greetings,
 
  I noticed that all the viewers using tcmalloc (mesh viewers under Linux
  and Windows, since those use SSE2 and must gain memory-aligned malloc()
  and new() calls that their standard library does not provide) suffer
  from a serious problem: they never release memory back to the system,
  meaning that after visiting a crowded place and caming around a lot,
  the viewer can occupy 2.5Gb of memory, and even after you TP out to a
  skybox with almost no objects/textures and no avatar around, the viewer
  retains the full amount of alloctaed memory for itself.
 
  Worst, should you manage to keep the viewer from crashing during a full
  hour or so, its allocated memory will get so badly fragmented that it
  starts crawling down and finally crashes, even in quiet sims.
 
  Bao Linden recently worked on private memory pools to work around these
  issues, but so far and despite his hard work, the result is less than
 satisfactory: the memory is still never released to the system, and the
  viewers using private memory pools crash every few minutes after issuing
  a warning:
  LLPluginProcessParent::poll: apr_pollset_poll failed with status 4
 
  Well, be happy since I found an easy work around for these problems
  while working on the Cool VL Viewer v1.26.1 (the mesh branch).
 
  tcmalloc is actually supposed to release back to the system the memory
  freed by the application using it, but it does so only after a certain
  number of memory blocks have been freed. There is an environment
  variable that you can set (TCMALLOC_RELEASE_RATE) to adjust the rate
  at which tcmalloc will release the freed blocks back to the system.
  In fact, this is not really a rate, but a divisor (the number of freed
  blocks is divided by the rate number (when != 0: a 0 rate means never
  release memory), and compared to a threshold. If the number is below
  the threshold, the freed blocks are released.
  The documentation for tcmalloc says that Reasonable rates are in the
  range [0,10], but even with a rate of 10, you never get the viewer to
  release more than a couple hundreds megabytes for 2+Gb of allocated
  memory. It occurred to me that the algorithm tcmalloc uses is simply
  crippled !
 
  The good news, is that if you pass an unreasonnable rate, tcmalloc
  will finally release memory (the more unreasonnable and the more
  memory is released). With a rate of 1 (yes, ten thousands), you
  get the viewer to release everything when it doesn't need it any more,
  which matches the behaviour of tcmalloc-less viewers.
 
  Since the Windows builds don't use a wrapper script to launch the
  viewer, it is however best to hardcode this new rate as the default
  one in tcmalloc istelf. This is what I did for the Cool VL Viewer
  and it works like a charm. There is only one line to change in
  tcmalloc source, in src/page_heap.cc:
  DEFINE_double(tcmalloc_release_rate,
  EnvToDouble(TCMALLOC_RELEASE_RATE, 1.0), --- HERE
  Rate at which we release unused memory to the system. 
  Zero means we never release memory back to the system. 
  Increase this flag to return memory faster; decrease it 
  to return memory slower. Reasonable rates are in the 
  range [0,10]);
 
  Now, the viewer runs rock stable (just like the non-mesh, tcmalloc-less
  version) and uses very reasonnable amounts of memory. It also doesn't
  suffer from memory fragmentation any more since it is transparently
  taken care of by the OS (via the page table and the PMMU of the CPU,
  something neither tcmalloc nor Bao's private memory pool can do since
  these are userspace code).
 
  For what it is worth...
 
  Henri.
  ___
  

Re: [opensource-dev] Mesh viewers and tcmalloc issues

2011-10-03 Thread Nicky Perian
http://dl.dropbox.com/u/7833186/libtcmalloc_minimal.dll.JPG (build directory 
image) Is it possible that tcmalloc is used through this dll and the -U and 
un-define prevent  linking it statically? 




From: Moriz Gupte moriz.gu...@gmail.com
To: SLDEV opensource-dev@lists.secondlife.com
Sent: Monday, October 3, 2011 3:17 PM
Subject: Re: [opensource-dev] Mesh viewers and tcmalloc issues


Am wondering if anyone from Linden Lab can comment on this thread. This is such 
an important issue regarding mesh viewers and mesh based economy. Many mesh 
builders are reverting products back to sculpty versions because the majority 
of clients refuse to use the latest viewers that support mesh because of the 
significant fps drops. 


On Mon, Oct 3, 2011 at 8:38 AM, Twisted Laws twisted_l...@hotmail.com wrote:

Only thing I don't understand about all this, is if you clone and build 
viewer-development and 3p-google-perftools, then LL_USE_TCMALLOC is undefined 
in both.
 
in GooglePerfTools.cmake:

set(TCMALLOC_FLAG -ULL_USE_TCMALLOC=1)

and in llcommon property pages
Undefine Preprocessor Definitions: LL_USE_TCMALLOC=1So it would appear, at 
least by default, that tcmalloc is not enabled.   Am I understanding this 
wrong?  Or does LL build their viewer with it defined and sending the viewer 
out the opensource world with it disabled?
 
I did follow Henri's msg and built a viewer with 
EnvToDouble(TCMALLOC_RELEASE_RATE, 1.0) (in 3p-google-perftools) and it 
seems to do exactly the same as without it.
 
Twisted
  Date: Sat, 1 Oct 2011 22:24:56 +0200
 From: sl...@free.fr
 To: opensource-dev@lists.secondlife.com
 Subject: [opensource-dev] Mesh viewers and tcmalloc issues

 
 Greetings,
 
 I noticed that all the viewers using tcmalloc (mesh viewers under Linux
 and Windows, since those use SSE2 and must gain memory-aligned malloc()
 and new() calls that their standard library does not provide) suffer
 from a serious problem: they never release memory back to the system,
 meaning that after visiting a crowded place and caming around a lot,
 the viewer can occupy 2.5Gb of memory, and even after you TP out to a
 skybox with almost no objects/textures and no avatar around, the viewer
 retains the full amount of alloctaed memory for itself.
 
 Worst, should you manage to keep the viewer from crashing during a full
 hour or so, its allocated memory will get so badly fragmented that it
 starts crawling down and finally crashes, even in quiet sims.
 
 Bao Linden recently worked on private memory pools to work around these
 issues, but so far and despite his hard work, the result is less than 
 satisfactory: the memory is still never released to the system, and the
 viewers using private memory pools crash every few minutes after issuing
 a warning:
 LLPluginProcessParent::poll: apr_pollset_poll failed with status 4
 
 Well, be happy since I found an easy work around for these problems
 while working on the Cool VL Viewer v1.26.1 (the mesh branch).
 
 tcmalloc is actually supposed to release back to the system the memory
 freed by the application using it, but it does so only after a certain
 number of memory blocks have been freed. There is an environment
 variable that you can set (TCMALLOC_RELEASE_RATE) to adjust the rate
 at which tcmalloc will release the freed blocks back to the system.
 In fact, this is not really a rate, but a divisor (the number of freed
 blocks is divided by the rate number (when != 0: a 0 rate means never
 release memory), and compared to a threshold. If the number is below
 the threshold, the freed blocks are released.
 The documentation for tcmalloc says that Reasonable rates are in the
 range [0,10], but even with a rate of 10, you never get the viewer to
 release more than a couple hundreds megabytes for 2+Gb of allocated
 memory. It occurred to me that the algorithm tcmalloc uses is simply
 crippled !
 
 The good news, is that if you pass an unreasonnable rate, tcmalloc
 will finally release memory (the more unreasonnable and the more
 memory is released). With a rate of 1 (yes, ten thousands), you
 get the viewer to release everything when it doesn't need it any more,
 which matches the behaviour of tcmalloc-less viewers.
 
 Since the Windows builds don't use a wrapper script to launch the
 viewer, it is however best to hardcode this new rate as the default
 one in tcmalloc istelf. This is what I did for the Cool VL Viewer
 and it works like a charm. There is only one line to change in
 tcmalloc source, in src/page_heap.cc:
 DEFINE_double(tcmalloc_release_rate,
 EnvToDouble(TCMALLOC_RELEASE_RATE, 1.0),--- HERE
 Rate at which we release unused memory to the system.  
 Zero means we never release memory back to the system.  
 Increase this flag to return memory faster; decrease it 
 to return memory slower.  Reasonable rates are in the 
 range [0,10]);
 
 Now, the viewer runs rock stable (just like the non-mesh, tcmalloc-less
 

Re: [opensource-dev] Mesh viewers and tcmalloc issues

2011-10-03 Thread Nicky D.
 in GooglePerfTools.cmake:
 set(TCMALLOC_FLAG -ULL_USE_TCMALLOC=1)
 and in llcommon property pages
 Undefine Preprocessor Definitions: LL_USE_TCMALLOC=1
 So it would appear, at least by default, that tcmalloc is not enabled.   Am
 I understanding this wrong?  Or does LL build their viewer with it defined
 and sending the viewer out the opensource world with it disabled?


That define is really only used for some statics code in
llallocator.cpp. It does
not influence if tcmalloc is used or not.
___
Policies and (un)subscribe information available here:
http://wiki.secondlife.com/wiki/OpenSource-Dev
Please read the policies before posting to keep unmoderated posting privileges


Re: [opensource-dev] Mesh viewers and tcmalloc issues

2011-10-03 Thread Armin Weatherwax
 That define is really only used for some statics code in
 llallocator.cpp. It does
 not influence if tcmalloc is used or not.
its not clear where/why/when it is used - any code path includuing  tcmalloc 
on linux 32bit isn't compiled, though the viewer crashes if it isn't linked, 
which in my opinion at least needs a comment *that* just linking it makes a 
difference. Even better would be a comment with e.g. a link to furher 
information about *why* just linking it makes a difference.
For Linux 64bit not linking tcmalloc makes no difference (anything is aligned 
anyway, and I didn't see any difference in speed).

Armin
___
Policies and (un)subscribe information available here:
http://wiki.secondlife.com/wiki/OpenSource-Dev
Please read the policies before posting to keep unmoderated posting privileges