Hi Pete, On Apr 28, 2010, at 19:22 , Peter Elmer wrote:
> Hi, > > On Apr 28, 2010, at 18:58, Stephan Wiesand <stephan.wies...@desy.de> wrote: >> On Apr 27, 2010, at 00:15 , Brett Viren wrote: >>> We recently started running our C++ analysis code on 64bit SL5.3 and >>> have been surprised to find the memory usage is about 2x what we are >>> used when running it on 32 bits. Comparing a few basic applications >>> like sleep(1) show similar memory usage. Others, like sshd, show only a >>> 30% size increase (maybe that is subject to configuration differences >>> between the two hosts). >>> >>> I understand that pointers must double in size but the bulk of our >>> objects are made of ints and floats and these are 32/64 bit-invariant. >>> I found[1] that poorly defined structs containing pointers can bloat >>> even on non-pointer data members due the padding needed to keep >>> everything properly aligned. It would kind of surprise me if this is >>> what is behind what we see. >>> >>> Does anyone have experience in understanding or maybe even combating >>> this increase in a program's memory footprint when going to 64 bits? >> >> Is it real or virtual memory usage that's increasing beyond expectations? >> >> Example: glibc's locale handling code will behave quite differently in the >> 64-bit case. In 32-bit mode, even virtual address space is a scarce >> resource, while in 64-bit mode it isn't. So in the latter case, they simply >> mmap the whole file providing the info for the locale in use, while in the >> former they use a small address window they slide to the appropriate >> position. The 64-bit case is simpler and thus probably less code, more >> robust and easier to maintain. And it's probably faster. The 32-bit case >> uses less *virtual* memory - but *real* memory usage is about the same, >> since only those pages actually read will ever be paged in. This has a >> dramatic effect on the VSZ of "hello world in python". It does not on >> anything that really matters - in particular, checking the memory footprints >> of sleep & co. is not very useful because they're really small compared to >> typical HEP analysis apps anyway. > > You can work around the locale thing for any batch application (for which > that usually should > not matter) by setting the LANG envvar to "C". For a single process this will > only be about 50MB, though. yes, this is what I meant to say with "not anything that really matters". > The big difference most of us saw was due to the linker forcing shared > libraries text/data to align to 2MB, while we have very many very small > (<<2MB) libraries. Ah. I really didn't know about this one yet. > You should see this > explicitly if you do a 'pmap' of your process once it is running and has > loaded all > libraries. You'll see memory sections with no permissions next to those > corresponding to > libraries. Assuming you aren't using huge memory pages in your application > there is a > linker option (don't recall off the top of my head the name) in SL5 binutils > ld which allows > you to reduce this. > > But what both of these things say is that VSIZE for 64bit is not a very good > measure of > how much memory an app really needs. Right. Unfortunately, it's the only value that actually can be attributed to a single process on a system running multiple jobs. > Taking out fake accounting things like the two > above our estimate is that our (CMS) applications typically only need 20-25% > more memory > at 64bit relative to 32bit. (From the small code size increase, data type > increases for ptr's > and whatnot and some increase from overhead/alignment for live objects in the > heap..) Yes, this seems reasonable. > We are actually preparing some proposals/recommendations about measuring > memory use, > as in addition to this VSIZE/64bit confusion the introduction of "multicore" > applications which > share memory also misleads people... Really looking forward to those. This is a serious problem. - Stephan -- Stephan Wiesand DESY -DV- Platanenenallee 6 15738 Zeuthen, Germany