Re: [CentOS] High memory needs
> Anyone using VIRT to make decisions about resource utilization is > completely ignorant of its function. I agree. I don't understand why Sun Grid Engine does exactly that. I posted a full explanation of this problem and the solution we used here: https://www.centos.org/modules/newbb/viewtopic.php?topic_id=39499 Thanks for your feedback, it was useful to isolate the problem. ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] High memory needs [SOLVED]
> did you test something like this? > > ls -al /usr/lib/locale/locale-archive > lrwxrwxrwx 1 root root 9 Sep 3 12:35 /usr/lib/locale/locale-archive -> > /dev/null Didn't test it. I thought that at least the languages set to be used must be accessible. Jeremie > > -- > LF > > > ___ > CentOS mailing list > CentOS@centos.org > http://lists.centos.org/mailman/listinfo/centos ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] High memory needs [SOLVED]
I finally found a solution to our problem. I think some people running like us a combination CentOS 64 bits\Sun Grid Engine, could encounter the same situation. Here is a detailed explanation, hope it can be useful to someone! The file /usr/lib/locale/locale-archive is a memory-mapped file used by glibc (the gnu C library). This file contains the languages used over the system (for instance, man pages). This memory-mapping allows to read in the file as if it were in memory, avoiding system calls used to perform disk-read operations, therefore it allows much faster access. Memory-mapped files (as well as shared libraries) are counted as part of the virtual memory of processes (see "top" command, "VIRT" field). Therefore, the part of the locale-archive file mapped to memory adds up to the virtual memory of every processes that makes use of glibc (basically everything), while this is actually only once in memory. In other words, for every processes, the virtual memory overestimates the real memory of the process by, at least, the part of the locale-archive file which is memory-mapped. Apparently, on 32bits distros, and on most 64bits distros, only a part of the locale-archive file is mmapped: for instance on CentOS 32 bits: $ pmap $$ b7689000 2048K r /usr/lib/locale/locale-archive Only 2MB of the file are mmapped, while the file is actually ~54MB. Some distros only install a small subsets of languages, for instance on my Ubuntu 12.04, this file is only ~3MB. I still don't get the reason, but CentOS x86_64 6.2 (and 6.3) mmappes the *entire* file: $ pmap $$ 7f217fae4000 96836K r /usr/lib/locale/locale-archive Consequence: every processes that use glibc is overestimated by ~100MB (according to the virtual memory). That, alone, is not a problem at all. It is known that the virtual memory is a bad estimation of the actual memory used, the fact that it is largely overestimated does not matter much in most cases. The problem, however, is that if you deal with a computing cluster that uses Sun Grid Engine (we have version 6.2u5) it uses this value to check whether jobs exceed their memory limit or not. The consequence is a false impression that jobs require a huge amount of memory, SGE being miss-leaded, and thus killing them. A solution could be to restrict the number of languages when installing glibc-common (which provides the locale-archive file), to have a smaller locale-archive file. This should be possible with: $ echo "%_install_langs ::" >> /etc/rpm/macros.lang $ rpm -e glibc-common --nodeps $ rm /usr/lib/locale/locale-archive $ rpm -i I don't know why but we couldn't have glibc-common (2.12-1.80) taking it into account, though. All languages were always installed. I read somewhere that it is done on purpose, not sure... The solution that worked for us is a post-install fix that consists of removing all languages that we don't use. We checked with "locale" which languages are used on the system. Then we remove all the others: $ localedef --list-archive | grep -v -e "en_US" -e "de_DE" -e "en_GB" | xargs localedef --delete-from-archive $ mv /usr/lib/locale/locale-archive /usr/lib/locale/locale-archive.tmpl $ build-locale-archive After this, the size of the locale-archive file is ~4MB, and running a single Bash instance does not show "107MB" for SGE anymore :-) Et voilà! Jérémie 2012/9/27 Les Mikesell : > On Thu, Sep 27, 2012 at 10:46 AM, Gordon Messmer wrote: >> >>> I understand it may not be very precise, however I still don't >>> understant the difference compared to other x64 ditributions, >>> under CentOS the value is 7 times higher! > > This might explain it: > https://bugzilla.redhat.com/show_bug.cgi?id=156477 > The mmapped local-archive contains all languages even though only the > ones you use are accessed from it. Other distros split them and the > installers only install what you want. > >> 64 bit system: >> 7f8ae84b7000 102580K r /usr/lib/locale/locale-archive >> >> 32 bit: >> b7689000 2048K r /usr/lib/locale/locale-archive > > That's an interesting difference on its own, since the underlying > files are about 95M and 54M respectively. Does the 32 bit kernel use > some tricks to sparsely map files where the 64 bit one does it > directly with page tables? > > -- > Les Mikesell > lesmikes...@gmail.com > ___ > CentOS mailing list > CentOS@centos.org > http://lists.centos.org/mailman/listinfo/centos ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] High memory needs
We have a computing cluster running Sun Grid Engine, which considers this value to check if a process exceeds the memory limit or not. So somehow I'm bound to consider it. I installed a machine from scratch with CentOS 6.2 x64, nothing else, I open a terminal, I run this simple bash script and VIRT goes beyond 100MB for it. I understand it may not be very precise, however I still don't understant the difference compared to other x64 ditributions, under CentOS the value is 7 times higher! 2012/9/27 Gordon Messmer : > On 09/26/2012 09:14 AM, Jérémie Dubois-Lacoste wrote: >> 1. Run a python script and check the memory that >> it requires (field "VIRT" of the "top" command). > > Don't use VIRT as a reference for memory used. RES is a better > indication, but even that won't tell you anything useful about shared > memory, and will lead you to believe that a process is using more memory > than it is. > > ___ > CentOS mailing list > CentOS@centos.org > http://lists.centos.org/mailman/listinfo/centos ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] High memory needs
You may have misunderstood. The detailed number I gave are obtained on distributions that are not CentOS, and ok, it can makes sense between 32 and 64 bits. But on CentOS 6.2, 64bits, I obtain: SH: 103MB PYTHON: 114MB R:200MB This is from a freshly installed CentOS 6.2 machine, without anything else. Thus the other components of our cluster are not involved here. This machine has 16 cores. This is MUCH more than other 64 bits distributions, as I wrote the ratios are between 2 and 7. 2012/9/26 Michael Hennebry : > On Wed, 26 Sep 2012, m.r...@5-cent.us wrote: > > >> Jérémie Dubois-Lacoste wrote: > > >>> Python script: >>>Avg Min Max >>> 32 bits8500 5004 11132 >>> 64 bits32800 3 36336 >> >> >> 8500 * 2 = 17000 >> 5004 * 2 = 10007 >> 11132 * 2 = 22264 >> >> So that ranges from 2-2.5 larger. > > > Huh? > 3*8500=25500 > < 32800 > 3*5004=15012 > < 3 > 3*11132=33396 > < 36336 > > >>> R: >>>Avg Min Max >>> 32 bits26900 21000 33452 >>> 64 bits100200 93008 97496 > > > 3*26900= 80700 > < 100200 > 4*21000=84000 > < 93008 > 2.9*33452< 97011 > < 97496 > > -- > Michael henne...@web.cs.ndsu.nodak.edu > "On Monday, I'm gonna have to tell my kindergarten class, > whom I teach not to run with scissors, > that my fiance ran me through with a broadsword." -- Lily > ___ > CentOS mailing list > CentOS@centos.org > http://lists.centos.org/mailman/listinfo/centos > ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] High memory needs
Hm, interesting suggestion. But didn't change anything. :( Thanks anyway, Jérémie 2012/9/26 Adrian Sevcenco : > On 09/26/12 19:14, Jérémie Dubois-Lacoste wrote: >> >> Dear All, > > Hi! > > >> We recently reinstalled our computing cluster. We were using CentOS >> 5.3 (32 bits). It is now CentOS 6.3 (64 bits), installed from the >> CentOS 6.2 x64 CD, then upgraded to 6.3. >> >> We have some issues with the memory needs of our running jobs. They >> require much more than before, it may be due to the switch from 32 to >> 64 bits, but to me this cannot explain the whole difference. > > it would seem that there is a malloc(glibc) behaviour ... > i seen in other list an advice to use : > export MALLOC_ARENA_MAX=1 > export MALLOC_MMAP_THRESHOLD=131072 > > in order to decrease the used memory .. > > HTH, > Adrian > >> >> Here are our investigations. >> >> We used the following simple benchmark: >> >> 1. Run a python script and check the memory that >> it requires (field "VIRT" of the "top" command). >> This script is: >> >> import time >> time.sleep(30) >> print("done") >> >> >> 2. Similarly, run and check the memory of a simple >> bash script: >> >> #!/bin/bash >> sleep 30 >> echo "done" >> >> >> 3. Open a R session and check the memory used >> >> >> I asked 10 of our users to run these three things on their personal >> PCs. They are running different distributions (mainly ubuntu, >> slackware), half of them use a 32 bits system, the other half a 64 >> one. Here is a summary of the results: >> >> Bash script: >> Avg Min Max >> 32 bits 54004192 9024 >> 64 bits 12900 1 16528 >> >> Python script: >> Avg Min Max >> 32 bits8500 5004 11132 >> 64 bits32800 3 36336 >> >> R: >> Avg Min Max >> 32 bits26900 21000 33452 >> 64 bits100200 93008 97496 >> >> (as a side remark, the difference between 32 and 64 is surprisingly >> big to me...). >> >> Then we ran the same things on our CentOS cluster, getting >> surprisingly high results. I installed a machine from scratch with the >> CentOS CD (6.2 x64) to be sure another component of the cluster was >> not playing a role. On this freshly installed machine I get the >> following results: >> SH: 103MB >> PYTHON: 114MB >> R:200MB >> >> So, compared to the highest of our users (among the 64 bits ones), we >> have a ratio of ~7, ~3, ~2, respectively. >> >> >> It is very problematic for us because many jobs now cannot run >> properly, because they lack memory on most of our computing nodes. >> So we really cannot stand the situation... >> >> Do you see any reason for this? Do you have suggestions? >> >> Sincerely, >> >> Jérémie >> ___ >> CentOS mailing list >> CentOS@centos.org >> http://lists.centos.org/mailman/listinfo/centos >> > > > > ___ > CentOS mailing list > CentOS@centos.org > http://lists.centos.org/mailman/listinfo/centos > ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
[CentOS] High memory needs
Dear All, We recently reinstalled our computing cluster. We were using CentOS 5.3 (32 bits). It is now CentOS 6.3 (64 bits), installed from the CentOS 6.2 x64 CD, then upgraded to 6.3. We have some issues with the memory needs of our running jobs. They require much more than before, it may be due to the switch from 32 to 64 bits, but to me this cannot explain the whole difference. Here are our investigations. We used the following simple benchmark: 1. Run a python script and check the memory that it requires (field "VIRT" of the "top" command). This script is: import time time.sleep(30) print("done") 2. Similarly, run and check the memory of a simple bash script: #!/bin/bash sleep 30 echo "done" 3. Open a R session and check the memory used I asked 10 of our users to run these three things on their personal PCs. They are running different distributions (mainly ubuntu, slackware), half of them use a 32 bits system, the other half a 64 one. Here is a summary of the results: Bash script: Avg Min Max 32 bits 54004192 9024 64 bits 12900 1 16528 Python script: Avg Min Max 32 bits8500 5004 11132 64 bits32800 3 36336 R: Avg Min Max 32 bits26900 21000 33452 64 bits100200 93008 97496 (as a side remark, the difference between 32 and 64 is surprisingly big to me...). Then we ran the same things on our CentOS cluster, getting surprisingly high results. I installed a machine from scratch with the CentOS CD (6.2 x64) to be sure another component of the cluster was not playing a role. On this freshly installed machine I get the following results: SH: 103MB PYTHON: 114MB R:200MB So, compared to the highest of our users (among the 64 bits ones), we have a ratio of ~7, ~3, ~2, respectively. It is very problematic for us because many jobs now cannot run properly, because they lack memory on most of our computing nodes. So we really cannot stand the situation... Do you see any reason for this? Do you have suggestions? Sincerely, Jérémie ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] stupid bash question
I gess you could also avoid the expension with: if test -n "$(find . -maxdepth 1 -name \"$NAME\" -print -quit)" 2012/8/15 Steve Thompson : > On Wed, 15 Aug 2012, Craig White wrote: > >> the relevant snippet is... >> >> NAME="*.mov" >> cd $IN >> if test -n "$(find . -maxdepth 1 -name $NAME -print -quit)" >> >> and if there is one file in this directory - ie test.mov, this works fine >> >> but if there are two (or more) files in this directory - test.mov, test2.mov >> >> then I get an error... >> find: paths must precede expression > > The substitution of $NAME is expanding the wild card, giving you a single > -name with two arguments. You probably want something like: > > NAME="\*.mov" > > Steve > ___ > CentOS mailing list > CentOS@centos.org > http://lists.centos.org/mailman/listinfo/centos ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos