Re: [CentOS] High memory needs

2012-10-01 Thread Jérémie Dubois-Lacoste
> Anyone using VIRT to make decisions about resource utilization is
> completely ignorant of its function.

I agree. I don't understand why Sun Grid Engine does exactly that.
I posted a full explanation of this problem and the solution we used here:
https://www.centos.org/modules/newbb/viewtopic.php?topic_id=39499

Thanks for your feedback, it was useful to isolate the problem.
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] High memory needs [SOLVED]

2012-09-28 Thread Jérémie Dubois-Lacoste
> did you test something like this?
>
> ls -al /usr/lib/locale/locale-archive
> lrwxrwxrwx 1 root root 9 Sep  3 12:35 /usr/lib/locale/locale-archive -> 
> /dev/null

Didn't test it. I thought that at least the languages set to be used
must be accessible.

Jeremie

>
> --
> LF
>
>
> ___
> CentOS mailing list
> CentOS@centos.org
> http://lists.centos.org/mailman/listinfo/centos
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] High memory needs [SOLVED]

2012-09-28 Thread Jérémie Dubois-Lacoste
I finally found a solution to our problem. I think some people running
like us a combination CentOS 64 bits\Sun Grid Engine,
could encounter the same situation.
Here is a detailed explanation, hope it can be useful to someone!

The file /usr/lib/locale/locale-archive is a memory-mapped file used
by glibc (the gnu C library).  This file contains the languages used
over the system (for instance, man pages). This memory-mapping allows
to read in the file as if it were in memory, avoiding system calls
used to perform disk-read operations, therefore it allows much faster
access. Memory-mapped files (as well as shared libraries) are counted
as part of the virtual memory of processes (see "top" command, "VIRT"
field). Therefore, the part of the locale-archive file mapped to
memory adds up to the virtual memory of every processes that makes use
of glibc (basically everything), while this is actually only once in
memory. In other words, for every processes, the virtual memory
overestimates the real memory of the process by, at least, the part of
the locale-archive file which is memory-mapped.

Apparently, on 32bits distros, and on most 64bits distros, only a part
of the locale-archive file is mmapped: for instance on CentOS 32 bits:
 $ pmap $$
 b7689000   2048K r  /usr/lib/locale/locale-archive
Only 2MB of the file are mmapped, while the file is actually ~54MB.
Some distros only install a small subsets of languages, for instance
on my Ubuntu 12.04, this file is only ~3MB.

I still don't get the reason, but CentOS x86_64 6.2 (and 6.3) mmappes
the *entire* file:
 $ pmap $$
 7f217fae4000  96836K r  /usr/lib/locale/locale-archive
Consequence: every processes that use glibc is overestimated by ~100MB
(according to the virtual memory).

That, alone, is not a problem at all. It is known that the virtual
memory is a bad estimation of the actual memory used, the fact that it
is largely overestimated does not matter much in most cases.

The problem, however, is that if you deal with a computing cluster
that uses Sun Grid Engine (we have version 6.2u5) it uses this value
to check whether jobs exceed their memory limit or not. The
consequence is a false impression that jobs require a huge amount of
memory, SGE being miss-leaded, and thus killing them.

A solution could be to restrict the number of languages when
installing glibc-common (which provides the locale-archive file), to
have a smaller locale-archive file.  This should be possible with:
 $ echo "%_install_langs ::" >> /etc/rpm/macros.lang
 $ rpm -e glibc-common --nodeps
 $ rm /usr/lib/locale/locale-archive
 $ rpm -i 
I don't know why but we couldn't have glibc-common (2.12-1.80) taking
it into account, though. All languages were always installed.  I read
somewhere that it is done on purpose, not sure...

The solution that worked for us is a post-install fix that consists of
removing all languages that we don't use.  We checked with "locale"
which languages are used on the system. Then we remove all the others:
 $ localedef --list-archive | grep -v -e "en_US" -e "de_DE" -e "en_GB"
| xargs localedef --delete-from-archive
 $ mv /usr/lib/locale/locale-archive /usr/lib/locale/locale-archive.tmpl
 $ build-locale-archive
After this, the size of the locale-archive file is ~4MB, and running a
single Bash instance does not show "107MB" for SGE anymore :-) Et voilà!


   Jérémie


2012/9/27 Les Mikesell :
> On Thu, Sep 27, 2012 at 10:46 AM, Gordon Messmer  wrote:
>>
>>> I understand it may not be very precise, however I still don't
>>> understant the difference compared to other x64 ditributions,
>>> under CentOS the value is 7 times higher!
>
> This might explain it:
> https://bugzilla.redhat.com/show_bug.cgi?id=156477
> The mmapped local-archive contains all languages even though only the
> ones you use are accessed from it.  Other distros split them and the
> installers only install what you want.
>
>> 64 bit system:
>> 7f8ae84b7000 102580K r  /usr/lib/locale/locale-archive
>>
>> 32 bit:
>> b7689000   2048K r  /usr/lib/locale/locale-archive
>
> That's an interesting difference on its own, since the underlying
> files are about 95M and 54M respectively.  Does the 32 bit kernel use
> some tricks to sparsely map files where the 64 bit one does it
> directly with page tables?
>
> --
>   Les Mikesell
>  lesmikes...@gmail.com
> ___
> CentOS mailing list
> CentOS@centos.org
> http://lists.centos.org/mailman/listinfo/centos
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] High memory needs

2012-09-27 Thread Jérémie Dubois-Lacoste
We have a computing cluster running Sun Grid Engine, which
considers this value to check if a process exceeds the memory
limit or not. So somehow I'm bound to consider it.

I installed a machine from scratch with CentOS 6.2 x64, nothing
else, I open a terminal, I run this simple bash script and VIRT
goes beyond 100MB for it.
I understand it may not be very precise, however I still don't
understant the difference compared to other x64 ditributions,
under CentOS the value is 7 times higher!




2012/9/27 Gordon Messmer :
> On 09/26/2012 09:14 AM, Jérémie Dubois-Lacoste wrote:
>> 1. Run a python script and check the memory that
>> it requires (field "VIRT" of the "top" command).
>
> Don't use VIRT as a reference for memory used.  RES is a better
> indication, but even that won't tell you anything useful about shared
> memory, and will lead you to believe that a process is using more memory
> than it is.
>
> ___
> CentOS mailing list
> CentOS@centos.org
> http://lists.centos.org/mailman/listinfo/centos
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] High memory needs

2012-09-26 Thread Jérémie Dubois-Lacoste
You may have misunderstood.
The detailed number I gave are obtained on distributions that
are not CentOS, and ok, it can makes sense between 32
and 64 bits.
But on CentOS 6.2, 64bits, I obtain:
SH:  103MB
PYTHON: 114MB
R:200MB

This is from a freshly installed CentOS 6.2 machine, without
anything else. Thus the other components of our cluster are not
involved here. This machine has 16 cores.
This is MUCH more than other 64 bits distributions, as I wrote
the ratios are between 2 and 7.


2012/9/26 Michael Hennebry :
> On Wed, 26 Sep 2012, m.r...@5-cent.us wrote:
>
>
>> Jérémie Dubois-Lacoste wrote:
>
>
>>> Python script:
>>>Avg  Min   Max
>>> 32 bits8500 5004  11132
>>> 64 bits32800   3  36336
>>
>>
>> 8500 * 2 = 17000
>> 5004 * 2 = 10007
>> 11132 * 2 = 22264
>>
>> So that ranges from 2-2.5 larger.
>
>
> Huh?
> 3*8500=25500
>  < 32800
> 3*5004=15012
>  < 3
> 3*11132=33396
>   < 36336
>
>
>>> R:
>>>Avg  Min   Max
>>> 32 bits26900   21000 33452
>>> 64 bits100200 93008  97496
>
>
> 3*26900= 80700
>   < 100200
> 4*21000=84000
>   < 93008
> 2.9*33452< 97011
>  < 97496
>
> --
> Michael   henne...@web.cs.ndsu.nodak.edu
> "On Monday, I'm gonna have to tell my kindergarten class,
> whom I teach not to run with scissors,
> that my fiance ran me through with a broadsword."  --  Lily
> ___
> CentOS mailing list
> CentOS@centos.org
> http://lists.centos.org/mailman/listinfo/centos
>
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] High memory needs

2012-09-26 Thread Jérémie Dubois-Lacoste
Hm, interesting suggestion. But didn't change anything. :(
Thanks anyway,

Jérémie



2012/9/26 Adrian Sevcenco :
> On 09/26/12 19:14, Jérémie Dubois-Lacoste wrote:
>>
>> Dear All,
>
> Hi!
>
>
>> We recently reinstalled our computing cluster.  We were using CentOS
>> 5.3 (32 bits).  It is now CentOS 6.3 (64 bits), installed from the
>> CentOS 6.2 x64 CD, then upgraded to 6.3.
>>
>> We have some issues with the memory needs of our running jobs. They
>> require much more than before, it may be due to the switch from 32 to
>> 64 bits, but to me this cannot explain the whole difference.
>
> it would seem that there is a malloc(glibc) behaviour ...
> i seen in other list an advice to use :
> export MALLOC_ARENA_MAX=1
> export MALLOC_MMAP_THRESHOLD=131072
>
> in order to decrease the used memory ..
>
> HTH,
> Adrian
>
>>
>> Here are our investigations.
>>
>> We used the following simple benchmark:
>>
>> 1. Run a python script and check the memory that
>> it requires (field "VIRT" of the "top" command).
>> This script is:
>> 
>> import time
>> time.sleep(30)
>> print("done")
>> 
>>
>> 2. Similarly, run and check the memory of a simple
>> bash script:
>> 
>> #!/bin/bash
>> sleep 30
>> echo "done"
>> 
>>
>> 3. Open a R session and check the memory used
>>
>>
>> I asked 10 of our users to run these three things on their personal
>> PCs. They are running different distributions (mainly ubuntu,
>> slackware), half of them use a 32 bits system, the other half a 64
>> one.  Here is a summary of the results:
>>
>> Bash script:
>> Avg  Min   Max
>> 32 bits 54004192  9024
>> 64 bits 12900  1  16528
>>
>> Python script:
>> Avg  Min   Max
>> 32 bits8500 5004  11132
>> 64 bits32800   3  36336
>>
>> R:
>> Avg  Min   Max
>> 32 bits26900   21000 33452
>> 64 bits100200 93008  97496
>>
>> (as a side remark, the difference between 32 and 64 is surprisingly
>> big to me...).
>>
>> Then we ran the same things on our CentOS cluster, getting
>> surprisingly high results. I installed a machine from scratch with the
>> CentOS CD (6.2 x64) to be sure another component of the cluster was
>> not playing a role. On this freshly installed machine I get the
>> following results:
>> SH:  103MB
>> PYTHON: 114MB
>> R:200MB
>>
>> So, compared to the highest of our users (among the 64 bits ones), we
>> have a ratio of ~7, ~3, ~2, respectively.
>>
>>
>> It is very problematic for us because many jobs now cannot run
>> properly, because they lack memory on most of our computing nodes.
>> So we really cannot stand the situation...
>>
>> Do you see any reason for this? Do you have suggestions?
>>
>> Sincerely,
>>
>> Jérémie
>> ___
>> CentOS mailing list
>> CentOS@centos.org
>> http://lists.centos.org/mailman/listinfo/centos
>>
>
>
>
> ___
> CentOS mailing list
> CentOS@centos.org
> http://lists.centos.org/mailman/listinfo/centos
>
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


[CentOS] High memory needs

2012-09-26 Thread Jérémie Dubois-Lacoste
Dear All,

We recently reinstalled our computing cluster.  We were using CentOS
5.3 (32 bits).  It is now CentOS 6.3 (64 bits), installed from the
CentOS 6.2 x64 CD, then upgraded to 6.3.

We have some issues with the memory needs of our running jobs. They
require much more than before, it may be due to the switch from 32 to
64 bits, but to me this cannot explain the whole difference.

Here are our investigations.

We used the following simple benchmark:

1. Run a python script and check the memory that
it requires (field "VIRT" of the "top" command).
This script is:

import time
time.sleep(30)
print("done")


2. Similarly, run and check the memory of a simple
bash script:

#!/bin/bash
sleep 30
echo "done"


3. Open a R session and check the memory used


I asked 10 of our users to run these three things on their personal
PCs. They are running different distributions (mainly ubuntu,
slackware), half of them use a 32 bits system, the other half a 64
one.  Here is a summary of the results:

Bash script:
   Avg  Min   Max
32 bits 54004192  9024
64 bits 12900  1  16528

Python script:
   Avg  Min   Max
32 bits8500 5004  11132
64 bits32800   3  36336

R:
   Avg  Min   Max
32 bits26900   21000 33452
64 bits100200 93008  97496

(as a side remark, the difference between 32 and 64 is surprisingly
big to me...).

Then we ran the same things on our CentOS cluster, getting
surprisingly high results. I installed a machine from scratch with the
CentOS CD (6.2 x64) to be sure another component of the cluster was
not playing a role. On this freshly installed machine I get the
following results:
SH:  103MB
PYTHON: 114MB
R:200MB

So, compared to the highest of our users (among the 64 bits ones), we
have a ratio of ~7, ~3, ~2, respectively.


It is very problematic for us because many jobs now cannot run
properly, because they lack memory on most of our computing nodes.
So we really cannot stand the situation...

Do you see any reason for this? Do you have suggestions?

Sincerely,

   Jérémie
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] stupid bash question

2012-08-15 Thread Jérémie Dubois-Lacoste
I gess you could also avoid the expension with:
if test -n "$(find . -maxdepth 1 -name \"$NAME\" -print -quit)"

2012/8/15 Steve Thompson :
> On Wed, 15 Aug 2012, Craig White wrote:
>
>> the relevant snippet is...
>>
>> NAME="*.mov"
>> cd $IN
>> if test -n "$(find . -maxdepth 1 -name $NAME -print -quit)"
>>
>> and if there is one file in this directory - ie test.mov, this works fine
>>
>> but if there are two (or more) files in this directory - test.mov, test2.mov
>>
>> then I get an error...
>> find: paths must precede expression
>
> The substitution of $NAME is expanding the wild card, giving you a single
> -name with two arguments. You probably want something like:
>
> NAME="\*.mov"
>
> Steve
> ___
> CentOS mailing list
> CentOS@centos.org
> http://lists.centos.org/mailman/listinfo/centos
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos