On Wed, Oct 05, 2016 at 03:28:06PM +0200, Raimo Niskanen wrote:

> On Mon, Oct 03, 2016 at 04:13:58PM +0200, Otto Moerbeek wrote:
> > On Mon, Oct 03, 2016 at 02:56:05PM +0200, Raimo Niskanen wrote:
> > 
> > > On Fri, Sep 30, 2016 at 01:02:10PM +0200, Otto Moerbeek wrote:
> > > > On Fri, Sep 30, 2016 at 09:10:21AM +0200, Raimo Niskanen wrote:
> > > > 
> > > > > On Wed, Sep 28, 2016 at 09:19:51AM +0200, Raimo Niskanen wrote:
> > > > > > Dear misc@
> > > > > > 
> > > > > > I have searched the archives and read the documentation of 
> > > > > > login.conf(5),
> > > > > > ksh(1):ulimit and can not find how to limit the amount of physical 
> > > > > > memory a
> > > > > > process may use.
> > > > > > 
> > > > > > I have the following limits where I have set down ulimit -m and 
> > > > > > ulimit -l
> > > > > > to 10000 kbytes in an attempt to limit the process I spawn which is
> > > > > > the Erlang VM.
> > > > > > 
> > > > > > $ ulimit -a
> > > > > > time(cpu-seconds)    unlimited
> > > > > > file(blocks)         unlimited
> > > > > > coredump(blocks)     unlimited
> > > > > > data(kbytes)         33554432
> > > > > > stack(kbytes)        8192
> > > > > > lockedmem(kbytes)    10000
> > > > > > memory(kbytes)       10000
> > > > > > nofiles(descriptors) 1024
> > > > > > processes            1024
> > > > > > 
> > > > > > Note that the machine has got 8 GB of physical memory and 8 GB of 
> > > > > > swap and
> > > > > > that I have set datasize=infinity in /etc/login.conf. I got
> > > > > > datasize=33554432 which seems to be the same as kern.shminfo.shmmax.
> > > > > > The datasize is twice the physical memory + swap.
> > > > > > 
> > > > > > Then I start the Erlang VM and tell it to allocate an address block 
> > > > > > of 30000
> > > > > > MByte for future use where it will store all literal data in the 
> > > > > > same block
> > > > > > (this is a garbage collector optimization).  Not much of this data 
> > > > > > is
> > > > > > actually used.
> > > > > > 
> > > > > >  68196 beam     CALL  
> > > > > > mmap(0,0x753000000,0<PROT_NONE>,0x1002<MAP_PRIVATE|MAP_ANON>,-1,0)
> > > > > >  68196 beam     RET   mmap 11871265173504/0xacbfe8b3000
> > > > > > 
> > > > > > Note the protection flags on the block.  No access is allowed.  
> > > > > > This trick
> > > > > > works just fine; here is what top says:
> > > > > > 
> > > > > > load averages:  0.15,  0.13,  0.09         frerin.otp.ericsson.se 
> > > > > > 08:49:46
> > > > > > 48 processes: 47 idle, 1 on processor                             
> > > > > > up 13:49
> > > > > > CPU0 states:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt, 
> > > > > >  100% idle
> > > > > > CPU1 states:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt, 
> > > > > >  100% idle
> > > > > > Memory: Real: 43M/636M act/tot Free: 7028M Cache: 508M Swap: 
> > > > > > 0K/8155M
> > > > > > 
> > > > > >   PID USERNAME PRI NICE  SIZE   RES STATE     WAIT      TIME    CPU 
> > > > > > COMMAND
> > > > > > 68196 raimo      2    0   29G   15M sleep     poll      0:00  1.42% 
> > > > > > beam
> > > > > > 
> > > > > > So I have a process with a data size of 29 GB on a machine with 16 
> > > > > > GB
> > > > > > memory + swap.  I have also tried to start an additional Erlang VM 
> > > > > > that
> > > > > > also allocates 29 GB of virtual memory which also works.
> > > > > > 
> > > > > > That this is allowed is just fine for me - this trick of allocating 
> > > > > > a
> > > > > > "large enough" PROT_NONE memory to get one address range for some 
> > > > > > special
> > > > > > data type is very useful for the Erlang VM.  But I wonder how to 
> > > > > > limit the
> > > > > > actual memory use?  Setting down ulimit -m and ulimit -l to 10000 
> > > > > > kbytes
> > > > > > did not prevent this process from getting 15 MByte of "RES" 
> > > > > > memory...
> > > > > > 
> > > > > > Is there some way to limit the actual amount of memory for a 
> > > > > > process when I
> > > > > > need to set up the datasize to allow for large unused virtual memory
> > > > > > blocks?
> > > > > 
> > > > > I have found clues in getrlimit,setrlimit(2):
> > > > > 
> > > > >      RLIMIT_DATA     The maximum size (in bytes) of the data segment 
> > > > > for a
> > > > >                      process; this includes memory allocated via 
> > > > > malloc(3)
> > > > >                      and all other anonymous memory mapped via 
> > > > > mmap(2).
> > > > > :
> > > > >      RLIMIT_RSS      The maximum size (in bytes) to which a process's
> > > > >                      resident set size may grow.  This imposes a limit
> > > > >                      on the amount of physical memory to be given to a
> > > > >                      process; if memory is tight, the system will 
> > > > > prefer
> > > > >                      to take memory from processes that are exceeding
> > > > >                      their declared resident set size.
> > > > > 
> > > > > Now I try to figure out the implications of this...  If I set the 
> > > > > data size
> > > > > so the sum of the data sizes for all processes in the system is 
> > > > > larger than
> > > > > physical memory + swap, then any process may allocate the last block 
> > > > > of
> > > > > memory in the system so a more important process later will fail to
> > > > > allocate?
> > > > 
> > > > yes.
> > > > 
> > > > > 
> > > > > And the memoryuse limit is rather toothless since there is no 
> > > > > immediate
> > > > > check of this limit.  When the system gets low on memory; is all that
> > > > > happens that processes that exceed their memoryuse limit probably 
> > > > > will get
> > > > > blocks swapped out?
> > > > 
> > > > RLIMIT_DATA *is* enforced, but it could be that PROT_NONE memory is
> > > > not counted. I don;t know atm.
> > > 
> > > That PROT_NONE is not counted sounds just as we want it to be...
> > > 
> > > That RLIMIT_DATA *is* enforced does not rhyme with what I saw, or I do not
> > > know what I saw...  As you can se above I had set ulimit -m 10000 (kbytes)
> > > and yet top reports RES 15M.  Is that not over the limit?  The PROT_NONE
> > > memory is reported in the 29GB entry by top.  I can easily within the
> > > erlang emulator construct a large list of integers that can not be in
> > > PROT_NONE memory and squeeze the RES entry up to above 10000M...
> > 
> > RES vs non-RES is not something that matters for RLIMIT_DATA.
> > RLMIT_DATA only applies to anonymous mappings. SIZE and RES show both
> > anonymous and non-anonymous data. I expect thta you have
> > some non-anonymous mappings as wel. procmap(8) shows more details.
> 
> I have now investigated such a process looking like this in top:
>   PID USERNAME PRI NICE  SIZE   RES STATE     WAIT      TIME    CPU COMMAND
>   46266 raimo      2    0 1745M  528M idle      poll      0:08  0.00% beam
> 
> It has 737732 KB of read/write [anon] in 8335 blocks,
> and 1047284 KB of noprot [anon] in 5 blocks.
> 
> Both are way above ulimit -m of 10000 (kbytes).
> 
> > 
> > Looing again at your ulimit numbers. You have a high limit on data. So it is
> > no surprise you do not hit that limit.
> 
> Correct, that is no surprise.
> 
> > 
> > AFAIK, the memory limit *is* applied to RES, buth only if there is a general
> > shortage on physical mem.
> 
> I think this must be the reason.  There is no general shortage on physical
> memory in this test.
> 
> But does that mean that if a process with ridiculously high data size but
> low memory limit would try to exhaust the physical+swap memory it would get
> failed mmaps when the system starts getting short on physical memory, so it
> would not be able to exhaust the physical memory?
> 
> And can I count on that even if there is a general shortage of on physical
> memory a program with high data size will still be able to mmap large
> PROT_NONE memory areas?
> 
> -- 
> 
> / Raimo Niskanen, Erlang/OTP, Ericsson AB

I tried finding the code responsible for maintaining the resident set
limits in the kernel but failed so far. So I cannot answer this in
more detail, maybe the code isn't there, or maybe I did look in the
wrong places.

        -Otto

Reply via email to