Am 24.02.2015 um 17:44 schrieb Simon Andrews <[email protected]>:
> On 24/02/2015 16:34, "Reuti" <[email protected]> wrote:
> 
>> 
>>> 
>>> Thanks for the reply.  I kind of guessed this might be the answer.  I
>>> think the issue is that we're assuming that the allocated memory will be
>>> available as a single block, whereas this isn't actually the case.  A
>>> lot
>>> of our applications rely on being able to allocate very large arrays
>>> which
>>> absolutely have to be contiguous, and when you get a lot of the same job
>>> type running on the same node then the effective memory can be exhausted
>>> long before the actual physical memory has gone.
>> 
>> I never ran into this with our applications. But I wouldn't say that one
>> job tampers an other one. Each process gets its own virtual address
>> space. Where this is in real memory (and contiguous or not) should be
>> transparent to the application.
>> 
>> Does the application allocate the memory once for the runtime of the job
>> or are there several malloc() and free() inside each application's run?
> 
> OK, then something odd is going on.  We're getting jobs which have very
> large h_vmem set, well over what they need, and applications are
> generating errors trying to allocate contiguous blocks of memory.
> Examples would be trying to allocate a very large vector in R (several GB)
> or allocating an internal array of 2GB in one of our DNA mapping programs.
> These same program run fine when they're the only thing on the machine so
> it looks like a clash with some other running applications.

With the same setting of h_vmem?

-- Reuti


>  Free says
> that the machine has plenty of available RAM, and qacct says that the jobs
> are nowhere near the limit they asked for.
> 
> Simon.
> 
> The Babraham Institute, Babraham Research Campus, Cambridge CB22 3AT 
> Registered Charity No. 1053902.
> The information transmitted in this email is directed only to the addressee. 
> If you received this in error, please contact the sender and delete this 
> email from your system. The contents of this e-mail are the views of the 
> sender and do not necessarily represent the views of the Babraham Institute. 
> Full conditions at: www.babraham.ac.uk<http://www.babraham.ac.uk/terms>
> 


_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to