On 24/02/2015 16:34, "Reuti" <[email protected]> wrote:
> >> >> Thanks for the reply. I kind of guessed this might be the answer. I >> think the issue is that we're assuming that the allocated memory will be >> available as a single block, whereas this isn't actually the case. A >>lot >> of our applications rely on being able to allocate very large arrays >>which >> absolutely have to be contiguous, and when you get a lot of the same job >> type running on the same node then the effective memory can be exhausted >> long before the actual physical memory has gone. > >I never ran into this with our applications. But I wouldn't say that one >job tampers an other one. Each process gets its own virtual address >space. Where this is in real memory (and contiguous or not) should be >transparent to the application. > >Does the application allocate the memory once for the runtime of the job >or are there several malloc() and free() inside each application's run? OK, then something odd is going on. We're getting jobs which have very large h_vmem set, well over what they need, and applications are generating errors trying to allocate contiguous blocks of memory. Examples would be trying to allocate a very large vector in R (several GB) or allocating an internal array of 2GB in one of our DNA mapping programs. These same program run fine when they're the only thing on the machine so it looks like a clash with some other running applications. Free says that the machine has plenty of available RAM, and qacct says that the jobs are nowhere near the limit they asked for. Simon. The Babraham Institute, Babraham Research Campus, Cambridge CB22 3AT Registered Charity No. 1053902. The information transmitted in this email is directed only to the addressee. If you received this in error, please contact the sender and delete this email from your system. The contents of this e-mail are the views of the sender and do not necessarily represent the views of the Babraham Institute. Full conditions at: www.babraham.ac.uk<http://www.babraham.ac.uk/terms> _______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users
