Re: fork on processes with lots of memory

2016-02-26 Thread lwoodman

On 01/27/2016 10:09 PM, Hugh Dickins wrote:

On Tue, 26 Jan 2016, Felix von Leitner wrote:

Dear Linux kernel devs,
I talked to someone who uses large Linux based hardware to run a
process with huge memory requirements (think 4 GB), and he told me that
if they do a fork() syscall on that process, the whole system comes to
standstill. And not just for a second or two. He said they measured a 45
minute (!) delay before the system became responsive again.

I'm sorry, I meant 4 TB not 4 GB.
I'm not used to working with that kind of memory sizes.


Their working theory is that all the pages need to be marked copy-on-write
in both processes, and if you touch one page, a copy needs to be made,
and than just takes a while if you have a billion pages.
I was wondering if there is any advice for such situations from the
memory management people on this list.
In this case the fork was for an execve afterwards, but I was going to
recommend fork to them for something else that can not be tricked around
with vfork.
Can anyone comment on whether the 45 minute number sounds like it could
be real? When I heard it, I was flabberghasted. But the other person
swore it was real. Can a fork cause this much of a delay? Is there a way
to work around it?
I was going to recommend the fork to create a boundary between the
processes, so that you can recover from memory corruption in one
process. In fact, after the fork I would want to munmap almost all of
the shared pages anyway, but there is no way to tell fork that.

You might find madvise(addr, length, MADV_DONTFORK) helpful:
that tells fork not to duplicate the given range in the child.

Hugh


I dont know exactly what program they are running but we test RHEL with 
up to 24TB
of memory and have not seen this problem.  I have mmap()'d 12TB of 
memory into a
parent process private, touched every page then forked a child which 
wrote to every
page thereby incurring tons of ZFOD and COW faults.  It takes a while to 
process the
6 billion faults but the system didnt come to a halt.  The time I do see 
significant pauses

is when we overcommit RAM and swap space and get into an OOMkill storm.

Attached is the program:




Thanks,
Felix
PS: Please put me on Cc if you reply, I'm not subscribed to this mailing
list.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majord...@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: mailto:"d...@kvack.org;> em...@kvack.org 


#include 
#include 
#include 
#include 
#include 
main(int argc,char *argv[])
{
unsigned long siz, procs, itterations, cow;
char*ptr1;
char*i;
int pid, j, k, status;

if ((argc <= 1)||(argc >4)) {
printf("bad args, usage: forkoff  #children 
#itterations cow:0|1\n");
exit(-1);
}
siz = ((long)atol(argv[1])*1024*1024*1024);
procs = atol(argv[2]);
itterations = atol(argv[3]);
cow = atol(argv[4]);
printf("mmaping %ld anonymous bytes\n", siz); 
ptr1 = (char *)mmap((void 
*)0,siz,PROT_READ|PROT_WRITE,MAP_ANONYMOUS|MAP_PRIVATE,-1,0);
if (ptr1 == (char *)-1) {
printf("ptr1 = %lx\n", ptr1);
perror("");
}
if (cow) {
printf("priming parent for child COW faults\n");
// This will cause the ZFOD faults in the parent & COW faults 
in the children.
for (i=ptr1; i

Re: fork on processes with lots of memory

2016-02-26 Thread lwoodman

On 01/27/2016 10:09 PM, Hugh Dickins wrote:

On Tue, 26 Jan 2016, Felix von Leitner wrote:

Dear Linux kernel devs,
I talked to someone who uses large Linux based hardware to run a
process with huge memory requirements (think 4 GB), and he told me that
if they do a fork() syscall on that process, the whole system comes to
standstill. And not just for a second or two. He said they measured a 45
minute (!) delay before the system became responsive again.

I'm sorry, I meant 4 TB not 4 GB.
I'm not used to working with that kind of memory sizes.


Their working theory is that all the pages need to be marked copy-on-write
in both processes, and if you touch one page, a copy needs to be made,
and than just takes a while if you have a billion pages.
I was wondering if there is any advice for such situations from the
memory management people on this list.
In this case the fork was for an execve afterwards, but I was going to
recommend fork to them for something else that can not be tricked around
with vfork.
Can anyone comment on whether the 45 minute number sounds like it could
be real? When I heard it, I was flabberghasted. But the other person
swore it was real. Can a fork cause this much of a delay? Is there a way
to work around it?
I was going to recommend the fork to create a boundary between the
processes, so that you can recover from memory corruption in one
process. In fact, after the fork I would want to munmap almost all of
the shared pages anyway, but there is no way to tell fork that.

You might find madvise(addr, length, MADV_DONTFORK) helpful:
that tells fork not to duplicate the given range in the child.

Hugh


I dont know exactly what program they are running but we test RHEL with 
up to 24TB
of memory and have not seen this problem.  I have mmap()'d 12TB of 
memory into a
parent process private, touched every page then forked a child which 
wrote to every
page thereby incurring tons of ZFOD and COW faults.  It takes a while to 
process the
6 billion faults but the system didnt come to a halt.  The time I do see 
significant pauses

is when we overcommit RAM and swap space and get into an OOMkill storm.

Attached is the program:




Thanks,
Felix
PS: Please put me on Cc if you reply, I'm not subscribed to this mailing
list.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majord...@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: mailto:"d...@kvack.org;> em...@kvack.org 


#include 
#include 
#include 
#include 
#include 
main(int argc,char *argv[])
{
unsigned long siz, procs, itterations, cow;
char*ptr1;
char*i;
int pid, j, k, status;

if ((argc <= 1)||(argc >4)) {
printf("bad args, usage: forkoff  #children 
#itterations cow:0|1\n");
exit(-1);
}
siz = ((long)atol(argv[1])*1024*1024*1024);
procs = atol(argv[2]);
itterations = atol(argv[3]);
cow = atol(argv[4]);
printf("mmaping %ld anonymous bytes\n", siz); 
ptr1 = (char *)mmap((void 
*)0,siz,PROT_READ|PROT_WRITE,MAP_ANONYMOUS|MAP_PRIVATE,-1,0);
if (ptr1 == (char *)-1) {
printf("ptr1 = %lx\n", ptr1);
perror("");
}
if (cow) {
printf("priming parent for child COW faults\n");
// This will cause the ZFOD faults in the parent & COW faults 
in the children.
for (i=ptr1; i

Re: fork on processes with lots of memory

2016-01-27 Thread Hugh Dickins
On Tue, 26 Jan 2016, Felix von Leitner wrote:
> > Dear Linux kernel devs,
> 
> > I talked to someone who uses large Linux based hardware to run a
> > process with huge memory requirements (think 4 GB), and he told me that
> > if they do a fork() syscall on that process, the whole system comes to
> > standstill. And not just for a second or two. He said they measured a 45
> > minute (!) delay before the system became responsive again.
> 
> I'm sorry, I meant 4 TB not 4 GB.
> I'm not used to working with that kind of memory sizes.
> 
> > Their working theory is that all the pages need to be marked copy-on-write
> > in both processes, and if you touch one page, a copy needs to be made,
> > and than just takes a while if you have a billion pages.
> 
> > I was wondering if there is any advice for such situations from the
> > memory management people on this list.
> 
> > In this case the fork was for an execve afterwards, but I was going to
> > recommend fork to them for something else that can not be tricked around
> > with vfork.
> 
> > Can anyone comment on whether the 45 minute number sounds like it could
> > be real? When I heard it, I was flabberghasted. But the other person
> > swore it was real. Can a fork cause this much of a delay? Is there a way
> > to work around it?
> 
> > I was going to recommend the fork to create a boundary between the
> > processes, so that you can recover from memory corruption in one
> > process. In fact, after the fork I would want to munmap almost all of
> > the shared pages anyway, but there is no way to tell fork that.

You might find madvise(addr, length, MADV_DONTFORK) helpful:
that tells fork not to duplicate the given range in the child.

Hugh

> 
> > Thanks,
> 
> > Felix
> 
> > PS: Please put me on Cc if you reply, I'm not subscribed to this mailing
> > list.


Re: fork on processes with lots of memory

2016-01-27 Thread Hugh Dickins
On Tue, 26 Jan 2016, Felix von Leitner wrote:
> > Dear Linux kernel devs,
> 
> > I talked to someone who uses large Linux based hardware to run a
> > process with huge memory requirements (think 4 GB), and he told me that
> > if they do a fork() syscall on that process, the whole system comes to
> > standstill. And not just for a second or two. He said they measured a 45
> > minute (!) delay before the system became responsive again.
> 
> I'm sorry, I meant 4 TB not 4 GB.
> I'm not used to working with that kind of memory sizes.
> 
> > Their working theory is that all the pages need to be marked copy-on-write
> > in both processes, and if you touch one page, a copy needs to be made,
> > and than just takes a while if you have a billion pages.
> 
> > I was wondering if there is any advice for such situations from the
> > memory management people on this list.
> 
> > In this case the fork was for an execve afterwards, but I was going to
> > recommend fork to them for something else that can not be tricked around
> > with vfork.
> 
> > Can anyone comment on whether the 45 minute number sounds like it could
> > be real? When I heard it, I was flabberghasted. But the other person
> > swore it was real. Can a fork cause this much of a delay? Is there a way
> > to work around it?
> 
> > I was going to recommend the fork to create a boundary between the
> > processes, so that you can recover from memory corruption in one
> > process. In fact, after the fork I would want to munmap almost all of
> > the shared pages anyway, but there is no way to tell fork that.

You might find madvise(addr, length, MADV_DONTFORK) helpful:
that tells fork not to duplicate the given range in the child.

Hugh

> 
> > Thanks,
> 
> > Felix
> 
> > PS: Please put me on Cc if you reply, I'm not subscribed to this mailing
> > list.


Re: fork on processes with lots of memory

2016-01-26 Thread Mikael Pettersson
Felix von Leitner writes:
 > > Dear Linux kernel devs,
 > 
 > > I talked to someone who uses large Linux based hardware to run a
 > > process with huge memory requirements (think 4 GB), and he told me that
 > > if they do a fork() syscall on that process, the whole system comes to
 > > standstill. And not just for a second or two. He said they measured a 45
 > > minute (!) delay before the system became responsive again.
 > 
 > I'm sorry, I meant 4 TB not 4 GB.
 > I'm not used to working with that kind of memory sizes.

Make sure you have >>4TB physical if you're going to fork from a process
with a 4TB virtual address space.  (I'm assuming it's not sparse, but all
actually being used.)

Disable transparent hugepages (THP).  The internal book-keeping mechanisms
have been known to run amok with large RAM sizes causing severe performance
issues.  Maybe 4.x kernels are better, I haven't checked.

If you're using explicit hugepages and these kinds of RAM sizes, don't
bother with RHEL 6 or 7 kernels -- they're broken.  Vanilla 4.x kernels work.

We're also in the TB range, though not quite 4TB, and fork()ing from inside
such processes definitely works for us.  We do disable THP since it kills us
otherwise.

 > 
 > > Their working theory is that all the pages need to be marked copy-on-write
 > > in both processes, and if you touch one page, a copy needs to be made,
 > > and than just takes a while if you have a billion pages.
 > 
 > > I was wondering if there is any advice for such situations from the
 > > memory management people on this list.
 > 
 > > In this case the fork was for an execve afterwards, but I was going to
 > > recommend fork to them for something else that can not be tricked around
 > > with vfork.
 > 
 > > Can anyone comment on whether the 45 minute number sounds like it could
 > > be real? When I heard it, I was flabberghasted. But the other person
 > > swore it was real. Can a fork cause this much of a delay? Is there a way
 > > to work around it?
 > 
 > > I was going to recommend the fork to create a boundary between the
 > > processes, so that you can recover from memory corruption in one
 > > process. In fact, after the fork I would want to munmap almost all of
 > > the shared pages anyway, but there is no way to tell fork that.
 > 
 > > Thanks,
 > 
 > > Felix
 > 
 > > PS: Please put me on Cc if you reply, I'm not subscribed to this mailing
 > > list.

-- 


Re: fork on processes with lots of memory

2016-01-26 Thread Borislav Petkov
+ linux-mm

On Tue, Jan 26, 2016 at 05:28:53PM +0100, Felix von Leitner wrote:
> > Dear Linux kernel devs,
> 
> > I talked to someone who uses large Linux based hardware to run a
> > process with huge memory requirements (think 4 GB), and he told me that
> > if they do a fork() syscall on that process, the whole system comes to
> > standstill. And not just for a second or two. He said they measured a 45
> > minute (!) delay before the system became responsive again.
> 
> I'm sorry, I meant 4 TB not 4 GB.
> I'm not used to working with that kind of memory sizes.
> 
> > Their working theory is that all the pages need to be marked copy-on-write
> > in both processes, and if you touch one page, a copy needs to be made,
> > and than just takes a while if you have a billion pages.
> 
> > I was wondering if there is any advice for such situations from the
> > memory management people on this list.
> 
> > In this case the fork was for an execve afterwards, but I was going to
> > recommend fork to them for something else that can not be tricked around
> > with vfork.
> 
> > Can anyone comment on whether the 45 minute number sounds like it could
> > be real? When I heard it, I was flabberghasted. But the other person
> > swore it was real. Can a fork cause this much of a delay? Is there a way
> > to work around it?
> 
> > I was going to recommend the fork to create a boundary between the
> > processes, so that you can recover from memory corruption in one
> > process. In fact, after the fork I would want to munmap almost all of
> > the shared pages anyway, but there is no way to tell fork that.
> 
> > Thanks,
> 
> > Felix
> 
> > PS: Please put me on Cc if you reply, I'm not subscribed to this mailing
> > list.
> 

-- 
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.


Re: fork on processes with lots of memory

2016-01-26 Thread Felix von Leitner
> Dear Linux kernel devs,

> I talked to someone who uses large Linux based hardware to run a
> process with huge memory requirements (think 4 GB), and he told me that
> if they do a fork() syscall on that process, the whole system comes to
> standstill. And not just for a second or two. He said they measured a 45
> minute (!) delay before the system became responsive again.

I'm sorry, I meant 4 TB not 4 GB.
I'm not used to working with that kind of memory sizes.

> Their working theory is that all the pages need to be marked copy-on-write
> in both processes, and if you touch one page, a copy needs to be made,
> and than just takes a while if you have a billion pages.

> I was wondering if there is any advice for such situations from the
> memory management people on this list.

> In this case the fork was for an execve afterwards, but I was going to
> recommend fork to them for something else that can not be tricked around
> with vfork.

> Can anyone comment on whether the 45 minute number sounds like it could
> be real? When I heard it, I was flabberghasted. But the other person
> swore it was real. Can a fork cause this much of a delay? Is there a way
> to work around it?

> I was going to recommend the fork to create a boundary between the
> processes, so that you can recover from memory corruption in one
> process. In fact, after the fork I would want to munmap almost all of
> the shared pages anyway, but there is no way to tell fork that.

> Thanks,

> Felix

> PS: Please put me on Cc if you reply, I'm not subscribed to this mailing
> list.


fork on processes with lots of memory

2016-01-26 Thread Felix von Leitner
Dear Linux kernel devs,

I talked to someone who uses large Linux based hardware to run a
process with huge memory requirements (think 4 GB), and he told me that
if they do a fork() syscall on that process, the whole system comes to
standstill. And not just for a second or two. He said they measured a 45
minute (!) delay before the system became responsive again.

Their working theory is that all the pages need to be marked copy-on-write
in both processes, and if you touch one page, a copy needs to be made,
and than just takes a while if you have a billion pages.

I was wondering if there is any advice for such situations from the
memory management people on this list.

In this case the fork was for an execve afterwards, but I was going to
recommend fork to them for something else that can not be tricked around
with vfork.

Can anyone comment on whether the 45 minute number sounds like it could
be real? When I heard it, I was flabberghasted. But the other person
swore it was real. Can a fork cause this much of a delay? Is there a way
to work around it?

I was going to recommend the fork to create a boundary between the
processes, so that you can recover from memory corruption in one
process. In fact, after the fork I would want to munmap almost all of
the shared pages anyway, but there is no way to tell fork that.

Thanks,

Felix

PS: Please put me on Cc if you reply, I'm not subscribed to this mailing
list.


fork on processes with lots of memory

2016-01-26 Thread Felix von Leitner
Dear Linux kernel devs,

I talked to someone who uses large Linux based hardware to run a
process with huge memory requirements (think 4 GB), and he told me that
if they do a fork() syscall on that process, the whole system comes to
standstill. And not just for a second or two. He said they measured a 45
minute (!) delay before the system became responsive again.

Their working theory is that all the pages need to be marked copy-on-write
in both processes, and if you touch one page, a copy needs to be made,
and than just takes a while if you have a billion pages.

I was wondering if there is any advice for such situations from the
memory management people on this list.

In this case the fork was for an execve afterwards, but I was going to
recommend fork to them for something else that can not be tricked around
with vfork.

Can anyone comment on whether the 45 minute number sounds like it could
be real? When I heard it, I was flabberghasted. But the other person
swore it was real. Can a fork cause this much of a delay? Is there a way
to work around it?

I was going to recommend the fork to create a boundary between the
processes, so that you can recover from memory corruption in one
process. In fact, after the fork I would want to munmap almost all of
the shared pages anyway, but there is no way to tell fork that.

Thanks,

Felix

PS: Please put me on Cc if you reply, I'm not subscribed to this mailing
list.


Re: fork on processes with lots of memory

2016-01-26 Thread Felix von Leitner
> Dear Linux kernel devs,

> I talked to someone who uses large Linux based hardware to run a
> process with huge memory requirements (think 4 GB), and he told me that
> if they do a fork() syscall on that process, the whole system comes to
> standstill. And not just for a second or two. He said they measured a 45
> minute (!) delay before the system became responsive again.

I'm sorry, I meant 4 TB not 4 GB.
I'm not used to working with that kind of memory sizes.

> Their working theory is that all the pages need to be marked copy-on-write
> in both processes, and if you touch one page, a copy needs to be made,
> and than just takes a while if you have a billion pages.

> I was wondering if there is any advice for such situations from the
> memory management people on this list.

> In this case the fork was for an execve afterwards, but I was going to
> recommend fork to them for something else that can not be tricked around
> with vfork.

> Can anyone comment on whether the 45 minute number sounds like it could
> be real? When I heard it, I was flabberghasted. But the other person
> swore it was real. Can a fork cause this much of a delay? Is there a way
> to work around it?

> I was going to recommend the fork to create a boundary between the
> processes, so that you can recover from memory corruption in one
> process. In fact, after the fork I would want to munmap almost all of
> the shared pages anyway, but there is no way to tell fork that.

> Thanks,

> Felix

> PS: Please put me on Cc if you reply, I'm not subscribed to this mailing
> list.


Re: fork on processes with lots of memory

2016-01-26 Thread Borislav Petkov
+ linux-mm

On Tue, Jan 26, 2016 at 05:28:53PM +0100, Felix von Leitner wrote:
> > Dear Linux kernel devs,
> 
> > I talked to someone who uses large Linux based hardware to run a
> > process with huge memory requirements (think 4 GB), and he told me that
> > if they do a fork() syscall on that process, the whole system comes to
> > standstill. And not just for a second or two. He said they measured a 45
> > minute (!) delay before the system became responsive again.
> 
> I'm sorry, I meant 4 TB not 4 GB.
> I'm not used to working with that kind of memory sizes.
> 
> > Their working theory is that all the pages need to be marked copy-on-write
> > in both processes, and if you touch one page, a copy needs to be made,
> > and than just takes a while if you have a billion pages.
> 
> > I was wondering if there is any advice for such situations from the
> > memory management people on this list.
> 
> > In this case the fork was for an execve afterwards, but I was going to
> > recommend fork to them for something else that can not be tricked around
> > with vfork.
> 
> > Can anyone comment on whether the 45 minute number sounds like it could
> > be real? When I heard it, I was flabberghasted. But the other person
> > swore it was real. Can a fork cause this much of a delay? Is there a way
> > to work around it?
> 
> > I was going to recommend the fork to create a boundary between the
> > processes, so that you can recover from memory corruption in one
> > process. In fact, after the fork I would want to munmap almost all of
> > the shared pages anyway, but there is no way to tell fork that.
> 
> > Thanks,
> 
> > Felix
> 
> > PS: Please put me on Cc if you reply, I'm not subscribed to this mailing
> > list.
> 

-- 
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.


Re: fork on processes with lots of memory

2016-01-26 Thread Mikael Pettersson
Felix von Leitner writes:
 > > Dear Linux kernel devs,
 > 
 > > I talked to someone who uses large Linux based hardware to run a
 > > process with huge memory requirements (think 4 GB), and he told me that
 > > if they do a fork() syscall on that process, the whole system comes to
 > > standstill. And not just for a second or two. He said they measured a 45
 > > minute (!) delay before the system became responsive again.
 > 
 > I'm sorry, I meant 4 TB not 4 GB.
 > I'm not used to working with that kind of memory sizes.

Make sure you have >>4TB physical if you're going to fork from a process
with a 4TB virtual address space.  (I'm assuming it's not sparse, but all
actually being used.)

Disable transparent hugepages (THP).  The internal book-keeping mechanisms
have been known to run amok with large RAM sizes causing severe performance
issues.  Maybe 4.x kernels are better, I haven't checked.

If you're using explicit hugepages and these kinds of RAM sizes, don't
bother with RHEL 6 or 7 kernels -- they're broken.  Vanilla 4.x kernels work.

We're also in the TB range, though not quite 4TB, and fork()ing from inside
such processes definitely works for us.  We do disable THP since it kills us
otherwise.

 > 
 > > Their working theory is that all the pages need to be marked copy-on-write
 > > in both processes, and if you touch one page, a copy needs to be made,
 > > and than just takes a while if you have a billion pages.
 > 
 > > I was wondering if there is any advice for such situations from the
 > > memory management people on this list.
 > 
 > > In this case the fork was for an execve afterwards, but I was going to
 > > recommend fork to them for something else that can not be tricked around
 > > with vfork.
 > 
 > > Can anyone comment on whether the 45 minute number sounds like it could
 > > be real? When I heard it, I was flabberghasted. But the other person
 > > swore it was real. Can a fork cause this much of a delay? Is there a way
 > > to work around it?
 > 
 > > I was going to recommend the fork to create a boundary between the
 > > processes, so that you can recover from memory corruption in one
 > > process. In fact, after the fork I would want to munmap almost all of
 > > the shared pages anyway, but there is no way to tell fork that.
 > 
 > > Thanks,
 > 
 > > Felix
 > 
 > > PS: Please put me on Cc if you reply, I'm not subscribed to this mailing
 > > list.

--