Re: fork/exec for hooks scripts with a large FSFS cache

2012-11-19 Thread Thomas Åkesson
On 14 nov 2012, at 01:44, Daniel Shahaf wrote:
 Philip Martin wrote on Tue, Nov 13, 2012 at 21:30:00 +:
 Perhaps we could start up a separate hook script process before
 allocating the large FSFS cache and then delegate the fork/exec to that
 smaller process?
 
 If so, let's have that daemon handle all forks we might do, not just
 those related to hook scripts.  (Otherwise we'll run into the same
 problem as soon as we have another use-case for fork() in the server
 code --- such as calling svn_sysinfo__release_name().)

Looking at it from another perspective, perhaps the cache should live within a 
separate daemon? That would address not only the hook script problem but also 
the challenges of prefork processes typically required when combining 
Subversion and PHP.

Just a thought,
/Thomas Å.

Re: fork/exec for hooks scripts with a large FSFS cache

2012-11-19 Thread Stefan Fuhrmann
On Mon, Nov 19, 2012 at 3:45 PM, Thomas Åkesson tho...@akesson.cc wrote:

 On 14 nov 2012, at 01:44, Daniel Shahaf wrote:
  Philip Martin wrote on Tue, Nov 13, 2012 at 21:30:00 +:
  Perhaps we could start up a separate hook script process before
  allocating the large FSFS cache and then delegate the fork/exec to that
  smaller process?
 
  If so, let's have that daemon handle all forks we might do, not just
  those related to hook scripts.  (Otherwise we'll run into the same
  problem as soon as we have another use-case for fork() in the server
  code --- such as calling svn_sysinfo__release_name().)

 Looking at it from another perspective, perhaps the cache should live
 within a separate daemon? That would address not only the hook script
 problem but also the challenges of prefork processes typically required
 when combining Subversion and PHP.


For non-obvious reasons listed at the end of
http://svn.haxx.se/dev/archive-2012-11/0148.shtml,
this is not trivial to do. I may give it a try in 1.9
but there might be lock cleanup edge cases that
simply can't be handled in a portable way.

-- Stefan^2.

-- 
Certified  Supported Apache Subversion Downloads:
*

http://www.wandisco.com/subversion/download
*


Re: fork/exec for hooks scripts with a large FSFS cache

2012-11-16 Thread Philip Martin
Peter Samuelson pe...@p12n.org writes:

 [Philip Martin]
 If I configure a server with an FSFS cache that uses about 50% of
 available memory and then I use the server so the cache is in use I
 find that hook scripts fail to run because the fork/exec cannot
 allocate memory.

 You didn't say what OS this is.  It matters.  Some are configured to
 overcommit RAM allocation, because a lot of software tends to allocate
 far more RAM than it actually needs.  (fork() is only one case of
 this.)

I was using Linux with heuristic overcommit (overcommit_memory=0,
overcommit_ratio=50) and no swap.

-- 
Certified  Supported Apache Subversion Downloads:
http://www.wandisco.com/subversion/download


Re: fork/exec for hooks scripts with a large FSFS cache

2012-11-15 Thread Peter Samuelson

[Philip Martin]
 If I configure a server with an FSFS cache that uses about 50% of
 available memory and then I use the server so the cache is in use I
 find that hook scripts fail to run because the fork/exec cannot
 allocate memory.

You didn't say what OS this is.  It matters.  Some are configured to
overcommit RAM allocation, because a lot of software tends to allocate
far more RAM than it actually needs.  (fork() is only one case of
this.)

The other reason the OS matters is that some OSes support a vfork()
call which would avoid this issue entirely.  Unfortunatley it seems
apr_proc_create() has no option to use vfork() instead of fork(), even
for OSes (Linux and most Unixes) that support it.

Peter


Re: fork/exec for hooks scripts with a large FSFS cache

2012-11-15 Thread Greg Stein
On Nov 15, 2012 9:29 AM, Peter Samuelson pe...@p12n.org wrote:


 [Philip Martin]
  If I configure a server with an FSFS cache that uses about 50% of
  available memory and then I use the server so the cache is in use I
  find that hook scripts fail to run because the fork/exec cannot
  allocate memory.

 You didn't say what OS this is.  It matters.  Some are configured to
 overcommit RAM allocation, because a lot of software tends to allocate
 far more RAM than it actually needs.  (fork() is only one case of
 this.)

 The other reason the OS matters is that some OSes support a vfork()
 call which would avoid this issue entirely.  Unfortunatley it seems
 apr_proc_create() has no option to use vfork() instead of fork(), even
 for OSes (Linux and most Unixes) that support it.

We can fix APR. If you're running a big cache, and hitting fork errors,
please upgrade to APR 1.6

Cheers,
-g


Re: fork/exec for hooks scripts with a large FSFS cache

2012-11-14 Thread Philip Martin
Stefan Fuhrmann stefan.fuhrm...@wandisco.com writes:

 philip.mar...@wandisco.comwrote:

 Perhaps we could start up a separate hook script process before
 allocating the large FSFS cache and then delegate the fork/exec to that
 smaller process?


 I wonder whether there is a way to pass a different
 cache setting to the sub-process.

I don't think this would work.  It's the fork that is failing, the child
process never exists so it cannot use a smaller memory footprint.

Having hooks run in a separate process is complicated.  The process
would need to be multi-threaded, or multi-process, to avoid hooks
running in serial.  stdin/out/err would need to be handled
somehow. Pipes perhaps?  By passing file descriptors across a Unix
domain socket?

For now I think we just have to recommend that the system has sufficient
swap for the fork to work.  Once the child execs the hook the memory
footprint of the process goes down.  So as far as I can tell on my Linux
system nothing gets written to swap, it just has to exist when fork is
called.

-- 
Certified  Supported Apache Subversion Downloads:
http://www.wandisco.com/subversion/download


Re: fork/exec for hooks scripts with a large FSFS cache

2012-11-14 Thread Greg Stein
On Wed, Nov 14, 2012 at 8:49 AM, Philip Martin
philip.mar...@wandisco.com wrote:
...
 Having hooks run in a separate process is complicated.  The process
 would need to be multi-threaded, or multi-process, to avoid hooks
 running in serial.  stdin/out/err would need to be handled
 somehow. Pipes perhaps?  By passing file descriptors across a Unix
 domain socket?

We could do whatever mod_cgid is doing.

But with that said: most hooks don't generate stdout or stderr. We
could ship over parameters and a stdin blob, and run the hook. This
simplified model would only work if it was acceptable to *not* return
stdout/err to the client. (anything could still be logged on the
server)

You don't really need multiprocess or multithread, if you run an async
network loop such as serf does. The child exit signal would pop the
network loop, allowing for examination of the result. The daemon would
get a hook request, fork/exec, and return the exit code. (heck, if the
stdout/err is small enough, it could be captured and returned in the
response)

IIRC, Apache httpd even has a subsystem to monitor these kinds of
daemons and keep them running.

Cheers,
-g


Re: fork/exec for hooks scripts with a large FSFS cache

2012-11-13 Thread Stefan Fuhrmann
On Tue, Nov 13, 2012 at 10:30 PM, Philip Martin
philip.mar...@wandisco.comwrote:

 If I configure a server with an FSFS cache that uses about 50% of
 available memory and then I use the server so the cache is in use I find
 that hook scripts fail to run because the fork/exec cannot allocate
 memory. The user sees:


I only discovered a few days ago that hook scripts
get their own host process instance. If they use
the same cache setup, that's a problem. Even if
there is enough memory, allocating the memory
alone may cause a performance hit.


 $ svn mkdir -mm http://localhost:/obj/repo/A
 svn: E165002: Failed to start
 '/home/pm/sw/subversion/obj/repo/hooks/pre-commit' hook

 and the apache log contains:

 [Tue Nov 13 21:14:28 2012] [error] [client ::1] Could not MERGE resource
 /obj/repo/!svn/txn/0-1 into /obj/repo.  [409, #0]
 [Tue Nov 13 21:14:28 2012] [error] [client ::1] An error occurred while
 committing the transaction.  [409, #165002]
 [Tue Nov 13 21:14:28 2012] [error] [client ::1] Failed to start
 '/home/pm/sw/subversion/obj/repo/hooks/pre-commit' hook  [409, #165002]
 [Tue Nov 13 21:14:28 2012] [error] [client ::1] Can't start process
 '/home/pm/sw/subversion/obj/repo/hooks/pre-commit': Cannot allocate memory
  [409, #12]

 If I increase the available memory by adding swap the hook script
 starts.

 Perhaps we could start up a separate hook script process before
 allocating the large FSFS cache and then delegate the fork/exec to that
 smaller process?


I wonder whether there is a way to pass a different
cache setting to the sub-process.

-- Stefan^2.

-- 
Certified  Supported Apache Subversion Downloads:
*

http://www.wandisco.com/subversion/download
*


Re: fork/exec for hooks scripts with a large FSFS cache

2012-11-13 Thread Daniel Shahaf
Philip Martin wrote on Tue, Nov 13, 2012 at 21:30:00 +:
 Perhaps we could start up a separate hook script process before
 allocating the large FSFS cache and then delegate the fork/exec to that
 smaller process?

If so, let's have that daemon handle all forks we might do, not just
those related to hook scripts.  (Otherwise we'll run into the same
problem as soon as we have another use-case for fork() in the server
code --- such as calling svn_sysinfo__release_name().)

 
 -- 
 Certified  Supported Apache Subversion Downloads:
 http://www.wandisco.com/subversion/download


Re: fork/exec for hooks scripts with a large FSFS cache

2012-11-13 Thread Greg Stein
This problem is why mod_cgid exists. Been there. Done that. It creates a
daemon early, before threads/mem are allocated. The child CGI processes are
then fast to spawn.

Cheers,
-g

(*) http://httpd.apache.org/docs/2.4/mod/mod_cgid.html
On Nov 13, 2012 6:44 PM, Daniel Shahaf d...@daniel.shahaf.name wrote:

 Philip Martin wrote on Tue, Nov 13, 2012 at 21:30:00 +:
  Perhaps we could start up a separate hook script process before
  allocating the large FSFS cache and then delegate the fork/exec to that
  smaller process?

 If so, let's have that daemon handle all forks we might do, not just
 those related to hook scripts.  (Otherwise we'll run into the same
 problem as soon as we have another use-case for fork() in the server
 code --- such as calling svn_sysinfo__release_name().)

 
  --
  Certified  Supported Apache Subversion Downloads:
  http://www.wandisco.com/subversion/download