On Mon, Mar 19, 2012 at 10:20 AM, Matt Oates (Home) <[email protected]> wrote: > > On 16 March 2012 00:32, Ole Tange <[email protected]> wrote: > > One of the problems with --load is that it only limits how many jobs > > are started. So you may start way too many. This will give you a load > > of 100: > > > > seq 100 | nice parallel -j0 --load 2.00 burnP6 > > > > and that is most likely not what you want. > > Am I wrong in thinking you can just do -j 100% so that you never spawn > more than maxload processes assuming one process load 1.0 on a single > core? Can you not use -j 100% in conjunction with --load to prevent > the overload on startup?
For CPU hungry programs like 'burnP6' that would be true. But if the program only uses 10% CPU (because it is waiting for network or disk I/O), then we should be able to spawn more - preferably automatically figuring out the "right" amount. > > While some programs run multiple threads (and thus can give a load > 1 > > each) that is the exception. So in general I think we can assume one > > job will at most give a load of 1. > > It would be nice to explicitly state the likely load per process > though especially if you are the one setting it. I frequently run hmm > building with concurrent threading per process and just do the maths > myself, and am lucky that all the hosts have the same number of CPUs. > Perhaps a flag like --is-threaded=4 or something to indicate the > likely load per job? I am not too happy about that. I would much prefer some automated way of doing-the-right-thing. > > Currently load is only computed every 10 seconds. So we could > > recompute every 10 seconds: > > > > number_of_concurrent_jobs = max_load - current_load + > > number_of_concurrent_jobs > > Looks good, though I have a couple of questions: If this is negative > are you going to kill processes rather than start them? What if it's > always 0 even from the start are you just never going to run on this > host? As a user I would be very surprised if GNU Parallel started to kill my jobs, and I try to design GNU Parallel adherring to POLA: http://en.wikipedia.org/wiki/Principle_of_least_astonishment So if it is < 1 it would mean: Do not spawn more new jobs, but wait for jobs to complete. > > I believe it would be better than the current, but I am very open to > > even better ideas. > > You are starting to get into the realm of needing to understand > scheduling per host... Load might be reported for something with a > different nice value than what you want to submit. So 100% load for > something with <0 nice and you want to put something in for +19. In > your equation above I would just add in something looking at the > difference between parallel's jobs that are running and those that are > ready/waiting. If all our jobs are running even under high load who > cares, we have priority here so keep up with the max load. If half of > our jobs are waiting then we might as well reduce spawning by half. I did not understand this part. > Best, > Matt. /Ole
