Hi Brian,

> I wonder is a 1MB buffer per socket a little aggressive?

not as aggressive as I'd like it to be, as the library does
not use the permission I gave it and still sends everything
in 8k chunks... ;-)

(But still a lot better than one system call per byte.)

> On a large-ish cluster you might have connections open from each node and 
> from each
> running job, so this could add up quickly, especially if you've limited
> RAM in dom0.

Note that that limit only applies to Haskell(!) daemons
talking over domain sockets(!) (and, in fact, only to those
daemons sending). So the maximum that can happen in parallel,
is the following

- one connection per non-queued job; with default settings, this
  is limited to 20. But we want to increase the number of jobs
  running in parallel, so let's say 50.
  (wconfd -> job)

- any number of parallel RAPI requests asking for large output;
  realistically there won't be that many "list instances" requests
  at the same time, say 5
  (luxid -> rapi)

- dito for any interactive operations on the command line asking
  for lengthy output, say another 5
  (luxid -> cli)

Connections not affected include noded answering luxid
(receiving(!) over a tcp), luxid asking for replication of job
files (only to master candidates (default: 10), over tcp, also
job files are small), jobs talking to nodes (python to python).

So, even if the worst comes to the worst, and the 1MiB is fully
used, we should be below 60MiB which still fits into dom0.

But, if you prefer, we can also reduce that size to somehing like
8k or 32k. I just decided to go for the maximum buffering we can
afford while still safely staying within the current dom0 limits.

What do you tink?

Thanks,
Klaus

-- 
Klaus Aehlig
Google Germany GmbH, Dienerstr. 12, 80331 Muenchen
Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg
Geschaeftsfuehrer: Matthew Scott Sucherman, Paul Terence Manicle

Reply via email to