Hi Brian, > I wonder is a 1MB buffer per socket a little aggressive?
not as aggressive as I'd like it to be, as the library does not use the permission I gave it and still sends everything in 8k chunks... ;-) (But still a lot better than one system call per byte.) > On a large-ish cluster you might have connections open from each node and > from each > running job, so this could add up quickly, especially if you've limited > RAM in dom0. Note that that limit only applies to Haskell(!) daemons talking over domain sockets(!) (and, in fact, only to those daemons sending). So the maximum that can happen in parallel, is the following - one connection per non-queued job; with default settings, this is limited to 20. But we want to increase the number of jobs running in parallel, so let's say 50. (wconfd -> job) - any number of parallel RAPI requests asking for large output; realistically there won't be that many "list instances" requests at the same time, say 5 (luxid -> rapi) - dito for any interactive operations on the command line asking for lengthy output, say another 5 (luxid -> cli) Connections not affected include noded answering luxid (receiving(!) over a tcp), luxid asking for replication of job files (only to master candidates (default: 10), over tcp, also job files are small), jobs talking to nodes (python to python). So, even if the worst comes to the worst, and the 1MiB is fully used, we should be below 60MiB which still fits into dom0. But, if you prefer, we can also reduce that size to somehing like 8k or 32k. I just decided to go for the maximum buffering we can afford while still safely staying within the current dom0 limits. What do you tink? Thanks, Klaus -- Klaus Aehlig Google Germany GmbH, Dienerstr. 12, 80331 Muenchen Registergericht und -nummer: Hamburg, HRB 86891 Sitz der Gesellschaft: Hamburg Geschaeftsfuehrer: Matthew Scott Sucherman, Paul Terence Manicle
