Re: Recommended resources for master / scheduler machines

2015-01-10 Thread Shuai Lin
Hi Itamar, You should really run zookeeper on more than one node (typically 3, 5, or 7 is very common). Otherwise, in your case, if the node running your zookeeper servce goes down for any reason, your whole mesos installation would stop working until you bring that node back. Regards, Shuai

Re: Recommended resources for master / scheduler machines

2015-01-08 Thread Itamar Ostricher
Thanks Tomas. We're still quite far from the 10k-20k machines limit :-) Currently, our framework scheduler generates many (millions) of mostly small tasks (some in the ~100ms, some in the few seconds). I understand that the network is the main bottleneck, but we sometimes experience lost tasks,

Re: Recommended resources for master / scheduler machines

2015-01-08 Thread Tomas Barton
Is ZooKeeper running in distributed mode? ZooKeeper is writes periodically all data to disk (transaction log), so the bottleneck could be ZooKeeper rather than not enough CPUs. ZooKeeper limits each key to 1MB, typically 512MB should be enough for ZooKeeper (or 4GB might not be enough, depends on

Recommended resources for master / scheduler machines

2015-01-06 Thread Itamar Ostricher
Are there recommendations regarding master / scheduler machines resources as function of cluster size? Say I have a cluster with hundreds of slave machines and thousands of CPUs, with a single framework that will schedule millions of tasks. How does the strength of the master scheduler machines