Hi Itamar,
You should really run zookeeper on more than one node (typically 3, 5, or 7
is very common). Otherwise, in your case, if the node running your
zookeeper servce goes down for any reason, your whole mesos installation
would stop working until you bring that node back.
Regards,
Shuai
Thanks Tomas.
We're still quite far from the 10k-20k machines limit :-)
Currently, our framework scheduler generates many (millions) of mostly
small tasks (some in the ~100ms, some in the few seconds).
I understand that the network is the main bottleneck, but we sometimes
experience lost tasks,
Is ZooKeeper running in distributed mode?
ZooKeeper is writes periodically all data to disk (transaction log), so the
bottleneck could be ZooKeeper rather than
not enough CPUs. ZooKeeper limits each key to 1MB, typically 512MB should
be enough for ZooKeeper (or 4GB
might not be enough, depends on
Are there recommendations regarding master / scheduler machines resources
as function of cluster size?
Say I have a cluster with hundreds of slave machines and thousands of CPUs,
with a single framework that will schedule millions of tasks.
How does the strength of the master scheduler machines
4 matches
Mail list logo