Hi Hayden, This is a talk from Flink Forward that may be of help to you: https://www.youtube.com/watch?v=8l8dCKMMWkw <https://www.youtube.com/watch?v=8l8dCKMMWkw>
and here are the slides: www.slideshare.net/FlinkForward/flink-forward-berlin-2017-robert-metzger-keep-it-going-how-to-reliably-and-efficiently-operate-apache-flink/3 <http://www.slideshare.net/FlinkForward/flink-forward-berlin-2017-robert-metzger-keep-it-going-how-to-reliably-and-efficiently-operate-apache-flink/3> Kostas > On Dec 7, 2017, at 6:36 PM, Kostas Kloudas <k.klou...@data-artisans.com> > wrote: > > Hi Hayden, > > It would be nice if you could share a bit more details about your use case > and the load that you expect to have, > as this could allow us to have a better view of your needs. > > As a general set of rules: > 1) I would say that the bigger your cluster (in terms of resources, not > necessarily machines) the better. > 2) the more the RAM per machine the better, as this will allow to fit more > things in memory without spilling to disk > 3) in the dilemma between few powerful machines vs a lot of small ones, I > would go more towards the first, as this > allows for smaller network delays. > > Once again, the above rules are just general recommendations and more details > about your workload will give us > more information to work with. > > In the documentation here: > https://ci.apache.org/projects/flink/flink-docs-release-1.3/setup/yarn_setup.html#background--internals > > <https://ci.apache.org/projects/flink/flink-docs-release-1.3/setup/yarn_setup.html#background--internals> > you can find some details about deployment, monitoring, etc. > > I hope this helps, > Kostas > >> On Dec 7, 2017, at 1:53 PM, Marchant, Hayden <hayden.march...@citi.com >> <mailto:hayden.march...@citi.com>> wrote: >> >> Hi, >> >> I'm looking for guidelines for Reference architecture for Hardware for a >> small/medium Flink cluster - we'll be installing on in-house bare-metal >> servers. I'm looking for guidance for: >> >> 1. Number and spec of CPUs >> 2. RAM >> 3. Disks >> 4. Network >> 5. Proximity of servers to each other >> >> (Most likely, we will choose YARN as a cluster manager for Flink) >> >> If someone can share a document or link with relevant information, I will be >> very grateful. >> >> Thanks, >> Hayden Marchant >> >