It is obviously not a one size fits all. It depends on a lot of factors.   How 
much data will you be ingesting, what is the data source, is it a firehose, a 
web front end or an app that is batching the messages.  How much processing 
will you be doing in the storm/kafka layer and obviously what will be the rate 
at which you will persist data to your sink.   So all these factors will 
determine your topology.  Storm/Spark are memory intensive but if you are 
streaming as would be the case with Kafka then it should not be much of an 
issue.
 

     On Sunday, March 8, 2015 11:26 PM, "Adaryl "Bob" Wakefield, MBA" 
<adaryl.wakefi...@hotmail.com> wrote:
   

 Let’s say you put together a real time streaming solution using Storm, Kafka, 
and the necessary Zookeeper and whatever storage tech you decide. Is it true 
that these applications are so resource intensive that they all need to live by 
themselves on their own machine? Put another way, for the ingestion portion, is 
the minimum number of machines required here 9? Adaryl "Bob" Wakefield, MBA
Principal
Mass Street Analytics, LLC
913.938.6685
www.linkedin.com/in/bobwakefieldmba
Twitter: @BobLovesData 

   

Reply via email to