Thanks for answering and sorry for not being more clear.
I will try to clarify more.

All topologies are running simple logic.
 it is a event driven approach and I am trying to figure out what is  
conceptually the way people design/organize Topologies on Apache Storm

So far i had done  kafka topic per event ( example: OrderCreated, OrderUpdated) 
 and 1  Topology per event ( exemple OrdeCreatedTopology)  Each Topology has 
has 1 KafkSpout ( receives data from the kafka Topic and passas to 1 Bolt that 
writes data to Cassandra.


My question is… if this Topology per event the way to do or do experience Storm 
developers would develop 1 Topology per business domain like OrderTopology and 
that topology with have all “Order” related KafkaSpouts and Bolts ?

Thanks
IPVP


On May 2, 2017 at 5:22:45 PM, Dmitry Semenov 
([email protected]<mailto:[email protected]>) wrote:

It's hard to understand your question or recommend a solution.

If you put too much of activity (business logic / processing) in a single task 
- then it will be hard for you to scale up the topology and your hardware 
utilization will be very high. Make tasks atomic and small, use batching 
inserts to DB if possible. Analyze if cassandra becomes a bottleneck.  Cache of 
data inside tasks's memory to avoid lookup queries to DB.

On Tue, May 2, 2017 at 7:44 AM, I PVP 
<[email protected]<mailto:[email protected]>> wrote:
What is the high level best practice on Apache Storm ?

a)  To create a OrderTopology that would receive  and process data from all 
Order related topics/Spouts like  OrderCreated, OrderUpdated, OrderCancelled 
and so on

OR

b) To create individual Topologies like OrderCreatedTopology, 
OrderUpdatedTopology, OrderCancelledTopology

The reason I am asking is because  processing power is getting consumed 100% on 
all supervisor machines/instance... and does not matter how big the 
machines/instances are  or how many topologies are running.
The overhead required to run a topology seems to be the attention point.. as 
cpus on supervisors are at 100% even when there is no data coming into Spouts  
or going out  to Bolts.

Our application  has Topologies that  receive data from a KafkaSpouts -> Bolts 
write data to Cassandra. So far 32 Topologies.

Should I  focus on consolidating all "business domain" ( like Order, Payment)  
activities within the same Topology( like OrderTopology, PaymentTopology)?

How does Storm based solutions “design” their topologies ?
A side of individual logging , what are the pros and cons  from Apache Storm 
perspective ?


thanks

IPVP


Reply via email to