Re: Does Storm work with Spring

2015-10-11 Thread Ankur Garg
Thanks for replying Ravi .

I think your suggestion to make wrapper to read json or xml is a very nice
Idea indeed .

But , the problem for me here is to have the context (with all beans loaded
and initialized ) available inside the Spouts and Bolts and that means
inside every running instance of Spouts and Bolts which may be running on
different machines and different jvm.

Agree that when defining topology I dont need Spring Context as I just have
to define spouts and bolts there.  I used context here to send them to
spout and bolt through constructor but it appears from comments above that
it wont work on distributed cluster .

So , is there some way that once topology gets submitted to run in a
distributed cluster , I can initialize my context there and someway they
are available to all Spouts and Bolts  ..Basically some shared location
where my application Context can be initialized (once and only once) and
this context can be accessed by
all instances of Spouts and Bolts ?

Thanks

On Sun, Oct 11, 2015 at 11:20 AM, Ravi Sharma  wrote:

> Basically u will have two context defined at different time/phase
>
> When u r about to submit the topology, u need to build topology, that
> context only need information about spouts and bolts.  You don't need any
> application bean like database accessories or ur services etc, as at this
> level u r not running ur application but u r just creating a topology and
> defining how bolts and spouts are connected to each other etc etc
>
> Now once topology is submitted, topology will be moved to one of the
> supervisor node and will start running, all spouts and bolts will be
> initialized,  at this moment u will need ur application context, which
> doesn't need ur earlier topology context
>
> So I will suggest keep both context separate.
>
> Topology is not complex to build, smaller topology can be built via code
> only, I. E. Which bolt listening to which spout, but if u want to go with
> good design, I say just write a small wrapper to read some json where u can
> define ur bolts and spouts and use that to build topology (u can use spring
> but it's not much needed)
>
> In past I have done it using both json setting (without spring) and xml
> setting (with spring) both works good
>
> Ravi
> On 11 Oct 2015 06:38, "Ankur Garg"  wrote:
>
>> Oh The problem here is I have many beans and which need to be initialized
>> (some are reading conf from yml files , database connection , thread pool
>> initialization etc) .
>>
>>
>> Now , I have written a spring boot application which takes care of all
>> the above and I define my topology inside one of the beans , Here is my
>> bean
>>
>> @Autowired
>> ApplicationContext appContext;
>>
>> @Bean
>> public void submitTopology() throws
>> AlreadyAliveException,InvalidTopologyException {
>>
>>TopologyBuilder builder = new TopologyBuilder();
>>
>>builder.setSpout("rabbitMqSpout", new RabbitListnerSpout(appContext),
>> 10);
>>
>>builder.setBolt("mapBolt", new GroupingBolt(appContext),
>> 10).shuffleGrouping("rabbitMqSpout");
>>
>> builder.setBolt("reduceBolt", new PublishingBolt(appContext),
>> 10).shuffleGrouping("mapBolt");
>>
>> Config conf = new Config();
>>
>> conf.registerSerialization(EventBean.class); // To be registered with
>> Kyro for Storm
>>
>> conf.registerSerialization(InputQueueManagerImpl.class);
>>
>> conf.setDebug(true);
>>
>>  conf.setMessageTimeoutSecs(200);
>>
>>LocalCluster cluster = new LocalCluster();
>>
>>   cluster.submitTopology("test", conf, builder.createTopology());
>>
>> }
>>
>>
>> When this bean is initialized , I already have appContext initialized by
>> my Spring Boot Application . So , the thing is , I am using SpringBoot to
>> initialize and load my context with all beans .
>>
>> Now this is the context which I want to leverage in my spouts and bolts .
>>
>>
>> So , if what I suggested earlier does  not work on Storm Distributed
>> Cluster , I need to find a way of initializing my AppContext somehow:(
>>
>> I would be really thankful if anyone here can help me :(
>>
>>
>> Thanks
>>
>> Ankur
>>
>> On Sun, Oct 11, 2015 at 5:54 AM, Javier Gonzalez 
>> wrote:
>>
>>> The local cluster runs completely within a single JVM AFAIK. The local
>>> cluster is useful for development, testing your topology, etc. The real
>>> deployment has to go through nimbus, run on workers started by supervisors
>>> on one or more nodes, etc. Kind of difficult to simulate all that on a
>>> single box.
>>>
>>> On Sat, Oct 10, 2015 at 1:45 PM, Ankur Garg 
>>> wrote:
>>>
 Oh ...So I will have to test it in a cluster.

 Having said that, how is local cluster which we use is too different
 from normal cluster.. Ideally ,it shud simulate normal cluster..
 On Oct 10, 2015 7:51 PM, "Ravi Sharma"  wrote:

> Hi Ankur,
> local it may be working but It wont work in Actual cluster.
>

Re: Multiple Spouts in Same topology or Topology per spout

2015-10-11 Thread Ravi Sharma
That depends if ur spout error has affected jvm or normal application error

performance issue in case of lot of errors, I don't think there is any
issue be coz of errors themselves but ofcourse if u r retrying these
messages on failure then that means u will be processing lot of messages
then normal and overall throughput will go down

Ravi

If ur topology has enabled acknowledgment that means spout will always
receive
On 11 Oct 2015 18:15, "Ankur Garg"  wrote:

>
> Thanks for the reply Abhishek and Ravi .
>
> One question though , going with One topology with multiple spouts ...What
> if something goes wrong in One spout or its associated bolts .. Does it
> impact other Spout as well?
>
> Thanks
> Ankur
>
> On Sun, Oct 11, 2015 at 10:21 PM, Ravi Sharma  wrote:
>
>> No 100% right ansers , u will have to test and see what will fit..
>>
>> persoanlly i wud suggest Multiple spouts in one Topology and if you have
>> N node where topology will be running then each Spout(reading from one
>> queue) shud run N times in parallel.
>>
>> if 2 Queues and say 4 Nodes
>> then one topolgy
>> 4 Spouts reading from Queue1 in different nodes
>> 4 spouts reading from Queue2 in different nodes
>>
>> Ravi.
>>
>> On Sun, Oct 11, 2015 at 5:25 PM, Abhishek priya > > wrote:
>>
>>> I guess this is a question where there r no really correct answers. I'll
>>> certainly avoid#1 as it is better to keep logic separate and lightweight.
>>>
>>> If your downstream bolts are same, then it makes senses to keep them in
>>> same topology but if they r totally different, I'll keep them in two
>>> different topologies. That will allow me to independently deploy and scale
>>> the topology. But if the rest of logic is same I topology scaling and
>>> resource utilization will be better with one topology.
>>>
>>> I hope this helps..
>>>
>>> Sent somehow
>>>
>>> > On Oct 11, 2015, at 9:07 AM, Ankur Garg  wrote:
>>> >
>>> > Hi ,
>>> >
>>> > So I have a situation where I want to read messages from different
>>> queues hosted in a Rabbitmq Server .
>>> >
>>> > Now , there are three ways which I can think to leverage Apache Storm
>>> here :-
>>> >
>>> > 1) Use the same Spout (say Spout A) to read messages from different
>>> queues and based on the messages received emit it to different Bolts.
>>> >
>>> > 2) Use different Spout (Spout A and Spout B and so on) within the same
>>> topology (say Topology A) to read messages from different queues .
>>> >
>>> > 3) Use Different Spouts one within eachTopology (Topology A , Topology
>>> B and so on) to read messages from different queues .
>>> >
>>> > Which is the best way to process this considering I want high
>>> throughput (more no of queue messages to be processed concurrently) .
>>> >
>>> > Also , If In use same Topology for all Spouts (currently though
>>> requirement is for 2 spouts)  will failure in one Spout (or its associated
>>> Bolts) effect the second or will they both continue working separately even
>>> if some failure is in Spout B ?
>>> >
>>> > Cost wise , how much would it be to maintain two different topologies .
>>> >
>>> > Looking for inputs from members here.
>>> >
>>> > Thanks
>>> > Ankur
>>> >
>>> >
>>>
>>
>>
>


Re:Re: “Topology summary” in storm ui is empty

2015-10-11 Thread zhwei_cn
yes, there is a supervisor, and I can see the id of the supervisor at 
"supervisor summary" in storm ui


what's more, when I use the following command, I can seen the topolgy is 
running from the output:
python storm jar ../examples/storm-starter/storm-starter-topologies-0.9.5.jar 
storm.starter.WordCountTopology


thanks 
wei






At 2015-10-11 02:57:35, soozandjohny...@gmail.com wrote:

Check to see if you have any supervisors running

Sent from my iPhone

On Oct 10, 2015, at 1:46 PM, zhwei_cn  wrote:


hi,
when I submit a topology to storm,the “Topology summary” in storm ui is empty
the nimbus log stop at “Starting Nimbus server...”,does it mean nimbus has not 
started successfully?
thanks
wei




 





 

Re: Multiple Spouts in Same topology or Topology per spout

2015-10-11 Thread Ankur Garg
Thanks for the reply Abhishek and Ravi .

One question though , going with One topology with multiple spouts ...What
if something goes wrong in One spout or its associated bolts .. Does it
impact other Spout as well?

Thanks
Ankur

On Sun, Oct 11, 2015 at 10:21 PM, Ravi Sharma  wrote:

> No 100% right ansers , u will have to test and see what will fit..
>
> persoanlly i wud suggest Multiple spouts in one Topology and if you have N
> node where topology will be running then each Spout(reading from one queue)
> shud run N times in parallel.
>
> if 2 Queues and say 4 Nodes
> then one topolgy
> 4 Spouts reading from Queue1 in different nodes
> 4 spouts reading from Queue2 in different nodes
>
> Ravi.
>
> On Sun, Oct 11, 2015 at 5:25 PM, Abhishek priya 
> wrote:
>
>> I guess this is a question where there r no really correct answers. I'll
>> certainly avoid#1 as it is better to keep logic separate and lightweight.
>>
>> If your downstream bolts are same, then it makes senses to keep them in
>> same topology but if they r totally different, I'll keep them in two
>> different topologies. That will allow me to independently deploy and scale
>> the topology. But if the rest of logic is same I topology scaling and
>> resource utilization will be better with one topology.
>>
>> I hope this helps..
>>
>> Sent somehow
>>
>> > On Oct 11, 2015, at 9:07 AM, Ankur Garg  wrote:
>> >
>> > Hi ,
>> >
>> > So I have a situation where I want to read messages from different
>> queues hosted in a Rabbitmq Server .
>> >
>> > Now , there are three ways which I can think to leverage Apache Storm
>> here :-
>> >
>> > 1) Use the same Spout (say Spout A) to read messages from different
>> queues and based on the messages received emit it to different Bolts.
>> >
>> > 2) Use different Spout (Spout A and Spout B and so on) within the same
>> topology (say Topology A) to read messages from different queues .
>> >
>> > 3) Use Different Spouts one within eachTopology (Topology A , Topology
>> B and so on) to read messages from different queues .
>> >
>> > Which is the best way to process this considering I want high
>> throughput (more no of queue messages to be processed concurrently) .
>> >
>> > Also , If In use same Topology for all Spouts (currently though
>> requirement is for 2 spouts)  will failure in one Spout (or its associated
>> Bolts) effect the second or will they both continue working separately even
>> if some failure is in Spout B ?
>> >
>> > Cost wise , how much would it be to maintain two different topologies .
>> >
>> > Looking for inputs from members here.
>> >
>> > Thanks
>> > Ankur
>> >
>> >
>>
>
>


Multiple Spouts in Same topology or Topology per spout

2015-10-11 Thread Ankur Garg
Hi ,

So I have a situation where I want to read messages from different queues
hosted in a Rabbitmq Server .

Now , there are three ways which I can think to leverage Apache Storm here
:-

1) Use the same Spout (say Spout A) to read messages from different queues
and based on the messages received emit it to different Bolts.

2) Use different Spout (Spout A and Spout B and so on) within the same
topology (say Topology A) to read messages from different queues .

3) Use Different Spouts one within eachTopology (Topology A , Topology B
and so on) to read messages from different queues .

Which is the best way to process this considering I want high throughput
(more no of queue messages to be processed concurrently) .

Also , If In use same Topology for all Spouts (currently though requirement
is for 2 spouts)  will failure in one Spout (or its associated Bolts)
effect the second or will they both continue working separately even if
some failure is in Spout B ?

Cost wise , how much would it be to maintain two different topologies .

Looking for inputs from members here.

Thanks
Ankur


Re: Multiple Spouts in Same topology or Topology per spout

2015-10-11 Thread Abhishek priya
I guess this is a question where there r no really correct answers. I'll 
certainly avoid#1 as it is better to keep logic separate and lightweight.

If your downstream bolts are same, then it makes senses to keep them in same 
topology but if they r totally different, I'll keep them in two different 
topologies. That will allow me to independently deploy and scale the topology. 
But if the rest of logic is same I topology scaling and resource utilization 
will be better with one topology.

I hope this helps..

Sent somehow

> On Oct 11, 2015, at 9:07 AM, Ankur Garg  wrote:
> 
> Hi ,
> 
> So I have a situation where I want to read messages from different queues 
> hosted in a Rabbitmq Server . 
> 
> Now , there are three ways which I can think to leverage Apache Storm here :-
> 
> 1) Use the same Spout (say Spout A) to read messages from different queues 
> and based on the messages received emit it to different Bolts.
> 
> 2) Use different Spout (Spout A and Spout B and so on) within the same 
> topology (say Topology A) to read messages from different queues .
> 
> 3) Use Different Spouts one within eachTopology (Topology A , Topology B and 
> so on) to read messages from different queues . 
> 
> Which is the best way to process this considering I want high throughput 
> (more no of queue messages to be processed concurrently) . 
> 
> Also , If In use same Topology for all Spouts (currently though requirement 
> is for 2 spouts)  will failure in one Spout (or its associated Bolts) effect 
> the second or will they both continue working separately even if some failure 
> is in Spout B ?
> 
> Cost wise , how much would it be to maintain two different topologies .
> 
> Looking for inputs from members here.
> 
> Thanks
> Ankur
> 
> 


Re: Multiple Spouts in Same topology or Topology per spout

2015-10-11 Thread Ravi Sharma
No 100% right ansers , u will have to test and see what will fit..

persoanlly i wud suggest Multiple spouts in one Topology and if you have N
node where topology will be running then each Spout(reading from one queue)
shud run N times in parallel.

if 2 Queues and say 4 Nodes
then one topolgy
4 Spouts reading from Queue1 in different nodes
4 spouts reading from Queue2 in different nodes

Ravi.

On Sun, Oct 11, 2015 at 5:25 PM, Abhishek priya 
wrote:

> I guess this is a question where there r no really correct answers. I'll
> certainly avoid#1 as it is better to keep logic separate and lightweight.
>
> If your downstream bolts are same, then it makes senses to keep them in
> same topology but if they r totally different, I'll keep them in two
> different topologies. That will allow me to independently deploy and scale
> the topology. But if the rest of logic is same I topology scaling and
> resource utilization will be better with one topology.
>
> I hope this helps..
>
> Sent somehow
>
> > On Oct 11, 2015, at 9:07 AM, Ankur Garg  wrote:
> >
> > Hi ,
> >
> > So I have a situation where I want to read messages from different
> queues hosted in a Rabbitmq Server .
> >
> > Now , there are three ways which I can think to leverage Apache Storm
> here :-
> >
> > 1) Use the same Spout (say Spout A) to read messages from different
> queues and based on the messages received emit it to different Bolts.
> >
> > 2) Use different Spout (Spout A and Spout B and so on) within the same
> topology (say Topology A) to read messages from different queues .
> >
> > 3) Use Different Spouts one within eachTopology (Topology A , Topology B
> and so on) to read messages from different queues .
> >
> > Which is the best way to process this considering I want high throughput
> (more no of queue messages to be processed concurrently) .
> >
> > Also , If In use same Topology for all Spouts (currently though
> requirement is for 2 spouts)  will failure in one Spout (or its associated
> Bolts) effect the second or will they both continue working separately even
> if some failure is in Spout B ?
> >
> > Cost wise , how much would it be to maintain two different topologies .
> >
> > Looking for inputs from members here.
> >
> > Thanks
> > Ankur
> >
> >
>


Re: Multiple Spouts in Same topology or Topology per spout

2015-10-11 Thread Rudraneel chakraborty
Can you give me a situation where multiple dependent topology have been
used , say different topologies will infer a big complex event

On Sunday, 11 October 2015, Ravi Sharma  wrote:

> That depends if ur spout error has affected jvm or normal application error
>
> performance issue in case of lot of errors, I don't think there is any
> issue be coz of errors themselves but ofcourse if u r retrying these
> messages on failure then that means u will be processing lot of messages
> then normal and overall throughput will go down
>
> Ravi
>
> If ur topology has enabled acknowledgment that means spout will always
> receive
> On 11 Oct 2015 18:15, "Ankur Garg"  > wrote:
>
>>
>> Thanks for the reply Abhishek and Ravi .
>>
>> One question though , going with One topology with multiple spouts
>> ...What if something goes wrong in One spout or its associated bolts ..
>> Does it impact other Spout as well?
>>
>> Thanks
>> Ankur
>>
>> On Sun, Oct 11, 2015 at 10:21 PM, Ravi Sharma > > wrote:
>>
>>> No 100% right ansers , u will have to test and see what will fit..
>>>
>>> persoanlly i wud suggest Multiple spouts in one Topology and if you have
>>> N node where topology will be running then each Spout(reading from one
>>> queue) shud run N times in parallel.
>>>
>>> if 2 Queues and say 4 Nodes
>>> then one topolgy
>>> 4 Spouts reading from Queue1 in different nodes
>>> 4 spouts reading from Queue2 in different nodes
>>>
>>> Ravi.
>>>
>>> On Sun, Oct 11, 2015 at 5:25 PM, Abhishek priya <
>>> abhishek.pr...@gmail.com
>>> > wrote:
>>>
 I guess this is a question where there r no really correct answers.
 I'll certainly avoid#1 as it is better to keep logic separate and
 lightweight.

 If your downstream bolts are same, then it makes senses to keep them in
 same topology but if they r totally different, I'll keep them in two
 different topologies. That will allow me to independently deploy and scale
 the topology. But if the rest of logic is same I topology scaling and
 resource utilization will be better with one topology.

 I hope this helps..

 Sent somehow

 > On Oct 11, 2015, at 9:07 AM, Ankur Garg > wrote:
 >
 > Hi ,
 >
 > So I have a situation where I want to read messages from different
 queues hosted in a Rabbitmq Server .
 >
 > Now , there are three ways which I can think to leverage Apache Storm
 here :-
 >
 > 1) Use the same Spout (say Spout A) to read messages from different
 queues and based on the messages received emit it to different Bolts.
 >
 > 2) Use different Spout (Spout A and Spout B and so on) within the
 same topology (say Topology A) to read messages from different queues .
 >
 > 3) Use Different Spouts one within eachTopology (Topology A ,
 Topology B and so on) to read messages from different queues .
 >
 > Which is the best way to process this considering I want high
 throughput (more no of queue messages to be processed concurrently) .
 >
 > Also , If In use same Topology for all Spouts (currently though
 requirement is for 2 spouts)  will failure in one Spout (or its associated
 Bolts) effect the second or will they both continue working separately even
 if some failure is in Spout B ?
 >
 > Cost wise , how much would it be to maintain two different topologies
 .
 >
 > Looking for inputs from members here.
 >
 > Thanks
 > Ankur
 >
 >

>>>
>>>
>>

-- 
Rudraneel Chakraborty
Carleton University Real Time and Distributed Systems Reserach


same jar does work well on online storm

2015-10-11 Thread Yang Nian
the same jar can work well in local mode
but when submit to online env, it does not work

the jar is simple and only depend storm.jar