Spring bolts

2013-12-25 Thread Michal Singer
Hi, I am trying to understand how to use beans in spring as bolts/spouts.

If I have the definition in spring which is initialized once the bolt or
spout is initialized.

But when creating a topology I need to do: new Bolt()….

And cannot get it from spring.

So what is the right way to do this?



Thanks, Michal


Re: Spring bolts

2013-12-25 Thread Michael Rose
Make a base spring bolt, in your prepare method inject the members. That's
the best I've come up with, as prepare happens server side whereas topology
config and static initializers happen at deploy time client side.
On Dec 25, 2013 7:51 AM, Michal Singer mic...@leadspace.com wrote:

 Hi, I am trying to understand how to use beans in spring as bolts/spouts.

 If I have the definition in spring which is initialized once the bolt or
 spout is initialized.

 But when creating a topology I need to do: new Bolt()….

 And cannot get it from spring.

 So what is the right way to do this?



 Thanks, Michal



RE: Spring bolts

2013-12-25 Thread Michal Singer
I am not sure I understand.

Spring beans are defined in the spring configuration files. How can I
inject them in the members.

What I thought to do is that the bolts will not be spring beans and in the
prepare method I will initialize the spring context.

This way, the bolts will call other spring beans which are not bolts and
initialized in spring. But of course this is a very limited solution.





*From:* Michael Rose [mailto:mich...@fullcontact.com]
*Sent:* Wednesday, December 25, 2013 5:06 PM
*To:* user@storm.incubator.apache.org
*Subject:* Re: Spring bolts



Make a base spring bolt, in your prepare method inject the members. That's
the best I've come up with, as prepare happens server side whereas topology
config and static initializers happen at deploy time client side.

On Dec 25, 2013 7:51 AM, Michal Singer mic...@leadspace.com wrote:

Hi, I am trying to understand how to use beans in spring as bolts/spouts.

If I have the definition in spring which is initialized once the bolt or
spout is initialized.

But when creating a topology I need to do: new Bolt()….

And cannot get it from spring.

So what is the right way to do this?



Thanks, Michal


RE: Spring bolts

2013-12-25 Thread Michael Rose
Yes, you'll need a Spring context in prepare. Given you have multiple bolts
per JVM, its worth ensuring only one creates it in prepare then shares that
context.

We do this with Guice injectors and double checked locks.

Each bolt uses the singleton injector to inject its members. I imagine
Spring has a similar concept once you have a context.

Life cycle of bolts is quite strange in Storm given they're made before
deployment and serialized. There's quite a few gotchas. Bolt constructors
can't be trusted, thus prepare.

There may be a spring storm example out there somewhere.

Merry Christmas!
On Dec 25, 2013 8:17 AM, Michal Singer mic...@leadspace.com wrote:

 I am not sure I understand.

 Spring beans are defined in the spring configuration files. How can I
 inject them in the members.

 What I thought to do is that the bolts will not be spring beans and in the
 prepare method I will initialize the spring context.

 This way, the bolts will call other spring beans which are not bolts and
 initialized in spring. But of course this is a very limited solution.





 *From:* Michael Rose [mailto:mich...@fullcontact.com]
 *Sent:* Wednesday, December 25, 2013 5:06 PM
 *To:* user@storm.incubator.apache.org
 *Subject:* Re: Spring bolts



 Make a base spring bolt, in your prepare method inject the members. That's
 the best I've come up with, as prepare happens server side whereas topology
 config and static initializers happen at deploy time client side.

 On Dec 25, 2013 7:51 AM, Michal Singer mic...@leadspace.com wrote:

 Hi, I am trying to understand how to use beans in spring as bolts/spouts.

 If I have the definition in spring which is initialized once the bolt or
 spout is initialized.

 But when creating a topology I need to do: new Bolt()….

 And cannot get it from spring.

 So what is the right way to do this?



 Thanks, Michal



RE: Spring bolts

2013-12-25 Thread Michal Singer
Thanks, Merry Christmas to you!



*From:* Michael Rose [mailto:mich...@fullcontact.com]
*Sent:* Wednesday, December 25, 2013 5:48 PM
*To:* user@storm.incubator.apache.org
*Subject:* RE: Spring bolts



Yes, you'll need a Spring context in prepare. Given you have multiple bolts
per JVM, its worth ensuring only one creates it in prepare then shares that
context.

We do this with Guice injectors and double checked locks.

Each bolt uses the singleton injector to inject its members. I imagine
Spring has a similar concept once you have a context.

Life cycle of bolts is quite strange in Storm given they're made before
deployment and serialized. There's quite a few gotchas. Bolt constructors
can't be trusted, thus prepare.

There may be a spring storm example out there somewhere.

Merry Christmas!

On Dec 25, 2013 8:17 AM, Michal Singer mic...@leadspace.com wrote:

I am not sure I understand.

Spring beans are defined in the spring configuration files. How can I
inject them in the members.

What I thought to do is that the bolts will not be spring beans and in the
prepare method I will initialize the spring context.

This way, the bolts will call other spring beans which are not bolts and
initialized in spring. But of course this is a very limited solution.





*From:* Michael Rose [mailto:mich...@fullcontact.com]
*Sent:* Wednesday, December 25, 2013 5:06 PM
*To:* user@storm.incubator.apache.org
*Subject:* Re: Spring bolts



Make a base spring bolt, in your prepare method inject the members. That's
the best I've come up with, as prepare happens server side whereas topology
config and static initializers happen at deploy time client side.

On Dec 25, 2013 7:51 AM, Michal Singer mic...@leadspace.com wrote:

Hi, I am trying to understand how to use beans in spring as bolts/spouts.

If I have the definition in spring which is initialized once the bolt or
spout is initialized.

But when creating a topology I need to do: new Bolt()….

And cannot get it from spring.

So what is the right way to do this?



Thanks, Michal


Strom Topology Pattern: Batching problem

2013-12-25 Thread 鞠大升
hi, all

I am using storm topology batching pattern to put data from kafka to hdfs.

I have a spout reading data from kafka, the bolt will cache received tuple
in local variable, after some time interval or cache's amount, we flush the
cache tuples to hdfs.

I have 2 problem:

1. where should i do the thing: check time or amount?
method A is check time or amount in Bolt's execute() method. but when
there is no data comes, execute() will not call, so the check time will not
work.
method B is Bolt will spawn a fixed scheduler thread to do check time
and amount. but this method  has problem: fixed scheduler dead? bolt thread
and fixed scheduler thread's common data protected?


2. how do you handle the last batch before topology killed, how do you
close hdfs file  before topology killed?
cleanup() method will not call in cluster mode, so is there any method
to do cleanup work?

thx for your help.

-- 
dashengju
+86 13810875910
dashen...@gmail.com


Re: Strom Topology Pattern: Batching problem

2013-12-25 Thread Philip O'Toole
One standard pattern is to do the writing in the bolt. You should research Tick 
Tuples, as they may help you with the cache flush. A separate timer will work, 
but you may not need it. 

As for closing the file handle, you can't guarantee it explicitly. But perhaps 
the OS will do it if the worker JVM goes away. 

Philip 

 On Dec 25, 2013, at 11:05 PM, 鞠大升 dashen...@gmail.com wrote:
 
 hi, all
 
 I am using storm topology batching pattern to put data from kafka to hdfs.
 
 I have a spout reading data from kafka, the bolt will cache received tuple in 
 local variable, after some time interval or cache's amount, we flush the 
 cache tuples to hdfs.
 
 I have 2 problem:
 
 1. where should i do the thing: check time or amount?
 method A is check time or amount in Bolt's execute() method. but when 
 there is no data comes, execute() will not call, so the check time will not 
 work.
 method B is Bolt will spawn a fixed scheduler thread to do check time and 
 amount. but this method  has problem: fixed scheduler dead? bolt thread and 
 fixed scheduler thread's common data protected?

 
 2. how do you handle the last batch before topology killed, how do you close 
 hdfs file  before topology killed?
 cleanup() method will not call in cluster mode, so is there any method to 
 do cleanup work?
 
 thx for your help.
 
 -- 
 dashengju
 +86 13810875910
 dashen...@gmail.com