Re: Storm topology using all the Max connections of db

2016-05-04 Thread Sai Dilip Reddy Kiralam
Hi, I'm closing up and opening the connections after execution of insert query by my bolt.Do I need to increase the max connections db to a level of 500 to 1000? Here in log file of supervisor it tells me the connection made by bolt is closed and again taking new connection.I think after

Storm build: Missing Artifacts

2016-05-04 Thread Le Xu
Hello! I'm trying to build Storm from the source (both 0.96 and 1.0) and I got a maven build error saying that I'm missing (lots of) artifacts. Since there are so many artifacts that are missing, I wonder if there are any faster way to restore my all these artifacts without manually download and

How to let a topology know that it's time to stop?

2016-05-04 Thread Navin Ipe
Hi, I know Storm is designed to run forever. I also know about Trident's technique of aggregation. But shouldn't Storm have a way to let bolts know that a certain bunch of processing has been completed? Consider this topology: Spout-->Bolt-A-->Bolt-B |

Re: Storm topology using all the Max connections of db

2016-05-04 Thread Spico Florin
hi! you have 9 bolts with 50 max db connections. so for each bolt you get a conection pool. try to decresae this number for example to 5 and check if your performance if fine with your db regards, florin On Wednesday, May 4, 2016, Sai Dilip Reddy Kiralam < dkira...@aadhya-analytics.com> wrote:

Getting Kafka Offset in Storm Bolt

2016-05-04 Thread Milind Vaidya
Is there any way I can know what Kafka offset corresponds to current tuple I am processing in a bolt ? Use case : Need to batch events from Kafka, persists them to a local file and eventually upload it to the S3. To manager failure cases, need to know the Kafka offset for a message, so that it

Re: Storm 1.0.0 upgrade Serialization issue

2016-05-04 Thread KB
Thanks for your reply Samuel. I have setup a very simple topology and not using ObjectMapper or any other Jackson classes. Although we are using jackson libraries Jackson-core-2.6.2 Jackson-databind-2.4.5 and these versions not changed between Storm version 0.9 and 1.0.0. Please let me know

Re: [DISCUSS] Would like to make collective intelligence about Metrics on Storm

2016-05-04 Thread Jungtaek Lim
Kevin, It makes sense but I can't imagine beautiful way to address that case. If we change the view from task to worker, IMO worker level metrics shouldn't look up task level objects, cause it's beyond the layer, and worker doesn't even know what instances have what fields. If there's a way to

Re: [DISCUSS] Would like to make collective intelligence about Metrics on Storm

2016-05-04 Thread Kevin Conaway
>For specific task, you can register your own metrics which resides per task. Exactly, thats the problem. For something like a JDBC connection pool (or a Cassandra cluster/session), its not tied to any one task or component, it is usually shared amongst all tasks in the JVM. Reporting on those

Re: How to you store database connections in a Spout or Bolt without serialization problems?

2016-05-04 Thread Navin Ipe
Hmm...ok thanks. In this case I need to preserve state, so can't use transient. Anyway, I redesigned the classes to keep the connection strings elsewhere, and now everything is working fine. Thanks a lot! On Wed, May 4, 2016 at 3:59 PM, Jungtaek Lim wrote: > Declare them as

Re: How to you store database connections in a Spout or Bolt without serialization problems?

2016-05-04 Thread Jungtaek Lim
Declare them as "class fields" but as transient (not mandatory) and initialize them in prepare() or open(). Leaving it as uninitialized until prepare() or open() gets called doesn't make any issue because of lifecycle of task of Apache Storm. On Wednesday, May 4, 2016, Navin Ipe

Re: How to you store database connections in a Spout or Bolt without serialization problems?

2016-05-04 Thread Navin Ipe
Yes, I know they should be initialized in open() or prepare(). But I'm referring to the declaration. If I do this: @Override public void prepare(Map map, TopologyContext tc, OutputCollector oc) { private Connection connRef; private Statement stmt; private ResultSet

RE: How to you store database connections in a Spout or Bolt without serialization problems?

2016-05-04 Thread Sinnema, Remon
Hi Navin, A DB connection is from one machine to another, how do you expect to share that between spouts and/or bolts that run on multiple machines? You should really set up the connection in open() or prepare(), so that it is specific to the machine that the spout or bolt runs on. Thanks,

Re: How to you store database connections in a Spout or Bolt without serialization problems?

2016-05-04 Thread Jungtaek Lim
Navin, Lifecycle of Spout and Bolt ensures that you can use fields which are initialized in prepare() safely in execute(), nextTuple(), ack(), fail(). In other words, prepare() will be called earlier than other methods. So please declare them as transient and initialize in prepare(). Hope this

How to you store database connections in a Spout or Bolt without serialization problems?

2016-05-04 Thread Navin Ipe
Hi, I know that if a MySQL database connection is instantiated in the constructor of a Spout or Bolt, it won't work. It should be instantiated in open() or prepare(). Problem is, when I store this database connection as a member of a class which is a member of a bolt. Eg: *public

Re: Storm topology using all the Max connections of db

2016-05-04 Thread Sai Dilip Reddy Kiralam
sorry ! my db is not yet started, so it given me the error ! but when I gave that statement and run the topology then it is using more connections than the specified number. *Best regards,* *K.Sai Dilip Reddy.* On Wed, May 4, 2016 at 11:22 AM, Sai Dilip Reddy Kiralam <