Re: logback file in jar

2015-05-14 Thread Benjamin Cuthbert
HI Jeff, It seems to pickup that file on the classpath by default and then attempts to use it. Regards > On 14 May 2015, at 22:32, Jeffery Maass wrote: > > I think that question is probably better asked of the developer's group : > http://mail-archives.apache.org/mod_mbox/storm-dev/ >

Re: Should I always make a copy of the data in Tuple during execute()?

2015-05-14 Thread Banias H
Thanks Nathan. Sorry for the confusion. Bolt_A actually generates a new SimplePojo object in every execute() function. So I shouldn't be reusing the same pojo object in Bolt_A. To give you more details, from the same worker log file I am seeing (without the copy workaround): t=1: Bolt_A generate

RE: storm topology logsaggregate and processing them

2015-05-14 Thread Rajesh_Kalluri
Dell - Internal Use - Confidential Thanks Jeffery, great to see enthusiastic community around storm. Do you have any suggestions on the log retention and indexing mechanisms (logstash etc?) From: Jeffery Maass [mailto:maas...@gmail.com] Sent: Thursday, May 14, 2015 4:42 PM To: user@storm.apache.

No of Records Processed By Topology Per second ???

2015-05-14 Thread Ashish Soni
How do i know how many records are processed per second by the Storm Topology , Please see below screen shot from the UI , Everything is running as single node cluster hortonworks SandBox Please help me understand below metric Complete Latency , Execute Latency and Process Latency for below s

Re: Should I always make a copy of the data in Tuple during execute()?

2015-05-14 Thread Nathan Leung
It sounds like you reuse the same simple pojo object in bolt a. You should avoid doing this; even with your copy workaround it's possible to run into more subtle race conditions. On May 14, 2015 8:51 PM, "Banias H" wrote: > In prototyping an application, I use a simple pojo (see below) to send >

Should I always make a copy of the data in Tuple during execute()?

2015-05-14 Thread Banias H
In prototyping an application, I use a simple pojo (see below) to send data between bolts. In an earlier version of my bolt, I would simply get the pojo using tuple.getValue() inside the execute() function, like: SimplePojo pojo = (SimplePojo) tuple.getValue(0); But when I scale up and have many

Re: Best way of scaling with a single spout

2015-05-14 Thread Javier Gonzalez
Hi Jeff, What makes us believe that is that, when using a spouts-only topology (we needed to coordinate transactions between three systems and ensure exactly once semantics) we had better performance, even when feeding from a slower input. When we added bolts, performance degraded as stuff has to

Re: Cleanup method not called for the BaseBasicBolt when the topology is killed

2015-05-14 Thread Javier Gonzalez
Isn't the cleanup method guaranteed to be called only while running as local topology? On May 13, 2015 9:20 AM, "Jeffery Maass" wrote: > Bolts which implement IBolt have a method called cleanup() which is called > by the Storm framework. > > https://storm.apache.org/apidocs/backtype/storm/task/IB

Re: storm topology logsaggregate and processing them

2015-05-14 Thread Jeffery Maass
To separate out the worker logs per topology, you need : * create separate cluster.xml files per application. ** currently, cluster.xml is hardcoded to the storm worker application ** cluster.xml is found in ${STORM_HOME}/logback ** deploy the custom xml files to all of the storm nodes ** change th

Re: logback file in jar

2015-05-14 Thread Jeffery Maass
I think that question is probably better asked of the developer's group : http://mail-archives.apache.org/mod_mbox/storm-dev/ Incidentally, how is the inclusion of the logback.xml file causing you a problem? Thank you for your time! + Jeff Maass linkedin.com/in/jeffmaass sta

Re: Streaming data from HDFS to Storm

2015-05-14 Thread Jeffery Maass
I googled and found these: https://github.com/ptgoetz/storm-hdfs/blob/master/src/test/java/org/apache/storm/hdfs/trident/FixedBatchSpout.java https://github.com/jerrylam/storm-hdfs Thank you for your time! + Jeff Maass linkedin.com/in/jeffmaass stackoverflow.com/users/373418

Re: handling Kafka log roll

2015-05-14 Thread Jeffery Maass
In this specific case, what do you mean by kafka log roll? How did you correlate the storm problems with the kafka log roll? Do you have any logs to show? Thank you for your time! + Jeff Maass linkedin.com/in/jeffmaass stackoverflow.com/users/373418/maassql ++

Topoplogy disconnected from Nimbus node after scheduled maintenance on AWS

2015-05-14 Thread Yashwant Ganti
Hello Folks, We have a topology currently running in AWS. The Nimbus node was scheduled for maintenance, which was completed a few hours earlier. After this, the Topology is running (since the supevisors weren't affected by the maintenance), but the Nimbus UI isn't reachable and I am assuming the

bin/storm in storm distribution

2015-05-14 Thread Mark Tomko
Does anyone know why the storm command-line program in the bin directory of the downloads is the storm python script rather than the shell script that's found in the storm github repo? We have a minor problem with this script because it explicitly invokes /usr/bin/python, which is not standard pyt

Fwd: Kafka Spout can't read message

2015-05-14 Thread bigdata hadoop
HI All I am trying to write the data to hdfs through kafka spout, I have the topology created but the data is not been acked. So when I saw the nimbus log it gives me a transport plugin error which is causing to fail, i tried adding the .storm dir and added the client_jass and storm.yaml with foll

Re: Maximum recommended Spout TTL

2015-05-14 Thread Iván García
Any ideas? I just want to know which is the maximum recommended TTL for a spout if there is one. Thanks! 2015-05-07 11:41 GMT-07:00 Iván García : > Good morning, > > I have a Storm project that aggregates Entities into Groups. Each Group is > considered "ready to process" if didn't receive an Ev

RE: Multiple Workers in local mode gives error

2015-05-14 Thread Rajesh_Kalluri
Dell - Internal Use - Confidential Assuming your parallelism hints default to 1, you have combined parallelism of 6 and with 6 workers each would get 1 executor each. So each of the bolt and task will run on a different worker(if you have >6 available slots on your cluster) Checkout the awesome

Re: Multiple Workers in local mode gives error

2015-05-14 Thread Asif Ihsan
If I create 6 workers for 5 bolts and 1 spout. Then how will these spout and bolts be distributed among workers. On Thu, May 14, 2015 at 5:34 PM, 임정택 wrote: > Yes, right. > Worker is a JVM instance which can contains one or more executors, each > executor is spout, bolt, acker, etc. > > 2015년 5월

Re: I need your help for strom-hdfs

2015-05-14 Thread Bobby Evans
I am adding the user mailing list so more people can benefit from it.  I have not done what you have asked myself but in theory it should be possible with overriding the file name format, although it is not the cleanest way to do it, so I would suggest you file a JIRA for us to clean it up. File

Re: Multiple Workers in local mode gives error

2015-05-14 Thread Asif Ihsan
I am using shuffelgrouping. With one worker it is running well. On Thu, May 14, 2015 at 4:18 PM, wrote: > *Dell - Internal Use - Confidential * > > Regarding the NotSerializableException: > > > > I would like to see what grouping you have defined between the two bolts > that you have experienced

Re: Hibernate + Storm

2015-05-14 Thread Fan Jiang
One thing to note is that you should try to avoid JDBC operations in a bolt, as they may block the bolt and affect the topology's performance. Try to do the database access asynchronously, or create a separate thread for JDBC operations. 2015-05-14 10:30 GMT-04:00 Mason Yu : > Interesting. H

Re: Hibernate + Storm

2015-05-14 Thread Mason Yu
Interesting. Hibernate hooks inside a J2ee container or Spring which requires a specific OR mapping to a 20th century RDBMS. Storm works in a Linux distributed environment which does not need a RDBMS. RDBMS's do not work in a distributed environment. Mason Yu Jr. CEO Big Data Architects, LLC

Re: Hibernate + Storm

2015-05-14 Thread Fan Jiang
It makes sense to me since the bolts can be distributed over different supervisors and may have different DB connections. In this case, you have to detach the hibernate entities from one connection and then re-attach them to the other, if you want to pass them between bolts. — Sincerely, Fan Jia

Re: Hibernate + Storm

2015-05-14 Thread Enno Shioji
The reason objects are serialized is so that they can be shipped to another process. As long as that's what you want, it follows that you'd have to share the sessions across processes. I don't think this is possible or wise! On Thu, May 14, 2015 at 2:58 PM, Stephen Powis wrote: > Hello everyone

Hibernate + Storm

2015-05-14 Thread Stephen Powis
Hello everyone! I'm currently toying around with a prototype built ontop of Storm and have been running into some not so easy going while trying to work with Hibernate and storm. I was hoping to get input on if this is just a case of "I'm doing it wrong" or maybe get some useful tips. In my prot

Unsubscribe from mailing list

2015-05-14 Thread Nedim Sabic
Hi I am trying to unsubscribe from this mailing list (I want to use another account), but there is no way. I had already send a few mails to user-unsubscr...@storm.apache.org, but nothing happened. I also sent a mail to the mailing list administrator, and didn't got any response. Any tips? Than

RE: Re: How much is the overhead of time to deploy a system on Storm ?

2015-05-14 Thread Nathan Leung
Maybe I will make an analogy. Think of spout executors as people wrapping presents. Think of spout tasks as tables where people can wrap presents. If you have 10 tasks and 1 executor, then you have 10 tasks and 1 person. The person will wrap a present at one table, then go to the next, wrap a pres

RE: Re: How much is the overhead of time to deploy a system on Storm ?

2015-05-14 Thread Rajesh_Kalluri
Dell - Internal Use - Confidential Nathan, Can you explain in a little more detail what you mean by “When you have more tasks than executors, the spout thread does the same logic, it just does it for more tasks during its main loop.” I thought the spout thread emits tuples based on the max spo

Re: Re: How much is the overhead of time to deploy a system on Storm ?

2015-05-14 Thread Nathan Leung
I would expect that it depends on how many executors you have. In storm, an executor corresponds to an OS thread while a task is more of a logical unit of work. The only situation where I would personally use more tasks than executors is if I wanted to over provision the tasks so that I can rebalan

Re: Storm topology getting stuck

2015-05-14 Thread Nathan Leung
When you send a tuple from the spout, it sets the message id in the acker. When a message fans out, it did logical xor of new tuple ids with the existing message id, and when tuples get acked their message id id xor'd into the same value again. This fanning out tuples should be quite light weight.

Re: Need Your Help Urgent!*

2015-05-14 Thread Paul Poulosky
The best documentation is here Guaranteeing Message Processing |   | |   | |   |   |   |   |   | | Guaranteeing Message ProcessingToggle navigation Apache Storm Home About About Contributing Bylaws Download Documentation News Guarante... | | | | View on storm.apache.org | Preview by Yahoo | | |

Re: Multiple Workers in local mode gives error

2015-05-14 Thread 임정택
Yes, right. Worker is a JVM instance which can contains one or more executors, each executor is spout, bolt, acker, etc. 2015년 5월 14일 목요일, Asif Ihsan님이 작성한 메시지: > You are right about it. What does multiple workers means. Does it mean > that when I run topology with single worker than i will see o

Need Your Help Urgent!*

2015-05-14 Thread prasad ch
HI, In storm we have normal topology means we have spout and bolts etc.In that document they mention it is at-least-once processing ,means may be chance of processing more than once ? How can i test storm has at-least-once-processing ? please help me? While In trident is exactly once processin

RE: Multiple Workers in local mode gives error

2015-05-14 Thread Rajesh_Kalluri
Dell - Internal Use - Confidential Regarding the NotSerializableException: I would like to see what grouping you have defined between the two bolts that you have experienced the NotSerializableException with. I am guessing it is localOrShuffleGrouping, can you change it to shuffleGrouping and r

handling Kafka log roll

2015-05-14 Thread Benjamin Cuthbert
Does anyone have any suggestions on how to handle the kafka log roll. When this happens my topology just stops processing.

Re: Multiple Workers in local mode gives error

2015-05-14 Thread Asif Ihsan
You are right about it. What does multiple workers means. Does it mean that when I run topology with single worker than i will see one single process handling all the bolts and when running 6 workers for 5 bolts and 1 spout than i will see 6 separate processes running each bolt and spout? M i right

Re: Cleanup method not called for the BaseBasicBolt when the topology is killed

2015-05-14 Thread Richards Peter
Thank you all for sharing your thoughts/comments. Richards Peter.

Re: Are Hooks the right place to process acknowledgement and failures at each bolt level?

2015-05-14 Thread Richards Peter
Hi Any thoughts/comments about this topic? Thanks, Richards Peter.

Re: Multiple Workers in local mode gives error

2015-05-14 Thread 임정택
Hi. Storm serializes tuples when tuple should be sent to other (remote) worker. In other words, Storm doesn't serialize tuples when destination is local task. That's why you didn't meet error when testing with 1 worker. MapEventBean seems to be not serializable, so you need to convert to other da

Re: Multiple Workers in local mode gives error

2015-05-14 Thread Asif Ihsan
I am using Esper in one of the bolts of the Storm topology. Bolt emit MapEventBean array (EventBean[]). With the single worker topology run smoothly. With multiple workers it gives following error in emit function call. May 14, 2015 2:37:07 PM clojure.tools.logging$eval1$fn__7 invoke SEVERE: Async

When Tuples are Failed In Trident?

2015-05-14 Thread prasad ch
Hi, i want to know in trident topology,in document they said if tuple is failed in one batch etc .. here i want to know what are the possible scenarios to fail tuples ? Here if we want to maintain exactly once processing should use trident state? When i run storm application in normal topology

Streaming data from HDFS to Storm

2015-05-14 Thread Spico Florin
Hello! I would like to know if there is any spout implementation for streaming data from HDFS to Storm (something similar to Spark Streaming from HDFS). I know that there is bolt implementation to write data into HDFS ( https://github.com/ptgoetz/storm-hdfs and http://docs.hortonworks.com/HDPDoc