Facing Error in storm-deploy

2014-02-26 Thread Gaurav Taank
Hi everyone, I am trying to deploy the storm on AWS cluster, and getting following error. I am using a mac machine, so these are the steps I followed: 1. downloaded lein, converted it to executable, moved to usr/local/bin and executed same. 2. did a git clone of storm deploy code. 3. cd into s

Re: Spout missing Acks when a Bolt uses JRuby

2014-02-26 Thread Jonathan Nilsson
Thanks Taylor. I was afraid creating the JRuby runtime this way might be expensive. Initially I did create it inside of the prepare() method but I ran into some trouble because the Ruby class is not serializable. I played around with it a little more today and I had some success creating a static c

Trident Topology only deploying to One of the running workers

2014-02-26 Thread Sangwa Simfukwe
I've been working on a proof of concept with Storm and got it working in Local Mode. I'm running into problems when I try to deploy though. When I launch Storm UI, I see that all bolts are running on only one of the workers! I can't for the life of me figure out why. I.e. when I look at StormUI, I

Re: Spout missing Acks when a Bolt uses JRuby

2014-02-26 Thread P. Taylor Goetz
Hi Jonathan, I've used jruby fairly extensively with storm (though with the trident API), but it's been a while so I'm rusty. Initializing the jruby runtime is very expensive, so you should do that in the prepare() method of your bolt. That means you'll have to store it as an instance variable

Re: [DISCUSS] Pulling "Contrib" Modules into Apache

2014-02-26 Thread P. Taylor Goetz
I purposely left out storm-starter from the discussion to keep things focused, and because it’s a different animal. But I also feel it should be pulled in, albeit differently. I was thinking something along the lines of an “examples” directory, and that all committers would share collective own

RE: [DISCUSS] Pulling "Contrib" Modules into Apache

2014-02-26 Thread Huang, Roger
Bobby, I vote to include both storm-yarn and storm-deploy. Roger -Original Message- From: Brian O'Neill [mailto:boneil...@gmail.com] On Behalf Of Brian O'Neill Sent: Wednesday, February 26, 2014 3:39 PM To: d...@storm.incubator.apache.org Cc: user@storm.incubator.apache.org Subject: Re: [

Re: [DISCUSS] Pulling "Contrib" Modules into Apache

2014-02-26 Thread Brian O'Neill
Bobby, FWIW, I¹d love to see storm-yarn inside. I think we could definitely make things easier on the end-user if they were more cohesive. e.g. Imagine if we had ³storm launch yarn² inside of $storm/bin that would kickoff a storm-yarn launch, with whatever version was built. It would likely si

Re: [DISCUSS] Pulling "Contrib" Modules into Apache

2014-02-26 Thread P. Taylor Goetz
Thanks for the feedback Bobby. To clarify, I’m mainly talking about spout/bolt/trident state implementations that integrate storm with *Technology X*, where *Technology X* is not a fundamental part of storm. Examples would be technologies that are part of or related to the Hadoop/Big Data eco

Storm cannot run in combination with a recent Hadoop/HBase version.

2014-02-26 Thread Niels Basjes
Hi, I'm trying to write some storm bolts and I want them to output the information they produce into HBase. Now the HBase we have running here is based on CDH 4.5.0 which is fully based on the zookeeper versions in the 3.4.x range. The problem I have is that Storm currently still uses zookeeper 3

Re: Unexpected behavior on message resend

2014-02-26 Thread Adam Lewis
In my case it was the state objects created as part of trident aggregation. Here is the final message in the thread (i.e. read bottom up): http://mail-archives.apache.org/mod_mbox/storm-user/201312.mbox/%3CCAAYLz+p4YhF+i3LAkFoyU3nvngZXOusZWXj=0+bynrx0+tg...@mail.gmail.com%3E On Wed, Feb 26, 2

Re: Unexpected behavior on message resend

2014-02-26 Thread Harald Kirsch
Hi Adam, ok, good to know. I resolved to create the tuple from scratch in case it needs to be resend. I don't where else in-place modification could hurt in a linear process. Am I missing something? Thanks, Harald. On 26.02.2014 15:48, Adam Lewis wrote: I've already gotten slapped around on

Re: Unexpected behavior on message resend

2014-02-26 Thread Adam Lewis
I've already gotten slapped around on the list for doing in place modifications, so let me pass it on :) Don't modify tuple objects in place. You shouldn't rely on serialization happening or not happening for correctness. On Mon, Feb 24, 2014 at 11:18 AM, Harald Kirsch wrote: > Hi all, > > my

Re: [RELEASE] Apache Storm 0.9.1-incubating released (defaults.yaml)

2014-02-26 Thread Lajos
Hi Derek, Ah! I accidentally unpacked source on top of binary, when I meant to put it in a separate directory. That's the problem, thanks. Cheers, L On 26/02/2014 15:32, Derek Dagit wrote: The defaults.yaml file is part of the source distribution and is packaged into storm's jar when deplo

Re: [RELEASE] Apache Storm 0.9.1-incubating released

2014-02-26 Thread Spico Florin
Hello, Padma! You can create a storm cluster on Windows with one node as described here: http://ptgoetz.github.io/blog/2013/12/18/running-apache-storm-on-windows/ I could set up following the instructions from this article. I hope that will help you also. Regards,\ Florin On Wed, Feb 26, 2014

Re: Storm Message Size

2014-02-26 Thread Adam Lewis
Hi Klaus, I've been dealing with similar use cases. I do a couple of things (which may not be a final solution, but it is interesting to discuss alternate approaches): I have passed trained models in the 200MB range through storm, but I try to avoid it. The model gets dropped into persistence and

Re: [RELEASE] Apache Storm 0.9.1-incubating released (defaults.yaml)

2014-02-26 Thread Derek Dagit
The defaults.yaml file is part of the source distribution and is packaged into storm's jar when deployed. In a storm cluster deployment, it is not meant to be on the file system in ${storm.home}/conf. Perhaps you are pointing to your source working tree as storm home? -- Derek On 2/26/14, 5:5

Re: STORM with MYSQL optimizations

2014-02-26 Thread Sean Allen
Sharding is a pain in the ass and should be avoided when possible. If its possible, I'd look for another data store that can handle a higher load as a cluster so you don't have to worry about the details of sharding. On Wed, Feb 26, 2014 at 8:54 AM, masoom alam wrote: > @Sean: You are right,

Re: STORM with MYSQL optimizations

2014-02-26 Thread masoom alam
@Sean: You are right, MYSQL is not configured to handle 1000 events per second. I will post the results of Batch, which is also slow in our case.I think we should investigate thoroughly why Batch of for example 1000 is also slow in our case. BTW, How easy it is to configure/implement Shards in MYS

Re: STORM with MYSQL optimizations

2014-02-26 Thread Sean Allen
Is your mysql set up to handle 1000 writes a second? I'm going to guess no. If that is the case then Klaus' suggestions are good ones. Batch or Shard. On Wed, Feb 26, 2014 at 8:45 AM, masoom alam wrote: > 1000 Events per second. > > > > > On Wed, Feb 26, 2014 at 6:40 PM, Sean Allen > wrote: >

Re: STORM with MYSQL optimizations

2014-02-26 Thread masoom alam
1000 Events per second. On Wed, Feb 26, 2014 at 6:40 PM, Sean Allen wrote: > How much traffic exactly are you pushing at mysql before the load gets to > high and it starts to buckle under the weight? > > > On Wed, Feb 26, 2014 at 8:38 AM, masoom alam wrote: > >> Dear All, >> >> Have any body w

Re: STORM with MYSQL optimizations

2014-02-26 Thread Klausen Schaefersinho
Hi, if you want to stick to MySQL I see two options: 1) You could try to batch your writes. The DbWriterBolt would buffer some writes (e.g.) thousand and write them in batch mode. This is usually much faster. However your database is not always up to date. 2) You could try to shrad you database

Re: Setting up Storm Cluster

2014-02-26 Thread Sean Allen
There are good basic default configurations for each. There's nothing you should have to do. Older versions of storm 0.9.x defaulted to ZeroMQ, the latest defaults to Netty. I would advise not tuning any parameters of either until you need to and understand what you are doing. On Sat, Feb 22, 201

Re: STORM with MYSQL optimizations

2014-02-26 Thread Sean Allen
How much traffic exactly are you pushing at mysql before the load gets to high and it starts to buckle under the weight? On Wed, Feb 26, 2014 at 8:38 AM, masoom alam wrote: > Dear All, > > Have any body worked on the configurations/optimizations needed generally > for using STORM with MYSQL. Ou

STORM with MYSQL optimizations

2014-02-26 Thread masoom alam
Dear All, Have any body worked on the configurations/optimizations needed generally for using STORM with MYSQL. Our scenario stores data in MYSQL tables, but as the data rate increases MYSQL starts responding very slow (in some cases connection refused error), resulting in DBWriterBolt to slowdown

Re: Storm Load Balancing

2014-02-26 Thread Sean Allen
Well 6700 isnt running at all. There's no uptime so they aren't ever starting. 6701 appears to have died 20 minutes before you took the screenshot, that is going to result in load being shuffled around. So you had 3 functional workers, 6701, 6702, 6703 and 6701 went down leaving 6702 and 6703

Re: Multiple input stream with different inputs

2014-02-26 Thread Sean Allen
In training bolt: @Override public void execute(Tuple tuple, BasicOutputCollector outputCollector) { // lots of stuff here outputCollector.emit("updated model", new Values(model)); } public void declareOutputFields(OutputFieldsDeclarer outputFieldsDeclarer) {

Re: Storm Message Size

2014-02-26 Thread Klausen Schaefersinho
THX, the idea is good, I will keep that in mind. The only drawback is that it relies on polling, what I do not like to much in the PredictionBolt. Off couse I could also pass S3 or File refernces around in the messages, to trigger an update. But for the sake of simplicity I was thinking of keeping

Re: [RELEASE] Apache Storm 0.9.1-incubating released

2014-02-26 Thread padma priya chitturi
Does 0.9.1 version has inbuilt support to run on windows ? On Wed, Feb 26, 2014 at 5:29 PM, Lajos wrote: > Quick question on this: defaults.yaml is in both conf and storm-core.jar, > so the first time you start nimbus 0.9.1 you get this message: > > java.lang.RuntimeException: Found multiple de

Re: [RELEASE] Apache Storm 0.9.1-incubating released

2014-02-26 Thread Lajos
Quick question on this: defaults.yaml is in both conf and storm-core.jar, so the first time you start nimbus 0.9.1 you get this message: java.lang.RuntimeException: Found multiple defaults.yaml resources. You're probably bundling the Storm jars with your topology jar. [file:/scratch/projects/

Re: Storm Message Size

2014-02-26 Thread Enno Shioji
I can't comment on how large tuples fare, but about the synchronization, would this not make more sense? InputSpout -> AggregationBolt -> PredictionBolt -> OutputBolt | | \/ | Agg. State

Storm Message Size

2014-02-26 Thread Klausen Schaefersinho
Hi, I have a topology which process events and aggregates them in some form and performs some prediction based on a machine learning (ML) model. Every x events the one of the bolt involved in the normal processing emit an "trainModel" event, which is routed to a bolt which is just dedicated to the

RE: Storm Applications

2014-02-26 Thread Simon Cooper
Using normal storm, any bolt can output to anything at any time, as each bolt runs arbitrary code. So a bolt in the middle of a topology can write to a database, or file, or anything else you need. It will likely be the last bolt in the topology, but it doesn't have to be. If you use trident, t

Puppet module for deploying Storm released

2014-02-26 Thread Michael G. Noll
Hi everyone, I have released a Puppet module to deploy Storm 0.9 in case anyone is interested. The module uses Puppet parameterized classes and as such decouples code (Puppet manifests) from configuration data -- hence you can use Puppet Hiera to configure the way Storm is deployed without having