Re: Storm Message Size

2014-02-27 Thread Klausen Schaefersinho
THX Adam, having a separate topology for training is a actually a very nice idea. I will give it a try! Cheers, Klaus On Wed, Feb 26, 2014 at 3:28 PM, Adam Lewis wrote: > Hi Klaus, > > I've been dealing with similar use cases. I do a couple of things (which > may not be a final solution, bu

Re: Storm Message Size

2014-02-26 Thread Adam Lewis
Hi Klaus, I've been dealing with similar use cases. I do a couple of things (which may not be a final solution, but it is interesting to discuss alternate approaches): I have passed trained models in the 200MB range through storm, but I try to avoid it. The model gets dropped into persistence and

Re: Storm Message Size

2014-02-26 Thread Klausen Schaefersinho
THX, the idea is good, I will keep that in mind. The only drawback is that it relies on polling, what I do not like to much in the PredictionBolt. Off couse I could also pass S3 or File refernces around in the messages, to trigger an update. But for the sake of simplicity I was thinking of keeping

Re: Storm Message Size

2014-02-26 Thread Enno Shioji
I can't comment on how large tuples fare, but about the synchronization, would this not make more sense? InputSpout -> AggregationBolt -> PredictionBolt -> OutputBolt | | \/ | Agg. State

Storm Message Size

2014-02-26 Thread Klausen Schaefersinho
Hi, I have a topology which process events and aggregates them in some form and performs some prediction based on a machine learning (ML) model. Every x events the one of the bolt involved in the normal processing emit an "trainModel" event, which is routed to a bolt which is just dedicated to the