It might help to understand your use case a little bit better. Are the requests for a finite amount of data that you just happen to want to stream out, or are they more akin to a subscription for an unbounded amount of data? Also, does the request contain a specification of what needs to be computed? That is, do you imagine your bolts (or various trident components) each keeping a list of active requests and modifying their behaviour (or data source in the case of spouts) as the requests come in? You might want to check out the storm-signals project for ideas of how you could orchestrate all the components to do the right thing in face of your requests coming in: https://github.com/ptgoetz/storm-signals
On Sat, Feb 15, 2014 at 7:30 PM, Carl Lerche <m...@carllerche.com> wrote: > Hey Adam, > > Actually, that's quite a good idea. I'm glad you responded, this is a > better approach than what I was going to attempt (aka, mega hacks). I > understand how your approach could be done with non-DRPC trident. The > one drawback that I can think of would be that the request message > would need to be part of a batch, so if matches only happen every > 15~20 seconds, it would take a while for the response to start > arriving. > > Perhaps you could elaborate more on how one could use vanilla storm > could be used to solve this? Perhaps it could help with the batch > delay problem. > > Cheers, > Carl > > On Sat, Feb 15, 2014 at 4:42 AM, Adam Lewis <m...@adamlewis.com> wrote: > > Hi Carl, > > > > DRPC is inherently synchronous in the way it works so if I understand > what > > you are trying to do correctly then I suggest you stick to non-DRPC > trident > > or even vanilla storm. You can setup some messaging queues to handle the > > input (request) and output (streaming result). Include a field in the > input > > tuple that can be used to correlate any downstream results (where that > ID is > > client generated), then create a client which handles publishing to the > > input queue and subscribing to the output queue (filtering on messages > which > > have the input correlation id). You can partition your storm topology > (and > > the input and output queues) on that correlation ID to achieve some load > > balancing. Finally, if your output streams are infinite, you need some > > mechanism to stop them... > > > > As a side benefit, you can overcome some limitations of DRPC such as no > > control over serialization and even have multiple trident streams all > > writing to your output queue (whereas DRPC doesn't support that sort of > > branching in the topology). > > > > Adam > > > > > > On Fri, Feb 14, 2014 at 7:06 PM, Carl Lerche <m...@carllerche.com> wrote: > >> > >> Hello, > >> > >> I noticed that the DRPC client only allows a single response (and for > >> that response to be JSON encoded). I was hoping to implement some sort > >> of DRPC stream where I a constant stream of data based on a given > >> query. Also, is there a way to serialize the response using Kryo > >> instead of JSON? The specific format of my data is not very JSON > >> friendly. > >> > >> Cheers, > >> Carl > > > > >