Can you tell us more about this use case? I don't really understand, but given what you've said so far, I might create a trident topology something like this:
TridentTopology topology = new TridentTopology(); StormTopology = topology.newStream("spout1", spout) .each(new Fields("request_id"), new CsvReader(), new Fields("csv_field1", "csv_field2", "csv_fieldN")); .groupBy(new Fields("csv_field1")) .... do something on the GroupedStream .build(); public class CsvReader extends BaseFunction { public CsvReader() { } @Override public void execute(TridentTuple tuple, TridentCollector collector) { long requestId = tuple.getLong(0); // do something with this requestId to figure out which CSV file to read ??? /* PSEUDOCODE for (each line in the CSV) { // emit one tuple per line with all the fields collector.emit(new Values(line[0], line[1], line[N])); } */ } } (Trident makes working with batches a lot easier. :) In general though, I'm not sure where you're getting the CSV files. I don't think reading CSV files off of the worker nodes' disks directly would be a good practice in Storm. It'd probably be better if your spouts emitted the data themselves or something. -Cody On Tue, May 6, 2014 at 1:13 AM, Kiran Kumar <kirankumardas...@ovi.com>wrote: > Hi Padma, > > Firstly, thanks for responding. > > Here is how i am defining my topology conceptually.. > > - Spout waits for a request signal.. > - once spout got a signal, it generates a request_id and broadcasts that > request_id to 10 csv reader bolts.. > - 10 csv reader bolts reads csv files line-by-line and emits those tuples, > respectively.. > - Now (this is the place where i need suggestion in technical/syntactical) > i need to batch up those tuples from all the 10 csv reader bolts on > specified fields.. > - finally, batch-ed tuples will be processed by final bolts. > > What i need is a technical approach. > On Tuesday, 6 May 2014 11:10 AM, padma priya chitturi < > padmapriy...@gmail.com> wrote: > > Hi, > > You can define spouts and bolts in such a way that, input streams read > by spouts would be grouped on specified fields and these could be processed > by specific bolts. This way, you could make batches of input stream. > > > On Tue, May 6, 2014 at 11:02 AM, Kiran Kumar <kirankumardas...@ovi.com>wrote: > > Hi, > > Can anyone suggest me a topology that makes batches of the input stream > on specified fields. so that the batch will be forwarded to a function that > processes it. > > Regards, > Kiran Kumar Dasari. > > > > > -- Cody A. Ray, LEED AP cody.a....@gmail.com 215.501.7891