to clarify - I mean I think it's within the scope of the design intentions. I agree that it is currently disallowed (at least in documentation).
On Fri, Aug 10, 2012 at 5:14 PM, Patrick Wendell <[email protected]> wrote: > Hey Jeremy, > > That comment has been in the code now for some time, but I don't think > it is actually enforced anywhere programatically. I think the idea was > just that if you are writing something which is capable of generating > new event data it should be in a source - though I'm also curious to > hear why this was put in there. > > IMHO, doing some type of event splitting seems within the scope of how > interceptors are used. > > - Patrick > > On Fri, Aug 10, 2012 at 11:07 AM, Jeremy Custenborder > <[email protected]> wrote: >> Hello All, >> >> I'm wondering if you could provide some guidance for me. One of the >> inputs I'm working with batches several entries to a single event. >> This is a lot simpler than my data but it provides an easy example. >> For example: >> >> timestamp - 5,4,3,2,1 >> timestamp - 9,7,5,5,6 >> >> If I tail the file this results in 2 events being generated. This >> example has the data for 10 events. >> >> Here is high level what I want to accomplish. >> (web server - agent 1) >> exec source tail -f /<some file path> >> collector-client to (agent 2) >> >> (collector - agent 2) >> collector-server >> Custom Interceptor (input 1 event, output n events) >> Multiplex to >> hdfs >> hbase >> >> An interceptor looked like the most logical spot for me to add this. >> Is there a better place to add this functionality? Has anyone run into >> a similar case? >> >> Looking at the docs for Interceptor. intercept(List<Event> events) it >> says "Output list of events. The size of output list MUST NOT BE >> GREATER than the size of the input list (i.e. transformation and >> removal ONLY)." which tells me not to emit more events than given. >> intercept(Event event) only returns a single event so I can't use it >> there either. Why is there a requirement to only return 1 for 1? >> >> For now I'm implementing a custom source that will handle generating >> multiple events from the events coming in on the web server. My >> preference was to do this transformation on the collector agent before >> I hand off to hdfs and hbase. I know another alternative would be to >> implement custom RPC but I would prefer not to do that. I would prefer >> to rely on what is currently available. >> >> Thanks! >> j
