On Nov 11, 2013, at 9:09 PM, Otis Gospodnetic wrote:

> Hi,
> 
> While poking around MorphlineSolrSink I got intrigued by
> MorphlineIntercepor in ...solr.morphline package.  A few Qs:
> 
> 1) This is also not Solr-specific, right?

yep

> 
> 2) I couldn't find any code in ...solr.morphline package that actually
> uses this MorphlineInterceptor... is it not used?

In Flume an Interceptor is a separate concept from a Sink. You can use the 
Interceptor without the Sink, and vice versa.

> 
> 3) I see Morphline command's "process(...)" method being called from
> both MorphlineIntercetor AND from MorphlineHandlerImpl.  How come?  My
> impression is that MorphlineHandlerImpl code is what is actually meant
> to be used, while MorphlineInterceptor doesn't seem to be used....
> what am I missing? :)
> 
> 4) I found the following in the Flume Guide: "This interceptor is not
> intended for heavy duty ETL processing - if you need this consider
> moving ETL processing from the Flume Source to a Flume Sink".
> Why should one not use MorphlineInterceptor for heavy duty ETL processing?

Two reasons: 

1) Interceptors are running in the thread of the Flume Source, and are thus 
tightly coupled to the Flume Source and the I/O handler of the Flume Source. 
It's safer to not block or fail in that thread - better to hand data off of 
that thread as soon as possible into the Flume Channel (i.e a queue from which 
sinks take events - sinks run in another thread and are thus more isolated). 

2) Flume Interceptors have the limitation that they can only generate zero or 
one output events for each input event. So generating N events for an input 
event isn't possible, like one might want to do when emitting one event per 
input line, or or one event per input column, or one event per email 
attachment, etc. 

To summarize, the reasons aren't specific to morphlines, they are rooted in the 
way Flume has designed the concept of Interceptors. 

Wolfgang.

> 
> Thanks,
> Otis
> --
> Performance Monitoring * Log Analytics * Search Analytics
> Solr & Elasticsearch Support * http://sematext.com/

Reply via email to