[ 
https://issues.apache.org/jira/browse/FLUME-1157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13273814#comment-13273814
 ] 

Mike Percy commented on FLUME-1157:
-----------------------------------

Hey Inder,
Let me explain my motivations for doing it this way.

Some goals I had for adding general decorator/interceptor support were the 
following:
# Works consistently regardless of components in the system, i.e. custom 
Sources/Sinks
# No counter-intuitive behavior, i.e. it should be straightforward to write an 
Interceptor and it shouldn't be easy to inadvertently create bad side effects 
in the system
# Don't require changes to existing interfaces

Based on the above goals, the only place where these plugins really fit was the 
ChannelProcessor. The way it works, as you can see from the patch, is that a 
Source generates an Event and calls the ChannelProcessor.processEvent(Event e) 
method (or processEventBatch). The ChannelProcessor is responsible for opening 
a Transaction, putting the Event in the Channel, and closing the Transaction. 
For correctness reasons, the interception / transformation of the Events really 
has to take place inside of a Transaction. Also, from a fan-out perspective, if 
you are doing an operation on an Event that you want to replicate downstream, 
it makes sense to do that operation only once for the sake of efficiency. For 
all of these reasons, the ChannelProcessor is a good place to put Interceptor 
handling.

As you have rightly observed, there is no corresponding interceptor on the Sink 
side of the Channel for cases where we may only want to apply some 
transformation to the Events flowing to a given Sink. I did not add this yet 
for the following reasons:
# There is no corollary to ChannelProcessor on the Sink side. There is 
something called SinkProcessor, but that is basically a misnomer; it should 
really be called SinkPolicy or something. Each Sink is actually in charge of 
opening a Transaction, take()ing from the Channel, and committing the 
Transaction. Because the Inversion of Control that exists on the Source side 
does not exist on the Sink side, there is no clear place to plug in a driver 
for the Interceptor interface without requiring each Sink to implement it.
# Particularly in regard to MemoryChannel, the Interceptor concept is a little 
messy because if we modify the Events in-place in the Interceptors, which is 
safe to do on the Source side (Sources are not allowed to buffer Events), then 
we are modifying Events that may get returned to the Channel in a Transaction 
rollback(). If that happens then we could process the same event multiple times 
in the event of a failed Transaction. Obviously that would be wrong so the 
alternative is to defensively copy the Events before giving them to an 
Interceptor. That represents a potential performance concern that has to be 
considered carefully.

At this point in time, I believe that Source-side interceptors are a step in 
the right direction, and are more straightforward and less risky to implement 
than Sink-side decorators. With just these, it's possible to create another 
tier (logical or physical) to apply different Event processing to events going 
to different Sinks, if required, so there is a workaround.

Anyway, I share your concerns and I have been thinking a lot about how to make 
Sink-side decorators work. I'd like to come up with a solution that creates a 
basically symmetric system without breaking existing interfaces. Some of the 
other folks working on Flume are thinking about this use case also and I think 
we will come up with a workable solution soon.

Best,
Mike
                
> Implement Decorators for Flume 1.x
> ----------------------------------
>
>                 Key: FLUME-1157
>                 URL: https://issues.apache.org/jira/browse/FLUME-1157
>             Project: Flume
>          Issue Type: New Feature
>            Reporter: Arvind Prabhakar
>            Assignee: Mike Percy
>             Fix For: v1.2.0
>
>
> Some nice to have built in decorators could be:
> * checksum decorator
> * checksum validation decorator 
> * timestamp decorator
> * GUID decorator
> The implementation should support the following:
> * support multiple decorators in predefined order via configuration
> * support custom decorators.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to