Author: mpercy
Date: Mon Jul  9 17:49:26 2012
New Revision: 1359312

URL: http://svn.apache.org/viewvc?rev=1359312&view=rev
Log:
FLUME-1356. Document interceptors.

(Hari Shreedharan via Jarek Jarcec Cecho)

Modified:
    flume/branches/branch-1.2.0/flume-ng-doc/sphinx/FlumeUserGuide.rst

Modified: flume/branches/branch-1.2.0/flume-ng-doc/sphinx/FlumeUserGuide.rst
URL: 
http://svn.apache.org/viewvc/flume/branches/branch-1.2.0/flume-ng-doc/sphinx/FlumeUserGuide.rst?rev=1359312&r1=1359311&r2=1359312&view=diff
==============================================================================
--- flume/branches/branch-1.2.0/flume-ng-doc/sphinx/FlumeUserGuide.rst 
(original)
+++ flume/branches/branch-1.2.0/flume-ng-doc/sphinx/FlumeUserGuide.rst Mon Jul  
9 17:49:26 2012
@@ -579,15 +579,17 @@ When paired with the built-in AvroSink o
 it can create tiered collection topologies.
 Required properties are in **bold**.
 
-=============  ===========  ===================================================
-Property Name  Default      Description
-=============  ===========  ===================================================
-**channels**   --
-**type**       --           The component type name, needs to be ``avro``
-**bind**       --           hostname or IP address to listen on
-**port**       --           Port # to bind to
-threads        --           Maximum number of worker threads to spawn
-=============  ===========  ===================================================
+==============  ===========  
===================================================
+Property Name   Default      Description
+==============  ===========  
===================================================
+**channels**    --
+**type**        --           The component type name, needs to be ``avro``
+**bind**        --           hostname or IP address to listen on
+**port**        --           Port # to bind to
+threads         --           Maximum number of worker threads to spawn
+interceptors    --           Space separated list of interceptors
+interceptors.*
+==============  ===========  
===================================================
 
 Example for agent named **agent_foo**:
 
@@ -624,6 +626,8 @@ restart          false        Whether th
 logStdErr        false        Whether the command's stderr should be logged
 selector.type    replicating  replicating or multiplexing
 selector.*                    Depends on the selector.type value
+interceptors     --           Space separated list of interceptors
+interceptors.*
 ===============  ===========  
==============================================================
 
 
@@ -678,6 +682,8 @@ Property Name    Default      Descriptio
 max-line-length  512          Max line length per event body (in bytes)
 selector.type    replicating  replicating or multiplexing
 selector.*                    Depends on the selector.type value
+interceptors     --           Space separated list of interceptors
+interceptors.*
 ===============  ===========  ===========================================
 
 Example for agent named **agent_foo**:
@@ -698,14 +704,16 @@ A simple sequence generator that continu
 that starts from 0 and increments by 1. Useful mainly for testing.
 Required properties are in **bold**.
 
-=============  ===========  ========================================
-Property Name  Default      Description
-=============  ===========  ========================================
-**channels**   --
-**type**       --           The component type name, needs to be ``seq``
-selector.type               replicating or multiplexing
-selector.*     replicating  Depends on the selector.type value
-=============  ===========  ========================================
+==============  ===========  ========================================
+Property Name   Default      Description
+==============  ===========  ========================================
+**channels**    --
+**type**        --           The component type name, needs to be ``seq``
+selector.type                replicating or multiplexing
+selector.*      replicating  Depends on the selector.type value
+interceptors    --           Space separated list of interceptors
+interceptors.*
+==============  ===========  ========================================
 
 Example for agent named **agent_foo**:
 
@@ -728,17 +736,19 @@ Required properties are in **bold**.
 Syslog TCP Source
 '''''''''''''''''
 
-=============  ===========  ==============================================
-Property Name  Default      Description
-=============  ===========  ==============================================
-**channels**   --
-**type**       --           The component type name, needs to be ``syslogtcp``
-**host**       --           Host name or IP address to bind to
-**port**       --           Port # to bind to
-eventSize      2500
-selector.type               replicating or multiplexing
-selector.*     replicating  Depends on the selector.type value
-=============  ===========  ==============================================
+==============   ===========  ==============================================
+Property Name    Default      Description
+==============   ===========  ==============================================
+**channels**     --
+**type**         --           The component type name, needs to be 
``syslogtcp``
+**host**         --           Host name or IP address to bind to
+**port**         --           Port # to bind to
+eventSize        2500
+selector.type                 replicating or multiplexing
+selector.*       replicating  Depends on the selector.type value
+interceptors     --           Space separated list of interceptors
+interceptors.*
+==============   ===========  ==============================================
 
 
 For example, a syslog TCP source for agent named **agent_foo**:
@@ -755,16 +765,18 @@ For example, a syslog TCP source for age
 Syslog UDP Source
 '''''''''''''''''
 
-=============  ===========  ==============================================
-Property Name  Default      Description
-=============  ===========  ==============================================
-**channels**   --
-**type**       --           The component type name, needs to be ``syslogudp``
-**host**       --           Host name or IP address to bind to
-**port**       --           Port # to bind to
-selector.type               replicating or multiplexing
-selector.*     replicating  Depends on the selector.type value
-=============  ===========  ==============================================
+==============  ===========  ==============================================
+Property Name   Default      Description
+==============  ===========  ==============================================
+**channels**    --
+**type**        --           The component type name, needs to be ``syslogudp``
+**host**        --           Host name or IP address to bind to
+**port**        --           Port # to bind to
+selector.type                replicating or multiplexing
+selector.*      replicating  Depends on the selector.type value
+interceptors    --           Space separated list of interceptors
+interceptors.*
+==============  ===========  ==============================================
 
 
 For example, a syslog UDP source for agent named **agent_foo**:
@@ -804,16 +816,18 @@ Required properties are in **bold**.
 Avro Legacy Source
 ''''''''''''''''''
 
-=============  ===========  
========================================================================================
-Property Name  Default      Description
-=============  ===========  
========================================================================================
-**channels**   --
-**type**       --           The component type name, needs to be 
``org.apache.flume.source.avroLegacy.AvroLegacySource``
-**host**       --           The hostname or IP address to bind to
-**port**       --           The port # to listen on
-selector.type               replicating or multiplexing
-selector.*     replicating  Depends on the selector.type value
-=============  ===========  
========================================================================================
+==============  ===========  
========================================================================================
+Property Name   Default      Description
+==============  ===========  
========================================================================================
+**channels**    --
+**type**        --           The component type name, needs to be 
``org.apache.flume.source.avroLegacy.AvroLegacySource``
+**host**        --           The hostname or IP address to bind to
+**port**        --           The port # to listen on
+selector.type                replicating or multiplexing
+selector.*      replicating  Depends on the selector.type value
+interceptors    --           Space separated list of interceptors
+interceptors.*
+==============  ===========  
========================================================================================
 
 Example for agent named **agent_foo**:
 
@@ -829,16 +843,18 @@ Example for agent named **agent_foo**:
 Thrift Legacy Source
 ''''''''''''''''''''
 
-=============  ===========  
======================================================================================
-Property Name  Default      Description
-=============  ===========  
======================================================================================
-**channels**   --
-**type**       --           The component type name, needs to be 
``org.apache.source.thriftLegacy.ThriftLegacySource``
-**host**       --           The hostname or IP address to bind to
-**port**       --           The port # to listen on
-selector.type               replicating or multiplexing
-selector.*     replicating  Depends on the selector.type value
-=============  ===========  
======================================================================================
+==============  ===========  
======================================================================================
+Property Name   Default      Description
+==============  ===========  
======================================================================================
+**channels**    --
+**type**        --           The component type name, needs to be 
``org.apache.source.thriftLegacy.ThriftLegacySource``
+**host**        --           The hostname or IP address to bind to
+**port**        --           The port # to listen on
+selector.type                replicating or multiplexing
+selector.*      replicating  Depends on the selector.type value
+interceptors    --           Space separated list of interceptors
+interceptors.*
+==============  ===========  
======================================================================================
 
 Example for agent named **agent_foo**:
 
@@ -858,14 +874,16 @@ A custom source is your own implementati
 source's class and its dependencies must be included in the agent's classpath
 when starting the Flume agent. The type of the custom source is its FQCN.
 
-=============  ===========  ==============================================
-Property Name  Default      Description
-=============  ===========  ==============================================
-**channels**   --
-**type**       --           The component type name, needs to be your FQCN
-selector.type               replicating or multiplexing
-selector.*     replicating  Depends on the selector.type value
-=============  ===========  ==============================================
+==============  ===========  ==============================================
+Property Name   Default      Description
+==============  ===========  ==============================================
+**channels**    --
+**type**        --           The component type name, needs to be your FQCN
+selector.type                replicating or multiplexing
+selector.*      replicating  Depends on the selector.type value
+interceptors    --           Space separated list of interceptors
+interceptors.*
+==============  ===========  ==============================================
 
 Example for agent named **agent_foo**:
 
@@ -1502,6 +1520,67 @@ Custom Sink Processor
 
 Custom sink processors are not implemented at this time.
 
+Flume Interceptors
+------------------
+
+Flume has the capability to modify/drop events in-flight. This is done with 
the help of interceptors. Interceptors
+are classes that implement ``org.apache.flume.interceptor.Interceptor`` 
interface. An interceptor can
+modify or even drop events based on any criteria chosen by the developer of 
the interceptor. Flume supports
+chaining of interceptors. This is made possible through by specifying the list 
of interceptor builder class names
+in the configuration. Interceptors are specified as a whitespace separated 
list in the source configuration.
+The order in which the interceptors are specified is the order in which they 
are invoked.
+The list of events returned by one interceptor is passed to the next 
interceptor in the chain. Interceptors
+can modify or drop events. If an interceptor needs to drop events, it just 
does not return that event in
+the list that it returns. If it is to drop all events, then it simply returns 
an empty list. Interceptors
+are named components, here is an example of how they are created through 
configuration:
+
+.. code-block:: properties
+
+  agent_foo.sources = source_foo
+  agent_foo.channels = channel-1
+  agent_foo.sources.source_foo.interceptors = a b
+  agent_foo.sources.source_foo.interceptors.a.type = 
org.apache.flume.interceptor.HostInterceptor$Builder
+  agent_foo.sources.source_foo.interceptors.a.preserveExisting = false
+  agent_foo.sources.source_foo.interceptors.a.hostHeader = hostname
+  agent_foo.sources.source_foo.interceptors.b.type = 
org.apache.flume.interceptor.TimestampInterceptor$Builder
+
+Note that the interceptor builders are passed to the type config parameter. 
The interceptors are themselves
+configurable and can be passed configuration values just like they are passed 
to any other configurable component.
+In the above example, events are passed to the HostInterceptor first and the 
events returned by the HostInterceptor
+are then passed along to the TimestampInterceptor.
+
+Timestamp Interceptor
+~~~~~~~~~~~~~~~~~~~~~
+
+This interceptor inserts into the event headers, the time in millis at which 
it processes the event. This interceptor
+inserts a header with key ``timestamp`` whose value is the relevant timestamp. 
This interceptor
+can preserve an existing timestamp if it is already present in the 
configuration.
+
+================  =======  
========================================================================
+Property Name     Default  Description
+================  =======  
========================================================================
+type              --       The component type name, has to be ``TIMESTAMP``
+preserveExisting  false    If the timestamp already exists, should it be 
preserved - true or false
+================  =======  
========================================================================
+
+Host Interceptor
+~~~~~~~~~~~~~~~~
+
+This interceptor inserts the hostname or IP address of the host that this 
agent is running on. It inserts a header
+with key ``host`` or a configured key whose value is the hostname or IP 
address of the host, based on configuration.
+
+================  =======  
========================================================================
+Property Name     Default  Description
+================  =======  
========================================================================
+type              --       The component type name, has to be ``HOST``
+preserveExisting  false    If the host header already exists, should it be 
preserved - true or false
+useIP             true     Use the IP Address if true, else use hostname.
+hostHeader        host     The header key to be used.
+================  =======  
========================================================================
+
+In the example above, the key used in the event headers is "hostname"
+
+
 Flume Properties
 ----------------
 


Reply via email to