Updated Branches:
  refs/heads/trunk 30da63bd6 -> 3f4298cd5

FLUME-1910. Add thrift RPC documentation.

(Hari Shreedharan via Mike Percy)


Project: http://git-wip-us.apache.org/repos/asf/flume/repo
Commit: http://git-wip-us.apache.org/repos/asf/flume/commit/3f4298cd
Tree: http://git-wip-us.apache.org/repos/asf/flume/tree/3f4298cd
Diff: http://git-wip-us.apache.org/repos/asf/flume/diff/3f4298cd

Branch: refs/heads/trunk
Commit: 3f4298cd5ffeb645c73585d0fc346ac9eb73deb7
Parents: 30da63b
Author: Mike Percy <[email protected]>
Authored: Wed Apr 3 12:57:39 2013 -0700
Committer: Mike Percy <[email protected]>
Committed: Wed Apr 3 12:58:22 2013 -0700

----------------------------------------------------------------------
 flume-ng-doc/sphinx/FlumeDeveloperGuide.rst |   46 +++++++-----
 flume-ng-doc/sphinx/FlumeUserGuide.rst      |   86 ++++++++++++++++++++--
 2 files changed, 107 insertions(+), 25 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/flume/blob/3f4298cd/flume-ng-doc/sphinx/FlumeDeveloperGuide.rst
----------------------------------------------------------------------
diff --git a/flume-ng-doc/sphinx/FlumeDeveloperGuide.rst 
b/flume-ng-doc/sphinx/FlumeDeveloperGuide.rst
index 02b0088..71afa4e 100644
--- a/flume-ng-doc/sphinx/FlumeDeveloperGuide.rst
+++ b/flume-ng-doc/sphinx/FlumeDeveloperGuide.rst
@@ -160,15 +160,15 @@ by using a convenience implementation such as the 
SimpleEvent class, or by using
 ``EventBuilder``\ 's overloaded ``withBody()`` static helper methods.
 
 
-Avro RPC default client
-'''''''''''''''''''''''
+RPC clients - Avro and Thrift
+'''''''''''''''''''''''''''''
 
-As of Flume 1.1.0, Avro is the only supported RPC protocol.  The
-``NettyAvroRpcClient`` implements the ``RpcClient`` interface. The client needs
-to create this object with the host and port of the target Flume agent, and can
-then use the ``RpcClient`` to send data into the agent. The following example
-shows how to use the Flume Client SDK API within a user's data-generating
-application:
+As of Flume 1.4.0, Avro is the default RPC protocol.  The
+``NettyAvroRpcClient`` and ``ThriftRpcClient`` implement the ``RpcClient``
+interface. The client needs to create this object with the host and port of
+the target Flume agent, and canthen use the ``RpcClient`` to send data into
+the agent. The following example shows how to use the Flume Client SDK API
+within a user's data-generating application:
 
 .. code-block:: java
 
@@ -206,6 +206,8 @@ application:
       this.hostname = hostname;
       this.port = port;
       this.client = RpcClientFactory.getDefaultInstance(hostname, port);
+      // Use the following method to create a thrift client (instead of the 
above line):
+      // this.client = RpcClientFactory.getThriftInstance(hostname, port);
     }
 
     public void sendDataToFlume(String data) {
@@ -220,6 +222,8 @@ application:
         client.close();
         client = null;
         client = RpcClientFactory.getDefaultInstance(hostname, port);
+        // Use the following method to create a thrift client (instead of the 
above line):
+        // this.client = RpcClientFactory.getThriftInstance(hostname, port);
       }
     }
 
@@ -230,7 +234,8 @@ application:
 
   }
 
-The remote Flume agent needs to have an ``AvroSource`` listening on some port.
+The remote Flume agent needs to have an ``AvroSource`` (or a
+``ThriftSource`` if you are using a Thrift client) listening on some port.
 Below is an example Flume agent configuration that's waiting for a connection
 from MyApp:
 
@@ -244,18 +249,21 @@ from MyApp:
 
   a1.sources.r1.channels = c1
   a1.sources.r1.type = avro
+  # For using a thrift source set the following instead of the above line.
+  # a1.source.r1.type = thrift
   a1.sources.r1.bind = 0.0.0.0
   a1.sources.r1.port = 41414
 
   a1.sinks.k1.channel = c1
   a1.sinks.k1.type = logger
 
-For more flexibility, the default Flume client implementation
-(``NettyAvroRpcClient``) can be configured with these properties:
+For more flexibility, the default Flume client implementations
+(``NettyAvroRpcClient`` and ``ThriftRpcClient``) can be configured with these
+properties:
 
 .. code-block:: properties
 
-  client.type = default
+  client.type = default (for avro) or thrift (for thrift)
 
   hosts = h1                           # default client accepts only 1 host
                                        # (additional hosts will be ignored)
@@ -274,7 +282,8 @@ Failover Client
 
 This class wraps the default Avro RPC client to provide failover handling
 capability to clients. This takes a whitespace-separated list of <host>:<port>
-representing the Flume agents that make-up a failover group. If there’s a
+representing the Flume agents that make-up a failover group. The Failover RPC
+Client currently does not support thrift. If there’s a
 communication error with the currently selected host (i.e. agent) agent,
 then the failover client automatically fails-over to the next host in the list.
 For example:
@@ -306,7 +315,7 @@ For more flexibility, the failover Flume client 
implementation
 
   client.type = default_failover
 
-  hosts = h1 h2 h3                     # at least one is required, but 2 or 
+  hosts = h1 h2 h3                     # at least one is required, but 2 or
                                        # more makes better sense
 
   hosts.h1 = host1.example.org:41414
@@ -324,7 +333,7 @@ For more flexibility, the failover Flume client 
implementation
                                        # once to send the Event, and if it
                                        # fails then there will be no failover
                                        # to a second client, so this value
-                                       # causes the failover client to 
+                                       # causes the failover client to
                                        # degenerate into just a default client.
                                        # It makes sense to set this value to at
                                        # least the number of hosts that you
@@ -339,7 +348,7 @@ For more flexibility, the failover Flume client 
implementation
 LoadBalancing RPC client
 ''''''''''''''''''''''''
 
-The Flume Client SDK also supports an RpcClient which load-balances among 
+The Flume Client SDK also supports an RpcClient which load-balances among
 multiple hosts. This type of client takes a whitespace-separated list of
 <host>:<port> representing the Flume agents that make-up a load-balancing 
group.
 This client can be configured with a load balancing strategy that either
@@ -347,7 +356,8 @@ randomly selects one of the configured hosts, or selects a 
host in a round-robin
 fashion. You can also specify your own custom class that implements the
 ``LoadBalancingRpcClient$HostSelector`` interface so that a custom selection
 order is used. In that case, the FQCN of the custom class needs to be specified
-as the value of the ``host-selector`` property.
+as the value of the ``host-selector`` property. The LoadBalancing RPC Client
+currently does not support thrift.
 
 If ``backoff`` is enabled then the client will temporarily blacklist
 hosts that fail, causing them to be excluded from being selected as a failover
@@ -578,7 +588,7 @@ processing its own configuration settings. For example:
 
       // Process the myProp value (e.g. validation)
 
-      // Store myProp for later retrieval by process() method 
+      // Store myProp for later retrieval by process() method
       this.myProp = myProp;
     }
 

http://git-wip-us.apache.org/repos/asf/flume/blob/3f4298cd/flume-ng-doc/sphinx/FlumeUserGuide.rst
----------------------------------------------------------------------
diff --git a/flume-ng-doc/sphinx/FlumeUserGuide.rst 
b/flume-ng-doc/sphinx/FlumeUserGuide.rst
index 600a360..060c0ac 100644
--- a/flume-ng-doc/sphinx/FlumeUserGuide.rst
+++ b/flume-ng-doc/sphinx/FlumeUserGuide.rst
@@ -58,7 +58,10 @@ A Flume source consumes events delivered to it by an 
external source like a web
 server. The external source sends events to Flume in a format that is
 recognized by the target Flume source. For example, an Avro Flume source can be
 used to receive Avro events from Avro clients or other Flume agents in the flow
-that send events from an Avro sink. When a Flume source receives an event, it
+that send events from an Avro sink. A similar flow can be defined using
+a Thrift Flume Source to receive events from a Thrift Sink or a Flume
+Thrift Rpc Client or Thrift clients written in any language generated from
+the Flume thrift protocol.When a Flume source receives an event, it
 stores it into one or more channels. The channel is a passive store that keeps
 the event until it's consumed by a Flume sink. The file channel is one example
 -- it is backed by the local filesystem. The sink removes the event
@@ -292,6 +295,7 @@ Flume supports the following mechanisms to read data from 
popular log stream
 types, such as:
 
 #. Avro
+#. Thrift
 #. Syslog
 #. Netcat
 
@@ -319,9 +323,10 @@ dozen of agents that write to HDFS cluster.
    :alt: A fan-in flow using Avro RPC to consolidate events in one place
 
 This can be achieved in Flume by configuring a number of first tier agents with
-an avro sink, all pointing to an avro source of single agent. This source on
-the second tier agent consolidates the received events into a single channel
-which is consumed by a sink to its final destination.
+an avro sink, all pointing to an avro source of single agent (Again you could
+use the thrift sources/sinks/clients in such a scenario). This source
+on the second tier agent consolidates the received events into a single
+channel which is consumed by a sink to its final destination.
 
 Multiplexing the flow
 ---------------------
@@ -481,9 +486,9 @@ config to do that:
 Configuring a multi agent flow
 ------------------------------
 
-To setup a multi-tier flow, you need to have an avro sink of first hop pointing
-to avro source of the next hop. This will result in the first Flume agent
-forwarding events to the next Flume agent. For example, if you are
+To setup a multi-tier flow, you need to have an avro/thrift sink of first hop
+pointing to avro/thrift source of the next hop. This will result in the first
+Flume agent forwarding events to the next Flume agent. For example, if you are
 periodically sending files (1 file per event) using avro client to a local
 Flume agent, then this local agent can forward it to another agent that has the
 mounted for storage.
@@ -693,6 +698,39 @@ Example for agent named a1:
   a1.sources.r1.bind = 0.0.0.0
   a1.sources.r1.port = 4141
 
+Thrift Source
+~~~~~~~~~~~~~
+
+Listens on Thrift port and receives events from external Thrift client streams.
+When paired with the built-in ThriftSink on another (previous hop) Flume agent,
+it can create tiered collection topologies.
+Required properties are in **bold**.
+
+==================   ===========  
===================================================
+Property Name        Default      Description
+==================   ===========  
===================================================
+**channels**         --
+**type**             --           The component type name, needs to be 
``thrift``
+**bind**             --           hostname or IP address to listen on
+**port**             --           Port # to bind to
+threads              --           Maximum number of worker threads to spawn
+selector.type
+selector.*
+interceptors         --           Space separated list of interceptors
+interceptors.*
+==================   ===========  
===================================================
+
+Example for agent named a1:
+
+.. code-block:: properties
+
+  a1.sources = r1
+  a1.channels = c1
+  a1.sources.r1.type = thrift
+  a1.sources.r1.channels = c1
+  a1.sources.r1.bind = 0.0.0.0
+  a1.sources.r1.port = 4141
+
 Exec Source
 ~~~~~~~~~~~
 
@@ -762,6 +800,7 @@ allows the 'command' to use features from the shell such as 
wildcards, back tick
 loops, conditionals etc. In the absence of the 'shell' config, the 'command' 
will be
 invoked directly.  Common values for 'shell' :  '/bin/sh -c', '/bin/ksh -c',
 'cmd /c',  'powershell -Command', etc.
+
 .. code-block:: properties
   agent_foo.sources.tailsource-1.type = exec
   agent_foo.sources.tailsource-1.shell = /bin/bash -c
@@ -1479,6 +1518,39 @@ Example for agent named a1:
   a1.sinks.k1.hostname = 10.10.10.10
   a1.sinks.k1.port = 4545
 
+Thrift Sink
+~~~~~~~~~~~
+
+This sink forms one half of Flume's tiered collection support. Flume events
+sent to this sink are turned into Thrift events and sent to the configured
+hostname / port pair. The events are taken from the configured Channel in
+batches of the configured batch size.
+Required properties are in **bold**.
+
+==========================   =======  
==============================================
+Property Name                Default  Description
+==========================   =======  
==============================================
+**channel**                  --
+**type**                     --       The component type name, needs to be 
``thrift``.
+**hostname**                 --       The hostname or IP address to bind to.
+**port**                     --       The port # to listen on.
+batch-size                   100      number of event to batch together for 
send.
+connect-timeout              20000    Amount of time (ms) to allow for the 
first (handshake) request.
+request-timeout              20000    Amount of time (ms) to allow for 
requests after the first.
+connection-reset-interval    none     Amount of time (s) before the connection 
to the next hop is reset. This will force the Thrift Sink to reconnect to the 
next hop. This will allow the sink to connect to hosts behind a hardware 
load-balancer when news hosts are added without having to restart the agent.
+==========================   =======  
==============================================
+
+Example for agent named a1:
+
+.. code-block:: properties
+
+  a1.channels = c1
+  a1.sinks = k1
+  a1.sinks.k1.type = thrift
+  a1.sinks.k1.channel = c1
+  a1.sinks.k1.hostname = 10.10.10.10
+  a1.sinks.k1.port = 4545
+
 IRC Sink
 ~~~~~~~~
 

Reply via email to