I'm trying to understand the correct way to use the Flume RpcClient in a 
multithreaded application. Information I have found so far indicates that the 
components are thread safe, but the example in the Flume documentation clouds 
the issue when it comes to error handling. This code:
public void sendDataToFlume(String data) {
    // Create a Flume Event object that encapsulates the sample data
    Event event = EventBuilder.withBody(data, Charset.forName("UTF-8"));

    // Send the event
    try {
      client.append(event);
    } catch (EventDeliveryException e) {
      // clean up and recreate the client
      client.close();
      client = null;
      client = RpcClientFactory.getDefaultInstance(hostname, port);
      // Use the following method to create a thrift client (instead of the 
above line):
      // this.client = RpcClientFactory.getThriftInstance(hostname, port);
    }
  }
If more than one thread calls this method, and the exception is thrown, then 
there will be a problem as multiple threads try to recreate the client in the 
exception handler.
Is the intent of the SDK that it should only be used by a single thread? Should 
this method be synchronized, as it appears to be in the log4jappender that is 
part of the Flume source? Should I put this code in its own worker and pass it 
events via a queue?
Does anyone have an example of RpcClient being used by more then one thread 
(included the error condition)?
Would I be better off using the "embedded agent"? Is that multithread friendly?
TIA,
--skatz

Reply via email to