Thanks for the quick reply Hari. When you say send data to Flume using the RPC Client API, do you mean send it to the Avro Source? If not, which source? Because that is currently what I am trying to do. I wasn't sure if encoding Avro data as byte[] and sending it to the Avro Source was a valid approach, but from what you are saying there is a way for sources (at least the HDFS source) to recognize the encoded Avro data. I hope the Solr source can be made to be similarly aware.
Would encoding the Avro data as byte[] and sending it to flume via the HTTP interface also work? I was actually having trouble converting an Avro object to a byte[] array to start with...but I will try that again. On Thu, Sep 18, 2014 at 10:16 AM, Hari Shreedharan < [email protected]> wrote: > No, the Avro Source is an RPC source. To send data to Flume use the RPC > client API (https://flume.apache.org/FlumeDeveloperGuide.html#client). > Just encode your Avro data as byte[] and use the AVRO_EVENT serializer > while writing to HDFS. > > Thanks, > Hari > > > On Wed, Sep 17, 2014 at 5:13 PM, zzz <[email protected]> wrote: > >> I am using Cloudera CDH 5.1 and running a Flume agent configured by >> Cloudera manager. >> >> I would like to send Avro data to Flume, and I was assuming the Avro >> Source would be the appropriate method to send data in this way. >> >> However, the examples of Java clients that send data via the Avro Source, >> send simple strings, not Avro objects to be serialized, e.g. the example >> here: https://flume.apache.org/FlumeDeveloperGuide.html >> >> And the examples of Avro serialization all seem to be able serializing to >> disk. >> >> In my use case, I am basically receiving a real-time stream of JSON >> documents, which I am able to convert to Avro objects, and would like to >> put them into Flume. I would then like to be able to index this Avro data >> in Solr via the Solr sink, and convert it to Parquet format in HDFS using >> the HDFS sink. >> >> Is this possible or am I coming about this the wrong way? >> > >
