I am using Cloudera CDH 5.1 and running a Flume agent configured by Cloudera manager.
I would like to send Avro data to Flume, and I was assuming the Avro Source would be the appropriate method to send data in this way. However, the examples of Java clients that send data via the Avro Source, send simple strings, not Avro objects to be serialized, e.g. the example here: https://flume.apache.org/FlumeDeveloperGuide.html And the examples of Avro serialization all seem to be able serializing to disk. In my use case, I am basically receiving a real-time stream of JSON documents, which I am able to convert to Avro objects, and would like to put them into Flume. I would then like to be able to index this Avro data in Solr via the Solr sink, and convert it to Parquet format in HDFS using the HDFS sink. Is this possible or am I coming about this the wrong way?
