Hello, I am new to Flume and I am trying to experiment it by moving binary files over two agents.
- The first agent runs on machine A and uses a spooldir source and a thrift sink. - The second agent runs on machine B, which is part of a Hadoop cluster. It has a thrift source and an HDFS sink. I have two questions for this configuration: - I know I have to use the BlobDeserializer$Builder for the source on A, but which is the correct size for the maxBlobLength parameter? Should it be less or greater than the expected size of the binary file? - I did some tests and I found that the transmitted file was corrupted on HDFS. I think this was caused by the HDFS sink which uses TEXT as default serializer (I assume it is writing \n characters between one event and the other). How could I fix this? Thank you very much in advance. Best regards, Riccardo
