Thanks Hari. Using Spool Dir, I could have remote flume agents write events to a remote dir and run rsync locally to sync a local dir with the remote dir and have local flume agent pick up events from the local dir.
But this way I am breaking the flume pipeline with rsync in the middle. I don't know how this will affect flume features like reliability, scalability, etc. -Majid On Tuesday, December 2, 2014, Hari Shreedharan <[email protected] <javascript:_e(%7B%7D,'cvml','[email protected]');>> wrote: > Not sure how that would be possible. You could use a Spool Dir Source if > you want to write the data to files and then read it from there. > > Thanks, > Hari > > > On Tue, Nov 25, 2014 at 11:00 AM, Majid Alfifi <[email protected]> > wrote: > >> I have a typical flume pipeline that collects logs from online servers >> and aggregate them and push them down to HDFS. The typical configuration is >> to open a port on the local cluster so the online flume agent can send Avro >> events to. >> >> Is it possible to have a flume agent on the local cluster basically >> "pulling" events from the online agent without the need to open a local >> port? >> >> Best Regards, >> Majid >> > >
