Bob, You may want to have a look at Apache Nifi.
http://ingest.tips/2014/12/22/getting-started-with-apache-nifi/ Regards, Jeff On Mon, Feb 2, 2015 at 3:49 PM, Bob Metelsky <[email protected]> wrote: > Steve - I appreciate you time on this... > > Yes, I want to use flume to copy .xml or .whatever files from a server > outside the cluster to hdfs. That server does l have flume installed on it > > Id like the same behavior as "spooling directory" but from a remote > machine --> to hdfs > > So, from all my reading flume looks like it completely designed for > streaming "live" logs and program outputs... > > Doesn't seem to be known for being a filewatcher and grabbing files as > they show up, then shiping and writing to hdfs > > Of can it? > > Ok I can think fragmentation with individual "small" files but doesn't > "spool directory behaviour" face the same issue? > > I've done quite a bit of reading but one can easily get into the weeds :) > - All I need to do is this simple task. > > Thanks > > > > On Mon, Feb 2, 2015 at 5:17 PM, Steve Morin <[email protected]> wrote: > >> So you want 1 to 1 replication of the logs to HDFS? >> >> As a footnote people usually don't do this because the log files are >> often too small (think fragmentation) which causes performance problems >> when used on Hadoop >> >> On Feb 2, 2015, at 13:30, Bob Metelsky <[email protected]> wrote: >> >> Hi I have a simple requirement >> >> on server1 (NOT in the cluster, but has flume installed) >> I have a process that constantly generates xml files in a known directory >> >> I need to transfer them to server2 (IN the hadoop cluster) >> and into hdfs as xml files >> >> from what Im reading avro, thrift rpc, et all - are designed for other >> uses >> >> Is there a way to have flume just copy over plain files? txt, xml... >> Im thinking there should be but I cant find it >> >> The closest I see is the "spooling directory" but that seems to be the >> files are already inside the cluster. >> >> Can flume do this? Is there an example,I've read the flume documentation >> and nothing is jumping out >> >> Thanks! >> >> >
