luster.
From: Marcelo Vanzin mailto:van...@cloudera.com>>
To: "Varadhan, Jawahar" mailto:varad...@yahoo.com>>
Cc: "d...@spark.apache.org<mailto:d...@spark.apache.org>"
mailto:d...@spark.apache.org>>
Sent: Friday, August 14, 2015 3:23 PM
Subject: Re
On Fri, Aug 14, 2015 at 2:11 PM, Varadhan, Jawahar <
varad...@yahoo.com.invalid> wrote:
> And hence, I was planning to use Spark Streaming with Kafka or Flume with
> Kafka. But flume runs on a JVM and may not be the best option as the huge
> file will create memory issues. Please suggest someway t
: "Varadhan, Jawahar"
Cc: "d...@spark.apache.org"
Sent: Friday, August 14, 2015 3:23 PM
Subject: Re: Setting up Spark/flume/? to Ingest 10TB from FTP
Why do you need to use Spark or Flume for this?
You can just use curl and hdfs:
curl ftp://blah | hdfs dfs -put - /bl