I want it to be available on all machines in the cluster.

On Tue, Feb 11, 2014 at 10:35 AM, Andrew Ash <and...@andrewash.com> wrote:

> Do you want the files scattered across the local temp directories of all
> your machines or just one of them?  If just one, I'd recommend having your
> driver program execute hadoop fs -getmerge /path/to/files...  using Scala's
> external process libraries.
>
>
> On Tue, Feb 11, 2014 at 9:18 AM, David Thomas <dt5434...@gmail.com> wrote:
>
>> I'm trying to copy a file from hdfs to a temp local directory within a
>> map function using static method of FileUtil and I get the below error. Is
>> there a way to get around this?
>>
>> org.apache.spark.SparkException: Job aborted: Task not serializable:
>> java.io.NotSerializableException: org.apache.hadoop.fs.Path
>>     at
>> org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$scheduler$DAGScheduler$$abortStage$1.apply(DAGScheduler.scala:1028)
>>
>
>

Reply via email to