I want it to be available on all machines in the cluster.
On Tue, Feb 11, 2014 at 10:35 AM, Andrew Ash <and...@andrewash.com> wrote: > Do you want the files scattered across the local temp directories of all > your machines or just one of them? If just one, I'd recommend having your > driver program execute hadoop fs -getmerge /path/to/files... using Scala's > external process libraries. > > > On Tue, Feb 11, 2014 at 9:18 AM, David Thomas <dt5434...@gmail.com> wrote: > >> I'm trying to copy a file from hdfs to a temp local directory within a >> map function using static method of FileUtil and I get the below error. Is >> there a way to get around this? >> >> org.apache.spark.SparkException: Job aborted: Task not serializable: >> java.io.NotSerializableException: org.apache.hadoop.fs.Path >> at >> org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$scheduler$DAGScheduler$$abortStage$1.apply(DAGScheduler.scala:1028) >> > >