You can add classpath info in hadoop env file...
Add the following line to your $HADOOP_HOME/etc/hadoop/hadoop-env.sh
export
HADOOP_CLASSPATH=$HADOOP_CLASSPATH:$HADOOP_HOME/share/hadoop/tools/lib/*
Add the following line to $SPARK_HOME/conf/spark-env.sh
export
> On 15 Oct 2015, at 19:04, Scott Reynolds wrote:
>
> List,
>
> Right now we build our spark jobs with the s3a hadoop client. We do this
> because our machines are only allowed to use IAM access to the s3 store. We
> can build our jars with the s3a filesystem and the
hmm I tried using --jars and that got passed to MasterArguments and that
doesn't work :-(
https://github.com/apache/spark/blob/branch-1.5/core/src/main/scala/org/apache/spark/deploy/master/MasterArguments.scala
Same with Worker:
You can use spark 1.5.1 with no hadoop and hadoop 2.7.1..
Hadoop 2.7.1 is more mature for s3a access. You also need to set hadoop
tools dir into hadoop classpath...
Raghav
On Oct 16, 2015 1:09 AM, "Scott Reynolds" wrote:
> We do not use EMR. This is deployed on Amazon VMs
Are you using EMR?
You can install Hadoop-2.6.0 along with Spark-1.5.1 in your EMR cluster.
And that brings s3a jars to the worker nodes and it becomes available to
your application.
On Thu, Oct 15, 2015 at 11:04 AM, Scott Reynolds
wrote:
> List,
>
> Right now we build our
We do not use EMR. This is deployed on Amazon VMs
We build Spark with Hadoop-2.6.0 but that does not include the s3a
filesystem nor the Amazon AWS SDK
On Thu, Oct 15, 2015 at 12:26 PM, Spark Newbie
wrote:
> Are you using EMR?
> You can install Hadoop-2.6.0 along with
List,
Right now we build our spark jobs with the s3a hadoop client. We do this
because our machines are only allowed to use IAM access to the s3 store. We
can build our jars with the s3a filesystem and the aws sdk just fine and
this jars run great in *client mode*.
We would like to move from