Re: Cluster mode dependent jars not working

2015-12-17 Thread vimal dinakaran
--driver-classpath needs to be added with jars needed. But this is not
being mentioned in the spark documentation.

On Tue, Dec 15, 2015 at 9:13 PM, Ted Yu  wrote:

> Please use --conf spark.executor.extraClassPath=XXX to specify dependent
> jars.
>
> On Tue, Dec 15, 2015 at 3:57 AM, vimal dinakaran 
> wrote:
>
>> I am running spark using cluster mode for deployment . Below is the
>> command
>>
>>
>> JARS=$JARS_HOME/amqp-client-3.5.3.jar,$JARS_HOME/nscala-time_2.10-2.0.0.jar,\
>> $JARS_HOME/kafka_2.10-0.8.2.1.jar,$JARS_HOME/kafka-clients-0.8.2.1.jar,\
>> $JARS_HOME/spark-streaming-kafka_2.10-1.4.1.jar,\
>> $JARS_HOME/zkclient-0.3.jar,$JARS_HOME/protobuf-java-2.4.0a.jar
>>
>> dse spark-submit -v --conf
>> "spark.serializer=org.apache.spark.serializer.KryoSerializer" \
>>  --executor-memory 512M \
>>  --total-executor-cores 3 \
>>  --deploy-mode "cluster" \
>>  --master spark://$MASTER:7077 \
>>  --jars=$JARS \
>>  --supervise \
>>  --class "com.testclass" $APP_JAR  input.json \
>>  --files "/home/test/input.json"
>>
>> The above command is working fine in client mode. But when I use it in
>> cluster mode I get class not found exception
>>
>> Exception in thread "main" java.lang.reflect.InvocationTargetException
>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> at
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>> at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>> at java.lang.reflect.Method.invoke(Method.java:606)
>> at
>> org.apache.spark.deploy.worker.DriverWrapper$.main(DriverWrapper.scala:58)
>> at
>> org.apache.spark.deploy.worker.DriverWrapper.main(DriverWrapper.scala)
>> Caused by: java.lang.NoClassDefFoundError:
>> org/apache/spark/streaming/kafka/KafkaUtils$
>>
>> In client mode the dependent jars are getting copied to the
>> /var/lib/spark/work directory whereas in cluster mode it is not.
>> I am using nfs and I have mounted the same directory on all the spark
>> nodes under same name. Still I get the error.
>>
>> From the verbose logs of dse spark-submit, I see the classpath elements
>> are missing here .
>>
>> --
>> spark.hadoop.cassandra.input.native.ssl.trust.store.password -> cassandra
>> spark.cassandra.connection.ssl.trustStore.password -> cassandra
>> spark.ssl.keyStorePassword -> cassandra
>> spark.cassandra.auth.username -> cassandra
>> spark.hadoop.fs.har.impl -> org.apache.hadoop.fs.HarFileSystem
>> Classpath elements:
>>
>>
>> WARN  2015-12-15 17:08:48 org.apache.spark.util.Utils: Your hostname,
>> demeter-dev-node2 resolves to a loopback address: 127.0.1.1; using
>> 10.29.23.170 instead (on interface eth0)
>> WARN  2015-12-15 17:08:48 org.apache.spark.util.Utils: Set SPARK_LOCAL_IP
>> if you need to bind to another addres
>>
>> How it is able to pick the application jar which is also under same
>> directory but not the dependent jars ?
>> Please help me in getting this solved.
>>
>>
>


Cluster mode dependent jars not working

2015-12-15 Thread vimal dinakaran
I am running spark using cluster mode for deployment . Below is the command

JARS=$JARS_HOME/amqp-client-3.5.3.jar,$JARS_HOME/nscala-time_2.10-2.0.0.jar,\
$JARS_HOME/kafka_2.10-0.8.2.1.jar,$JARS_HOME/kafka-clients-0.8.2.1.jar,\
$JARS_HOME/spark-streaming-kafka_2.10-1.4.1.jar,\
$JARS_HOME/zkclient-0.3.jar,$JARS_HOME/protobuf-java-2.4.0a.jar

dse spark-submit -v --conf
"spark.serializer=org.apache.spark.serializer.KryoSerializer" \
 --executor-memory 512M \
 --total-executor-cores 3 \
 --deploy-mode "cluster" \
 --master spark://$MASTER:7077 \
 --jars=$JARS \
 --supervise \
 --class "com.testclass" $APP_JAR  input.json \
 --files "/home/test/input.json"

The above command is working fine in client mode. But when I use it in
cluster mode I get class not found exception

Exception in thread "main" java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at
org.apache.spark.deploy.worker.DriverWrapper$.main(DriverWrapper.scala:58)
at
org.apache.spark.deploy.worker.DriverWrapper.main(DriverWrapper.scala)
Caused by: java.lang.NoClassDefFoundError:
org/apache/spark/streaming/kafka/KafkaUtils$

In client mode the dependent jars are getting copied to the
/var/lib/spark/work directory whereas in cluster mode it is not.
I am using nfs and I have mounted the same directory on all the spark nodes
under same name. Still I get the error.

>From the verbose logs of dse spark-submit, I see the classpath elements are
missing here .
--
spark.hadoop.cassandra.input.native.ssl.trust.store.password -> cassandra
spark.cassandra.connection.ssl.trustStore.password -> cassandra
spark.ssl.keyStorePassword -> cassandra
spark.cassandra.auth.username -> cassandra
spark.hadoop.fs.har.impl -> org.apache.hadoop.fs.HarFileSystem
Classpath elements:


WARN  2015-12-15 17:08:48 org.apache.spark.util.Utils: Your hostname,
demeter-dev-node2 resolves to a loopback address: 127.0.1.1; using
10.29.23.170 instead (on interface eth0)
WARN  2015-12-15 17:08:48 org.apache.spark.util.Utils: Set SPARK_LOCAL_IP
if you need to bind to another addres

How it is able to pick the application jar which is also under same
directory but not the dependent jars ?
Please help me in getting this solved.


Re: Cluster mode dependent jars not working

2015-12-15 Thread Ted Yu
Please use --conf spark.executor.extraClassPath=XXX to specify dependent
jars.

On Tue, Dec 15, 2015 at 3:57 AM, vimal dinakaran 
wrote:

> I am running spark using cluster mode for deployment . Below is the command
>
>
> JARS=$JARS_HOME/amqp-client-3.5.3.jar,$JARS_HOME/nscala-time_2.10-2.0.0.jar,\
> $JARS_HOME/kafka_2.10-0.8.2.1.jar,$JARS_HOME/kafka-clients-0.8.2.1.jar,\
> $JARS_HOME/spark-streaming-kafka_2.10-1.4.1.jar,\
> $JARS_HOME/zkclient-0.3.jar,$JARS_HOME/protobuf-java-2.4.0a.jar
>
> dse spark-submit -v --conf
> "spark.serializer=org.apache.spark.serializer.KryoSerializer" \
>  --executor-memory 512M \
>  --total-executor-cores 3 \
>  --deploy-mode "cluster" \
>  --master spark://$MASTER:7077 \
>  --jars=$JARS \
>  --supervise \
>  --class "com.testclass" $APP_JAR  input.json \
>  --files "/home/test/input.json"
>
> The above command is working fine in client mode. But when I use it in
> cluster mode I get class not found exception
>
> Exception in thread "main" java.lang.reflect.InvocationTargetException
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at
> org.apache.spark.deploy.worker.DriverWrapper$.main(DriverWrapper.scala:58)
> at
> org.apache.spark.deploy.worker.DriverWrapper.main(DriverWrapper.scala)
> Caused by: java.lang.NoClassDefFoundError:
> org/apache/spark/streaming/kafka/KafkaUtils$
>
> In client mode the dependent jars are getting copied to the
> /var/lib/spark/work directory whereas in cluster mode it is not.
> I am using nfs and I have mounted the same directory on all the spark
> nodes under same name. Still I get the error.
>
> From the verbose logs of dse spark-submit, I see the classpath elements
> are missing here .
>
> --
> spark.hadoop.cassandra.input.native.ssl.trust.store.password -> cassandra
> spark.cassandra.connection.ssl.trustStore.password -> cassandra
> spark.ssl.keyStorePassword -> cassandra
> spark.cassandra.auth.username -> cassandra
> spark.hadoop.fs.har.impl -> org.apache.hadoop.fs.HarFileSystem
> Classpath elements:
>
>
> WARN  2015-12-15 17:08:48 org.apache.spark.util.Utils: Your hostname,
> demeter-dev-node2 resolves to a loopback address: 127.0.1.1; using
> 10.29.23.170 instead (on interface eth0)
> WARN  2015-12-15 17:08:48 org.apache.spark.util.Utils: Set SPARK_LOCAL_IP
> if you need to bind to another addres
>
> How it is able to pick the application jar which is also under same
> directory but not the dependent jars ?
> Please help me in getting this solved.
>
>


Re: --jars not working?

2015-06-13 Thread Alex Nakos
Are you using a build for scala 2.11? I’ve encountered the same behaviour
trying to run on Yarn with scala 2.11 and Spark 1.3.0, 1.3.1 and 1.4.0.RC3
and raised JIRA issue here: https://issues.apache.org/jira/browse/SPARK-7944.
Would be good to know if this is identical to what you’re seeing on Mesos.

Thanks
Alex

On Fri, Jun 12, 2015 at 8:45 PM, Akhil Das ak...@sigmoidanalytics.com
wrote:

 You can verify if the jars are shipped properly by looking at the driver
 UI (running on 4040) Environment tab.

 Thanks
 Best Regards

 On Sat, Jun 13, 2015 at 12:43 AM, Jonathan Coveney jcove...@gmail.com
 wrote:

 Spark version is 1.3.0 (will upgrade as soon as we upgrade past mesos
 0.19.0)...

 Regardless, I'm running into a really weird situation where when I pass
 --jars to bin/spark-shell I can't reference those classes on the repl. Is
 this expected? The logs even tell me that my jars have been added, and yet
 the classes inside of them are not available.

 Am I missing something obvious?





--jars not working?

2015-06-12 Thread Jonathan Coveney
Spark version is 1.3.0 (will upgrade as soon as we upgrade past mesos
0.19.0)...

Regardless, I'm running into a really weird situation where when I pass
--jars to bin/spark-shell I can't reference those classes on the repl. Is
this expected? The logs even tell me that my jars have been added, and yet
the classes inside of them are not available.

Am I missing something obvious?


Re: --jars not working?

2015-06-12 Thread Akhil Das
You can verify if the jars are shipped properly by looking at the driver UI
(running on 4040) Environment tab.

Thanks
Best Regards

On Sat, Jun 13, 2015 at 12:43 AM, Jonathan Coveney jcove...@gmail.com
wrote:

 Spark version is 1.3.0 (will upgrade as soon as we upgrade past mesos
 0.19.0)...

 Regardless, I'm running into a really weird situation where when I pass
 --jars to bin/spark-shell I can't reference those classes on the repl. Is
 this expected? The logs even tell me that my jars have been added, and yet
 the classes inside of them are not available.

 Am I missing something obvious?