[ 
https://issues.apache.org/jira/browse/SPARK-15343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15566019#comment-15566019
 ] 

Jo Desmet commented on SPARK-15343:
-----------------------------------

By design we apparently have a very tight coupling of scheduling and execution 
environment(?). Unless this gets better decoupling with a remote-call API, it 
is as much YARN's as it is SPARK's classpath. Isn't that what Jersey is about? 
What I advocate is a more subdued approach where we keep supporting new 
versions of Spark (2.0 and beyond) that remain compatible with mainstream used 
versions of Hadoop Yarn. We should do so until we have more mainstream adoption 
of a YARN environment with the more recent libraries, or until other fixes or 
features are implemented on either Spark or Yarn side. The desire for a newer 
Jersey library just does not seem that much worth to me compared to this.
I am no Yarn fan, but it just feels like we are breaking the bond with Yarn 
just because we feel it is not going fast enough on that side.


> NoClassDefFoundError when initializing Spark with YARN
> ------------------------------------------------------
>
>                 Key: SPARK-15343
>                 URL: https://issues.apache.org/jira/browse/SPARK-15343
>             Project: Spark
>          Issue Type: Bug
>          Components: YARN
>    Affects Versions: 2.0.0
>            Reporter: Maciej BryƄski
>            Priority: Critical
>
> I'm trying to connect Spark 2.0 (compiled from branch-2.0) with Hadoop.
> Spark compiled with:
> {code}
> ./dev/make-distribution.sh -Pyarn -Phadoop-2.6 -Phive -Phive-thriftserver 
> -Dhadoop.version=2.6.0 -DskipTests
> {code}
> I'm getting following error
> {code}
> mbrynski@jupyter:~/spark$ bin/pyspark
> Python 3.4.0 (default, Apr 11 2014, 13:05:11)
> [GCC 4.8.2] on linux
> Type "help", "copyright", "credits" or "license" for more information.
> Warning: Master yarn-client is deprecated since 2.0. Please use master "yarn" 
> with specified deploy mode instead.
> Setting default log level to "WARN".
> To adjust logging level use sc.setLogLevel(newLevel).
> 16/05/16 11:54:41 WARN SparkConf: The configuration key 'spark.yarn.jar' has 
> been deprecated as of Spark 2.0 and may be removed in the future. Please use 
> the new key 'spark.yarn.jars' instead.
> 16/05/16 11:54:41 WARN NativeCodeLoader: Unable to load native-hadoop library 
> for your platform... using builtin-java classes where applicable
> 16/05/16 11:54:42 WARN AbstractHandler: No Server set for 
> org.spark_project.jetty.server.handler.ErrorHandler@f7989f6
> 16/05/16 11:54:43 WARN DomainSocketFactory: The short-circuit local reads 
> feature cannot be used because libhadoop cannot be loaded.
> Traceback (most recent call last):
>   File "/home/mbrynski/spark/python/pyspark/shell.py", line 38, in <module>
>     sc = SparkContext()
>   File "/home/mbrynski/spark/python/pyspark/context.py", line 115, in __init__
>     conf, jsc, profiler_cls)
>   File "/home/mbrynski/spark/python/pyspark/context.py", line 172, in _do_init
>     self._jsc = jsc or self._initialize_context(self._conf._jconf)
>   File "/home/mbrynski/spark/python/pyspark/context.py", line 235, in 
> _initialize_context
>     return self._jvm.JavaSparkContext(jconf)
>   File 
> "/home/mbrynski/spark/python/lib/py4j-0.10.1-src.zip/py4j/java_gateway.py", 
> line 1183, in __call__
>   File 
> "/home/mbrynski/spark/python/lib/py4j-0.10.1-src.zip/py4j/protocol.py", line 
> 312, in get_return_value
> py4j.protocol.Py4JJavaError: An error occurred while calling 
> None.org.apache.spark.api.java.JavaSparkContext.
> : java.lang.NoClassDefFoundError: 
> com/sun/jersey/api/client/config/ClientConfig
>         at 
> org.apache.hadoop.yarn.client.api.TimelineClient.createTimelineClient(TimelineClient.java:45)
>         at 
> org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.serviceInit(YarnClientImpl.java:163)
>         at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
>         at 
> org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:150)
>         at 
> org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:56)
>         at 
> org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:148)
>         at org.apache.spark.SparkContext.<init>(SparkContext.scala:502)
>         at 
> org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:58)
>         at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
> Method)
>         at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>         at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>         at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
>         at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:240)
>         at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
>         at py4j.Gateway.invoke(Gateway.java:236)
>         at 
> py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:80)
>         at 
> py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:69)
>         at py4j.GatewayConnection.run(GatewayConnection.java:211)
>         at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.ClassNotFoundException: 
> com.sun.jersey.api.client.config.ClientConfig
>         at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>         at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>         at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
>         at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>         ... 19 more
> {code}
> On 1.6 everything works fine. I'm using HDP2.2 (Hadoop 2.6.0)
> I have HADOOP_CONF_DIR and SPARK_HOME env variables.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to