[ https://issues.apache.org/jira/browse/SYSTEMML-1276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15906386#comment-15906386 ]
Matthias Boehm commented on SYSTEMML-1276: ------------------------------------------ There is indeed room for improvement. Currently, we always analyze the yarn cluster for the number of nodes, vcores, etc and subsequently apply corrections when in spark execution mode. We should make these calls only if really needed. > Resolve jersey class not found error with Spark2 and YARN > --------------------------------------------------------- > > Key: SYSTEMML-1276 > URL: https://issues.apache.org/jira/browse/SYSTEMML-1276 > Project: SystemML > Issue Type: Improvement > Components: Runtime > Affects Versions: SystemML 0.13 > Environment: Spark 2.x, Hadoop 2.7.3 > Reporter: Glenn Weidner > Assignee: Glenn Weidner > > This is a known issue as reported in [YARN-5271] and [SPARK-15343]. It was > observed during 0.13 performance testing and can be reproduced with following > example: > spark-submit --master yarn --deploy-mode client --class > org.apache.sysml.api.DMLScript ./systemml-0.13.0-incubating-SNAPSHOT.jar -f > ./scripts/utils/sample.dml -exec hybrid_spark -nvargs X=linRegData.csv > sv=perc.csv O=linRegDataParts ofmt=csv > Exception in thread "main" java.lang.NoClassDefFoundError: > com/sun/jersey/api/client/config/ClientConfig > at > org.apache.hadoop.yarn.client.api.TimelineClient.createTimelineClient(TimelineClient.java:55) > at > org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.createTimelineClient(YarnClientImpl.java:182) > at > org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.serviceInit(YarnClientImpl.java:169) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) > at > org.apache.hadoop.mapred.ResourceMgrDelegate.serviceInit(ResourceMgrDelegate.java:103) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) > at > org.apache.hadoop.mapred.ResourceMgrDelegate.<init>(ResourceMgrDelegate.java:97) > at org.apache.hadoop.mapred.YARNRunner.<init>(YARNRunner.java:122) > at > org.apache.hadoop.mapred.YarnClientProtocolProvider.create(YarnClientProtocolProvider.java:34) > at org.apache.hadoop.mapreduce.Cluster.initialize(Cluster.java:95) > at org.apache.hadoop.mapreduce.Cluster.<init>(Cluster.java:82) > at org.apache.hadoop.mapreduce.Cluster.<init>(Cluster.java:75) > at org.apache.hadoop.mapred.JobClient.init(JobClient.java:475) > at org.apache.hadoop.mapred.JobClient.<init>(JobClient.java:454) > at > org.apache.sysml.runtime.controlprogram.parfor.stat.InfrastructureAnalyzer.analyzeHadoopCluster(InfrastructureAnalyzer.java:472) > at > org.apache.sysml.runtime.controlprogram.parfor.stat.InfrastructureAnalyzer.getRemoteParallelMapTasks(InfrastructureAnalyzer.java:114) > at > org.apache.sysml.runtime.controlprogram.parfor.stat.InfrastructureAnalyzer.getCkMaxMR(InfrastructureAnalyzer.java:298) > at > org.apache.sysml.runtime.controlprogram.parfor.opt.OptimizationWrapper.optimize(OptimizationWrapper.java:168) > at > org.apache.sysml.runtime.controlprogram.ParForProgramBlock.execute(ParForProgramBlock.java:550) > at > org.apache.sysml.runtime.controlprogram.Program.execute(Program.java:145) > at org.apache.sysml.api.DMLScript.execute(DMLScript.java:674) > at org.apache.sysml.api.DMLScript.executeScript(DMLScript.java:354) > at org.apache.sysml.api.DMLScript.main(DMLScript.java:199) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:738) > at > org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:187) > at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:212) > at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:126) > at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) > Caused by: java.lang.ClassNotFoundException: > com.sun.jersey.api.client.config.ClientConfig > at java.net.URLClassLoader.findClass(URLClassLoader.java:381) > at java.lang.ClassLoader.loadClass(ClassLoader.java:424) > at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331) > at java.lang.ClassLoader.loadClass(ClassLoader.java:357) > ... 32 more -- This message was sent by Atlassian JIRA (v6.3.15#6346)