[ https://issues.apache.org/jira/browse/OOZIE-2482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15242608#comment-15242608 ]
Alexandre Linte commented on OOZIE-2482: ---------------------------------------- Hi [~satishsaley], Thank you for the replay. My bad, the argument "yarn-master" is a mistake. I corrected it by setting "yarn-cluster" in my job configuration. I checked the comments on the JIRA SPARK-10795. I can successfully do the command: {noformat} [toto@client pysparkpi]$ spark-submit -v --master yarn-client ./pi.py 100 Using properties file: /opt/application/Spark/current/conf/spark-defaults.conf Adding default property: spark.serializer=org.apache.spark.serializer.KryoSerializer Adding default property: spark.executor.extraJavaOptions=-Djava.library.path=/opt/application/Hadoop/current/lib/native/ Adding default property: spark.broadcast.compress=true Adding default property: spark.io.compression.codec=org.apache.spark.io.SnappyCompressionCodec Adding default property: spark.eventLog.enabled=true Adding default property: spark.driver.maxResultSize=1200m Adding default property: spark.io.compression.snappy.blockSize=32k Adding default property: spark.kryoserializer.buffer.max=1500m Adding default property: spark.sql.hive.metastore.jars=builtin Adding default property: spark.driver.memory=2g Adding default property: spark.executor.instances=4 Adding default property: spark.kryo.referenceTracking=false Adding default property: spark.default.parallelism=10 Adding default property: spark.kryo.classesToRegister=org.apache.hadoop.hive.ql.io.HiveKey,org.apache.hadoop.io.BytesWritable,org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch Adding default property: spark.kryoserializer.buffer=100m Adding default property: spark.master=yarn-client Adding default property: spark.broadcast.blockSize=4096 Adding default property: spark.executor.memory=4g Adding default property: spark.eventLog.dir=hdfs:///Products/SPARK/logs/ Adding default property: spark.eventLog.compress=true Adding default property: spark.executor.cores=2 Adding default property: spark.yarn.scheduler.heartbeat.interval-ms=3000 Adding default property: spark.akka.frameSize=100 Adding default property: spark.sql.hive.metastore.version=1.2.1 Parsed arguments: master yarn-client deployMode null executorMemory 4g executorCores 2 totalExecutorCores null propertiesFile /opt/application/Spark/current/conf/spark-defaults.conf driverMemory 2g driverCores null driverExtraClassPath null driverExtraLibraryPath null driverExtraJavaOptions null supervise false queue null numExecutors 4 files null pyFiles null archives null mainClass null primaryResource file:/home/toto/workspace/oozie/pyspark/pysparkpi/./pi.py name pi.py childArgs [100] jars null packages null packagesExclusions null repositories null verbose true Spark properties used, including those specified through --conf and those from the properties file /opt/application/Spark/current/conf/spark-defaults.conf: spark.io.compression.codec -> org.apache.spark.io.SnappyCompressionCodec spark.default.parallelism -> 10 spark.executor.memory -> 4g spark.driver.memory -> 2g spark.kryo.referenceTracking -> false spark.broadcast.blockSize -> 4096 spark.executor.instances -> 4 spark.eventLog.compress -> true spark.eventLog.enabled -> true spark.kryo.classesToRegister -> org.apache.hadoop.hive.ql.io.HiveKey,org.apache.hadoop.io.BytesWritable,org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch spark.kryoserializer.buffer -> 100m spark.serializer -> org.apache.spark.serializer.KryoSerializer spark.executor.extraJavaOptions -> -Djava.library.path=/opt/application/Hadoop/current/lib/native/ spark.akka.frameSize -> 100 spark.yarn.scheduler.heartbeat.interval-ms -> 3000 spark.sql.hive.metastore.version -> 1.2.1 spark.kryoserializer.buffer.max -> 1500m spark.broadcast.compress -> true spark.eventLog.dir -> hdfs:///Products/SPARK/logs/ spark.driver.maxResultSize -> 1200m spark.master -> yarn-client spark.io.compression.snappy.blockSize -> 32k spark.executor.cores -> 2 spark.sql.hive.metastore.jars -> builtin Main class: org.apache.spark.deploy.PythonRunner Arguments: file:/home/toto/workspace/oozie/pyspark/pysparkpi/./pi.py null 100 System properties: spark.io.compression.codec -> org.apache.spark.io.SnappyCompressionCodec spark.default.parallelism -> 10 spark.kryo.referenceTracking -> false spark.driver.memory -> 2g spark.executor.memory -> 4g spark.broadcast.blockSize -> 4096 spark.executor.instances -> 4 spark.eventLog.compress -> true spark.eventLog.enabled -> true spark.kryo.classesToRegister -> org.apache.hadoop.hive.ql.io.HiveKey,org.apache.hadoop.io.BytesWritable,org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch SPARK_SUBMIT -> true spark.kryoserializer.buffer -> 100m spark.serializer -> org.apache.spark.serializer.KryoSerializer spark.akka.frameSize -> 100 spark.executor.extraJavaOptions -> -Djava.library.path=/opt/application/Hadoop/current/lib/native/ spark.app.name -> pi.py spark.yarn.scheduler.heartbeat.interval-ms -> 3000 spark.sql.hive.metastore.version -> 1.2.1 spark.submit.deployMode -> client spark.kryoserializer.buffer.max -> 1500m spark.broadcast.compress -> true spark.eventLog.dir -> hdfs:///Products/SPARK/logs/ spark.driver.maxResultSize -> 1200m spark.yarn.isPython -> true spark.master -> yarn-client spark.io.compression.snappy.blockSize -> 32k spark.executor.cores -> 2 spark.sql.hive.metastore.jars -> builtin Classpath elements: Pi is roughly 3.142274 {noformat} But when I do the job with oozie, it still fails. For info, the pi script I'm using is the following: {noformat} from __future__ import print_function # # Licensed to the Apache Software Foundation (ASF) under one or more # contributor license agreements. See the NOTICE file distributed with # this work for additional information regarding copyright ownership. # The ASF licenses this file to You under the Apache License, Version 2.0 # (the "License"); you may not use this file except in compliance with # the License. You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. # import sys import os from random import random from operator import add from pyspark import SparkContext if __name__ == "__main__": """ Usage: pi [partitions] """ os.environ["SPARK_HOME"] = "/opt/application/Spark/current" sc = SparkContext(appName="PythonPi") partitions = int(sys.argv[1]) if len(sys.argv) > 1 else 2 n = 100000 * partitions def f(_): x = random() * 2 - 1 y = random() * 2 - 1 return 1 if x ** 2 + y ** 2 < 1 else 0 count = sc.parallelize(range(1, n + 1), partitions).map(f).reduce(add) print("Pi is roughly %f" % (4.0 * count / n)) sc.stop() {noformat} > Pyspark job fails with Oozie > ---------------------------- > > Key: OOZIE-2482 > URL: https://issues.apache.org/jira/browse/OOZIE-2482 > Project: Oozie > Issue Type: Bug > Components: core, workflow > Affects Versions: 4.2.0 > Environment: Hadoop 2.7.2, Spark 1.6.0 on Yarn, Oozie 4.2.0 > Cluster secured with Kerberos > Reporter: Alexandre Linte > Assignee: Satish Subhashrao Saley > > Hello, > I'm trying to run pi.py example in a pyspark job with Oozie. Every try I made > failed for the same reason: key not found: SPARK_HOME. > Note: A scala job works well in the environment with Oozie. > The logs on the executors are: > {noformat} > SLF4J: Class path contains multiple SLF4J bindings. > SLF4J: Found binding in > [jar:file:/mnt/hd4/hadoop/yarn/local/filecache/145/slf4j-log4j12-1.6.6.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: Found binding in > [jar:file:/mnt/hd2/hadoop/yarn/local/filecache/155/spark-assembly-1.6.0-hadoop2.7.1.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: Found binding in > [jar:file:/opt/application/Hadoop/hadoop-2.7.2/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an > explanation. > SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] > log4j:ERROR setFile(null,true) call failed. > java.io.FileNotFoundException: > /mnt/hd7/hadoop/yarn/log/application_1454673025841_13136/container_1454673025841_13136_01_000001 > (Is a directory) > at java.io.FileOutputStream.open(Native Method) > at java.io.FileOutputStream.<init>(FileOutputStream.java:221) > at java.io.FileOutputStream.<init>(FileOutputStream.java:142) > at org.apache.log4j.FileAppender.setFile(FileAppender.java:294) > at > org.apache.log4j.FileAppender.activateOptions(FileAppender.java:165) > at > org.apache.hadoop.yarn.ContainerLogAppender.activateOptions(ContainerLogAppender.java:55) > at > org.apache.log4j.config.PropertySetter.activate(PropertySetter.java:307) > at > org.apache.log4j.config.PropertySetter.setProperties(PropertySetter.java:172) > at > org.apache.log4j.config.PropertySetter.setProperties(PropertySetter.java:104) > at > org.apache.log4j.PropertyConfigurator.parseAppender(PropertyConfigurator.java:809) > at > org.apache.log4j.PropertyConfigurator.parseCategory(PropertyConfigurator.java:735) > at > org.apache.log4j.PropertyConfigurator.configureRootCategory(PropertyConfigurator.java:615) > at > org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:502) > at > org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:547) > at > org.apache.log4j.helpers.OptionConverter.selectAndConfigure(OptionConverter.java:483) > at org.apache.log4j.LogManager.<clinit>(LogManager.java:127) > at > org.slf4j.impl.Log4jLoggerFactory.getLogger(Log4jLoggerFactory.java:64) > at org.slf4j.LoggerFactory.getLogger(LoggerFactory.java:285) > at > org.apache.commons.logging.impl.SLF4JLogFactory.getInstance(SLF4JLogFactory.java:155) > at > org.apache.commons.logging.impl.SLF4JLogFactory.getInstance(SLF4JLogFactory.java:132) > at org.apache.commons.logging.LogFactory.getLog(LogFactory.java:275) > at > org.apache.hadoop.service.AbstractService.<clinit>(AbstractService.java:43) > Using properties file: null > Parsed arguments: > master yarn-master > deployMode cluster > executorMemory null > executorCores null > totalExecutorCores null > propertiesFile null > driverMemory null > driverCores null > driverExtraClassPath null > driverExtraLibraryPath null > driverExtraJavaOptions null > supervise false > queue null > numExecutors null > files null > pyFiles null > archives null > mainClass null > primaryResource > hdfs://hadoopsandbox/User/toto/WORK/Oozie/pyspark/lib/pi.py > name Pysparkpi example > childArgs [100] > jars null > packages null > packagesExclusions null > repositories null > verbose true > Spark properties used, including those specified through > --conf and those from the properties file null: > spark.executorEnv.SPARK_HOME -> /opt/application/Spark/current > spark.executorEnv.PYTHONPATH -> /opt/application/Spark/current/python > spark.yarn.appMasterEnv.SPARK_HOME -> /opt/application/Spark/current > Main class: > org.apache.spark.deploy.yarn.Client > Arguments: > --name > Pysparkpi example > --primary-py-file > hdfs://hadoopsandbox/User/toto/WORK/Oozie/pyspark/lib/pi.py > --class > org.apache.spark.deploy.PythonRunner > --arg > 100 > System properties: > spark.executorEnv.SPARK_HOME -> /opt/application/Spark/current > spark.executorEnv.PYTHONPATH -> /opt/application/Spark/current/python > SPARK_SUBMIT -> true > spark.app.name -> Pysparkpi example > spark.submit.deployMode -> cluster > spark.yarn.appMasterEnv.SPARK_HOME -> /opt/application/Spark/current > spark.yarn.isPython -> true > spark.master -> yarn-cluster > Classpath elements: > Failing Oozie Launcher, Main class > [org.apache.oozie.action.hadoop.SparkMain], main() threw exception, key not > found: SPARK_HOME > java.util.NoSuchElementException: key not found: SPARK_HOME > at scala.collection.MapLike$class.default(MapLike.scala:228) > at scala.collection.AbstractMap.default(Map.scala:58) > at scala.collection.MapLike$class.apply(MapLike.scala:141) > at scala.collection.AbstractMap.apply(Map.scala:58) > at > org.apache.spark.deploy.yarn.Client$$anonfun$findPySparkArchives$2.apply(Client.scala:1045) > at > org.apache.spark.deploy.yarn.Client$$anonfun$findPySparkArchives$2.apply(Client.scala:1044) > at scala.Option.getOrElse(Option.scala:120) > at > org.apache.spark.deploy.yarn.Client.findPySparkArchives(Client.scala:1044) > at > org.apache.spark.deploy.yarn.Client.createContainerLaunchContext(Client.scala:717) > at > org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:142) > at org.apache.spark.deploy.yarn.Client.run(Client.scala:1016) > at org.apache.spark.deploy.yarn.Client$.main(Client.scala:1076) > at org.apache.spark.deploy.yarn.Client.main(Client.scala) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731) > at > org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181) > at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206) > at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121) > at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) > at > org.apache.oozie.action.hadoop.SparkMain.runSpark(SparkMain.java:104) > at org.apache.oozie.action.hadoop.SparkMain.run(SparkMain.java:95) > at > org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:47) > at org.apache.oozie.action.hadoop.SparkMain.main(SparkMain.java:38) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:236) > at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) > at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) > at > org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler.runSubtask(LocalContainerLauncher.java:380) > at > org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler.runTask(LocalContainerLauncher.java:301) > at > org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler.access$200(LocalContainerLauncher.java:187) > at > org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler$1.run(LocalContainerLauncher.java:230) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > log4j:WARN No appenders could be found for logger > (org.apache.hadoop.mapreduce.v2.app.MRAppMaster). > log4j:WARN Please initialize the log4j system properly. > log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more > info. > {noformat} > The workflow used for Oozie is the following: > {noformat} > <workflow-app xmlns='uri:oozie:workflow:0.5' name='PysparkPi-test'> > <start to='spark-node' /> > <action name='spark-node'> > <spark xmlns="uri:oozie:spark-action:0.1"> > <job-tracker>${jobTracker}</job-tracker> > <name-node>${nameNode}</name-node> > <master>${master}</master> > <mode>${mode}</mode> > <name>Pysparkpi example</name> > <class></class> > > <jar>${nameNode}/User/toto/WORK/Oozie/pyspark/lib/pi.py</jar> > <spark-opts>--conf > spark.yarn.appMasterEnv.SPARK_HOME=/opt/application/Spark/current --conf > spark.executorEnv.SPARK_HOME=/opt/application/Spark/current --conf > spark.executorEnv.PYTHONPATH=/opt/application/Spark/current/python</spark-opts> > <arg>100</arg> > </spark> > <ok to="end" /> > <error to="fail" /> > </action> > <kill name="fail"> > <message>Workflow failed, error > message[${wf:errorMessage(wf:lastErrorNode())}]</message> > </kill> > <end name='end' /> > </workflow-app> > {noformat} > I also created a JIRA for Spark: > [https://issues.apache.org/jira/browse/SPARK-13679] -- This message was sent by Atlassian JIRA (v6.3.4#6332)