*I've installed an HDP cluster with Hbase and Spark with YARN. As part of that installation I created some HDP (Ambari) managed clients. I installed PIO on one of these clients and configured PIO to use the HDP installed Hadoop, HBase, and Spark. When I run the command 'pio eventserver &', I get the following error.*
#### /home/centos/PredictionIO-0.12.1/bin/semver.sh: line 89: [: 2.2.6.2.14-5: integer expression expected /home/centos/PredictionIO-0.12.1/bin/semver.sh: line 93: [[: 2.2.6.2.14-5: syntax error: invalid arithmetic operator (error token is ".2.6.2.14-5") /home/centos/PredictionIO-0.12.1/bin/semver.sh: line 97: [[: 2.2.6.2.14-5: syntax error: invalid arithmetic operator (error token is ".2.6.2.14-5") You have Apache Spark 2.1.1.2.6.2.14-5 at /usr/hdp/2.6.2.14-5/spark2/ which does not meet the minimum version requirement of 1.3.0. Aborting. #### *If I then go to /usr/hdp/2.6.2.14-5/spark2/ and replace the RELEASE with an empty file, I can then start the Eventserver, which gives me the following message:* ### /usr/hdp/2.6.2.14-5/spark2/ contains an empty RELEASE file. This is a known problem with certain vendors (e.g. Cloudera). Please make sure you are using at least 1.3.0. [INFO] [Management$] Creating Event Server at 0.0.0.0:7070 [WARN] [DomainSocketFactory] The short-circuit local reads feature cannot be used because libhadoop cannot be loaded. [INFO] [HttpListener] Bound to /0.0.0.0:7070 [INFO] [EventServerActor] Bound received. EventServer is ready. #### *I can then send events to the Eventserver. After sending the events listed in the SimilarProduct Recommender example I am unable to train. Using the cluster. If I use 'pio train' then it successfully trains locally. If I atttempt to use the command "pio train -- --master yarn" then I get the following:* ####### Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 1 at org.apache.spark.deploy.yarn.YarnSparkHadoopUtil$$anonfun$setEnvFromInputString$1.apply(YarnSparkHadoopUtil.scala:154) at org.apache.spark.deploy.yarn.YarnSparkHadoopUtil$$anonfun$setEnvFromInputString$1.apply(YarnSparkHadoopUtil.scala:152) at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33) at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186) at org.apache.spark.deploy.yarn.YarnSparkHadoopUtil$.setEnvFromInputString(YarnSparkHadoopUtil.scala:152) at org.apache.spark.deploy.yarn.Client$$anonfun$setupLaunchEnv$6.apply(Client.scala:819) at org.apache.spark.deploy.yarn.Client$$anonfun$setupLaunchEnv$6.apply(Client.scala:817) at scala.Option.foreach(Option.scala:257) at org.apache.spark.deploy.yarn.Client.setupLaunchEnv(Client.scala:817) at org.apache.spark.deploy.yarn.Client.createContainerLaunchContext(Client.scala:911) at org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:172) at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:56) at org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:156) at org.apache.spark.SparkContext.<init>(SparkContext.scala:509) at org.apache.predictionio.workflow.WorkflowContext$.apply(WorkflowContext.scala:45) at org.apache.predictionio.workflow.CoreWorkflow$.runTrain(CoreWorkflow.scala:59) at org.apache.predictionio.workflow.CreateWorkflow$.main(CreateWorkflow.scala:251) at org.apache.predictionio.workflow.CreateWorkflow.main(CreateWorkflow.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:751) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:187) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:212) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:126) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) ######## *What is the correct way to get PIO to use the YARN based Spark for training?* *Thanks,* *--Cliff.*