[ https://issues.apache.org/jira/browse/SPARK-2243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16201364#comment-16201364 ]
ASF GitHub Bot commented on SPARK-2243: --------------------------------------- Github user 561152 commented on the issue: https://github.com/apache/incubator-predictionio/pull/441 @mars thank you。 I think this is a BUG, and I do the following: Test one: I tested the version: 0.12 and master Using templates: Recommendation Local environment: HDP spark2.1-hadoop2.7.3 Pio batchpredict: Exception error in thread org.apache.spark.SparkException: Only main one SparkContext may be running in this JVM (see SPARK-2243). To ignore this error, set spark.driver.allowMultipleContexts true. The currently running SparkContext = was created at: Test two: Modify file: Core/src/main/scala/org/apache/predictionio/workflow/WorkflowContext.scala: Val conf = new, SparkConf () To: Val, conf = new, SparkConf (),.Set ("spark.driver.allowMultipleContexts", "true") Pio batchpredict: error reporting [INFO], [ContextHandler], Started, o.s.j.s.ServletContextHandler@2474df51{/metrics/json, null, AVAILABLE} [WARN], [SparkContext], Multiple, running, SparkContexts, detected, in, the, JVM, same! [ERROR] [Utils] Aborting task [ERROR], [Executor], Exception, in, task, 0, in, stage, 0 (TID, 0) [WARN] [TaskSetManager] Lost task 0 in stage 0 (TID 0, localhost, executor, driver): org.apache.spark.SparkException:, This, RDD, lacks, a, SparkContext., It,, could, happen, in, cases:, the, following,... (1) RDD transformations and actions are NOT invoked by the driver, but inside of other transformations; for example, rdd1.map (x rdd2.values.count) = > (* x) is invalid because the values transformation and count action cannot be performed inside of the rdd1.map transformation. For more information, see SPARK-5063. (2) When a Spark Streaming job recovers from checkpoint, this exception will be hit if a reference to an RDD not defined by the streaming job is used in DStream operations. For more information, See SPARK-13758. At org.apache.spark.rdd.RDD.org$apache$spark$rdd$RDD$$sc (RDD.scala:89) At org.apache.spark.rdd.RDD.withScope (RDD.scala:362) At org.apache.spark.rdd.PairRDDFunctions.lookup (PairRDDFunctions.scala:939) At org.apache.spark.mllib.recommendation.MatrixFactorizationModel.recommendProducts (MatrixFactorizationModel.scala:169) At org.example.recommendation.ALSAlgorithm$$anonfun$predict$1.apply (ALSAlgorithm.scala:85) At org.example.recommendation.ALSAlgorithm$$anonfun$predict$1.apply (ALSAlgorithm.scala:80) At scala.Option.map (Option.scala:146) At org.example.recommendation.ALSAlgorithm.predict (ALSAlgorithm.scala:80) At org.example.recommendation.ALSAlgorithm.predict (ALSAlgorithm.scala:22) At org.apache.predictionio.controller.PAlgorithm.predictBase (PAlgorithm.scala:76) At org.apache.predictionio.workflow.BatchPredict$$anonfun$15$$anonfun$16.apply (BatchPredict.scala:212) At org.apache.predictionio.workflow.BatchPredict$$anonfun$15$$anonfun$16.apply (BatchPredict.scala:211) At scala.collection.TraversableLike$$anonfun$map$1.apply (TraversableLike.scala:234) At scala.collection.TraversableLike$$anonfun$map$1.apply (TraversableLike.scala:234) At scala.collection.immutable.List.foreach (List.scala:381) At scala.collection.TraversableLike$class.map (TraversableLike.scala:234) At scala.collection.immutable.List.map (List.scala:285) At org.apache.predictionio.workflow.BatchPredict$$anonfun$15.apply (BatchPredict.scala:211) At org.apache.predictionio.workflow.BatchPredict$$anonfun$15.apply (BatchPredict.scala:197) At scala.collection.Iterator$$anon$11.next (Iterator.scala:409) At scala.collection.Iterator$$anon$11.next (Iterator.scala:409) At org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$1$$anonfun$13$$anonfun$apply$7.apply$mcV$sp (PairRDDFunctions.scala:1211) At org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$1$$anonfun$13$$anonfun$apply$7.apply (PairRDDFunctions.scala:1210) At org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$1$$anonfun$13$$anonfun$apply$7.apply (PairRDDFunctions.scala:1210) At org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks (Utils.scala:1341) At org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$1$$anonfun$13.apply (PairRDDFunctions.scala:1218) At org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$1$$anonfun$13.apply (PairRDDFunctions.scala:1197) At org.apache.spark.scheduler.ResultTask.runTask (ResultTask.scala:87) At org.apache.spark.scheduler.Task.run (Task.scala:99) At org.apache.spark.executor.Executor$TaskRunner.run (Executor.scala:282) At java.util.concurrent.ThreadPoolExecutor.runWorker (ThreadPoolExecutor.java:1142) At java.util.concurrent.ThreadPoolExecutor$Worker.run (ThreadPoolExecutor.java:617) At java.lang.Thread.run (Thread.java:745) [ERROR] [TaskSetManager] Task 0 in stage 0 failed 1 times; aborting job test three: pio deploy and use reset aip success。 > Support multiple SparkContexts in the same JVM > ---------------------------------------------- > > Key: SPARK-2243 > URL: https://issues.apache.org/jira/browse/SPARK-2243 > Project: Spark > Issue Type: New Feature > Components: Block Manager, Spark Core > Affects Versions: 0.7.0, 1.0.0, 1.1.0 > Reporter: Miguel Angel Fernandez Diaz > > We're developing a platform where we create several Spark contexts for > carrying out different calculations. Is there any restriction when using > several Spark contexts? We have two contexts, one for Spark calculations and > another one for Spark Streaming jobs. The next error arises when we first > execute a Spark calculation and, once the execution is finished, a Spark > Streaming job is launched: > {code} > 14/06/23 16:40:08 ERROR executor.Executor: Exception in task ID 0 > java.io.FileNotFoundException: http://172.19.0.215:47530/broadcast_0 > at > sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1624) > at > org.apache.spark.broadcast.HttpBroadcast$.read(HttpBroadcast.scala:156) > at > org.apache.spark.broadcast.HttpBroadcast.readObject(HttpBroadcast.scala:56) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017) > at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1893) > at > java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798) > at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350) > at > java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990) > at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915) > at > java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798) > at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350) > at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370) > at > org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:40) > at > org.apache.spark.scheduler.ResultTask$.deserializeInfo(ResultTask.scala:63) > at > org.apache.spark.scheduler.ResultTask.readExternal(ResultTask.scala:139) > at > java.io.ObjectInputStream.readExternalData(ObjectInputStream.java:1837) > at > java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1796) > at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350) > at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370) > at > org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:40) > at > org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:62) > at > org.apache.spark.executor.Executor$TaskRunner$$anonfun$run$1.apply$mcV$sp(Executor.scala:193) > at > org.apache.spark.deploy.SparkHadoopUtil.runAsUser(SparkHadoopUtil.scala:45) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:176) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > 14/06/23 16:40:08 WARN scheduler.TaskSetManager: Lost TID 0 (task 0.0:0) > 14/06/23 16:40:08 WARN scheduler.TaskSetManager: Loss was due to > java.io.FileNotFoundException > java.io.FileNotFoundException: http://172.19.0.215:47530/broadcast_0 > at > sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1624) > at > org.apache.spark.broadcast.HttpBroadcast$.read(HttpBroadcast.scala:156) > at > org.apache.spark.broadcast.HttpBroadcast.readObject(HttpBroadcast.scala:56) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017) > at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1893) > at > java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798) > at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350) > at > java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990) > at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915) > at > java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798) > at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350) > at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370) > at > org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:40) > at > org.apache.spark.scheduler.ResultTask$.deserializeInfo(ResultTask.scala:63) > at > org.apache.spark.scheduler.ResultTask.readExternal(ResultTask.scala:139) > at > java.io.ObjectInputStream.readExternalData(ObjectInputStream.java:1837) > at > java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1796) > at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350) > at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370) > at > org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:40) > at > org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:62) > at > org.apache.spark.executor.Executor$TaskRunner$$anonfun$run$1.apply$mcV$sp(Executor.scala:193) > at > org.apache.spark.deploy.SparkHadoopUtil.runAsUser(SparkHadoopUtil.scala:45) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:176) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > 14/06/23 16:40:08 ERROR scheduler.TaskSetManager: Task 0.0:0 failed 1 times; > aborting job > 14/06/23 16:40:08 INFO scheduler.TaskSchedulerImpl: Removed TaskSet 0.0, > whose tasks have all completed, from pool > 14/06/23 16:40:08 INFO scheduler.DAGScheduler: Failed to run runJob at > NetworkInputTracker.scala:182 > [WARNING] > org.apache.spark.SparkException: Job aborted: Task 0.0:0 failed 1 times (most > recent failure: Exception failure: java.io.FileNotFoundException: > http://172.19.0.215:47530/broadcast_0) > at > org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$scheduler$DAGScheduler$$abortStage$1.apply(DAGScheduler.scala:1020) > at > org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$scheduler$DAGScheduler$$abortStage$1.apply(DAGScheduler.scala:1018) > at > scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) > at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) > at > org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$abortStage(DAGScheduler.scala:1018) > at > org.apache.spark.scheduler.DAGScheduler$$anonfun$processEvent$10.apply(DAGScheduler.scala:604) > at > org.apache.spark.scheduler.DAGScheduler$$anonfun$processEvent$10.apply(DAGScheduler.scala:604) > at scala.Option.foreach(Option.scala:236) > at > org.apache.spark.scheduler.DAGScheduler.processEvent(DAGScheduler.scala:604) > at > org.apache.spark.scheduler.DAGScheduler$$anonfun$start$1$$anon$2$$anonfun$receive$1.applyOrElse(DAGScheduler.scala:190) > at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498) > at akka.actor.ActorCell.invoke(ActorCell.scala:456) > at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237) > at akka.dispatch.Mailbox.run(Mailbox.scala:219) > at > akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:385) > at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) > at > scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) > at > scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) > at > scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) > 14/06/23 16:40:09 INFO dstream.ForEachDStream: metadataCleanupDelay = 3600 > {code} > So far, we are working on localhost. Any clue about where this error is > coming from? Any workaround to solve the issue? -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org