Jian Wu created SPARK-21161: ------------------------------- Summary: SparkContext stopped when execute a query on Solr Key: SPARK-21161 URL: https://issues.apache.org/jira/browse/SPARK-21161 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 2.1.1 Environment: Hadoop2.7.3, Spark 2.1.1, spark-solr-3.0.1.jar, solr-solrj-6.5.1.jar Reporter: Jian Wu
The SparkContext stopped due to DAGSchedulerEventProcessLoop failed when I query Solr data in Spark. {code:none} 17/06/21 12:40:53 INFO ContextLauncher: 17/06/21 12:40:53 ERROR scheduler.DAGSchedulerEventProcessLoop: DAGSchedulerEventProcessLoop failed; shutting down SparkContext 17/06/21 12:40:53 INFO ContextLauncher: java.lang.NumberFormatException: For input string: “8983_solr” 17/06/21 12:40:53 INFO ContextLauncher: at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65) 17/06/21 12:40:53 INFO ContextLauncher: at java.lang.Integer.parseInt(Integer.java:580) 17/06/21 12:40:53 INFO ContextLauncher: at java.lang.Integer.parseInt(Integer.java:615) 17/06/21 12:40:53 INFO ContextLauncher: at scala.collection.immutable.StringLike$class.toInt(StringLike.scala:272) 17/06/21 12:40:53 INFO ContextLauncher: at scala.collection.immutable.StringOps.toInt(StringOps.scala:29) 17/06/21 12:40:53 INFO ContextLauncher: at org.apache.spark.util.Utils$.parseHostPort(Utils.scala:959) 17/06/21 12:40:53 INFO ContextLauncher: at org.apache.spark.scheduler.cluster.YarnScheduler.getRackForHost(YarnScheduler.scala:36) 17/06/21 12:40:53 INFO ContextLauncher: at org.apache.spark.scheduler.TaskSetManager$$anonfun$org$apache$spark$scheduler$TaskSetManager$$addPendingTask$1.apply(TaskSetManager.scala:200) 17/06/21 12:40:53 INFO ContextLauncher: at org.apache.spark.scheduler.TaskSetManager$$anonfun$org$apache$spark$scheduler$TaskSetManager$$addPendingTask$1.apply(TaskSetManager.scala:181) 17/06/21 12:40:53 INFO ContextLauncher: at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) 17/06/21 12:40:53 INFO ContextLauncher: at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48) 17/06/21 12:40:53 INFO ContextLauncher: at org.apache.spark.scheduler.TaskSetManager.org$apache$spark$scheduler$TaskSetManager$$addPendingTask(TaskSetManager.scala:181) 17/06/21 12:40:53 INFO ContextLauncher: at org.apache.spark.scheduler.TaskSetManager$$anonfun$1.apply$mcVI$sp(TaskSetManager.scala:160) 17/06/21 12:40:53 INFO ContextLauncher: at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:160) 17/06/21 12:40:53 INFO ContextLauncher: at org.apache.spark.scheduler.TaskSetManager.<init>(TaskSetManager.scala:159) 17/06/21 12:40:53 INFO ContextLauncher: at org.apache.spark.scheduler.TaskSchedulerImpl.createTaskSetManager(TaskSchedulerImpl.scala:212) 17/06/21 12:40:53 INFO ContextLauncher: at org.apache.spark.scheduler.TaskSchedulerImpl.submitTasks(TaskSchedulerImpl.scala:176) 17/06/21 12:40:53 INFO ContextLauncher: at org.apache.spark.scheduler.DAGScheduler.submitMissingTasks(DAGScheduler.scala:1043) 17/06/21 12:40:53 INFO ContextLauncher: at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$submitStage(DAGScheduler.scala:918) 17/06/21 12:40:53 INFO ContextLauncher: at org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$scheduler$DAGScheduler$$submitStage$4.apply(DAGScheduler.scala:921) 17/06/21 12:40:53 INFO ContextLauncher: at org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$scheduler$DAGScheduler$$submitStage$4.apply(DAGScheduler.scala:920) 17/06/21 12:40:53 INFO ContextLauncher: at scala.collection.immutable.List.foreach(List.scala:381) 17/06/21 12:40:53 INFO ContextLauncher: at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$submitStage(DAGScheduler.scala:920) 17/06/21 12:40:53 INFO ContextLauncher: at org.apache.spark.scheduler.DAGScheduler.handleJobSubmitted(DAGScheduler.scala:862) 17/06/21 12:40:53 INFO ContextLauncher: at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1613) 17/06/21 12:40:53 INFO ContextLauncher: at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1605) 17/06/21 12:40:53 INFO ContextLauncher: at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1594) 17/06/21 12:40:53 INFO ContextLauncher: at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48) {code} It is caused by the Solr special node name like "idx5.oi.dev:8983_solr". It brings "_solr" along with the port number. So when the YarnScheduler parses the port, it gets a "java.lang.NumberFormatException". -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org