[ https://issues.apache.org/jira/browse/SPARK-19835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
bruce xu updated SPARK-19835: ----------------------------- Description: when I run CAST sql such as: --------------------- create table A as select a.id, b.id, c.id from (select * from B) a left join (select * from C) b left join (select momo_id from D) c on a.id =b.id and a.id =c.id; --------------------- then it hung and not continue running, no errors return. the last few INFO msg is as follows: 17/03/06 18:57:47 INFO spark.SparkContext: Starting job: processCmd at CliDriver.java:376 17/03/06 18:57:47 INFO scheduler.DAGScheduler: Got job 2 (processCmd at CliDriver.java:376) with 2 output partitions 17/03/06 18:57:47 INFO scheduler.DAGScheduler: Final stage: ResultStage 3 (processCmd at CliDriver.java:376) 17/03/06 18:57:47 INFO scheduler.DAGScheduler: Parents of final stage: List() 17/03/06 18:57:47 INFO scheduler.DAGScheduler: Missing parents: List() 17/03/06 18:57:47 INFO scheduler.DAGScheduler: Submitting ResultStage 3 (MapPartitionsRDD[20] at processCmd at CliDriver.java:376), which has no missing parents 17/03/06 18:57:47 INFO memory.MemoryStore: Block broadcast_8 stored as values in memory (estimated size 150.1 KB, free 3.7 GB) 17/03/06 18:57:47 INFO memory.MemoryStore: Block broadcast_8_piece0 stored as bytes in memory (estimated size 55.0 KB, free 3.7 GB) 17/03/06 18:57:47 INFO storage.BlockManagerInfo: Added broadcast_8_piece0 in memory on 10.87.101.151:64500 (size: 55.0 KB, free: 3.8 GB) 17/03/06 18:57:47 INFO spark.SparkContext: Created broadcast 8 from broadcast at DAGScheduler.scala:1012 17/03/06 18:57:47 INFO scheduler.DAGScheduler: Submitting 2 missing tasks from ResultStage 3 (MapPartitionsRDD[20] at processCmd at CliDriver.java:376) 17/03/06 18:57:47 INFO cluster.YarnScheduler: Adding task set 3.0 with 2 tasks 17/03/06 18:57:47 INFO scheduler.FairSchedulableBuilder: Added task set TaskSet_3 tasks to pool default 17/03/06 18:57:47 INFO scheduler.TaskSetManager: Starting task 0.0 in stage 3.0 (TID 204, hadoop491.dx.momo.com, partition 0, RACK_LOCAL, 5824 bytes) 17/03/06 18:57:47 INFO cluster.YarnSchedulerBackend$YarnDriverEndpoint: Launching task 204 on executor id: 50 hostname: hadoop491.dx.momo.com. 17/03/06 18:57:48 INFO storage.BlockManagerInfo: Added broadcast_8_piece0 in memory on hadoop491.dx.momo.com:33300 (size: 55.0 KB, free: 2.8 GB) 17/03/06 18:57:48 INFO storage.BlockManagerInfo: Added broadcast_0_piece0 in memory on hadoop491.dx.momo.com:33300 (size: 42.7 KB, free: 2.8 GB) 17/03/06 18:57:50 INFO storage.BlockManagerInfo: Added broadcast_5_piece0 in memory on hadoop491.dx.momo.com:33300 (size: 1678.6 KB, free: 2.8 GB) 17/03/06 18:57:50 INFO storage.BlockManagerInfo: Added broadcast_7_piece0 in memory on hadoop491.dx.momo.com:33300 (size: 3.8 MB, free: 2.8 GB) 17/03/06 18:57:51 INFO scheduler.TaskSetManager: Starting task 1.0 in stage 3.0 (TID 205, hadoop605.dx.momo.com, partition 1, ANY, 5824 bytes) 17/03/06 18:57:51 INFO cluster.YarnSchedulerBackend$YarnDriverEndpoint: Launching task 205 on executor id: 9 hostname: hadoop605.dx.momo.com. 17/03/06 18:57:51 INFO storage.BlockManagerInfo: Added broadcast_8_piece0 in memory on hadoop605.dx.momo.com:37394 (size: 55.0 KB, free: 2.8 GB) 17/03/06 18:57:52 INFO storage.BlockManagerInfo: Added broadcast_0_piece0 in memory on hadoop605.dx.momo.com:37394 (size: 42.7 KB, free: 2.8 GB) 17/03/06 18:57:54 INFO storage.BlockManagerInfo: Added broadcast_5_piece0 in memory on hadoop605.dx.momo.com:37394 (size: 1678.6 KB, free: 2.8 GB) 17/03/06 18:57:54 INFO storage.BlockManagerInfo: Added broadcast_7_piece0 in memory on hadoop605.dx.momo.com:37394 (size: 3.8 MB, free: 2.8 GB) 17/03/06 19:05:31 INFO storage.BlockManagerInfo: Removed broadcast_6_piece0 on 10.87.101.151:64500 in memory (size: 10.0 KB, free: 3.8 GB) 17/03/06 19:05:31 INFO storage.BlockManagerInfo: Removed broadcast_6_piece0 on hadoop572.dx.momo.com:22730 in memory (size: 10.0 KB, free: 2.8 GB) ------------------- when I enter CTRL + C to cancel the job,the output stack is as follows: org.apache.spark.SparkException: Job 2 cancelled as part of cancellation of all jobs at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1454) at org.apache.spark.scheduler.DAGScheduler.handleJobCancellation(DAGScheduler.scala:1393) at org.apache.spark.scheduler.DAGScheduler$$anonfun$doCancelAllJobs$1.apply$mcVI$sp(DAGScheduler.scala:725) at org.apache.spark.scheduler.DAGScheduler$$anonfun$doCancelAllJobs$1.apply(DAGScheduler.scala:725) at org.apache.spark.scheduler.DAGScheduler$$anonfun$doCancelAllJobs$1.apply(DAGScheduler.scala:725) at scala.collection.mutable.HashSet.foreach(HashSet.scala:78) at org.apache.spark.scheduler.DAGScheduler.doCancelAllJobs(DAGScheduler.scala:725) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1645) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1622) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1611) at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48) at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:632) at org.apache.spark.SparkContext.runJob(SparkContext.scala:1890) at org.apache.spark.SparkContext.runJob(SparkContext.scala:1903) at org.apache.spark.SparkContext.runJob(SparkContext.scala:1923) at org.apache.spark.sql.hive.execution.InsertIntoHiveTable.saveAsHiveFile(InsertIntoHiveTable.scala:132) at org.apache.spark.sql.hive.execution.InsertIntoHiveTable.sideEffectResult$lzycompute(InsertIntoHiveTable.scala:234) at org.apache.spark.sql.hive.execution.InsertIntoHiveTable.sideEffectResult(InsertIntoHiveTable.scala:143) at org.apache.spark.sql.hive.execution.InsertIntoHiveTable.doExecute(InsertIntoHiveTable.scala:310) at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:115) at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:115) at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:136) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:133) at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:114) at org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:86) at org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:86) at org.apache.spark.sql.hive.execution.CreateHiveTableAsSelectCommand.run(CreateHiveTableAsSelectCommand.scala:93) at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:60) at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:58) at org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:74) at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:115) at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:115) at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:136) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:133) at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:114) at org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:86) at org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:86) at org.apache.spark.sql.Dataset.<init>(Dataset.scala:186) at org.apache.spark.sql.Dataset.<init>(Dataset.scala:167) at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:65) at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:687) at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:682) at org.apache.spark.sql.hive.thriftserver.SparkSQLDriver.run(SparkSQLDriver.scala:62) at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.processCmd(SparkSQLCLIDriver.scala:331) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376) at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver$.main(SparkSQLCLIDriver.scala:247) at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.main(SparkSQLCLIDriver.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:740) at org.apache.spark.deploy.SparkSubmit$$anon$1.run(SparkSubmit.scala:167) at org.apache.spark.deploy.SparkSubmit$$anon$1.run(SparkSubmit.scala:165) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:165) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:210) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:124) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) --------------------------------------------------------------- part of the thread dump info is: 91 Executor task launch worker-0 RUNNABLE 96 DataStreamer for file /tmp/spark/stage-datapm_hive_2017-03-06_18-57-40_470_3468784292858131323-1/-ext-10000/_temporary/0/_temporary/attempt_201703061857_0003_m_000000_0/part-00000 block BP-454299492-10.84.100.171-1416301904728:blk_3772186281_2703877735 TIMED_WAITING 48 dispatcher-event-loop-0 WAITING 49 dispatcher-event-loop-1 WAITING ----------------------------------------------------------------------------------- the detail of thread ID 91 is: org.apache.spark.unsafe.Platform.copyMemory(Platform.java:194) org.apache.spark.unsafe.types.UTF8String.getBytes(UTF8String.java:230) org.apache.spark.unsafe.types.UTF8String.toString(UTF8String.java:821) org.apache.hadoop.hive.serde2.objectinspector.primitive.JavaStringObjectInspector.getPrimitiveWritableObject(JavaStringObjectInspector.java:36) org.apache.hadoop.hive.serde2.lazy.LazyUtils.writePrimitiveUTF8(LazyUtils.java:222) org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serialize(LazySimpleSerDe.java:307) org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serializeField(LazySimpleSerDe.java:262) org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.doSerialize(LazySimpleSerDe.java:246) org.apache.hadoop.hive.serde2.AbstractEncodingAwareSerDe.serialize(AbstractEncodingAwareSerDe.java:50) org.apache.spark.sql.hive.SparkHiveWriterContainer$$anonfun$writeToFile$1.apply(hiveWriterContainers.scala:191) org.apache.spark.sql.hive.SparkHiveWriterContainer$$anonfun$writeToFile$1.apply(hiveWriterContainers.scala:185) scala.collection.Iterator$class.foreach(Iterator.scala:893) org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.foreach(WholeStageCodegenExec.scala:368) org.apache.spark.sql.hive.SparkHiveWriterContainer.writeToFile(hiveWriterContainers.scala:185) org.apache.spark.sql.hive.execution.InsertIntoHiveTable$$anonfun$saveAsHiveFile$3.apply(InsertIntoHiveTable.scala:132) org.apache.spark.sql.hive.execution.InsertIntoHiveTable$$anonfun$saveAsHiveFile$3.apply(InsertIntoHiveTable.scala:132) org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:70) org.apache.spark.scheduler.Task.run(Task.scala:86) org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274) java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) java.lang.Thread.run(Thread.java:745) ------------------------------------------------------------------------------------------------------------ I wonder if someone else can reproduce the issue, if can, then this may be a bug. was: when I run CAST sql such as: --------------------- create table A as select a.id, b.id, c.id from (select * from B) a left join (select * from C) b left join (select momo_id from D) c on a.id =b.id and a.id =c.id; --------------------- then it hung and not continue running, no errors return. the last few INFO msg is as follows: 17/03/06 18:57:47 INFO spark.SparkContext: Starting job: processCmd at CliDriver.java:376 17/03/06 18:57:47 INFO scheduler.DAGScheduler: Got job 2 (processCmd at CliDriver.java:376) with 2 output partitions 17/03/06 18:57:47 INFO scheduler.DAGScheduler: Final stage: ResultStage 3 (processCmd at CliDriver.java:376) 17/03/06 18:57:47 INFO scheduler.DAGScheduler: Parents of final stage: List() 17/03/06 18:57:47 INFO scheduler.DAGScheduler: Missing parents: List() 17/03/06 18:57:47 INFO scheduler.DAGScheduler: Submitting ResultStage 3 (MapPartitionsRDD[20] at processCmd at CliDriver.java:376), which has no missing parents 17/03/06 18:57:47 INFO memory.MemoryStore: Block broadcast_8 stored as values in memory (estimated size 150.1 KB, free 3.7 GB) 17/03/06 18:57:47 INFO memory.MemoryStore: Block broadcast_8_piece0 stored as bytes in memory (estimated size 55.0 KB, free 3.7 GB) 17/03/06 18:57:47 INFO storage.BlockManagerInfo: Added broadcast_8_piece0 in memory on 10.87.101.151:64500 (size: 55.0 KB, free: 3.8 GB) 17/03/06 18:57:47 INFO spark.SparkContext: Created broadcast 8 from broadcast at DAGScheduler.scala:1012 17/03/06 18:57:47 INFO scheduler.DAGScheduler: Submitting 2 missing tasks from ResultStage 3 (MapPartitionsRDD[20] at processCmd at CliDriver.java:376) 17/03/06 18:57:47 INFO cluster.YarnScheduler: Adding task set 3.0 with 2 tasks 17/03/06 18:57:47 INFO scheduler.FairSchedulableBuilder: Added task set TaskSet_3 tasks to pool default 17/03/06 18:57:47 INFO scheduler.TaskSetManager: Starting task 0.0 in stage 3.0 (TID 204, hadoop491.dx.momo.com, partition 0, RACK_LOCAL, 5824 bytes) 17/03/06 18:57:47 INFO cluster.YarnSchedulerBackend$YarnDriverEndpoint: Launching task 204 on executor id: 50 hostname: hadoop491.dx.momo.com. 17/03/06 18:57:48 INFO storage.BlockManagerInfo: Added broadcast_8_piece0 in memory on hadoop491.dx.momo.com:33300 (size: 55.0 KB, free: 2.8 GB) 17/03/06 18:57:48 INFO storage.BlockManagerInfo: Added broadcast_0_piece0 in memory on hadoop491.dx.momo.com:33300 (size: 42.7 KB, free: 2.8 GB) 17/03/06 18:57:50 INFO storage.BlockManagerInfo: Added broadcast_5_piece0 in memory on hadoop491.dx.momo.com:33300 (size: 1678.6 KB, free: 2.8 GB) 17/03/06 18:57:50 INFO storage.BlockManagerInfo: Added broadcast_7_piece0 in memory on hadoop491.dx.momo.com:33300 (size: 3.8 MB, free: 2.8 GB) 17/03/06 18:57:51 INFO scheduler.TaskSetManager: Starting task 1.0 in stage 3.0 (TID 205, hadoop605.dx.momo.com, partition 1, ANY, 5824 bytes) 17/03/06 18:57:51 INFO cluster.YarnSchedulerBackend$YarnDriverEndpoint: Launching task 205 on executor id: 9 hostname: hadoop605.dx.momo.com. 17/03/06 18:57:51 INFO storage.BlockManagerInfo: Added broadcast_8_piece0 in memory on hadoop605.dx.momo.com:37394 (size: 55.0 KB, free: 2.8 GB) 17/03/06 18:57:52 INFO storage.BlockManagerInfo: Added broadcast_0_piece0 in memory on hadoop605.dx.momo.com:37394 (size: 42.7 KB, free: 2.8 GB) 17/03/06 18:57:54 INFO storage.BlockManagerInfo: Added broadcast_5_piece0 in memory on hadoop605.dx.momo.com:37394 (size: 1678.6 KB, free: 2.8 GB) 17/03/06 18:57:54 INFO storage.BlockManagerInfo: Added broadcast_7_piece0 in memory on hadoop605.dx.momo.com:37394 (size: 3.8 MB, free: 2.8 GB) 17/03/06 19:05:31 INFO storage.BlockManagerInfo: Removed broadcast_6_piece0 on 10.87.101.151:64500 in memory (size: 10.0 KB, free: 3.8 GB) 17/03/06 19:05:31 INFO storage.BlockManagerInfo: Removed broadcast_6_piece0 on hadoop572.dx.momo.com:22730 in memory (size: 10.0 KB, free: 2.8 GB) ------------------- when I enter CTRL + C to cancel the job,the output stack is as follows: org.apache.spark.SparkException: Job 2 cancelled as part of cancellation of all jobs at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1454) at org.apache.spark.scheduler.DAGScheduler.handleJobCancellation(DAGScheduler.scala:1393) at org.apache.spark.scheduler.DAGScheduler$$anonfun$doCancelAllJobs$1.apply$mcVI$sp(DAGScheduler.scala:725) at org.apache.spark.scheduler.DAGScheduler$$anonfun$doCancelAllJobs$1.apply(DAGScheduler.scala:725) at org.apache.spark.scheduler.DAGScheduler$$anonfun$doCancelAllJobs$1.apply(DAGScheduler.scala:725) at scala.collection.mutable.HashSet.foreach(HashSet.scala:78) at org.apache.spark.scheduler.DAGScheduler.doCancelAllJobs(DAGScheduler.scala:725) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1645) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1622) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1611) at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48) at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:632) at org.apache.spark.SparkContext.runJob(SparkContext.scala:1890) at org.apache.spark.SparkContext.runJob(SparkContext.scala:1903) at org.apache.spark.SparkContext.runJob(SparkContext.scala:1923) at org.apache.spark.sql.hive.execution.InsertIntoHiveTable.saveAsHiveFile(InsertIntoHiveTable.scala:132) at org.apache.spark.sql.hive.execution.InsertIntoHiveTable.sideEffectResult$lzycompute(InsertIntoHiveTable.scala:234) at org.apache.spark.sql.hive.execution.InsertIntoHiveTable.sideEffectResult(InsertIntoHiveTable.scala:143) at org.apache.spark.sql.hive.execution.InsertIntoHiveTable.doExecute(InsertIntoHiveTable.scala:310) at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:115) at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:115) at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:136) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:133) at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:114) at org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:86) at org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:86) at org.apache.spark.sql.hive.execution.CreateHiveTableAsSelectCommand.run(CreateHiveTableAsSelectCommand.scala:93) at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:60) at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:58) at org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:74) at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:115) at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:115) at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:136) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:133) at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:114) at org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:86) at org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:86) at org.apache.spark.sql.Dataset.<init>(Dataset.scala:186) at org.apache.spark.sql.Dataset.<init>(Dataset.scala:167) at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:65) at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:687) at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:682) at org.apache.spark.sql.hive.thriftserver.SparkSQLDriver.run(SparkSQLDriver.scala:62) at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.processCmd(SparkSQLCLIDriver.scala:331) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376) at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver$.main(SparkSQLCLIDriver.scala:247) at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.main(SparkSQLCLIDriver.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:740) at org.apache.spark.deploy.SparkSubmit$$anon$1.run(SparkSubmit.scala:167) at org.apache.spark.deploy.SparkSubmit$$anon$1.run(SparkSubmit.scala:165) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:165) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:210) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:124) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) --------------------------------------------------------------- I wonder if someone else can reproduce the issue, if can, then this may be a bug. > Runing CTAS sql on spark2.0.1 hung > ----------------------------------- > > Key: SPARK-19835 > URL: https://issues.apache.org/jira/browse/SPARK-19835 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 2.0.1 > Environment: spark2.0.1, hadoop2.6.0, jdk1.7 > Reporter: bruce xu > > when I run CAST sql such as: > --------------------- > create table A as > select a.id, b.id, c.id from > (select * from B) a > left join > (select * from C) b > left join > (select momo_id from D) c > on a.id =b.id and a.id =c.id; > --------------------- > then it hung and not continue running, no errors return. the last few INFO > msg is as follows: > 17/03/06 18:57:47 INFO spark.SparkContext: Starting job: processCmd at > CliDriver.java:376 > 17/03/06 18:57:47 INFO scheduler.DAGScheduler: Got job 2 (processCmd at > CliDriver.java:376) with 2 output partitions > 17/03/06 18:57:47 INFO scheduler.DAGScheduler: Final stage: ResultStage 3 > (processCmd at CliDriver.java:376) > 17/03/06 18:57:47 INFO scheduler.DAGScheduler: Parents of final stage: List() > 17/03/06 18:57:47 INFO scheduler.DAGScheduler: Missing parents: List() > 17/03/06 18:57:47 INFO scheduler.DAGScheduler: Submitting ResultStage 3 > (MapPartitionsRDD[20] at processCmd at CliDriver.java:376), which has no > missing parents > 17/03/06 18:57:47 INFO memory.MemoryStore: Block broadcast_8 stored as values > in memory (estimated size 150.1 KB, free 3.7 GB) > 17/03/06 18:57:47 INFO memory.MemoryStore: Block broadcast_8_piece0 stored as > bytes in memory (estimated size 55.0 KB, free 3.7 GB) > 17/03/06 18:57:47 INFO storage.BlockManagerInfo: Added broadcast_8_piece0 in > memory on 10.87.101.151:64500 (size: 55.0 KB, free: 3.8 GB) > 17/03/06 18:57:47 INFO spark.SparkContext: Created broadcast 8 from broadcast > at DAGScheduler.scala:1012 > 17/03/06 18:57:47 INFO scheduler.DAGScheduler: Submitting 2 missing tasks > from ResultStage 3 (MapPartitionsRDD[20] at processCmd at CliDriver.java:376) > 17/03/06 18:57:47 INFO cluster.YarnScheduler: Adding task set 3.0 with 2 tasks > 17/03/06 18:57:47 INFO scheduler.FairSchedulableBuilder: Added task set > TaskSet_3 tasks to pool default > 17/03/06 18:57:47 INFO scheduler.TaskSetManager: Starting task 0.0 in stage > 3.0 (TID 204, hadoop491.dx.momo.com, partition 0, RACK_LOCAL, 5824 bytes) > 17/03/06 18:57:47 INFO cluster.YarnSchedulerBackend$YarnDriverEndpoint: > Launching task 204 on executor id: 50 hostname: hadoop491.dx.momo.com. > 17/03/06 18:57:48 INFO storage.BlockManagerInfo: Added broadcast_8_piece0 in > memory on hadoop491.dx.momo.com:33300 (size: 55.0 KB, free: 2.8 GB) > 17/03/06 18:57:48 INFO storage.BlockManagerInfo: Added broadcast_0_piece0 in > memory on hadoop491.dx.momo.com:33300 (size: 42.7 KB, free: 2.8 GB) > 17/03/06 18:57:50 INFO storage.BlockManagerInfo: Added broadcast_5_piece0 in > memory on hadoop491.dx.momo.com:33300 (size: 1678.6 KB, free: 2.8 GB) > 17/03/06 18:57:50 INFO storage.BlockManagerInfo: Added broadcast_7_piece0 in > memory on hadoop491.dx.momo.com:33300 (size: 3.8 MB, free: 2.8 GB) > 17/03/06 18:57:51 INFO scheduler.TaskSetManager: Starting task 1.0 in stage > 3.0 (TID 205, hadoop605.dx.momo.com, partition 1, ANY, 5824 bytes) > 17/03/06 18:57:51 INFO cluster.YarnSchedulerBackend$YarnDriverEndpoint: > Launching task 205 on executor id: 9 hostname: hadoop605.dx.momo.com. > 17/03/06 18:57:51 INFO storage.BlockManagerInfo: Added broadcast_8_piece0 in > memory on hadoop605.dx.momo.com:37394 (size: 55.0 KB, free: 2.8 GB) > 17/03/06 18:57:52 INFO storage.BlockManagerInfo: Added broadcast_0_piece0 in > memory on hadoop605.dx.momo.com:37394 (size: 42.7 KB, free: 2.8 GB) > 17/03/06 18:57:54 INFO storage.BlockManagerInfo: Added broadcast_5_piece0 in > memory on hadoop605.dx.momo.com:37394 (size: 1678.6 KB, free: 2.8 GB) > 17/03/06 18:57:54 INFO storage.BlockManagerInfo: Added broadcast_7_piece0 in > memory on hadoop605.dx.momo.com:37394 (size: 3.8 MB, free: 2.8 GB) > 17/03/06 19:05:31 INFO storage.BlockManagerInfo: Removed broadcast_6_piece0 > on 10.87.101.151:64500 in memory (size: 10.0 KB, free: 3.8 GB) > 17/03/06 19:05:31 INFO storage.BlockManagerInfo: Removed broadcast_6_piece0 > on hadoop572.dx.momo.com:22730 in memory (size: 10.0 KB, free: 2.8 GB) > ------------------- > when I enter CTRL + C to cancel the job,the output stack is as follows: > org.apache.spark.SparkException: Job 2 cancelled as part of cancellation of > all jobs > at > org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1454) > at > org.apache.spark.scheduler.DAGScheduler.handleJobCancellation(DAGScheduler.scala:1393) > at > org.apache.spark.scheduler.DAGScheduler$$anonfun$doCancelAllJobs$1.apply$mcVI$sp(DAGScheduler.scala:725) > at > org.apache.spark.scheduler.DAGScheduler$$anonfun$doCancelAllJobs$1.apply(DAGScheduler.scala:725) > at > org.apache.spark.scheduler.DAGScheduler$$anonfun$doCancelAllJobs$1.apply(DAGScheduler.scala:725) > at scala.collection.mutable.HashSet.foreach(HashSet.scala:78) > at > org.apache.spark.scheduler.DAGScheduler.doCancelAllJobs(DAGScheduler.scala:725) > at > org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1645) > at > org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1622) > at > org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1611) > at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48) > at > org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:632) > at org.apache.spark.SparkContext.runJob(SparkContext.scala:1890) > at org.apache.spark.SparkContext.runJob(SparkContext.scala:1903) > at org.apache.spark.SparkContext.runJob(SparkContext.scala:1923) > at > org.apache.spark.sql.hive.execution.InsertIntoHiveTable.saveAsHiveFile(InsertIntoHiveTable.scala:132) > at > org.apache.spark.sql.hive.execution.InsertIntoHiveTable.sideEffectResult$lzycompute(InsertIntoHiveTable.scala:234) > at > org.apache.spark.sql.hive.execution.InsertIntoHiveTable.sideEffectResult(InsertIntoHiveTable.scala:143) > at > org.apache.spark.sql.hive.execution.InsertIntoHiveTable.doExecute(InsertIntoHiveTable.scala:310) > at > org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:115) > at > org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:115) > at > org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:136) > at > org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) > at > org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:133) > at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:114) > at > org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:86) > at > org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:86) > at > org.apache.spark.sql.hive.execution.CreateHiveTableAsSelectCommand.run(CreateHiveTableAsSelectCommand.scala:93) > at > org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:60) > at > org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:58) > at > org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:74) > at > org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:115) > at > org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:115) > at > org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:136) > at > org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) > at > org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:133) > at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:114) > at > org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:86) > at > org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:86) > at org.apache.spark.sql.Dataset.<init>(Dataset.scala:186) > at org.apache.spark.sql.Dataset.<init>(Dataset.scala:167) > at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:65) > at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:687) > at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:682) > at > org.apache.spark.sql.hive.thriftserver.SparkSQLDriver.run(SparkSQLDriver.scala:62) > at > org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.processCmd(SparkSQLCLIDriver.scala:331) > at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376) > at > org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver$.main(SparkSQLCLIDriver.scala:247) > at > org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.main(SparkSQLCLIDriver.scala) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:740) > at > org.apache.spark.deploy.SparkSubmit$$anon$1.run(SparkSubmit.scala:167) > at > org.apache.spark.deploy.SparkSubmit$$anon$1.run(SparkSubmit.scala:165) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) > at > org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:165) > at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:210) > at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:124) > at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) > --------------------------------------------------------------- > part of the thread dump info is: > 91 Executor task launch worker-0 RUNNABLE > 96 DataStreamer for file > /tmp/spark/stage-datapm_hive_2017-03-06_18-57-40_470_3468784292858131323-1/-ext-10000/_temporary/0/_temporary/attempt_201703061857_0003_m_000000_0/part-00000 > block BP-454299492-10.84.100.171-1416301904728:blk_3772186281_2703877735 > TIMED_WAITING > 48 dispatcher-event-loop-0 WAITING > 49 dispatcher-event-loop-1 WAITING > ----------------------------------------------------------------------------------- > > the detail of thread ID 91 is: > org.apache.spark.unsafe.Platform.copyMemory(Platform.java:194) > org.apache.spark.unsafe.types.UTF8String.getBytes(UTF8String.java:230) > org.apache.spark.unsafe.types.UTF8String.toString(UTF8String.java:821) > org.apache.hadoop.hive.serde2.objectinspector.primitive.JavaStringObjectInspector.getPrimitiveWritableObject(JavaStringObjectInspector.java:36) > org.apache.hadoop.hive.serde2.lazy.LazyUtils.writePrimitiveUTF8(LazyUtils.java:222) > org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serialize(LazySimpleSerDe.java:307) > org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serializeField(LazySimpleSerDe.java:262) > org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.doSerialize(LazySimpleSerDe.java:246) > org.apache.hadoop.hive.serde2.AbstractEncodingAwareSerDe.serialize(AbstractEncodingAwareSerDe.java:50) > org.apache.spark.sql.hive.SparkHiveWriterContainer$$anonfun$writeToFile$1.apply(hiveWriterContainers.scala:191) > org.apache.spark.sql.hive.SparkHiveWriterContainer$$anonfun$writeToFile$1.apply(hiveWriterContainers.scala:185) > scala.collection.Iterator$class.foreach(Iterator.scala:893) > org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.foreach(WholeStageCodegenExec.scala:368) > org.apache.spark.sql.hive.SparkHiveWriterContainer.writeToFile(hiveWriterContainers.scala:185) > org.apache.spark.sql.hive.execution.InsertIntoHiveTable$$anonfun$saveAsHiveFile$3.apply(InsertIntoHiveTable.scala:132) > org.apache.spark.sql.hive.execution.InsertIntoHiveTable$$anonfun$saveAsHiveFile$3.apply(InsertIntoHiveTable.scala:132) > org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:70) > org.apache.spark.scheduler.Task.run(Task.scala:86) > org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274) > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > java.lang.Thread.run(Thread.java:745) > ------------------------------------------------------------------------------------------------------------ > I wonder if someone else can reproduce the issue, if can, then this may be a > bug. -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org