[
https://issues.apache.org/jira/browse/SPARK-5060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Michael Armbrust resolved SPARK-5060.
-------------------------------------
Resolution: Cannot Reproduce
This code has changed a lot in Spark 1.5, so I'm going to close this ticket.
Please reopen if you can still reproduce.
> Spark driver main thread hanging after SQL insert in Parquet file
> -----------------------------------------------------------------
>
> Key: SPARK-5060
> URL: https://issues.apache.org/jira/browse/SPARK-5060
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Reporter: Alex Baretta
>
> Here's what the console shows:
> 15/01/01 01:12:29 INFO scheduler.TaskSchedulerImpl: Removed TaskSet 58.0,
> whose tasks have all completed, from pool
> 15/01/01 01:12:29 INFO scheduler.DAGScheduler: Stage 58 (runJob at
> ParquetTableOperations.scala:326) finished in 5493.549 s
> 15/01/01 01:12:29 INFO scheduler.DAGScheduler: Job 41 finished: runJob at
> ParquetTableOperations.scala:326, took 5493.747061 s
> It is now 01:40:03, so the driver has been hanging for the last 28 minutes.
> The web UI on the other hand shows that all tasks completed successfully, and
> the output directory has been populated--although the _SUCCESS file is
> missing.
> It is worth noting that my code started this job as its own thread. The
> actual code looks like the following snippet, modulo some simplifications.
> def save_to_parquet(allowExisting : Boolean = false) = {
> val threads = tables.map(table => {
> val thread = new Thread {
> override def run {
> table.insertInto(t.table_name)
> }
> }
> thread.start
> thread
> })
> threads.foreach(_.join)
> }
> As far as I can see the insertInto call never returns.
> The version of Spark I'm using is built from master, off of this commit:
> commit 815de54002f9c1cfedc398e95896fa207b4a5305
> Author: YanTangZhai <[email protected]>
> Date: Mon Dec 29 11:30:54 2014 -0800
> [SPARK-4946] [CORE] Using AkkaUtils.askWithReply in
> MapOutputTracker.askTracker to reduce the chance of the communicating problem
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]