Patrick, Sure. I was interested in knowing if anyone experienced a similar issue and whether there was any known workaround. Anyway will report on JIRA.
Alex On Jan 2, 2015 9:13 AM, "Patrick Wendell" <pwend...@gmail.com> wrote: > Hi Alessandro, > > Can you create a JIRA for this rather than reporting it on the dev > list? That's where we track issues like this. Thanks!. > > - Patrick > > On Wed, Dec 31, 2014 at 8:48 PM, Alessandro Baretta > <alexbare...@gmail.com> wrote: > > Here's what the console shows: > > > > 15/01/01 01:12:29 INFO scheduler.TaskSchedulerImpl: Removed TaskSet 58.0, > > whose tasks have all completed, from pool > > 15/01/01 01:12:29 INFO scheduler.DAGScheduler: Stage 58 (runJob at > > ParquetTableOperations.scala:326) finished in 5493.549 s > > 15/01/01 01:12:29 INFO scheduler.DAGScheduler: Job 41 finished: runJob at > > ParquetTableOperations.scala:326, took 5493.747061 s > > > > It is now 01:40:03, so the driver has been hanging for the last 28 > minutes. > > The web UI on the other hand shows that all tasks completed successfully, > > and the output directory has been populated--although the _SUCCESS file > is > > missing. > > > > It is worth noting that my code started this job as its own thread. The > > actual code looks like the following snippet, modulo some > simplifications. > > > > def save_to_parquet(allowExisting : Boolean = false) = { > > val threads = tables.map(table => { > > val thread = new Thread { > > override def run { > > table.insertInto(t.table_name) > > } > > } > > thread.start > > thread > > }) > > threads.foreach(_.join) > > } > > > > As far as I can see the insertInto call never returns. Any idea why? > > > > Alex >