Patrick,

Sure. I was interested in knowing if anyone experienced a similar issue and
whether there was any known workaround. Anyway will report on JIRA.

Alex
On Jan 2, 2015 9:13 AM, "Patrick Wendell" <pwend...@gmail.com> wrote:

> Hi Alessandro,
>
> Can you create a JIRA for this rather than reporting it on the dev
> list? That's where we track issues like this. Thanks!.
>
> - Patrick
>
> On Wed, Dec 31, 2014 at 8:48 PM, Alessandro Baretta
> <alexbare...@gmail.com> wrote:
> > Here's what the console shows:
> >
> > 15/01/01 01:12:29 INFO scheduler.TaskSchedulerImpl: Removed TaskSet 58.0,
> > whose tasks have all completed, from pool
> > 15/01/01 01:12:29 INFO scheduler.DAGScheduler: Stage 58 (runJob at
> > ParquetTableOperations.scala:326) finished in 5493.549 s
> > 15/01/01 01:12:29 INFO scheduler.DAGScheduler: Job 41 finished: runJob at
> > ParquetTableOperations.scala:326, took 5493.747061 s
> >
> > It is now 01:40:03, so the driver has been hanging for the last 28
> minutes.
> > The web UI on the other hand shows that all tasks completed successfully,
> > and the output directory has been populated--although the _SUCCESS file
> is
> > missing.
> >
> > It is worth noting that my code started this job as its own thread. The
> > actual code looks like the following snippet, modulo some
> simplifications.
> >
> >   def save_to_parquet(allowExisting : Boolean = false) = {
> >     val threads = tables.map(table => {
> >       val thread = new Thread {
> >         override def run {
> >           table.insertInto(t.table_name)
> >         }
> >       }
> >       thread.start
> >       thread
> >     })
> >     threads.foreach(_.join)
> >   }
> >
> > As far as I can see the insertInto call never returns. Any idea why?
> >
> > Alex
>

Reply via email to