The closing issue is related to using Crail for input/output. The changes
Adrian made just earlier today are changes on the shuffle plugin. Are you
using the Crail shuffle plugin at all? If not then the changes of Adrian
are not relevant to you.

-Patrick

On Wed, Jun 19, 2019 at 3:22 PM David Crespi <
[email protected]> wrote:

> Thanks for the description.  It seemed odd that it behaved this way as
> HDFS does close as
>
> expected, so I wasn’t sure. Wouldn’t this change the Terasort benchmark
> numbers?
>
>
>
> Regards,
>
>
>
>            David
>
>
>
> C: 714-476-2692
>
>
>
> ________________________________
> From: Jonas Pfefferle <[email protected]>
> Sent: Wednesday, June 19, 2019 12:17:30 AM
> To: [email protected]; David Crespi; [email protected]
> Subject: Re: Crail used as type 2 storage for TeraSort does not catch the
> "finished" signal
>
> Hi David,
>
>
> Unfortunately, if you use Crail for input/output with Spark this is
> expected. The problem is Spark never closes the filesystem correctly. I
> haven't look into this lately but if I remember correctly there was no easy
> way otherwise to determine Spark is about to close.
>
> Regards,
> Jonas
>
>   On Tue, 18 Jun 2019 22:17:16 +0000
>   David Crespi <[email protected]> wrote:
> > Hi,
> > I’m running Crail as the temporary backend storage for Terasort.
> > After each section (TeraGen, TeraSort, TeraVerify)
> > the program waits until a Cntl-C is given, then moves on to the next
> >section.  Is this the expected behavior, or is
> > this a bug?
> >
> > Here’s a small snippet of the output.  Terasort waits where the
> >bolded “Number of records” is listed, until
> > The ^c is given.  Each of the three programs does the same, but the
> >program does finish without errors.
> >
> >
> > 19/06/18 15:13:19 DEBUG TaskSchedulerImpl: parentName: , name:
> >TaskSet_1.0, runningTasks: 1
> > 19/06/18 15:13:19 INFO TaskSetManager: Finished task 1.0 in stage
> >1.0 (TID 3) in 142 ms on 192.168.3.10 (executor 4) (1/2)
> > 19/06/18 15:13:19 INFO BlockManagerInfo: Added broadcast_1_piece0 in
> >memory on 192.168.3.12:34011 (size: 1825.0 B, free: 366.3 MB)
> > 19/06/18 15:13:19 DEBUG TaskSchedulerImpl: parentName: , name:
> >TaskSet_1.0, runningTasks: 0
> > 19/06/18 15:13:19 INFO TaskSetManager: Finished task 0.0 in stage
> >1.0 (TID 2) in 977 ms on 192.168.3.12 (executor 3) (2/2)
> > 19/06/18 15:13:19 INFO TaskSchedulerImpl: Removed TaskSet 1.0, whose
> >tasks have all completed, from pool
> > 19/06/18 15:13:19 INFO DAGScheduler: ResultStage 1 (count at
> >TeraGen.scala:94) finished in 0.995 s
> > 19/06/18 15:13:19 DEBUG DAGScheduler: After removal of stage 1,
> >remaining stages = 0
> > 19/06/18 15:13:19 INFO DAGScheduler: Job 1 finished: count at
> >TeraGen.scala:94, took 1.003537 s
> > Number of records written: 10000
> > ^C19/06/18 15:13:36 INFO SparkContext: Invoking stop() from shutdown
> >hook
> >
> > Regards,
> >
> >           David
> >
> >
>
>

Reply via email to