Re: PySpark + executor lost

Sandy Ryza Fri, 08 Aug 2014 15:49:33 -0700

What exactly do you mean by "YARN cluster".  Do you mean running Spark
against a YARN cluster in general, or particularly in "yarn-cluster" mode,
where the driver runs inside a Spark application master?


Also, what error are you seeing in your executors?

-Sandy


On Fri, Aug 8, 2014 at 2:00 PM, Avishek Saha <[email protected]> wrote:

> Btw, I get this for Spark-1.0.2
> I guess YARN cluster is still not supported for PySpark.
>
>
> -------------------------------------------------------------------------------------
> Error: Cluster deploy mode is currently not supported for python.
> Run with --help for usage help or --verbose for debug output
>
> On 8 August 2014 13:28, Avishek Saha <[email protected]> wrote:
> > You mean YARN cluster, right?
> >
> > Also, my jobs runs thru all their stages just fine. But the entire
> > code crashes when I do a "saveAsTextFile".
> >
> > On 8 August 2014 13:24, Sandy Ryza <[email protected]> wrote:
> >> Hi Avishek,
> >>
> >> As of Spark 1.0, PySpark does in fact run on YARN.
> >>
> >> -Sandy
> >>
> >>
> >> On Fri, Aug 8, 2014 at 12:47 PM, Avishek Saha <[email protected]>
> >> wrote:
> >>>
> >>> So I think I have a better idea of the problem now.
> >>>
> >>> The environment is YARN client and IIRC PySpark doesn't run on YARN
> >>> cluster.
> >>>
> >>> So my client is heavily loaded which causes iy loose a lot of e
> executors
> >>> which might be part of the problem.
> >>>
> >>> Btw any plans in supporting PySpark in YARN clusters mode?
> >>>
> >>> On Aug 7, 2014 3:04 PM, "Davies Liu" <[email protected]> wrote:
> >>>>
> >>>> What is the environment ? YARN or Mesos or Standalone?
> >>>>
> >>>> It will be more helpful if you could show more loggings.
> >>>>
> >>>> On Wed, Aug 6, 2014 at 7:25 PM, Avishek Saha <[email protected]>
> >>>> wrote:
> >>>> > Hi,
> >>>> >
> >>>> > I get a lot of executor lost error for "saveAsTextFile" with PySpark
> >>>> > and Hadoop 2.4.
> >>>> >
> >>>> > For small datasets this error occurs but since the dataset is small
> it
> >>>> > gets eventually written to the file.
> >>>> > For large datasets, it takes forever to write the final output.
> >>>> >
> >>>> > Any help is appreciated.
> >>>> > Avishek
> >>>> >
> >>>> >
> ---------------------------------------------------------------------
> >>>> > To unsubscribe, e-mail: [email protected]
> >>>> > For additional commands, e-mail: [email protected]
> >>>> >
> >>
> >>
>

Re: PySpark + executor lost

Reply via email to