RE: Spark standalone hangup during shuffle flatMap or explode in cluster

Saif.A.Ellafi Wed, 07 Oct 2015 10:53:01 -0700

It can be large yes. But, that still does not resolve the question of why it 
works in smaller environment, i.e. Local[32] or in cluster mode when using 
SQLContext instead of HiveContext.

The process in general, is a RowNumber() hiveQL operation, that is why I need 
HiveContext.

I have the feeling there is something wrong with HiveContext. I dont have a 
Hive Hadoop database, I only enabled HiveContext to use its functions in my 
JSON loaded dataframe.

I am new at spark, please dont hesitate to ask for more information as I still 
not sure what would be relevant.

Saif

-----Original Message-----
From: Sean Owen [mailto:so...@cloudera.com] 
Sent: Wednesday, October 07, 2015 2:38 PM
To: Ellafi, Saif A.
Cc: user
Subject: Re: Spark standalone hangup during shuffle flatMap or explode in 
cluster

-dev

Is r.getInt(ind) very large in some cases? I think there's not quite enough 
info here.

On Wed, Oct 7, 2015 at 6:23 PM,  <saif.a.ell...@wellsfargo.com> wrote:
> When running stand-alone cluster mode job, the process hangs up 
> randomly during a DataFrame flatMap or explode operation, in HiveContext:
>
> -->> df.flatMap(r => for (n <- 1 to r.getInt(ind)) yield r)
>
> This does not happen either with SQLContext in cluster, or Hive/SQL in 
> local mode, where it works fine.
>
> A couple minutes after the hangup, executors start dropping. I am 
> attching the logs Saif
>
>
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For 
> additional commands, e-mail: user-h...@spark.apache.org

RE: Spark standalone hangup during shuffle flatMap or explode in cluster

Reply via email to