Hi,

It looks like this is not related to Alluxio. Have you tried running the
same job with different storage?

Maybe you could increase the Spark JVM heap size to see if that helps your
issue?

Hope that helps,
Gene

On Wed, Jun 15, 2016 at 8:52 PM, Chanh Le <giaosu...@gmail.com> wrote:

> Hi everyone,
> I added more logs for my use case:
>
> When I cached all my data 500 mil records and count.
> I receive this.
> 16/06/16 10:09:25 ERROR TaskSetManager: Total size of serialized results
> of 27 tasks (1876.7 MB) is bigger than spark.driver.maxResultSize (1024.0
> MB)
> >>> that weird because I just count
> After increase maxResultSize to 10g
> I still waiting slow for result and error
> 16/06/16 10:09:25 INFO BlockManagerInfo: Removed taskresult_94 on
> slave1:27743 in memory (size: 69.5 MB, free: 6.2 GB)
> org.apache.spark.SparkException: Job aborted due to stage failure: Total
> size of serialized results of 15 tasks (1042.6 MB) is bigger than
> spark.driver.maxResultSize (1024.0 MB
> )
>   at org.apache.spark.scheduler.DAGScheduler.org
> $apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1450)
>   at
> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1438)
>   at
> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1437)
>   at
> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
>   at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
>   at
> org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1437)
>   at
> org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:811)
>   at
> org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:811)
>   at scala.Option.foreach(Option.scala:257)
>   at
> org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:811)
>   at
> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1659)
>   at
> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1618)
>   at
> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1607)
>   at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
>   at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:632)
>   at org.apache.spark.SparkContext.runJob(SparkContext.scala:1863)
>   at org.apache.spark.SparkContext.runJob(SparkContext.scala:1876)
>   at org.apache.spark.SparkContext.runJob(SparkContext.scala:1889)
>   at org.apache.spark.SparkContext.runJob(SparkContext.scala:1903)
>   at org.apache.spark.rdd.RDD$$anonfun$collect$1.apply(RDD.scala:883)
>   at
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
>   at
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
>   at org.apache.spark.rdd.RDD.withScope(RDD.scala:357)
>   at org.apache.spark.rdd.RDD.collect(RDD.scala:882)
>   at
> org.apache.spark.sql.execution.SparkPlan.executeCollect(SparkPlan.scala:290)
>   at
> org.apache.spark.sql.Dataset$$anonfun$org$apache$spark$sql$Dataset$$execute$1$1.apply(Dataset.scala:2122)
>   at
> org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:57)
>   at org.apache.spark.sql.Dataset.withNewExecutionId(Dataset.scala:2436)
>   at org.apache.spark.sql.Dataset.org
> $apache$spark$sql$Dataset$$execute$1(Dataset.scala:2121)
>   at org.apache.spark.sql.Dataset.org
> $apache$spark$sql$Dataset$$collect(Dataset.scala:2128)
>   at
> org.apache.spark.sql.Dataset$$anonfun$count$1.apply(Dataset.scala:2156)
>   at
> org.apache.spark.sql.Dataset$$anonfun$count$1.apply(Dataset.scala:2155)
>   at org.apache.spark.sql.Dataset.withCallback(Dataset.scala:2449)
>   at org.apache.spark.sql.Dataset.count(Dataset.scala:2155)
>   ... 48 elided
>
> I lost all my executors.
>
>
>
> On Jun 15, 2016, at 8:44 PM, Chanh Le <giaosu...@gmail.com> wrote:
>
> Hi Gene,
> I am using Alluxio 1.1.0.
> Spark 2.0 Preview version.
> Load from alluxio then cached and query for 2nd time. Spark will stuck.
>
>
>
> On Jun 15, 2016, at 8:42 PM, Gene Pang <gene.p...@gmail.com> wrote:
>
> Hi,
>
> Which version of Alluxio are you using?
>
> Thanks,
> Gene
>
> On Tue, Jun 14, 2016 at 3:45 AM, Chanh Le <giaosu...@gmail.com> wrote:
>
>> I am testing Spark 2.0
>> I load data from alluxio and cached then I query but the first query is
>> ok because it kick off cache action. But after that I run the query again
>> and it’s stuck.
>> I ran in cluster 5 nodes in spark-shell.
>>
>> Did anyone has this issue?
>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>> For additional commands, e-mail: user-h...@spark.apache.org
>>
>>
>
>
>

Reply via email to