[jira] [Updated] (SPARK-1712) ParallelCollectionRDD operations hanging forever without any error messages

JIRA Sun, 04 May 2014 12:24:37 -0700

     [ 
https://issues.apache.org/jira/browse/SPARK-1712?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Piotr Kołaczkowski updated SPARK-1712:
--------------------------------------

    Description: 
{noformat}
scala> val collection = (1 to 1000000).map(i => ("foo" + i, i)).toVector
collection: Vector[(String, Int)] = Vector((foo1,1), (foo2,2), (foo3,3), 
(foo4,4), (foo5,5), (foo6,6), (foo7,7), (foo8,8), (foo9,9), (foo10,10), 
(foo11,11), (foo12,12), (foo13,13), (foo14,14), (foo15,15), (foo16,16), 
(foo17,17), (foo18,18), (foo19,19), (foo20,20), (foo21,21), (foo22,22), 
(foo23,23), (foo24,24), (foo25,25), (foo26,26), (foo27,27), (foo28,28), 
(foo29,29), (foo30,30), (foo31,31), (foo32,32), (foo33,33), (foo34,34), 
(foo35,35), (foo36,36), (foo37,37), (foo38,38), (foo39,39), (foo40,40), 
(foo41,41), (foo42,42), (foo43,43), (foo44,44), (foo45,45), (foo46,46), 
(foo47,47), (foo48,48), (foo49,49), (foo50,50), (foo51,51), (foo52,52), 
(foo53,53), (foo54,54), (foo55,55), (foo56,56), (foo57,57), (foo58,58), 
(foo59,59), (foo60,60), (foo61,61), (foo62,62), (foo63,63), (foo64,64), (foo...

scala> val rdd = sc.parallelize(collection)
rdd: org.apache.spark.rdd.RDD[(String, Int)] = ParallelCollectionRDD[0] at 
parallelize at <console>:24

scala> rdd.first
res4: (String, Int) = (foo1,1)

scala> rdd.map(_._2).sum
// nothing happens

{noformat}

CPU and I/O idle. 
Memory usage reported by JVM, after manually triggered GC:
repl: 216 MB / 2 GB
executor: 67 MB / 2 GB
worker: 6 MB / 128 MB
master: 6 MB / 128 MB

No errors found in worker's stderr/stdout. 

It works fine with 700,000 elements and then it takes about 1 second to process 
the request and calculate the sum. With 700,000 items the spark executor memory 
doesn't even exceed 300 MB out of 2GB available. It fails with 800,000 items.

Multiple parralelized collections of size 700,000 items at the same time in the 
same session work fine.

  was:
{noformat}
scala> val collection = (1 to 1000000).map(i => ("foo" + i, i)).toVector
collection: Vector[(String, Int)] = Vector((foo1,1), (foo2,2), (foo3,3), 
(foo4,4), (foo5,5), (foo6,6), (foo7,7), (foo8,8), (foo9,9), (foo10,10), 
(foo11,11), (foo12,12), (foo13,13), (foo14,14), (foo15,15), (foo16,16), 
(foo17,17), (foo18,18), (foo19,19), (foo20,20), (foo21,21), (foo22,22), 
(foo23,23), (foo24,24), (foo25,25), (foo26,26), (foo27,27), (foo28,28), 
(foo29,29), (foo30,30), (foo31,31), (foo32,32), (foo33,33), (foo34,34), 
(foo35,35), (foo36,36), (foo37,37), (foo38,38), (foo39,39), (foo40,40), 
(foo41,41), (foo42,42), (foo43,43), (foo44,44), (foo45,45), (foo46,46), 
(foo47,47), (foo48,48), (foo49,49), (foo50,50), (foo51,51), (foo52,52), 
(foo53,53), (foo54,54), (foo55,55), (foo56,56), (foo57,57), (foo58,58), 
(foo59,59), (foo60,60), (foo61,61), (foo62,62), (foo63,63), (foo64,64), (foo...

scala> val rdd = sc.parallelize(collection)
rdd: org.apache.spark.rdd.RDD[(String, Int)] = ParallelCollectionRDD[0] at 
parallelize at <console>:24

scala> rdd.first
res4: (String, Int) = (foo1,1)

scala> rdd.map(_._2).sum
// nothing happens

{noformat}

CPU and I/O idle. 
Memory usage reported by JVM, after manually triggered GC:
repl: 216 MB / 2 GB
executor: 67 MB / 2 GB
worker: 6 MB / 128 MB
master: 6 MB / 128 MB

No errors found in worker's stderr/stdout. 

It works fine with 700,000 elements and then it takes about 1 second to process 
the request and calculate the sum. With 700,000 items the spark executor memory 
doesn't even exceed 300 MB out of 2GB available. It fails with 800,000 items.

Multiple parralelized collections of size 700,000 at the same time in the same 
session items work fine.


> ParallelCollectionRDD operations hanging forever without any error messages 
> ----------------------------------------------------------------------------
>
>                 Key: SPARK-1712
>                 URL: https://issues.apache.org/jira/browse/SPARK-1712
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 0.9.0
>         Environment: Linux Ubuntu 14.04, a single spark node; standalone mode.
>            Reporter: Piotr Kołaczkowski
>         Attachments: executor.jstack.txt, master.jstack.txt, repl.jstack.txt, 
> spark-hang.png, worker.jstack.txt
>
>
> {noformat}
> scala> val collection = (1 to 1000000).map(i => ("foo" + i, i)).toVector
> collection: Vector[(String, Int)] = Vector((foo1,1), (foo2,2), (foo3,3), 
> (foo4,4), (foo5,5), (foo6,6), (foo7,7), (foo8,8), (foo9,9), (foo10,10), 
> (foo11,11), (foo12,12), (foo13,13), (foo14,14), (foo15,15), (foo16,16), 
> (foo17,17), (foo18,18), (foo19,19), (foo20,20), (foo21,21), (foo22,22), 
> (foo23,23), (foo24,24), (foo25,25), (foo26,26), (foo27,27), (foo28,28), 
> (foo29,29), (foo30,30), (foo31,31), (foo32,32), (foo33,33), (foo34,34), 
> (foo35,35), (foo36,36), (foo37,37), (foo38,38), (foo39,39), (foo40,40), 
> (foo41,41), (foo42,42), (foo43,43), (foo44,44), (foo45,45), (foo46,46), 
> (foo47,47), (foo48,48), (foo49,49), (foo50,50), (foo51,51), (foo52,52), 
> (foo53,53), (foo54,54), (foo55,55), (foo56,56), (foo57,57), (foo58,58), 
> (foo59,59), (foo60,60), (foo61,61), (foo62,62), (foo63,63), (foo64,64), 
> (foo...
> scala> val rdd = sc.parallelize(collection)
> rdd: org.apache.spark.rdd.RDD[(String, Int)] = ParallelCollectionRDD[0] at 
> parallelize at <console>:24
> scala> rdd.first
> res4: (String, Int) = (foo1,1)
> scala> rdd.map(_._2).sum
> // nothing happens
> {noformat}
> CPU and I/O idle. 
> Memory usage reported by JVM, after manually triggered GC:
> repl: 216 MB / 2 GB
> executor: 67 MB / 2 GB
> worker: 6 MB / 128 MB
> master: 6 MB / 128 MB
> No errors found in worker's stderr/stdout. 
> It works fine with 700,000 elements and then it takes about 1 second to 
> process the request and calculate the sum. With 700,000 items the spark 
> executor memory doesn't even exceed 300 MB out of 2GB available. It fails 
> with 800,000 items.
> Multiple parralelized collections of size 700,000 items at the same time in 
> the same session work fine.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (SPARK-1712) ParallelCollectionRDD operations hanging forever without any error messages

Reply via email to