Re: spark-submit memory too larger

2014-10-25 Thread marylucy
~60GB of memory right away or does it start out smaller and slowly build up to that high? If so, how long does it take to get that high? Also, which version of Spark are you using? SameerF On Fri, Oct 24, 2014 at 8:07 AM, marylucy qaz163wsx_...@hotmail.com wrote: i used standalone spark

spark-submit memory too larger

2014-10-24 Thread marylucy
i used standalone spark,set spark.driver.memory=5g,but spark-submit process use 57g memory, is this normal?how to decrease it?

Re: why fetch failed

2014-10-21 Thread marylucy
, marylucy qaz163wsx_...@hotmail.com wrote: When doing groupby for big data,may be 500g,some partition tasks success,some partition tasks fetchfailed error. Spark system retry previous stage,but always fail 6 computers : 384g Worker:40g*7 for one computer Can anyone tell me why fetch

Re: why fetch failed

2014-10-21 Thread marylucy
--- My Blog: https://www.dbtsai.com LinkedIn: https://www.linkedin.com/in/dbtsai On Sat, Oct 18, 2014 at 6:22 PM, marylucy qaz163wsx_...@hotmail.com wrote: When doing groupby for big data,may be 500g,some partition tasks success,some partition

How to show RDD size

2014-10-20 Thread marylucy
in spark-shell,I do in follows val input = sc.textfile(hdfs://192.168.1.10/people/testinput/) input.cache() In webui,I cannot see any rdd in storage tab.can anyone tell me how to show rdd size?thank you - To unsubscribe,

Re: How to show RDD size

2014-10-20 Thread marylucy
. Remember that certain operations in Spark are lazy, and caching is one of them. Nick On Mon, Oct 20, 2014 at 9:19 AM, marylucy qaz163wsx_...@hotmail.com wrote: in spark-shell,I do in follows val input = sc.textfile(hdfs://192.168.1.10/people/testinput/) input.cache() In webui,I

why fetch failed

2014-10-18 Thread marylucy
When doing groupby for big data,may be 500g,some partition tasks success,some partition tasks fetchfailed error. Spark system retry previous stage,but always fail 6 computers : 384g Worker:40g*7 for one computer Can anyone tell me why fetch failed???

spark 1.1.0 requested array size exceed vm limits

2014-09-05 Thread marylucy
I am building spark-1.0-rc4 with maven,following http://spark.apache.org/docs/latest/building-with-maven.html But running graphx edgeFileList ,some tasks failed error:requested array size exceed vm limits error:executor lost Can anyone know how to fit it

Re: spark 1.1.0 requested array size exceed vm limits

2014-09-05 Thread marylucy
I set 200,it remain failed in second step,(map and mapPartition in webui) In spark1.0.2 stable version ,it works well in first step,configuration same as 1.1.0 - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For

Re: how to filter value in spark

2014-08-29 Thread marylucy
i see it works well,thank you!!! But in follow situation how to do var a = sc.textFile(/sparktest/1/).map((_,a)) var b = sc.textFile(/sparktest/2/).map((_,b)) How to get (3,a) and (4,a) 在 Aug 28, 2014,19:54,Matthew Farrellee m...@redhat.com 写道: On 08/28/2014 07:20 AM, marylucy wrote

how to filter value in spark

2014-08-28 Thread marylucy
fileA=1 2 3 4 one number a line,save in /sparktest/1/ fileB=3 4 5 6 one number a line,save in /sparktest/2/ I want to get 3 and 4 var a = sc.textFile(/sparktest/1/).map((_,1)) var b = sc.textFile(/sparktest/2/).map((_,1)) a.filter(param={b.lookup(param._1).length0}).map(_._1).foreach(println)