Spark heap issues

2013-12-05 Thread learner1014 all
Hi, Trying to do a join operation on an RDD, my input is pipe delimited data and there are 2 files. One file is 24MB and the other file is 285MB. Setup being used is the single node (server) setup: SPARK_MEM set to 512m Master /pkg/java/jdk1.7.0_11/bin/java -cp :/spark-0.8.0-incubating-bin-cdh4/c

Re: Spark heap issues

2013-12-06 Thread learner1014 all
") // Will allocate more > memory > System.setProperty("spark.akka.frameSize","2000") > System.setProperty("spark.akka.threads","16") // Dependent upon > number of cores with your worker machine > > > On Fri, Dec 6, 20

Re: Spark heap issues

2013-12-06 Thread learner1014 all
Btw the node only has 4GB memory so does the spark.executor.memory make sense... Should i instead make it around 2-3GB. ALso how different is this parameter from SPARK_MEM Thanks, Saurabh On Fri, Dec 6, 2013 at 8:26 AM, learner1014 all wrote: > Still see a whole lot of following er

Constant out of memory issues

2013-12-10 Thread learner1014 all
Data is in hdfs, running 2 workers with 1 GB memory datafile1 is ~9KB and datafile2 is ~216MB. Cant get it to run at all... Tried various different settings for the number of tasks, all the way from 2 to 1024. Anyone else seen similar issues. import org.apache.spark.SparkContext import org.apache

Re: Constant out of memory issues

2013-12-11 Thread learner1014 all
needs more than 1GB of heap space to function > correctly. What happens if you give the workers more memory? > > - Patrick > > On Tue, Dec 10, 2013 at 2:42 PM, learner1014 all > wrote: > > > > Data is in hdfs, running 2 workers with 1 GB memory > > datafile1 is ~9