Okay upped the memory SPARK_MEM to 2048m for the workers. I still left the
master at 512m. Similar issues
Here is what i see in the stdout logs:

792.187: [Full GC 792.187: [CMS: 963391K->963391K(963392K), 2.5459950 secs]
1039003K->1025065K(1040064K), [CMS Perm : 46331K->46331K(77368K)],
2.5460730 secs] [Times: user=2.54 sys=0.00, real=2.55 secs]
794.735: [GC [1 CMS-initial-mark: 963391K(963392K)] 1027917K(1040064K),
0.0553670 secs] [Times: user=0.06 sys=0.00, real=0.05 secs]
794.791: [CMS-concurrent-mark-start]
794.799: [Full GC 794.799: [CMS796.060: [CMS-concurrent-mark: 1.269/1.269
secs] [Times: user=1.27 sys=0.00, real=1.27 secs]
 (concurrent mode failure): 963391K->963373K(963392K), 3.8255090 secs]
1040063K->1025223K(1040064K), [CMS Perm : 46331K->46065K(77368K)],
3.8255710 secs] [Times: user=3.80 sys=0.00, real=3.83 secs]
798.642: [Full GC 798.642: [CMS: 963373K->963391K(963392K), 2.5939450 secs]
1039469K->1028596K(1040064K), [CMS Perm : 46066K->46066K(77368K)],
2.5940410 secs] [Times: user=2.59 sys=0.00, real=2.60 secs]
801.236: [GC [1 CMS-initial-mark: 963391K(963392K)] 1030753K(1040064K),
0.0603550 secs] [Times: user=0.06 sys=0.00, real=0.06 secs]
801.297: [CMS-concurrent-mark-start]
801.310: [Full GC 801.310: [CMS802.543: [CMS-concurrent-mark: 1.244/1.246
secs] [Times: user=1.26 sys=0.00, real=1.24 secs]
 (concurrent mode failure): 963391K->963391K(963392K), 3.8241010 secs]
1039208K->1030163K(1040064K), [CMS Perm : 46066K->46066K(77368K)],
3.8241680 secs] [Times: user=3.80 sys=0.00, real=3.83 secs]
805.143: [Full GC 805.143: [CMS: 963391K->963391K(963392K), 2.6232410 secs]
1038565K->1033504K(1040064K), [CMS Perm : 46066K->46066K(77368K)],
2.6233060 secs] [Times: user=2.61 sys=0.00, real=2.63 secs]
807.767: [GC [1 CMS-initial-mark: 963391K(963392K)] 1035552K(1040064K),
0.0616310 secs] [Times: user=0.06 sys=0.00, real=0.06 secs]
807.829: [CMS-concurrent-mark-start]
807.833: [Full GC 807.833: [CMS809.079: [CMS-concurrent-mark: 1.250/1.250
secs] [Times: user=1.25 sys=0.00, real=1.25 secs]
 (concurrent mode failure): 963391K->963392K(963392K), 3.8759250 secs]
1040063K->1035642K(1040064K), [CMS Perm : 46066K->46066K(77368K)],
3.8759870 secs] [Times: user=3.86 sys=0.00, real=3.88 secs]

Any pointers would be helpful.


On Tue, Dec 10, 2013 at 6:08 PM, Patrick Wendell <pwend...@gmail.com> wrote:

> Spark probably needs more than 1GB of heap space to function
> correctly. What happens if you give the workers more memory?
>
> - Patrick
>
> On Tue, Dec 10, 2013 at 2:42 PM, learner1014 all <learner1...@gmail.com>
> wrote:
> >
> > Data is in hdfs, running 2 workers with 1 GB memory
> > datafile1 is ~9KB and datafile2 is ~216MB. Cant get it to run at all...
> > Tried various different settings for the number of tasks, all the way
> from 2
> > to 1024.
> > Anyone else seen similar issues.
> >
> >
> > import org.apache.spark.SparkContext
> > import org.apache.spark.SparkContext._
> > import org.apache.spark.storage.StorageLevel
> >
> > object SimpleApp {
> >
> >       def main (args: Array[String]) {
> >
> >
> >
> System.setProperty("spark.local.dir","/spark-0.8.0-incubating-bin-cdh4/tmp");
> >       System.setProperty("spark.serializer",
> > "org.apache.spark.serializer.KryoSerializer")
> >       System.setProperty("spark.akka.timeout", "30")  //in seconds
> >
> >       System.setProperty("spark.executor.memory","1024m")
> >       System.setProperty("spark.akka.frameSize", "2000")  //in MB
> >       System.setProperty("spark.akka.threads","8")
> >
> >       val dataFile1 = "hdfs://dev01:8020/user/sa/datafile1"
> >       val dataFile2 = "hdfs://dev01:8020/user/sa/datafile2"
> >       val sc = new SparkContext("spark://dev01:7077", "Simple App",
> > "/spark-0.8.0-incubating-bin-cdh4",
> >       List("target/scala-2.9.3/simple-project_2.9.3-1.0.jar"))
> >
> >       val data10 = sc.textFile(dataFile1, 1024)
> >       val data11 = data10.map(x => x.split("|"))
> >       val data12 = data11.map( x  =>  (x(1).toInt -> x) )
> >
> >       val data20 = sc.textFile(dataFile2, 1024)
> >       val data21 = data20.map(x => x.split("|"))
> >       val data22 = data21.map(x => (x(1).toInt -> x))
> >
> >       val data3 = data12.join(data22, 1024)
> >       //val data4 = data3.distinct(4)
> >       //val numAs = data10.count()
> >       //val numBs = data20.count()
> >       //val numCs = data3.count()
> >       //val numDs = data4.count()
> > println("Total records after join is %s".format( data3.count()))
> >
> > Thanks,
>

Reply via email to