Are you doing a .cache after the sc.textFile? If so, you can set the StorageLevel to MEMORY_AND_DISK to avoid that.
Thanks Best Regards On Thu, Sep 3, 2015 at 10:11 AM, Spark Enthusiast <sparkenthusi...@yahoo.in> wrote: > Folks, > > I have an input file which is gzipped. I use sc.textFile("foo.gz") when I > see the following problem. Can someone help me how to fix this? > > 15/09/03 10:05:32 INFO deprecation: mapred.job.id is deprecated. Instead, > use mapreduce.job.id > 15/09/03 10:05:32 INFO CodecPool: Got brand-new decompressor [.gz] > 15/09/03 10:06:15 WARN MemoryStore: Not enough space to cache rdd_2_0 in > memory! (computed 216.3 MB so far) > 15/09/03 10:06:15 INFO MemoryStore: Memory use = 156.2 KB (blocks) + 213.1 > MB (scratch space shared across 1 thread(s)) = 213.3 MB. Storage limit = > 265.1 MB. > > > > >