Re: Strange Error: java.lang.OutOfMemoryError: GC overhead limit exceeded

2015-07-15 Thread Saeed Shahrivari
On Wed, Jul 15, 2015 at 8:06 AM, Saeed Shahrivari saeed.shahriv...@gmail.com wrote: I use a simple map/reduce step in a Java/Spark program to remove duplicated documents from a large (10 TB compressed) sequence file containing some html pages. Here is the partial code: JavaPairRDDBytesWritable

Strange Error: java.lang.OutOfMemoryError: GC overhead limit exceeded

2015-07-15 Thread Saeed Shahrivari
I use a simple map/reduce step in a Java/Spark program to remove duplicated documents from a large (10 TB compressed) sequence file containing some html pages. Here is the partial code: JavaPairRDDBytesWritable, NullWritable inputRecords = sc.sequenceFile(args[0], BytesWritable.class,

Re: spark.akka.frameSize limit error

2015-01-05 Thread Saeed Shahrivari
up with a proper fix. In the meantime, I recommend that you increase your Akka frame size. On Sat, Jan 3, 2015 at 8:51 PM, Saeed Shahrivari saeed.shahriv...@gmail.com wrote: I use the 1.2 version. On Sun, Jan 4, 2015 at 3:01 AM, Josh Rosen rosenvi...@gmail.com wrote: Which version

Re: spark.akka.frameSize limit error

2015-01-03 Thread Saeed Shahrivari
the Akka frame size (via the spark.akka.frameSize configuration option). On Sat, Jan 3, 2015 at 10:40 AM, Saeed Shahrivari saeed.shahriv...@gmail.com wrote: Hi, I am trying to get the frequency of each Unicode char in a document collection using Spark. Here is the code snippet that does