Hello all . Does anyone else have any suggestions? Even understanding what
this error is from would help a lot.
On Oct 11, 2014 12:56 AM, Ilya Ganelin ilgan...@gmail.com wrote:
Hi Akhil - I tried your suggestions and tried varying my partition sizes.
Reducing the number of partitions led to
You could be hitting this issue
https://issues.apache.org/jira/browse/SPARK-3633 (or similar). You can
try the following workarounds:
sc.set(spark.core.connection.ack.wait.timeout,600)
sc.set(spark.akka.frameSize,50)
Also reduce the number of partitions, you could be hitting the kernel's
ulimit.
Thank you - I will try this. If I drop the partition count am I not more
likely to hit memory issues? Especially if the dataset is rather large?
On Oct 10, 2014 3:19 AM, Akhil Das ak...@sigmoidanalytics.com wrote:
You could be hitting this issue
https://issues.apache.org/jira/browse/SPARK-3633
Hi Akhil - I tried your suggestions and tried varying my partition sizes.
Reducing the number of partitions led to memory errors (presumably - I saw
IOExceptions much sooner).
With the settings you provided the program ran for longer but ultimately
crashes in the same way. I would like to
On Oct 9, 2014 10:18 AM, Ilya Ganelin ilgan...@gmail.com wrote:
Hi all – I could use some help figuring out a couple of exceptions I’ve
been getting regularly.
I have been running on a fairly large dataset (150 gigs). With smaller
datasets I don't have any issues.
My sequence of operations is
Hi all – I could use some help figuring out a couple of exceptions I’ve
been getting regularly.
I have been running on a fairly large dataset (150 gigs). With smaller
datasets I don't have any issues.
My sequence of operations is as follows – unless otherwise specified, I am
not caching:
Map a