Anyone experienced this before? Any help would be appreciated
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Map-output-statuses-exceeds-frameSize-tp18783p18866.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
Hey all
I am doing a groupby on nearly 2TB of data and I am getting this error:
2014-11-13 00:25:30 ERROR org.apache.spark.MapOutputTrackerMasterActor - Map
output statuses were 32163619 bytes which exceeds spark.akka.frameSize
(10485760 bytes).
org.apache.spark.SparkException: Map output
I would like to know a way for not adding those $_folder$ files to S3 as
well. I can go ahead and delete them but it would be nice if Spark handles
this for you.
--
View this message in context:
I had similar problem writing to cassandra using the connector for cassandra.
I am not sure whether this will work or not but I reduced the number of
cores to 1 per machine and my job was stable. More explanation of my
issue...
Hey all
I tried spark connector with Cassandra and I ran into a problem that I was
blocked on for couple of weeks. I managed to find a solution to the problem
but I am not sure whether it was a bug of the connector/spark or not.
I had three tables in Cassandra (Running Cassandra on 5 node
Hi there
What is an optimal cluster setup for spark? Given X amount of resources,
would you favour more worker nodes with less resources or less worker node
with more resources. Is this application dependent? If so what are the
things to consider, what are good practices?
Cheers
--
View this