Hi,
My team is using Spark 1.0.1 and the project we're working on needs to
compute exact numbers, which are then saved to S3, to be reused later in
other Spark jobs to compute other numbers. The problem we noticed yesterday:
one of the output partition files in S3 was missing :/ (some part-00218).
Hi, guys
My current project is using Spark 0.9.1, and after increasing the level of
parallelism and partitions in our RDDs, stages and tasks seem to complete
much faster. However it also seems that our cluster becomes more "unstable"
after some time:
- stalled stages still showing under "active st
Hi,
One of the executors in my spark cluster shows a "CANNOT FIND ADDRESS"
address, for one of the stages which failed. After that stages, I got
cascading failures for all my stages :/ (stages that seem complete but still
appears as active stage in the dashboard; incomplete or failed stages that
ar
Thanks, this is what I needed :) I should have searched more...
Something I noticed though: after the SparkContext is initialized, I had to
wait for a few seconds until sc.getExecutorStorageStatus.length returns the
correct number of workers in my cluster (otherwise it returns 1, for the
driver)..
Hi,
Is there a way to get the number of slaves/workers during runtime?
I searched online but didn't find anything :/ The application I'm working
will run on different clusters corresponding to different deployment stages
(beta -> prod). It would be great to get the number of slaves currently in
u