RE: Shuffle files

2014-10-07 Thread Lisonbee, Todd
Are you sure the new ulimit has taken effect? How many cores are you using? How many reducers? In general if a node in your cluster has C assigned cores and you run a job with X reducers then Spark will open C*X files in parallel and start writing. Shuffle

RE: Unit test failure: Address already in use

2014-06-18 Thread Lisonbee, Todd
Disabling parallelExecution has worked for me. Other alternatives I’ve tried that also work include: 1. Using a lock – this will let tests execute in parallel except for those using a SparkContext. If you have a large number of tests that could execute in parallel, this can shave off some