Understanding disk usage with Accumulators

2014-12-16 Thread Ganelin, Ilya
Hi all – I’m running a long running batch-processing job with Spark through Yarn. I am doing the following Batch Process val resultsArr = sc.accumulableCollection(mutable.ArrayBuffer[ListenableFuture[Result]]()) InMemoryArray.forEach{ 1) Using a thread pool, generate callable jobs that

Re: Understanding disk usage with Accumulators

2014-12-16 Thread Ganelin, Ilya
...@capitalone.com Date: Tuesday, December 16, 2014 at 10:23 AM To: 'user@spark.apache.orgmailto:'user@spark.apache.org' user@spark.apache.orgmailto:user@spark.apache.org Subject: Understanding disk usage with Accumulators Hi all – I’m running a long running batch-processing job with Spark through Yarn. I am