Re: Spark 2.x OFF_HEAP persistence

2017-01-09 Thread Gene Pang
Yes, as far as I can tell, your description is accurate. Thanks, Gene On Wed, Jan 4, 2017 at 9:37 PM, Vin J wrote: > Thanks for the reply Gene. Looks like this means, with Spark 2.x, one has > to change from rdd.persist(StorageLevel.OFF_HEAP) to >

Re: Spark 2.x OFF_HEAP persistence

2017-01-04 Thread Vin J
Thanks for the reply Gene. Looks like this means, with Spark 2.x, one has to change from rdd.persist(StorageLevel.OFF_HEAP) to rdd.saveAsTextFile(alluxioPath) / rdd.saveAsObjectFile (alluxioPath) for guarantees like persisted rdd surviving a Spark JVM crash etc, as also the other benefits you

Re: Spark 2.x OFF_HEAP persistence

2017-01-04 Thread Gene Pang
Hi Vin, >From Spark 2.x, OFF_HEAP was changed to no longer directly interface with an external block store. The previous tight dependency was restrictive and reduced flexibility. It looks like the new version uses the executor's off heap memory to allocate direct byte buffers, and does not

Spark 2.x OFF_HEAP persistence

2017-01-04 Thread Vin J
Until Spark 1.6 I see there were specific properties to configure such as the external block store master url (spark.externalBlockStore.url) etc to use OFF_HEAP storage level which made it clear that an external Tachyon type of block store as required/used for OFF_HEAP storage. Can someone