subject:"Re\: Spark to utilize HDFS's mmap caching"

Re: Spark to utilize HDFS's mmap caching

2014-05-13 Thread Marcelo Vanzin

On Mon, May 12, 2014 at 12:14 PM, Matei Zaharia matei.zaha...@gmail.com wrote: That API is something the HDFS administrator uses outside of any application to tell HDFS to cache certain files or directories. But once you’ve done that, any existing HDFS client accesses them directly from the

Re: Spark to utilize HDFS's mmap caching

2014-05-13 Thread Chanwit Kaewkasi

Great to know that! Thank you, Matei. Best regards, -chanwit -- Chanwit Kaewkasi linkedin.com/in/chanwit On Tue, May 13, 2014 at 2:14 AM, Matei Zaharia matei.zaha...@gmail.com wrote: That API is something the HDFS administrator uses outside of any application to tell HDFS to cache certain

Re: Spark to utilize HDFS's mmap caching

2014-05-12 Thread Matei Zaharia

Yes, Spark goes through the standard HDFS client and will automatically benefit from this. Matei On May 8, 2014, at 4:43 AM, Chanwit Kaewkasi chan...@gmail.com wrote: Hi all, Can Spark (0.9.x) utilize the caching feature in HDFS 2.3 via sc.textFile() and other HDFS-related APIs?

Re: Spark to utilize HDFS's mmap caching

2014-05-12 Thread Marcelo Vanzin

Is that true? I believe that API Chanwit is talking about requires explicitly asking for files to be cached in HDFS. Spark automatically benefits from the kernel's page cache (i.e. if some block is in the kernel's page cache, it will be read more quickly). But the explicit HDFS cache is a

Re: Spark to utilize HDFS's mmap caching

Re: Spark to utilize HDFS's mmap caching

Re: Spark to utilize HDFS's mmap caching

Re: Spark to utilize HDFS's mmap caching

4 matches

Site Navigation

Mail list logo

Footer information