Re: Can Spark read input data from HDFS centralized cache?

Ted Yu Mon, 25 Jan 2016 13:40:48 -0800

Have you read this thread ?

http://search-hadoop.com/m/uOzYttXZcg1M6oKf2/HDFS+cache&subj=RE+hadoop+hdfs+cache+question+do+client+processes+share+cache+


Cheers

On Mon, Jan 25, 2016 at 1:23 PM, Jia Zou <jacqueline...@gmail.com> wrote:

> I configured HDFS to cache file in HDFS's cache, like following:
>
> hdfs cacheadmin -addPool hibench
>
> hdfs cacheadmin -addDirective -path /HiBench/Kmeans/Input -pool hibench
>
>
> But I didn't see much performance impacts, no matter how I configure
> dfs.datanode.max.locked.memory
>
>
> Is it possible that Spark doesn't know the data is in HDFS cache, and
> still read data from disk, instead of from HDFS cache?
>
>
> Thanks!
>
> Jia
>

Re: Can Spark read input data from HDFS centralized cache?

Reply via email to