[ https://issues.apache.org/jira/browse/HADOOP-8705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13484724#comment-13484724 ]
Weiming Shi commented on HADOOP-8705: ------------------------------------- Can we leverage the work of guava? > Add JSR 107 Caching support > ---------------------------- > > Key: HADOOP-8705 > URL: https://issues.apache.org/jira/browse/HADOOP-8705 > Project: Hadoop Common > Issue Type: Improvement > Reporter: Dhruv Kumar > > Having a cache on mappers and reducers could be very useful for some use > cases, including but not limited to: > 1. Iterative Map Reduce Programs: Some machine learning algorithms frequently > need access to invariant data (see Mahout) over each iteration of MapReduce > until convergence. A cache on such nodes could allow easy access to the > hotset of data without going all the way to the distributed cache. This > optimization has been described by Jimmy Lin et. al in the paper > "Low-Latency, High-Throughput Access to Static Global Resources within the > Hadoop Framework" (http://hcil2.cs.umd.edu/trs/2009-01/2009-01.pdf) > 2. Storing of intermediate map outputs in memory to reduce shuffling time. > This optimization has been discussed at length in Haloop > (http://www.ics.uci.edu/~yingyib/papers/HaLoop_camera_ready.pdf), and by > Shubin Zhang in "Accelerating MapReduce with Distributed Memory Cache" > presented at ICPADS 2009. > There are some other scenarios as well where having a cache could come in > handy. > JSR 107 aims to standardize caching interfaces for Java Application and > popular caching solutions such as Ehcache and Memcached have JSR 107 wrapper. > Hence, tt will be nice to have some sort of pluggable support for JSR 107 > compliant caches on Hadoop. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira