Per inputstream means the cache can only been used in the scope of one file. I think it's will be better if there's a cache in DFSClient.
On Fri, Jun 11, 2010 at 5:02 PM, Todd Lipcon <t...@cloudera.com> wrote: > It is cached per input stream - see DFSInputStream.locatedBlocks, > prefetchSize, etc. > > -Todd > On Thu, Jun 10, 2010 at 11:43 PM, Jeff Zhang <zjf...@gmail.com> wrote: >> >> Hi all, >> >> According the GFS paper claims, GFS will cache meta data in client. >> But when I check the source code of hadoop, it seems that hadoop won't >> cache it in client side. I just wan to make sure whether I am right ? >> And wondering whether there's someone work on it ? One advantage of >> caching metadata in client side I can think of is that tasktracker >> will fetch job.xml in HDFS. And most of time we will run multiple task >> in one node, so if tasktrack cache the metadata, it can reduce the >> communication with namenode. >> >> >> >> -- >> Best Regards >> >> Jeff Zhang > > > > -- > Todd Lipcon > Software Engineer, Cloudera > -- Best Regards Jeff Zhang