2011/9/13 kang hua <kanghua...@msn.com> > Hi Master: > can you explain more detail --- "The only way to avoid this is to > make the data much more cacheable and to have a viable cache coherency > strategy. Cache coherency at the meta-data level is difficult. Cache > coherency at the block level is also difficult (but not as difficult) > because many blocks get moved for balance purposes" > why "Cache coherency at the meta-data level is difficult" ? >
I said this because meta-data is updated often. Caching in the presence of high updates requires some sort of coherency model. For meta-data, it is difficult to detect stale information on use and use of stale information can be disastrous. Thus, caching is difficult. > why "Cache coherency at the block level is also difficult (but not as > difficult) because many blocks get moved for balance purposes" > The basic problem here is update rate. Late detection of stale information is much easier however since you can just note that the block isn't where you thought it was and update your cache. There are still problems and the fact that race conditions are still being found in the HDFS lease management code is an indicator that this isn't a completely trivial problem.