p.s. I should mention that I'm also constructing OmniGraffle diagrams attempting to show processes and message flow during job execution (as well as the myriad configuration parameters that affect job execution) that we'll use for training folks here at Facebook on optimizing Hadoop. I plan to run these training documents by the Hadoop community before making them public, so any insight will help get us there more quickly!
Thanks, Jeff On 9/9/07, Jeff Hammerbacher <[EMAIL PROTECTED]> wrote: > > Hello, > > What's the DistributedCache for, in words? I'm combing through the > JobClient/JobTracker/TaskTracker code right now and slowly getting a view of > the whole system, starting from "bin/hadoop jar ...". I've almost made it > down the stack to TaskRunner, where DistributedCache seems to get used most > heavily, and started looking at DistributedCache.java but things are a bit > less penetrable than the rest of the codebase (including the first > single-letter variable name I've seen, other than "r" for Random()). Any > wise Hadoop dev care to clear it up a bit for me? > > Thanks, > Jeff >
