Hello, We are using Hadoop here at Stony Brook University to power the next-generation text analytics backend for www.textmap.com. We also have an NFS partition that is mounted on all machines of our 100-node cluster. I found it much more convenient to store manually created files (e.g. configuration) on the NFS partition and just use them from my mappers and reducers rather than copying them to HDFS every time I change them, which is necessary when using DistributedCache. Is there a way to do the same for jars?
Specifically, I just need a way to alter the child JVM's classpath via JobConf, without having the framework copy anything in and out of HDFS, because all my files are already accessible from all nodes. I see how to do that by adding a couple of lines to TaskRunner's run() method, e.g.: classPath.append(sep); classPath.append(conf.get("mapred.additional.classpath")); or something similar. Is there already such a feature or should I just go ahead and implement it? Thanks, Mikhail Bautin