@Adam I am currently interested with the latter half of your second question. My main interest lies in determining how to optimize data processing. If I have two data centers that are geographically far apart and I am working on a local machines but I need data from the second data center, how do I have the processing occur on the second data center? The constraints to this problem include a lack of empirical knowledge of the HDFS node that the data contains, but is within the network topology I currently reside in. Furthermore, it pertains to Map/Reduce jobs that utilize the AccumuloInputFormat. Is it possible to have the distant data center process my Mapper and send me the resulting data set instead of processing the Mapper locally and making numerous network queries?
----- -- View this message in context: http://apache-accumulo.1065345.n5.nabble.com/Rack-and-Datacenter-Awareness-tp7193p7225.html Sent from the Developers mailing list archive at Nabble.com.
