> The issue is that MR2 doesn't have a JobTracker address. Neither does > it have TaskTrackers. So there is no real way to expose this.
Yes, I know that, then pick other names that represent the same functional components. For a MapReduce job, even in V2, there is a job tracker, and there are task trackers. This would not be an interface for system testing exactly, it would be for the client to simulate failures that might happen from its perspective. > I don't see any reason that HBase should need to get these things -- This isn't just about HBase. What about Pig or Hive or Giraph or ... any other project layered above MR. [ Similar discussion about HDFS skipped. ] > The above should only be useful for system-testing MR itself. But for > dependent projects (eg HBase/Hive/etc) what's the use case? Full stack tests, unit tests. Regarding MR miniclusters, do you think we could transition all of the MR based HBase tests to MiniMRClientCluster (or even MRUnit)? If so, then I would agree I'm raising something without a clear use case and we could do that migration. Regarding HDFS miniclusters, the interface is already limited-private and there is no pressing need, but we do have test cases where we need to simulate DataNode failures. Also, I can conceive of an application unit test where I would want to set replication to 1 on some file, then corrupt blocks, then check that repair (at the application level) was successful. Would some limited public interface for that be plausible? We have Bigtop for end to end testing on real clusters but it's still in incubation and spinning up a real cluster to run unit tests is not really practical. Best regards, - Andy Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White)