I had suggested something like this in one of the original "remove MRUNIT from hadoop contrib" threads... There was some push back about community fragmentation (tests should live in hadoop), but I personally don't see why not, we could course correct as things mature.
On Thu, May 26, 2011 at 3:35 AM, Steve Loughran <[email protected]> wrote: > I'm thinking, could MRUnit be the place to put in other hadoop-testing code. > > specifically > > == Junit on multiple hosts == > > > I have some prototype code to exec junit test cases as MR jobs, collect the > results (including serialized throwables). It runs one test per line of text > (the name of the package). It could be better to support lines of tests and > config options, or other ways to explore the config space. And I'd really > like to be able to deploy the junit tests to all the workers in the cluster, > the reduction would be to identify which boxes are playing up. > > == Sampling for testing == > > Good desktop tests need real data, which means sampling from the live > datasets. Some standard MR jobs to do the sampling (which themselves use MR > Unit to self-test) could make it easier to sample. > > thoughts? >
