And I think that leads in to the conversation about where mrunit goes when we graduate. The original purpose of a breakout was mostly to allow separate release cycles and to be able to support multiple versions of Hadoop (if we wanted to do such things) without having circular dependencies across versions. If we graduate to a standalone subproject of Hadoop (which may be an option, subject to the Hadoop PMC's approval) we could "reunite" the communities while still remaining independent. Just a thought.
On Thu, May 26, 2011 at 9:58 AM, Patrick Hunt <[email protected]> wrote: > I had suggested something like this in one of the original "remove > MRUNIT from hadoop contrib" threads... There was some push back about > community fragmentation (tests should live in hadoop), but I > personally don't see why not, we could course correct as things > mature. > > On Thu, May 26, 2011 at 3:35 AM, Steve Loughran <[email protected]> wrote: > > I'm thinking, could MRUnit be the place to put in other hadoop-testing > code. > > > > specifically > > > > == Junit on multiple hosts == > > > > > > I have some prototype code to exec junit test cases as MR jobs, collect > the > > results (including serialized throwables). It runs one test per line of > text > > (the name of the package). It could be better to support lines of tests > and > > config options, or other ways to explore the config space. And I'd really > > like to be able to deploy the junit tests to all the workers in the > cluster, > > the reduction would be to identify which boxes are playing up. > > > > == Sampling for testing == > > > > Good desktop tests need real data, which means sampling from the live > > datasets. Some standard MR jobs to do the sampling (which themselves use > MR > > Unit to self-test) could make it easier to sample. > > > > thoughts? > > > -- Eric Sammer twitter: esammer data: www.cloudera.com
