[
https://issues.apache.org/jira/browse/MRUNIT-69?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13434963#comment-13434963
]
Bertrand Dechoux commented on MRUNIT-69:
----------------------------------------
I do have questions about the annotation based API (assuming there is no need
to extend a provided parent test class).
1) How do you intend to check for input/ouput types? I don't see how that can
be done without exposing a context (like the typed driver or a parent class).
It wouldn't be enforced anymore?
2) How do you intend to check for methods which should not be called eg
withKeyGroupingComparator during a reducer test? I don't see how that can be
done without exposing a context. It wouldn't be enforced anymore?
3) I like the idea of a in memory FileSystem implementation. I know Cascading 2
has now a local mode which sounds a bit similar. But it might too abstract or
too strongly tied to Cascading concepts to be of any use to MRUnit. This
feature would be also be a must for pig/hive when you want to run the same
query but locally without the cluster latency. So it would be interesting to
see if something already exist around Hadoop.
> new mrunit api
> --------------
>
> Key: MRUNIT-69
> URL: https://issues.apache.org/jira/browse/MRUNIT-69
> Project: MRUnit
> Issue Type: Umbrella
> Affects Versions: 0.8.1
> Reporter: Jim Donofrio
> Assignee: Jim Donofrio
>
> So I am curious what the plan is for the longterm future of MRUNIT?
> I think currently MRUNIT is useful for just unit testing a single mapper or
> reducer but currently there is a void for testing more complicated features
> such as MultipleInputs, MultipleOutputs, a driver class, counters, among
> other things. I wonder if instead of adding support to the current MRUNIT
> framework for these extra features it would more useful to add in hooks to
> the existing LocalJobRunner and MiniMRCluster classes to provide methods to
> more easily verify file output from text files, sequence files, etc. This
> would allow MRUNIT to test driver classes, MultipleInputs, MultipleOutputs,
> etc. MRUNIT would also then test against the real hadoop code instead of an
> implementation that mimics hadoop which can miss some bugs such as the
> ReduceDriver that did not reuse the same object until 0.8.0. MRUNIT would
> also keep up with new map reduce features instead of us having to implement
> fake versions of them
> I understand that performance would be an issue due to the file I/O but I
> wonder how fast the LocalJobRunner would be if we wrote a new class that
> extending FileSystem to allow users to write out fake files to memory and
> make the LocalJobRunner read from them
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira