hudi-bot opened a new issue, #14608: URL: https://github.com/apache/hudi/issues/14608
There are some tests like ITTestRepairsCommand vs TestRepairsCommand, ITTestCleanerCommand vs TestCleanerCommand. Please consolidate if they are redundant. ## JIRA info - Link: https://issues.apache.org/jira/browse/HUDI-1033 - Type: Improvement - Epic: https://issues.apache.org/jira/browse/HUDI-3303 - Affects version(s): - 0.9.0 --- ## Comments 21/Jun/20 14:52;yanghua;Hi [~vbalaji] There is a difference between {{ITTestRepairsCommand}} and {{TestRepairsCommand}}. The {{ITXXX}} test cases need to specify {{--sparkMaster}} config option. So [~hongdongdong] distinguished them. I also have the same concern before. After he explained, I think it is fine to separate them. WDYT?;;; --- 22/Jun/20 01:24;hongdongdong;Hi [~vbalaji] : as [~yanghua] said there is difference between them. {{ITXXX}} test cases need run on spark which need load the jars generated during the package phase. I'm trying to move {{ITXXX to hudi-integ-test module.}};;; --- 22/Jun/20 07:20;vbalaji;[~hongdongdong] [~yanghua]: I strongly feel that we should be having only one test class for these. Its mostly duplicated testing otherwise and may not add value. I am not sure why we need to specifically test for loading the jars. But, it is ok to have these run only as integration tests. For that, it would be better to actually perform live actions (like clean/commit) instead of mocking. Otherwise, we can just have unit tests. ;;; --- 22/Jun/20 08:57;hongdongdong;[~vbalaji]: Somethings that I had not explained clearly. There is no duplicated testing, TestXXX for unit test and ITTestXXX for integration testing, they test different command. For example, CleansCommand TestCleansCommand test for commands: 'cleans show' 'clean showpartitions'. ITTestCleansCommand test for command: 'cleans run'. Why need integration testing? Some commands(like cleans run) run on spark, and SparkLauncher needs to load the jars(like hudi-common/hudi-client) under lib(generated during the package phase) when init (not means test for loading the jars). And, as you think, all tests actually perform live actions, not use mocking.;;; --- 04/Sep/21 17:42;xushiyan;[~yanghua] [~hongdongdong] [~vbalaji] [~vinoth] We're defining some boundaries for different tests * Unit tests: pure logical tests, with mocking when needed * Functional tests: need to start local server like spark hive, etc. to test a higher level function * Integration tests: full-fledged end-to-end test scenario Each of the categories has a maven profile defined, for the purpose of split run in CI. Though many testcases in the existing codebase are functional tests (like subclasses of HoodieClientHarness), they are ran as if they are unit tests. We want to address this as we move forward by pushing out new test harness class and refactoring. As for CLI module, we should make all commands' testcases as functional tests, and one for each command, for reasons * they do not reach the level of full-fledged end-to-end scenario as the setup in hoodie-integ-test module * we should keep integration tests only within hoodie-integ-test module for better logical separation Currently TestRepairsCommand and ITTestRepairsCommand both (and like other commands pairs) extend AbstractShellIntegrationTest which are confusing. And obviously local Spark instances are created for each test method, which is not ideal. Therefore I'm planning on taking over the CLI tests consolidation and conversion to functional tests. Feel free to chime in. Thanks.;;; -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
