[ https://issues.apache.org/jira/browse/KAFKA-440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13433272#comment-13433272 ]
Neha Narkhede commented on KAFKA-440: ------------------------------------- Thanks for patch v2! Overall, it looks like a good start. Here are some review comments - 1. Standardize utils into system_test/utils. Remove lib. The libraries that go in here should be available to all other scripts through an 'import library' statement. 1.1. Rename li_reg_test_helper to KafkaSystemTestUtils. This has Kafka specific helper APIs. Other util APIs can go in system_test/utils/SystemTestUtils. 1.2. Can we include the logger in there as well ? Every script should be able to import that. 1.3. Rename RtLogging to Logger. 1.4. We shouldn't have to pass around the rtLogger object everywhere. I hope this can be available as a module level static variable. Once I import Logger, I should just be able to use a variable named logger to log statements 1.5. Rename RegTestEnv to SystemTestEnv. Let's add all useful environment variables here like – kafkaBaseDir, systemTestBaseDir, testSuiteBaseDir, testCaseBaseDir, testCaseLogDir etc 1.6. replica_basic_test.py currently doesn't have access to the above environment variables which makes it awkward to use. 2. Rename reg_test* to SystemTestUtils or something like that 3. Rename suite_replication to replication_test_suite. You might want to name every test suite as *_test_suite so that the scripts can detect that as a test suite. 4. Rename reg_test_driver to test_driver. 5. It will be good to follow some Python style guide like this one - http://www.python.org/dev/peps/pep-0008/#descriptive-naming-styles 5.1. Package names should be all lower case and *preferably* without underscores, al though sometimes this is difficult to avoid. So something like reg_test_helper could just be utils/testutils. 5.2. Class names should be CamelCase, with first letter capitalized. 5.3. Function names should be all lower case with words separated by underscores. 5.4. Constants should be defined at module level with all letters capitalized and separated with underscores. 6. A lot of commands need to prepend 'ssh host' to them. How about having a helper API run_at_host_command(host, command) that will do this and return the command string ? 7. Also, all commands need to specify the host to run on, the path to the script, some arguments to that script and an output file. It will be nice to add a helper API to do that. 8. It will be good to structure the testcase/logs directory by role-entityid. The reason is that we want to collect metrics and plot graphs for every entity. These will involve generating several csv/svg/png files per entity. Instead of putting them all in one testcase/logs directory, how about having the following structure - 1. testcase/logs 2. |__ zookeeper-0 3. |__ metrics 4. ||__ dashboards 5. |__ kafka-1 6. |__ metrics 7. |__ dashboards 9. To make it easier to use the directory structure above, I think we should have access to helper APIs that given the entity id can return the path to the metrics/dashboards directories. This will be used by the APIs that collect metrics and plot graphs. 10. cluster_config.json describes each entity with a set of properties. It might be easier to define a class called TestEntity that has all these properties. On startup, the test will read cluster_config.json and create a map/list of TestEntity objects. The test scripts should have access to these. 11. How about having lifecycle management for all testcases in a test suite similar to Junit ? For example, all test cases in one test suite can have a setup and teardown method, where common tasks can be performed. However, you can probably do this as part of another JIRA. > Create a regression test framework for distributed environment testing > ---------------------------------------------------------------------- > > Key: KAFKA-440 > URL: https://issues.apache.org/jira/browse/KAFKA-440 > Project: Kafka > Issue Type: Task > Reporter: John Fung > Assignee: John Fung > Attachments: kafka-440-v1.patch, kafka-440-v2.patch > > > Initial requirements: > 1. The whole test framework is preferably coded in Python (a common scripting > language which has well supported features) > 2. Test framework driver should be generic (distributed environment can be > local host) > 3. Test framework related configurations are defined in JSON format > 4. Test environment, suite, case definitions may be defined in the following > levels: > 4-a entity_id is used as a key for looking up related config from different > levels > 4-b Cluster level defines: entity_id, hostname, kafka_home, java_home, ... > 4-c Test suite / case level defines: > 4-c-1 zookeeper: entity_id, clientPort, dataDir, log_filename, > config_filename > 4-c-2 broker: entity_id, port, log.file.size, log.dir, log_filename, > config_filename > 4-c-3 producer: entity_id, topic, threads, compression-codec, > message-size, log_filename, config_filename -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira