[
https://issues.apache.org/jira/browse/KAFKA-1589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093658#comment-14093658
]
Joe Stein commented on KAFKA-1589:
----------------------------------
This would be a great contribution. I would also ask please for documentation
for these scripts. There are a lot of nooks and crannies in the systems_test
directory and a bunch of gems but once folks go in there they get errors and
turn away (so I have seen) e.g.
vagrant@precise64:~$ python
kafka-0.8.2-SNAPSHOT-src/system_test/utils/metrics.py
Traceback (most recent call last):
File "kafka-0.8.2-SNAPSHOT-src/system_test/utils/metrics.py", line 34, in
<module>
import matplotlib as mpl
ImportError: No module named matplotlib
not that items like this can't be corrected with steps and effort but it isn't
a good "out of the box" experience" we could provide do it differently.
I think with more communicable "how to" and ease of use more folks in the
community will latch on to /system_test/ and make them part of their cycles in
their environments. This also goes to the heart / root cause of what you the
pain is here to I think.
> Strengthen System Tests
> -----------------------
>
> Key: KAFKA-1589
> URL: https://issues.apache.org/jira/browse/KAFKA-1589
> Project: Kafka
> Issue Type: Bug
> Reporter: Guozhang Wang
> Fix For: 0.9.0
>
>
> Although the system test code is also part of the open source repository, not
> too much attention is paid to this module today. The incurred results is that
> we keep breaking the system tests with either changes on the admin tools, or
> library upgrades that change the APIs like Zookeeper. And when the system
> tests breaks / hangs / etc, it is also hard to debug the issue. We need to
> treat the system test suite just as part of the open source code.
> Based on my personal experience trouble shooting system tests, I would
> propose doing at least the follow enhancement around system tests.
> 1. Add unit tests for all system util test tools, for example:
> kafka_system_test_utils.get_controller_attributes
> kafka_system_test_utils.get_leader_for
> 2. Add exception handling logic in the python test framework to clean-up the
> testbed upon failures, so that the subsequent test cases will not be affected.
> 3. Remove timing based mechanism such as "sleep(5000) to wait for metadata to
> be propagated" as much as possible to avoid transient failures.
> After those enhancements, we should probably also pick a very small subset
> (say one from each suite) of the system test cases into the patch reviewing
> process along with the unit tests.
--
This message was sent by Atlassian JIRA
(v6.2#6252)