[ https://issues.apache.org/jira/browse/CASSANDRA-14298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16476666#comment-16476666 ]
Patrick Bannister edited comment on CASSANDRA-14298 at 5/16/18 3:05 AM: ------------------------------------------------------------------------ I've made a lot of progress porting cqlshlib to Python 3. Along the way I've been taking notes on all the areas that I think would require extra effort for cross compatibility with Python 2. I don't have a complete plan yet, but I have some observations. In terms of level of effort and complexity, this is not going to be as simple as running 2to3 and then adding a few imports from future and six. However, we won't need to rearchitect the library either. So far I've found that existing classes and functions work with just a few tweaks to their implementation, mostly around IO and strings vs. bytes. The biggest challenge, regardless of whether we go straight Python 3 or cross-compatible, is going to be adequately testing the result. The cqlshlib unittests and the cqlsh_tests have been useful to help find bugs, but I'm not confident that our tests have enough code coverage to exercise everything. We would need a strategy for more comprehensive testing. Some specifics: * The SaferScanner class in saferscanner.py requires a slightly different implementation in Python 2 vs. Python 3, because of changes in the internals of the re module for regular expressions. * copyutil.py, formatting.py, and displaying.py have needed the most work so far, since they have a lot of IO and serialization. * The formatter for blobs in formatting.py needs a different implementation in Python 2 vs. Python 3, because of changes in the behavior of binascii.hexlify. * On the dtests side, there are several tests that fail intermittently due to different sorting between expected results and observed results. The result of these tests is flaky depending on what randomly occurring sort happens to come out of the test. I've been able to get these tests to pass consistently by sorting results just before asserting equality. * Another notable dtest issue: in the cqlsh_copy_tests, the bulk_round_trip tests that use the blogposts profile are failing because of a limitation of the Python csv.reader, which is used in cqlshlib3 and in the bulk_round_trip tests. Python's csv.reader chokes on newlines and null characters, but the cassandra-stress tool's Strings Generator subclass generates both of these things in text fields. (Edit: this may be a combination of misuse on my part of csv.reader, plus a failure to properly port formatting of text data.) was (Author: ptbannister): I've made a lot of progress porting cqlshlib to Python 3. Along the way I've been taking notes on all the areas that I think would require extra effort for cross compatibility with Python 2. I don't have a complete plan yet, but I have some observations. In terms of level of effort and complexity, this is not going to be as simple as running 2to3 and then adding a few imports from future and six. However, we won't need to rearchitect the library either. So far I've found that existing classes and functions work with just a few tweaks to their implementation, mostly around IO and strings vs. bytes. The biggest challenge, regardless of whether we go straight Python 3 or cross-compatible, is going to be adequately testing the result. The cqlshlib unittests and the cqlsh_tests have been useful to help find bugs, but I'm not confident that our tests have enough code coverage to exercise everything. We would need a strategy for more comprehensive testing. Some specifics: * The SaferScanner class in saferscanner.py requires a slightly different implementation in Python 2 vs. Python 3, because of changes in the internals of the re module for regular expressions. * copyutil.py, formatting.py, and displaying.py have needed the most work so far, since they have a lot of IO and serialization. * The formatter for blobs in formatting.py needs a different implementation in Python 2 vs. Python 3, because of changes in the behavior of binascii.hexlify. * On the dtests side, there are several tests that fail intermittently due to different sorting between expected results and observed results. The result of these tests is flaky depending on what randomly occurring sort happens to come out of the test. I've been able to get these tests to pass consistently by sorting results just before asserting equality. * Another notable dtest issue: in the cqlsh_copy_tests, the bulk_round_trip tests that use the blogposts profile are failing because of a limitation of the Python csv.reader, which is used in cqlshlib3 and in the bulk_round_trip tests. Python's csv.reader chokes on newlines and null characters, but the cassandra-stress tool's Strings Generator subclass generates both of these things in text fields. > cqlshlib tests broken on b.a.o > ------------------------------ > > Key: CASSANDRA-14298 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14298 > Project: Cassandra > Issue Type: Bug > Components: Build, Testing > Reporter: Stefan Podkowinski > Assignee: Patrick Bannister > Priority: Major > Labels: cqlsh, dtest > Attachments: CASSANDRA-14298-old.txt, CASSANDRA-14298.txt, > cqlsh_tests_notes.md > > > It appears that cqlsh-tests on builds.apache.org on all branches stopped > working since we removed nosetests from the system environment. See e.g. > [here|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-trunk-cqlsh-tests/458/cython=no,jdk=JDK%201.8%20(latest),label=cassandra/console]. > Looks like we either have to make nosetests available again or migrate to > pytest as we did with dtests. Giving pytest a quick try resulted in many > errors locally, but I haven't inspected them in detail yet. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org