[ https://issues.apache.org/jira/browse/IMPALA-13125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18015799#comment-18015799 ]
Csaba Ringhofer commented on IMPALA-13125: ------------------------------------------ Looked into this a bit and my impression is that the lib we use to generate the tests (allpairspy) can miss out on a lot of parameter pairs if a filter function is used. This can depend on the order of dimensions - changing the order by using an OrderedDict instead of dict changed the number of test vectors returned from 11 to 123. (test_queries.py / test_partitioned_top_n). There is a similar open issue for the lib: https://github.com/thombashi/allpairspy/issues/2 In this case using OrderedDict (or Python3, where the dict behaves similarly) seems more correct, as the 11 combinations provide very poor coverage for file formats, which seems to be against the intention of the test. On the other hand the 123 combinations looks too much, especially testing all compression types for text/json/seq/rc files look unneeded to me, or at least I don't see a reason to combine them with other parameters like disable_codegen. I think that this is a major issue, we practically can't reason about how test combinations are generated, and switching to Python3/OrderedDict won't help much with this, as the new orders can make useful combinations disappear in other tests. Unless we find a reliable way to generate pairwise test vectors when we use filters, it may be better to generate all combinations (like __generate_exhaustive_combinations()) and sample them randomly based on some seed to keep the number of test runs manageable. cc [~rizaon] > Set of tests for exploration_strategy=exhaustive varies between python 2 and 3 > ------------------------------------------------------------------------------ > > Key: IMPALA-13125 > URL: https://issues.apache.org/jira/browse/IMPALA-13125 > Project: IMPALA > Issue Type: Sub-task > Components: Infrastructure > Affects Versions: Impala 4.5.0 > Reporter: Joe McDonnell > Priority: Major > > TLDR: Python 3 runs a different set of exhaustive tests than Python 2. > Longer version: > When looking into running Python 3 tests, I noticed that the set of tests > running for the exhaustive tests is different for Python 2 vs Python 3. This > was surprising. > It turns out there is a distinction between run-tests.py's > --exploration_strategy=exhaustive vs the > --workload_exploration_strategy="functional-query:exhaustive" option. The > exhaustive job is actually doing the latter. This means that individual > function-query workload classes see cls.exploration_strategy() == > "exhaustive", but the logic that generates the test vector still see > exploration_strategy=core and it still uses pairwise generation. Code: > {noformat} > if exploration_strategy == 'exhaustive': > return self.__generate_exhaustive_combinations() > elif exploration_strategy in ['core', 'pairwise']: > return self.__generate_pairwise_combinations(){noformat} > [https://github.com/apache/impala/blob/master/tests/common/test_vector.py#L165-L168] > Python 2 vs 3 changes the way dictionaries work, impacting the order of test > dimensions and how it picks tests. So, the Python 3 exhaustive tests are > different. This may expose latent bugs, because some combinations that meet > the constraints are never actually run (e.g. some json encodings don't have > the decimal_tiny table). > We can work to make them behave similarly, using pytest's --collect-only > option to look at the differences (and compare them to actual existing runs). -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org