Yida Wu has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/21708


Change subject: IMPALA-13317: Enhance tpc_sort_key for wider name support
......................................................................

IMPALA-13317: Enhance tpc_sort_key for wider name support

Currently, the tpc_sort_key function is used for sorting TPCH or
TPCDS files while running the TPCH or TPCDS tests, and only
used by test_tuple_cache_tpc_queries now. It is designed to
handle filenames in formats like "tpch-qx-y," "tpch-qx," or
"tpch-qxX." However, it doesn't support filenames in the format
"tpch-qx-yY," and attempting to sort these files results in an error.

This patch improves the robustness of the tpc_sort_key function
by adding more checks to prevent errors and extending support
for filenames in the "tpch-qxX-yY" format.

Tests:
Reran and passed tests with file name like "tpch-qxX-yY" format.
Seems no tests exist for test util functions, I tested the function
with following unit tests locally and passed
test_cases = {
    'tpcds-q1': (1, 0, '', ''),
    'tpcds-q1X': (1, 0, 'X', ''),
    'tpcds-q1-2Y': (1, 2, '', 'Y'),
    'tpcds-q1X-2Y': (1, 2, 'X', 'Y'),
    'tpcds-q2-3': (2, 3, '', ''),
    'tpcds-q10': (10, 0, '', ''),
    'tpcds-q10-20': (10, 20, '', ''),
    'tpcds-q10a-20': (10, 20, 'a', ''),
    'tpcds-q10-20b': (10, 20, '', 'b'),
    'tpcds-q10a-20b': (10, 20, 'a', 'b'),
    'tpcds-q0': (0, 0, '', ''),
    'tpcds-': (0, 0, '', ''),
    'tpcds--': (0, 0, '', ''),
    'tpcds-xx-xx': (0, 0, '', ''),
    'tpcds-x1-x1': (0, 0, '', ''),
    'tpcds-x1-x': (0, 0, '', ''),
    'tpcds-x-x1': (0, 0, '', ''),
    'tpcds': (0, 0, '', ''),
}
for input_str, expected in test_cases.items():
    result = tpc_sort_key(input_str)
    assert result == expected

Change-Id: Ib238ff09d5a2278c593f2759cf35f136b0ff1344
---
M tests/util/test_file_parser.py
1 file changed, 21 insertions(+), 9 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/08/21708/2
--
To view, visit http://gerrit.cloudera.org:8080/21708
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ib238ff09d5a2278c593f2759cf35f136b0ff1344
Gerrit-Change-Number: 21708
Gerrit-PatchSet: 2
Gerrit-Owner: Yida Wu <wydbaggio...@gmail.com>
Gerrit-Reviewer: Jason Fehr <jf...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <joemcdonn...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kdesc...@cloudera.com>
Gerrit-Reviewer: Michael Smith <michael.sm...@cloudera.com>
Gerrit-Reviewer: gaurav singh <gsi...@cloudera.com>

Reply via email to