[ 
https://issues.apache.org/jira/browse/IMPALA-11192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17510949#comment-17510949
 ] 

ASF subversion and git services commented on IMPALA-11192:
----------------------------------------------------------

Commit 7273c9f8a9698609816b680a79f21f9389bb265f in impala's branch 
refs/heads/master from stiga-huang
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=7273c9f ]

IMPALA-11192: Batch uploading files in test_scanner_fuzz.py

test_scanner_fuzz.py runs much slower on ORC than other formats. The
majority of the time is spent in uploading local files one by one to the
hdfs table folder.

The local files are copied from hdfs and randomly corrupted by the test.
The directory layout remains the same as the table folder. There are no
staging dirs that we should skip. So we can upload the whole local
folder at once, which saves a lot of the test time.

Tested locally and verified profiles of the succeeded queries. They all
scan the expected number of rows.

Change-Id: I504e160b84b3cc01d3be0b4e242d3c372692d181
Reviewed-on: http://gerrit.cloudera.org:8080/18329
Reviewed-by: Impala Public Jenkins <[email protected]>
Tested-by: Impala Public Jenkins <[email protected]>


> test_scanner_fuzz.py runs super slow on ORC format
> --------------------------------------------------
>
>                 Key: IMPALA-11192
>                 URL: https://issues.apache.org/jira/browse/IMPALA-11192
>             Project: IMPALA
>          Issue Type: Test
>          Components: Infrastructure
>            Reporter: Quanlong Huang
>            Assignee: Quanlong Huang
>            Priority: Major
>
> I recently need to iterate test_scanner_fuzz.py multiple times and find it 
> takes more than 0.5h to run it once (only for ORC).
> {code:bash}
> $ time -p impala-py.test --skip_hbase --table_formats=orc/def/block 
> tests/query_test/test_scanners_fuzz.py
> real 2155.47
> user 2779.64
> sys 193.76
> {code}
> Looking into a Jenkins job, it shows that ORC tests are much slower than 
> other formats:
> ||Test name||Duration||Status||
> |test_fuzz_alltypes[... 'debug_action': 
> '-1:OPEN:[email protected]' \\| table_format: 
> avro/snap/block]|31 sec|Passed|
> |test_fuzz_alltypes[... 'debug_action': 
> '-1:OPEN:[email protected]' \\| table_format: 
> orc/def/block]|{color:#FF0000}2 min 1 sec{color}|Passed|
> |test_fuzz_alltypes[... 'debug_action': 
> '-1:OPEN:[email protected]' \\| table_format: 
> parquet/none]|35 sec|Passed|
> |test_fuzz_alltypes[... 'debug_action': 
> '-1:OPEN:[email protected]' \\| table_format: 
> text/none]|48 sec|Passed|
> |test_fuzz_alltypes[... 'debug_action': 
> '-1:OPEN:[email protected]' \\| table_format: 
> avro/snap/block]|40 sec|Passed|
> |test_fuzz_alltypes[... 'debug_action': 
> '-1:OPEN:[email protected]' \\| table_format: 
> orc/def/block]|{color:#FF0000}2 min 55 sec{color}|Passed|
> |test_fuzz_alltypes[... 'debug_action': 
> '-1:OPEN:[email protected]' \\| table_format: 
> parquet/none]|22 sec|Passed|
> |test_fuzz_alltypes[... 'debug_action': 
> '-1:OPEN:[email protected]' \\| table_format: 
> text/none]|29 sec|Passed|
> |test_fuzz_alltypes[... 'debug_action': None, \\| table_format: 
> avro/snap/block]|32 sec|Passed|
> |test_fuzz_alltypes[... 'debug_action': None, \\| table_format: 
> orc/def/block]|{color:#FF0000}3 min 25 sec{color}|Passed|
> |test_fuzz_alltypes[... 'debug_action': None, \\| table_format: 
> parquet/none]|29 sec|Passed|
> |test_fuzz_alltypes[... 'debug_action': None, \\| table_format: text/none]|20 
> sec|Passed|
> |test_fuzz_decimal_tbl[... 'debug_action': 
> '-1:OPEN:[email protected]' \\| table_format: 
> avro/snap/block]|20 sec|Passed|
> |test_fuzz_decimal_tbl[... 'debug_action': 
> '-1:OPEN:[email protected]' \\| table_format: 
> orc/def/block]|{color:#FF0000}1 min 35 sec{color}|Passed|
> |test_fuzz_decimal_tbl[... 'debug_action': 
> '-1:OPEN:[email protected]' \\| table_format: 
> parquet/none]|22 sec|Passed|
> |test_fuzz_decimal_tbl[... 'debug_action': 
> '-1:OPEN:[email protected]' \\| table_format: 
> text/none]|18 sec|Passed|
> |test_fuzz_decimal_tbl[... 'debug_action': 
> '-1:OPEN:[email protected]' \\| table_format: 
> avro/snap/block]|20 sec|Passed|
> |test_fuzz_decimal_tbl[... 'debug_action': 
> '-1:OPEN:[email protected]' \\| table_format: 
> orc/def/block]|{color:#FF0000}1 min 16 sec{color}|Passed|
> |test_fuzz_decimal_tbl[... 'debug_action': 
> '-1:OPEN:[email protected]' \\| table_format: 
> parquet/none]|17 sec|Passed|
> |test_fuzz_decimal_tbl[... 'debug_action': 
> '-1:OPEN:[email protected]' \\| table_format: 
> text/none]|16 sec|Passed|
> |test_fuzz_decimal_tbl[... 'debug_action': None \\| table_format: 
> avro/snap/block]|19 sec|Passed|
> |test_fuzz_decimal_tbl[... 'debug_action': None \\| table_format: 
> orc/def/block]|{color:#FF0000}1 min 4 sec{color}|Passed|
> |test_fuzz_decimal_tbl[... 'debug_action': None \\| table_format: 
> parquet/none]|22 sec|Passed|
> |test_fuzz_decimal_tbl[... 'debug_action': None \\| table_format: 
> text/none]|29 sec|Passed|
> |test_fuzz_nested_types[... 'debug_action': 
> '-1:OPEN:[email protected]' \\| table_format: 
> avro/snap/block]|4 sec|Skipped|
> |test_fuzz_nested_types[... 'debug_action': 
> '-1:OPEN:[email protected]' \\| table_format: 
> orc/def/block]|{color:#FF0000}1 min 35 sec{color}|Passed|
> |test_fuzz_nested_types[... 'debug_action': 
> '-1:OPEN:[email protected]' \\| table_format: 
> parquet/none]|18 sec|Passed|
> |test_fuzz_nested_types[... 'debug_action': 
> '-1:OPEN:[email protected]' \\| table_format: 
> text/none]|4.7 sec|Skipped|
> |test_fuzz_nested_types[... 'debug_action': 
> '-1:OPEN:[email protected]' \\| table_format: 
> avro/snap/block]|5.2 sec|Skipped|
> |test_fuzz_nested_types[... 'debug_action': 
> '-1:OPEN:[email protected]' \\| table_format: 
> orc/def/block]|{color:#FF0000}1 min 59 sec{color}|Passed|
> |test_fuzz_nested_types[... 'debug_action': 
> '-1:OPEN:[email protected]' \\| table_format: 
> parquet/none]|17 sec|Passed|
> |test_fuzz_nested_types[... 'debug_action': 
> '-1:OPEN:[email protected]' \\| table_format: 
> text/none]|4.3 sec|Skipped|
> |test_fuzz_nested_types[... 'debug_action': None \\| table_format: 
> avro/snap/block]|4.1 sec|Skipped|
> |test_fuzz_nested_types[... 'debug_action': None \\| table_format: 
> orc/def/block]|{color:#FF0000}2 min 30 sec{color}|Passed|
> |test_fuzz_nested_types[... 'debug_action': None \\| table_format: 
> parquet/none]|18 sec|Passed|
> |test_fuzz_nested_types[... 'debug_action': None \\| table_format: 
> text/none]|3.5 sec|Skipped|
> |test_fuzz_uncompressed_parquet_orc[... 'debug_action': 
> '-1:OPEN:[email protected]' \\| table_format: 
> avro/snap/block]|4 sec|Skipped|
> |test_fuzz_uncompressed_parquet_orc[... 'debug_action': 
> '-1:OPEN:[email protected]' \\| table_format: 
> orc/def/block]|{color:#FF0000}17 min{color}|Passed|
> |test_fuzz_uncompressed_parquet_orc[... 'debug_action': 
> '-1:OPEN:[email protected]' \\| table_format: 
> parquet/none]|20 sec|Passed|
> |test_fuzz_uncompressed_parquet_orc[... 'debug_action': 
> '-1:OPEN:[email protected]' \\| table_format: 
> text/none]|3.1 sec|Skipped|
> |test_fuzz_uncompressed_parquet_orc[... 'debug_action': 
> '-1:OPEN:[email protected]' \\| table_format: 
> avro/snap/block]|3.8 sec|Skipped|
> |test_fuzz_uncompressed_parquet_orc[... 'debug_action': 
> '-1:OPEN:[email protected]' \\| table_format: 
> orc/def/block]|{color:#FF0000}9 min 51 sec{color}|Passed|
> |test_fuzz_uncompressed_parquet_orc[... 'debug_action': 
> '-1:OPEN:[email protected]' \\| table_format: 
> parquet/none]|19 sec|Passed|
> |test_fuzz_uncompressed_parquet_orc[... 'debug_action': 
> '-1:OPEN:[email protected]' \\| table_format: 
> text/none]|4.4 sec|Skipped|
> |test_fuzz_uncompressed_parquet_orc[... 'debug_action': None \\| 
> table_format: avro/snap/block]|3.9 sec|Skipped|
> |test_fuzz_uncompressed_parquet_orc[... 'debug_action': None \\| 
> table_format: orc/def/block]|{color:#FF0000}9 min 27 sec{color}|Passed|
> |test_fuzz_uncompressed_parquet_orc[... 'debug_action': None \\| 
> table_format: parquet/none]|19 sec|Passed|
> |test_fuzz_uncompressed_parquet_orc[... 'debug_action': None \\| 
> table_format: text/none]|4.3 sec|Skipped|
> Tests on other formats take less than 1min for each. ORC tests usually takes 
> several minutes.
> CC [~boroknagyz]



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to