Joe McDonnell created IMPALA-11453:
--------------------------------------

             Summary: Add option to run-workload.py to have "warm-up" runs of 
query
                 Key: IMPALA-11453
                 URL: https://issues.apache.org/jira/browse/IMPALA-11453
             Project: IMPALA
          Issue Type: Improvement
          Components: Infrastructure
    Affects Versions: Impala 4.2.0
            Reporter: Joe McDonnell


bin/run-workload.py has an option to explain the query before running it the 
first time. This gets the metadata loading out of the way so that it doesn't 
impact the first query time.

It would be useful to add another option that runs the query a couple times to 
warm up any caches before starting measurement. This would reduce variation due 
to the data not being in OS buffer caches, etc.

In my runs of perf-AB-test, the first run of a query sometimes shows this 
difference (for either A or B):
{noformat}
Run 1-3:
22:34:36 | TPCH-Q1  | 2022-07-20 04:16:12 | 7.52           | 1         |
22:34:36 | TPCH-Q1  | 2022-07-20 04:16:20 | 4.82           | 1         |
22:34:36 | TPCH-Q1  | 2022-07-20 04:16:25 | 5.04           | 1         |

Run 1-3:
22:34:36 | TPCH-Q11 | 2022-07-20 04:23:21 | 1.12           | 1         |
22:34:36 | TPCH-Q11 | 2022-07-20 04:23:23 | 0.93           | 1         |
22:34:36 | TPCH-Q11 | 2022-07-20 04:23:23 | 0.97           | 1         |

Run 1-3:
22:34:36 | TPCH-Q12 | 2022-07-20 04:24:13 | 2.23           | 1         |
22:34:36 | TPCH-Q12 | 2022-07-20 04:24:15 | 1.88           | 1         |
22:34:36 | TPCH-Q12 | 2022-07-20 04:24:17 | 1.78           | 1         
|{noformat}
If we ran the query a couple times before starting recordings, it would be a 
more consistent benchmark. This seems a useful setting to use for 
single_node_perf_run.py.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

Reply via email to