yjshen commented on PR #2146:
URL:
https://github.com/apache/arrow-datafusion/pull/2146#issuecomment-1087134690
# TPC-H SF=1
`master`:
```
Running benchmarks with the following options: DataFusionBenchmarkOpt {
query: 1, debug: false, iterations: 3, partitions: 2, batch_size: 4096, path:
"../../tpch-parquet/", file_format: "parquet", mem_table: false, output_path:
None }
Query 1 iteration 0 took 2851.7 ms and returned 6001214 rows
Query 1 iteration 1 took 2817.7 ms and returned 6001214 rows
Query 1 iteration 2 took 2735.9 ms and returned 6001214 rows
Query 1 avg time: 2801.75 ms
```
This PR:
```
Running benchmarks with the following options: DataFusionBenchmarkOpt {
query: 1, debug: false, iterations: 3, partitions: 2, batch_size: 4096, path:
"/home/yijie/sort_test/tpch-parquet", file_format: "parquet", mem_table: false,
output_path: None }
Query 1 iteration 0 took 3174.9 ms and returned 6001214 rows
Query 1 iteration 1 took 3130.8 ms and returned 6001214 rows
Query 1 iteration 2 took 3058.3 ms and returned 6001214 rows
Query 1 avg time: 3121.35 ms
```
The row format comes with a price of more computation, with ~11% performance
deterioration witnessed. Although this PR is showing a better cache locality,
the computation cost overweight the cache benefits:
```
sudo perf stat -a -e
cache-misses,cache-references,l3_cache_accesses,l3_misses,dTLB-load-misses,dTLB-loads
target/release/tpch benchmark datafusion --iterations 3 --path
/home/yijie/sort_test/tpch-parquet --format parquet --query 1 --batch-size 4096
```
`master`
```
Performance counter stats for 'system wide':
756,702,553 cache-misses # 34.256 % of all cache
refs
2,208,936,269 cache-references
1,156,898,644 l3_cache_accesses
362,860,081 l3_misses
215,166,268 dTLB-load-misses # 45.27% of all dTLB
cache accesses
475,312,480 dTLB-loads
8.774750150 seconds time elapsed
```
This PR:
```
Performance counter stats for 'system wide':
593,785,538 cache-misses # 25.841 % of all cache
refs
2,297,807,480 cache-references
835,622,737 l3_cache_accesses
227,838,803 l3_misses
146,442,785 dTLB-load-misses # 55.76% of all dTLB
cache accesses
262,616,456 dTLB-loads
10.556249442 seconds time elapsed
```
Much better cache accessing behavior with the row format.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]