[
https://issues.apache.org/jira/browse/SPARK-27586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
WoudyGao updated SPARK-27586:
-
Description:
I found the cpu cost of TypeUtils.compareBinary is noticeable when handle some
big parquet files;
After some perf work, I found:
the " for-comprehension if statements" will execute ≈15X instructions than
while loop
*'while-loop' version perf:*
{{886.687949 task-clock (msec) # 1.257 CPUs utilized}}
{{ 3,089 context-switches # 0.003 M/sec}}
{{ 265 cpu-migrations# 0.299 K/sec}}
{{12,227 page-faults # 0.014 M/sec}}
{{ 2,209,183,920 cycles# 2.492 GHz}}
{{ stalled-cycles-frontend}}
{{ stalled-cycles-backend}}
{{ 6,865,836,114 instructions # 3.11 insns per
cycle}}
{{ 1,568,910,228 branches # 1769.405 M/sec}}
{{ 9,172,613 branch-misses # 0.58% of all
branches}}
{{ 0.705671157 seconds time elapsed}}
*TypeUtils.compareBinary perf:*
{{ 16347.242313 task-clock (msec) # 1.233 CPUs utilized}}
{{ 8,370 context-switches # 0.512 K/sec}}
{{ 481 cpu-migrations# 0.029 K/sec}}
{{ 536,671 page-faults # 0.033 M/sec}}
{{40,857,347,119 cycles# 2.499 GHz}}
{{ stalled-cycles-frontend}}
{{ stalled-cycles-backend}}
{{90,606,381,612 instructions # 2.22 insns per
cycle}}
{{18,107,867,151 branches # 1107.702 M/sec}}
{{12,880,296 branch-misses # 0.07% of all
branches}}
{{ 13.257617118 seconds time elapsed}}
was:
I found the cpu cost of TypeUtils.compareBinary is noticeable when handle some
big parquet files;
After some perf work, I found:
In the " for-comprehension if statements" will execute ≈15X instructions than
while loop
*'while-loop' version perf:*
{{886.687949 task-clock (msec) # 1.257 CPUs utilized}}
{{ 3,089 context-switches # 0.003 M/sec}}
{{ 265 cpu-migrations# 0.299 K/sec}}
{{12,227 page-faults # 0.014 M/sec}}
{{ 2,209,183,920 cycles# 2.492 GHz}}
{{ stalled-cycles-frontend}}
{{ stalled-cycles-backend}}
{{ 6,865,836,114 instructions # 3.11 insns per cycle}}
{{ 1,568,910,228 branches # 1769.405 M/sec}}
{{ 9,172,613 branch-misses # 0.58% of all branches}}
{{ 0.705671157 seconds time elapsed}}
*TypeUtils.compareBinary perf:*
{{ 16347.242313 task-clock (msec) # 1.233 CPUs utilized}}
{{ 8,370 context-switches # 0.512 K/sec}}
{{ 481 cpu-migrations# 0.029 K/sec}}
{{ 536,671 page-faults # 0.033 M/sec}}
{{40,857,347,119 cycles# 2.499 GHz}}
{{ stalled-cycles-frontend}}
{{ stalled-cycles-backend}}
{{90,606,381,612 instructions # 2.22 insns per cycle}}
{{18,107,867,151 branches # 1107.702 M/sec}}
{{12,880,296 branch-misses # 0.07% of all branches}}
{{ 13.257617118 seconds time elapsed}}
> Improve binary comparison: replace Scala's for-comprehension if statements
> with while loop
> --
>
> Key: SPARK-27586
> URL: https://issues.apache.org/jira/browse/SPARK-27586
> Project: Spark
> Issue Type: Improvement
> Components: SQL
>Affects Versions: 2.4.2
> Environment: benchmark env:
> * Intel(R) Xeon(R) CPU E5-2650 v4 @ 2.20GHz
> * Linux 4.4.0-33.bm.1-amd64
> * java version "1.8.0_131"
> * Scala 2.11.8
> * perf version 4.4.0
> Run:
> 40,000,000 times comparison on 32 bytes-length binary
>
>Reporter: WoudyGao
>Priority: Minor
>
> I found the cpu cost of TypeUtils.compareBinary is noticeable when handle
> some big parquet files;
> After some perf work, I found:
> the " for-comprehension if statements" will execute ≈15X instructions than
> while loop
>
> *'while-loop' version perf:*
>
> {{886.687949 task-clock (msec) # 1.257 CPUs
> utilized}}
> {{ 3,089 context-switches # 0.003 M/sec}}
> {{ 265 cpu-migrations# 0.299 K/sec}}
> {{12,227 page-faults # 0.014 M/sec}}
> {{