[jira] [Updated] (SPARK-27586) Improve binary comparison: replace Scala's for-comprehension if statements with while loop

2019-05-02 Thread Dongjoon Hyun (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-27586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-27586:
--
Affects Version/s: (was: 2.4.2)
   3.0.0

> Improve binary comparison: replace Scala's for-comprehension if statements 
> with while loop
> --
>
> Key: SPARK-27586
> URL: https://issues.apache.org/jira/browse/SPARK-27586
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.0.0
> Environment: benchmark env:
>  * Intel(R) Xeon(R) CPU E5-2650 v4 @ 2.20GHz
>  * Linux 4.4.0-33.bm.1-amd64
>  * java version "1.8.0_131"
>  * Scala 2.11.8
>  * perf version 4.4.0
> Run:
> 40,000,000 times comparison on 32 bytes-length binary
>  
>Reporter: WoudyGao
>Assignee: WoudyGao
>Priority: Minor
> Fix For: 3.0.0
>
>
> I found the cpu cost of TypeUtils.compareBinary is noticeable when handle 
> some big parquet files;
> After some perf work, I found:
> the " for-comprehension if statements" will execute ≈15X instructions than 
> while loop
>  
> *'while-loop' version perf:*
>   
>  {{886.687949  task-clock (msec) #    1.257 CPUs 
> utilized}}
>  {{ 3,089  context-switches  #    0.003 M/sec}}
>  {{   265  cpu-migrations#    0.299 K/sec}}
>  {{12,227  page-faults   #    0.014 M/sec}}
>  {{ 2,209,183,920  cycles#    2.492 GHz}}
>  {{     stalled-cycles-frontend}}
>  {{     stalled-cycles-backend}}
>  {{ 6,865,836,114  instructions  #    3.11  insns per 
> cycle}}
>  {{ 1,568,910,228  branches  # 1769.405 M/sec}}
>  {{ 9,172,613  branch-misses #    0.58% of all 
> branches}}
>   
>  {{   0.705671157 seconds time elapsed}}
>   
> *TypeUtils.compareBinary perf:*
>  {{  16347.242313  task-clock (msec) #    1.233 CPUs 
> utilized}}
>  {{ 8,370  context-switches  #    0.512 K/sec}}
>  {{   481  cpu-migrations#    0.029 K/sec}}
>  {{   536,671  page-faults   #    0.033 M/sec}}
>  {{40,857,347,119  cycles#    2.499 GHz}}
>  {{     stalled-cycles-frontend}}
>  {{     stalled-cycles-backend}}
>  {{90,606,381,612  instructions  #    2.22  insns per 
> cycle}}
>  {{18,107,867,151  branches  # 1107.702 M/sec}}
>  {{12,880,296  branch-misses #    0.07% of all 
> branches}}
>   
>  {{  13.257617118 seconds time elapsed}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-27586) Improve binary comparison: replace Scala's for-comprehension if statements with while loop

2019-04-28 Thread WoudyGao (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-27586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

WoudyGao updated SPARK-27586:
-
Description: 
I found the cpu cost of TypeUtils.compareBinary is noticeable when handle some 
big parquet files;

After some perf work, I found:

the " for-comprehension if statements" will execute ≈15X instructions than 
while loop

 

*'while-loop' version perf:*
  
 {{886.687949  task-clock (msec) #    1.257 CPUs utilized}}
 {{ 3,089  context-switches  #    0.003 M/sec}}
 {{   265  cpu-migrations#    0.299 K/sec}}
 {{12,227  page-faults   #    0.014 M/sec}}
 {{ 2,209,183,920  cycles#    2.492 GHz}}
 {{     stalled-cycles-frontend}}
 {{     stalled-cycles-backend}}
 {{ 6,865,836,114  instructions  #    3.11  insns per 
cycle}}
 {{ 1,568,910,228  branches  # 1769.405 M/sec}}
 {{ 9,172,613  branch-misses #    0.58% of all 
branches}}
  
 {{   0.705671157 seconds time elapsed}}
  

*TypeUtils.compareBinary perf:*
 {{  16347.242313  task-clock (msec) #    1.233 CPUs utilized}}
 {{ 8,370  context-switches  #    0.512 K/sec}}
 {{   481  cpu-migrations#    0.029 K/sec}}
 {{   536,671  page-faults   #    0.033 M/sec}}
 {{40,857,347,119  cycles#    2.499 GHz}}
 {{     stalled-cycles-frontend}}
 {{     stalled-cycles-backend}}
 {{90,606,381,612  instructions  #    2.22  insns per 
cycle}}
 {{18,107,867,151  branches  # 1107.702 M/sec}}
 {{12,880,296  branch-misses #    0.07% of all 
branches}}
  
 {{  13.257617118 seconds time elapsed}}

  was:
I found the cpu cost of TypeUtils.compareBinary is noticeable when handle some 
big parquet files;

After some perf work, I found:

In  the " for-comprehension if statements" will execute ≈15X instructions than 
while loop

 

*'while-loop' version perf:*
 
{{886.687949  task-clock (msec) #    1.257 CPUs utilized}}
{{ 3,089  context-switches  #    0.003 M/sec}}
{{   265  cpu-migrations#    0.299 K/sec}}
{{12,227  page-faults   #    0.014 M/sec}}
{{ 2,209,183,920  cycles#    2.492 GHz}}
{{     stalled-cycles-frontend}}
{{     stalled-cycles-backend}}
{{ 6,865,836,114  instructions  #    3.11  insns per cycle}}
{{ 1,568,910,228  branches  # 1769.405 M/sec}}
{{ 9,172,613  branch-misses #    0.58% of all branches}}
 
{{   0.705671157 seconds time elapsed}}
 

*TypeUtils.compareBinary perf:*
{{  16347.242313  task-clock (msec) #    1.233 CPUs utilized}}
{{ 8,370  context-switches  #    0.512 K/sec}}
{{   481  cpu-migrations#    0.029 K/sec}}
{{   536,671  page-faults   #    0.033 M/sec}}
{{40,857,347,119  cycles#    2.499 GHz}}
{{     stalled-cycles-frontend}}
{{     stalled-cycles-backend}}
{{90,606,381,612  instructions  #    2.22  insns per cycle}}
{{18,107,867,151  branches  # 1107.702 M/sec}}
{{12,880,296  branch-misses #    0.07% of all branches}}
 
{{  13.257617118 seconds time elapsed}}


> Improve binary comparison: replace Scala's for-comprehension if statements 
> with while loop
> --
>
> Key: SPARK-27586
> URL: https://issues.apache.org/jira/browse/SPARK-27586
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 2.4.2
> Environment: benchmark env:
>  * Intel(R) Xeon(R) CPU E5-2650 v4 @ 2.20GHz
>  * Linux 4.4.0-33.bm.1-amd64
>  * java version "1.8.0_131"
>  * Scala 2.11.8
>  * perf version 4.4.0
> Run:
> 40,000,000 times comparison on 32 bytes-length binary
>  
>Reporter: WoudyGao
>Priority: Minor
>
> I found the cpu cost of TypeUtils.compareBinary is noticeable when handle 
> some big parquet files;
> After some perf work, I found:
> the " for-comprehension if statements" will execute ≈15X instructions than 
> while loop
>  
> *'while-loop' version perf:*
>   
>  {{886.687949  task-clock (msec) #    1.257 CPUs 
> utilized}}
>  {{ 3,089  context-switches  #    0.003 M/sec}}
>  {{   265  cpu-migrations#    0.299 K/sec}}
>  {{12,227  page-faults   #    0.014 M/sec}}
>  {{