[jira] [Updated] (SPARK-42789) Rewrite multiple GetJsonObjects to a JsonTuple if their json expression is the same

2023-12-08 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated SPARK-42789:
---
Labels: pull-request-available  (was: )

> Rewrite multiple GetJsonObjects to a JsonTuple if their json expression is 
> the same
> ---
>
> Key: SPARK-42789
> URL: https://issues.apache.org/jira/browse/SPARK-42789
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.5.0
>Reporter: Yuming Wang
>Priority: Major
>  Labels: pull-request-available
>
> Benchmark result:
> {noformat}
> Running benchmark: Benchmark rewrite GetJsonObjects
>   Running case: Default: 2
>   Stopped after 2 iterations, 77193 ms
>   Running case: Rewrite: 2
>   Stopped after 2 iterations, 51699 ms
> Java HotSpot(TM) 64-Bit Server VM 17.0.4.1+1-LTS-2 on Mac OS X 13.2.1
> Intel(R) Core(TM) i9-9980HK CPU @ 2.40GHz
> Benchmark rewrite GetJsonObjects: Best Time(ms)   Avg Time(ms)   
> Stdev(ms)Rate(M/s)   Per Row(ns)   Relative
> 
> Default: 237914  38597
>  966  0.25244.0   1.0X
> Rewrite: 224887  25850
> 1361  0.33442.2   1.5X
> Running benchmark: Benchmark rewrite GetJsonObjects
>   Running case: Default: 3
>   Stopped after 2 iterations, 110890 ms
>   Running case: Rewrite: 3
>   Stopped after 2 iterations, 56102 ms
> Java HotSpot(TM) 64-Bit Server VM 17.0.4.1+1-LTS-2 on Mac OS X 13.2.1
> Intel(R) Core(TM) i9-9980HK CPU @ 2.40GHz
> Benchmark rewrite GetJsonObjects: Best Time(ms)   Avg Time(ms)   
> Stdev(ms)Rate(M/s)   Per Row(ns)   Relative
> 
> Default: 352862  55445
>  NaN  0.17311.6   1.0X
> Rewrite: 326752  28051
> 1837  0.33700.2   2.0X
> Running benchmark: Benchmark rewrite GetJsonObjects
>   Running case: Default: 4
>   Stopped after 2 iterations, 150828 ms
>   Running case: Rewrite: 4
>   Stopped after 2 iterations, 57110 ms
> Java HotSpot(TM) 64-Bit Server VM 17.0.4.1+1-LTS-2 on Mac OS X 13.2.1
> Intel(R) Core(TM) i9-9980HK CPU @ 2.40GHz
> Benchmark rewrite GetJsonObjects: Best Time(ms)   Avg Time(ms)   
> Stdev(ms)Rate(M/s)   Per Row(ns)   Relative
> 
> Default: 471680  75414
>  NaN  0.19914.4   1.0X
> Rewrite: 428452  28555
>  145  0.33935.4   2.5X
> Running benchmark: Benchmark rewrite GetJsonObjects
>   Running case: Default: 5
>   Stopped after 2 iterations, 223367 ms
>   Running case: Rewrite: 5
>   Stopped after 2 iterations, 78193 ms
> Java HotSpot(TM) 64-Bit Server VM 17.0.4.1+1-LTS-2 on Mac OS X 13.2.1
> Intel(R) Core(TM) i9-9980HK CPU @ 2.40GHz
> Benchmark rewrite GetJsonObjects: Best Time(ms)   Avg Time(ms)   
> Stdev(ms)Rate(M/s)   Per Row(ns)   Relative
> 
> Default: 5   108479 111684
> 1447  0.1   15004.2   1.0X
> Rewrite: 536830  39097
>  NaN  0.25094.0   2.9X
> Running benchmark: Benchmark rewrite GetJsonObjects
>   Running case: Default: 10
>   Stopped after 2 iterations, 311453 ms
>   Running case: Rewrite: 10
>   Stopped after 2 iterations, 65873 ms
> Java HotSpot(TM) 64-Bit Server VM 17.0.4.1+1-LTS-2 on Mac OS X 13.2.1
> Intel(R) Core(TM) i9-9980HK CPU @ 2.40GHz
> Benchmark rewrite GetJsonObjects: Best Time(ms)   Avg Time(ms)   
> Stdev(ms)Rate(M/s)   Per Row(ns)   Relative
> 
> Default: 10  153952 155727
> 2510  0.0   21293.7   1.0X
> Rewrite: 10   32436  32937
>  708  0.24486.3   4.7X
> Running benchmark: Benchmark rewrite GetJsonObjects
>   Running case: Default: 15
>   Stopped after 2 iterations, 451911 ms
>   Running 

[jira] [Updated] (SPARK-42789) Rewrite multiple GetJsonObjects to a JsonTuple if their json expression is the same

2023-03-14 Thread Yuming Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuming Wang updated SPARK-42789:

Description: 
Benchmark result:
{noformat}

Running benchmark: Benchmark rewrite GetJsonObjects
  Running case: Default: 2
  Stopped after 2 iterations, 77193 ms
  Running case: Rewrite: 2
  Stopped after 2 iterations, 51699 ms

Java HotSpot(TM) 64-Bit Server VM 17.0.4.1+1-LTS-2 on Mac OS X 13.2.1
Intel(R) Core(TM) i9-9980HK CPU @ 2.40GHz
Benchmark rewrite GetJsonObjects: Best Time(ms)   Avg Time(ms)   
Stdev(ms)Rate(M/s)   Per Row(ns)   Relative

Default: 237914  38597 
966  0.25244.0   1.0X
Rewrite: 224887  25850
1361  0.33442.2   1.5X

Running benchmark: Benchmark rewrite GetJsonObjects
  Running case: Default: 3
  Stopped after 2 iterations, 110890 ms
  Running case: Rewrite: 3
  Stopped after 2 iterations, 56102 ms

Java HotSpot(TM) 64-Bit Server VM 17.0.4.1+1-LTS-2 on Mac OS X 13.2.1
Intel(R) Core(TM) i9-9980HK CPU @ 2.40GHz
Benchmark rewrite GetJsonObjects: Best Time(ms)   Avg Time(ms)   
Stdev(ms)Rate(M/s)   Per Row(ns)   Relative

Default: 352862  55445 
NaN  0.17311.6   1.0X
Rewrite: 326752  28051
1837  0.33700.2   2.0X

Running benchmark: Benchmark rewrite GetJsonObjects
  Running case: Default: 4
  Stopped after 2 iterations, 150828 ms
  Running case: Rewrite: 4
  Stopped after 2 iterations, 57110 ms

Java HotSpot(TM) 64-Bit Server VM 17.0.4.1+1-LTS-2 on Mac OS X 13.2.1
Intel(R) Core(TM) i9-9980HK CPU @ 2.40GHz
Benchmark rewrite GetJsonObjects: Best Time(ms)   Avg Time(ms)   
Stdev(ms)Rate(M/s)   Per Row(ns)   Relative

Default: 471680  75414 
NaN  0.19914.4   1.0X
Rewrite: 428452  28555 
145  0.33935.4   2.5X

Running benchmark: Benchmark rewrite GetJsonObjects
  Running case: Default: 5
  Stopped after 2 iterations, 223367 ms
  Running case: Rewrite: 5
  Stopped after 2 iterations, 78193 ms

Java HotSpot(TM) 64-Bit Server VM 17.0.4.1+1-LTS-2 on Mac OS X 13.2.1
Intel(R) Core(TM) i9-9980HK CPU @ 2.40GHz
Benchmark rewrite GetJsonObjects: Best Time(ms)   Avg Time(ms)   
Stdev(ms)Rate(M/s)   Per Row(ns)   Relative

Default: 5   108479 111684
1447  0.1   15004.2   1.0X
Rewrite: 536830  39097 
NaN  0.25094.0   2.9X

Running benchmark: Benchmark rewrite GetJsonObjects
  Running case: Default: 10
  Stopped after 2 iterations, 311453 ms
  Running case: Rewrite: 10
  Stopped after 2 iterations, 65873 ms

Java HotSpot(TM) 64-Bit Server VM 17.0.4.1+1-LTS-2 on Mac OS X 13.2.1
Intel(R) Core(TM) i9-9980HK CPU @ 2.40GHz
Benchmark rewrite GetJsonObjects: Best Time(ms)   Avg Time(ms)   
Stdev(ms)Rate(M/s)   Per Row(ns)   Relative

Default: 10  153952 155727
2510  0.0   21293.7   1.0X
Rewrite: 10   32436  32937 
708  0.24486.3   4.7X

Running benchmark: Benchmark rewrite GetJsonObjects
  Running case: Default: 15
  Stopped after 2 iterations, 451911 ms
  Running case: Rewrite: 15
  Stopped after 2 iterations, 69790 ms

Java HotSpot(TM) 64-Bit Server VM 17.0.4.1+1-LTS-2 on Mac OS X 13.2.1
Intel(R) Core(TM) i9-9980HK CPU @ 2.40GHz
Benchmark rewrite GetJsonObjects: Best Time(ms)   Avg Time(ms)   
Stdev(ms)Rate(M/s)   Per Row(ns)   Relative

Default: 15  224950 225956
1423  0.0   31113.6   1.0X
Rewrite: 15   34806  34895 
126  0.24814.2   6.5X

Running benchmark: Benchmark 

[jira] [Updated] (SPARK-42789) Rewrite multiple GetJsonObjects to a JsonTuple if their json expression is the same

2023-03-14 Thread Yuming Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuming Wang updated SPARK-42789:

Summary: Rewrite multiple GetJsonObjects to a JsonTuple if their json 
expression is the same  (was: rewrites multiple GetJsonObjects to a JsonTuple 
if their json expression is the same)

> Rewrite multiple GetJsonObjects to a JsonTuple if their json expression is 
> the same
> ---
>
> Key: SPARK-42789
> URL: https://issues.apache.org/jira/browse/SPARK-42789
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.5.0
>Reporter: Yuming Wang
>Priority: Major
>
> Benchmark result:
> {noformat}
> Running benchmark: Benchmark rewrite GetJsonObjects
>   Running case: Default: 2
>   Stopped after 2 iterations, 80787 ms
>   Running case: Rewrite: 2
>   Stopped after 2 iterations, 48900 ms
> Java HotSpot(TM) 64-Bit Server VM 17.0.4.1+1-LTS-2 on Mac OS X 13.2.1
> Intel(R) Core(TM) i9-9980HK CPU @ 2.40GHz
> Benchmark rewrite GetJsonObjects: Best Time(ms)   Avg Time(ms)   
> Stdev(ms)Rate(M/s)   Per Row(ns)   Relative
> 
> Default: 239026  40394
> 1935  0.25397.8   1.0X
> Rewrite: 224354  24450
>  137  0.33368.4   1.6X
> Running benchmark: Benchmark rewrite GetJsonObjects
>   Running case: Default: 3
>   Stopped after 2 iterations, 115055 ms
>   Running case: Rewrite: 3
>   Stopped after 2 iterations, 62297 ms
> Java HotSpot(TM) 64-Bit Server VM 17.0.4.1+1-LTS-2 on Mac OS X 13.2.1
> Intel(R) Core(TM) i9-9980HK CPU @ 2.40GHz
> Benchmark rewrite GetJsonObjects: Best Time(ms)   Avg Time(ms)   
> Stdev(ms)Rate(M/s)   Per Row(ns)   Relative
> 
> Default: 354652  57528
>  NaN  0.17559.1   1.0X
> Rewrite: 330702  31149
>  631  0.24246.6   1.8X
> Running benchmark: Benchmark rewrite GetJsonObjects
>   Running case: Default: 4
>   Stopped after 2 iterations, 155392 ms
>   Running case: Rewrite: 4
>   Stopped after 2 iterations, 54776 ms
> Java HotSpot(TM) 64-Bit Server VM 17.0.4.1+1-LTS-2 on Mac OS X 13.2.1
> Intel(R) Core(TM) i9-9980HK CPU @ 2.40GHz
> Benchmark rewrite GetJsonObjects: Best Time(ms)   Avg Time(ms)   
> Stdev(ms)Rate(M/s)   Per Row(ns)   Relative
> 
> Default: 475503  77696
>  NaN  0.1   10443.1   1.0X
> Rewrite: 426962  27388
>  602  0.33729.3   2.8X
> Running benchmark: Benchmark rewrite GetJsonObjects
>   Running case: Default: 5
>   Stopped after 2 iterations, 192836 ms
>   Running case: Rewrite: 5
>   Stopped after 2 iterations, 51967 ms
> Java HotSpot(TM) 64-Bit Server VM 17.0.4.1+1-LTS-2 on Mac OS X 13.2.1
> Intel(R) Core(TM) i9-9980HK CPU @ 2.40GHz
> Benchmark rewrite GetJsonObjects: Best Time(ms)   Avg Time(ms)   
> Stdev(ms)Rate(M/s)   Per Row(ns)   Relative
> 
> Default: 594923  96418
> 2115  0.1   13129.1   1.0X
> Rewrite: 525362  25984
>  880  0.33507.8   3.7X
> Running benchmark: Benchmark rewrite GetJsonObjects
>   Running case: Default: 10
>   Stopped after 2 iterations, 317246 ms
>   Running case: Rewrite: 10
>   Stopped after 2 iterations, 56734 ms
> Java HotSpot(TM) 64-Bit Server VM 17.0.4.1+1-LTS-2 on Mac OS X 13.2.1
> Intel(R) Core(TM) i9-9980HK CPU @ 2.40GHz
> Benchmark rewrite GetJsonObjects: Best Time(ms)   Avg Time(ms)   
> Stdev(ms)Rate(M/s)   Per Row(ns)   Relative
> 
> Default: 10  157458 158623
> 1648  0.0   21778.6   1.0X
> Rewrite: 10   28296  28367
>  100  0.33913.8   5.6X
> Running benchmark: Benchmark rewrite