dongjoon-hyun commented on code in PR #42667:
URL: https://github.com/apache/spark/pull/42667#discussion_r1315206808


##########
sql/core/benchmarks/JsonBenchmark-results.txt:
##########
@@ -3,121 +3,125 @@ Benchmark for performance of JSON parsing
 
================================================================================================
 
 Preparing data for benchmarking ...
-OpenJDK 64-Bit Server VM 1.8.0_362-b09 on Linux 5.15.0-1037-azure
-Intel(R) Xeon(R) CPU E5-2673 v3 @ 2.40GHz
+OpenJDK 64-Bit Server VM 1.8.0_292-8u292-b10-0ubuntu1~18.04-b10 on Linux 
5.4.0-1045-aws
+Intel(R) Xeon(R) Platinum 8375C CPU @ 2.90GHz
 JSON schema inferring:                    Best Time(ms)   Avg Time(ms)   
Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
 
------------------------------------------------------------------------------------------------------------------------
-No encoding                                        3720           3843         
121          1.3         743.9       1.0X
-UTF-8 is set                                       5412           5455         
 45          0.9        1082.4       0.7X
+No encoding                                        2084           2134         
 46          2.4         416.8       1.0X
+UTF-8 is set                                       3077           3093         
 14          1.6         615.3       0.7X
 
 Preparing data for benchmarking ...
-OpenJDK 64-Bit Server VM 1.8.0_362-b09 on Linux 5.15.0-1037-azure
-Intel(R) Xeon(R) CPU E5-2673 v3 @ 2.40GHz
+OpenJDK 64-Bit Server VM 1.8.0_292-8u292-b10-0ubuntu1~18.04-b10 on Linux 
5.4.0-1045-aws
+Intel(R) Xeon(R) Platinum 8375C CPU @ 2.90GHz
 count a short column:                     Best Time(ms)   Avg Time(ms)   
Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
 
------------------------------------------------------------------------------------------------------------------------
-No encoding                                        3234           3254         
 33          1.5         646.7       1.0X
-UTF-8 is set                                       4847           4868         
 21          1.0         969.5       0.7X
+No encoding                                        2854           2863         
  8          1.8         570.8       1.0X
+UTF-8 is set                                       4066           4066         
  1          1.2         813.1       0.7X
 
 Preparing data for benchmarking ...
-OpenJDK 64-Bit Server VM 1.8.0_362-b09 on Linux 5.15.0-1037-azure
-Intel(R) Xeon(R) CPU E5-2673 v3 @ 2.40GHz
+OpenJDK 64-Bit Server VM 1.8.0_292-8u292-b10-0ubuntu1~18.04-b10 on Linux 
5.4.0-1045-aws
+Intel(R) Xeon(R) Platinum 8375C CPU @ 2.90GHz
 count a wide column:                      Best Time(ms)   Avg Time(ms)   
Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
 
------------------------------------------------------------------------------------------------------------------------
-No encoding                                        5702           5794         
101          0.2        5702.1       1.0X
-UTF-8 is set                                       9526           9607         
 73          0.1        9526.1       0.6X
+No encoding                                        3348           3368         
 26          0.3        3347.8       1.0X
+UTF-8 is set                                       5215           5239         
 22          0.2        5214.7       0.6X
 
 Preparing data for benchmarking ...
-OpenJDK 64-Bit Server VM 1.8.0_362-b09 on Linux 5.15.0-1037-azure
-Intel(R) Xeon(R) CPU E5-2673 v3 @ 2.40GHz
+OpenJDK 64-Bit Server VM 1.8.0_292-8u292-b10-0ubuntu1~18.04-b10 on Linux 
5.4.0-1045-aws
+Intel(R) Xeon(R) Platinum 8375C CPU @ 2.90GHz
 select wide row:                          Best Time(ms)   Avg Time(ms)   
Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
 
------------------------------------------------------------------------------------------------------------------------
-No encoding                                       18318          18448         
199          0.0      366367.7       1.0X
-UTF-8 is set                                      19791          19887         
 99          0.0      395817.1       0.9X
+No encoding                                       11046          11102         
 54          0.0      220928.4       1.0X
+UTF-8 is set                                      12135          12181         
 54          0.0      242697.4       0.9X
 
 Preparing data for benchmarking ...
-OpenJDK 64-Bit Server VM 1.8.0_362-b09 on Linux 5.15.0-1037-azure
-Intel(R) Xeon(R) CPU E5-2673 v3 @ 2.40GHz
+OpenJDK 64-Bit Server VM 1.8.0_292-8u292-b10-0ubuntu1~18.04-b10 on Linux 
5.4.0-1045-aws
+Intel(R) Xeon(R) Platinum 8375C CPU @ 2.90GHz
 Select a subset of 10 columns:            Best Time(ms)   Avg Time(ms)   
Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
 
------------------------------------------------------------------------------------------------------------------------
-Select 10 columns                                  2531           2570         
 51          0.4        2531.3       1.0X
-Select 1 column                                    1867           1882         
 16          0.5        1867.0       1.4X
+Select 10 columns                                  2486           2488         
  2          0.4        2486.5       1.0X
+Select 1 column                                    1505           1506         
  2          0.7        1504.6       1.7X
 
 Preparing data for benchmarking ...
-OpenJDK 64-Bit Server VM 1.8.0_362-b09 on Linux 5.15.0-1037-azure
-Intel(R) Xeon(R) CPU E5-2673 v3 @ 2.40GHz
+OpenJDK 64-Bit Server VM 1.8.0_292-8u292-b10-0ubuntu1~18.04-b10 on Linux 
5.4.0-1045-aws
+Intel(R) Xeon(R) Platinum 8375C CPU @ 2.90GHz
 creation of JSON parser per line:         Best Time(ms)   Avg Time(ms)   
Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
 
------------------------------------------------------------------------------------------------------------------------
-Short column without encoding                       868            875         
  7          1.2         868.4       1.0X
-Short column with UTF-8                            1151           1163         
 11          0.9        1150.9       0.8X
-Wide column without encoding                      12063          12299         
205          0.1       12063.0       0.1X
-Wide column with UTF-8                            16095          16136         
 51          0.1       16095.3       0.1X
+Short column without encoding                       888            889         
  3          1.1         887.6       1.0X
+Short column with UTF-8                            1134           1136         
  2          0.9        1134.3       0.8X
+Wide column without encoding                       8012           8056         
 51          0.1        8012.4       0.1X
+Wide column with UTF-8                             9830           9844         
 22          0.1        9829.7       0.1X
 
 Preparing data for benchmarking ...
-OpenJDK 64-Bit Server VM 1.8.0_362-b09 on Linux 5.15.0-1037-azure
-Intel(R) Xeon(R) CPU E5-2673 v3 @ 2.40GHz
+OpenJDK 64-Bit Server VM 1.8.0_292-8u292-b10-0ubuntu1~18.04-b10 on Linux 
5.4.0-1045-aws
+Intel(R) Xeon(R) Platinum 8375C CPU @ 2.90GHz
 JSON functions:                           Best Time(ms)   Avg Time(ms)   
Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
 
------------------------------------------------------------------------------------------------------------------------
-Text read                                           165            170         
  4          6.1         164.7       1.0X
-from_json                                          2339           2386         
 77          0.4        2338.9       0.1X
-json_tuple                                         2667           2730         
 55          0.4        2667.3       0.1X
-get_json_object                                    2627           2659         
 32          0.4        2627.1       0.1X
+Text read                                            85             87         
  2         11.7          85.4       1.0X
+from_json                                          1706           1711         
  4          0.6        1706.4       0.1X
+json_tuple                                         1528           1534         
  7          0.7        1528.2       0.1X
+get_json_object                                    1275           1286         
 17          0.8        1275.0       0.1X
 
 Preparing data for benchmarking ...
-OpenJDK 64-Bit Server VM 1.8.0_362-b09 on Linux 5.15.0-1037-azure
-Intel(R) Xeon(R) CPU E5-2673 v3 @ 2.40GHz
+OpenJDK 64-Bit Server VM 1.8.0_292-8u292-b10-0ubuntu1~18.04-b10 on Linux 
5.4.0-1045-aws
+Intel(R) Xeon(R) Platinum 8375C CPU @ 2.90GHz
 Dataset of json strings:                  Best Time(ms)   Avg Time(ms)   
Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
 
------------------------------------------------------------------------------------------------------------------------
-Text read                                           700            715         
 20          7.1         140.1       1.0X
-schema inferring                                   3144           3166         
 20          1.6         628.7       0.2X
-parsing                                            3261           3271         
  9          1.5         652.1       0.2X
+Text read                                           369            370         
  1         13.6          73.8       1.0X
+schema inferring                                   1880           1883         
  4          2.7         376.0       0.2X
+parsing                                            3731           3737         
  8          1.3         746.1       0.1X
 
 Preparing data for benchmarking ...
-OpenJDK 64-Bit Server VM 1.8.0_362-b09 on Linux 5.15.0-1037-azure
-Intel(R) Xeon(R) CPU E5-2673 v3 @ 2.40GHz
+OpenJDK 64-Bit Server VM 1.8.0_292-8u292-b10-0ubuntu1~18.04-b10 on Linux 
5.4.0-1045-aws
+Intel(R) Xeon(R) Platinum 8375C CPU @ 2.90GHz
 Json files in the per-line mode:          Best Time(ms)   Avg Time(ms)   
Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
 
------------------------------------------------------------------------------------------------------------------------
-Text read                                          1096           1105         
 12          4.6         219.1       1.0X
-Schema inferring                                   3818           3830         
 16          1.3         763.6       0.3X
-Parsing without charset                            4107           4137         
 32          1.2         821.4       0.3X
-Parsing with UTF-8                                 5717           5763         
 41          0.9        1143.3       0.2X
+Text read                                           553            579         
 32          9.0         110.6       1.0X
+Schema inferring                                   2195           2196         
  2          2.3         439.0       0.3X
+Parsing without charset                            4272           4274         
  3          1.2         854.3       0.1X

Review Comment:
   Given the ration between `Test read` and `Parsing without charset `, is 
there a chance of regression in this PR?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to