Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22844#discussion_r229213742 --- Diff: sql/core/benchmarks/JSONBenchmarks-results.txt --- @@ -0,0 +1,33 @@ +================================================================================================ +Benchmark for performance of JSON parsing +================================================================================================ + +OpenJDK 64-Bit Server VM 1.8.0_163-b01 on Windows 7 6.1 +Intel64 Family 6 Model 94 Stepping 3, GenuineIntel +JSON schema inferring: Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative +------------------------------------------------------------------------------------------------ +No encoding 48088 / 48180 2.1 480.9 1.0X +UTF-8 is set 71881 / 71992 1.4 718.8 0.7X + +OpenJDK 64-Bit Server VM 1.8.0_163-b01 on Windows 7 6.1 +Intel64 Family 6 Model 94 Stepping 3, GenuineIntel +JSON per-line parsing: Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative +------------------------------------------------------------------------------------------------ +No encoding 12107 / 12246 8.3 121.1 1.0X +UTF-8 is set 12375 / 12475 8.1 123.8 1.0X --- End diff -- IIRC, this benchmark was added rather we can make sure setting encoding does not affect the performance without encoding (right @MaxGekk ?). We should fix this. @cloud-fan
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org