Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/21733 @tdas I found the spare time to run performance tests though I've run only one app for now... I couldn't run the tests concurrently. Please let me know if you are not confident with the results from one app: I'll find more time to go with all test cases. Hope this number could give confident to accept the patch. > Machine info. MBP 15-inch Mid 2015 * i7 2.5Ghz (4 core) * 16GB 1600 Mhz DDR3 * SSD 512G > Test information * base commit : c9914cf (latest master branch) * patch internally rebased with base commit before testing * spark-submit options: master local[3] --driver-memory 6g * I don't run perf. test with all cores and memory: I left some spare resource for OS and background apps. > Performance test code https://github.com/HeartSaVioR/iot-trucking-app-spark-structured-streaming/blob/master/src/main/scala/com/hortonworks/spark/benchmark/BenchmarkMovingAggregationsListener.scala Please note that there're 4 more apps (big key size, big value size, many key columns, many value columns) in same repository. > Test result Both of version didn't catch up rate per seconds 200000, but since processed rows per second were around 188000 I felt I don't need to adjust rate per seconds more tightly (like 185000, 190000, etc...). The numbers for input rows per seconds and processed rows per second are calculated by taking average of 3 batches (38, 39, 40 respectively). The numbers regarding state are picked when total state rows went to 60000. version | input rows per second | processed rows per second | total state rows | used bytes of current state version ---- | ---- | ---- | ---- | ---- | latest master (c9914cf) | 200492.065 | 188880.316 | 60000 | 17,755,895 | | patch (on top of c9914cf) | 199242.598 | 188160.833 | 60000 | 14,687,543 | So while two processed rows per seconds didn't show outstanding difference (under 1%), the patch reduced memory usage of state (for latest version) by 17.29 %. One thing to note is, in performance test, state is saved to the local SSD. It may give (small? trivial?) performance benefit on the patch when we set remote checkpoint directory.
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org