I'm evaluating flight benchmark [1] on single host. Met with one problem. Would like to seek for help.
Flight benchmark has a "num_threads" parameter [1] to set "number of current gets". Counter-intuitively, setting it to larger values drops performance, "arrow-flight-benchmark --num_threads=1" performs much better than "arrow-flight-benchmark --num_threads=2". There's a history thread talking about this issue [2], explains it's better to spawn more servers on different ports rather than having all threads go to a single server app. I did another test with standalone server, the result is different. 1. spawn a standalone flight server $ ./arrow-flight-perf-server Server host: localhost Server port: 31337 2. test one flight benchmark to get baseline performance $ ./arrow-flight-benchmark --num_threads 1 --server_host localhost --records_per_stream=123456789 .... Speed: 4717.28 MB/s 3. test two flight benchmarks concurrently, check scalability # run in one console $ ./arrow-flight-benchmark --num_threads 1 --server_host localhost --records_per_stream=123456789 .... Speed: 4160.94 MB/s # run at *same time* in another console $ ./arrow-flight-benchmark --num_threads 1 --server_host localhost --records_per_stream=123456789 .... Speed: 4154.65 MB/s From this result, looks flight server has good multi core scalability. Same behaviour observed if tested across network. What's the difference of above two tests, using standalone server and not. [1] https://github.com/apache/arrow/blob/master/cpp/src/arrow/flight/flight_benchmark.cc#L44 [2] https://lists.apache.org/thread.html/rd2aa01f460dd1092c60d1ba75087c2ce87c81ac543a246549b4713fb%40%3Cdev.arrow.apache.org%3E Yibo