I'm evaluating flight benchmark [1] on single host. Met with one problem. Would 
like to seek for help.

Flight benchmark has a "num_threads" parameter [1] to set "number of current gets". 
Counter-intuitively, setting it to larger values drops performance, "arrow-flight-benchmark --num_threads=1" 
performs much better than "arrow-flight-benchmark --num_threads=2". There's a history thread talking about 
this issue [2], explains it's better to spawn more servers on different ports rather than having all threads go to a 
single server app.

I did another test with standalone server, the result is different.

1. spawn a standalone flight server
   $ ./arrow-flight-perf-server
   Server host: localhost
   Server port: 31337

2. test one flight benchmark to get baseline performance
   $ ./arrow-flight-benchmark --num_threads 1 --server_host localhost 
--records_per_stream=123456789
   ....
   Speed: 4717.28 MB/s

3. test two flight benchmarks concurrently, check scalability
   # run in one console
   $ ./arrow-flight-benchmark --num_threads 1 --server_host localhost 
--records_per_stream=123456789
   ....
   Speed: 4160.94 MB/s

   # run at *same time* in another console
   $ ./arrow-flight-benchmark --num_threads 1 --server_host localhost 
--records_per_stream=123456789
   ....
   Speed: 4154.65 MB/s

From this result, looks flight server has good multi core scalability. Same 
behaviour observed if tested across network.
What's the difference of above two tests, using standalone server and not.

[1] 
https://github.com/apache/arrow/blob/master/cpp/src/arrow/flight/flight_benchmark.cc#L44
[2] 
https://lists.apache.org/thread.html/rd2aa01f460dd1092c60d1ba75087c2ce87c81ac543a246549b4713fb%40%3Cdev.arrow.apache.org%3E

Yibo

Reply via email to