Flight benchmark question

Yibo Cai Mon, 15 Jun 2020 03:45:03 -0700

I'm evaluating flight benchmark [1] on single host. Met with one problem. Would 
like to seek for help.

Flight benchmark has a "num_threads" parameter [1] to set "number of current gets".
Counter-intuitively, setting it to larger values drops performance, "arrow-flight-benchmark --num_threads=1"
performs much better than "arrow-flight-benchmark --num_threads=2". There's a history thread talking about
this issue [2], explains it's better to spawn more servers on different ports rather than having all threads go to a
single server app.

I did another test with standalone server, the result is different.

1. spawn a standalone flight server
$ ./arrow-flight-perf-server
Server host: localhost
Server port: 31337

2. test one flight benchmark to get baseline performance
$ ./arrow-flight-benchmark --num_threads 1 --server_host localhost
--records_per_stream=123456789
....
Speed: 4717.28 MB/s

3. test two flight benchmarks concurrently, check scalability
# run in one console
$ ./arrow-flight-benchmark --num_threads 1 --server_host localhost
--records_per_stream=123456789
....
Speed: 4160.94 MB/s

# run at *same time* in another console
$ ./arrow-flight-benchmark --num_threads 1 --server_host localhost
--records_per_stream=123456789
....
Speed: 4154.65 MB/s

From this result, looks flight server has good multi core scalability. Same
behaviour observed if tested across network.
What's the difference of above two tests, using standalone server and not.

[1]
https://github.com/apache/arrow/blob/master/cpp/src/arrow/flight/flight_benchmark.cc#L44
[2]
https://lists.apache.org/thread.html/rd2aa01f460dd1092c60d1ba75087c2ce87c81ac543a246549b4713fb%40%3Cdev.arrow.apache.org%3E

Yibo

Flight benchmark question

Reply via email to