vichry2 opened a new issue, #6670: URL: https://github.com/apache/arrow-rs/issues/6670
**Which part is this question about** Arrow flight, FlightDataEncoderBuilder, do_get **Describe your question** Is it expected that Arrow's Python (C++) Flight implementation encodes data more efficiently than arrow-rs? **Additional context** Hello. After discussion with @alamb, I am filing an issue here. Unsure if this is a bug, or if it's expected, or if there's just an issue with my code, but after running some tests, it seems that Rust's encoding takes more time and resources than Python. I am running two servers, one in Python and the other in Rust, with the same simple design: -Create a `Table`/`RecordBatch` before starting the flight service, which the service will hold in memory when running. -When receiving a request (in `do_get`), simply provide a view of the data to `fl.RecordBatchStream` in Python / `FlightDataEncoderBuilder` in Rust. Because nothing is really happening on the Python side (just providing a view to a `Table`), and a single request is not holding the GIL for a significant amount of time, I imagine I'm ultimately measuring the C++ Arrow Flight implementation. I have run two tests: 1. Python script which sends *n* requests sequentially to each server, consuming the entire stream (`flightclient.do_get().read_all()`) and displays the average response time for each server. 2. Using the Locust framework, load testing the maximum RPS capabilities of the servers (used `taskset -c` to seperate locust users and server). I observe the following from the tests: 1. As the size of data sent to the client increases, the difference of average response time between Python and Rust servers also increases (in favor of Python server). 2. Similarily, as the amount of data increases, Python is able to achieve a higher RPS than Rust. The Rust server's CPUs are fully utilized (using more than Python server in certain cases). After profiling with `perf`, I am seeing a lot of CPU usage related to memory movement. You can access my code here: https://github.com/vichry2/flight-benchmark Thank you for your help! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
