sodonnel commented on PR #6613: URL: https://github.com/apache/ozone/pull/6613#issuecomment-3249780357
@chungen0126 Could you explain more detail how this change works, and why it should perform better than the current approach? Has this code been run on any cluster to prove it works and also that its performance is better than the current approach. eg for `ozone sh key get` even on a local docker environment? As I understand the current approach, provided a client asks for chunk size reads, the logic is that we pull one chunk at a time from the server to the client. When the reader has consumed that, it will go back and make another read chunk call. With the approach in this PR, it talks about using a GRPC bi-directional feature to pipeline the chunks to the client. I don't understand how this works exactly, but I immediately wonder: 1. Is there more buffering memory needed on the client? 2. How many chunks could the client need to buffer at any time for this to be effective? 3. What triggers new chunks to come from server to client? 4. What if the client needs to seek and read parts of the key, such as for ORC file? 5. Is a GRPC handler tied up on the server for the entire time the block is being read by the client? Or is a thread pulled from the thread pool only when a new chunk needs to be sent to the client? Thanks. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
