ianmcook commented on issue #40597:
URL: https://github.com/apache/arrow/issues/40597#issuecomment-2002661998

   @kou Do you mean that a client could send a range request like `Range: 
batches=x-y` instead of `Range: bytes=x-y`? In that case: yes, the server would 
be more efficient retrieving the requested batches if the data on the server 
side was in the IPC file format, because the footer contains memory offsets and 
sizes for each record batch.
   
   But I am -1 on recommending the use of range requests with units that are 
not `bytes`. Although this is allowed by HTTP/1.1 (as described in [RFC 2616 
Section 3.12](https://datatracker.ietf.org/doc/html/rfc2616#section-3.12)) and 
also by HTTP/2 (as described in [RFC 7540 Section 
8](https://datatracker.ietf.org/doc/html/rfc7540#section-8)), HTTP clients and 
servers in general do not support this well. At best it would require 
overriding classes of the HTTP server libraries that are rarely overridden. At 
worst it would be altogether incompatible with some HTTP clients and servers.
   
   I think it is better if we recommend that HTTP APIs should handle requests 
for specific ranges of batches using whatever higher-level application-specific 
methods they choose (such as URL query parameters) and restrict the use of 
range requests to `bytes` units only.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to