Ulimo commented on pull request #10906:
URL: https://github.com/apache/arrow/pull/10906#issuecomment-920094029


   @lidavidm I can give one real world cases that involves partitions:
   
   User executes query to Trino which builds up a SQL query based on the pushed 
down filter etc.
   We call getFlightInfo to the flight server to get relevant partitions, Trino 
then takes those and sends them out to worker nodes that fetches the data from 
the flight servers. Getting extra partitions here would potentially allocate 
more worker nodes.
   
   At this point in time there is a trust that Trino in this case doesnt create 
parameters that would do SQL injection. So we build up the query as a raw 
string. But the flight servers are responsible of figuring out what partitions 
are involved for each query that comes in getFlightInfo.
   
   Having partition knowledge at the client would ofcourse be best performance 
(if ticket format is known to client implementor you could even call the server 
directly), but that would be quite implementation specific, and quite hard to 
generalize I guess. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to