Xi Lyu created SPARK-53525:
------------------------------

             Summary: Spark Connect ArrowBatch Result Chunking
                 Key: SPARK-53525
                 URL: https://issues.apache.org/jira/browse/SPARK-53525
             Project: Spark
          Issue Type: Improvement
          Components: Connect
    Affects Versions: 4.1.0
            Reporter: Xi Lyu


Currently, we enforce gRPC message limits on both the client and the server. 
These limits are largely meant to protect both sides from potential OOMs by 
rejecting abnormally large messages. However, there are cases in which the 
server incorrectly sends oversized messages that exceed these limits and cause 
execution failures.

Specifically, the large message issue from the server to the client we’re 
solving here, comes from the Arrow batch data in ExecutePlanResponse being too 
large. It’s caused by a single arrow row exceeding the 128MB message limit, and 
Arrow cannot partition further and return the single large row in one gRPC 
message.

To improve Spark Connect stability, this PR implements chunking large Arrow 
batches when returning query results from the server to the client, ensuring 
each ExecutePlanResponse chunk remains within the size limit, and the chunks 
from a batch will be reassembled on the client when parsing as an arrow batch.





--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to