HeartSaVioR opened a new pull request, #51124:
URL: https://github.com/apache/spark/pull/51124

   ### What changes were proposed in this pull request?
   
   This PR is a follow-up of SPARK-50752 (PR #49397), which introduced 
PYTHON_UDF_MAX_RECORDS_PER_BATCH into SQLConf to control the batch size of 
normal Python UDF. This PR is to fix the config to be effective for Python 
worker side as well.
   
   ### Why are the changes needed?
   
   The original PR enabled the control of batch size for JVM side, but it 
wasn't properly propagated to Python worker since we missed to override the 
value for Python UDF. (Default value before overriding is a static one, 100)
   
   ### Does this PR introduce _any_ user-facing change?
   
   Yes, but it's about tuning and does not impact any change on the output.
   
   ### How was this patch tested?
   
   No automated test, since there is no good way to test this properly since 
missing this does not change the output itself. 
   
   This was manually tested with artificial debug logging.
   
   ### Was this patch authored or co-authored using generative AI tooling?
   
   No.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to