[I] [VL] Limit the result of the Split function when combined with Slice [incubator-gluten]

via GitHub Tue, 29 Jul 2025 05:23:04 -0700


NEUpanning opened a new issue, #10277:
URL: https://github.com/apache/incubator-gluten/issues/10277


   ### Description
   
   In our production environment, there is a poor performance case whose SQL is 
like `select SLICE(SPLIT(business_area_id, "\\044"), 1, 1000) from table`. As 
we can see, although slice only takes 1000 values, the split function needs to 
compute all the results. This occurs unnecessary costs when the column can be 
splited to more than 1000 results. 
   
   I think we can push the parameter of slice function to split. The following 
is a plan to demonstrate it.
   original expression:
   `slice(split(column1, delimiter, -1), S, N)`
   transformed expression:
   `slice(split(column1, delimiter, N+S-1), S, N)`
   
   
   ### Gluten version
   
   None


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[I] [VL] Limit the result of the Split function when combined with Slice [incubator-gluten]

Reply via email to