Re: [PR] [draft][Python] expose filter option to python for join [arrow]

2025-05-23 Thread via GitHub


xingyu-long commented on PR #46566:
URL: https://github.com/apache/arrow/pull/46566#issuecomment-2904998743

   @AlenkaF Thanks for taking a look!
   
   I just opened the issue to track this 
(https://github.com/apache/arrow/issues/46572). for the failing tests, probably 
related to corresponding python callers / function definition. but could you 
take a look first? since the main part is to enable join option in _acero.pyx, 
I'd like to get some feedback from the community for this part and see if it 
makes sense. Thanks! 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] [draft][Python] expose filter option to python for join [arrow]

2025-05-23 Thread via GitHub


AlenkaF commented on PR #46566:
URL: https://github.com/apache/arrow/pull/46566#issuecomment-2903700256

   Hi @xingyu-long, thank you for opening a PR!
   Could you first open an issue to track the changes and check the failing CI 
builds, some failing tests are connected.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] [draft][Python] expose filter option to python for join [arrow]

2025-05-22 Thread via GitHub


xingyu-long commented on PR #46566:
URL: https://github.com/apache/arrow/pull/46566#issuecomment-2903165397

   cc @richardliaw  since I discussed this with Richard and he suggested me to 
give this a try. and it may be helpful for ray project too. Thanks!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] [draft][Python] expose filter option to python for join [arrow]

2025-05-22 Thread via GitHub


github-actions[bot] commented on PR #46566:
URL: https://github.com/apache/arrow/pull/46566#issuecomment-2903163704

   
   
   Thanks for opening a pull request!
   
   If this is not a [minor 
PR](https://github.com/apache/arrow/blob/main/CONTRIBUTING.md#Minor-Fixes). 
Could you open an issue for this pull request on GitHub? 
https://github.com/apache/arrow/issues/new/choose
   
   Opening GitHub issues ahead of time contributes to the 
[Openness](http://theapacheway.com/open/#:~:text=Openness%20allows%20new%20users%20the,must%20happen%20in%20the%20open.)
 of the Apache Arrow project.
   
   Then could you also rename the pull request title in the following format?
   
   GH-${GITHUB_ISSUE_ID}: [${COMPONENT}] ${SUMMARY}
   
   or
   
   MINOR: [${COMPONENT}] ${SUMMARY}
   
   See also:
   
 * [Other pull requests](https://github.com/apache/arrow/pulls/)
 * [Contribution Guidelines - Contributing 
Overview](https://arrow.apache.org/docs/developers/overview.html)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



[PR] [draft][Python] expose filter option to python for join [arrow]

2025-05-22 Thread via GitHub


xingyu-long opened a new pull request, #46566:
URL: https://github.com/apache/arrow/pull/46566

   ### Rationale for this change
   
   C++ implementation support filter while performing join, however, it didn't 
expose to python and I think it's good to have this, so other users can avoid 
additional filter op explicitly in their side. 
   
   ### What changes are included in this PR?
   
   Support expression in python binding.
   
   
   ### Are these changes tested?
   
   Yes, added new test test_hash_join_with_filter
   
   ### Are there any user-facing changes?
   
   It will expose one more argument for user, i.e., filter_expression for 
Table.join and Datastet.join
   
   
   
   Note: I added [draft] for this change, since I'd like to get feedback from 
reviewers first and then we can change the frontend calls, i.e., Table, Dataset 
pxi files.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]