zanmato1984 commented on code in PR #46566: URL: https://github.com/apache/arrow/pull/46566#discussion_r2115109420
########## python/pyarrow/acero.py: ########## @@ -114,6 +114,8 @@ def _perform_join(join_type, left_operand, left_keys, in the join result. output_type: Table or InMemoryDataset The output type for the exec plan result. + filter_expression: Expression + Expression that will be used during join operation. Review Comment: ```suggestion Residual filter which is applied to matching row. ``` ########## python/pyarrow/tests/test_acero.py: ########## @@ -300,6 +300,37 @@ def test_order_by(): _ = OrderByNodeOptions([("b", "ascending")], null_placement="start") +def test_hash_join_with_filter(): Review Comment: ```suggestion def test_hash_join_with_residual_filter(): ``` ########## python/pyarrow/tests/test_acero.py: ########## @@ -300,6 +300,37 @@ def test_order_by(): _ = OrderByNodeOptions([("b", "ascending")], null_placement="start") +def test_hash_join_with_filter(): Review Comment: We can add cases with special filters like `true`, `false`, expression referencing columns from both sides. ########## python/pyarrow/table.pxi: ########## @@ -5665,6 +5665,8 @@ cdef class Table(_Tabular): in the join result. use_threads : bool, default True Whether to use multithreading or not. + filter_expression: Expression + Expression that will be used during join operation. Review Comment: ```suggestion Residual filter which is applied to matching row. ``` ########## python/pyarrow/_acero.pyx: ########## @@ -273,14 +273,15 @@ cdef class _HashJoinNodeOptions(ExecNodeOptions): def _set_options( self, join_type, left_keys, right_keys, left_output=None, right_output=None, - output_suffix_for_left="", output_suffix_for_right="", + output_suffix_for_left="", output_suffix_for_right="", Expression filter_expression=None, Review Comment: I think we can simply call it (and other places passing through this parameter) `filter`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org