zanmato1984 commented on code in PR #46566:
URL: https://github.com/apache/arrow/pull/46566#discussion_r2115109420
##########
python/pyarrow/acero.py:
##########
@@ -114,6 +114,8 @@ def _perform_join(join_type, left_operand, left_keys,
in the join result.
output_type: Table or InMemoryDataset
The output type for the exec plan result.
+ filter_expression: Expression
+ Expression that will be used during join operation.
Review Comment:
```suggestion
Residual filter which is applied to matching row.
```
##########
python/pyarrow/tests/test_acero.py:
##########
@@ -300,6 +300,37 @@ def test_order_by():
_ = OrderByNodeOptions([("b", "ascending")], null_placement="start")
+def test_hash_join_with_filter():
Review Comment:
```suggestion
def test_hash_join_with_residual_filter():
```
##########
python/pyarrow/tests/test_acero.py:
##########
@@ -300,6 +300,37 @@ def test_order_by():
_ = OrderByNodeOptions([("b", "ascending")], null_placement="start")
+def test_hash_join_with_filter():
Review Comment:
We can add cases with special filters like `true`, `false`, expression
referencing columns from both sides.
##########
python/pyarrow/table.pxi:
##########
@@ -5665,6 +5665,8 @@ cdef class Table(_Tabular):
in the join result.
use_threads : bool, default True
Whether to use multithreading or not.
+ filter_expression: Expression
+ Expression that will be used during join operation.
Review Comment:
```suggestion
Residual filter which is applied to matching row.
```
##########
python/pyarrow/_acero.pyx:
##########
@@ -273,14 +273,15 @@ cdef class _HashJoinNodeOptions(ExecNodeOptions):
def _set_options(
self, join_type, left_keys, right_keys, left_output=None,
right_output=None,
- output_suffix_for_left="", output_suffix_for_right="",
+ output_suffix_for_left="", output_suffix_for_right="", Expression
filter_expression=None,
Review Comment:
I think we can simply call it (and other places passing through this
parameter) `filter`.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]