zanmato1984 commented on code in PR #46566:
URL: https://github.com/apache/arrow/pull/46566#discussion_r2115109420


##########
python/pyarrow/acero.py:
##########
@@ -114,6 +114,8 @@ def _perform_join(join_type, left_operand, left_keys,
         in the join result.
     output_type: Table or InMemoryDataset
         The output type for the exec plan result.
+    filter_expression: Expression
+        Expression that will be used during join operation.

Review Comment:
   ```suggestion
           Residual filter which is applied to matching row.
   ```



##########
python/pyarrow/tests/test_acero.py:
##########
@@ -300,6 +300,37 @@ def test_order_by():
         _ = OrderByNodeOptions([("b", "ascending")], null_placement="start")
 
 
+def test_hash_join_with_filter():

Review Comment:
   ```suggestion
   def test_hash_join_with_residual_filter():
   ```



##########
python/pyarrow/tests/test_acero.py:
##########
@@ -300,6 +300,37 @@ def test_order_by():
         _ = OrderByNodeOptions([("b", "ascending")], null_placement="start")
 
 
+def test_hash_join_with_filter():

Review Comment:
   We can add cases with special filters like `true`, `false`, expression 
referencing columns from both sides.



##########
python/pyarrow/table.pxi:
##########
@@ -5665,6 +5665,8 @@ cdef class Table(_Tabular):
             in the join result.
         use_threads : bool, default True
             Whether to use multithreading or not.
+        filter_expression: Expression
+            Expression that will be used during join operation.

Review Comment:
   ```suggestion
               Residual filter which is applied to matching row.
   ```



##########
python/pyarrow/_acero.pyx:
##########
@@ -273,14 +273,15 @@ cdef class _HashJoinNodeOptions(ExecNodeOptions):
 
     def _set_options(
         self, join_type, left_keys, right_keys, left_output=None, 
right_output=None,
-        output_suffix_for_left="", output_suffix_for_right="",
+        output_suffix_for_left="", output_suffix_for_right="", Expression 
filter_expression=None,

Review Comment:
   I think we can simply call it (and other places passing through this 
parameter) `filter`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to