zanmato1984 commented on PR #46566:
URL: https://github.com/apache/arrow/pull/46566#issuecomment-2921384026
> I see, so if I understand this correctly, ideally, we probably should
assign distinct key for both columns before using filter expression since
output_suffix_for_left would only works for output at the end of the workflow,
right? (sorry if this is a dumb question...) i.e., something like this won't
work
>
> ```python
> join_opts = HashJoinNodeOptions(
> "inner", left_keys="key", right_keys="key",
> output_suffix_for_left="_left",output_suffix_for_right="_right",
> filter=pc.equal(pc.field('key_left'), 2)) # <------------ will
hit key not found in both schemas.
> joined = Declaration(
> "hashjoin", options=join_opts, inputs=[left_source, right_source])
> result = joined.to_table()
> ```
Sorry I made a mistake. You are right about this. Thanks for clarifying.
If you want to write a similar test case, let's just workaround the
constraint and use unique column names.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]