Hello,
When testing out joins in solr streams we noticed that when the on clause
is reversed the results are incorrect and the join will return as if
everything matched.
For example if you have steamA and streamB with the following tuples:
streamA:
{
item_id_1: "123",
item_id_2: "456"
}
streamB:
{
item_id: "789",
user_id: "0"
}
Executing a stream like below:
leftOuterJoin(
search(collection-a, q=*:*, fq="item_id_1:123", fl="item_id_1,item_id_2",
qt="/export", sort="item_id_2 desc"),
search(collection-b,
fq="user_id:0",q="*:*",qt="/export",fl="item_id,user_id",sort="item_id
desc"),
on="item_id=item_id_2")
This will return something like this where all tuples are joined even
though item_id doesn't match item_id_2:
{
item_id_1: "123",
item_id_2: "456",
item_id: "789",
user_id: "0"
}
Note that the first column in the on clause is from the second table.
Is this expected behavior? We're running solr 8.11.1 and noticed it
while setting up a new query. It's an easy fix to switch the on clause but
seems like it should throw an error or handle it properly. Happy to open up
a bug ticket if this isn't expected.
Thanks,
--
*Geren White | Senior Director, Engineering*
*(e)* [email protected]