Hi Courtney,

> Is it the case that as long as the affinity key is in the join predicate
that it would be a colocated JOIN

This is true for cases when you have a predicate with equality by affinity
keys. If a join predicate includes affinity keys eq condition then it also
can have any other conditions.

In your case, you have 2 tables and affinity keys T0.releaseId, T1.t0Id.
Then the valid join predicate will be: "T0.releaseId = T1.t0Id and (...
anything you wish)"

> In other words, if both tables share the same affinity key is it still a
collocated join if there are other filters in the join predicate?

Yes.

You can check examples of valid and non-valid joins here:
https://github.com/timoninmaxim/ignite/blob/master/modules/indexing/src/test/java/org/apache/ignite/sqltests/CheckWarnJoinPartitionedTables.java

Since Apache Ignite 2.12 it writes a warning message for SQL queries with
non-colocated joins.


On Sat, Feb 12, 2022 at 10:04 AM Courtney Robinson <
courtney.robin...@crlog.info> wrote:

> Hi Maksim,
>
> Interesting, thanks for your reply.
> Okay, I misunderstood (I also thought being on a single node that it
> didn't matter).
>
> Is it the case that as long as the affinity key is in the join predicate
> that it would be a colocated JOIN (I'm concerned about the impact of
> setDistributedJoins(true))?
> Or is it the case that if you're joining on partitioned tables, you must
> do so with ONLY the affinity key in the join predicate?
>
> SELECT tbl.releaseId, tbl.name FROM T0 tbl
>
>  INNER JOIN T1 col ON tbl.releaseId = <What goes here?>
>
>
> In the previous tables, T1 does not have the releaseId as a column so does
> that mean it is impossible to do a co-located JOIN with this setup?
>
> If we modify T1 so that it also has releaseId and we make releaseId the
> affinity key of T1 will both of these work?
>
> SELECT tbl.releaseId, tbl.name FROM T0 tbl
>
>  INNER JOIN T1 col ON tbl.releaseId = col.releaseId
>
>
> AND
>
> SELECT tbl.releaseId, tbl.name FROM T0 tbl
>
>  INNER JOIN T1 col ON tbl.releaseId = col.releaseId AND col.tableId =
>> tbl.id AND col.x = y
>
>
> In other words, if both tables share the same affinity key is it still a
> collocated join if there are other filters in the join predicate?
>
> If the answer to this yes, does it matter if the filters in the join
> predicate are all = i.e. does it have to be an equi-join? or could the
> predicate be
>
>> ON tbl.releaseId = col.releaseId AND* col.tableId > tbl.id
>> <http://tbl.id/>* AND *col.x >= y*
>>
>
> Thanks
>
> On Fri, Feb 11, 2022 at 6:42 PM Maksim Timonin <timoninma...@apache.org>
> wrote:
>
>> Hi Courtney,
>>
>> > I don't expect collocation issues to be in play here
>>
>> Do you check this doc:
>> https://ignite.apache.org/docs/latest/SQL/distributed-joins ?
>>
>> It says: "A distributed join is a SQL statement with a join clause that
>> combines two or more partitioned tables. If the tables are joined on the
>> partitioning column (affinity key), the join is called a colocated join.
>> Otherwise, it is called a non-colocated join"
>>
>> You definitely have a collocation issue due to non-collocated join: T0
>> partitioned by "releaseId", T1 by "t0Id", and you make a join by columns
>> that aren't affinity columns (id = tableId).
>>
>> You should specify the flag "SqlFieldsQuery.setDistributedJoins(true)" to
>> make your join return correct results.
>>
>> Maksim
>>
>>
>> On Fri, Feb 11, 2022 at 8:09 PM Courtney Robinson <
>> courtney.robin...@crlog.info> wrote:
>>
>>>
>>> I have a query like this:
>>>
>>> SELECT
>>>> tbl.id AS tbl_id, tbl.releaseId AS tbl_releaseId, col.type AS col_type
>>>> FROM T0 tbl
>>>> INNER JOIN T1 col ON tbl.id = col.tableId
>>>> *WHERE tbl.releaseId = ? AND tbl.name <http://tbl.name> = ?*
>>>> LIMIT 100
>>>>
>>>
>>> This returns no results so after investigating, I ended up changing it
>>> to the below
>>>
>>> SELECT
>>>> tbl.id AS tbl_id, tbl.releaseId AS tbl_releaseId, col.type AS col_type
>>>> FROM *(SELECT * FROM T0 t WHERE t.releaseId = ? AND t.name
>>>> <http://t.name> = ?) *tbl
>>>> INNER JOIN T1 col ON tbl.id = col.tableId
>>>> LIMIT 100
>>>>
>>>
>>>  This returns the results expected.
>>> Can anyone offer any insight into what is going wrong here?
>>>
>>> The tables here look like this (I removed some columns from the tables
>>> and the query to help make it easier on the eyes to parse):
>>>
>>> CREATE TABLE IF NOT EXISTS T0
>>>> (
>>>>   id        LONG,
>>>>   releaseId VARCHAR,
>>>>   name      VARCHAR,
>>>>   PRIMARY KEY (releaseId, id)
>>>> ) WITH "template=hypi_tpl,affinity_key=releaseId";
>>>
>>> CREATE INDEX IF NOT EXISTS VirtualTable_idx0 ON VirtualTable (releaseId,
>>>> name);
>>>>
>>>> CREATE TABLE IF NOT EXISTS T1
>>>> (
>>>>   id      LONG,
>>>>   t0Id LONG,
>>>>   name    VARCHAR,
>>>>   type    VARCHAR,
>>>>   PRIMARY KEY (t0Id, id)
>>>> ) WITH "template=hypi_tpl,affinity_key=t0Id";
>>>>
>>>
>>> Note here it is a single node locally (so I don't expect collocation
>>> issues to be in play here) - in development so not in a production cluster
>>> yet.
>>> Running Ignite 2.8.0
>>>
>>> This is not the first time we've had something like this but it's the
>>> first time I've been able to reproduce it myself and consistently.
>>>
>>> Best,
>>> Courtney
>>>
>>

Reply via email to