Good afternoon,

I am facing a strange performance issue when doing SQL queries on my
cluster.
Here is the scenario (I cannot use real config etc because this source code
is protected):

I have 3 caches (a subset of a STAR schema).

CACHE1 is the fact table: EVENTS. It is a partitioned cache. It has an
affinityKey on USER ID.
CACHE2 is the user table: USERS. It is a partitioned cache. It has an
affinityKey on USER ID as well.
CACHE3 is the document table: DOCS. It is a replicated cache.

I also have the following config:

Every event from the event table has one USER ID and one DOCUMENT ID.
All the columns are indexed.
I run the query with setLocal(true) and setEnforceJoinOrder(true).



I would like to do the following:

1. SELECT *
2. FROM EVENTS

3.     INNER JOIN USERS ON EVENTS.USER_ID = USERS.USER_ID
4.     INNER JOIN DOCS  ON EVENTS.DOC_ID   = DOCS.DOC_ID

5. WHERE   DOCS.COL1 = 'some filter'
6. AND       USERS.COL2 = 'some other filter'


Here is what I observe:

When I run the query without line 5 (filter on documents), it is instant.
When I run the query with line 5 (both filters), it is 20X slower. (Even
though those filters are on indexed columns)- The EXPLAIN form the logs
indicates lots of scans.

If I run the query as so (syntax is not exact but the idea is there):

SELECT * 
FROM (

          SELECT USERS.*, DOCS.*

          FROM EVENTS

               INNER JOIN USERS ON EVENTS.USER_ID = USERS.USER_ID
               INNER JOIN DOCS  ON EVENTS.DOC_ID   = DOCS.DOC_ID

          WHERE   DOCS.COL1 = 'some filter'

)

WHERE USERS.COL2 = 'some other filter'

I have the expected performances again.


It seems that the index on COL2 is being ignored when doing two joins and a
filter on each table.
What do you think about it ?


Thank yo very much !
Kind regards
Emmanuel










--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Reply via email to