[jira] [Commented] (IGNITE-6085) SQL: JOIN with multiple conditions is extremely slow

2017-11-23 Thread Roman Kondakov (JIRA)

[ 
https://issues.apache.org/jira/browse/IGNITE-6085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16265020#comment-16265020
 ] 

Roman Kondakov commented on IGNITE-6085:


After the deep research it was found out that such kind of problems occurs not 
only in the JOIN statements, but also in any SELECT ... WHERE statement with 
the OR clause. For example, in this query H2 will use the full table scan 
instead of indexes, which could lead to unacceptable execution time:
{code}
SELECT * FROM tbl WHERE field1=1 OR field2=2
{code}
H2 does not use indexes here because if it iterate over the index on the field1 
with the value 1, it would not get all rows with the field2=2 because not all 
rows with field2=2 have also field1=1. So, to obtain a correct result set using 
indexes, H2 should iterate over the index on field1 with the condition field1=1 
and also iterate over the index on field2 with the condition field2=2. And then 
merge both results fetched during these iterations. But this scenario is not 
implemented in H2 yet, so it uses a full scan instead of the indexes in the OR 
clauses.
There is a  TODO comment in org.h2.expression.ConditionAndOr (since 2008): 
{code:java}
// TODO optimization: convert .. OR .. to UNION if the cost is lower
{code}
After this optimization in the H2 engine has been done, this ticket can be 
closed.

> SQL: JOIN with multiple conditions is extremely slow
> 
>
> Key: IGNITE-6085
> URL: https://issues.apache.org/jira/browse/IGNITE-6085
> Project: Ignite
>  Issue Type: Task
>  Components: sql
>Affects Versions: 2.1
>Reporter: Vladimir Ozerov
>Assignee: Roman Kondakov
>  Labels: performance
> Fix For: 2.4
>
>
> Consider the following query:
> {code}
> SELECT ... FROM A a
> INNER JOIN B b ON b.id = a.foreign_id1 OR b.id = a.foreign_id2
> {code}
> In this case H2 cannot use indexes on {{foreign_id1}} or {{foreign_id2}} 
> columns and query execution takes extraordinary time. Known workaround for a 
> problem is to apply multiple JOINs, e.g.:
> {code}
> SELECT ... FROM A a
> LEFT OUTER JOIN B b1 ON b1.id = a.foreign_id1 
> LEFT OUTER JOIN B b2 ON b2.id = a.foreign_id2
> WHERE b1.id IS NOT NULL AND b2.id IS NOT NULL
> {code}
> On a single real-world scenario it improved exeution time by a factor of 500 
> (from 4s to 80ms).
> Something is terribly wrong here. Probably, H2 cannot perform necessary query 
> re-write, or cannot understand how to use index. Let's find a way to fix that.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (IGNITE-6085) SQL: JOIN with multiple conditions is extremely slow

2017-11-16 Thread Roman Kondakov (JIRA)

[ 
https://issues.apache.org/jira/browse/IGNITE-6085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16255155#comment-16255155
 ] 

Roman Kondakov commented on IGNITE-6085:


This problem occurs only with the OR condition in ON clause.  If you change 
condition to AND this query works well. The problem could be in the method 

{code:java}
public void createIndexConditions(Session session, TableFilter filter) {
if (andOrType == AND) {
left.createIndexConditions(session, filter);
right.createIndexConditions(session, filter);
}
}
{code}

in ConditionAndOr class. I asked a question about it in H2/issues: 
https://github.com/h2database/h2database/issues/670. Waiting for a response now.


> SQL: JOIN with multiple conditions is extremely slow
> 
>
> Key: IGNITE-6085
> URL: https://issues.apache.org/jira/browse/IGNITE-6085
> Project: Ignite
>  Issue Type: Task
>  Components: sql
>Affects Versions: 2.1
>Reporter: Vladimir Ozerov
>Assignee: Roman Kondakov
>  Labels: performance
> Fix For: 2.4
>
>
> Consider the following query:
> {code}
> SELECT ... FROM A a
> INNER JOIN B b ON b.id = a.foreign_id1 OR b.id = a.foreign_id2
> {code}
> In this case H2 cannot use indexes on {{foreign_id1}} or {{foreign_id2}} 
> columns and query execution takes extraordinary time. Known workaround for a 
> problem is to apply multiple JOINs, e.g.:
> {code}
> SELECT ... FROM A a
> LEFT OUTER JOIN B b1 ON b1.id = a.foreign_id1 
> LEFT OUTER JOIN B b2 ON b2.id = a.foreign_id2
> WHERE b1.id IS NOT NULL AND b2.id IS NOT NULL
> {code}
> On a single real-world scenario it improved exeution time by a factor of 500 
> (from 4s to 80ms).
> Something is terribly wrong here. Probably, H2 cannot perform necessary query 
> re-write, or cannot understand how to use index. Let's find a way to fix that.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (IGNITE-6085) SQL: JOIN with multiple conditions is extremely slow

2017-11-10 Thread Vladimir Ozerov (JIRA)

[ 
https://issues.apache.org/jira/browse/IGNITE-6085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16247236#comment-16247236
 ] 

Vladimir Ozerov commented on IGNITE-6085:
-

See IGNITE-4150. Probably similar fix ti H2 could be applied.

> SQL: JOIN with multiple conditions is extremely slow
> 
>
> Key: IGNITE-6085
> URL: https://issues.apache.org/jira/browse/IGNITE-6085
> Project: Ignite
>  Issue Type: Task
>  Components: sql
>Affects Versions: 2.1
>Reporter: Vladimir Ozerov
>  Labels: performance
> Fix For: 2.4
>
>
> Consider the following query:
> {code}
> SELECT ... FROM A a
> INNER JOIN B b ON b.id = a.foreign_id1 OR b.id = a.foreign_id2
> {code}
> In this case H2 cannot use indexes on {{foreign_id1}} or {{foreign_id2}} 
> columns and query execution takes extraordinary time. Known workaround for a 
> problem is to apply multiple JOINs, e.g.:
> {code}
> SELECT ... FROM A a
> LEFT OUTER JOIN B b1 ON b1.id = a.foreign_id1 
> LEFT OUTER JOIN B b2 ON b2.id = a.foreign_id2
> WHERE b1.id IS NOT NULL AND b2.id IS NOT NULL
> {code}
> On a single real-world scenario it improved exeution time by a factor of 500 
> (from 4s to 80ms).
> Something is terribly wrong here. Probably, H2 cannot perform necessary query 
> re-write, or cannot understand how to use index. Let's find a way to fix that.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)