[ 
https://issues.apache.org/jira/browse/DERBY-4007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Knut Anders Hatlen updated DERBY-4007:
--------------------------------------

    Attachment: CreateDatabase4007.java

Attached is a java class (CreateDatabase4007.java) which creates a database 
with the schema Mikkel posted and populates it with about the same number of 
rows as he reported.

When I try the update statement I see that it takes ~19 seconds.

The time is reduced to ~4.5 seconds if I create an extra index:

    create unique index cp on summa_relations(childid, parentid);

This index is more efficient than the existing PC index when there's a 
restriction on childid. However, the statement still performs a full scan of 
SUMMA_RECORDS, which sounds sub-optimal.

> Optimization of IN with nested SELECT
> -------------------------------------
>
>                 Key: DERBY-4007
>                 URL: https://issues.apache.org/jira/browse/DERBY-4007
>             Project: Derby
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 10.4.2.0
>         Environment: Linux
>            Reporter: Mikkel Kamstrup Erlandsen
>            Priority: Minor
>         Attachments: CreateDatabase4007.java, dblook.log, dblook_p_index.log, 
> derby.log, derby_p_index.log
>
>
> The problem is with the following query:
> UPDATE summa_records SET base='foobar' WHERE id IN ( SELECT parentId FROM 
> summa_relations WHERE childId='horizon_2615441');
> It takes in the order of 30s to run when we expect something in the order of 
> 1-2ms.
> We have a setup with two tables
> summa_records:  1,5M rows
> summa_relations: ~350000 rows
> summa_records have and 'id' column that is also indexed and is the primary 
> key. The summa_relations table holds mappings between different ids.
> In our case the nested SELECT produces 2 hits, say, 'foo' and 'bar'. So the 
> UPDATE on these two hits should be quite snappy. If we run the SELECT alone 
> it runs in an instant, and also if we run with hardcoded ids for the IN 
> clause:
> UPDATE summa_records SET base='foobar' WHERE id IN ('foo', 'bar');
> We have instant execution. I'll attach a query plan in a sec.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to