Re: [h2] threads and "where .. in" queries

Brian Craft Mon, 07 Oct 2013 21:47:24 -0700

Doing an inner join on TABLE() does, at least, eliminate the 50k row scan, 
and the index is clearly being used, now. Scan counts are now under 4.


It wasn't completely straight-forward to do the conversion. The probes 
table is indexed on two columns, eid and name. The full query looked 
something like

select * from probes ... left join foo ... left join bar ... where eid = ? 
and name in (?,?,?....)

Adding the inner join to TABLE() initially made the row scans dramatically 
larger, seemingly because the "eid = ?" clause was coming into play later 
(though I'm mostly guessing here): from the scan counts, it appeared to be 
working with all the probes matching the name list, rather than just those 
with matching eid, until the last step.

I tried moving the inner join left and right, without apparent improvement. 
What does work is making a sub-query, like

select * from (select * from probes inner join table(...) ... where eid = 
?) left join foo ... left join bar ...




On Monday, October 7, 2013 2:20:20 PM UTC-7, Brian Craft wrote:
>
> Sorry, the 120 sec is including some application code that is apparently 
> slower than I expected. The h2 trace log shows 9 sec for the query. There 
> is an index on the column, but it doesn't appear to be used for the IN 
> clause with 500 values. "explain analyze" shows it scanning 50k rows or so.
>
> I'll try the TABLE() thing instead of the big IN clause.
>
> On Monday, October 7, 2013 12:33:36 PM UTC-7, Thomas Mueller wrote:
>>
>> Hi,
>>
>> > a "where ... in" clause with 500 values (like "where x in [v0, v1, v2, 
>> .... v499]"), which ran with average time 120 seconds.
>>
>> 120 seconds for 500 entries sounds like a problem with the data or the 
>> query. Do you have an index on that column? See also 
>> http://h2database.com/html/performance.html#explain_plan
>>
>> Regards,
>> Thomas
>>
>>
>> On Sun, Oct 6, 2013 at 6:29 AM, Brian Craft <craft...@gmail.com> wrote:
>>
>>> Reading over the archive on the subject of threads has left me mostly 
>>> still confused about how h2 handles concurrency, so I've been doing some 
>>> tests, instead.
>>>
>>> I started with largish queries that all have a "where ... in" clause 
>>> with 500 values (like "where x in [v0, v1, v2, .... v499]"), which ran with 
>>> average time 120 seconds.
>>>
>>>  I then re-ran the same queries as sets of 500 queries (each with a 
>>> single value, like "where x in [v0]"), splitting the queries across 
>>> different threads. These sets of 500 queries ran much faster, average 30 
>>> seconds. It also made much more use of the disk: io stats where much 
>>> higher, which is consistent with the queries running concurrently.
>>>
>>> The docs seem to indicate that a query in one thread blocks queries in 
>>> other threads, so I'm not sure how to interpret these numbers. Maybe I 
>>> misunderstood the docs, and queries do run concurrently? I'm not using 
>>> MULTI_THREADED.
>>>
>>>  -- 
>>> You received this message because you are subscribed to the Google 
>>> Groups "H2 Database" group.
>>> To unsubscribe from this group and stop receiving emails from it, send 
>>> an email to h2-database...@googlegroups.com.
>>> To post to this group, send email to h2-da...@googlegroups.com.
>>> Visit this group at http://groups.google.com/group/h2-database.
>>> For more options, visit https://groups.google.com/groups/opt_out.
>>>
>>
>>

-- 
You received this message because you are subscribed to the Google Groups "H2 
Database" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to h2-database+unsubscr...@googlegroups.com.
To post to this group, send email to h2-database@googlegroups.com.
Visit this group at http://groups.google.com/group/h2-database.
For more options, visit https://groups.google.com/groups/opt_out.

Re: [h2] threads and "where .. in" queries

Reply via email to