Re: [h2] Poor performance

2016-01-01 Thread James Sheets
That's improved things greatly for my test cases, thanks!


On Saturday, December 26, 2015 at 7:35:43 AM UTC-5, Noel Grandin wrote:
>
> I have puhed a fix for this to master. Please try it out and let me 
> know how it works for you. 
>
> The fix changes the index costing quite a bit, so it might affect 
> other queries too. 
>

-- 
You received this message because you are subscribed to the Google Groups "H2 
Database" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to h2-database+unsubscr...@googlegroups.com.
To post to this group, send email to h2-database@googlegroups.com.
Visit this group at https://groups.google.com/group/h2-database.
For more options, visit https://groups.google.com/d/optout.


Re: [h2] Poor performance

2015-12-26 Thread Noel Grandin
I have puhed a fix for this to master. Please try it out and let me
know how it works for you.

The fix changes the index costing quite a bit, so it might affect
other queries too.

-- 
You received this message because you are subscribed to the Google Groups "H2 
Database" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to h2-database+unsubscr...@googlegroups.com.
To post to this group, send email to h2-database@googlegroups.com.
Visit this group at https://groups.google.com/group/h2-database.
For more options, visit https://groups.google.com/d/optout.


Re: [h2] Poor performance

2015-12-24 Thread Steve McLeod
James, I think query optimizer hints would really help here. Unfortunately, 
no one has offered to implement them yet, I think.



On Wednesday, 23 December 2015 18:07:00 UTC+1, James Sheets wrote:
>
> Just to be clear, by "covering" I meant an index that covers the predicate 
> and the projection.
>

-- 
You received this message because you are subscribed to the Google Groups "H2 
Database" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to h2-database+unsubscr...@googlegroups.com.
To post to this group, send email to h2-database@googlegroups.com.
Visit this group at https://groups.google.com/group/h2-database.
For more options, visit https://groups.google.com/d/optout.


Re: [h2] Poor performance

2015-12-23 Thread James Sheets
Just to be clear, by "covering" I meant an index that covers the predicate 
and the projection.

-- 
You received this message because you are subscribed to the Google Groups "H2 
Database" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to h2-database+unsubscr...@googlegroups.com.
To post to this group, send email to h2-database@googlegroups.com.
Visit this group at https://groups.google.com/group/h2-database.
For more options, visit https://groups.google.com/d/optout.


Re: [h2] Poor performance

2015-12-23 Thread James Sheets
Yes, but there's a very large real-world performance difference between 
using the index I've created and the unique key that H2 chooses.  I don't 
know anything about the internals of H2, but I could make some guesses as 
to why. 

I personally think the most intuitive behavior would be to use a covering 
index over a PK or UK that doesn't cover. 


Unique Key index is *422 ms*

-- Performance
SELECT MAX(b.id) as id
FROM tbl_b b
GROUP BY b.tbl_a_id
[2015-12-23 11:38:17] 100 rows retrieved starting from 1 in 422ms 
(execution: 410ms, fetching: 12ms)

-- Explain Analyze
SELECT
MAX(B.ID) AS ID
FROM PUBLIC.TBL_B B
/* PUBLIC.TBL_B_UK_INDEX_4 */
/* scanCount: 5 */
GROUP BY B.TBL_A_ID
/* group sorted */
/*
reads: 31554
*/


The TBL_B_IDX is *58 ms*

-- Performance
SELECT MAX(b.id) as id
FROM tbl_b b
-- This where predicate is just to trick H2 to using tbl_b_idx
WHERE b.tbl_a_id > 0
GROUP BY b.tbl_a_id
[2015-12-23 11:44:50] 100 rows retrieved starting from 1 in 58ms 
(execution: 50ms, fetching: 8ms)

-- Explain Analyze
SELECT
MAX(B.ID) AS ID
FROM PUBLIC.TBL_B B
/* PUBLIC.TBL_B_IDX: TBL_A_ID > 0 */
/* scanCount: 5 */
WHERE B.TBL_A_ID > 0
GROUP BY B.TBL_A_ID



On Tuesday, December 22, 2015 at 1:12:18 AM UTC-5, Noel Grandin wrote:
>
>
>
> On 2015-12-22 05:28 AM, James Sheets wrote: 
> > Try making another unit test, checking that the below query uses 
> 'tbl_b_idx', and the test case will fail: 
> > 
> > EXPLAIN ANALYZE 
> > SELECT MAX(b.id)as id 
> > FROM tbl_b b 
> > GROUP BY b.tbl_a_id; 
> > 
>
> For this particular query, you need to read all of the rows, so there is 
> no cost difference in using one index over another. 
>

-- 
You received this message because you are subscribed to the Google Groups "H2 
Database" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to h2-database+unsubscr...@googlegroups.com.
To post to this group, send email to h2-database@googlegroups.com.
Visit this group at https://groups.google.com/group/h2-database.
For more options, visit https://groups.google.com/d/optout.


Re: [h2] Poor performance

2015-12-21 Thread Noel Grandin



On 2015-12-22 05:28 AM, James Sheets wrote:

Try making another unit test, checking that the below query uses 'tbl_b_idx', 
and the test case will fail:

EXPLAIN ANALYZE
SELECT MAX(b.id)as id
FROM tbl_b b
GROUP BY b.tbl_a_id;



For this particular query, you need to read all of the rows, so there is no 
cost difference in using one index over another.

--
You received this message because you are subscribed to the Google Groups "H2 
Database" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to h2-database+unsubscr...@googlegroups.com.
To post to this group, send email to h2-database@googlegroups.com.
Visit this group at https://groups.google.com/group/h2-database.
For more options, visit https://groups.google.com/d/optout.


Re: [h2] Poor performance

2015-12-21 Thread James Sheets
Thanks!

I tried out your changes, and it's made some improvement. The query I 
originally submitted does seem to use the correct index.

However, if you just slightly modify the test query (which I see you made a 
unit test for), the same issue comes back up.

Try making another unit test, checking that the below query uses 'tbl_b_idx', 
and the test case will fail:

EXPLAIN ANALYZE
SELECT MAX(b.id) as id
FROM tbl_b b
GROUP BY b.tbl_a_id;




On Sunday, December 20, 2015 at 9:19:06 AM UTC-5, Noel Grandin wrote:
>
> I have committed a fix for this to master. You can either build from 
> source or try out the nightly build later 
>

-- 
You received this message because you are subscribed to the Google Groups "H2 
Database" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to h2-database+unsubscr...@googlegroups.com.
To post to this group, send email to h2-database@googlegroups.com.
Visit this group at https://groups.google.com/group/h2-database.
For more options, visit https://groups.google.com/d/optout.


Re: [h2] Poor performance

2015-12-20 Thread Noel Grandin
I have committed a fix for this to master. You can either build from
source or try out the nightly build later

-- 
You received this message because you are subscribed to the Google Groups "H2 
Database" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to h2-database+unsubscr...@googlegroups.com.
To post to this group, send email to h2-database@googlegroups.com.
Visit this group at https://groups.google.com/group/h2-database.
For more options, visit https://groups.google.com/d/optout.


Re: [h2] Poor performance

2015-12-19 Thread Noel Grandin
I added some debug and this is the output:

2015-12-19 14:36:15 jdbc[3]: Plan   : calculate cost for plan [B, A]
2015-12-19 14:36:15 jdbc[3]: Plan   :   for table filter B
2015-12-19 14:36:15 jdbc[3]: Table  : potential plan item cost
509,990 index PUBLIC.TBL_B.tableScan
2015-12-19 14:36:15 jdbc[3]: Table  : potential plan item cost
509,990 index PUBLIC.PRIMARY_KEY_4B
2015-12-19 14:36:15 jdbc[3]: Table  : potential plan item cost
509,990 index PUBLIC.TBL_B_UK_INDEX_4
2015-12-19 14:36:15 jdbc[3]: Table  : potential plan item cost
509,990 index PUBLIC.TBL_B_IDX
2015-12-19 14:36:15 jdbc[3]: Plan   :   best plan item cost
504,890 index PUBLIC.TBL_B.tableScan
2015-12-19 14:36:15 jdbc[3]: Plan   :   for table filter A
2015-12-19 14:36:15 jdbc[3]: Table  : potential plan item cost
11,000 index PUBLIC.TBL_A.tableScan
2015-12-19 14:36:15 jdbc[3]: Table  : potential plan item cost
30 index PUBLIC.PRIMARY_KEY_4
2015-12-19 14:36:15 jdbc[3]: Table  : potential plan item cost
11,000 index PUBLIC.TBL_A_UK_INDEX_4
2015-12-19 14:36:15 jdbc[3]: Plan   :   best plan item cost 29
index PUBLIC.PRIMARY_KEY_4
2015-12-19 14:36:15 jdbc[3]: Plan   : plan cost 15,500,156
2015-12-19 14:36:15 jdbc[3]: Plan   : calculate cost for plan [A, B]
2015-12-19 14:36:15 jdbc[3]: Plan   :   for table filter A
2015-12-19 14:36:15 jdbc[3]: Table  : potential plan item cost
11,000 index PUBLIC.TBL_A.tableScan
2015-12-19 14:36:15 jdbc[3]: Table  : potential plan item cost
11,000 index PUBLIC.PRIMARY_KEY_4
2015-12-19 14:36:15 jdbc[3]: Table  : potential plan item cost
11,000 index PUBLIC.TBL_A_UK_INDEX_4
2015-12-19 14:36:15 jdbc[3]: Plan   :   best plan item cost 10,780
index PUBLIC.TBL_A.tableScan
2015-12-19 14:36:15 jdbc[3]: Plan   :   for table filter B
2015-12-19 14:36:15 jdbc[3]: Table  : potential plan item cost
509,990 index PUBLIC.TBL_B.tableScan
2015-12-19 14:36:15 jdbc[3]: Table  : potential plan item cost
509,990 index PUBLIC.PRIMARY_KEY_4B
2015-12-19 14:36:15 jdbc[3]: Table  : potential plan item cost
1,020 index PUBLIC.TBL_B_UK_INDEX_4
2015-12-19 14:36:15 jdbc[3]: Table  : potential plan item cost
1,020 index PUBLIC.TBL_B_IDX
2015-12-19 14:36:15 jdbc[3]: Plan   :   best plan item cost 1,014
index PUBLIC.TBL_B_UK_INDEX_4
2015-12-19 14:36:15 jdbc[3]: Plan   : plan cost 10,952,417


The interesting part is the following two lines:
2015-12-19 14:36:15 jdbc[3]: Table  : potential plan item cost
1,020 index PUBLIC.TBL_B_UK_INDEX_4
2015-12-19 14:36:15 jdbc[3]: Table  : potential plan item cost
1,020 index PUBLIC.TBL_B_IDX

which means that our index cost calculation does not know how to take
into account that retrieving from a primary key index is way cheaper
than retrieving from any other index.

-- 
You received this message because you are subscribed to the Google Groups "H2 
Database" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to h2-database+unsubscr...@googlegroups.com.
To post to this group, send email to h2-database@googlegroups.com.
Visit this group at https://groups.google.com/group/h2-database.
For more options, visit https://groups.google.com/d/optout.


Re: [h2] Poor performance

2015-12-17 Thread Noel Grandin

This is the output of EXPLAIN ANALYZE for me:

EXPLAIN ANALYZE
SELECT MAX(b.id) as id
FROM tbl_b b
JOIN tbl_a a ON b.tbl_a_id = a.id
GROUP BY b.tbl_a_id
HAVING a.active = true;
PLAN
SELECT
MAX(B.ID) AS ID
FROM PUBLIC.TBL_A A
/* PUBLIC.TBL_A.tableScan */
/* scanCount: 101 */
INNER JOIN PUBLIC.TBL_B B
/* PUBLIC.TBL_B_UK_INDEX_4: TBL_A_ID = A.ID */
ON 1=1
/* scanCount: 50099 */
WHERE B.TBL_A_ID = A.ID
GROUP BY B.TBL_A_ID
HAVING A.ACTIVE = TRUE
/*
total: 892
TBL_B.TBL_B_DATA read: 614 (68%)
TBL_B.TBL_B_UK_INDEX_4 read: 278 (31%)
*/
(1 row, 229 ms)



EXPLAIN ANALYZE
SELECT MAX(b.id) as id
FROM tbl_b b
JOIN tbl_a a ON b.tbl_a_id = a.id
GROUP BY b.tbl_a_id
HAVING a.active = true;
PLAN
SELECT
MAX(B.ID) AS ID
FROM PUBLIC.TBL_A A
/* PUBLIC.TBL_A.tableScan */
/* scanCount: 101 */
INNER JOIN PUBLIC.TBL_B B
/* PUBLIC.TBL_B_IDX: TBL_A_ID = A.ID */
ON 1=1
/* scanCount: 50099 */
WHERE B.TBL_A_ID = A.ID
GROUP BY B.TBL_A_ID
HAVING A.ACTIVE = TRUE
/*
TBL_B.TBL_B_IDX read: 46
*/
(1 row, 25 ms)

--
You received this message because you are subscribed to the Google Groups "H2 
Database" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to h2-database+unsubscr...@googlegroups.com.
To post to this group, send email to h2-database@googlegroups.com.
Visit this group at https://groups.google.com/group/h2-database.
For more options, visit https://groups.google.com/d/optout.


[h2] Poor performance

2015-12-10 Thread James Sheets
I've got a fairly simple query using a GROUP BY, but the query optimizer 
won't pick the index I've created. Instead, it'll use the indexes for the 
constraints on the table instead, which is much slower.

I've tested this on 1.4.189 and 1.4.190 with the following JDBC url: 
jdbc:h2:c:/temp/database;AUTO_SERVER=TRUE;MODE=MSSQLServer;CACHE_SIZE=16384

I've attached an example script to replicate the issue

DROP TABLE TBL_A;
DROP TABLE TBL_B;

CREATE TABLE TBL_A
(
  id IDENTITY PRIMARY KEY NOT NULL,
  name VARCHAR_IGNORECASE(255) NOT NULL,
  createDate TIMESTAMP DEFAULT NOW(),
  createUser VARCHAR(100) NOT NULL,
  comments VARCHAR(2000),
  active BOOLEAN DEFAULT TRUE,


  UNIQUE KEY TBL_A_UK (name),
);


CREATE TABLE TBL_B
(
  id IDENTITY PRIMARY KEY NOT NULL,
  tbl_a_id BIGINT NOT NULL,
  name VARCHAR(50),
  status VARCHAR(30) NOT NULL,
  createDate TIMESTAMP DEFAULT NOW(),
  createUser VARCHAR(100) NOT NULL,


  UNIQUE KEY TBL_B_UK (tbl_a_id, createDate),
  FOREIGN KEY (tbl_a_id) REFERENCES public.TBL_A(id) ON DELETE NO ACTION ON 
UPDATE NO ACTION,
);


INSERT INTO TBL_A (name, createUser, comments)
  SELECT 'package_' || CAST(X as VARCHAR), 'testUser', 'testComment'
  FROM SYSTEM_RANGE(1, 100)
  WHERE X <= 100;


INSERT INTO TBL_B (tbl_a_id, name, status, createUser, createDate)
  SELECT CASE WHEN tbl_a_id = 0 THEN 1 ELSE tbl_a_id END, name, status, 
createUser, createDate
  FROM (
SELECT
  ROUND((RAND() * 100)) AS tbl_a_id,
  'item_' || CAST(X AS VARCHAR) as name,
  'ok' as status,
  'testUser2' as createUser,
  DATEADD('SECOND', X, NOW()) as createDate
FROM SYSTEM_RANGE(1, 5)
WHERE X < 5
  );


CREATE INDEX tbl_b_idx ON tbl_b(tbl_a_id, id);


ANALYZE;


-- Won't use any of the indexes created above. Uses the constraint keys 
instead.
-- If you remove both the Foreign Key and the Unique Key on TBL_B, it'll use 
the tbl_b_idx index and be much faster
--EXPLAIN ANALYZE
SELECT MAX(b.id) as id
FROM tbl_b b
JOIN tbl_a a ON b.tbl_a_id = a.id
GROUP BY b.tbl_a_id
HAVING a.active = true;


Rerun the script above with the following changes to TBL_B

CREATE TABLE TBL_B
(
  id IDENTITY PRIMARY KEY NOT NULL,
  tbl_a_id BIGINT NOT NULL,
  name VARCHAR(50),
  status VARCHAR(30) NOT NULL,
  createDate TIMESTAMP DEFAULT NOW(),
  createUser VARCHAR(100) NOT NULL,

  -- Remove constraints so it chooses our index
  --UNIQUE KEY TBL_B_UK (tbl_a_id, createDate),
  --FOREIGN KEY (tbl_a_id) REFERENCES public.TBL_A(id) ON DELETE NO ACTION 
ON UPDATE NO ACTION,
);



-- 
You received this message because you are subscribed to the Google Groups "H2 
Database" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to h2-database+unsubscr...@googlegroups.com.
To post to this group, send email to h2-database@googlegroups.com.
Visit this group at http://groups.google.com/group/h2-database.
For more options, visit https://groups.google.com/d/optout.