Hi all,
I've seen performance problem when selecting rows within date range.

My table schema is:

CREATE TABLE MY_TABLE (
   OBJ_ID varchar(20) not null,
   CREATE_DT timestamp not null,
   KEYWORD varchar(100) not null,
   COUNT integer,
   CONSTRAINT PK PRIMARY KEY (OBJ_ID,CREATE_DT,KEYWORD)
);

MY_TABLE has almost 5,200,000 rows,
CREATE_DT has about 6 months range.

And executed query:

SELECT KEYWORD, SUM(COUNT)
FROM MY_TABLE
WHERE CREATE_DT > to_timestamp('2016-03-01 00:00')
AND CREATE_DT < to_timestamp('2016-04-01 00:00')
GROUP BY KEYWORD;

It tooks 46 seconds, too slow than expected, cause CREATE_DT is one of row
key.
I created a secondary index on CREATE_DT but there's no improvement.

Query plan looks weird:
CLIENT 2-CHUNK 0 ROWS 0 BYTES PARALLEL 2-WAY FULL SCAN OVER
NEWS_KEYWORD_COUNT
     SERVER FILTER BY (CREATE_DT > TIMESTAMP '2017-03-01 00:00:00.000' AND
CREATE_DT < TIMESTAMP '2017-04-01 00:00:00.000')
     SERVER AGGREGATE INTO DISTINCT ROWS BY [KEYWORD]
CLIENT MERGE SORT
CLIENT 100 ROW LIMIT

BUT If CREATE_DT comes first of row key, plan says range scan will be done.

Any suggestion? : )

Thanks,
NaHeon

Reply via email to