Hi,

I'm using SQLite v3.8.4.3 and the issue happens in a table with the
following schema:

CREATE TABLE HistoryEntry (
id INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL,
sourceType INTEGER NOT NULL,
sourceId INTEGER,
uri TEXT NOT NULL,
historyOrder BIGINT NOT NULL CHECK (historyOrder >= 0) DEFAULT
9223372036854775807,
historyTimestamp BIGINT NOT NULL CHECK (historyTimestamp > 0),
entryState INTEGER,
entryIsSpam INTEGER,
entryRelatedTech INTEGER,
entryDisplayed INTEGER,
entryPayload TEXT
);
CREATE INDEX idx_HistoryEntry_historyOrder_historyTimestamp ON
HistoryEntry(historyOrder, historyTimestamp);
CREATE INDEX idx_HistoryEntry_sourceType_sourceId ON
HistoryEntry(sourceType, sourceId);
CREATE INDEX idx_HistoryEntry_uri_historyOrder ON HistoryEntry(uri,
historyOrder);

The query being used is the following:

WITH maxHist (uri, historyOrder) AS (
SELECT uri AS uri, max(historyOrder) AS historyOrder FROM HistoryEntry
WHERE ((sourceType & 8191) <> 0)
GROUP BY uri
)
SELECT opH.uri AS uri, opH.historyOrder AS historyOrder,
max(opH.historyTimestamp) AS historyTimestamp, opH.id AS historyId
FROM HistoryEntry opH, maxHist
WHERE (((opH.sourceType & 8191) <> 0) AND (opH.uri = maxHist.uri) AND
(opH.historyOrder = maxHist.historyOrder))
GROUP BY opH.uri, opH.historyOrder;

This is the associated query plan:

SCAN TABLE HistoryEntry USING INDEX idx_HistoryEntry_uri_historyOrder
SCAN SUBQUERY 1
SEARCH TABLE HistoryEntry AS opH USING INDEX
idx_HistoryEntry_uri_historyOrder (uri=? AND historyOrder=?)
USE TEMP B-TREE FOR GROUP BY

Everything looks fine and the query runs smoothly, but at around 15000
entries, the query plan changes to:

SCAN TABLE HistoryEntry USING INDEX idx_HistoryEntry_uri_historyOrder
SCAN SUBQUERY 1
SEARCH TABLE HistoryEntry AS opH USING INDEX
idx_HistoryEntry_historyOrder_historyTimestamp (historyOrder=?)
USE TEMP B-TREE FOR GROUP BY

This causes the query to take 2.2s instead 47,20ms!
The issues is rather obvious: the planner decided to change to a much worse
index, which in turn causes the slowdown.
I was able to bypass the problem by providing an hint in the query, like so:

WITH maxHist (uri, historyOrder) AS (
SELECT uri AS uri, max(historyOrder) AS historyOrder FROM HistoryEntry
WHERE ((sourceType & 8191) <> 0)
GROUP BY uri
)
SELECT opH.uri AS uri, opH.historyOrder AS historyOrder,
max(opH.historyTimestamp) AS historyTimestamp, opH.id AS historyId
FROM HistoryEntry opH, maxHist
WHERE (((opH.sourceType & 8191) <> 0) AND (opH.uri = maxHist.uri) AND
+(opH.historyOrder = maxHist.historyOrder))
GROUP BY opH.uri, opH.historyOrder;

This isn't optimal because the query plan changes to:

SCAN TABLE HistoryEntry USING INDEX idx_HistoryEntry_uri_historyOrder
SCAN SUBQUERY 1
SEARCH TABLE HistoryEntry AS opH USING INDEX
idx_HistoryEntry_uri_historyOrder (uri=?)
USE TEMP B-TREE FOR GROUP BY

But at least the correct index gets chosen.
The statistics where all updated before running every query and the DB
vacuumed: yet the planner always chooses the incorrect index after reaching
around 15000 entries.
If necessary, I can provide a DB where this happens.

Thank you,
João
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to