Re: [sqlite] FTS4: OFFSET Performance

2011-10-12 Thread Fabian
2011/10/12 Dan Kennedy 

>
> (INDEX 1) == lookup by rowid.
>
>
Okay! Then maybe my problem is unrelated to FTS, I will create a new
question.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] FTS4: OFFSET Performance

2011-10-12 Thread Dan Kennedy

On 10/12/2011 09:39 PM, Fabian wrote:

2011/10/12 Dan Kennedy



Can we see the output of EXPLAIN for this query?







Without selecting table2.data2:

0 0 0 SEARCH TABLE table1 USING INDEX data1idx (data1=?) (~3 rows)

Including selecting table2.data2:

0 0 0 SEARCH TABLE table1 USING INDEX data1idx (data1=?) (~3 rows)
0 1 1 SCAN TABLE table2 VIRTUAL TABLE INDEX 1: (~0 rows)

So it looks like the whole problem is caused by the 'SCAN' on table2. Is
there any reason why SQLite does this SCAN instead of a SEARCH, since I JOIN
the two tables on ROWID, why does it have to do a full table scan?


(INDEX 1) == lookup by rowid.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] FTS4: OFFSET Performance

2011-10-12 Thread Fabian
2011/10/12 Dan Kennedy 

>
> Can we see the output of EXPLAIN for this query?
>

>

Without selecting table2.data2:

0 0 0 SEARCH TABLE table1 USING INDEX data1idx (data1=?) (~3 rows)

Including selecting table2.data2:

0 0 0 SEARCH TABLE table1 USING INDEX data1idx (data1=?) (~3 rows)
0 1 1 SCAN TABLE table2 VIRTUAL TABLE INDEX 1: (~0 rows)

So it looks like the whole problem is caused by the 'SCAN' on table2. Is
there any reason why SQLite does this SCAN instead of a SEARCH, since I JOIN
the two tables on ROWID, why does it have to do a full table scan?
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] FTS4: OFFSET Performance

2011-10-12 Thread Fabian
2011/10/12 Dan Kennedy 

>
> Are you using 3.7.7 or newer? And specifying the order=DESC
> option when creating the FTS4 table? If not, it might be a big
> help in this case.
>
>  
> http://www.sqlite.org/fts3.**html#fts4order
>
>
Yes I'm using 3.7.7, and I already considered specifying order=DESC upon
creation, but I suspect it will reverse the problem: Slow queries with
OFFSET=0 and fast queries with a large OFFSET, which would be no solution.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] FTS4: OFFSET Performance

2011-10-12 Thread Fabian
2011/10/12 Petite Abeille 

>
> In any case, you can easily refactor the query into two steps:
>
> (1) First get your pagination
> (2) Then join to your fts
>
>
I tried implementing your refactoring, but immediately run into a problem.
The user must be able to sort the data on any of the columns returned.
So 'ORDER
BY table1.rowid DESC' is really a variabele, which could easily be 'ORDER BY
table2.data2 DESC', in which cases the query fails, because .data2 is not
selected in the inner query. And I cannot move 'ORDER BY' into the outer
loop, because else it will only sort the 250 results returned, not the
underlying data.

So the questions remains: why does increasing OFFSET kill performance when
selecting table2.data2 in additation to the columns of table1.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] FTS4: OFFSET Performance

2011-10-12 Thread Fabian
>
>
> > How would I optimize the above queries, to take advantage of the
> > LIMIT/OFFSET values, making them suitable for fast pagination?
>
> Are you sure the issue is with the fts table? Or is it the order by desc?
>
>
If I omit the FTS table in the query, like so:

SELECT table1.data1
FROM table1
WHERE table1.data1 = 10
ORDER BY table1.rowid DESC
LIMIT 250
OFFSET 0

The query always executes fast, independant of how large OFFSET is. The only
difference is that I don't include 'table2.data2' in the output, so that
leads me to think that the issue is related to FTS, or is there another
possibility? And in the FTS docs I read:

"Because of the way SQLite's virtual table interface works, retrieving the
value of the "title" column requires loading the entire row from disk
(including the "content" field, which may be quite large). This means that
if the users query matches several thousand documents, many megabytes of
"title" and "content" data may be loaded from disk into memory even though
they will never be used for any purpose."

Which seems to confirm that the reason the query is so slow, is because I
include the value of 'table2.data2' in the output?
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] FTS4: OFFSET Performance

2011-10-12 Thread Dan Kennedy

On 10/12/2011 04:36 PM, Fabian wrote:

I have two tables, one normal table (table1), and one FTS4 table (table2).
The ROWID's between the tables are in sync. I often run queries on table1,
but every row returned must also include a value from table2 (data2), so I
create a query like this:

SELECT table1.data1, table2.data2
FROM table1 JOIN table2 ON table1.rowid=table2.rowid
WHERE table1.data1 = 10
ORDER BY table1.rowid DESC
LIMIT 250
OFFSET 0

This query returns the results as expected, and performs well. But as soon
as I raise the OFFSET to a large value (for pagination) the performance
drops drastically. After a long search through the documentation, I found
out the reason: the SELECT on table2.data2 causes FTS4 to load the entire
column from disk, and doesn't take into account the LIMIT/OFFSET clauses.


Can we see the output of EXPLAIN for this query?


I have a related query, where I do a MATCH on table2.data2,

SELECT table1.data1, table2.data2
FROM table2 JOIN table1 ON table2.rowid=table1.rowid
WHERE table2.data2 MATCH 'value' AND table1.data1 = 10
ORDER BY table1.rowid DESC
LIMIT 250
OFFSET 0


Are you using 3.7.7 or newer? And specifying the order=DESC
option when creating the FTS4 table? If not, it might be a big
help in this case.

  http://www.sqlite.org/fts3.html#fts4order

Dan.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] FTS4: OFFSET Performance

2011-10-12 Thread Petite Abeille

On Oct 12, 2011, at 11:36 AM, Fabian wrote:

> How would I optimize the above queries, to take advantage of the
> LIMIT/OFFSET values, making them suitable for fast pagination?

Are you sure the issue is with the fts table? Or is it the order by desc?

In any case, you can easily refactor the query into two steps:

(1) First get your pagination
(2) Then join to your fts

selecttable1.data1, 
  table2.data2
from  (
selecttable1.data1
from  table1

where table1.data1 = 10

order by  table1.rowid desc
limit 250
offset0
  )
astable1 

join  table2 
ontable1.rowid = table2.rowid


___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] FTS4: OFFSET Performance

2011-10-12 Thread Fabian
I have two tables, one normal table (table1), and one FTS4 table (table2).
The ROWID's between the tables are in sync. I often run queries on table1,
but every row returned must also include a value from table2 (data2), so I
create a query like this:

SELECT table1.data1, table2.data2
FROM table1 JOIN table2 ON table1.rowid=table2.rowid
WHERE table1.data1 = 10
ORDER BY table1.rowid DESC
LIMIT 250
OFFSET 0

This query returns the results as expected, and performs well. But as soon
as I raise the OFFSET to a large value (for pagination) the performance
drops drastically. After a long search through the documentation, I found
out the reason: the SELECT on table2.data2 causes FTS4 to load the entire
column from disk, and doesn't take into account the LIMIT/OFFSET clauses.

I have a related query, where I do a MATCH on table2.data2,

SELECT table1.data1, table2.data2
FROM table2 JOIN table1 ON table2.rowid=table1.rowid
WHERE table2.data2 MATCH 'value' AND table1.data1 = 10
ORDER BY table1.rowid DESC
LIMIT 250
OFFSET 0

Here the problem is even worse. When MATCH 'value' returns a lot of results,
the query is very slow (even with OFFSET 0) because it ignores the fact it
only has to return the first 250 results.

How would I optimize the above queries, to take advantage of the
LIMIT/OFFSET values, making them suitable for fast pagination?
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users