Re: [sqlite] Force the use of a specified index?

2009-09-19 Thread Bart Smissaert
I tried that and it picked up the index, but the query was slower plus
adding the compound index took some time as well, so the other way is
better.

RBS


On Sat, Sep 19, 2009 at 6:12 AM, Jon Dixon <j.di...@ieee.org> wrote:
> Out of curiosity, would it work any faster to switch the date clause to be
> T1.ADDED_DATE > DATE(T2.ADDED_DATE,'-15 month')
> and used an index (PATIENT_ID,ADDED_DATE)?
>
> Jon
>
> You said:
>
> OK, it looks the construction with DATE instead of Julianday is a bit
> faster, so best option here seems to be:
>
> DELETE FROM TABLE1 WHERE
> ENTRY_ID NOT IN (
>   SELECT T1.ENTRY_ID FROM TABLE1 T1 INNER JOIN TABLE2 T2 ON
>   (T1.PATIENT_ID = T2.PATIENT_ID)
>   WHERE
>   DATE(T1.ADDED_DATE, '+15 month') > T2.ADDED_DATE)
>
> 
> From: Bart Smissaert <bart.smissa...@gmail.com>
> To: Jon Dixon <j.di...@ieee.org>
> Sent: Friday, September 18, 2009 6:38:34 PM
> Subject: Re: [sqlite] Force the use of a specified index?
>
> Had at this suggestion now and it works and uses the PATIENT_ID,
> ADDED_DATE index, but it is as slow as my delete with Julianday. It
> looks Pavel's suggestion is the way to do this. Just will have a look
> now and see if doing the construction with DATE( instead of Julianday
> is any faster.
>
> RBS
>
>
> On Fri, Sep 18, 2009 at 10:48 PM, Jon Dixon <j.di...@ieee.org> wrote:
>> From:
>> "Bart Smissaert" <bart.smissa...@gmail.com>
>> Then the SQL I was trying to improve:
>>
>> DELETE
>> FROM
>> TABLE2
>> WHERE
>> PATIENT_ID NOT IN (
>>   SELECT
>>   TABLE1.PATIENT_ID
>>   FROM
>>   TABLE1
>>   WHERE
>>   JULIANDAY(TABLE2.START_DATE, '-14 month') >
>>   JULIANDAY(TABLE1.START_DATE) AND
>>   TABLE1.PATIENT_ID = TABLE2.PATIENT_ID)
>>
>> -
>>
>> To my understanding, SQLite will not use indices on function results. I
>> wonder if you would have more luck (since START_DATE is -mm-dd) using
>>
>> DELETE FROM TABLE2 WHERE PATIENT_ID NOT IN (
>>   SELECT TABLE1.PATIENT_ID FROM TABLE1 WHERE TABLE1.START_DATE <
>>  DATE(TABLE2.START_DATE,'-14 month') AND TABLE1.PATIENT_ID =
>> TABLE2.PATIENT_ID
>>   )
>>
>> I believe this will make use of a joint index on PATIENT_ID and
>> START_DATE,
>> with the preferred order depending on which is the more restrictive term
>> (I'd guess best would be and index on (PATIENT_ID, START_DATE)).
>>
>> Jon Dixon
>>
>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Force the use of a specified index?

2009-09-18 Thread Bart Smissaert
OK, it looks the construction with DATE instead of Julianday is a bit
faster, so best option here seems to be:

DELETE FROM TABLE1 WHERE
ENTRY_ID NOT IN (
   SELECT T1.ENTRY_ID FROM TABLE1 T1 INNER JOIN TABLE2 T2 ON
   (T1.PATIENT_ID = T2.PATIENT_ID)
   WHERE
   DATE(T1.ADDED_DATE, '+15 month') > T2.ADDED_DATE)

This will use the index on the single field PATIENT_ID, which I need
in any case for other
queries. So, I now have a faster, simpler query and also less indexes needed.
Thanks again for all the suggestions.


RBS


On Sat, Sep 19, 2009 at 12:03 AM, Bart Smissaert
 wrote:
> Had a look at this suggestion now and it works and uses the PATIENT_ID,
> ADDED_DATE index, but it is as slow as my delete with Julianday. It
> looks Pavel's suggestion is the way to do this. Just will have a look
> now and see if doing the construction with DATE( instead of Julianday
> is any faster.
>
> RBS
>
>
> On Fri, Sep 18, 2009 at 10:48 PM, Jon Dixon  wrote:
>> From:
>> "Bart Smissaert" 
>> Then the SQL I was trying to improve:
>>
>> DELETE
>> FROM
>> TABLE2
>> WHERE
>> PATIENT_ID NOT IN (
>>   SELECT
>>   TABLE1.PATIENT_ID
>>   FROM
>>   TABLE1
>>   WHERE
>>   JULIANDAY(TABLE2.START_DATE, '-14 month') >
>>   JULIANDAY(TABLE1.START_DATE) AND
>>   TABLE1.PATIENT_ID = TABLE2.PATIENT_ID)
>>
>> -
>>
>> To my understanding, SQLite will not use indices on function results. I
>> wonder if you would have more luck (since START_DATE is -mm-dd) using
>>
>> DELETE FROM TABLE2 WHERE PATIENT_ID NOT IN (
>>   SELECT TABLE1.PATIENT_ID FROM TABLE1 WHERE TABLE1.START_DATE <
>>  DATE(TABLE2.START_DATE,'-14 month') AND TABLE1.PATIENT_ID =
>> TABLE2.PATIENT_ID
>>   )
>>
>> I believe this will make use of a joint index on PATIENT_ID and START_DATE,
>> with the preferred order depending on which is the more restrictive term
>> (I'd guess best would be and index on (PATIENT_ID, START_DATE)).
>>
>> Jon Dixon
>>
>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Force the use of a specified index?

2009-09-18 Thread Bart Smissaert
Had a look at this suggestion now and it works and uses the PATIENT_ID,
ADDED_DATE index, but it is as slow as my delete with Julianday. It
looks Pavel's suggestion is the way to do this. Just will have a look
now and see if doing the construction with DATE( instead of Julianday
is any faster.

RBS


On Fri, Sep 18, 2009 at 10:48 PM, Jon Dixon  wrote:
> From:
> "Bart Smissaert" 
> Then the SQL I was trying to improve:
>
> DELETE
> FROM
> TABLE2
> WHERE
> PATIENT_ID NOT IN (
>   SELECT
>   TABLE1.PATIENT_ID
>   FROM
>   TABLE1
>   WHERE
>   JULIANDAY(TABLE2.START_DATE, '-14 month') >
>   JULIANDAY(TABLE1.START_DATE) AND
>   TABLE1.PATIENT_ID = TABLE2.PATIENT_ID)
>
> -
>
> To my understanding, SQLite will not use indices on function results. I
> wonder if you would have more luck (since START_DATE is -mm-dd) using
>
> DELETE FROM TABLE2 WHERE PATIENT_ID NOT IN (
>   SELECT TABLE1.PATIENT_ID FROM TABLE1 WHERE TABLE1.START_DATE <
>  DATE(TABLE2.START_DATE,'-14 month') AND TABLE1.PATIENT_ID =
> TABLE2.PATIENT_ID
>   )
>
> I believe this will make use of a joint index on PATIENT_ID and START_DATE,
> with the preferred order depending on which is the more restrictive term
> (I'd guess best would be and index on (PATIENT_ID, START_DATE)).
>
> Jon Dixon
>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Force the use of a specified index?

2009-09-18 Thread Bart Smissaert
DELETE
FROM
TABLE2
WHERE
ENTRY_ID NOT IN
 (SELECT T2.ENTRY_ID FROM
 TABLE2 T2 INNER JOIN TABLE1 T1 ON
 (T1.PATIENT_ID = T2.PATIENT_ID)
 WHERE julianday(T2.START_DATE, '+15 month') >
   julianday(T1.START_DATE)
 )

That is indeed a lot faster and then slightly faster than my approach
with the intermediate table. Will now have a look at Jon's suggestion.

RBS


On Fri, Sep 18, 2009 at 10:04 PM, Pavel Ivanov  wrote:
> *I'm leaving aside the rant that your first delete is not identical to
> combination of the select and delete in the second approach and select
> in second approach contains typos...*
>
> But did you try to combine your insert and delete statements from the
> second approach? This approach quicker because of exactly that -
> select and delete statements are independent whereas in first approach
> your select is executed again and again for each row in TABLE2. So
> just make it like this:
>
> DELETE
> FROM
> TABLE2
> WHERE
> ENTRY_ID NOT IN
>  (SELECT T2.ENTRY_ID FROM
>  TABLE2 T2 INNER JOIN TABLE1 T1 ON
>  (T1.PATIENT_ID = T2.PATIENT_ID)
>  WHERE julianday(T2.START_DATE, '+15 month') >
>        julianday(T1.START_DATE)
>  )
>
> And about indexes: for this query index on (PATIENT_ID, START_DATE)
> doesn't do any better than on (PATIENT_ID) because SQLite must to
> check all rows with given PATIENT_ID anyway. And that is because
> START_DATE is in the query inside function call to julianday(). Index
> on 2 fields could help only if your condition was e.g. T2.START_DATE >
> T1.START_DATE.
> And in most cases there's no benefit creating index on 1 field
> (PATIENT_ID) when there's already index on 2 fields (PATIENT_ID,
> START_DATE) where PATIENT_ID is the first field.
>
> Pavel
>
> On Fri, Sep 18, 2009 at 4:46 PM, Bart Smissaert
>  wrote:
>> This is what I am dealing with:
>> 2 tables with exactly the same schema (but could be slightly
>> different, so can't put in same table):
>>
>> CREATE TABLE TABLE1(
>>   [PATIENT_ID] INTEGER,
>>   [ENTRY_ID] INTEGER PRIMARY KEY,
>>   [READ_CODE] TEXT,
>>   [ADDED_DATE] TEXT,
>>   [START_DATE] TEXT)
>>
>> The last 2 date fields are in the format -mm-dd
>>
>> Again in both tables the following indexes:
>> CREATE INDEX IDX_TABLE1_PATIENT_ID ON TABLE1(PATIENT_ID)
>> CREATE INDEX IDX_TABLE1_PATIENT_ID_START_DATE ON TABLE1(PATIENT_ID, 
>> START_DATE)
>> CREATE INDEX IDX_TABLE1_PATIENT_ID_ENTRY_ID ON TABLE1(PATIENT_ID, ENTRY_ID)
>> CREATE INDEX TABLE1_START_DATE ON TABLE1(START_DATE)
>>
>> Then some delete SQL's will be run so that in the end PATIENT_ID is
>> unique in both tables.
>>
>> Then the SQL I was trying to improve:
>>
>> DELETE
>> FROM
>> TABLE2
>> WHERE
>> PATIENT_ID NOT IN (
>>   SELECT
>>   TABLE1.PATIENT_ID
>>   FROM
>>   TABLE1
>>   WHERE
>>   JULIANDAY(TABLE2.START_DATE, '-14 month') >
>>   JULIANDAY(TABLE1.START_DATE) AND
>>   TABLE1.PATIENT_ID = TABLE2.PATIENT_ID)
>>
>> The index it will use for this SQL is IDX_TABLE1_PATIENT_ID_ENTRY_ID
>> and that surprises me as ENTRY_ID is not in the above SQL. If I drop
>> that index then it will
>> use IDX_TABLE1_PATIENT_ID and only if that is dropped as well will it
>> use IDX_TABLE1_PATIENT_ID_START_DATE, which I thought would be the
>> best, but turns
>> out to be no better than the first. Whatever way I do this it is slow
>> and I can do it a lot quicker by doing this:
>>
>> CREATE TEMP TABLE DATE_COMPARE_TEMP([E_ID] INTEGER)
>>
>> INSERT INTO DATE_COMPARE_TEMP (E_ID)
>>   SELECT T2.ENTRY_ID FROM
>>   TABLE2 T2 INNER JOIN TABLE1 T2 ON
>>   (T1.PATIENT_ID = T2.PATIENT_ID)
>>   WHERE julianday(T2.START_DATE, '+15 month') >
>>         julianday(T1.START_DATE)
>>
>> CREATE INDEX IDX1_DATE_COMPARE_TEMP_E_ID ON DATE_COMPARE_TEMP(E_ID)
>>
>> analyze DATE_COMPARE_TEMP
>>
>> and then the delete SQL like this:
>>
>> DELETE
>> FROM
>> TABLE2
>> WHERE
>> ENTRY_ID NOT IN
>>   (SELECT E_ID FROM DATE_COMPARE_TEMP)
>>
>> Although this involves a third temp table this method is about twice
>> as fast as the first one.
>> Maybe that is just the way it is and there just is no way to do this
>> as fast without the intermediate temp table, but I just wondered.
>> Again there is no problem here as the second method is simple and
>> fast. Just trying to increase my understanding of SQLite and indexes.
>>
>>
>> RBS
>>
>>
>>
>>
>> On Fri, Sep 18, 2009 at 8:25 PM, Simon Slavin
>>  wrote:
>>>
>>> On 18 Sep 2009, at 4:49pm, bartsmissa...@blueyonder.co.uk wrote:
>>>
> Did something in the documentation make
> you think SQLite wouldn't use a
> multi-column index unless you forced it ?

 No, but I just noticed it didn't use the index I thought would be
 best. As
 it turned out it looks I was wrong in that that index didn't give the
 quickest result.
>>>
>>> Ah.  Okay, that's fine.  It can be quite surprising what's best.  Glad
>>> you got a good result.
>>>
>>> Simon.
>>> ___
>>> 

Re: [sqlite] Force the use of a specified index?

2009-09-18 Thread Jon Dixon
From: 
"Bart Smissaert" 
Then the SQL I was trying to improve:


DELETE
FROM
TABLE2
WHERE
PATIENT_ID NOT IN (
   SELECT
   TABLE1.PATIENT_ID
   FROM
   TABLE1
   WHERE
   JULIANDAY(TABLE2.START_DATE, '-14 month') >
   JULIANDAY(TABLE1.START_DATE) AND
   TABLE1.PATIENT_ID = TABLE2.PATIENT_ID)

-

To my understanding, SQLite will not use indices on function results. I wonder 
if you would have more luck (since START_DATE is -mm-dd) using

DELETE FROM TABLE2 WHERE PATIENT_ID NOT IN (
  SELECT TABLE1.PATIENT_ID FROM TABLE1 WHERE TABLE1.START_DATE < 
 DATE(TABLE2.START_DATE,'-14 month') AND TABLE1.PATIENT_ID = 
TABLE2.PATIENT_ID
  )

I believe this will make use of a joint index on PATIENT_ID and START_DATE, 
with the preferred order depending on which is the more restrictive term (I'd 
guess best would be and index on (PATIENT_ID, START_DATE)).

Jon Dixon
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Force the use of a specified index?

2009-09-18 Thread Bart Smissaert
Thanks, will have a look at your suggestion and yes, I had a feeling I
was overlooking some elemental things here. I typed it out all bit
quick (hence the typo's and difference in the deletes), but I thought
it would make clear what was going on. Will test now and see if your
suggestion is indeed quicker.

RBS



On Fri, Sep 18, 2009 at 10:04 PM, Pavel Ivanov  wrote:
> *I'm leaving aside the rant that your first delete is not identical to
> combination of the select and delete in the second approach and select
> in second approach contains typos...*
>
> But did you try to combine your insert and delete statements from the
> second approach? This approach quicker because of exactly that -
> select and delete statements are independent whereas in first approach
> your select is executed again and again for each row in TABLE2. So
> just make it like this:
>
> DELETE
> FROM
> TABLE2
> WHERE
> ENTRY_ID NOT IN
>  (SELECT T2.ENTRY_ID FROM
>  TABLE2 T2 INNER JOIN TABLE1 T1 ON
>  (T1.PATIENT_ID = T2.PATIENT_ID)
>  WHERE julianday(T2.START_DATE, '+15 month') >
>        julianday(T1.START_DATE)
>  )
>
> And about indexes: for this query index on (PATIENT_ID, START_DATE)
> doesn't do any better than on (PATIENT_ID) because SQLite must to
> check all rows with given PATIENT_ID anyway. And that is because
> START_DATE is in the query inside function call to julianday(). Index
> on 2 fields could help only if your condition was e.g. T2.START_DATE >
> T1.START_DATE.
> And in most cases there's no benefit creating index on 1 field
> (PATIENT_ID) when there's already index on 2 fields (PATIENT_ID,
> START_DATE) where PATIENT_ID is the first field.
>
> Pavel
>
> On Fri, Sep 18, 2009 at 4:46 PM, Bart Smissaert
>  wrote:
>> This is what I am dealing with:
>> 2 tables with exactly the same schema (but could be slightly
>> different, so can't put in same table):
>>
>> CREATE TABLE TABLE1(
>>   [PATIENT_ID] INTEGER,
>>   [ENTRY_ID] INTEGER PRIMARY KEY,
>>   [READ_CODE] TEXT,
>>   [ADDED_DATE] TEXT,
>>   [START_DATE] TEXT)
>>
>> The last 2 date fields are in the format -mm-dd
>>
>> Again in both tables the following indexes:
>> CREATE INDEX IDX_TABLE1_PATIENT_ID ON TABLE1(PATIENT_ID)
>> CREATE INDEX IDX_TABLE1_PATIENT_ID_START_DATE ON TABLE1(PATIENT_ID, 
>> START_DATE)
>> CREATE INDEX IDX_TABLE1_PATIENT_ID_ENTRY_ID ON TABLE1(PATIENT_ID, ENTRY_ID)
>> CREATE INDEX TABLE1_START_DATE ON TABLE1(START_DATE)
>>
>> Then some delete SQL's will be run so that in the end PATIENT_ID is
>> unique in both tables.
>>
>> Then the SQL I was trying to improve:
>>
>> DELETE
>> FROM
>> TABLE2
>> WHERE
>> PATIENT_ID NOT IN (
>>   SELECT
>>   TABLE1.PATIENT_ID
>>   FROM
>>   TABLE1
>>   WHERE
>>   JULIANDAY(TABLE2.START_DATE, '-14 month') >
>>   JULIANDAY(TABLE1.START_DATE) AND
>>   TABLE1.PATIENT_ID = TABLE2.PATIENT_ID)
>>
>> The index it will use for this SQL is IDX_TABLE1_PATIENT_ID_ENTRY_ID
>> and that surprises me as ENTRY_ID is not in the above SQL. If I drop
>> that index then it will
>> use IDX_TABLE1_PATIENT_ID and only if that is dropped as well will it
>> use IDX_TABLE1_PATIENT_ID_START_DATE, which I thought would be the
>> best, but turns
>> out to be no better than the first. Whatever way I do this it is slow
>> and I can do it a lot quicker by doing this:
>>
>> CREATE TEMP TABLE DATE_COMPARE_TEMP([E_ID] INTEGER)
>>
>> INSERT INTO DATE_COMPARE_TEMP (E_ID)
>>   SELECT T2.ENTRY_ID FROM
>>   TABLE2 T2 INNER JOIN TABLE1 T2 ON
>>   (T1.PATIENT_ID = T2.PATIENT_ID)
>>   WHERE julianday(T2.START_DATE, '+15 month') >
>>         julianday(T1.START_DATE)
>>
>> CREATE INDEX IDX1_DATE_COMPARE_TEMP_E_ID ON DATE_COMPARE_TEMP(E_ID)
>>
>> analyze DATE_COMPARE_TEMP
>>
>> and then the delete SQL like this:
>>
>> DELETE
>> FROM
>> TABLE2
>> WHERE
>> ENTRY_ID NOT IN
>>   (SELECT E_ID FROM DATE_COMPARE_TEMP)
>>
>> Although this involves a third temp table this method is about twice
>> as fast as the first one.
>> Maybe that is just the way it is and there just is no way to do this
>> as fast without the intermediate temp table, but I just wondered.
>> Again there is no problem here as the second method is simple and
>> fast. Just trying to increase my understanding of SQLite and indexes.
>>
>>
>> RBS
>>
>>
>>
>>
>> On Fri, Sep 18, 2009 at 8:25 PM, Simon Slavin
>>  wrote:
>>>
>>> On 18 Sep 2009, at 4:49pm, bartsmissa...@blueyonder.co.uk wrote:
>>>
> Did something in the documentation make
> you think SQLite wouldn't use a
> multi-column index unless you forced it ?

 No, but I just noticed it didn't use the index I thought would be
 best. As
 it turned out it looks I was wrong in that that index didn't give the
 quickest result.
>>>
>>> Ah.  Okay, that's fine.  It can be quite surprising what's best.  Glad
>>> you got a good result.
>>>
>>> Simon.
>>> ___
>>> sqlite-users mailing list
>>> 

Re: [sqlite] Force the use of a specified index?

2009-09-18 Thread Pavel Ivanov
*I'm leaving aside the rant that your first delete is not identical to
combination of the select and delete in the second approach and select
in second approach contains typos...*

But did you try to combine your insert and delete statements from the
second approach? This approach quicker because of exactly that -
select and delete statements are independent whereas in first approach
your select is executed again and again for each row in TABLE2. So
just make it like this:

DELETE
FROM
TABLE2
WHERE
ENTRY_ID NOT IN
  (SELECT T2.ENTRY_ID FROM
  TABLE2 T2 INNER JOIN TABLE1 T1 ON
  (T1.PATIENT_ID = T2.PATIENT_ID)
  WHERE julianday(T2.START_DATE, '+15 month') >
julianday(T1.START_DATE)
  )

And about indexes: for this query index on (PATIENT_ID, START_DATE)
doesn't do any better than on (PATIENT_ID) because SQLite must to
check all rows with given PATIENT_ID anyway. And that is because
START_DATE is in the query inside function call to julianday(). Index
on 2 fields could help only if your condition was e.g. T2.START_DATE >
T1.START_DATE.
And in most cases there's no benefit creating index on 1 field
(PATIENT_ID) when there's already index on 2 fields (PATIENT_ID,
START_DATE) where PATIENT_ID is the first field.

Pavel

On Fri, Sep 18, 2009 at 4:46 PM, Bart Smissaert
 wrote:
> This is what I am dealing with:
> 2 tables with exactly the same schema (but could be slightly
> different, so can't put in same table):
>
> CREATE TABLE TABLE1(
>   [PATIENT_ID] INTEGER,
>   [ENTRY_ID] INTEGER PRIMARY KEY,
>   [READ_CODE] TEXT,
>   [ADDED_DATE] TEXT,
>   [START_DATE] TEXT)
>
> The last 2 date fields are in the format -mm-dd
>
> Again in both tables the following indexes:
> CREATE INDEX IDX_TABLE1_PATIENT_ID ON TABLE1(PATIENT_ID)
> CREATE INDEX IDX_TABLE1_PATIENT_ID_START_DATE ON TABLE1(PATIENT_ID, 
> START_DATE)
> CREATE INDEX IDX_TABLE1_PATIENT_ID_ENTRY_ID ON TABLE1(PATIENT_ID, ENTRY_ID)
> CREATE INDEX TABLE1_START_DATE ON TABLE1(START_DATE)
>
> Then some delete SQL's will be run so that in the end PATIENT_ID is
> unique in both tables.
>
> Then the SQL I was trying to improve:
>
> DELETE
> FROM
> TABLE2
> WHERE
> PATIENT_ID NOT IN (
>   SELECT
>   TABLE1.PATIENT_ID
>   FROM
>   TABLE1
>   WHERE
>   JULIANDAY(TABLE2.START_DATE, '-14 month') >
>   JULIANDAY(TABLE1.START_DATE) AND
>   TABLE1.PATIENT_ID = TABLE2.PATIENT_ID)
>
> The index it will use for this SQL is IDX_TABLE1_PATIENT_ID_ENTRY_ID
> and that surprises me as ENTRY_ID is not in the above SQL. If I drop
> that index then it will
> use IDX_TABLE1_PATIENT_ID and only if that is dropped as well will it
> use IDX_TABLE1_PATIENT_ID_START_DATE, which I thought would be the
> best, but turns
> out to be no better than the first. Whatever way I do this it is slow
> and I can do it a lot quicker by doing this:
>
> CREATE TEMP TABLE DATE_COMPARE_TEMP([E_ID] INTEGER)
>
> INSERT INTO DATE_COMPARE_TEMP (E_ID)
>   SELECT T2.ENTRY_ID FROM
>   TABLE2 T2 INNER JOIN TABLE1 T2 ON
>   (T1.PATIENT_ID = T2.PATIENT_ID)
>   WHERE julianday(T2.START_DATE, '+15 month') >
>         julianday(T1.START_DATE)
>
> CREATE INDEX IDX1_DATE_COMPARE_TEMP_E_ID ON DATE_COMPARE_TEMP(E_ID)
>
> analyze DATE_COMPARE_TEMP
>
> and then the delete SQL like this:
>
> DELETE
> FROM
> TABLE2
> WHERE
> ENTRY_ID NOT IN
>   (SELECT E_ID FROM DATE_COMPARE_TEMP)
>
> Although this involves a third temp table this method is about twice
> as fast as the first one.
> Maybe that is just the way it is and there just is no way to do this
> as fast without the intermediate temp table, but I just wondered.
> Again there is no problem here as the second method is simple and
> fast. Just trying to increase my understanding of SQLite and indexes.
>
>
> RBS
>
>
>
>
> On Fri, Sep 18, 2009 at 8:25 PM, Simon Slavin
>  wrote:
>>
>> On 18 Sep 2009, at 4:49pm, bartsmissa...@blueyonder.co.uk wrote:
>>
 Did something in the documentation make
 you think SQLite wouldn't use a
 multi-column index unless you forced it ?
>>>
>>> No, but I just noticed it didn't use the index I thought would be
>>> best. As
>>> it turned out it looks I was wrong in that that index didn't give the
>>> quickest result.
>>
>> Ah.  Okay, that's fine.  It can be quite surprising what's best.  Glad
>> you got a good result.
>>
>> Simon.
>> ___
>> sqlite-users mailing list
>> sqlite-users@sqlite.org
>> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>>
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Force the use of a specified index?

2009-09-18 Thread Bart Smissaert
This is what I am dealing with:
2 tables with exactly the same schema (but could be slightly
different, so can't put in same table):

CREATE TABLE TABLE1(
   [PATIENT_ID] INTEGER,
   [ENTRY_ID] INTEGER PRIMARY KEY,
   [READ_CODE] TEXT,
   [ADDED_DATE] TEXT,
   [START_DATE] TEXT)

The last 2 date fields are in the format -mm-dd

Again in both tables the following indexes:
CREATE INDEX IDX_TABLE1_PATIENT_ID ON TABLE1(PATIENT_ID)
CREATE INDEX IDX_TABLE1_PATIENT_ID_START_DATE ON TABLE1(PATIENT_ID, START_DATE)
CREATE INDEX IDX_TABLE1_PATIENT_ID_ENTRY_ID ON TABLE1(PATIENT_ID, ENTRY_ID)
CREATE INDEX TABLE1_START_DATE ON TABLE1(START_DATE)

Then some delete SQL's will be run so that in the end PATIENT_ID is
unique in both tables.

Then the SQL I was trying to improve:

DELETE
FROM
TABLE2
WHERE
PATIENT_ID NOT IN (
   SELECT
   TABLE1.PATIENT_ID
   FROM
   TABLE1
   WHERE
   JULIANDAY(TABLE2.START_DATE, '-14 month') >
   JULIANDAY(TABLE1.START_DATE) AND
   TABLE1.PATIENT_ID = TABLE2.PATIENT_ID)

The index it will use for this SQL is IDX_TABLE1_PATIENT_ID_ENTRY_ID
and that surprises me as ENTRY_ID is not in the above SQL. If I drop
that index then it will
use IDX_TABLE1_PATIENT_ID and only if that is dropped as well will it
use IDX_TABLE1_PATIENT_ID_START_DATE, which I thought would be the
best, but turns
out to be no better than the first. Whatever way I do this it is slow
and I can do it a lot quicker by doing this:

CREATE TEMP TABLE DATE_COMPARE_TEMP([E_ID] INTEGER)

INSERT INTO DATE_COMPARE_TEMP (E_ID)
   SELECT T2.ENTRY_ID FROM
   TABLE2 T2 INNER JOIN TABLE1 T2 ON
   (T1.PATIENT_ID = T2.PATIENT_ID)
   WHERE julianday(T2.START_DATE, '+15 month') >
 julianday(T1.START_DATE)

CREATE INDEX IDX1_DATE_COMPARE_TEMP_E_ID ON DATE_COMPARE_TEMP(E_ID)

analyze DATE_COMPARE_TEMP

and then the delete SQL like this:

DELETE
FROM
TABLE2
WHERE
ENTRY_ID NOT IN
   (SELECT E_ID FROM DATE_COMPARE_TEMP)

Although this involves a third temp table this method is about twice
as fast as the first one.
Maybe that is just the way it is and there just is no way to do this
as fast without the intermediate temp table, but I just wondered.
Again there is no problem here as the second method is simple and
fast. Just trying to increase my understanding of SQLite and indexes.


RBS




On Fri, Sep 18, 2009 at 8:25 PM, Simon Slavin
 wrote:
>
> On 18 Sep 2009, at 4:49pm, bartsmissa...@blueyonder.co.uk wrote:
>
>>> Did something in the documentation make
>>> you think SQLite wouldn't use a
>>> multi-column index unless you forced it ?
>>
>> No, but I just noticed it didn't use the index I thought would be
>> best. As
>> it turned out it looks I was wrong in that that index didn't give the
>> quickest result.
>
> Ah.  Okay, that's fine.  It can be quite surprising what's best.  Glad
> you got a good result.
>
> Simon.
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Force the use of a specified index?

2009-09-18 Thread Simon Slavin

On 18 Sep 2009, at 4:49pm, bartsmissa...@blueyonder.co.uk wrote:

>> Did something in the documentation make
>> you think SQLite wouldn't use a
>> multi-column index unless you forced it ?
>
> No, but I just noticed it didn't use the index I thought would be  
> best. As
> it turned out it looks I was wrong in that that index didn't give the
> quickest result.

Ah.  Okay, that's fine.  It can be quite surprising what's best.  Glad  
you got a good result.

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Force the use of a specified index?

2009-09-18 Thread bartsmissaert
I was trying to force the use of a multi-column index. Will have a better
look and see what is going on here. For now I get best performance with a
2-stage approach with the use of a intermediate temp table. Will post the
exact details of this later.

RBS


> If neither index individually offers a performance boost, it's possible a
> single multi-column index might be better.
> Sam
>
>
> On Thu, Sep 17, 2009 at 3:43 PM, Bart Smissaert
> wrote:
>
>> Have tried INDEXED BY and it does indeed work and force the use of the
>> specified index.
>> It didn't however make the query faster, so maybe the SQLite plan
>> generator is better than I thought!
>>
>> RBS
>>
>>
>>
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
>



___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Force the use of a specified index?

2009-09-18 Thread Samuel Neff
If neither index individually offers a performance boost, it's possible a
single multi-column index might be better.
Sam


On Thu, Sep 17, 2009 at 3:43 PM, Bart Smissaert wrote:

> Have tried INDEXED BY and it does indeed work and force the use of the
> specified index.
> It didn't however make the query faster, so maybe the SQLite plan
> generator is better than I thought!
>
> RBS
>
>
>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Force the use of a specified index?

2009-09-17 Thread Bart Smissaert
Have tried INDEXED BY and it does indeed work and force the use of the
specified index.
It didn't however make the query faster, so maybe the SQLite plan
generator is better than I thought!

RBS


On Thu, Sep 17, 2009 at 10:07 AM, Dan Kennedy  wrote:
>
> On Sep 17, 2009, at 5:02 AM, Bart Smissaert wrote:
>
>> Is it possible to tell SQLite to use a specified index?
>> I know you can use the + to excludes fields being used in an index,
>> but this doesn't help me in this particular case. I remember a
>> discussion about this and that this option might be added to SQLite,
>> but couldn't find it anywhere.
>
> There is the "INDEXED BY" clause. But many are of the opinion that
> this feature is prone to misuse.
>
>   http://www.sqlite.org/lang_indexedby.html
>
> Dan.
>
>>
>> RBS
>> ___
>> sqlite-users mailing list
>> sqlite-users@sqlite.org
>> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Force the use of a specified index?

2009-09-17 Thread bartsmissaert
Ah, thanks, that was the one and will give that a try.

RBS


>
> On Sep 17, 2009, at 5:02 AM, Bart Smissaert wrote:
>
>> Is it possible to tell SQLite to use a specified index?
>> I know you can use the + to excludes fields being used in an index,
>> but this doesn't help me in this particular case. I remember a
>> discussion about this and that this option might be added to SQLite,
>> but couldn't find it anywhere.
>
> There is the "INDEXED BY" clause. But many are of the opinion that
> this feature is prone to misuse.
>
>http://www.sqlite.org/lang_indexedby.html
>
> Dan.
>
>>
>> RBS
>> ___
>> sqlite-users mailing list
>> sqlite-users@sqlite.org
>> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
>



___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Force the use of a specified index?

2009-09-17 Thread Dan Kennedy

On Sep 17, 2009, at 5:02 AM, Bart Smissaert wrote:

> Is it possible to tell SQLite to use a specified index?
> I know you can use the + to excludes fields being used in an index,
> but this doesn't help me in this particular case. I remember a
> discussion about this and that this option might be added to SQLite,
> but couldn't find it anywhere.

There is the "INDEXED BY" clause. But many are of the opinion that
this feature is prone to misuse.

   http://www.sqlite.org/lang_indexedby.html

Dan.

>
> RBS
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users