Re: [sqlite] Partial Index on "~" malfunctions when used with likely/unlikely

2019-05-04 Thread Manuel Rigger
Thanks for the explanation and the quick fix!

Best,
Manuel

On Sat, May 4, 2019 at 7:41 PM Richard Hipp  wrote:

> Here is another case:
>
> CREATE TABLE t1(a,b,c);
> INSERT INTO t1 VALUES(NULL,8,'yes');
> CREATE INDEX t1b ON t1(b) WHERE a IS NOT NULL;
> SELECT c FROM t1 WHERE b=8 AND (a OR 1);
>
> The problem was in the theorem prover that determines when a partial
> index can be used.  The problem goes all the way back to the initial
> introduction of partial indexes in SQLite version 3.8.0 (2013-08-26).
> The theorem prover was (incorrectly) assuming that if the expression
> "a OR 1" is true, then "a IS NOT NULL" must also be true.  And that
> assumption is correct for most binary operators - just not for OR.
> Fixed now.
>
> On 5/4/19, Manuel Rigger  wrote:
> > This similar test case, that I just found now, demonstrates that this
> could
> > be a pattern that is used in practice (TRUE can also be computed):
> >
> > CREATE TABLE t0 (c0);
> > CREATE INDEX index_0 ON t0(c0) WHERE c0 NOTNULL;
> > INSERT INTO t0(c0) VALUES (NULL);
> > SELECT * FROM t0 WHERE (c0 OR TRUE);
> >
> > Also here, the row is not fetched.
> >
> > Best,
> > Manuel
> >
> > On Sat, May 4, 2019 at 3:45 PM Manuel Rigger 
> > wrote:
> >
> >> Hi,
> >>
> >> I discovered a bug, which is demonstrated through the following test
> case:
> >>
> >> CREATE TABLE t0(c0);
> >> CREATE INDEX index_0 ON t0(c0) WHERE (~c0) NOT NULL;
> >> INSERT INTO t0(c0) VALUES (NULL);
> >> SELECT * FROM t0 WHERE (LIKELY(~c0) OR TRUE);
> >>
> >> No row is fetched, although the WHERE clause is always TRUE. I could
> >> reproduce this bug only when creating a partial index, and when using
> >> either LIKELY or UNLIKELY. The datatype of the c0 column seems to
> >> irrelevant. PRAGMA integrity_check; and REINDEX could not detect this
> >> error.
> >>
> >> Best,
> >> Manuel
> >>
> > ___
> > sqlite-users mailing list
> > sqlite-users@mailinglists.sqlite.org
> > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
> >
>
>
> --
> D. Richard Hipp
> d...@sqlite.org
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Partial Index on "~" malfunctions when used with likely/unlikely

2019-05-04 Thread Richard Hipp
Here is another case:

CREATE TABLE t1(a,b,c);
INSERT INTO t1 VALUES(NULL,8,'yes');
CREATE INDEX t1b ON t1(b) WHERE a IS NOT NULL;
SELECT c FROM t1 WHERE b=8 AND (a OR 1);

The problem was in the theorem prover that determines when a partial
index can be used.  The problem goes all the way back to the initial
introduction of partial indexes in SQLite version 3.8.0 (2013-08-26).
The theorem prover was (incorrectly) assuming that if the expression
"a OR 1" is true, then "a IS NOT NULL" must also be true.  And that
assumption is correct for most binary operators - just not for OR.
Fixed now.

On 5/4/19, Manuel Rigger  wrote:
> This similar test case, that I just found now, demonstrates that this could
> be a pattern that is used in practice (TRUE can also be computed):
>
> CREATE TABLE t0 (c0);
> CREATE INDEX index_0 ON t0(c0) WHERE c0 NOTNULL;
> INSERT INTO t0(c0) VALUES (NULL);
> SELECT * FROM t0 WHERE (c0 OR TRUE);
>
> Also here, the row is not fetched.
>
> Best,
> Manuel
>
> On Sat, May 4, 2019 at 3:45 PM Manuel Rigger 
> wrote:
>
>> Hi,
>>
>> I discovered a bug, which is demonstrated through the following test case:
>>
>> CREATE TABLE t0(c0);
>> CREATE INDEX index_0 ON t0(c0) WHERE (~c0) NOT NULL;
>> INSERT INTO t0(c0) VALUES (NULL);
>> SELECT * FROM t0 WHERE (LIKELY(~c0) OR TRUE);
>>
>> No row is fetched, although the WHERE clause is always TRUE. I could
>> reproduce this bug only when creating a partial index, and when using
>> either LIKELY or UNLIKELY. The datatype of the c0 column seems to
>> irrelevant. PRAGMA integrity_check; and REINDEX could not detect this
>> error.
>>
>> Best,
>> Manuel
>>
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>


-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Partial Index on "~" malfunctions when used with likely/unlikely

2019-05-04 Thread Manuel Rigger
This similar test case, that I just found now, demonstrates that this could
be a pattern that is used in practice (TRUE can also be computed):

CREATE TABLE t0 (c0);
CREATE INDEX index_0 ON t0(c0) WHERE c0 NOTNULL;
INSERT INTO t0(c0) VALUES (NULL);
SELECT * FROM t0 WHERE (c0 OR TRUE);

Also here, the row is not fetched.

Best,
Manuel

On Sat, May 4, 2019 at 3:45 PM Manuel Rigger 
wrote:

> Hi,
>
> I discovered a bug, which is demonstrated through the following test case:
>
> CREATE TABLE t0(c0);
> CREATE INDEX index_0 ON t0(c0) WHERE (~c0) NOT NULL;
> INSERT INTO t0(c0) VALUES (NULL);
> SELECT * FROM t0 WHERE (LIKELY(~c0) OR TRUE);
>
> No row is fetched, although the WHERE clause is always TRUE. I could
> reproduce this bug only when creating a partial index, and when using
> either LIKELY or UNLIKELY. The datatype of the c0 column seems to
> irrelevant. PRAGMA integrity_check; and REINDEX could not detect this error.
>
> Best,
> Manuel
>
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Partial Index and Query Planner

2016-08-05 Thread Richard Hipp
On 8/5/16, Detlef Golze  wrote:
> As documented, the query planner only uses the index if the SELECT contains
> that exact same condition.
>
> According to .eqp with 3.12.1 the following statement indeed uses the index:
>
> SELECT Value1 FROM MyTable WHERE Value1<>0 AND Value1=7;
> --EQP-- 0,0,0,SEARCH TABLE MyTable USING COVERING INDEX MyIndex1 (Value1=?)
>
> Can I safely assume that this 'workaround' works with all future versions?

Yes.

In a future release of SQLite, the theorem prover inside the query
planner that determines when it is acceptable to use a partial index
might grow smarter, and realize that value1=7 implies value1<>0.  At
that point the extra value1<>0 term in the WHERE clause will become
superfluous.  But it will not cause problems.  The partial index will
continue to be used.

-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Partial index to find maximum

2014-12-29 Thread Clemens Ladisch
Hick Gunter wrote:
>create the primary key index ordered properly
>
>CREATE TABLE t (..., PRIMARY KEY ( a ASC, b DESC)...);

DESC is not necessary here; SQLite has no problem reading the index
in reverse order, if needed.

(DESC in an index is useful only when you want to optimize multiple
ORDER BYs in a single query.)

>SELECT b FROM t WHERE a = ? LIMIT 1;

This is wrong with or without a DESC index: there is no guarantee that
the returned row is the first in the index unless you're using ORDER BY
or MAX().


Regards,
Clemens

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Partial index to find maximum

2014-12-29 Thread Hick Gunter
create the primary key index ordered properly

CREATE TABLE t (..., PRIMARY KEY ( a ASC, b DESC)...);
SELECT b FROM t WHERE a = ? LIMIT 1;

If you insist on using a partial index for this (for example if each a has a 
lot of b entries) you could add a field b_is_max and keep it current using 
triggers.

CREATE INDEX t_max ON t (a,b) WHERE b_is_max;
SELECT b FROM t WHERE a=? AND b_is_max;

CREATE TRIGGER t_ins AFTER INSERT ON t WHEN new.b > (SELECT b FROM t WHERE a = 
new.a AND b_is_max) BEGIN
 UPDATE t SET b_is_max = 0 WHERE a = new.a AND b_is_max;
 UPDATE t SET b_is_max = 1 WHERE a = new.a AND b = new.b;
END;

CREATE TRIGGER t_del BEFORE DELETE ON t WHEN old. b_is_max) BEGIN
 UPDATE t SET b_is_max = 0 WHERE a = old.a AND b = old.b);
 UPDATE t SET b_is_max = 1 WHERE a = old.a AND b = (SELECT max(b) FROM t 
WHERE a = old.a and b < old.b);
END;

CREATE TRIGGER t_upd_new AFTER UPDATE ON t WHEN new.b > (SELECT b FROM t WHERE 
a = new.a AND b_is_max) BEGIN
 UPDATE t SET b_is_max = 0 WHERE a = new.a AND b_is_max;
 UPDATE t SET b_is_max = 1 WHERE a = new.a AND b = new.b;
END;

CREATE TRIGGER t_upd_old BEFORE UPDATE ON t WHEN old.b_is_max) BEGIN
 UPDATE t SET b_is_max = 0 WHERE a = old.a AND b = old.b);
 UPDATE t SET b_is_max = 1 WHERE a = old.a AND b =(SELECT max(b) FROM t 
WHERE a = old.a and b < old.b);
END;

Triggers shown are untested. 

-Ursprüngliche Nachricht-
Von: Baruch Burstein [mailto:bmburst...@gmail.com] 
Gesendet: Montag, 29. Dezember 2014 09:34
An: General Discussion of SQLite Database
Betreff: [sqlite] Partial index to find maximum

Hi,

I have a table with a 2 column PK, say 'a' and 'b'. I need to find, for a given 
value of 'a', the highest matching 'b'. The query itself it simple:

SELECT max(b) FROM t WHERE a=:whatever

To speed this up, I would add an index on 'a'. Now, the question is is there 
some way to tell the index that I am only interested in the maximum value of b? 
For example, for the following table:

a|b
1|1
1|2
2|2
2|3

I only need the index to contain the rows (1,2) and (2,3). The docs for partial 
indexes say that they can't contain functions (like max()). Any suggestions?

--
˙uʍop-ǝpısdn sı ɹoʇıuoɯ ɹnoʎ 'sıɥʇ pɐǝɹ uɐɔ noʎ ɟı 
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Partial index to find maximum

2014-12-29 Thread Keith Medcalf

You should include both a and b in the index to be most helpful.

CREATE INDEX whatever ON t (a, b);

However, you say that (a, b) is already the primary key and therefore this 
index already exists and you do not need to create another one.

Although the index will contain all rows, finding the max(b) given an "a" value 
should take just 1 index lookup.  If the optimizer does not do this for you (I 
believe it does) then you can just do:

select b from t where a=:whatever order by b desc limit 1;

to find the answer you seek with nothing more than a single lookup in the 
index.  So lets test what SQLite is going to do:

On SQLite 3.8.8 (head of trunk) we see that:

SQLite version 3.8.8 2014-12-25 12:19:56
Enter ".help" for usage hints.
Connected to a transient in-memory database.
Use ".open FILENAME" to reopen on a persistent database.
sqlite> create table t(a INTEGER, b INTEGER, primary key (a, b));
sqlite> insert into t values (1,1),(1,5),(2,3),(1,8),(2,7);
sqlite> .explain
sqlite> explain select max(b) from t where a=1;
addr  opcode p1p2p3p4 p5  comment
  -        -  --  -
0 Init   0 21000  Start at 21
1 Null   0 1 200  r[1..2]=NULL
2 OpenRead   1 3 0 k(3,nil,nil,nil)  00  root=3 iDb=0; 
sqlite_autoindex_t_1
3 Explain0 0 0 SEARCH TABLE t USING COVERING INDEX 
sqlite_autoindex_t_1 (a=?)  00
4 Noop   0 0 000  Begin WHERE-loop0: t
5 Integer1 3 000  r[3]=1
6 SeekLE 1 153 1  00  key=r[3]
7   IdxLT  1 153 1  00  key=r[3]
8   Noop   0 0 000  Begin WHERE-core
9   Column 1 1 400  r[4]=t.b
10  CollSeq0 0 0 (BINARY)   00
11  AggStep0 4 1 max(1) 01  accum=r[1] 
step(r[4])
12  Goto   0 16000  max() by index
13  Noop   0 0 000  End WHERE-core
14Prev   1 7 000
15Noop   0 0 000  End WHERE-loop0: t
16Close  1 0 000
17AggFinal   1 1 0 max(1) 00  accum=r[1] N=1
18Copy   1 5 000  r[5]=r[1]
19ResultRow  5 1 000  output=r[5]
20Halt   0 0 000
21Transaction0 0 1 0  01
22TableLock  0 2 0 t  00  iDb=0 root=2 write=0
23Goto   0 1 000
sqlite> explain select b from t where a=1 order by b desc limit 1;
addr  opcode p1p2p3p4 p5  comment
  -        -  --  -
0 Init   0 18000  Start at 18
1 Noop   0 0 000
2 Integer1 1 000  r[1]=1; LIMIT counter
3 OpenRead   2 3 0 k(3,nil,nil,nil)  00  root=3 iDb=0; 
sqlite_autoindex_t_1
4 Explain0 0 0 SEARCH TABLE t USING COVERING INDEX 
sqlite_autoindex_t_1 (a=?)  00
5 Noop   0 0 000  Begin WHERE-loop0: t
6 Integer1 2 000  r[2]=1
7 SeekLE 2 152 1  00  key=r[2]
8   IdxLT  2 152 1  00  key=r[2]
9   Noop   0 0 000  Begin WHERE-core
10  Column 2 1 300  r[3]=t.b
11  ResultRow  3 1 000  output=r[3]
12  IfZero 1 16-1   00  r[1]+=-1, if 
r[1]==0 goto 16
13  Noop   0 0 000  End WHERE-core
14Prev   2 8 000
15Noop   0 0 000  End WHERE-loop0: t
16Close  2 0 000
17Halt   0 0 000
18Transaction0 0 1 0  01
19TableLock  0 2 0 t  00  iDb=0 root=2 write=0
20Goto   0 1 000
sqlite>

So both forms of the query will run mostly the same plan and retrieve the 
result you desire in a single index lookup.

---
Theory is when you know everything but nothing works.  Practice is when 
everything works but no one knows why.  Sometimes theory and practice are 
combined:  nothing works and no one knows why.

>-Original Message-
>From: sqlite-users-boun...@

Re: [sqlite] Partial index to find maximum

2014-12-29 Thread Mohit Sindhwani

On 29/12/2014 4:33 PM, Baruch Burstein wrote:

Hi,

I have a table with a 2 column PK, say 'a' and 'b'. I need to find, for a
given value of 'a', the highest matching 'b'. The query itself it simple:

 SELECT max(b) FROM t WHERE a=:whatever

To speed this up, I would add an index on 'a'. Now, the question is is
there some way to tell the index that I am only interested in the maximum
value of b? For example, for the following table:

a|b
1|1
1|2
2|2
2|3

I only need the index to contain the rows (1,2) and (2,3). The docs for
partial indexes say that they can't contain functions (like max()). Any
suggestions?



I don't know if you can do a partial index for this, but if this is a 
use case that is slow for you and is very critical, I would use a 
trigger to update a separate table that stores just the value of "a" and 
its corresponding maximum.  It's quite easy to manage this for the 
create and insert cases.  You may have to run that whole query in the 
trigger again if there a deletion done.  So, in a way this table is the 
cache you want.


Not sure if that helps your case.

Best Regards,
Mohit.





___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] partial index?

2010-08-22 Thread Ralf Junker
On 20.08.2010 00:18, Eric Smith wrote:

> Our app is caching what are basically csv files.  Hundreds of files, 
> about 2m records per file.  Sometimes we want to delete all the cache
> rows for one of the files.  We know ahead of time which file it will
> be -- let's say it's file 7.
> 
> The schema is roughly
> 
> create table records(__recno INTEGER PRIMARY KEY, fileId, data);
> 
> So sometimes we would *like* to say "delete from records where 
> fileId=7".
> 
> But that is bad because does a full table scan.
> 
> So the next cut is to say "create index recordsIdxFileId on 
> records(fileId)".
> 
> But that is bad because it is a huge burden during INSERT and is not
>  used often enough (or with enough values) to justify its existence.
> 
> What I really want is to be able to say "create index 
> recordsIdxFileIdOnFile3 on records(fileId) where fileId=7". But 
> sqlite doesn't do that.

I believe you can do this with an index helper table managed by triggers
on your main table. Use the insert trigger's when clause to control
which records to "index".

The delete statement is obviously a little more complex than an ordinary
delete, but "explain query plan" shows that it uses indexes instead of a
full table scan so it should run quite fast.

Example SQL follows below.

Ralf



drop table if exists records;
create table records (id INTEGER PRIMARY KEY, fileid, value);

drop table if exists records_idx;
create table records_idx (id INTEGER PRIMARY KEY, fileid);
create index records_idx_idx on records_idx (fileid);

create trigger records_insert_after after insert on records
  when new.fileid = 7
begin
  insert into records_idx values (new.rowid, new.fileid);
end;

create trigger records_delete after delete on records
  begin
delete from records_idx where id = old.rowid;
  end;

insert into records (fileid, value) values (1, 'one');
insert into records (fileid, value) values (7, 'seven 1');
insert into records (fileid, value) values (7, 'seven 2');
insert into records (fileid, value) values (8, 'eight');

select * from records_idx;

delete from records
  where rowid in (select rowid from records_idx where fileid = 7);

select * from records;
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] partial index?

2010-08-21 Thread Tim Romano
OK. Thanks for the clarification, Igor and Filip. I was misunderstanding the
partial index to work, in effect, like a standard index on a virtual column
based on a function that returns null for the "irrelevant" values, with the
index defined to ignore nulls.

I see now from the "inclusion" test described in the paper that the partial
index will not be used if the query itself does not contain the same set of
conditions that were used to define the index.  That makes the partial index
safe, not the trouble I was envisioning.

Regards
Tim Romano



On Fri, Aug 20, 2010 at 8:01 AM, Igor Tandetnik  wrote:

> Tim Romano  wrote:
> > Igor,
> > Here's the example where a partial index can "hide" rows.
> >
> > From the wikipedia article cited by the OP:
> >
> > 
> > It is not necessary that the condition be the same as the index
> criterion;
> > Stonebraker's paper below presents a number of examples with indexes
> similar
> > to the following:
> >
> >  create index partial_salary on employee(age) where salary > 2100;
> >
> > 
> >
> > What would happen if you issued these queries?
> >
> >  select max(age) from employee
> >  select avg(age) from employee
> >
> > Would the ages of employees earning <= 2100 be included?
>
> Of course. The presence or absence of an index never affect the meaning of
> a query - just its performance.
>
> > Is the
> > partial-index used under those circumstances?
>
> No, I don't see how it could be beneficial for these queries.
> --
> Igor Tandetnik
>
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] partial index?

2010-08-21 Thread Max Vlasov
> Maybe it'll be clearer if I describe my (quite simple) use case.  Our
> app is caching what are basically csv files.  Hundreds of files, about 2m
> records per file.  Sometimes we want to delete all the cache rows for one
> of the files.  We know ahead of time which file it will be -- let's say
> it's file 7.
>
> The schema is roughly
>
> create table records(__recno INTEGER PRIMARY KEY, fileId, data);
>
> So sometimes we would *like* to say "delete from records where
> fileId=7".
>
> But that is bad because does a full table scan.
>
>

If your csv inserts are made as a whole, what about keeping track of rowid
ranges (rowidfrom..rowitto) for every csv insert in a separate table? You
can be sure it will be a new range not intersecting with any previous if
rowid is AUTOINCREMENT (or you can provide your own id starting current
max(id) to be absolutely sure). In this case you will get fast deletes and
you no longer need an extra indexed field.

Max
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] partial index?

2010-08-20 Thread Simon Slavin

On 19 Aug 2010, at 11:18pm, Eric Smith wrote:

> The schema is roughly
> 
> create table records(__recno INTEGER PRIMARY KEY, fileId, data);
> 
> So sometimes we would *like* to say "delete from records where
> fileId=7".
> 
> But that is bad because does a full table scan.
> 
> So the next cut is to say "create index recordsIdxFileId on
> records(fileId)".
> 
> But that is bad because it is a huge burden during INSERT and is not
> used often enough (or with enough values) to justify its existence.
> 
> What I really want is to be able to say "create index
> recordsIdxFileIdOnFile3 on records(fileId) where fileId=7".  
> But sqlite doesn't do that.

But in order to create that partial index, SQLite would have to scan the entire 
table anyway.  Otherwise how would it know which records had fileId=7 ?

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] partial index?

2010-08-20 Thread Eric Smith
Simon Slavin wrote: 

> http://www.sqlite.org/lang_createview.html 
> 
> This is the SQL standard way to reduce your view of a table to just 
> certain rows.  If I understand your request, this feature should provide 
> exactly what you want.  Appropriate indexes will be used when consulting 
> any VIEW you've defined.  

I don't think that helps either.  A view in sqlite is just syntactic
sugar for a select statement.  I don't want to define any real indices 
-- they are a performance burden.

> > Something like DELETE FROM records WHERE __recno IN (SELECT __recno 
> > FROM idxTable), where __recno is the INTEGER PRIMARY KEY on records.  
> 
> I don't understand what you're looking up here.  If you have some 
> method of recognising which rows of a table should be deleted just use the 
> appropriate 
> 
> DELETE FROM ...  WHERE ...  
> 
> command.  No need for any sub-SELECT clause.  

Maybe it'll be clearer if I describe my (quite simple) use case.  Our
app is caching what are basically csv files.  Hundreds of files, about 2m 
records per file.  Sometimes we want to delete all the cache rows for one 
of the files.  We know ahead of time which file it will be -- let's say
it's file 7.

The schema is roughly

create table records(__recno INTEGER PRIMARY KEY, fileId, data);

So sometimes we would *like* to say "delete from records where
fileId=7".

But that is bad because does a full table scan.

So the next cut is to say "create index recordsIdxFileId on
records(fileId)".

But that is bad because it is a huge burden during INSERT and is not
used often enough (or with enough values) to justify its existence.

What I really want is to be able to say "create index
recordsIdxFileIdOnFile3 on records(fileId) where fileId=7".  
But sqlite doesn't do that.

So let's assume SQLite is much faster at deleting rows by the INTEGER
PRIMARY KEY than it is by deleting rows by some other value.  Then we
can optimize by keeping track of which __recnos we will want to delete.

Hence my idea for having a separate table that maps fileId --> __recno,
but only for the fileId we care about.

Eric

-- 
Eric A. Smith

Keeping Young #6:
Don't look back. Something might be gaining on you.
-- Satchel Paige
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] partial index?

2010-08-20 Thread Eric Smith
Kees Nuyt wrote:

> You could add a "deleted" column with value range (0,1) and
> create an index on it if benchmarks show that makes it
> faster. As a bonus it is easier to code and maintain than a
> separate table with references and triggers.
> 
> Alternatively, you can create an composite index with the
> "deleted" column as one of the components.
> 
> From a theoretical view, if you care about the visibility of
> a row, you should express it as an attribute of the entity.
> The solutions above comply with that notion.

I think you misunderstand what I want.

I don't care about keeping the row around after it is deleted.  I don't
care about visibility.

I only want to find rows quickly in order to delete them.

--
Eric A. Smith

Bowlikinetics (boh lih kih neh' tiks), n.:
The act of trying to control a released bowling ball 
by twisting one's body in the direction one wants it to go.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] partial index?

2010-08-20 Thread Eric Smith

Sorry, let me amend that:

> The schema is roughly
> 
> create table records(__recno INTEGER PRIMARY KEY, fileId, data);

Forget the INTEGER PRIMARY KEY.  My partial index would reference 
the _rowid_.  I don't permit vacuums on the database so, if I'm not
mistaken, this shouldn't be an issue.

--
Eric A. Smith

Parkinson's Fourth Law:
The number of people in any working group tends to increase
regardless of the amount of work to be done.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] partial index?

2010-08-20 Thread Eric Smith
Stephen Oberholtzer wrote:

> I believe what he's getting at is this:
{snip explanation}

You exactly understand what I'm going for and my use case.

Is there a better way to implement it in sql itself than what I
outlined?  I.e. create my own index table that points to the proper
rows and keep it updated via triggers?

Eric

--
Eric A. Smith

Gnagloot, n.:
A person who leaves all his ski passes on his jacket just to
impress people.
-- Rich Hall, "Sniglets"
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] partial index?

2010-08-20 Thread Eric Smith
Igor Tandetnik wrote: 

> > How would you find a row whose column X contained value Y if the 
> > "partial" index on column X specified that rows containing value Y 
> > in column X should never be returned?  
> 
> No one suggests partial index should be capable of hiding anything.  The 
> idea is that, when the query can be proven to only involve rows covered 
> by the partial index, the index can be used to speed up the query.  
> Otherwise, it simply won't be used.  

Right.

-- 
Eric A. Smith

Well then, let's go on.  Sorry, there's nothing to go on to.
Let's digress.
-- Tim Pulju, Winter 2005
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] partial index?

2010-08-20 Thread Jay A. Kreibich
On Fri, Aug 20, 2010 at 06:52:40AM -0700, Cory Nelson scratched on the wall:
> On Thu, Aug 19, 2010 at 1:30 PM, Eric Smith  wrote:
> > Afaict sqlite doesn't support indices on subsets of rows in a table, ??
> > la http://en.wikipedia.org/wiki/Partial_index -- right?
> >
> > Any plans to implement that?
> 
> +1 for this feature request.  They've got a very specific and fairly
> rare use case, but when opportunity strikes partial indexes are much
> more convenient, straightforward, and efficient than the alternative.

  This strikes me as more trouble than it is worth.

  You admit it has a very specific and rare use case, yet implementing
  it is going to add complexity and size to the query processor and
  optimizer.  This is not a "low cost" feature.  Any time a partial
  index comes into play, the optimizer needs to determine if the index
  can be used or not by verifying that the query domain is a subset of
  the index domain, based off the conditions on the query and on the
  partial index.  This may or may not involve some significant logic (or
  it will tend to be rather dumb).  It requires noticeable amounts of
  development and testing time-- especially testing, as it introduces
  the possibility of returning a looks-correct-but-is-wrong answer.  And,
  because you're talking about modifications to a very central part
  of the database engine, adding complexity adds upkeep costs from here
  on out.

  It is also likely you would need to bump the file format, since you
  now need the ability to attach conditions to an index and flag it
  as partial and unusable for the general case.

  I have no doubt that it is useful for those cases where it applies,
  but that seems like a high costs for such a narrow gain.

> > Are there any known hacks to implement something similar?
> 
> There's not really any good solution.  You can create a separate table
> with the subset in it, that's pretty much it.  

  Right.  Create a shadow table that has the fields you want to
  partially index.  Map the ROWID values one-to-one.  Index the shadow
  table (or not, depending on your needs).  Put triggers on the main
  table to keep the shadow table up to date on INSERT/UPDATE/DELETEs.
  If you index the shadow table, you now have a partial index.

  You have to make the call with each query about using the partial
  index or not, and you have to manually add an extra JOIN into your
  query, but for most of the situations described in the tread so far,
  that doesn't seem that difficult.  Performance-wise, queries should
  be the same as a native partial index, since the ROWID value can be
  pulled directly from the index and mapped back to the main table,
  exactly like a native index.

> But that has some overhead too, and can complicate your queries.

  The complexity is there no matter what.  The difference is if the
  person writing the query sees it or not.  
  
  My personal opinion is that this is a unique enough feature, and
  workarounds exist (even if they aren't exactly pretty), that it
  does not justify the long-term testing and upkeep costs.

   -j

-- 
Jay A. Kreibich < J A Y  @  K R E I B I.C H >

"Intelligence is like underwear: it is important that you have it,
 but showing it to the wrong people has the tendency to make them
 feel uncomfortable." -- Angela Johnson
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] partial index?

2010-08-20 Thread Jim Wilcoxson
On 8/20/10, Cory Nelson  wrote:
> On Fri, Aug 20, 2010 at 7:39 AM, Jim Wilcoxson  wrote:
>> ...
>> The best I could come up with is a separate table.  The problem is,
>> indexing the SHA1 normally means there is a copy in the row and a copy
>> in the index.  Using a separate table, which still has to be indexed,
>> means there is a copy in the row of the main table, a copy in the
>> separate table, and a copy in the separate table's index.
>>
>
> You might want to index the "sha" column on the primary table, then
> the secondary table can just be indexedblocks(blockid INTEGER PRIMARY
> KEY).  No wasted space that way, but will require more I/Os to JOIN
> the tables.

In my case, the point of a partial index would be to save space.  If
the sha column were indexed completely in the primary table, no other
tables or indexes are necessary, but then no space is saved.  If I
only want to index 80% of the rows, I'd save 20*.2n bytes with a
partial index, where n is the number of rows.  (SHA1 is 20 bytes)

Using a separate table, I'd need 40 bytes (roughly) for each sha
indexed: 20 bytes for the main table, 20 for the index; or 40m for m
entries.  This saves space if half or fewer of all blocks are indexed.
 If more than than half are indexed, indexing all rows in the main
table uses less space.

Jim
--
HashBackup: easy onsite and offsite Unix backup
http://sites.google.com/site/hashbackup
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] partial index?

2010-08-20 Thread Cory Nelson
On Fri, Aug 20, 2010 at 7:39 AM, Jim Wilcoxson  wrote:
> ...
> The best I could come up with is a separate table.  The problem is,
> indexing the SHA1 normally means there is a copy in the row and a copy
> in the index.  Using a separate table, which still has to be indexed,
> means there is a copy in the row of the main table, a copy in the
> separate table, and a copy in the separate table's index.
>

You might want to index the "sha" column on the primary table, then
the secondary table can just be indexedblocks(blockid INTEGER PRIMARY
KEY).  No wasted space that way, but will require more I/Os to JOIN
the tables.

>
> I guess one way to do this using the SQL partial index feature being
> discussed would be to have an extra column in the row called
> "isindexed", setting that to 1 or 0 depending on whether the SHA
> should be included in the index, and using create index ... where
> isindexed=1.  Then on the queries, say "select blockid from blocks
> where isindexed=1 and sha=X".  And any query that didn't have
> isindexed=1 wouldn't be able to use the index at all.
>
> Is that how it would work?

Yes, that's exactly how it works.

Partial indexes are best used when:

-   You only need to search a specific subset of your rows.
"isindexed" is a good candidate for this.

-   You have no other indexes that can already narrow down searches efficiently.

-   You have existing queries that use rows from both inside and
outside of the partial index.  If you don't, then there's no reason to
group them together and you might as well put the subset in a separate
table if it's not too inconvenient.

-   The column is used outside of the index, or the storage used by it
is acceptable.  Ie. if 99% of your rows have "isindexed=1", you've got
a lot of redundant columns taking up space and might investigate
JOINing with another smaller table instead.

-- 
Cory Nelson
http://int64.org
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] partial index?

2010-08-20 Thread Jim Wilcoxson
On 8/20/10, Cory Nelson  wrote:
> +1 for this feature request.  They've got a very specific and fairly
> rare use case, but when opportunity strikes partial indexes are much
> more convenient, straightforward, and efficient than the alternative.
>
> -   If a table has 100,000,000 rows and each index page for it holds
> 100 rows, a full index will need 1,000,000 pages of storage and 4 page
> reads to search.
> -   With a partial index covering only 10,000 rows of the same table,
> it will only need 100 pages of storage and 2 page reads to search.
>
> Big improvement while keeping query syntax and results exactly the same!
>
>> Are there any known hacks to implement something similar?
>
> There's not really any good solution.  You can create a separate table
> with the subset in it, that's pretty much it.  But that has some
> overhead too, and can complicate your queries.

A while back I needed something like a partial index.  The backup
program I'm working on does block dedup, so I made an index on each
block's SHA1.  But then I wanted to only dedup some blocks, not all.
I needed to say "insert this block row and also index the sha in the
sha index", or "insert this block row but don't index the sha in the
sha index".  Then I wanted to be able to say "is X in the sha index?",
ie, only look in the index.  SQL doesn't really have a way to do that.
 It's a simple concept, but it doesn't seem to fit within the SQL
framework.

The best I could come up with is a separate table.  The problem is,
indexing the SHA1 normally means there is a copy in the row and a copy
in the index.  Using a separate table, which still has to be indexed,
means there is a copy in the row of the main table, a copy in the
separate table, and a copy in the separate table's index.

I guess one way to do this using the SQL partial index feature being
discussed would be to have an extra column in the row called
"isindexed", setting that to 1 or 0 depending on whether the SHA
should be included in the index, and using create index ... where
isindexed=1.  Then on the queries, say "select blockid from blocks
where isindexed=1 and sha=X".  And any query that didn't have
isindexed=1 wouldn't be able to use the index at all.

Is that how it would work?

Jim
--
HashBackup: easy onsite and offsite Unix backup
http://sites.google.com/site/hashbackup
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] partial index?

2010-08-20 Thread Cory Nelson
On Thu, Aug 19, 2010 at 1:30 PM, Eric Smith  wrote:
> Afaict sqlite doesn't support indices on subsets of rows in a table, Ю
> la http://en.wikipedia.org/wiki/Partial_index -- right?
>
> Any plans to implement that?

+1 for this feature request.  They've got a very specific and fairly
rare use case, but when opportunity strikes partial indexes are much
more convenient, straightforward, and efficient than the alternative.

-   If a table has 100,000,000 rows and each index page for it holds
100 rows, a full index will need 1,000,000 pages of storage and 4 page
reads to search.
-   With a partial index covering only 10,000 rows of the same table,
it will only need 100 pages of storage and 2 page reads to search.

Big improvement while keeping query syntax and results exactly the same!

> Are there any known hacks to implement something similar?

There's not really any good solution.  You can create a separate table
with the subset in it, that's pretty much it.  But that has some
overhead too, and can complicate your queries.

-- 
Cory Nelson
http://int64.org
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] partial index?

2010-08-20 Thread Igor Tandetnik
Tim Romano  wrote:
> Igor,
> Here's the example where a partial index can "hide" rows.
> 
> From the wikipedia article cited by the OP:
> 
> 
> It is not necessary that the condition be the same as the index criterion;
> Stonebraker's paper below presents a number of examples with indexes similar
> to the following:
> 
>  create index partial_salary on employee(age) where salary > 2100;
> 
> 
> 
> What would happen if you issued these queries?
> 
>  select max(age) from employee
>  select avg(age) from employee
> 
> Would the ages of employees earning <= 2100 be included?

Of course. The presence or absence of an index never affect the meaning of a 
query - just its performance.

> Is the
> partial-index used under those circumstances?

No, I don't see how it could be beneficial for these queries.
-- 
Igor Tandetnik

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] partial index?

2010-08-20 Thread Filip Navara
On Fri, Aug 20, 2010 at 1:47 PM, Tim Romano  wrote:
> Igor,
> Here's the example where a partial index can "hide" rows.
>
> From the wikipedia article cited by the OP:
>
> 
> It is not necessary that the condition be the same as the index criterion;
> Stonebraker's paper below presents a number of examples with indexes similar
> to the following:
>
>  create index partial_salary on employee(age) where salary > 2100;
>
> 
>
> What would happen if you issued these queries?
>
>                  select max(age) from employee
>                  select avg(age) from employee
>
> Would the ages of employees earning <= 2100 be included?

Yes

> Is the partial-index used under those circumstances?

No, it would change outcome of the query. The partial index is used
only for optimizing queries that satisfy the index condition.

F.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] partial index?

2010-08-20 Thread Tim Romano
Igor,
Here's the example where a partial index can "hide" rows.

>From the wikipedia article cited by the OP:


It is not necessary that the condition be the same as the index criterion;
Stonebraker's paper below presents a number of examples with indexes similar
to the following:

  create index partial_salary on employee(age) where salary > 2100;



What would happen if you issued these queries?

  select max(age) from employee
  select avg(age) from employee

Would the ages of employees earning <= 2100 be included?  Is the
partial-index used under those circumstances?

--
Tim Romano






On Thu, Aug 19, 2010 at 9:16 PM, Igor Tandetnik  wrote:

> Tim Romano  wrote:
> > How would you find a row whose column X contained value Y if the
> "partial"
> > index on column X specified that rows containing value Y in column X
> should
> > never be returned?
>
> No one suggests partial index should be capable of hiding anything. The
> idea is that, when the query can be proven to only involve rows covered by
> the partial index, the index can be used to speed up the query. Otherwise,
> it simply won't be used.
> --
> Igor Tandetnik
>
>
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] partial index?

2010-08-19 Thread Artur Reilin

> Tim Romano  wrote:
>> How would you find a row whose column X contained value Y if the
>> "partial"
>> index on column X specified that rows containing value Y in column X
>> should
>> never be returned?
>
> No one suggests partial index should be capable of hiding anything. The
> idea is that, when the query can be proven to only involve rows covered by
> the partial index, the index can be used to speed up the query. Otherwise,
> it simply won't be used.
> --
> Igor Tandetnik
>
>
The thing, that I don't get is: Why should the rows be hidden? If the
index ist just used to cover some special values, it didn't hide others.

If you create an index on some columns, it didn't hide the other columns, or?


Artur Reilin

sqlite.yuedream.de
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] partial index?

2010-08-19 Thread Igor Tandetnik
Tim Romano  wrote:
> How would you find a row whose column X contained value Y if the "partial"
> index on column X specified that rows containing value Y in column X should
> never be returned?

No one suggests partial index should be capable of hiding anything. The idea is 
that, when the query can be proven to only involve rows covered by the partial 
index, the index can be used to speed up the query. Otherwise, it simply won't 
be used.
-- 
Igor Tandetnik


___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] partial index?

2010-08-19 Thread Tim Romano
Typo:
"... more performant than partial query" should read "more performant than a
partial index".
Tim Romano


>
>>
>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] partial index?

2010-08-19 Thread Tim Romano
Eric,
How would you find a row whose column X contained value Y if the "partial"
index on column X specified that rows containing value Y in column X should
never be returned?  If the index hides the row, how do you cause the row to
become visible to a query? You have to drop the index.

However, I would be willing to accept an index on a *virtual* column whose
set of discrete possible values was a subset of the values in the actual
underlying table, or some translated form of those values, for example a
column that was the result of a function that converted a date to 'Q1',
'Q2', 'Q3', or 'Q4'.

Compare: http://www.oracle-base.com/articles/11g/VirtualColumns_11gR1.php

If your goal is performance, moving rows out of the table when they cease to
meet your business rule's definition of relevance will be more performant
than partial query: not only will the index  contain just as few nodes, but
the table itself will contain fewer rows than the table when using a partial
index.

And programming would not be more difficult: you'd simply substitute a
trigger for the partial index declaration. Moreover, this technique would be
highly portable. Partial indexes, not.

Regards
Tim Romano
Swarthmore PA



>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] partial index?

2010-08-19 Thread Stephen Oberholtzer
On Thu, Aug 19, 2010 at 5:53 PM, Kees Nuyt  wrote:
> On Thu, 19 Aug 2010 17:39:14 -0400, Eric Smith
>  wrote:
>
>>Am I missing something?
>
> You could add a "deleted" column with value range (0,1) and
> create an index on it if benchmarks show that makes it
> faster. As a bonus it is easier to code and maintain than a
> separate table with references and triggers.
>
> Alternatively, you can create an composite index with the
> "deleted" column as one of the components.
>
> From a theoretical view, if you care about the visibility of
> a row, you should express it as an attribute of the entity.
> The solutions above comply with that notion.
> --
>  (  Kees Nuyt

I think you've missed the point. I believe what he's getting at is this:

>> CREATE INDEX foo ON bar (to_be_deleted) <<

Imagine if he had 100 million rows in his table, and 100 of them were
marked "to_be_deleted".
His index will have 100 million rows, probably 500MB or 900MB (not
sure if rowid is 32- or 64-bit), consisting of 99,999,900 "0"s and 100
"1"s.

If he could create what MSSQL calls a "filtered index", using a syntax
like this:

>> CREATE INDEX foo_filtered ON bar (to_be_deleted) WHERE to_be_deleted = 1 <<

he could speed up the statement

>> DELETE FROM bar WHERE to_be_deleted = 1 <<

using that index, just like he could with the unfiltered "foo" index.
The only difference is that where foo has 100 million rows,
foo_filtered only contains 100 rows, taking up only 500-900 bytes
(thus actually having like 300% overhead due to page sizes!)


Now, in order to implement this, the following changes would have to be made:

1. Conditional logic would have to be generated inside the VDBE
programs for INSERT statements.  This is pretty straightforward.

2. Conditional logic would have to be generated inside the VDBE
programs for UPDATE statements. Care must be taken to make sure that
the index is updated properly when the column(s) referenced in the
WHERE clause are updated, but other than that, it's probably pretty
straightforward.

3. Depending on how the IdxDelete operator handles "key not found in
index" errors, the VDBE code generated for DELETE statements may also
need to be updated.

4. The statement parser needs to be modified to parse this syntax.

5. The schema parser needs to be modified to decode this syntax.

6. The optimizer needs to flatten and check that every possible branch
of the WHERE clause on a SELECT/DML statement is compatible with the
WHERE clause of the index, before it can use that index.

Now, I personally could do #1-3, because they're pretty easy.
I could probably even manage #4 and #5 if I spent a week familiarizing
myself with the code.
But #6, as far as I can tell, is a LOT harder.  Consider the following examples:

create index ix1 on Bar (baz) where quux between 30 and 95;

select * from baz where quux = 35; -- Index is viable
select * from baz where quux between 31 and 94; -- Index is viable
select * from baz where quux = 38 or quux between 80 and 90; -- Index is viable
select * from baz where quux in (40,50,60,70); -- again, index is viable
select * from baz where quux between 25 and 35; -- index is NOT viable
select * from baz where quux = 38 or baz = 5; -- index is NOT viable



-- 
-- Stevie-O
Real programmers use COPY CON PROGRAM.EXE
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] partial index?

2010-08-19 Thread Simon Slavin

On 19 Aug 2010, at 10:39pm, Eric Smith wrote:

> I want an index that only can be used to find rows with a particular 
> value or set of values.

Take a look at VIEWs:

http://www.sqlite.org/lang_createview.html

This is the SQL standard way to reduce your view of a table to just certain 
rows.  If I understand your request, this feature should provide exactly what 
you want.  Appropriate indexes will be used when consulting any VIEW you've 
defined.

> Since SQLite doesn't support partial indices directly, I'm 
> thinking about making my own index as a separate table and 
> populating/depopulating it using triggers on the main table.  I only 
> need it for fast lookups during deletion of the relevant rows, so I'll 
> hijack the app logic that wants to delete those rows and instead use 
> the secondary table to get the row ids, and delete those directly.  
> 
> Something like DELETE FROM records WHERE __recno IN (SELECT __recno 
> FROM idxTable), where __recno is the INTEGER PRIMARY KEY on records.

I don't understand what you're looking up here.  If you have some method of 
recognising which rows of a table should be deleted just use the appropriate

DELETE FROM ... WHERE ...

command.  No need for any sub-SELECT clause.

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] partial index?

2010-08-19 Thread Kees Nuyt
On Thu, 19 Aug 2010 17:39:14 -0400, Eric Smith
 wrote:

>Am I missing something?

You could add a "deleted" column with value range (0,1) and
create an index on it if benchmarks show that makes it
faster. As a bonus it is easier to code and maintain than a
separate table with references and triggers.

Alternatively, you can create an composite index with the
"deleted" column as one of the components.

>From a theoretical view, if you care about the visibility of
a row, you should express it as an attribute of the entity.
The solutions above comply with that notion.
-- 
  (  Kees Nuyt
  )
c[_]
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] partial index?

2010-08-19 Thread Eric Smith
Tim Romano wrote: 

> The partial index is one very messy thing, fraught with ambiguities, 
> something to avoid.  

I want an index that only can be used to find rows with a particular 
value or set of values.  In what way is that ambiguous?  Other databases 
(e.g. postgres) seem to support this kind of thing.  

> I can imagine other business rules being really 
> bollixed up by the sudden reappearance of zombie rows.  

This isn't a 'business rule', this is an optimization.  No high 
level logic will change.  Just like when we use other sql indices.  

> Under the partial index method, how would 
> you ever find a row again once it has become invisible, unless you were 
> perhaps to change or suspend the partial index rule, and cause the missing 
> rows to reappear?  

"Become invisible", meaning it no longer contains data that I care 
about?  I don't need to find it quickly because it no longer contains 
data that I care about.  

So, I'm not sure I understand your concerns.  

Since SQLite doesn't support partial indices directly, I'm 
thinking about making my own index as a separate table and 
populating/depopulating it using triggers on the main table.  I only 
need it for fast lookups during deletion of the relevant rows, so I'll 
hijack the app logic that wants to delete those rows and instead use 
the secondary table to get the row ids, and delete those directly.  

Something like DELETE FROM records WHERE __recno IN (SELECT __recno 
FROM idxTable), where __recno is the INTEGER PRIMARY KEY on records.

Am I missing something?

Eric 

-- 
Eric A. Smith

Furbling, v.:
Having to wander through a maze of ropes at an airport or 
bank even when you are the only person in line.
-- Rich Hall, "Sniglets"
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] partial index?

2010-08-19 Thread Kees Nuyt
On Thu, 19 Aug 2010 17:15:40 -0400, Tim Romano
 wrote:

>Ah, an opportunity for another purist tirade presents itself.
>
>I don't have a hack for SQLite but something I consider to be a much better
>practice that accomplishes the same goal. If your business rules would
>declare that rows with value X in column Y no longer belong to the set, the
>most straightforward way to implement such a rule is to move those rows to
>another table where they do belong. Use an after update/insert trigger to do
>this.

[...]

>The partial index is one very messy thing, fraught with ambiguities,
>something to avoid.  I can imagine other business rules being really
>bollixed up by the sudden reappearance of zombie rows.

+1

My 2 cents:
Pure SQL doesn't mind about indexes, in RDBMS
implementations they are just an optimization feature and a
way to implement unique constraints. Anything more is a can
of worms indeed.

Optional indexes are a codasyl hierarchical or network
database feature, where indexes are exposed to the DML.
-- 
  (  Kees Nuyt
  )
c[_]
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] partial index?

2010-08-19 Thread Tim Romano
Ah, an opportunity for another purist tirade presents itself.

I don't have a hack for SQLite but something I consider to be a much better
practice that accomplishes the same goal. If your business rules would
declare that rows with value X in column Y no longer belong to the set, the
most straightforward way to implement such a rule is to move those rows to
another table where they do belong. Use an after update/insert trigger to do
this

Splitting the rows into separate tables In that manner, you could move an
inactive|invisible row back into active|visible status if the need should
ever arise, simply by changing the column value and moving the row back into
the active table. Under the partial index method, how would you ever find a
row again once it has become invisible, unless you were perhaps to change or
suspend the partial index rule, and cause the missing rows to reappear?

The partial index is one very messy thing, fraught with ambiguities,
something to avoid.  I can imagine other business rules being really
bollixed up by the sudden reappearance of zombie rows.

Regards
Tim Romano
Swarthmore PA





   on the Gender column.

On Thu, Aug 19, 2010 at 4:30 PM, Eric Smith  wrote:

> Afaict sqlite doesn't support indices on subsets of rows in a table, Ю
> la http://en.wikipedia.org/wiki/Partial_index -- right?
>
> Any plans to implement that?
>
> Are there any known hacks to implement something similar?
>
> --
> Eric A. Smith
>
> Keeping Young #3:
> Keep the juices flowing by janglin round gently as you move.
>-- Satchel Paige
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users