[sqlite] Clustered indicies Was: [sqlite] A proposal for SQLite version 3.0

ben . carlyle Wed, 07 Apr 2004 20:05:22 -0700

Jeff,

Jeff Pleimling <[EMAIL PROTECTED]>
08/04/2004 12:42 PM

        To:     [EMAIL PROTECTED]
        cc: 
        Subject:        Re: [sqlite] A proposal for SQLite version 3.0

At 12:08 PM 4/8/2004 +1000, [EMAIL PROTECTED] wrote:
> I believe you're thinking of a 'clustered index'. This puts the data
> into the order of the index. There can be, of course, only one clustered
> index per table.
> Since at least some of the data in the table is moved around on every 
insert,
> regular indexs need to take this into account (usually by indirection, 
rather
> then modifying all of the indexes with each insert).

Ahh, I didn't think of that. I don't have any other indices on my table, 
so this wouldn't be a problem for me... but I can see now how it would 
harm the general case. I guess the simplest implementation would have a 
proabition on having -any- other indicies on the table.

> >If a table could be ordered according to an index, rather than having 
an
> >external index, I think it would significantly improve the time and 
space
> >performance of my databases. I don't know whether my experience would 
be
> >shared by other users. It it were something that could go into 3.0 it
> >would at least do me some good.
> Clustered indexes can really slow the performance for OLTP (On-Line
> Transaction Processing) and other systems where data is added/deleted in 
a
> mixed fashion. Every time a record is inserted, data is possibly moved 
on
> the disk (with page splits causing even more slowdowns).

Yes, that's what's happening already in the index whenever I do an insert. 
My thinking was that maintaining an index-ordered table would be less work 
overall than maintaining a table with an ordered index. I could be wrong 
on that, but I'm not sure I see the flaw in my logic.

> If your system is entirely historic data, that would be great - but if 
your
> system is inserting meter readings in (near) real-time, you'd probably
> get a big performance hit.

It's certainly real-time, with the occasional query.

> There are many pros and cons. A google search turns up articles (usually 
for
> MS SQL Server) on both side - some people saying 'always' and some 
'never'.

I can certainly see how the usefulness of this feature could be limited.

I guess the problem is primarily related to how the rowid is chosen. If I 
could choose a rowid that would put the table in the correct order, and 
renumber as required I might make some ground. I could order by ROWID, 
then... although there may have to be some kind of code tweakage to allow 
the where clauses to operate correctly.

Hmm... maybe something like this:
BEGIN TRANSACTION;
SELECT * FROM mytable WHERE <whatever is required to find the insertion 
point>
-- if the insertion point is after current data:
INSERT INTO mytable VALUES (MAX(ROWID) + 10, <values>)
-- else if insertion point is between two values
INSERT INTO mytable VALUES ((<previous rowid> + <next rowid>)/2, <values>)
-- else rearrange contiguous values
UPDATE mytable SET ROWID = ROWID+1 WHERE ROWID >= <insertion point> AND 
ROWID < <next free space>
INSERT INTO mytable VALUES (<insertion point>, <values>)
END TRANSACTION;

Perhaps the changes to sqlite could be as minimal as providing a facility 
to say:
"I promise to keep these rows in an order consistent with this index"
so sqlite will use the index in queries.

Benjamin.

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

[sqlite] Clustered indicies Was: [sqlite] A proposal for SQLite version 3.0

Reply via email to