> let's say i have a table MY_TABLE(INT_ID int not null, STRING_ID 
> varchar(255), DATA blob, primary key(INT_ID)). table has millions of 
> rows. i need to map from STRING_ID to INT_ID, but only for some rows. at 
> the time of insertion i know whether the new row needs to be in that map 
> or not. in some environments, most rows are mapped and in others, most 
> rows are not mapped. assume single insert per transaction. i would like 
> to avoid insertion-time performance impact of updating an extra index 
> when i don't need it. i've considered adding a second table with just 
> INT_ID and STRING_ID columns and inserting into that table only when i 
> need that mapping. however, when most rows are mapped, performance of 
> that solution seems worse than just an index on STRING_ID in MY_TABLE 
> table. i have also considered having two tables, one with an index on 
> STRING_ID and one without and inserting into one table or the other 
> table as appropriate. but, that would as much as double the cost of all 
> my INT_ID-based SELECTs and DELETEs because i would need to execute them 
> on two tables.
> 
> ideally, it would be nice if there was a way to index just the rows that 
> i need. is there any SQL/SQLite trick that i am missing?

As far as I know, SQLite does not support Partial Indices.  Your idea of 
potentially using a separate mapping table for just those items that 
need to be indexed seems a reasonable approximation of a partial index. 
  However, as you indicate, there may be substantial performance and/or 
space impacts with this approach-- perhaps worse than indexing everything.

I would take a step back and ask why you are worried about indexing 
"unneeded" rows:  You seem to indicate that "insertion-time performance" 
is your primary concern.  However, you also state that you plan to do a 
single insert per transaction.  If this is the case, my guess is that 
the time required for the disk flush at the end of each transaction* 
will be an order of magnitude longer than the index update.

In other words, if you are okay with the (slowish) performance of single 
insert per transaction, then I doubt you would notice the additional 
time to update an index.  Of course, you should test this hypothesis.

~Eric

* Assuming you haven't turned off synchronous write mode.
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to