On 8/20/10, Cory Nelson <phro...@gmail.com> wrote:
> On Fri, Aug 20, 2010 at 7:39 AM, Jim Wilcoxson <pri...@gmail.com> wrote:
>> ...
>> The best I could come up with is a separate table.  The problem is,
>> indexing the SHA1 normally means there is a copy in the row and a copy
>> in the index.  Using a separate table, which still has to be indexed,
>> means there is a copy in the row of the main table, a copy in the
>> separate table, and a copy in the separate table's index.
>>
>
> You might want to index the "sha" column on the primary table, then
> the secondary table can just be indexedblocks(blockid INTEGER PRIMARY
> KEY).  No wasted space that way, but will require more I/Os to JOIN
> the tables.

In my case, the point of a partial index would be to save space.  If
the sha column were indexed completely in the primary table, no other
tables or indexes are necessary, but then no space is saved.  If I
only want to index 80% of the rows, I'd save 20*.2n bytes with a
partial index, where n is the number of rows.  (SHA1 is 20 bytes)

Using a separate table, I'd need 40 bytes (roughly) for each sha
indexed: 20 bytes for the main table, 20 for the index; or 40m for m
entries.  This saves space if half or fewer of all blocks are indexed.
 If more than than half are indexed, indexing all rows in the main
table uses less space.

Jim
--
HashBackup: easy onsite and offsite Unix backup
http://sites.google.com/site/hashbackup
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to