On 6/20/07, Scott Hess <[EMAIL PROTECTED]> wrote:

On 6/20/07, Andrew Finkenstadt <[EMAIL PROTECTED]> wrote:
> How difficult do you think it would be to support an alternative method
of
> indexing within SQLite specifically to support O(1) retrieval of the
rowid
> for a table, and then potentially O(1) retrieval of the row data for a
> table, when in-order retrieval is undesired?
>
> My database design is highly specialized to never, ever retrieve data
except
> by whole-key equality, and would benefit greatly from this optimization.
> I'm willing to code it up myself, although I'm not quite set up to do
SQL
> syntax parser changes, preferring to use the amalgamation.

Would you be able to live with converting your primary key to a 64-bit
hash value?


Yes.  I could... or even smaller, if I can do collision exclusion by a
select against the "real primary key".

In development of fts2, I found that direct rowid -> row
access is pretty fast.  It's still technically O(logN), but the log's
base is absurdly large, so it comes reasonably close to O(1).  Then
you could do something like:

   SELECT x FROM t WHERE rowid = hashfn(?0) AND real_primary = ?0

hashfn() would take a string and return a 64-bit rowid.  The test
against real_primary is to guard against hash collisions, which you
may-or-may-not care to bother with (that said, it should be
essentially free, at least in disk I/O terms).


It's funny you should say that, as my current "best" (fastest) 64-bit hash
of a string involves concatenating Ye Olde Standard CRC-32 and Paul Hsieh's
SuperFastHash function, and I've been considering using that internally as
an in-memory surrogate of what would otherwise be a long path/filename.



BTW, this may be an example of a case where *dbm or bdb would be
justified.  It would also be interesting for someone to build a
virtual table interface implementing this idiom.


I gave those options thought as well, but ended up liking the vdbe/SQL
interface for development convenience, and the fact that "out of the box"
all of the cross-thread cross-process consistency issues were dealt with.
I'm a 17-year veteran of Oracle databases, and (perl)dbm databases before
that, so I really liked the lesser "mental surface area" needed to program
real programs with these options.

--a

Reply via email to