On Mon, 2004-04-19 at 20:36, D. Richard Hipp wrote: > Mrs. Brisby wrote: > >> > >>The linked-list structure of overflow storage is part of the problem. > >>But the fact that SQLite uses synchronous I/O is also a factor. In > >>order to make BLOBs fast in SQLite, I would have to change to a different > >>indexing technique for overflow storage *and* come up with some kind > >>of cross-platform, asynchronous disk read mechanism. > > > > > > D.R.Morrison (1968)'s PATRICIA would certainly be faster for indexing > > large objects. > > > > A key feature of SQLite 3.0 (needed to support internationalization) > is the ability of users to specify their own comparison functions then > have SQLite use that comparison function to order indices. PATRICIA > does not support user-defined comparison functions. Keys in PATRICIA > must occur in memcmp() order, as far as I am aware.
Why not fold the strings at insert time to keep your indexing simple? You can still get internationalization, but require the user supply a function with performs this folding: á -> a for example. > > Asynchronous read isn't necessary, but vectored reads are. Consider > > readv() POSIX 1003.1-2001 -- in fact, you could probably make > > result-fields return a struct iovec * that would "point" to the value > > within the database. > > > > readv() doesn't help, actually. BLOBs are stored in 1k blocks scattered > all over the file. readv() reads a continguous range of bytes - it > puts those bytes into scattered buffers but the bytes must originate > from a contiguous region of the file. I'd still have to do 1024 > sequential readv()s in order to extract a 1MB blob. My brain fizzled out there for a moment. I don't know where I was. Sadly you're right. While, POSIX 1003.1-2003 does define aio_read(), it still is a portability nightmare. On systems where context switches are cheap, one could use fork() or posix threads to populate a number of pipes, but doubt this would buy much (if anything)... ever. Wouldn't it be nice if poll() actually did something interesting with regular files? :) Looks to me like you can either make two I/O policies (or more), sort your reads/seeks, OR move the blobs into another file :) --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]