I'm going to combine three posts here because they all involve jcone: First, as to my message heading:
The 'search forum' mechanism can't find his posts under the 'jcone' name (I was curious, because they're interesting/strange, depending on how one looks at them). I've also noticed (once in his case, once in Louwtjie's) that the 'last post' column of one thread may reflect a post made to a different thread. Second, in response to your "Indexing other than hash tables" post: The only way you could get a file system like ZFS to perform indexed look-ups for you would be to make each of your 'records' an entire file with the appropriate look-up name, and ReiserFS may be the only current file system that could handle this reasonably well This is an outgrowth of the Unix mindset that files must only be byte-streams rather than anything more powerful (such as the single- and multi-key indexed files of traditional minicomputer and mainframe systems) - and that's especially unfortunate in ZFS's case, because system-managed COW mechanisms just happen to be a dynamite way to handle b-trees (you could do so at the application level on top of ZFS via use of a sparse file plus a facility to deallocate space in it explicitly, but you'd still need an entire separate level of in-file space-allocation/deallocation mechanism). B-trees are the obvious solution to the kind of partial-key and/or key-range queries that you described. Finally, in response to your current post (which sounds more as if it had come from a hardware engineer than from a database type): All the facilities that you describe are traditionally handled by transactions of one form or another, and only read-only transactions can normally be non-blocking (because they simply capture a consistent point-in-time database state and operate upon that, ignoring any subsequent changes that may occur during their lifetimes). Other less-popular but more general non-blocking approaches exist which simply abort upon detecting conflict rather than attempt to wait for the conflict to evaporate, which tends not to scale very well because (unlike the case with non-blocking low-level hardware synchronization) restarting a transaction when you don't have to can very often result in a *lot* of redundant work being performed; they include some multi-version approaches that implement more general 'time domain addressing' than that just described for read-only transactions and the rare implementations based upon 'optimistic' concurrency control that let conflicts occur and then decide whether to abort someone when they attempt to commit. ZFS supports transactions only for its internal use, and cannot feasibly support arbitrarily complex transactions because its atomicity approach depends upon gathering all transaction updates in RAM before writing them back atomically to disk (yes, it could perhaps do so in stages, since the entire new tree structure doesn't become visible until its root has been made persistent, but that could arbitrarily delay other write activity in the system). While I think that supporting user-level transactions is a useful file-system feature and a few file systems such as Transarc's Structured File System have actually done so, ZFS would have to change significantly to do so for anything other than *very* limited user-level transactions - hence I wouldn't hold my breath waiting for such support in ZFS. - bill This message posted from opensolaris.org _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss