Re: [zfs-discuss] User-visible non-blocking / atomic ops in ZFS

can you guess? Wed, 21 Nov 2007 19:21:17 -0800

I'm going to combine three posts here because they all involve jcone:

First, as to my message heading:


The 'search forum' mechanism can't find his posts under the 'jcone' name (I was 
curious, because they're interesting/strange, depending on how one looks at 
them).  I've also noticed (once in his case, once in Louwtjie's) that the 'last 
post' column of one thread may reflect a post made to a different thread.

Second, in response to your "Indexing other than hash tables" post:

The only way you could get a file system like ZFS to perform indexed look-ups 
for you would be to make each of your 'records' an entire file with the 
appropriate look-up name, and ReiserFS may be the only current file system that 
could handle this reasonably well

This is an outgrowth of the Unix mindset that files must only be byte-streams 
rather than anything more powerful (such as the single- and multi-key indexed 
files of traditional minicomputer and mainframe systems) - and that's 
especially unfortunate in ZFS's case, because system-managed COW mechanisms 
just happen to be a dynamite way to handle b-trees (you could do so at the 
application level on top of ZFS via use of a sparse file plus a facility to 
deallocate space in it explicitly, but you'd still need an entire separate 
level of in-file space-allocation/deallocation mechanism).  B-trees are the 
obvious solution to the kind of partial-key and/or key-range queries that you 
described.

Finally, in response to your current post (which sounds more as if it had come 
from a hardware engineer than from a database type):

All the facilities that you describe are traditionally handled by transactions 
of one form or another, and only read-only transactions can normally be 
non-blocking (because they simply capture a consistent point-in-time database 
state and operate upon that, ignoring any subsequent changes that may occur 
during their lifetimes).  Other less-popular but more general non-blocking 
approaches exist which simply abort upon detecting conflict rather than attempt 
to wait for the conflict to evaporate, which tends not to scale very well 
because (unlike the case with non-blocking low-level hardware synchronization) 
restarting a transaction when you don't have to can very often result in a 
*lot* of redundant work being performed; they include some multi-version 
approaches that implement more general 'time domain addressing' than that just 
described for read-only transactions and the rare implementations based upon 
'optimistic' concurrency control that let conflicts occur and then decide
  whether to abort someone when they attempt to commit.

ZFS supports transactions only for its internal use, and cannot feasibly 
support arbitrarily complex transactions because its atomicity approach depends 
upon gathering all transaction updates in RAM before writing them back 
atomically to disk (yes, it could perhaps do so in stages, since the entire new 
tree structure doesn't become visible until its root has been made persistent, 
but that could arbitrarily delay other write activity in the system).  While I 
think that supporting user-level transactions is a useful file-system feature 
and a few file systems such as Transarc's Structured File System have actually 
done so, ZFS would have to change significantly to do so for anything other 
than *very* limited user-level transactions - hence I wouldn't hold my breath 
waiting for such support in ZFS.

- bill
 
 
This message posted from opensolaris.org
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] User-visible non-blocking / atomic ops in ZFS

Reply via email to