2012-10-24 23:58, Timothy Coalson wrote: > I doubt I would like the outcome of having > some software make arbitrary decisions of what real filesystem each > to put file on, and then having one filesystem fail, so if you really > expect this, you may be happier keeping the two pools separate and > deciding where to put stuff yourself (since if you are expecting a > set of disks to fail, I expect you would have some idea as to which > ones it would be, for instance an external enclosure).
This to an extent sounds similar (doable with) hierarchical storage management, such as Sun's SAMFS/QFS solution. Essentially, this is a (virtual) filesystem where you set up storage rules based on last access times and frequencies, data types, etc. and where you have many tiers of storage (ranging from fast, small, expensive to slow, bulky, cheap), such as SSD Arrays - 15K SAS arrays - 7.2 SATA - Tape. New incoming data ends up on the fast tier. Old stale data lives on tapes. Data used sometimes migrates between tiers. The rules you define for the HSM system regulate how many copies on which tier you'd store, so loss of some devices should not be fatal - as well as cleaning up space on the faster tier to receive new data or to cache the old data requested by users and fetched from slower tiers. I did propose to add some HSM-type capabilities to ZFS, mostly with the goals of power-saving on home-NAS machines, so that the box could live with a couple of active disks (i.e. rpool and the "active-data" part of the data pool) while most of the data pool's disks can remain spun-down. Whenever a user reads some data from the pool (watching a movie or listening to music or processing his photos) the system would prefetch the data (perhaps a folder with MP3's) onto the cache disks and let the big ones spin down - with a home NAS and few users it is likely that if you're watching a movie, you system is otherwise unused for a couple of hours. Likewise, and this happens to be the trickier part, new writes to the data pool should go to the active disks and occasionally sync to and spread over the main pool disk. I hoped this can be all done transparently to users within ZFS, but overall discussions led to conclusion that this can better be done not within ZFS, but with some daemons (perhaps a dtrace-abusing script) doing the data migration and abstraction (the transparency to users). Besides, with introduction and advances in generic L2ARC, and with the possibility of file-level prefetch, much of that discussion became moot ;) Hope this small historical insight helps you :) //Jim Klimov _______________________________________________ OpenIndiana-discuss mailing list OpenIndiana-discuss@openindiana.org http://openindiana.org/mailman/listinfo/openindiana-discuss