[zfs-discuss] fsflush and zfs
Is there any change regarding fsflush such as autoup tunable for zfs ? Thanks This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Thumper and ZFS
Do you want data availability, data retention, space, or performance? -- richard Robert Milkowski wrote: Hello zfs-discuss, While waiting for Thumpers to come I'm thinking how to configure them. I would like to use raid-z. As thumper has 6 SATA controllers each 8-port then maybe it would make sense to create raid-z groups from 6 disks each from separate controller. Then combine 7 such groups into one pool. Then there're 6 disks remaining with two of them designated for system (mirror) which leaves 4 disks probably as hot-spares. That way if one controller fails entire pool will still be ok. What do you think? ps. there still will be SPOF for boot disks and hot spares but it looks like there's no choice anyway. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Re: [nfs-discuss] Re: Re: NFS Performance and Tar
As far as zfs performance is concerned, O_DSYNC and O_SYNC are equivalent. This is because, zfs saves all posix layer transactions (eg WRITE, SETATTR, RENAME...) in the log. So both meta data and data is always re-created if a replay is needed. Anton B. Rang wrote On 10/12/06 15:42,: fsync() should theoretically be better because O_SYNC requires that each write() include writing not only the data but also the inode and all indirect blocks back to the disk. This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Thumper and ZFS
Hello zfs-discuss, While waiting for Thumpers to come I'm thinking how to configure them. I would like to use raid-z. As thumper has 6 SATA controllers each 8-port then maybe it would make sense to create raid-z groups from 6 disks each from separate controller. Then combine 7 such groups into one pool. Then there're 6 disks remaining with two of them designated for system (mirror) which leaves 4 disks probably as hot-spares. That way if one controller fails entire pool will still be ok. What do you think? ps. there still will be SPOF for boot disks and hot spares but it looks like there's no choice anyway. -- Best regards, Robert mailto:[EMAIL PROTECTED] http://milek.blogspot.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS patches for S10 6/06
On Oct 5, 2006, at 2:28 AM, George Wilson wrote: Andreas, The first ZFS patch will be released in the upcoming weeks. For now, the latest available bits are the ones from s10 6/06. George, will there at least be a T patch available? I'm anxious for these because my ZFS-backed NFS server just isn't having it in terms of client i/o rates. /dale ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Re: zfs/raid configuration question for an application
Hello Anton, Thursday, October 12, 2006, 11:45:40 PM, you wrote: ABR> Yes, set the block size to 8K, to avoid a read-modify-write cycle inside ZFS. Unfortunately it won't help on 06/06 until patch is released to fix a bug (not to read old block if it's "overwritten"). However it still is wise to do it (set to 8k) 'coz when patch will be released it could actually win some performance. ps. it's fixed in snv for some time -- Best regards, Robertmailto:[EMAIL PROTECTED] http://milek.blogspot.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Re: zfs/raid configuration question for an application
Yes, set the block size to 8K, to avoid a read-modify-write cycle inside ZFS. As you suggest, using a separate mirror for the transaction log will only be useful if you're on different disks -- otherwise you will be forcing the disk head to move back and forth between slices each time you write. This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Re: [nfs-discuss] Re: Re: NFS Performance and Tar
fsync() should theoretically be better because O_SYNC requires that each write() include writing not only the data but also the inode and all indirect blocks back to the disk. This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Re: ZFS Inexpensive SATA Whitebox
this really bothers me too.. i was an early x2100 adopter and been waiting almost a year for this.. come on sun, please release a patch to fully support your own hardware on solaris 10!! On Oct 11, 2006, at 9:23 PM, Frank Cusack wrote: On October 11, 2006 11:14:59 PM -0400 Dale Ghent <[EMAIL PROTECTED]> wrote: Today, in 2006 - much different story. I even had Linux AND Solaris problems with my machine's MCP51 chipset when it first came out. Both forcedeth and nge croaked on it. Welcome to the bleeding edge. You're unfortunately on the bleeding edge of hardware AND software. Yeah, Solaris x86 is so bleeding edge that it doesn't even support Sun's own hardware! (x2100 SATA, which is now already in its second generation) -frank ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Re: Where is the ZFS configuration data stored?
Bart Smaalders wrote: Sergey wrote: + a little addition to the original quesion: Imagine that you have a RAID attached to Solaris server. There's ZFS on RAID. And someday you lost your server completely (fired motherboard, physical crash, ...). Is there any way to connect the RAID to some another server and restore ZFS layout (not loosing all data on RAID)? JBODs are simple, easy and relatively foolproof when used w/ ZFS. Yeah...but there management is lacking and you rarely see them in data centers. . :-P ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Where is the ZFS configuration data stored?
Steven Goldberg wrote: Thanks Matt. So is the config/meta info for the pool that is stored within the pool kept in a file? Is the file user readable or binary? It is not user-readable. See the on-disk format document, linked here: http://www.opensolaris.org/os/community/zfs/docs/ --matt ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Where is the ZFS configuration data stored?
Thanks Matt. So is the config/meta info for the pool that is stored within the pool kept in a file? Is the file user readable or binary? Steve Matthew Ahrens wrote: James McPherson wrote: On 10/12/06, Steve Goldberg <[EMAIL PROTECTED]> wrote: Where is the ZFS configuration (zpools, mountpoints, filesystems, etc) data stored within Solaris? Is there something akin to vfstab or perhaps a database? Have a look at the contents of /etc/zfs for an in-filesystem artefact of zfs. Apart from that, the information required is stored on the disk itself. There is really good documentation on ZFS at the ZFS community pages found via http://www.opensolaris.org/os/community/zfs. FYI, /etc/zfs/zpool.cache just tells us what pools to open when you boot up. Everything else (mountpoints, filesystems, etc) is stored in the pool itself. --matt -- Steven Goldberg Engagement Architect Sun Microsystems, Inc. 15395 SE 30th PL, Suite 120 Bellevue, WA 98011 USA Phone +1 425 467 4349 Email [EMAIL PROTECTED] ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Best way to carve up 8 disks
Brian Hechinger wrote: Ok, previous threads have lead me to believe that I want to make raidz vdevs [0] either 3, 5 or 9 disks in size [1]. Let's say I have 8 disks. Do I want to create a zfs pool with a 5-disk vdev and a 3-disk vdev? Are there performance issues with mixing differently sized raidz vdevs in a pool? If there *is* a performance hit to mix like that, would it be greater or lesser than building an 8-disk vdev? Unless you are running a database (or other record-structured application), or have specific performance data for your workload that supports your choice, I wouldn't worry about using the power-of-two-plus-parity size stripes. I'd choose between (in order of decreasing available io/s): 4x 2-way mirrors (most io/s and most read bandwidth) 2x 4-way raidz1 1x 8-way raidz1 (most write bandwidth) 1x 8-way raidz2 (most redundant) [0] - Just for clarity, what are the "sub-pools" in a pool, the actual raidz/mirror/etc "containers" called. What is the correct term to refer to them? I don't want any extra confusion here. ;) We would usually just call them "vdevs" (or to be more specific, "top-level vdevs"). --matt ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Re: Where is the ZFS configuration data stored?
Sergey wrote: + a little addition to the original quesion: Imagine that you have a RAID attached to Solaris server. There's ZFS on RAID. And someday you lost your server completely (fired motherboard, physical crash, ...). Is there any way to connect the RAID to some another server and restore ZFS layout (not loosing all data on RAID)? If the RAID controller is undamaged, just hook it up and go; you can import the ZFS pool on another system seamlessly. If the RAID controller gets damaged, you'll need to follow the manufacturer's documentation to restore your data. JBODs are simple, easy and relatively foolproof when used w/ ZFS. - Bart -- Bart Smaalders Solaris Kernel Performance [EMAIL PROTECTED] http://blogs.sun.com/barts ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Where is the ZFS configuration data stored?
James C. McPherson wrote: Dick Davies wrote: On 12/10/06, Matthew Ahrens <[EMAIL PROTECTED]> wrote: FYI, /etc/zfs/zpool.cache just tells us what pools to open when you boot up. Everything else (mountpoints, filesystems, etc) is stored in the pool itself. Does anyone know of any plans or strategies to remove this dependancy? What do you suggest in its place? Directory service hooks perhaps? ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Where is the ZFS configuration data stored?
> I was asking if it was going to be replaced because it would really > simplify ZFS root. > > Dick. > > [0] going from: > http://solaristhings.blogspot.com/2006/06/zfs-root-on-solaris-part-3.html I don't know about "replaced", but presumably with the addition of hostid to the pool data, it could be enhanced. For instance, VxVM has a similar file 'volboot'. It's needed to import any diskgroup that isn't "normal" (using entire disks). So boot-time imports consist of both scanning devices, looking for VxVM signatures, then importing any with the correct hostid and flags, then attempting to import other things explicicly mentioned in the volboot file. Since you can put a ZFS pool in files, you're going to have to have a cache or something like it since you're never going to scan them for importable pools. Sone one possible enhancement in ZFS could be that even if the cache didn't have a given (full disk) pool, it could be scanned and imported anyway. I don't know if such a feature would be useful for the implementation of ZFS root or not. Either way it would have to wait for the hostid stuff to go in. -- Darren Dunham [EMAIL PROTECTED] Senior Technical Consultant TAOShttp://www.taos.com/ Got some Dr Pepper? San Francisco, CA bay area < This line left intentionally blank to confuse you. > ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] system hangs on POST after giving zfs a drive
John Sonnenschein wrote: I *just* figgured out this problem, looking for a potential solution (or at the very least some validation that i'm not crazy) Okay, so here's the deal. I've been using this terrible horrible no-good very bad hackup of a couple partitions spread across 3 drives as a zpool. I got sick of having to dig up the info of what slices do what every time I need to do something, so I shuffled around some data & created a new zpool out of my SATA drive. ( [i]# zpool create xenophanes c2d0[/i] ). Everything works, etc. for a while, then I rebooted the machine. As it turns out now, something about the drive is causing the machine to hang on POST. It boots fine if the drive isn't connected, and if I hot plug the drive after the machine boots, it works fine, but the computer simply will not boot with the drive attatched. any thoughts on resolution? When you give ZFS a whole drive, it writes a EFI label on the disk. Some bios vendors' products choke when they taste this label; the Tyan 2865 does this. Go to the BIOS screen and disable probing for that disk; tell the BIOS there is nothing there (turn off AUTO mode). Then power cycle (NOT reset) the box and it should boot w/o further problems. - Bart (who ran into this on his home server). -- Bart Smaalders Solaris Kernel Performance [EMAIL PROTECTED] http://blogs.sun.com/barts ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [nfs-discuss] Re: [zfs-discuss] Re: NFS Performance and Tar
On Thu, Joerg Schilling wrote: > Spencer Shepler <[EMAIL PROTECTED]> wrote: > > > On Thu, Joerg Schilling wrote: > > > Spencer Shepler <[EMAIL PROTECTED]> wrote: > > > > > > > The close-to-open behavior of NFS clients is what ensures that the > > > > file data is on stable storage when close() returns. > > > > > > In the 1980s this was definitely not the case. When did this change? > > > > It has not. NFS clients have always flushed (written) modified file data > > to the server before returning to the applications close(). The NFS > > client also asks that the data be committed to disk in this case. > > This is definitely wrong. > > Our developers did loose many files in the 1980s when the NFS file server > did fill up the exported filesystem while several NFS clients did try to > write back edited files at the same time. > > VI at that time did not call fsync and for this reason did not notice that > the file could not be written back properly. > > What happens: All client did call statfs() and did asume that there is > still space on the server. They all did allow to put blocks into the local > clients buffer cache. VI did call close, but the client did notice the > no space problem after the close did return and VI did not notice that the > file was damaged and allowed the user to quit VI. > > Some time later, Sun did enhance VI to first call fsync() and then call > close(). Only if both return 0, the file is granted to be on the server. > Sun also did inform us to write applications this way in order to prevent > lost file content. I didn't comment on the error conditions that can occur during the writing of data upon close(). What you describe is the preferred method of obtaining any errors that occur during the writing of data. This occurs because the NFS client is writing asynchronously and the only method the application has of retrieving the error information is from the fsync() or close() call. At close(), it is to late to recovery so fsync() can be used to obtain any asynchronous error state. This doesn't change the fact that upon close() the NFS client will write data back to the server. This is done to meet the close-to-open semantics of NFS. > > > > Having tar create/write/close files concurrently would be a > > > > big win over NFS mounts on almost any system. > > > > > > Do you have an idea on how to do this? > > > > My naive thought would be to have multiple threads that create and > > write file data upon extraction. This multithreaded behavior would > > provide better overall throughput of an extraction given NFS' response > > time characteristics. More outstanding requests results in better > > throughput. It isn't only the file data being written to disk that > > is the overhead of the extraction, it is the creation of the directories > > and files that must also be committed to disk in the case of NFS. > > This is the other part that makes things slower than local access. > > Doing this with tar (which fetches the data from a serial data stream) > would only make sense in case that there will be threads that only have the > task > to wait for a final fsync()/close(). > > It would also make it harder to implement error control as it may be that > a problem is detected late while another large file is being extracted. > Star could not just quit with an error message but would need to delay the > error caused exit. Sure, I can see that it would be difficult. My point is that tar is not only waiting upon the fsync()/close() but also on file and directory creation. There is a longer delay not only because of the network latency but also the latency to writing the filesystem data to stable storage. Parallel requests will tend to overcome the delay/bandwidth issues. Not easy but can be an advantage with respect to performance. Spencer ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Re: zfs/raid configuration question for an application
Quite helpful, thank you. I think I should set the zfs mirror block size to 8K to match it with db, right ? and do you think I should create another zfs mirror for transaction log of pgsql ? or is this only useful if I create zfs mirror on a different set of disks but not slices ? Mete This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Where is the ZFS configuration data stored?
Ceri Davies wrote: On Thu, Oct 12, 2006 at 02:06:15PM +0100, Dick Davies wrote: On 12/10/06, Ceri Davies <[EMAIL PROTECTED]> wrote: On Wed, Oct 11, 2006 at 11:49:48PM -0700, Matthew Ahrens wrote: FYI, /etc/zfs/zpool.cache just tells us what pools to open when you boot up. Everything else (mountpoints, filesystems, etc) is stored in the pool itself. What happens if the file does not exist? Are the devices searched for metadata? My understanding (I'll be delighted if I'm wrong) is that you would be stuffed. I'd expect: zpool import -f (see the manpage) to probe /dev/dsk/ and rebuild the zpool.cache file, but my understanding is that this a) doesn't work yet or b) does horrible things to your chances of surviving a reboot [0]. So how do I import a pool created on a different host for the first time? zpool import [ -f ] (provided it's not in use *at the same time* by another host) -- Michael Schuster +49 89 46008-2974 / x62974 visit the online support center: http://www.sun.com/osc/ Recursion, n.: see 'Recursion' ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [nfs-discuss] Re: [zfs-discuss] Re: NFS Performance and Tar
Spencer Shepler <[EMAIL PROTECTED]> wrote: > On Thu, Joerg Schilling wrote: > > Spencer Shepler <[EMAIL PROTECTED]> wrote: > > > > > The close-to-open behavior of NFS clients is what ensures that the > > > file data is on stable storage when close() returns. > > > > In the 1980s this was definitely not the case. When did this change? > > It has not. NFS clients have always flushed (written) modified file data > to the server before returning to the applications close(). The NFS > client also asks that the data be committed to disk in this case. This is definitely wrong. Our developers did loose many files in the 1980s when the NFS file server did fill up the exported filesystem while several NFS clients did try to write back edited files at the same time. VI at that time did not call fsync and for this reason did not notice that the file could not be written back properly. What happens: All client did call statfs() and did asume that there is still space on the server. They all did allow to put blocks into the local clients buffer cache. VI did call close, but the client did notice the no space problem after the close did return and VI did not notice that the file was damaged and allowed the user to quit VI. Some time later, Sun did enhance VI to first call fsync() and then call close(). Only if both return 0, the file is granted to be on the server. Sun also did inform us to write applications this way in order to prevent lost file content. > > > Having tar create/write/close files concurrently would be a > > > big win over NFS mounts on almost any system. > > > > Do you have an idea on how to do this? > > My naive thought would be to have multiple threads that create and > write file data upon extraction. This multithreaded behavior would > provide better overall throughput of an extraction given NFS' response > time characteristics. More outstanding requests results in better > throughput. It isn't only the file data being written to disk that > is the overhead of the extraction, it is the creation of the directories > and files that must also be committed to disk in the case of NFS. > This is the other part that makes things slower than local access. Doing this with tar (which fetches the data from a serial data stream) would only make sense in case that there will be threads that only have the task to wait for a final fsync()/close(). It would also make it harder to implement error control as it may be that a problem is detected late while another large file is being extracted. Star could not just quit with an error message but would need to delay the error caused exit. Jörg -- EMail:[EMAIL PROTECTED] (home) Jörg Schilling D-13353 Berlin [EMAIL PROTECTED](uni) [EMAIL PROTECTED] (work) Blog: http://schily.blogspot.com/ URL: http://cdrecord.berlios.de/old/private/ ftp://ftp.berlios.de/pub/schily ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Re: zfs/raid configuration question for an application
Mirroring will give you the best performance for small write operations. If you can get by with two disks, I’d divide each of them into two slices, s0 and s1, say. Set up an SVM mirror between d0s0 and d1s0 and use that for your root. Set up a ZFS mirror between d0s1 and d1s1 and use that for your data. Any 3-disk configuration would be significantly more awkward and waste some space. For mirroring in your situation, I would tend to avoid hardware RAID. You won’t be maxing out the controller rates so you won’t get much benefit. This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Re: Where is the ZFS configuration data stored?
The configuration data is stored on the disk devices themselves, at least primarily. There is also a copy of the basic configuration data in the file /etc/zfs/zpool.cache on the boot device. If this file is missing, ZFS will not automatically import pools, but you can manually import them. (I’m not sure how ZFS deals with the typical mirror failure case: Configure a mirror between A and B; run for a while with B failed; reboot and B is visible but A fails on reboot — the goal is to avoid running on the stale mirror data, B. I’m not sure if the zpool.cache handles this [it would have to track the state of individual devices].) This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS Inexpensive SATA Whitebox
On Thu, 12 Oct 2006, Ian Collins wrote: > Al Hopper wrote: > > >On Wed, 11 Oct 2006, Dana H. Myers wrote: > > > > > > > >>Al Hopper wrote: > >> > >> > >> > >>>Memory: DDR-400 - your choice but Kingston is always a safe bet. 2*512Mb > >>>sticks for a starter, cost effective, system. 4*512Mb for a good long > >>>term solution. > >>> > >>> > >>Due to fan-out considerations, every BIOS I've seen will run DDR400 > >>memory at 333MHz when connected to more than 1 DIMM-per-channel (I > >>believe at AMD's urging). > >> > >> > > > >Really!? That's surprising. Is there a way to verify that on an Ultra20 > >running Solaris 06/06? > > > > > > > I can't remember which ones (I think it was the dual socket 940 Tyan > boards) listed different memory limits for DDR400 and DD333 RAM. [ Off Topic . ] Sure - the situation is quite different for dual socket Opteron motherboards - the layout can be quite a challenge for these. But we were talking about a single (939-pin) socket motherboard where the layout is much easier to do. Also - if you look at Opteron motherboards with 6 or more DIMM slots on a processor, the memory speed will (usually) go down to 333MHz for > 4 and <= 6 DIMMs and 266MHz for > 6 (populated) DIMMs. This issue (limited # of physical DIMM slots) was supposed to be resolved by now with FB-DIMMs. Not. The current generation are too expensive, run too hot and introduce high latencies. Details emerging on the upcoming quad core AMD (Barcelona) say that it'll have two memory controllers, one for each set of 2 cores and the contoller will support both DDR3 and FB-DIMM memory parts. Should be interesting. Regards, Al Hopper Logical Approach Inc, Plano, TX. [EMAIL PROTECTED] Voice: 972.379.2133 Fax: 972.379.2134 Timezone: US CDT OpenSolaris.Org Community Advisory Board (CAB) Member - Apr 2005 OpenSolaris Governing Board (OGB) Member - Feb 2006 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [nfs-discuss] Re: [zfs-discuss] Re: NFS Performance and Tar
On Thu, Joerg Schilling wrote: > Spencer Shepler <[EMAIL PROTECTED]> wrote: > > > The close-to-open behavior of NFS clients is what ensures that the > > file data is on stable storage when close() returns. > > In the 1980s this was definitely not the case. When did this change? It has not. NFS clients have always flushed (written) modified file data to the server before returning to the applications close(). The NFS client also asks that the data be committed to disk in this case. > > The meta-data requirements of NFS is what ensures that file creation, > > removal, renames, etc. are on stable storage when the server > > sends a response. > > > > So, unless the NFS server is behaving badly, the NFS client has > > a synchronous behavior and for some that means more "safe" but > > usually means that it is also slower that local access. > > In any case, calling fsync before close does nto seem to be a problem. Not for the NFS client because the default behavior has the same effect as fsync()/close(). > > > > You tell me ? We have 2 issues > > > > > > can we make 'tar x' over direct attach, safe (fsync) > > > and posix compliant while staying close to current > > > performance characteristics ? In other words do we > > > have the posix leeway to extract files in parallel ? > > > > > > For NFS, can we make 'tar x' fast and reliable while > > > keeping a principle of least surprise for users on > > > this non-posix FS. > > > > Having tar create/write/close files concurrently would be a > > big win over NFS mounts on almost any system. > > Do you have an idea on how to do this? My naive thought would be to have multiple threads that create and write file data upon extraction. This multithreaded behavior would provide better overall throughput of an extraction given NFS' response time characteristics. More outstanding requests results in better throughput. It isn't only the file data being written to disk that is the overhead of the extraction, it is the creation of the directories and files that must also be committed to disk in the case of NFS. This is the other part that makes things slower than local access. Spencer ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Best way to carve up 8 disks
On Thu, Oct 12, 2006 at 08:52:34AM -0500, Al Hopper wrote: > On Thu, 12 Oct 2006, Brian Hechinger wrote: > > > Ok, previous threads have lead me to believe that I want to make raidz > > vdevs [0] either 3, 5 or 9 disks in size [1]. Let's say I have 8 disks. > > Do I want to create a zfs pool with a 5-disk vdev and a 3-disk vdev? > > Personally I think that 5 disks for raidz is the "sweet spot". With the > remaining 3 disks, consider a 3-way mirror if redundancy is more important > than space. This will give you two pools with different operational > characteristics. I'll take space over redundancy in this case. :) > > Are there performance issues with mixing differently sized raidz vdevs > > in a pool? If there *is* a performance hit to mix like that, would it > > Can you tell us what size your disks are? They will be 400GB or 500GB disks. > > be greater or lesser than building an 8-disk vdev? > > In this situation a little experimenting will probably answer all your > questions. ZFS is so quick at, well, at everything, that its fun and > productive to experiment with different disk configurations. As soon as I get the disks in, I will certainly have to give them a run for their money. ;) > > [1] - For RAID-Z. How does RAID-Z2 effect this? > > For raidz2, the corresponding #s would be 4,6 and 10 (not recommended) That's what I guessed, but wanted to make sure. Thanks!!! -brian ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Where is the ZFS configuration data stored?
On Thu, Oct 12, 2006 at 02:54:05PM +0100, Ceri Davies wrote: > On Thu, Oct 12, 2006 at 02:06:15PM +0100, Dick Davies wrote: > > On 12/10/06, Ceri Davies <[EMAIL PROTECTED]> wrote: > > >On Wed, Oct 11, 2006 at 11:49:48PM -0700, Matthew Ahrens wrote: > > > > >> FYI, /etc/zfs/zpool.cache just tells us what pools to open when you boot > > >> up. Everything else (mountpoints, filesystems, etc) is stored in the > > >> pool itself. > > > > > >What happens if the file does not exist? Are the devices searched for > > >metadata? > > > > My understanding (I'll be delighted if I'm wrong) is that you would be > > stuffed. > > > > I'd expect: > > > > zpool import -f > > > > (see the manpage) > > to probe /dev/dsk/ and rebuild the zpool.cache file, > > but my understanding is that this a) doesn't work yet or b) does > > horrible things to your chances of surviving a reboot [0]. > > So how do I import a pool created on a different host for the first > time? Never mind, Mark just answered that. Ceri -- That must be wonderful! I don't understand it at all. -- Moliere pgpGc5be2cfdQ.pgp Description: PGP signature ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Where is the ZFS configuration data stored?
On Thu, Oct 12, 2006 at 07:53:37AM -0600, Mark Maybee wrote: > Ceri Davies wrote: > >On Wed, Oct 11, 2006 at 11:49:48PM -0700, Matthew Ahrens wrote: > > > >>James McPherson wrote: > >> > >>>On 10/12/06, Steve Goldberg <[EMAIL PROTECTED]> wrote: > >>> > Where is the ZFS configuration (zpools, mountpoints, filesystems, > etc) data stored within Solaris? Is there something akin to vfstab > or perhaps a database? > >>> > >>> > >>>Have a look at the contents of /etc/zfs for an in-filesystem artefact > >>>of zfs. Apart from that, the information required is stored on the > >>>disk itself. > >>> > >>>There is really good documentation on ZFS at the ZFS community > >>>pages found via http://www.opensolaris.org/os/community/zfs. > >> > >>FYI, /etc/zfs/zpool.cache just tells us what pools to open when you boot > >>up. Everything else (mountpoints, filesystems, etc) is stored in the > >>pool itself. > > > > > >What happens if the file does not exist? Are the devices searched for > >metadata? > > > >Ceri > > If the file does not exist than ZFS will not attempt to open any > pools at boot. You must issue an explicit 'zpool import' command to > probe the available devices for metadata to re-discover your pools. OK, that's fine then. Ceri -- That must be wonderful! I don't understand it at all. -- Moliere pgpNFNE0tDgSA.pgp Description: PGP signature ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Where is the ZFS configuration data stored?
On Thu, Oct 12, 2006 at 02:06:15PM +0100, Dick Davies wrote: > On 12/10/06, Ceri Davies <[EMAIL PROTECTED]> wrote: > >On Wed, Oct 11, 2006 at 11:49:48PM -0700, Matthew Ahrens wrote: > > >> FYI, /etc/zfs/zpool.cache just tells us what pools to open when you boot > >> up. Everything else (mountpoints, filesystems, etc) is stored in the > >> pool itself. > > > >What happens if the file does not exist? Are the devices searched for > >metadata? > > My understanding (I'll be delighted if I'm wrong) is that you would be > stuffed. > > I'd expect: > > zpool import -f > > (see the manpage) > to probe /dev/dsk/ and rebuild the zpool.cache file, > but my understanding is that this a) doesn't work yet or b) does > horrible things to your chances of surviving a reboot [0]. So how do I import a pool created on a different host for the first time? Ceri -- That must be wonderful! I don't understand it at all. -- Moliere pgpnORJ2EpiPx.pgp Description: PGP signature ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Where is the ZFS configuration data stored?
Ceri Davies wrote: On Wed, Oct 11, 2006 at 11:49:48PM -0700, Matthew Ahrens wrote: James McPherson wrote: On 10/12/06, Steve Goldberg <[EMAIL PROTECTED]> wrote: Where is the ZFS configuration (zpools, mountpoints, filesystems, etc) data stored within Solaris? Is there something akin to vfstab or perhaps a database? Have a look at the contents of /etc/zfs for an in-filesystem artefact of zfs. Apart from that, the information required is stored on the disk itself. There is really good documentation on ZFS at the ZFS community pages found via http://www.opensolaris.org/os/community/zfs. FYI, /etc/zfs/zpool.cache just tells us what pools to open when you boot up. Everything else (mountpoints, filesystems, etc) is stored in the pool itself. What happens if the file does not exist? Are the devices searched for metadata? Ceri If the file does not exist than ZFS will not attempt to open any pools at boot. You must issue an explicit 'zpool import' command to probe the available devices for metadata to re-discover your pools. -Mark ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Best way to carve up 8 disks
On Thu, 12 Oct 2006, Brian Hechinger wrote: > Ok, previous threads have lead me to believe that I want to make raidz > vdevs [0] either 3, 5 or 9 disks in size [1]. Let's say I have 8 disks. > Do I want to create a zfs pool with a 5-disk vdev and a 3-disk vdev? Personally I think that 5 disks for raidz is the "sweet spot". With the remaining 3 disks, consider a 3-way mirror if redundancy is more important than space. This will give you two pools with different operational characteristics. > Are there performance issues with mixing differently sized raidz vdevs > in a pool? If there *is* a performance hit to mix like that, would it Can you tell us what size your disks are? > be greater or lesser than building an 8-disk vdev? In this situation a little experimenting will probably answer all your questions. ZFS is so quick at, well, at everything, that its fun and productive to experiment with different disk configurations. > -brian > > [0] - Just for clarity, what are the "sub-pools" in a pool, the actual > raidz/mirror/etc "containers" called. What is the correct term to refer > to them? I don't want any extra confusion here. ;) > > [1] - For RAID-Z. How does RAID-Z2 effect this? For raidz2, the corresponding #s would be 4,6 and 10 (not recommended) Al Hopper Logical Approach Inc, Plano, TX. [EMAIL PROTECTED] Voice: 972.379.2133 Fax: 972.379.2134 Timezone: US CDT OpenSolaris.Org Community Advisory Board (CAB) Member - Apr 2005 OpenSolaris Governing Board (OGB) Member - Feb 2006 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] [Moving to solaris x86 Interest] Re: ZFS Inexpensive SATA Whitebox
Dick Davies wrote: On 11/10/06, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote: Dick Davies wrote: > On 11/10/06, Peter van Gemert <[EMAIL PROTECTED]> wrote: >> You might want to check the HCL at http://www.sun.com/bigadmin/hcl to >> find out which hardware is supported by Solaris 10. > I tried that myself - there really isn't very much on there. > I can't believe Solaris runs on so little hardware (well, I know most of > my kit isn't on there), so I assume it isn't updated that much... There are tools around that can tell you if hardware is supported by Solaris. One such tool can be found at: http://www.sun.com/bigadmin/hcl/hcts/install_check.html That doesn't help with buying hardware though - I'm quite happy to buy hardware specifically for an OS (like I've always done for my BSD boxes and linux) but it's annoying to be forced to do trial and error . There is a process for submitting input back to Sun on driver testing I thought so (had that experience trying to get a variant of iprb added to device_aliases) and I can understand why, but an overly conservative HCL just feeds the 'Solaris supports hardly any hardware' argument against adoption. Is there any chance that we can get a better maintained list of hardware devices supported under Nevada (OpenSolaris) on the OpenSolaris.org site somewhere? And have it actually updated quickly after developers putback driver support into NV? Given that Nevada has a much greater reach for drivers, the standard Solaris 10 HCL page on BigAdmin is pretty far behind. At least, a listing of the various chipsets supported (not necessarily specific motherboards/add-in cards, but at least the base chips) would be really nice for everyone. -Erik ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Where is the ZFS configuration data stored?
Dick Davies wrote: On 12/10/06, Michael Schuster <[EMAIL PROTECTED]> wrote: James C. McPherson wrote: > Dick Davies wrote: >> On 12/10/06, Matthew Ahrens <[EMAIL PROTECTED]> wrote: >> >>> FYI, /etc/zfs/zpool.cache just tells us what pools to open when you boot >>> up. Everything else (mountpoints, filesystems, etc) is stored in the >>> pool itself. >> >> Does anyone know of any plans or strategies to remove this dependancy? > > What do you suggest in its place? and why? what's your objection to the current scheme? Just the hassle of having to create a cache file in boot archives etc. Why is that a hassle ? bootadm update-archive does that for you. -- Darren J Moffat ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Where is the ZFS configuration data stored?
On 12/10/06, Ceri Davies <[EMAIL PROTECTED]> wrote: On Wed, Oct 11, 2006 at 11:49:48PM -0700, Matthew Ahrens wrote: > FYI, /etc/zfs/zpool.cache just tells us what pools to open when you boot > up. Everything else (mountpoints, filesystems, etc) is stored in the > pool itself. What happens if the file does not exist? Are the devices searched for metadata? My understanding (I'll be delighted if I'm wrong) is that you would be stuffed. I'd expect: zpool import -f (see the manpage) to probe /dev/dsk/ and rebuild the zpool.cache file, but my understanding is that this a) doesn't work yet or b) does horrible things to your chances of surviving a reboot [0]. This means that for zfs root and failsafe boots, you need to have a zpool.cache in your boot/miniroot archive (I probably have the terminology wrong) otherwise the boot will fail. I was asking if it was going to be replaced because it would really simplify ZFS root. Dick. [0] going from: http://solaristhings.blogspot.com/2006/06/zfs-root-on-solaris-part-3.html -- Rasputin :: Jack of All Trades - Master of Nuns http://number9.hellooperator.net/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Re: Where is the ZFS configuration data stored?
On Thu, Oct 12, 2006 at 05:46:24PM +1000, Nathan Kroenert wrote: > > A few of the RAID controllers I have played with has an option to > 'rebuild' a raid set, which I get the impression (though have never > tried) allows you to essentially tell the controller there is a raid set > there, and if you set it up the same way as before, it will use work. Experience has taught me that if this works for you, you are problably one of the 10 luckiest people in the world. ;) Especially on the PERC garbage. Yeah, don't even bother. ;) -brian ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Best way to carve up 8 disks
Ok, previous threads have lead me to believe that I want to make raidz vdevs [0] either 3, 5 or 9 disks in size [1]. Let's say I have 8 disks. Do I want to create a zfs pool with a 5-disk vdev and a 3-disk vdev? Are there performance issues with mixing differently sized raidz vdevs in a pool? If there *is* a performance hit to mix like that, would it be greater or lesser than building an 8-disk vdev? -brian [0] - Just for clarity, what are the "sub-pools" in a pool, the actual raidz/mirror/etc "containers" called. What is the correct term to refer to them? I don't want any extra confusion here. ;) [1] - For RAID-Z. How does RAID-Z2 effect this? ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Where is the ZFS configuration data stored?
On Wed, Oct 11, 2006 at 11:49:48PM -0700, Matthew Ahrens wrote: > James McPherson wrote: > >On 10/12/06, Steve Goldberg <[EMAIL PROTECTED]> wrote: > >>Where is the ZFS configuration (zpools, mountpoints, filesystems, > >>etc) data stored within Solaris? Is there something akin to vfstab > >>or perhaps a database? > > > > > >Have a look at the contents of /etc/zfs for an in-filesystem artefact > >of zfs. Apart from that, the information required is stored on the > >disk itself. > > > >There is really good documentation on ZFS at the ZFS community > >pages found via http://www.opensolaris.org/os/community/zfs. > > FYI, /etc/zfs/zpool.cache just tells us what pools to open when you boot > up. Everything else (mountpoints, filesystems, etc) is stored in the > pool itself. What happens if the file does not exist? Are the devices searched for metadata? Ceri -- That must be wonderful! I don't understand it at all. -- Moliere pgpKKSlC0EwiG.pgp Description: PGP signature ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Re: ZFS Inexpensive SATA Whitebox
Yeah, I looked at the tool. Unfortunately it doesnt help at all with choosing what to buy.On 10/12/06, Dick Davies < [EMAIL PROTECTED]> wrote:On 11/10/06, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote:> Dick Davies wrote:>> > On 11/10/06, Peter van Gemert <[EMAIL PROTECTED] > wrote:> >> You might want to check the HCL at http://www.sun.com/bigadmin/hcl to> >> find out which hardware is supported by Solaris 10. > > I tried that myself - there really isn't very much on there.> > I can't believe Solaris runs on so little hardware (well, I know most of> > my kit isn't on there), so I assume it isn't updated that much... > There are tools around that can tell you if hardware is supported by> Solaris.> One such tool can be found at:> http://www.sun.com/bigadmin/hcl/hcts/install_check.html That doesn't help with buying hardware though -I'm quite happy to buy hardware specifically for an OS(like I've always done for my BSD boxes and linux) but it'sannoying to be forced to do trial and error . > There is a process for submitting input back to Sun on driver testingI thought so (had that experience trying to get a variant of iprb addedto device_aliases) and I can understand why, but an overly conservative HCL just feeds the 'Solaris supports hardly any hardware' argument againstadoption.--Rasputin :: Jack of All Trades - Master of Nunshttp://number9.hellooperator.net/ ___zfs-discuss mailing listzfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Where is the ZFS configuration data stored?
On 12/10/06, Michael Schuster <[EMAIL PROTECTED]> wrote: James C. McPherson wrote: > Dick Davies wrote: >> On 12/10/06, Matthew Ahrens <[EMAIL PROTECTED]> wrote: >> >>> FYI, /etc/zfs/zpool.cache just tells us what pools to open when you boot >>> up. Everything else (mountpoints, filesystems, etc) is stored in the >>> pool itself. >> >> Does anyone know of any plans or strategies to remove this dependancy? > > What do you suggest in its place? and why? what's your objection to the current scheme? Just the hassle of having to create a cache file in boot archives etc. -- Rasputin :: Jack of All Trades - Master of Nuns http://number9.hellooperator.net/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Where is the ZFS configuration data stored?
James C. McPherson wrote: Dick Davies wrote: On 12/10/06, Matthew Ahrens <[EMAIL PROTECTED]> wrote: FYI, /etc/zfs/zpool.cache just tells us what pools to open when you boot up. Everything else (mountpoints, filesystems, etc) is stored in the pool itself. Does anyone know of any plans or strategies to remove this dependancy? What do you suggest in its place? and why? what's your objection to the current scheme? Michael -- Michael Schuster +49 89 46008-2974 / x62974 visit the online support center: http://www.sun.com/osc/ Recursion, n.: see 'Recursion' ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Where is the ZFS configuration data stored?
James C. McPherson wrote: Dick Davies wrote: On 12/10/06, Matthew Ahrens <[EMAIL PROTECTED]> wrote: FYI, /etc/zfs/zpool.cache just tells us what pools to open when you boot up. Everything else (mountpoints, filesystems, etc) is stored in the pool itself. Does anyone know of any plans or strategies to remove this dependancy? What do you suggest in its place? Or better yet, exactly what is the problem with having the cache ? -- Darren J Moffat ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Where is the ZFS configuration data stored?
Dick Davies wrote: On 12/10/06, Matthew Ahrens <[EMAIL PROTECTED]> wrote: FYI, /etc/zfs/zpool.cache just tells us what pools to open when you boot up. Everything else (mountpoints, filesystems, etc) is stored in the pool itself. Does anyone know of any plans or strategies to remove this dependancy? What do you suggest in its place? James C. McPherson -- Solaris kernel software engineer, system admin and troubleshooter http://www.jmcp.homeunix.com/blog Find me on LinkedIn @ http://www.linkedin.com/in/jamescmcpherson ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [nfs-discuss] Re: [zfs-discuss] Re: NFS Performance and Tar
"Frank Batschulat (Home)" <[EMAIL PROTECTED]> wrote: > On Tue, 10 Oct 2006 01:25:36 +0200, Roch <[EMAIL PROTECTED]> wrote: > > > You tell me ? We have 2 issues > > > > can we make 'tar x' over direct attach, safe (fsync) > > and posix compliant while staying close to current > > performance characteristics ? In other words do we > > have the posix leeway to extract files in parallel ? > > why fsync(3C) ? it is usually more heavy weight then > opening the file with O_SYNC - and both provide > POSIX synchronized file integrity completion. I believe that I did run tests that show that fsync is better. Jörg -- EMail:[EMAIL PROTECTED] (home) Jörg Schilling D-13353 Berlin [EMAIL PROTECTED](uni) [EMAIL PROTECTED] (work) Blog: http://schily.blogspot.com/ URL: http://cdrecord.berlios.de/old/private/ ftp://ftp.berlios.de/pub/schily ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [nfs-discuss] Re: [zfs-discuss] Re: NFS Performance and Tar
Spencer Shepler <[EMAIL PROTECTED]> wrote: > The close-to-open behavior of NFS clients is what ensures that the > file data is on stable storage when close() returns. In the 1980s this was definitely not the case. When did this change? > The meta-data requirements of NFS is what ensures that file creation, > removal, renames, etc. are on stable storage when the server > sends a response. > > So, unless the NFS server is behaving badly, the NFS client has > a synchronous behavior and for some that means more "safe" but > usually means that it is also slower that local access. In any case, calling fsync before close does nto seem to be a problem. > > You tell me ? We have 2 issues > > > > can we make 'tar x' over direct attach, safe (fsync) > > and posix compliant while staying close to current > > performance characteristics ? In other words do we > > have the posix leeway to extract files in parallel ? > > > > For NFS, can we make 'tar x' fast and reliable while > > keeping a principle of least surprise for users on > > this non-posix FS. > > Having tar create/write/close files concurrently would be a > big win over NFS mounts on almost any system. Do you have an idea on how to do this? Jörg -- EMail:[EMAIL PROTECTED] (home) Jörg Schilling D-13353 Berlin [EMAIL PROTECTED](uni) [EMAIL PROTECTED] (work) Blog: http://schily.blogspot.com/ URL: http://cdrecord.berlios.de/old/private/ ftp://ftp.berlios.de/pub/schily ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Re: NFS Performance and Tar
Roch <[EMAIL PROTECTED]> wrote: > > Neither Sun tar nor GNU tar call fsync which is the only way to > > enforce data integrity over NFS. > > I tend to agree with this although I'd say that in practice, > from performance perspective, calling fsync should be more > relevant for direct attach. For NFS, doesn't close-to-open > and other aspect of the protocol need to enforce much more > synchronous operations ? For tar x over nfs I'd bet the > fsync will be an over-the-wire ops (say 0.5ms) but will not > add an additional I/O latency (5ms) on each file extract. I did never test the performance aspects over NFS, as from my experience this is the only way to grant detection of file write problems. > My target for a single threaded 'tar x' of small files is to > be able to run over NFS at 1 file per I/O latency, no matter > what the backend FS is. I guess that 'star -yes-fsync' over > direct attach should behave the same ? Or do you have > concurrency in there...see below. No, star did not change since the last 15 year: - One process (the second) read/writes the archive In copy mode, this process extract the internal stream. - One process (the first) does the file I/O and archive generation. > > If you like to test this, use star. Star by default calls > > fsync before it closes a written file in x mode. To switch this > > off, use star -no-fsync. > > > > > Net Net, for single threaded 'tar x', data integrity > > > consideration forces NFS to provide a high quality slow > > > service. For direct attach, we don't have those data > > > integrity issues, and the community has managed to get by the > > > lower quality higher speed service. > > > > What dou you have in mind? > > > > A tar that calls fsync in detached threads? > > > > You tell me ? We have 2 issues > > can we make 'tar x' over direct attach, safe (fsync) > and posix compliant while staying close to current > performance characteristics ? In other words do we > have the posix leeway to extract files in parallel ? What do you believe how fsync is related to POSIX? When I did introduce fsync calls 7 years ago, I did make some performance tests and it seems that on UFS, calling fsync reduces the extract performance by 10-20% which looks OK to me. > For NFS, can we make 'tar x' fast and reliable while > keeping a principle of least surprise for users on > this non-posix FS. Someone should start with star -x vs. star -x -no-fsync tests and report timings... Jörg -- EMail:[EMAIL PROTECTED] (home) Jörg Schilling D-13353 Berlin [EMAIL PROTECTED](uni) [EMAIL PROTECTED] (work) Blog: http://schily.blogspot.com/ URL: http://cdrecord.berlios.de/old/private/ ftp://ftp.berlios.de/pub/schily ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] zfs/raid configuration question for an application
Hi all, I am going to have solaris 06/06, a database (postgresql), application server (tomcat or sun app. server) on x86, 2GB with max. 3 scsi (hardware raid possible) disks. There will be no e-mail or file serving. The ratio of write/read operations to database is >=2 (and there is no other significant disk io). Block size of postgresql is default (I think it is 8K). Our application performance is bounded with database write (and then read) operations, so db write/read performance is important. I need redundancy in all over the system, not only for db data, but also os itself. Capacity efficieny is not that important (I can use full mirroring). Although I am not a sys.adm., I know some about raid levels and zfs, I want to ask your comments on disk configuration. Do you recommend mirroring or striping+some form of redundancy ? Specifically, raid 1 or raid 5 ? I am not going to use zfs as a root partition (since it is not possible in solaris officially yet), so I need some ufs partition. So, do you recommend me to use hardware raid or zfs ? If you recommend zfs, how can I configure it (I mean which disk/slice as zfs, and which disk/slice as ufs) since I only have 3 disks ? Thank you in advance. Mete This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Where is the ZFS configuration data stored?
On 12/10/06, Matthew Ahrens <[EMAIL PROTECTED]> wrote: FYI, /etc/zfs/zpool.cache just tells us what pools to open when you boot up. Everything else (mountpoints, filesystems, etc) is stored in the pool itself. Does anyone know of any plans or strategies to remove this dependancy? -- Rasputin :: Jack of All Trades - Master of Nuns http://number9.hellooperator.net/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Re: ZFS Inexpensive SATA Whitebox
On 11/10/06, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote: Dick Davies wrote: > On 11/10/06, Peter van Gemert <[EMAIL PROTECTED]> wrote: >> You might want to check the HCL at http://www.sun.com/bigadmin/hcl to >> find out which hardware is supported by Solaris 10. > I tried that myself - there really isn't very much on there. > I can't believe Solaris runs on so little hardware (well, I know most of > my kit isn't on there), so I assume it isn't updated that much... There are tools around that can tell you if hardware is supported by Solaris. One such tool can be found at: http://www.sun.com/bigadmin/hcl/hcts/install_check.html That doesn't help with buying hardware though - I'm quite happy to buy hardware specifically for an OS (like I've always done for my BSD boxes and linux) but it's annoying to be forced to do trial and error . There is a process for submitting input back to Sun on driver testing I thought so (had that experience trying to get a variant of iprb added to device_aliases) and I can understand why, but an overly conservative HCL just feeds the 'Solaris supports hardly any hardware' argument against adoption. -- Rasputin :: Jack of All Trades - Master of Nuns http://number9.hellooperator.net/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Re: ZFS Inexpensive SATA Whitebox
On 10/11/06, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote: So are there any pci-e SATA cards that are supported ? I was hoping to go with a sempron64. Using old-pci seems like a waste. I recently built a am2 sempron64 based zfs box. motherboard: ASUS M2NPV-MX cpu: amd am2 sempron64 2800+ The motherboard has 2 ide ports and 4 sata ports provided by nvidia mcp51. The ide and sata ports work in compatability mode. The onboard nge ethernet works. The motherboard has builtin geforce based video but I havent tested this. Im using 2 ide disks for mirrored boot/root and a 4 disk raidz on the sata ports. It has been running snv_47 for the last couple of weeks with no problems. Paul ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Re: ZFS Inexpensive SATA Whitebox
On Wed, Oct 11, 2006 at 06:36:28PM -0500, David Dyer-Bennet wrote: > On 10/11/06, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote: > > >> The more I learn about Solaris hardware support, the more I see it as > >> a minefield. > > > > > >I've found this to be true for almost all open source platforms where > >you're trying to use something that hasn't been explicitly used and > >tested by the developers. > > I've been running Linux since kernel 0.99pl13, I think it was, and > have had amazingly little trouble. Whereas I'm now sitting on $2k of > hardware that won't do what I wanted it to do under Solaris, so it's a > bit of a hot-button issue for me right now. I've never had to > consider Linux issues in selecting hardware (in fact I haven't > selected hardware, my linux boxes have all been castoffs originally > purchased to run Windowsx) Perhaps that's true of most Linux development machines too :) Ceri -- That must be wonderful! I don't understand it at all. -- Moliere pgpJujKrQNL0n.pgp Description: PGP signature ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Re: Where is the ZFS configuration data stored?
I'll take a crack at this. First off, I'm assuming that the RAID you are talking about it provided by the hardware and not by ZFS. IF that's the case, then it will depend on the way you created the raid set, the bios of the controller, and whether or not these two things match up with any other systems. A few of the RAID controllers I have played with has an option to 'rebuild' a raid set, which I get the impression (though have never tried) allows you to essentially tell the controller there is a raid set there, and if you set it up the same way as before, it will use work. Personally, unless I was moving the disks to another system with the same RAID controller and BIOS, I would have no expectation it would work. It might, but I would not be surprised (or disappointed) if it did not. If you are talking about using ZFS's raid, then you won't need to do anything. It should just work, as ZFS will be able to just import the zpool. I hope I understood your question. (And I hope I'm telling no lies... ;) Nathan. Sergey wrote: + a little addition to the original quesion: Imagine that you have a RAID attached to Solaris server. There's ZFS on RAID. And someday you lost your server completely (fired motherboard, physical crash, ...). Is there any way to connect the RAID to some another server and restore ZFS layout (not loosing all data on RAID)? This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Re: system hangs on POST after giving zfs a drive
On 10/12/06, John Sonnenschein <[EMAIL PROTECTED]> wrote: well, it's an SiS 960 board, and it appears my only option to turn off probing of the drives is to enable RAID mode (which makes them inacessable by the OS) I think the option is in the standard CMOS setup section, and allows you to set the disk geometry, translation, etc. There should be options for each disk, something like: auto detect/manual/not present. Hopefully your BIOS has a similar setting. what would be my next (cheapest) option, a proper SATA add-in card? I've heard good things about the silicon image 3132 based cards, but I'm not sure if they'll still leave my BIOS in the same position if i run the drives in ATA mode The best supported card is the Supermicro AOC-SAT2-MV8. Drivers are also present for the SiI 3132/3124 based cards in the SATA framework, but they haven't been updated in a while, and don't support NCQ yet. Either way, unless you are using a recent nevada build, any controller will only run in compatibility mode. Chris ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Re: ZFS Inexpensive SATA Whitebox
Hi Darren, The Solaris Operating System for x86 Installation Check Tool 1.1 is designed to report whether Solaris drivers are available for the devices the tool detects on a x86 system and determine quickly whether your system is likely to be able to install the Solaris OS. It is not designed to make sure that the driver fully follows a certain specification or it is bug free. The Solaris Operating System for x86 Installation Check Tool 1.1 is based on Solaris 10 Update 2 (06/06) kernel. The supported driver list also generated from s10u2. In s10u2, the MCP55 build-in NIC was not supported, so the tool reports it doesn't support. It's possible that nv_44 can detect that card, but snv is not officially released, so this Installation Check Tool won't support it. I'd like to take this chance to introduce Hardware Certification Test Suite. The Hardware Certification Test Suite (HCTS) is the application and set of test suites that you can use to test your system or component to verify that it is compatible with the Solaris OS on x86 platforms. HCTS testing enables you to certify server, desktop, and laptop systems and to certify many different types of controllers. All hardware that passes certification testing is eligible to be included in the Solaris OS Hardware Compatibility List (HCL) as a certified system or component. Please note HCTS certifies hardware, but not drivers. If you are interest in this suite, go to http://www.sun.com/bigadmin/hcl/hcts/index.html and have a try. Best regards, Ni, Zhiqi Original Message Subject: Re: [zfs-discuss] Re: ZFS Inexpensive SATA Whitebox Date: Wed, 11 Oct 2006 16:24:50 -0700 From: [EMAIL PROTECTED] To: David Dyer-Bennet <[EMAIL PROTECTED]> CC: zfs-discuss@opensolaris.org References: <[EMAIL PROTECTED]> <[EMAIL PROTECTED]> <[EMAIL PROTECTED]> <[EMAIL PROTECTED]> <[EMAIL PROTECTED]> David Dyer-Bennet wrote: > On 10/11/06, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote: > >> There are tools around that can tell you if hardware is supported by >> Solaris. >> One such tool can be found at: >> http://www.sun.com/bigadmin/hcl/hcts/install_check.html > > > Beware of this tool. It reports "Y" for both 32-bit and 64-bit on the > nVidia MCP55 SATA controller -- but in the real world, it's supported > only in compatibility mode, and (fatal flaw for me) *it doesn't > support hot-swap with this controller*. So apparently even a clean > result from this utility isn't a safe indication that the device is > fully supported. > > Also, it says that the nVidia MCP55 ethernet is NOT supported in > either 32 or 64 bit, but actually nv_44 found the ethernet without any > trouble. Maybe that's just that the support was extended recently; > the install tool is based on S10 6/06. Driver support for Solaris Nevada is not the same as Solaris 10 Update 2, so it is not surprising to see these discrepencies. In some cases, getting Solaris to support a piece of hardware is as simple as running the update_drv command to tell it about a new PCI id (these change often and are central to driver support on all x86 platforms.) > The more I learn about Solaris hardware support, the more I see it as > a minefield. I've found this to be true for almost all open source platforms where you're trying to use something that hasn't been explicitly used and tested by the developers. Darren ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Re: Where is the ZFS configuration data stored?
+ a little addition to the original quesion: Imagine that you have a RAID attached to Solaris server. There's ZFS on RAID. And someday you lost your server completely (fired motherboard, physical crash, ...). Is there any way to connect the RAID to some another server and restore ZFS layout (not loosing all data on RAID)? This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS Inexpensive SATA Whitebox
Al Hopper wrote: >On Wed, 11 Oct 2006, Dana H. Myers wrote: > > > >>Al Hopper wrote: >> >> >> >>>Memory: DDR-400 - your choice but Kingston is always a safe bet. 2*512Mb >>>sticks for a starter, cost effective, system. 4*512Mb for a good long >>>term solution. >>> >>> >>Due to fan-out considerations, every BIOS I've seen will run DDR400 >>memory at 333MHz when connected to more than 1 DIMM-per-channel (I >>believe at AMD's urging). >> >> > >Really!? That's surprising. Is there a way to verify that on an Ultra20 >running Solaris 06/06? > > > I can't remember which ones (I think it was the dual socket 940 Tyan boards) listed different memory limits for DDR400 and DD333 RAM. Ian ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Re: system hangs on POST after giving zfs a drive
well, it's an SiS 960 board, and it appears my only option to turn off probing of the drives is to enable RAID mode (which makes them inacessable by the OS) what would be my next (cheapest) option, a proper SATA add-in card? I've heard good things about the silicon image 3132 based cards, but I'm not sure if they'll still leave my BIOS in the same position if i run the drives in ATA mode This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss