Re: [zfs-discuss] ZFS vq_max_pending value ?
4 disk raidz group issues 128k/3=42.6k io to each individual data disk.If 35 concurrent 128k IO is enough to saturate a disk( vdev ) , then 35*3=105 concurrent 42k io will be required to saturates the same disk. >>> >>> ZFS doesn't know anything about disk saturation. It will send >>> up to vq_max_pending I/O requests per vdev (usually a vdev is a >>> disk). It will try to keep vq_max_pending I/O requests queued to >>> the vdev. >> >> I can see the "avg pending I/Os" hitting my vq_max_pending limit, >> then raising the limit would be a good thing. I think , it's due to >> many 42k Read IO to individual disk in the 4 disk raidz group. > > You're dealing with a queue here. iostat's average pending I/Os > represents > the queue depth. Some devices can't handle a large queue. In any > case, queuing theory applies. > > Note that for reads, the disk will likely have a track cache, so it is > not a good assumption that a read I/O will require a media access. My workload issues around 5000 MB read I/0 & iopattern says around 55% of the IO are random in nature. I don't know how much prefetching through track cache is going to help here.Probably I can try disabling vdev_cache through "set 'zfs_vdev_cache_max' 1" Thanks Manoj Nayak ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] LowEnd Batt. backed raid controllers that will deal with ZFS commit semantics correctly?
Albert Chin wrote: > On Tue, Jan 22, 2008 at 09:20:30PM -0500, Kyle McDonald wrote: > >> Anyone know the answer to this? I'll be ordering 2 of the 7K's for >> my x346's this week. If niether A nor B will work I'm not sure >> there's any advantage to using the 7k card considering I want ZFS to >> do the mirroring. >> > > Why even both with a H/W RAID array when you won't use the H/W RAID? > Better to find a decent SAS/FC JBOD with cache. Would definitely be > cheaper. > > I've never heard of such a thing? Do you have any links (cheap or not?) Do they exist for less than $350? Thats what the 7k will run me. Do they include an enclosure for at least 6 disks? the 7k will use the 6 U320 hot swap bays already in my IBM x346 chassis. I'm not being sarcastic, if something better exists, even for a little more, I'm interested. I'd especially love to switch to SATA as I'm about to pay about $550 each for 300GB U320 drives, and with SATA I could go bigger, or save money or both. :) -Kyle ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS vq_max_pending value ?
manoj nayak wrote: > > - Original Message - From: "Richard Elling" > <[EMAIL PROTECTED]> > To: "manoj nayak" <[EMAIL PROTECTED]> > Cc: > Sent: Wednesday, January 23, 2008 7:20 AM > Subject: Re: [zfs-discuss] ZFS vq_max_pending value ? > > >> manoj nayak wrote: >>> Manoj Nayak wrote: > Hi All. > > ZFS document says ZFS schedules it's I/O in such way that it > manages to saturate a single disk bandwidth using enough > concurrent 128K I/O. > The no of concurrent I/O is decided by vq_max_pending.The default > value for vq_max_pending is 35. > > We have created 4-disk raid-z group inside ZFS pool on Thumper.ZFS > record size is set to 128k.When we read/write a 128K record ,it > issue a > 128K/3 I/O to each of the 3 data disks in the 4-disk raid-z group. > Yes, this is how it works for a read without errors. For a write, you should see 4 writes, each 128KBytes/3. Writes may also be coalesced, so you may see larger physical writes. > We need to saturate all three data disk bandwidth in the Raidz > group.Is it required to set vq_max_pending value to 35*3=135 ? > No. vq_max_pending applies to each vdev. >>> >>> 4 disk raidz group issues 128k/3=42.6k io to each individual data >>> disk.If 35 concurrent 128k IO is enough to saturate a disk( vdev ) , >>> then 35*3=105 concurrent 42k io will be required to saturates the >>> same disk. >> >> ZFS doesn't know anything about disk saturation. It will send >> up to vq_max_pending I/O requests per vdev (usually a vdev is a >> disk). It will try to keep vq_max_pending I/O requests queued to >> the vdev. > > I can see the "avg pending I/Os" hitting my vq_max_pending limit, > then raising the limit would be a good thing. I think , it's due to > many 42k Read IO to individual disk in the 4 disk raidz group. You're dealing with a queue here. iostat's average pending I/Os represents the queue depth. Some devices can't handle a large queue. In any case, queuing theory applies. Note that for reads, the disk will likely have a track cache, so it is not a good assumption that a read I/O will require a media access. -- richard ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] ZFS volume Block Size & Record Size
Hi All, How ZFS volblocksize is related to ZFS record size ? Thanks Manoj Nayak ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] LowEnd Batt. backed raid controllers that will deal with ZFS commit semantics correctly?
Albert Chin wrote: > On Tue, Jan 22, 2008 at 09:20:30PM -0500, Kyle McDonald wrote: >> Anyone know the answer to this? I'll be ordering 2 of the 7K's for >> my x346's this week. If niether A nor B will work I'm not sure >> there's any advantage to using the 7k card considering I want ZFS to >> do the mirroring. > > Why even both with a H/W RAID array when you won't use the H/W RAID? > Better to find a decent SAS/FC JBOD with cache. Would definitely be > cheaper. Please name some candidate cards matching that description - I don't know of any. -- Carson ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] LowEnd Batt. backed raid controllers that will deal with ZFS commit semantics correctly?
On Tue, Jan 22, 2008 at 09:20:30PM -0500, Kyle McDonald wrote: > Anyone know the answer to this? I'll be ordering 2 of the 7K's for > my x346's this week. If niether A nor B will work I'm not sure > there's any advantage to using the 7k card considering I want ZFS to > do the mirroring. Why even both with a H/W RAID array when you won't use the H/W RAID? Better to find a decent SAS/FC JBOD with cache. Would definitely be cheaper. -- albert chin ([EMAIL PROTECTED]) ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] LowEnd Batt. backed raid controllers that will deal with ZFS commit semantics correctly?
Kyle McDonald wrote: > Now I just need to determine if a) the cache is used by the card even > when useing the disks on it as JBOD, or b) if the card will allow me to > make 5 or 6 raid 0 luns with only 1 disk in each, to simulate (a) and > activate the write cache. > > I found docs at IBM that make me think that (B) at least will work. The 7k can use up to 30 disks, and can allow the host to see as many as 8 LUNs. IBM's description of RAID0 is that it requires a min. of 1 drive, so I can't see why I can't create 5 or 6 1 drive RAID 0 LUNs to use the 7k's 256MB cache with ZFS. Next question is, with a single drive RAID 0 LUN, will the card's stripe unit size be a factor? and if so how to set the card's stripe unit size. I know ZFS likes to write to the disk in 128K chunks. Is that 128K to each vdev? or 128K/n to each vdev? This card allows stripe unit sizes of 8k, 16k, 32k, and 64k. I'm guessing that if ZFS will be sending 128K to each vdev at once nearly all the time, I should use 64k. If it's 128K across the 5 vdevs, then 16k or 32k might be better?? In either case is there an advantage to tuning ZFS's size down to match the card's? > If this all does work, it should speed up all the writes to the disk, > including the ZIL writes. Is there still an advantage to investigating a > Solid State Disk, or Flash Drive device to reloacte the ZIL to? > I'm still going to investigate this further. I think I'll try to calculate the max ZIL size once I'm up and running, and see if I can't get a cheap USB flash drive of a decent size to test this with. -Kyle ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] LowEnd Batt. backed raid controllers that will deal with ZFS commit semantics correctly?
Carson Gaspar wrote: > Kyle McDonald wrote: > ... > >> I know, but for a that card you need a driver to make it appear as a >> device. Plus it would take a PCI slot. >> I was hoping to make use of the battery backed ram on a RAID card that I >> already have (but can't use since I want to let ZFS do the redundancy.) >> If I had a card with battery backed ram, how would I go about testing >> the commit semantics to see if it is only obeying ZFS commits when the >> battery is bad? >> > > Any _sane_ controller that supports battery backed cache will disable > its write cache if its battery goes bad. It should also log this. I'd > check the docs or contact your vendor's tech support to verify the card > you have is sane, and if it reports the error to its monitoring tools so > you find out about it quickly. > You're right. I forgot that. Not only would the commits need to happen right away, but the cache should be disabled completely. Now that you mention it, I know from experience, for the ServeRAID 7k/8k controllers, the cache is disabled if/when the battery fails. Good point. Now I just need to determine if a) the cache is used by the card even when useing the disks on it as JBOD, or b) if the card will allow me to make 5 or 6 raid 0 luns with only 1 disk in each, to simulate (a) and activate the write cache. Anyone know the answer to this? I'll be ordering 2 of the 7K's for my x346's this week. If niether A nor B will work I'm not sure there's any advantage to using the 7k card considering I want ZFS to do the mirroring. If this all does work, it should speed up all the writes to the disk, including the ZIL writes. Is there still an advantage to investigating a Solid State Disk, or Flash Drive device to reloacte the ZIL to? > Now you'll probably _still_ need to disable the ZFS cache flushes, which > is a global option, so you'd need to make sure that _all_ your ZFS > devices had battery backed write caches or no write caches at all. > > I guess this is a better solution than chasing down firmware authors to get them to ignore flush requests. It's just too bad it's not settable on a pool by pool basis rather than server by server. Won't affect me though this will be the only pool on this machine. -Kyle ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS vq_max_pending value ?
- Original Message - From: "Richard Elling" <[EMAIL PROTECTED]> To: "manoj nayak" <[EMAIL PROTECTED]> Cc: Sent: Wednesday, January 23, 2008 7:20 AM Subject: Re: [zfs-discuss] ZFS vq_max_pending value ? > manoj nayak wrote: >> >>> Manoj Nayak wrote: Hi All. ZFS document says ZFS schedules it's I/O in such way that it manages to saturate a single disk bandwidth using enough concurrent 128K I/O. The no of concurrent I/O is decided by vq_max_pending.The default value for vq_max_pending is 35. We have created 4-disk raid-z group inside ZFS pool on Thumper.ZFS record size is set to 128k.When we read/write a 128K record ,it issue a 128K/3 I/O to each of the 3 data disks in the 4-disk raid-z group. >>> >>> Yes, this is how it works for a read without errors. For a write, you >>> should see 4 writes, each 128KBytes/3. Writes may also be >>> coalesced, so you may see larger physical writes. >>> We need to saturate all three data disk bandwidth in the Raidz group.Is it required to set vq_max_pending value to 35*3=135 ? >>> >>> No. vq_max_pending applies to each vdev. >> >> 4 disk raidz group issues 128k/3=42.6k io to each individual data disk.If >> 35 concurrent 128k IO is enough to saturate a disk( vdev ) , >> then 35*3=105 concurrent 42k io will be required to saturates the same >> disk. > > ZFS doesn't know anything about disk saturation. It will send > up to vq_max_pending I/O requests per vdev (usually a vdev is a > disk). It will try to keep vq_max_pending I/O requests queued to > the vdev. I can see the "avg pending I/Os" hitting my vq_max_pending limit, then raising the limit would be a good thing. I think , it's due to many 42k Read IO to individual disk in the 4 disk raidz group. Thanks Manoj Nayak > For writes, you should see them become coalesced, so rather than > sending 3 42.6kByte write requests to a vdev, you might see one > 128kByte write request. > > In other words, ZFS has an I/O scheduler which is responsible > for sending I/O requests to vdevs. > -- richard > > ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Sparc zfs root/boot status ?
On Jan 22, 2008, at 18:24, Lori Alt wrote: > ZFS boot supported by the installation software, plus > support for having swap and dump be zvols within > the root pool (i.e., no longer requiring a separate > swap/dump slice), plus various other features, such > as support for failsafe-archive booting. Will there any support for tying into patching / Live Upgrade with the ZFS boot put back, or is that a separate project? Thanks for any info. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS vq_max_pending value ?
manoj nayak wrote: > >> Manoj Nayak wrote: >>> Hi All. >>> >>> ZFS document says ZFS schedules it's I/O in such way that it manages >>> to saturate a single disk bandwidth using enough concurrent 128K I/O. >>> The no of concurrent I/O is decided by vq_max_pending.The default >>> value for vq_max_pending is 35. >>> >>> We have created 4-disk raid-z group inside ZFS pool on Thumper.ZFS >>> record size is set to 128k.When we read/write a 128K record ,it issue a >>> 128K/3 I/O to each of the 3 data disks in the 4-disk raid-z group. >>> >> >> Yes, this is how it works for a read without errors. For a write, you >> should see 4 writes, each 128KBytes/3. Writes may also be >> coalesced, so you may see larger physical writes. >> >>> We need to saturate all three data disk bandwidth in the Raidz >>> group.Is it required to set vq_max_pending value to 35*3=135 ? >>> >> >> No. vq_max_pending applies to each vdev. > > 4 disk raidz group issues 128k/3=42.6k io to each individual data > disk.If 35 concurrent 128k IO is enough to saturate a disk( vdev ) , > then 35*3=105 concurrent 42k io will be required to saturates the same > disk. ZFS doesn't know anything about disk saturation. It will send up to vq_max_pending I/O requests per vdev (usually a vdev is a disk). It will try to keep vq_max_pending I/O requests queued to the vdev. For writes, you should see them become coalesced, so rather than sending 3 42.6kByte write requests to a vdev, you might see one 128kByte write request. In other words, ZFS has an I/O scheduler which is responsible for sending I/O requests to vdevs. -- richard ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS vdev_cache
On Jan 22, 2008, at 5:39 PM, manoj nayak wrote: >> >> Manoj Nayak writes: >>> Hi All, >>> >>> If any dtrace script is available to figure out the vdev_cache (or >>> software track buffer) reads in kiloBytes ? >>> >>> The document says the default size of the read is 128k , However >>> vdev_cache source code implementation says the default size is 64k >>> >>> Thanks >>> Manoj Nayak >>> >> >> Which document ? It's 64K when it applies. >> Nevada won't use the vdev_cache for data block anymore. > > How readahead or software track buffer is going to used in Navada > without > vdev_cache ? Any pointer to documents regarding that ? The vdev cache is still used - just for metadata. See: http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/ common/fs/zfs/vdev_cache.c#35 http://blogs.sun.com/erickustarz/entry/vdev_cache_improvements_to_help eric ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS vdev_cache
> > Manoj Nayak writes: > > Hi All, > > > > If any dtrace script is available to figure out the vdev_cache (or > > software track buffer) reads in kiloBytes ? > > > > The document says the default size of the read is 128k , However > > vdev_cache source code implementation says the default size is 64k > > > > Thanks > > Manoj Nayak > > > > Which document ? It's 64K when it applies. > Nevada won't use the vdev_cache for data block anymore. How readahead or software track buffer is going to used in Navada without vdev_cache ? Any pointer to documents regarding that ? Thanks Manoj Nayak > > -r > > > > > > > ___ > > zfs-discuss mailing list > > zfs-discuss@opensolaris.org > > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > > ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS vq_max_pending value ?
> Manoj Nayak wrote: >> Hi All. >> >> ZFS document says ZFS schedules it's I/O in such way that it manages to >> saturate a single disk bandwidth using enough concurrent 128K I/O. >> The no of concurrent I/O is decided by vq_max_pending.The default value >> for vq_max_pending is 35. >> >> We have created 4-disk raid-z group inside ZFS pool on Thumper.ZFS record >> size is set to 128k.When we read/write a 128K record ,it issue a >> 128K/3 I/O to each of the 3 data disks in the 4-disk raid-z group. >> > > Yes, this is how it works for a read without errors. For a write, you > should see 4 writes, each 128KBytes/3. Writes may also be > coalesced, so you may see larger physical writes. > >> We need to saturate all three data disk bandwidth in the Raidz group.Is >> it required to set vq_max_pending value to 35*3=135 ? >> > > No. vq_max_pending applies to each vdev. 4 disk raidz group issues 128k/3=42.6k io to each individual data disk.If 35 concurrent 128k IO is enough to saturate a disk( vdev ) , then 35*3=105 concurrent 42k io will be required to saturates the same disk. Thanks Manoj Nayak Use iostat to see what > the device load is. For the commonly used Hitachi 500 GByte disks > in a thumper, the read media bandwidth is 31-64.8 MBytes/s. Writes > will be about 80% of reads, or 24.8-51.8 MBytes/s. In a thumper, > the disk bandwidth will be the limiting factor for the hardware. > -- richard > > ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Sparc zfs root/boot status ?
andrewk9 wrote: >> zfs boot on sparc will not be putback on its own. >> It will be putback with the rest of zfs boot support, >> sometime around build 86. >> > > Since we already have ZFS boot on x86, what else will be added in addition to > ZFS boot for SPARC? > > ZFS boot supported by the installation software, plus support for having swap and dump be zvols within the root pool (i.e., no longer requiring a separate swap/dump slice), plus various other features, such as support for failsafe-archive booting. Lori ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] LowEnd Batt. backed raid controllers that will deal with ZFS commit semantics correctly?
Kyle McDonald wrote: ... > I know, but for a that card you need a driver to make it appear as a > device. Plus it would take a PCI slot. > I was hoping to make use of the battery backed ram on a RAID card that I > already have (but can't use since I want to let ZFS do the redundancy.) > If I had a card with battery backed ram, how would I go about testing > the commit semantics to see if it is only obeying ZFS commits when the > battery is bad? Any _sane_ controller that supports battery backed cache will disable its write cache if its battery goes bad. It should also log this. I'd check the docs or contact your vendor's tech support to verify the card you have is sane, and if it reports the error to its monitoring tools so you find out about it quickly. Now you'll probably _still_ need to disable the ZFS cache flushes, which is a global option, so you'd need to make sure that _all_ your ZFS devices had battery backed write caches or no write caches at all. -- Carson ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] zpool attach problem
On a V240 running s10u4 (no additional patches), I had a pool which looked like this: > # zpool status > pool: pool01 > state: ONLINE > scrub: none requested > config: > > NAME STATE READ WRITE CKSUM > pool01 ONLINE 0 0 0 > mirror ONLINE 0 0 0 > c8t600C0FF0082668310F838000d0 ONLINE 0 0 0 > c8t600C0FF007E4BE4C38F4ED00d0 ONLINE 0 0 0 > mirror ONLINE 0 0 0 > c8t600C0FF008266812A0877700d0 ONLINE 0 0 0 > c8t600C0FF007E4BE2BEDBC9600d0 ONLINE 0 0 0 > > errors: No known data errors Since this system is not in production yet, I wanted to do a little disk juggling as follows: > # zpool detach pool01 c8t600C0FF007E4BE4C38F4ED00d0 > # zpool detach pool01 c8t600C0FF007E4BE2BEDBC9600d0 New pool status: > # zpool status > pool: pool01 > state: ONLINE > scrub: none requested > config: > > NAME STATE READ WRITE CKSUM > pool01 ONLINE 0 0 0 > c8t600C0FF0082668310F838000d0 ONLINE 0 0 0 > c8t600C0FF008266812A0877700d0 ONLINE 0 0 0 > > errors: No known data errors Finally, I wanted to re-establish mirrors, but am seeing the following errors: > # zpool attach pool01 c8t600C0FF008266812A0877700d0 > c8t600C0FF007E4BE4C38F4ED00d0 > cannot attach c8t600C0FF007E4BE4C38F4ED00d0 to > c8t600C0FF008266812A0877700d0: device is too small > # zpool attach pool01 c8t600C0FF0082668310F838000d0 > c8t600C0FF007E4BE2BEDBC9600d0 > cannot attach c8t600C0FF007E4BE2BEDBC9600d0 to > c8t600C0FF0082668310F838000d0: device is too small Is this expected behavior? The 'zpool' man page says: If device is not currently part of a mirrored configuration, device automatically transforms into a two-way mirror of device and new_device. But, this isn't what I'm seeing . . . did I do something wrong? Here's the format output for the disks: 4. c8t600C0FF007E4BE2BEDBC9600d0/scsi_vhci/[EMAIL PROTECTED] 5. c8t600C0FF007E4BE4C38F4ED00d0 /scsi_vhci/[EMAIL PROTECTED] 6. c8t600C0FF008266812A0877700d0 /scsi_vhci/[EMAIL PROTECTED] 7. c8t600C0FF0082668310F838000d0 /scsi_vhci/[EMAIL PROTECTED] Rob This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Sparc zfs root/boot status ?
> zfs boot on sparc will not be putback on its own. > It will be putback with the rest of zfs boot support, > sometime around build 86. Since we already have ZFS boot on x86, what else will be added in addition to ZFS boot for SPARC? Thanks Andrew. This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] LowEnd Batt. backed raid controllers that will deal with ZFS commit semantics correctly?
Albert Chin wrote: > On Tue, Jan 22, 2008 at 12:47:37PM -0500, Kyle McDonald wrote: > >> My primary use case, is NFS base storage to a farm of software build >> servers, and developer desktops. >> > > For the above environment, you'll probably see a noticable improvement > with a battery-backed NVRAM-based ZIL. Unfortunately, no inexpensive > cards exist for the common consumer (with ECC memory anyways). If you > convince http://www.micromemory.com/ to sell you one, let us know :) > > I know, but for a that card you need a driver to make it appear as a device. Plus it would take a PCI slot. I was hoping to make use of the battery backed ram on a RAID card that I already have (but can't use since I want to let ZFS do the redundancy.) If I had a card with battery backed ram, how would I go about testing the commit semantics to see if it is only obeying ZFS commits when the battery is bad? Does anyone know if the IBM ServeRAID 7k or 8k do this correctly? If not any chance of getting IBM to 'fix' the firmware? The Solaris RedBooks I've read, they seem to think highly of ZFS. Back on the subject of NVRAM for ZIL devices, What are people using then for ZIL devices on the budget-limited side of things? I've foudn some SATA Flash drive, and a bunch that are IDE. Unfortunately the HW I'd like to stick this in is a little older... It's got a U320 SCSI controller in it. Has anyone found a good U320 Flash Disk that's not overkill size wise, and not outrageously expensive? Google found what appear to be a few OEM vendors, but no resellers on the qty I'd be interested in. Anyone using a USB Flash drive? Is USB fast enough to gain any benefits? -Kyle > Set "set zfs:zil_disable = 1" in /etc/system to gauge the type of > improvement you can expect. Don't use this in production though. > > ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] LowEnd Batt. backed raid controllers that will deal with ZFS commit semantics correctly?
On Tue, Jan 22, 2008 at 12:47:37PM -0500, Kyle McDonald wrote: > > My primary use case, is NFS base storage to a farm of software build > servers, and developer desktops. For the above environment, you'll probably see a noticable improvement with a battery-backed NVRAM-based ZIL. Unfortunately, no inexpensive cards exist for the common consumer (with ECC memory anyways). If you convince http://www.micromemory.com/ to sell you one, let us know :) Set "set zfs:zil_disable = 1" in /etc/system to gauge the type of improvement you can expect. Don't use this in production though. -- albert chin ([EMAIL PROTECTED]) ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS vq_max_pending value ?
Manoj Nayak wrote: > Hi All. > > ZFS document says ZFS schedules it's I/O in such way that it manages to > saturate a single disk bandwidth using enough concurrent 128K I/O. > The no of concurrent I/O is decided by vq_max_pending.The default value > for vq_max_pending is 35. > > We have created 4-disk raid-z group inside ZFS pool on Thumper.ZFS > record size is set to 128k.When we read/write a 128K record ,it issue a > 128K/3 I/O to each of the 3 data disks in the 4-disk raid-z group. > Yes, this is how it works for a read without errors. For a write, you should see 4 writes, each 128KBytes/3. Writes may also be coalesced, so you may see larger physical writes. > We need to saturate all three data disk bandwidth in the Raidz group.Is > it required to set vq_max_pending value to 35*3=135 ? > No. vq_max_pending applies to each vdev. Use iostat to see what the device load is. For the commonly used Hitachi 500 GByte disks in a thumper, the read media bandwidth is 31-64.8 MBytes/s. Writes will be about 80% of reads, or 24.8-51.8 MBytes/s. In a thumper, the disk bandwidth will be the limiting factor for the hardware. -- richard ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] LowEnd Batt. backed raid controllers that will deal with ZFS commit semantics correctly?
Are there, or Does it make any sense to try to find a RAID card with battery backup that will ignore the ZFS commit commands when the battery is able to guarantee stable storage? I don't know if they do this, but I've recently had good non-ZFS performance with the IBM ServeRAID 8k raid that was in an xSeries server I was using. the 8k has 256MB or batter backed cache. The server it was in, only had 6 drive bays, and I'm not looking to have it do RAID5 for ZFS, but I just had the idea: "Hey, I wonder if I could setup the card with 5 (single drive) RAID 0 LUNs, and gain the advantage of the the 256MB battery backed cache, when I tell ZFS to do RAIDZ across them?" I know battery-backed cache, and the proper commit semantics are generally found only on higher end raid controllers and arrays (right?) But I'm wondering now if I couldn't get an 8 port SATA controller that would let me map each single drive as a RAID 0 LUN and use it's cache to boost performance. My primary use case, is NFS base storage to a farm of software build servers, and developer desktops. Anyone searched for this already? Anyone found any reasons why it wouldn't work already? -Kyle ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] LowEnd Batt. backed raid controllers that will deal with ZFS commit semantics correctly?
Are there, or Does it make any sense to try to find a RAID card with battery backup that will ignore the ZFS commit commands when the battery is able to guarantee stable storage? I don't know if they do this, but I've recently had good non-ZFS performance with the IBM ServeRAID 8k raid that was in an xSeries server I was using. the 8k has 256MB or batter backed cache. The server it was in, only had 6 drive bays, and I'm not looking to have it do RAID5 for ZFS, but I just had the idea: "Hey, I wonder if I could setup the card with 5 (single drive) RAID 0 LUNs, and gain the advantage of the the 256MB battery backed cache, when I tell ZFS to do RAIDZ across them?" I know battery-backed cache, and the proper commit semantics are generally found only on higher end raid controllers and arrays (right?) But I'm wondering now if I couldn't get an 8 port SATA controller that would let me map each single drive as a RAID 0 LUN and use it's cache to boost performance. My primary use case, is NFS base storage to a farm of software build servers, and developer desktops. Anyone searched for this already? Anyone found any reasons why it wouldn't work already? -Kyle ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Sparc zfs root/boot status ?
zfs boot on sparc will not be putback on its own. It will be putback with the rest of zfs boot support, sometime around build 86. Lori Mauro Mozzarelli wrote: > Back in October/November 2007 when I asked about Sparc zfs boot and root > capabilities, I got a reply indicating late December 2007 for a possible > release. > > I was wondering what is the status right now, will this feature make it into > build 79? > > > This message posted from opensolaris.org > ___ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Sparc zfs root/boot status ?
Mauro Mozzarelli wrote: > Back in October/November 2007 when I asked about Sparc zfs boot and root > capabilities, I got a reply indicating late December 2007 for a possible > release. > > I was wondering what is the status right now, will this feature make it into > build 79? No build 79 has long since closed and SPARC ZFS Boot isn't in it. -- Darren J Moffat ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Swap on ZVOL safe to use?
Lori Alt wrote: > The bug is being actively worked at this time (it just got a boost > in urgency as a result of the issues it was causing for the > zfs boot project). It is likely that there will be a fix soon > (sooner than zfs boot will be available). In the > meantime, I know of no workaround. Maybe someone > else does. Is the fix to make it safe to swap on a ZVOL or is it the introduction of the raw (non COW) volumes mentioned previously ? -- Darren J Moffat ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS vdev_cache
Manoj Nayak writes: > Hi All, > > If any dtrace script is available to figure out the vdev_cache (or > software track buffer) reads in kiloBytes ? > > The document says the default size of the read is 128k , However > vdev_cache source code implementation says the default size is 64k > > Thanks > Manoj Nayak > Which document ? It's 64K when it applies. Nevada won't use the vdev_cache for data block anymore. -r > > > ___ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Updated ZFS Automatic Snapshot Service - version 0.10.
Hi all, I've got a slightly updated version of the ZFS Automatic Snapshot SMF Service on my blog. This version contains a few bugfixes (many thanks to Reid Spencer and Breandan Dezendorf!) as well as a small new feature - by default we now avoid taking snapshots for any datasets that are on a pool that's currently being scrubbed or resilvered to avoid running into 6343667. More at: http://blogs.sun.com/timf/entry/zfs_automatic_snapshots_0_10 Is this service something that we'd like to put into OpenSolaris or are there plans for something similar that achieves the same goal (and perhaps integrates more neatly with the rest of ZFS) ? Otherwise, should I start filling in an ARC one-pager template or is this sort of utility something that's better left to sysadmins to implement themselves, rather than baking it into the OS ? cheers, tim -- Tim Foster, Sun Microsystems Inc, Solaris Engineering Ops http://blogs.sun.com/timf ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] ZFS vq_max_pending value ?
Hi All. ZFS document says ZFS schedules it's I/O in such way that it manages to saturate a single disk bandwidth using enough concurrent 128K I/O. The no of concurrent I/O is decided by vq_max_pending.The default value for vq_max_pending is 35. We have created 4-disk raid-z group inside ZFS pool on Thumper.ZFS record size is set to 128k.When we read/write a 128K record ,it issue a 128K/3 I/O to each of the 3 data disks in the 4-disk raid-z group. We need to saturate all three data disk bandwidth in the Raidz group.Is it required to set vq_max_pending value to 35*3=135 ? Thanks Manoj Nayak ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] ZFS vdev_cache
Hi All, If any dtrace script is available to figure out the vdev_cache (or software track buffer) reads in kiloBytes ? The document says the default size of the read is 128k , However vdev_cache source code implementation says the default size is 64k Thanks Manoj Nayak ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Sparc zfs root/boot status ?
Back in October/November 2007 when I asked about Sparc zfs boot and root capabilities, I got a reply indicating late December 2007 for a possible release. I was wondering what is the status right now, will this feature make it into build 79? This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Ditto blocks in S10U4 ?
On 22 January, 2008 - [EMAIL PROTECTED] sent me these 1,6K bytes: > bash-3.00# cat /etc/release > Solaris 10 8/07 s10x_u4wos_12b X86 >Copyright 2007 Sun Microsystems, Inc. All Rights Reserved. > Use is subject to license terms. > Assembled 16 August 2007 > > (with all the latest patches) > > bash-3.00# zpool list > NAMESIZEUSED AVAILCAP HEALTH ALTROOT > zpool1 20.8T 5.44G 20.8T 0% ONLINE - > > bash-3.00# zpool upgrade -v > This system is currently running ZFS version 4. > > The following versions are supported: > > VER DESCRIPTION > --- > 1 Initial ZFS version > 2 Ditto blocks (replicated metadata) > 3 Hot spares and double parity RAID-Z > 4 zpool history > > For more information on a particular version, including supported > releases, see: > > http://www.opensolaris.org/os/community/zfs/version/N > > Where 'N' is the version number. > > bash-3.00# zfs set copies=2 zpool1 > cannot set property for 'zpool1': invalid property 'copies' > > >From http://www.opensolaris.org/os/community/zfs/version/2/ > "... This version includes support for "Ditto Blocks", or replicated > metadata." > > Can anybody shed any light on it ? The 'copies' thing in zfs set is ditto blocks for data.. the one in ver2 is for metadata only.. /Tomas -- Tomas Ögren, [EMAIL PROTECTED], http://www.acc.umu.se/~stric/ |- Student at Computing Science, University of Umeå `- Sysadmin at {cs,acc}.umu.se ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Ditto blocks in S10U4 ?
bash-3.00# cat /etc/release Solaris 10 8/07 s10x_u4wos_12b X86 Copyright 2007 Sun Microsystems, Inc. All Rights Reserved. Use is subject to license terms. Assembled 16 August 2007 (with all the latest patches) bash-3.00# zpool list NAMESIZEUSED AVAILCAP HEALTH ALTROOT zpool1 20.8T 5.44G 20.8T 0% ONLINE - bash-3.00# zpool upgrade -v This system is currently running ZFS version 4. The following versions are supported: VER DESCRIPTION --- 1 Initial ZFS version 2 Ditto blocks (replicated metadata) 3 Hot spares and double parity RAID-Z 4 zpool history For more information on a particular version, including supported releases, see: http://www.opensolaris.org/os/community/zfs/version/N Where 'N' is the version number. bash-3.00# zfs set copies=2 zpool1 cannot set property for 'zpool1': invalid property 'copies' >From http://www.opensolaris.org/os/community/zfs/version/2/ "... This version includes support for "Ditto Blocks", or replicated metadata." Can anybody shed any light on it ? Regards przemol -- http://przemol.blogspot.com/ -- Kogo Doda ciagnela do lozka? Sprawdz >> http://link.interia.pl/f1cde ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] problem with nfs share of zfs storage
Hello Francois, Monday, January 21, 2008, 9:51:22 PM, you wrote: FD> I have a need to stream video over nfs. video is stored on zfs. every 10 FD> minutes or so, the video will freeze, and then 1 minute later it FD> resumes. This doesn't happen from an nfs mount on ufs. zfs server is a FD> 32 bit P4 box with 512MB, running nexenta in plain text mode, and FD> nothing else, really. Tried playback from different OSes and the same is FD> happening. Network has more than 10x the capacity that is required, no FD> compression on zfs FD> Any idea what is going on? cpu is not pegged on server or playback FD> client. Not sure what to look for. try to do iostat -xnz 1 while you are streamin and catch the moment you experience a problem. Also try vmstat -p 1 at the same time and catch the same moment. -- Best regards, Robertmailto:[EMAIL PROTECTED] http://milek.blogspot.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ATA UDMA data parity error
For the archive, I swapped the mobo and all is good now... (I copied 100GB into the pool without a crash) One problem I had was that Solaris would hang whenever booting - even when all the aoc-sat2-mv8 cards were pulled out. Turns out that switching the BIOS field "USB 2.0 Controller Mode" from "HiSpeed" to "FullSpeed" makes the difference - any ideas why? Thanks, Kent ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss