Re: [zfs-discuss] zpool import hangs indefinitely (retry post in parts; too long?)
Victor, I've reproduced the crash and have vmdump.0 and dump device files. How do I query the stack on crash for your analysis? What other analysis should I provide? Thanks -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] OCZ Vertex 2 Pro performance numbers
> -Original Message- > From: Erik Trimble [mailto:erik.trim...@oracle.com] > Sent: 星期四, 七月 01, 2010 11:45 > To: Fred Liu > Cc: Bob Friesenhahn; 'OpenSolaris ZFS discuss' > Subject: Re: [zfs-discuss] OCZ Vertex 2 Pro performance numbers > > On 6/30/2010 7:17 PM, Fred Liu wrote: > > See. Thanks. > > Does it have the hardware functionality to detect the power outage > and do force cache flush when the cache is enabled? > > Any more detailed info about the capacity (farad) of this supercap > and how long one discharge will be? > > > > Thanks. > > > > Fred > > > > > > I don't think it matters the actual size of the supercap. As Bob said, > it only needs to be sized to be big enough to allow all on-board DRAM > to > be flushed out to Flash. How big that should be is easily determined by > the manufacturer, and they'd be grossly negligent if it wasn't at least > that size. Any capacity beyond that needed to do a single full flush is > excess, so I would hazard a guess that the supercap is "just big enough, > and no more". That is, just enough to power the SSD for the partial > second or so it takes to flush to flash. I don't think we need to worry > how big that actually is. Understand and agree. It is sort of picky to ask this without manufacturer's help ;-) > > Answering your second question first - my reading of the info is that > it > will force a cache flush (if the cache is enabled) upon loss of power > under any circumstance. That is, it will flush the cache in both a > controlled power-down (regardless of whether the OS says to do so) and > in an immediate power loss. All this is in the SSD's firmware. That is exactly what I expect. Little bit broadly speaking, it is sort of ambiguous in aspect of cache flush in all the HDDs. Is it OS-controlled or firmware-controlled or both? At least in this case, I have got what I expect. But what about for all(generic) the HDDs? Thanks. Fred > > -Erik > > > > -Original Message- > > From: Bob Friesenhahn [mailto:bfrie...@simple.dallas.tx.us] > > Sent: 星期四, 七月 01, 2010 10:01 > > To: Fred Liu > > Cc: 'OpenSolaris ZFS discuss' > > Subject: Re: [zfs-discuss] OCZ Vertex 2 Pro performance numbers > > > > On Wed, 30 Jun 2010, Fred Liu wrote: > > > > > >> Any duration limit on the supercap? How long can it sustain the data? > >> > > A supercap on a SSD drive only needs to sustain the data until it has > > been saved (perhaps 10 milliseconds). It is different than a RAID > > array battery. > > > > Bob > > > > > -- > Erik Trimble > Java System Support > Mailstop: usca22-123 > Phone: x17195 > Santa Clara, CA > Timezone: US/Pacific (GMT-0800) > ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] OCZ Vertex 2 Pro performance numbers
On 6/30/2010 7:17 PM, Fred Liu wrote: > See. Thanks. > Does it have the hardware functionality to detect the power outage and do > force cache flush when the cache is enabled? > Any more detailed info about the capacity (farad) of this supercap and how > long one discharge will be? > > Thanks. > > Fred > > I don't think it matters the actual size of the supercap. As Bob said, it only needs to be sized to be big enough to allow all on-board DRAM to be flushed out to Flash. How big that should be is easily determined by the manufacturer, and they'd be grossly negligent if it wasn't at least that size. Any capacity beyond that needed to do a single full flush is excess, so I would hazard a guess that the supercap is "just big enough, and no more". That is, just enough to power the SSD for the partial second or so it takes to flush to flash. I don't think we need to worry how big that actually is. Answering your second question first - my reading of the info is that it will force a cache flush (if the cache is enabled) upon loss of power under any circumstance. That is, it will flush the cache in both a controlled power-down (regardless of whether the OS says to do so) and in an immediate power loss. All this is in the SSD's firmware. -Erik > -Original Message- > From: Bob Friesenhahn [mailto:bfrie...@simple.dallas.tx.us] > Sent: 星期四, 七月 01, 2010 10:01 > To: Fred Liu > Cc: 'OpenSolaris ZFS discuss' > Subject: Re: [zfs-discuss] OCZ Vertex 2 Pro performance numbers > > On Wed, 30 Jun 2010, Fred Liu wrote: > > >> Any duration limit on the supercap? How long can it sustain the data? >> > A supercap on a SSD drive only needs to sustain the data until it has > been saved (perhaps 10 milliseconds). It is different than a RAID > array battery. > > Bob > -- Erik Trimble Java System Support Mailstop: usca22-123 Phone: x17195 Santa Clara, CA Timezone: US/Pacific (GMT-0800) ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Dedup RAM requirements, vs. L2ARC?
On 6/30/2010 2:01 PM, valrh...@gmail.com wrote: Another question on SSDs in terms of performance vs. capacity. Between $150 and $200, there are at least three SSDs that would fit the rough specifications for the L2ARC on my system: 1. Crucial C300, 64 GB: $150: medium performance, medium capacity. 2. OCZ Vertex 2, 50 GB: $180: higher performance, lower capacity. (The Agility 2 is similar, but $15 cheaper) 3. Corsair Force 60 GB, $195: similar performance, slightly higher capacity (more over-provisioning with the same SandForce controller). 4. Intel X25M G2, 80 GB: $200: largest capacity, probably lowest(?) performance. So which would be the best choice L2ARC? Is it size, or is it throughput, that really matter for this? Within this range, price doesn't make much difference. Thanks, as always, for the guidance. For L2ARC, random read performance is the primary important factor. Given what I've seen of the C300's random read performance for NON-4k block sizes, and for short queue depths (both of which are highly likely for an L2ARC device), my recommendation of the above four is #4, the Intel device. #2 is possibly the best performing for ZIL usage, should you choose to use a portion of the device for that purpose. For what you've said your usage pattern is, I think the Intel X25M is the best fit for good performance and size for the dollar. -- Erik Trimble Java System Support Mailstop: usca22-123 Phone: x17195 Santa Clara, CA Timezone: US/Pacific (GMT-0800) ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Use of blocksize (-b) during zfs zvol create, poor performance
Hi Eff, There are a significant number of variables to work through with dedup and compression enabled. So the first suggestion I have is to disable those features for now so your not working with too many elements. With those features set aside an NTFS cluster operation does not = a 64k raw I/O block. As well the ZFS 64k blocksize does not = one I/O operation. We may also need to consider the overall network performance behavior and iSCSI protocol characteristics and the Windows network stack. iperf is a good tool to rule that out. What I primarily suspect the issue may be is that write I/O operations are not aligned and are waiting for a I/O completion over multiple vdevs. Alignment is important for write I/O optimization and how the I/O maps at the software raid mode will make a significant impact to the DMU and SPA operations on a specific vdev layout. You may also have an issue with write cache operations, by default large I/O calls such as 64K will not use a ZIL cache vdev, if you have one defined, but will be written directly to your array vdevs which will also include a transaction group write operation. To ensure ZIL log usage with 64k I/O's you can apply the following: edit the /etc/system file with set zfs:zfs_immediate_write_sz = 131071 a reboot is required to activate the system file You have also not indicated what your zpool configuration looks like, that would helpful in the discussion area. It appears that your applying the x4500 as a backup target which means you should (if not already) enable write caching on the COMSTAR LU properties for this type of application. e.g stmfadm modify-lu -p wcd=false 600144F02F2280004C1D62010001 To help triage the perf issue further you could post 2 'kstat zfs' + 2 'kstat stmf' outputs on a 5 min interval and a 'zpool iostat -v 30 5' which would help visualize the I/O behavior. Regards, Mike http://blog.laspina.ca/ -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] OCZ Vertex 2 Pro performance numbers
See. Thanks. Does it have the hardware functionality to detect the power outage and do force cache flush when the cache is enabled? Any more detailed info about the capacity (farad) of this supercap and how long one discharge will be? Thanks. Fred -Original Message- From: Bob Friesenhahn [mailto:bfrie...@simple.dallas.tx.us] Sent: 星期四, 七月 01, 2010 10:01 To: Fred Liu Cc: 'OpenSolaris ZFS discuss' Subject: Re: [zfs-discuss] OCZ Vertex 2 Pro performance numbers On Wed, 30 Jun 2010, Fred Liu wrote: > Any duration limit on the supercap? How long can it sustain the data? A supercap on a SSD drive only needs to sustain the data until it has been saved (perhaps 10 milliseconds). It is different than a RAID array battery. Bob -- Bob Friesenhahn bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer,http://www.GraphicsMagick.org/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] OCZ Vertex 2 Pro performance numbers
On Wed, 30 Jun 2010, Fred Liu wrote: Any duration limit on the supercap? How long can it sustain the data? A supercap on a SSD drive only needs to sustain the data until it has been saved (perhaps 10 milliseconds). It is different than a RAID array battery. Bob -- Bob Friesenhahn bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer,http://www.GraphicsMagick.org/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] OCZ Vertex 2 Pro performance numbers
Any duration limit on the supercap? How long can it sustain the data? Thanks. Fred -Original Message- From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of David Magda Sent: 星期六, 六月 26, 2010 21:48 To: Arne Jansen Cc: 'OpenSolaris ZFS discuss' Subject: Re: [zfs-discuss] OCZ Vertex 2 Pro performance numbers On Jun 26, 2010, at 02:09, Arne Jansen wrote: > Geoff Nordli wrote: >> Is this the one >> (http://www.ocztechnology.com/products/solid-state-drives/2-5--sata-ii/maxim >> um-performance-enterprise-solid-state-drives/ocz-vertex-2-pro- >> series-sata-ii >> -2-5--ssd-.html) with the built in supercap? > > Yes. Crickey. Who's the genius who thinks of these URLs? ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] What happens when unmirrored ZIL log device is removed ungracefully
Ragnar Sundblad wrote: I was referring to the case where zfs has written data to the drive but still hasen't issued a cache flush, and before the cache flush the drive is reset. If zfs finally issues a cache flush and then isn't informed that the drive has been reset, data is lost. I hope this is not the case, on any SCSI-based protocol or SATA. The nasty race that occurs is if your system crashes or is powered off *after* the log has acknowledged the write, but before the bits get shoved to main pool storage. This is a data loss situation. With "log", do you mean the ZIL (with or without a slog device)? If so, that should not be an issue and is exactly with the ZIL is for - it will be replayed at the next filesystem attach and the data will be pushed to the main pool storage. Do I misunderstand you? See your case above - written, ack'd, but not cache flushed. We're talking about the same thing. -- Carson ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs rpool corrupt?????
On 07/ 1/10 01:36 AM, Tony MacDoodle wrote: Hello, Has anyone encountered the following error message, running Solaris 10 u8 in an LDom. bash-3.00# devfsadm devfsadm: write failed for /dev/.devfsadm_dev.lock: Bad exchange descriptor Not specifically. But it is clear from what follows that ZFS has detected an error in a pool which lacks redundancy. bash-3.00# zpool status -v rpool pool: rpool state: DEGRADED status: One or more devices has experienced an error resulting in data corruption. Applications may be affected. action: Restore the file in question if possible. Otherwise restore the entire pool from backup. see: http://www.sun.com/msg/ZFS-8000-8A scrub: scrub in progress for 0h1m, 22.57% done, 0h5m to go config: NAME STATE READ WRITE CKSUM rpool DEGRADED 0 0 17 c0d0s0 DEGRADED 0 0 34 too many errors errors: Permanent errors have been detected in the following files: //dev/.devfsadm_dev.lock //var/svc/log/system-tsol-zones:default.log //var/svc/log/system-labeld:default.log //var/svc/log/system-filesystem-volfs:default.log -- Ian. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Kernel Panic on zpool clean
Aha: http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6794136 I think I'll try booting from a b134 Live CD and see that will let me fix things. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] b134 pool borked!
Just in case any stray searches finds it way here, this is what happened to my pool: http://phrenetic.to/zfs -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] What happens when unmirrored ZIL log device is removed ungracefully
On 30 jun 2010, at 22.46, Garrett D'Amore wrote: > On Wed, 2010-06-30 at 22:28 +0200, Ragnar Sundblad wrote: > >> To be safe, the protocol needs to be able to discover that the devices >> (host or disk) has been disconnected and reconnected or has been reset >> and that either parts assumptions about the state of the other has to >> be invalidated. >> >> I don't know enough about either SAS or SATA to say if they guarantee that >> you will be noticed. But if they don't, they aren't safe for cached writes. > > Generally, ZFS will only notice a removed disk when it is trying to > write to it -- or when it probes. ZFS does not necessarily get notified > on hot device removal -- certainly not immediately. That should be fine, as soon as it is informed on the next access. > (I've written some > code so that *will* notice, even if no write ever goes there... that's > the topic of another message.) > > The other thing is that disk writes are generally idempotent. So, if a > drive was removed between the time an IO was finished but before the > time the response was returned to the host, it isn't a problem. When > the disk is returned, ZFS should automatically retry the I/O. (In fact, > ZFS automatically retries failed I/O operations several times before > finally "failing".) I was referring to the case where zfs has written data to the drive but still hasen't issued a cache flush, and before the cache flush the drive is reset. If zfs finally issues a cache flush and then isn't informed that the drive has been reset, data is lost. I hope this is not the case, on any SCSI-based protocol or SATA. > The nasty race that occurs is if your system crashes or is powered off > *after* the log has acknowledged the write, but before the bits get > shoved to main pool storage. This is a data loss situation. With "log", do you mean the ZIL (with or without a slog device)? If so, that should not be an issue and is exactly with the ZIL is for - it will be replayed at the next filesystem attach and the data will be pushed to the main pool storage. Do I misunderstand you? /ragge ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Dedup RAM requirements, vs. L2ARC?
On Wed, 2010-06-30 at 16:41 -0500, Nicolas Williams wrote: > On Wed, Jun 30, 2010 at 01:35:31PM -0700, valrh...@gmail.com wrote: > > Finally, for my purposes, it doesn't seem like a ZIL is necessary? I'm > > the only user of the fileserver, so there probably won't be more than > > two or three computers, maximum, accessing stuff (and writing stuff) > > remotely. > > It depends on what you're doing. > > The perennial complaint about NFS is the synchronous open()/close() > operations and the fact that archivers (tar, ...) will generally unpack > archives in a single-threaded manner, which means all those synchronous > ops punctuate the archiver's performance with pauses. This is a load > type for which ZIL devices come in quite handy. If you write lots of > small files often and in single-threaded ways _and_ want to guarantee > you don't lose transactions, then you want a ZIL device. (The recent > knob for controlling whether synchronous I/O gets done asynchronously > would help you if you don't care about losing a few seconds worth of > writes, assuming that feature makes it into any release of Solaris.) Btw, that feature will be in the NexentaStor 3.0.4 release (which is currently in late development/early QA, and should be out soon.) Archivers are not the only thing that acts this way, btw. Databases, and even something as benign as compiling a large software suite can have similar implications where a fast slog device can help. - Garrett ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Dedup RAM requirements, vs. L2ARC?
On Wed, Jun 30, 2010 at 01:35:31PM -0700, valrh...@gmail.com wrote: > Finally, for my purposes, it doesn't seem like a ZIL is necessary? I'm > the only user of the fileserver, so there probably won't be more than > two or three computers, maximum, accessing stuff (and writing stuff) > remotely. It depends on what you're doing. The perennial complaint about NFS is the synchronous open()/close() operations and the fact that archivers (tar, ...) will generally unpack archives in a single-threaded manner, which means all those synchronous ops punctuate the archiver's performance with pauses. This is a load type for which ZIL devices come in quite handy. If you write lots of small files often and in single-threaded ways _and_ want to guarantee you don't lose transactions, then you want a ZIL device. (The recent knob for controlling whether synchronous I/O gets done asynchronously would help you if you don't care about losing a few seconds worth of writes, assuming that feature makes it into any release of Solaris.) > But, from what I can gather, by spending a little under $400, I should > substantially increase the performance of my system with dedup? Many > thanks, again, in advance. If you have deduplicatious data, yes. Nico -- ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Zpool mirror fail testing - odd resilver behaviour after reconnect
I have a Opensolaris snv_134 machine with 2 x 1.5TB drives. One is a Samsung Silencer the other is a dreaded Western Digital Green. I'm testing the mirror for failure by simply yanking out the SATA cable while the machine is running. The system never skips a beat, which is great. But the reconnect behaviour is vastly different on the two drives. 1. Samsung reconnect. `cfgadm` reported the drive as connected but unconfigured. After running `cfgadm -c configure sata1/1`, the drive automatically came online in the zpool mirror and resilvered its differences which completed in about 10 seconds. This excellent. 2. WD Green reconnect. `cfgadm` reported the drive as disconnected. I had to use the '-f' option to connect the drive and then configure it: m...@vault:~$ cfgadm Ap_Id Type Receptacle Occupant Condition sata1/0sata-portdisconnected unconfigured failed sata1/1::dsk/c8t1d0disk connectedconfigured ok m...@vault:~$ pfexec cfgadm -c connect sata1/0 cfgadm: Insufficient condition m...@vault:~$ pfexec cfgadm -f -c connect sata1/0 Activate the port: /devices/p...@0,0/pci8086,4...@1f,2:0 This operation will enable activity on the SATA port Continue (yes/no)? yes m...@vault:~$ cfgadm Ap_Id Type Receptacle Occupant Condition sata1/0disk connectedunconfigured unknown sata1/1::dsk/c8t1d0disk connectedconfigured ok m...@vault:~$ pfexec cfgadm -c configure sata1/0 m...@vault:~$ cfgadm Ap_Id Type Receptacle Occupant Condition sata1/0::dsk/c8t0d0disk connectedconfigured ok sata1/1::dsk/c8t1d0disk connectedconfigured ok After this point, zpool resilvered the entire 243GB dataset. I suspect that the automatic connect is simply a firmware problem and yet another reason to NOT BUY Western Digital Green drives. But my real question is: Why does zpool want to resilver the entire dataset on drive, but not the other?? -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Dedup RAM requirements, vs. L2ARC?
Another question on SSDs in terms of performance vs. capacity. Between $150 and $200, there are at least three SSDs that would fit the rough specifications for the L2ARC on my system: 1. Crucial C300, 64 GB: $150: medium performance, medium capacity. 2. OCZ Vertex 2, 50 GB: $180: higher performance, lower capacity. (The Agility 2 is similar, but $15 cheaper) 3. Corsair Force 60 GB, $195: similar performance, slightly higher capacity (more over-provisioning with the same SandForce controller). 4. Intel X25M G2, 80 GB: $200: largest capacity, probably lowest(?) performance. So which would be the best choice L2ARC? Is it size, or is it throughput, that really matter for this? Within this range, price doesn't make much difference. Thanks, as always, for the guidance. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] What happens when unmirrored ZIL log device is removed ungracefully
On Wed, 2010-06-30 at 22:28 +0200, Ragnar Sundblad wrote: > To be safe, the protocol needs to be able to discover that the devices > (host or disk) has been disconnected and reconnected or has been reset > and that either parts assumptions about the state of the other has to > be invalidated. > > I don't know enough about either SAS or SATA to say if they guarantee that > you will be noticed. But if they don't, they aren't safe for cached writes. Generally, ZFS will only notice a removed disk when it is trying to write to it -- or when it probes. ZFS does not necessarily get notified on hot device removal -- certainly not immediately. (I've written some code so that *will* notice, even if no write ever goes there... that's the topic of another message.) The other thing is that disk writes are generally idempotent. So, if a drive was removed between the time an IO was finished but before the time the response was returned to the host, it isn't a problem. When the disk is returned, ZFS should automatically retry the I/O. (In fact, ZFS automatically retries failed I/O operations several times before finally "failing".) The nasty race that occurs is if your system crashes or is powered off *after* the log has acknowledged the write, but before the bits get shoved to main pool storage. This is a data loss situation. But assuming you don't take a system crash or some other fault, I would guess that removal of a log device and reinsertion would not cause any problems. (Except for possibly delaying synchronous writes.) That said, I've not actually *tested* it. - Garrett > > /ragge > > ___ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Dedup RAM requirements, vs. L2ARC?
Thanks to everyone for such helpful and detailed answers. Contrary to some of the trolls in other threads, I've had a fantastic experience here, and am grateful to the community. Based on the feedback, I'll upgrade my machine to 8 GB of RAM. I only have two slots on the motherboard, and either add two 2 GB DIMMs to add to the two I have there, or throw those away and start over with 4 GB DIMMs, which is not something I'm quite ready to do yet (before this is all working, for instance). Now, for the SSD, Crucial appears to have their (recommended above) C300 64 GB drive for $150, which seems like a good deal. Intel's X25M G2 is $200 for 80 GB. Does anyone have a strong opinion as to which would work better for the L2ARC? I am having a hard time understanding, from the performance numbers given, which would be a better choice. Finally, for my purposes, it doesn't seem like a ZIL is necessary? I'm the only user of the fileserver, so there probably won't be more than two or three computers, maximum, accessing stuff (and writing stuff) remotely. But, from what I can gather, by spending a little under $400, I should substantially increase the performance of my system with dedup? Many thanks, again, in advance. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] What happens when unmirrored ZIL log device is removed ungracefully
On 12 apr 2010, at 22.32, Carson Gaspar wrote: > Carson Gaspar wrote: >> Miles Nordin wrote: "re" == Richard Elling writes: >>> How do you handle the case when a hotplug SATA drive is powered off >>> unexpectedly with data in its write cache? Do you replay the writes, or do >>> they go down the ZFS hotplug write hole? >> If zfs never got a positive response to a cache flush, that data is still in >> memory and will be re-written. Unless I greatly misunderstand how ZFS >> works... >> If the drive _lies_ about a cache flush, you're screwed (well, you can >> probably roll back a few TXGs...). Don't buy broken drives / bridge chipsets. > > Hrm... thinking about this some more, I'm not sure what happens if the drive > comes _back_ after a power loss, quickly enough that ZFS is never told about > the disappearance (assuming that can happen without a human cfgadm'ing it > back online - I don't know). > > Does anyone who understands the internals better than care to take a stab at > what happens if: > > - ZFS writes data to /dev/foo > - /dev/foo looses power and the data from the above write, not yet flushed to > rust (say a field tech pulls the wrong drive...) > - /dev/foo powers back on (field tech quickly goes whoops and plugs it back > in) > > In the case of a redundant zpool config, when will ZFS notice the uberblocks > are out of sync and repair? If this is a non-redundant zpool, how does the > response differ? To be safe, the protocol needs to be able to discover that the devices (host or disk) has been disconnected and reconnected or has been reset and that either parts assumptions about the state of the other has to be invalidated. I don't know enough about either SAS or SATA to say if they guarantee that you will be noticed. But if they don't, they aren't safe for cached writes. /ragge ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Forcing resilver?
> This may not work for you, but it worked for me, and I was pleasantly > surprised. Replace a drive with itself. > > zpool replace tank c0t2d0 c0t2d0 I tried that - it didn't work - I replaced the drive with a new one, that worked, and then I made a new zpool on the old drive with zfs-fuse in Linux, destroyed the pool and put the old drive in the "cold spares" drawer Vennlige hilsener / Best regards roy -- Roy Sigurd Karlsbakk (+47) 97542685 r...@karlsbakk.net http://blogg.karlsbakk.net/ -- I all pedagogikk er det essensielt at pensum presenteres intelligibelt. Det er et elementært imperativ for alle pedagoger å unngå eksessiv anvendelse av idiomer med fremmed opprinnelse. I de fleste tilfeller eksisterer adekvate og relevante synonymer på norsk. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS on Ubuntu
- Original Message - > I think zfs on ubuntu currently is a rather bad idea. See test below > with ubuntu Lucid 10.04 (amd64) Quick update on this - it seems this is due to a bug in the Linux kernel where it can't deal with partition changes on a drive with mounted filesystems. I'm not 100% sure about this, but it still looks that way. Testing with disks with non-mounted filesystems shows this works better. Just my two cents Vennlige hilsener / Best regards roy -- Roy Sigurd Karlsbakk (+47) 97542685 r...@karlsbakk.net http://blogg.karlsbakk.net/ -- I all pedagogikk er det essensielt at pensum presenteres intelligibelt. Det er et elementært imperativ for alle pedagoger å unngå eksessiv anvendelse av idiomer med fremmed opprinnelse. I de fleste tilfeller eksisterer adekvate og relevante synonymer på norsk. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Announce: zfsdump
On Wed, Jun 30, 2010 at 12:54 PM, Edward Ned Harvey wrote: >> From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- >> boun...@opensolaris.org] On Behalf Of Asif Iqbal >> >> would be nice if i could pipe the zfs send stream to a split and then >> send of those splitted stream over the >> network to a remote system. it would help sending it over to remote >> system quicker. can your tool do that? > > Does that make sense? I assume the network is the bottleneck; the only way > the multiple streams would go any faster than a single stream would be > because you're multithreading and hogging all the bandwidth for yourself, > instead of sharing fairly with the httpd or whatever other server is trying > to use the bandwidth. currently to speed up the zfs send| zfs recv I am using mbuffer. It moves the data lot faster than using netcat (or ssh) as the transport method that is why I thought may be transport it like axel does better than wget. axel let you create multiple pipes, so you get the data multiple times faster than with wget. > > If you're talking about streaming to a bunch of separate tape drives (or > whatever) on a bunch of separate systems because the recipient storage is > the bottleneck instead of the network ... then "split" probably isn't the > most useful way to distribute those streams. Because "split" is serial. > You would really want to "stripe" your data to all those various > destinations, so they could all be writing simultaneously. But this seems > like a very specialized scenario, that I think is probably very unusual. > > -- Asif Iqbal PGP Key: 0xE62693C5 KeyServer: pgp.mit.edu A: Because it messes up the order in which people normally read text. Q: Why is top-posting such a bad thing? ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Forcing resilver?
> From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- > boun...@opensolaris.org] On Behalf Of Roy Sigurd Karlsbakk > > There was some messup with switching of drives and an unexpected > reboot, so I suddenly have a drive in my pool that is partly > resilvered. zfs status shows the pool is fine, but after a scrub, it > shows the drive faulted. I've been told on #opensolaris that making a > new pool on the drive and then destroying that pool, and then zpool > replace the drive will help, or moving the drive out and putting > another filesystem on it, then replacing it in the pool, might help. > But then, is it possible to forcibly resilver a drive without this > hassle? This may not work for you, but it worked for me, and I was pleasantly surprised. Replace a drive with itself. zpool replace tank c0t2d0 c0t2d0 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Announce: zfsdump
> From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- > boun...@opensolaris.org] On Behalf Of Asif Iqbal > > would be nice if i could pipe the zfs send stream to a split and then > send of those splitted stream over the > network to a remote system. it would help sending it over to remote > system quicker. can your tool do that? Does that make sense? I assume the network is the bottleneck; the only way the multiple streams would go any faster than a single stream would be because you're multithreading and hogging all the bandwidth for yourself, instead of sharing fairly with the httpd or whatever other server is trying to use the bandwidth. If you're talking about streaming to a bunch of separate tape drives (or whatever) on a bunch of separate systems because the recipient storage is the bottleneck instead of the network ... then "split" probably isn't the most useful way to distribute those streams. Because "split" is serial. You would really want to "stripe" your data to all those various destinations, so they could all be writing simultaneously. But this seems like a very specialized scenario, that I think is probably very unusual. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] What happens when unmirrored ZIL log device is removed ungracefully
On Wed, Jun 30, 2010 at 09:47:15AM -0700, Edward Ned Harvey wrote: > > From: Arne Jansen [mailto:sensi...@gmx.net] > > > > Edward Ned Harvey wrote: > > > Due to recent experiences, and discussion on this list, my colleague > > and > > > I performed some tests: > > > > > > Using solaris 10, fully upgraded. (zpool 15 is latest, which does > > not > > > have log device removal that was introduced in zpool 19) In any way > > > possible, you lose an unmirrored log device, and the OS will crash, > > and > > > the whole zpool is permanently gone, even after reboots. > > > > > > > I'm a bit confused. I tried hard, but haven't been able to reproduce > > this > > using Sol10U8. I have a mirrored slog device. While putting it > > under load doing synchronous file creations, we pulled the power cords > > and unplugged the slog devices. After powering on zfs imported the > > pool, > > but prompted to acknowledge the missing slog devices with zpool clear. > > After that the pool was accessible again. That's exactly how it should > > be. > > Very interesting. I did this test some months ago, so I may not recall the > relevant details, but here are the details I do remember: > > I don't recall if I did this test on osol2009.06, or sol10. > > In Sol10u6 (and I think Sol10u8) the default zpool version is 10, but if you > apply all your patches, then 15 becomes available. I am sure that I've > never upgraded any of my sol10 zpools higher than 10. So it could be that > an older zpool version might exhibit the problem, and you might be using a > newer version. > > In osol2009.06, IIRC, the default is zpool 14, and if you upgrade fully, > you'll get to something around 24. So again, it's possible the bad behavior > went away in zpool 15, or any other number from 11 to 15. > > I'll leave it there for now. If that doesn't shed any light, I'll try to > dust out some more of my mental cobwebs. Anyone else done any testing with zpool version 15 (on Solaris 10 U8)? Have a new system coming in shortly and will test myself, but knowing this is a recoverable scenario would help me rest easier as I have an unmirrored slog setup hanging around still. Ray ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] What happens when unmirrored ZIL log device is removed ungracefully
> From: Arne Jansen [mailto:sensi...@gmx.net] > > Edward Ned Harvey wrote: > > Due to recent experiences, and discussion on this list, my colleague > and > > I performed some tests: > > > > Using solaris 10, fully upgraded. (zpool 15 is latest, which does > not > > have log device removal that was introduced in zpool 19) In any way > > possible, you lose an unmirrored log device, and the OS will crash, > and > > the whole zpool is permanently gone, even after reboots. > > > > I'm a bit confused. I tried hard, but haven't been able to reproduce > this > using Sol10U8. I have a mirrored slog device. While putting it > under load doing synchronous file creations, we pulled the power cords > and unplugged the slog devices. After powering on zfs imported the > pool, > but prompted to acknowledge the missing slog devices with zpool clear. > After that the pool was accessible again. That's exactly how it should > be. Very interesting. I did this test some months ago, so I may not recall the relevant details, but here are the details I do remember: I don't recall if I did this test on osol2009.06, or sol10. In Sol10u6 (and I think Sol10u8) the default zpool version is 10, but if you apply all your patches, then 15 becomes available. I am sure that I've never upgraded any of my sol10 zpools higher than 10. So it could be that an older zpool version might exhibit the problem, and you might be using a newer version. In osol2009.06, IIRC, the default is zpool 14, and if you upgrade fully, you'll get to something around 24. So again, it's possible the bad behavior went away in zpool 15, or any other number from 11 to 15. I'll leave it there for now. If that doesn't shed any light, I'll try to dust out some more of my mental cobwebs. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs-discuss Digest, Vol 56, Issue 126
I searched and searched but was not able to find your added text in this long quoted message. Please re-submit using the english language in simple ASCII text intended for humans. Thanks, Bob On Wed, 30 Jun 2010, Eric Andersen wrote: On Jun 28, 2010, at 10:03 AM, zfs-discuss-requ...@opensolaris.org wrote: Send zfs-discuss mailing list submissions to zfs-discuss@opensolaris.org To subscribe or unsubscribe via the World Wide Web, visit http://mail.opensolaris.org/mailman/listinfo/zfs-discuss or, via email, send a message with subject or body 'help' to zfs-discuss-requ...@opensolaris.org You can reach the person managing the list at zfs-discuss-ow...@opensolaris.org When replying, please edit your Subject line so it is more specific than "Re: Contents of zfs-discuss digest..." Today's Topics: 1. Re: ZFS bug - should I be worried about this? (Gabriele Bulfon) 2. Re: ZFS bug - should I be worried about this? (Victor Latushkin) 3. Re: OCZ Vertex 2 Pro performance numbers (Frank Cusack) 4. Re: ZFS bug - should I be worried about this? (Garrett D'Amore) 5. Announce: zfsdump (Tristram Scott) 6. Re: Announce: zfsdump (Brian Kolaci) 7. Re: zpool import hangs indefinitely (retry post in parts; too long?) (Andrew Jones) 8. Re: Announce: zfsdump (Tristram Scott) 9. Re: Announce: zfsdump (Brian Kolaci) -- Message: 1 Date: Mon, 28 Jun 2010 05:16:00 PDT From: Gabriele Bulfon To: zfs-discuss@opensolaris.org Subject: Re: [zfs-discuss] ZFS bug - should I be worried about this? Message-ID: <593812734.121277727391600.javamail.tweb...@sf-app1> Content-Type: text/plain; charset=UTF-8 Yes...they're still running...but being aware that a power failure causing an unexpected poweroff may make the pool unreadable is a pain Yes. Patches should be available. Or adoption may be lowering a lot... -- This message posted from opensolaris.org -- Message: 2 Date: Mon, 28 Jun 2010 18:14:12 +0400 From: Victor Latushkin To: Gabriele Bulfon Cc: zfs-discuss@opensolaris.org Subject: Re: [zfs-discuss] ZFS bug - should I be worried about this? Message-ID: <4c28ae34.1030...@sun.com> Content-Type: text/plain; CHARSET=US-ASCII; format=flowed On 28.06.10 16:16, Gabriele Bulfon wrote: Yes...they're still running...but being aware that a power failure causing an unexpected poweroff may make the pool unreadable is a pain Pool integrity is not affected by this issue. -- Message: 3 Date: Mon, 28 Jun 2010 07:26:45 -0700 From: Frank Cusack To: 'OpenSolaris ZFS discuss' Subject: Re: [zfs-discuss] OCZ Vertex 2 Pro performance numbers Message-ID: <5f1b59775f3ffc0e1781f...@cusack.local> Content-Type: text/plain; charset=us-ascii; format=flowed On 6/26/10 9:47 AM -0400 David Magda wrote: Crickey. Who's the genius who thinks of these URLs? SEOs -- Message: 4 Date: Mon, 28 Jun 2010 08:17:21 -0700 From: "Garrett D'Amore" To: Gabriele Bulfon Cc: zfs-discuss@opensolaris.org Subject: Re: [zfs-discuss] ZFS bug - should I be worried about this? Message-ID: <1277738241.5596.4325.ca...@velocity> Content-Type: text/plain; charset="UTF-8" On Mon, 2010-06-28 at 05:16 -0700, Gabriele Bulfon wrote: Yes...they're still running...but being aware that a power failure causing an unexpected poweroff may make the pool unreadable is a pain Yes. Patches should be available. Or adoption may be lowering a lot... I don't have access to the information, but if this problem is the same one I think it is, then the pool does not become unreadable. Rather, its state after such an event represents a *consistent* state from some point of time *earlier* than that confirmed fsync() (or a write on a file opened with O_SYNC or O_DSYNC). For most users, this is not a critical failing. For users using databases or requiring transactional integrity for data stored on ZFS, then yes, this is a very nasty problem indeed. I suspect that this is the problem I reported earlier in my blog (http://gdamore.blogspot.com) about certain kernels having O_SYNC and O_DSYNC problems. I can't confirm this though, because I don't have access to the SunSolve database to read the report. (This is something I'll have to check into fixing... it seems like my employer ought to have access to that information...) - Garrett -- Message: 5 Date: Mon, 28 Jun 2010 08:26:02 PDT From: Tristram Scott To: zfs-discuss@opensolaris.org Subject: [zfs-discuss] Announce: zfsdump Message-ID: <311835455.361277738793747.javamail.tweb...@sf-app1> Content-Type: text/plain; charset=UTF-8 For quite some time I have been using zfs send -R fsn...@snapname | dd of=/dev/rmt/1ln to make a tape backup of my zfs file system. A few weeks back the size of the file system grew to larger than would fit on a single DAT72 tape, and I once ag
Re: [zfs-discuss] zfs-discuss Digest, Vol 56, Issue 126
On Jun 28, 2010, at 10:03 AM, zfs-discuss-requ...@opensolaris.org wrote: > Send zfs-discuss mailing list submissions to > zfs-discuss@opensolaris.org > > To subscribe or unsubscribe via the World Wide Web, visit > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > or, via email, send a message with subject or body 'help' to > zfs-discuss-requ...@opensolaris.org > > You can reach the person managing the list at > zfs-discuss-ow...@opensolaris.org > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of zfs-discuss digest..." > > > Today's Topics: > > 1. Re: ZFS bug - should I be worried about this? (Gabriele Bulfon) > 2. Re: ZFS bug - should I be worried about this? (Victor Latushkin) > 3. Re: OCZ Vertex 2 Pro performance numbers (Frank Cusack) > 4. Re: ZFS bug - should I be worried about this? (Garrett D'Amore) > 5. Announce: zfsdump (Tristram Scott) > 6. Re: Announce: zfsdump (Brian Kolaci) > 7. Re: zpool import hangs indefinitely (retry post in parts; too > long?) (Andrew Jones) > 8. Re: Announce: zfsdump (Tristram Scott) > 9. Re: Announce: zfsdump (Brian Kolaci) > > > -- > > Message: 1 > Date: Mon, 28 Jun 2010 05:16:00 PDT > From: Gabriele Bulfon > To: zfs-discuss@opensolaris.org > Subject: Re: [zfs-discuss] ZFS bug - should I be worried about this? > Message-ID: <593812734.121277727391600.javamail.tweb...@sf-app1> > Content-Type: text/plain; charset=UTF-8 > > Yes...they're still running...but being aware that a power failure causing an > unexpected poweroff may make the pool unreadable is a pain > > Yes. Patches should be available. > Or adoption may be lowering a lot... > -- > This message posted from opensolaris.org > > > -- > > Message: 2 > Date: Mon, 28 Jun 2010 18:14:12 +0400 > From: Victor Latushkin > To: Gabriele Bulfon > Cc: zfs-discuss@opensolaris.org > Subject: Re: [zfs-discuss] ZFS bug - should I be worried about this? > Message-ID: <4c28ae34.1030...@sun.com> > Content-Type: text/plain; CHARSET=US-ASCII; format=flowed > > On 28.06.10 16:16, Gabriele Bulfon wrote: >> Yes...they're still running...but being aware that a power failure causing an >> unexpected poweroff may make the pool unreadable is a pain > > Pool integrity is not affected by this issue. > > > > -- > > Message: 3 > Date: Mon, 28 Jun 2010 07:26:45 -0700 > From: Frank Cusack > To: 'OpenSolaris ZFS discuss' > Subject: Re: [zfs-discuss] OCZ Vertex 2 Pro performance numbers > Message-ID: <5f1b59775f3ffc0e1781f...@cusack.local> > Content-Type: text/plain; charset=us-ascii; format=flowed > > On 6/26/10 9:47 AM -0400 David Magda wrote: >> Crickey. Who's the genius who thinks of these URLs? > > SEOs > > > -- > > Message: 4 > Date: Mon, 28 Jun 2010 08:17:21 -0700 > From: "Garrett D'Amore" > To: Gabriele Bulfon > Cc: zfs-discuss@opensolaris.org > Subject: Re: [zfs-discuss] ZFS bug - should I be worried about this? > Message-ID: <1277738241.5596.4325.ca...@velocity> > Content-Type: text/plain; charset="UTF-8" > > On Mon, 2010-06-28 at 05:16 -0700, Gabriele Bulfon wrote: >> Yes...they're still running...but being aware that a power failure causing >> an unexpected poweroff may make the pool unreadable is a pain >> >> Yes. Patches should be available. >> Or adoption may be lowering a lot... > > > I don't have access to the information, but if this problem is the same > one I think it is, then the pool does not become unreadable. Rather, > its state after such an event represents a *consistent* state from some > point of time *earlier* than that confirmed fsync() (or a write on a > file opened with O_SYNC or O_DSYNC). > > For most users, this is not a critical failing. For users using > databases or requiring transactional integrity for data stored on ZFS, > then yes, this is a very nasty problem indeed. > > I suspect that this is the problem I reported earlier in my blog > (http://gdamore.blogspot.com) about certain kernels having O_SYNC and > O_DSYNC problems. I can't confirm this though, because I don't have > access to the SunSolve database to read the report. > > (This is something I'll have to check into fixing... it seems like my > employer ought to have access to that information...) > > - Garrett > > > > -- > > Message: 5 > Date: Mon, 28 Jun 2010 08:26:02 PDT > From: Tristram Scott > To: zfs-discuss@opensolaris.org > Subject: [zfs-discuss] Announce: zfsdump > Message-ID: <311835455.361277738793747.javamail.tweb...@sf-app1> > Content-Type: text/plain; charset=UTF-8 > > For quite some time I have been using zfs send -R fsn...@snapname | dd > of=/dev/rmt/1ln to make a tape backup of my zfs file system. A few weeks > back the size of the file system grew to larger than would fit on a single >
Re: [zfs-discuss] ZFS on Caviar Blue (Hard Drive Recommendations)
On Tue, Jun 29, 2010 at 11:25 AM, Patrick Donnelly wrote: > I googled around but couldn't find anything on whether someone has > good or bad experiences with the Caviar *Blue* drives? I saw in the > archives Caviar Blacks are *not* recommended for ZFS arrays (excluding > apparently RE3 and RE4?). Specifically I'm looking to buy Western > Digital Caviar Blue WD10EALS 1TB drives [1]. Does anyone have any > experience with these drives? We use a mix of WD Caviar Blue 500 GB, Caviar Black 500 GB, and RE2 500 GB drives in one of our storage servers without any issues. Attached to 3Ware 9550SXU and 9650SE RAID controllers, configured as Single Drive arrays. There's also 8 WD Caviar Green 1.5 TB drives in there, which are not very good (even after twiddling the idle timeout setting via wdidle3). Definitely avoid the Green/GP line of drives. -- Freddie Cash fjwc...@gmail.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] What are requirements for "zpool split" ?
Hey Mitch, The zpool split feature is available in the OpenSolaris release if you upgrade to build 131. You can read about the requirements here: http://hub.opensolaris.org/bin/view/Community+Group+zfs/docs See the ZFS Admin Guide, page 89-90 Thanks, Cindy On 06/29/10 13:37, Mitchell Petty wrote: Hi, Is "zpool split" available ? If not when will it be ? If it is what are the prerequisites ? Thanks In Advance , Mitch ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] zfs rpool corrupt?????
Hello, Has anyone encountered the following error message, running Solaris 10 u8 in an LDom. bash-3.00# devfsadm devfsadm: write failed for /dev/.devfsadm_dev.lock: Bad exchange descriptor bash-3.00# zpool status -v rpool pool: rpool state: DEGRADED status: One or more devices has experienced an error resulting in data corruption. Applications may be affected. action: Restore the file in question if possible. Otherwise restore the entire pool from backup. see: http://www.sun.com/msg/ZFS-8000-8A scrub: scrub in progress for 0h1m, 22.57% done, 0h5m to go config: NAME STATE READ WRITE CKSUM rpool DEGRADED 0 0 17 c0d0s0 DEGRADED 0 0 34 too many errors errors: Permanent errors have been detected in the following files: //dev/.devfsadm_dev.lock //var/svc/log/system-tsol-zones:default.log //var/svc/log/system-labeld:default.log //var/svc/log/system-filesystem-volfs:default.log Thanks ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Permanet errors detected in :<0x13>
Well, I was doing a ZFS send / receive to backup a large (60 GB) of data, which never completed. A zpool clear at that point just hung and I had to reboot the system, after which it appeared to come up clean. As soon as I tried the backup again I noticed the pool reported the error you see below - but the backup did complete as the pool remained online. Thanks for your help Cindy, Brian Cindy Swearingen wrote: I reviewed the zpool clear syntax (looking at my own docs) and didn't remember that a one-device pool probably doesn't need the device specified. For pools with many devices, you might want to just clear the errors on a particular device. USB sticks for pools are problemmatic. It would be good to know what caused these errors to try to prevent them in the future. We know that USB devices don't generate/fabricate device IDs so they are prone to problems when moving/changing/re-inserting but without more info, its hard to tell what happened. cs On 06/29/10 14:13, W Brian Leonard wrote: Interesting, this time it worked! Does specifying the device to clear cause the command to behave differently? I had assumed w/out the device specification, the clear would just apply to all devices in the pool (which are just the one). Thanks, Brian Cindy Swearingen wrote: Hi Brian, Because the pool is still online and the metadata is redundant, maybe these errors were caused by a brief hiccup from the USB device's physical connection. You might try: # zpool clear external c0t0d0p0 Then, run a scrub: # zpool scrub external If the above fails, then please identify the Solaris release and what events preceded this problem. Thanks, Cindy On 06/29/10 11:15, W Brian Leonard wrote: Hi Cindy, The scrub didn't help and yes, this is an external USB device. Thanks, Brian Cindy Swearingen wrote: Hi Brian, You might try running a scrub on this pool. Is this an external USB device? Thanks, Cindy On 06/29/10 09:16, Brian Leonard wrote: Hi, I have a zpool which is currently reporting that the ":<0x13>" file is corrupt: bleon...@opensolaris:~$ pfexec zpool status -xv external pool: external state: ONLINE status: One or more devices has experienced an error resulting in data corruption. Applications may be affected. action: Restore the file in question if possible. Otherwise restore the entire pool from backup. see: http://www.sun.com/msg/ZFS-8000-8A scrub: none requested config: NAMESTATE READ WRITE CKSUM externalONLINE 0 0 0 c0t0d0p0 ONLINE 0 0 0 errors: Permanent errors have been detected in the following files: :<0x13> Otherwise, as you can see, the pool is online. As it's unclear to me how to restore the ":<0x13>" file, is my only option for correcting this error to destroy and recreate the pool? Thanks, Brian -- W Brian Leonard Principal Product Manager 860.206.6093 http://blogs.sun.com/observatory ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Permanet errors detected in :<0x13>
Interesting, this time it worked! Does specifying the device to clear cause the command to behave differently? I had assumed w/out the device specification, the clear would just apply to all devices in the pool (which are just the one). Thanks, Brian Cindy Swearingen wrote: Hi Brian, Because the pool is still online and the metadata is redundant, maybe these errors were caused by a brief hiccup from the USB device's physical connection. You might try: # zpool clear external c0t0d0p0 Then, run a scrub: # zpool scrub external If the above fails, then please identify the Solaris release and what events preceded this problem. Thanks, Cindy On 06/29/10 11:15, W Brian Leonard wrote: Hi Cindy, The scrub didn't help and yes, this is an external USB device. Thanks, Brian Cindy Swearingen wrote: Hi Brian, You might try running a scrub on this pool. Is this an external USB device? Thanks, Cindy On 06/29/10 09:16, Brian Leonard wrote: Hi, I have a zpool which is currently reporting that the ":<0x13>" file is corrupt: bleon...@opensolaris:~$ pfexec zpool status -xv external pool: external state: ONLINE status: One or more devices has experienced an error resulting in data corruption. Applications may be affected. action: Restore the file in question if possible. Otherwise restore the entire pool from backup. see: http://www.sun.com/msg/ZFS-8000-8A scrub: none requested config: NAMESTATE READ WRITE CKSUM externalONLINE 0 0 0 c0t0d0p0 ONLINE 0 0 0 errors: Permanent errors have been detected in the following files: :<0x13> Otherwise, as you can see, the pool is online. As it's unclear to me how to restore the ":<0x13>" file, is my only option for correcting this error to destroy and recreate the pool? Thanks, Brian -- W Brian Leonard Principal Product Manager 860.206.6093 http://blogs.sun.com/observatory ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] What are requirements for "zpool split" ?
Hi, Is "zpool split" available ? If not when will it be ? If it is what are the prerequisites ? Thanks In Advance , Mitch ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] ZFS on Caviar Blue (Hard Drive Recommendations)
Hi list, I googled around but couldn't find anything on whether someone has good or bad experiences with the Caviar *Blue* drives? I saw in the archives Caviar Blacks are *not* recommended for ZFS arrays (excluding apparently RE3 and RE4?). Specifically I'm looking to buy Western Digital Caviar Blue WD10EALS 1TB drives [1]. Does anyone have any experience with these drives? If this is the wrong way to go, does anyone have a recommendation for 1TB drives I can get for <= 90$? [1] http://www.wdc.com/en/products/products.asp?driveid=793 Thanks for any help, -- - Patrick Donnelly ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Permanet errors detected in :<0x13>
Hi Cindy, The scrub didn't help and yes, this is an external USB device. Thanks, Brian Cindy Swearingen wrote: Hi Brian, You might try running a scrub on this pool. Is this an external USB device? Thanks, Cindy On 06/29/10 09:16, Brian Leonard wrote: Hi, I have a zpool which is currently reporting that the ":<0x13>" file is corrupt: bleon...@opensolaris:~$ pfexec zpool status -xv external pool: external state: ONLINE status: One or more devices has experienced an error resulting in data corruption. Applications may be affected. action: Restore the file in question if possible. Otherwise restore the entire pool from backup. see: http://www.sun.com/msg/ZFS-8000-8A scrub: none requested config: NAMESTATE READ WRITE CKSUM externalONLINE 0 0 0 c0t0d0p0 ONLINE 0 0 0 errors: Permanent errors have been detected in the following files: :<0x13> Otherwise, as you can see, the pool is online. As it's unclear to me how to restore the ":<0x13>" file, is my only option for correcting this error to destroy and recreate the pool? Thanks, Brian -- W Brian Leonard Principal Product Manager 860.206.6093 http://blogs.sun.com/observatory ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Kernel Panic on zpool clean
> Please try > > zdb -U /dev/null -ebcsv storage2 r...@crypt:~# zdb -U /dev/null -ebcsv storage2 zdb: can't open storage2: No such device or address If I try r...@crypt:~# zdb -C storage2 Then it prints what appears to be a valid configuration but then the same error message about being unable to find the device (output attached). George -- This message posted from opensolaris.orgr...@crypt:~# zdb -C storage2 version=14 name='storage2' state=0 txg=1807366 pool_guid=14701046672203578408 hostid=8522651 hostname='crypt' vdev_tree type='root' id=0 guid=14701046672203578408 children[0] type='raidz' id=0 guid=15861342641545291969 nparity=1 metaslab_array=14 metaslab_shift=35 ashift=9 asize=3999672565760 is_log=0 children[0] type='disk' id=0 guid=14390766171745861103 path='/dev/dsk/c9t4d2s0' devid='id1,s...@n600d0230006c8a5f0c3fd863ea736d00/a' phys_path='/p...@0,0/pci1022,7...@b/pci9005,4...@1/s...@4,2:a' whole_disk=1 DTL=301 children[1] type='disk' id=1 guid=14806610527738068493 path='/dev/dsk/c9t4d3s0' devid='id1,s...@n600d0230006c8a5f0c3fd8514ed8d900/a' phys_path='/p...@0,0/pci1022,7...@b/pci9005,4...@1/s...@4,3:a' whole_disk=1 DTL=300 children[2] type='disk' id=2 guid=4272121319363331595 path='/dev/dsk/c10t4d2s0' devid='id1,s...@n600d0230006c8a5f0c3fd84312aa6d00/a' phys_path='/p...@0,0/pci1022,7...@b/pci9005,4...@1,1/s...@4,2:a' whole_disk=1 DTL=299 children[3] type='disk' id=3 guid=16286569401176941639 path='/dev/dsk/c10t4d4s0' devid='id1,s...@n600d0230006c8a5f0c3fd8415c62ae00/a' phys_path='/p...@0,0/pci1022,7...@b/pci9005,4...@1,1/s...@4,4:a' whole_disk=1 DTL=296 children[1] type='raidz' id=1 guid=12601468074885676119 nparity=1 metaslab_array=172 metaslab_shift=35 ashift=9 asize=3999672565760 is_log=0 children[0] type='disk' id=0 guid=7040280703157905854 path='/dev/dsk/c10t4d0s0' devid='id1,s...@n600d0230006c8a5f0c3fd83eda0a4a00/a' phys_path='/p...@0,0/pci1022,7...@b/pci9005,4...@1,1/s...@4,0:a' whole_disk=1 DTL=305 children[1] type='replacing' id=1 guid=16928413524184799719 whole_disk=0 children[0] type='disk' id=0 guid=9102173991259789741 path='/dev/dsk/c9t4d0s0' devid='id1,s...@n600d0230006c8a5f0c3fd86eee69a300/a' phys_path='/p...@0,0/pci1022,7...@b/pci9005,4...@1/s...@4,0:a' whole_disk=1 DTL=304 children[1] type='disk' id=1 guid=16888611779137638814 path='/dev/dsk/c9t4d4s0' devid='id1,s...@n600d0230006c8a5f0c3fd8612edc7d00/a' phys_path='/p...@0,0/pci1022,7...@b/pci9005,4...@1/s...@4,4:a' whole_disk=1 DTL=321 children[2] type='disk' id=2 guid=4025009484028197162 path='/dev/dsk/c10t4d1s0' devid='id1,s...@n600d0230006c8a5f0c3fd8609d147700/a' phys_path='/p...@0,0/pci1022,7...@b/pci9005,4...@1,1/s...@4,1:a' whole_disk=1 DTL=303 children[3
Re: [zfs-discuss] Kernel Panic on zpool clean
On Jun 30, 2010, at 10:48 AM, George wrote: >> I suggest you to try running 'zdb -bcsv storage2' and >> show the result. > > r...@crypt:/tmp# zdb -bcsv storage2 > zdb: can't open storage2: No such device or address > > then I tried > > r...@crypt:/tmp# zdb -ebcsv storage2 > zdb: can't open storage2: File exists Please try zdb -U /dev/null -ebcsv storage2 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss