Re: [zfs-discuss] zpool import hangs indefinitely (retry post in parts; too long?)

2010-06-30 Thread Andrew Jones
Victor,

I've reproduced the crash and have vmdump.0 and dump device files. How do I 
query the stack on crash for your analysis? What other analysis should I 
provide?

Thanks
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] OCZ Vertex 2 Pro performance numbers

2010-06-30 Thread Fred Liu


> -Original Message-
> From: Erik Trimble [mailto:erik.trim...@oracle.com]
> Sent: 星期四, 七月 01, 2010 11:45
> To: Fred Liu
> Cc: Bob Friesenhahn; 'OpenSolaris ZFS discuss'
> Subject: Re: [zfs-discuss] OCZ Vertex 2 Pro performance numbers
> 
> On 6/30/2010 7:17 PM, Fred Liu wrote:
> > See. Thanks.
> > Does it have the hardware functionality to detect the power outage
> and do force cache flush when the cache is enabled?
> > Any more detailed info about the capacity (farad) of this supercap
> and how long one discharge will be?
> >
> > Thanks.
> >
> > Fred
> >
> >
> 
> I don't think it matters the actual size of the supercap. As Bob said,
> it only needs to be sized to be big enough to allow all on-board DRAM
> to
> be flushed out to Flash. How big that should be is easily determined by
> the manufacturer, and they'd be grossly negligent if it wasn't at least
> that size. Any capacity beyond that needed to do a single full flush is
> excess, so I would hazard a guess that the supercap is "just big enough,
> and no more". That is, just enough to power the SSD for the partial
> second or so it takes to flush to flash. I don't think we need to worry
> how big that actually is.

Understand and agree. It is sort of picky to ask this without manufacturer's 
help ;-)

> 
> Answering your second question first - my reading of the info is that
> it
> will force a cache flush (if the cache is enabled) upon loss of power
> under any circumstance. That is, it will flush the cache in both a
> controlled power-down (regardless of whether the OS says to do so) and
> in an immediate power loss. All this is in the SSD's firmware.

That is exactly what I expect. Little bit broadly speaking, it is sort of 
ambiguous in
aspect of cache flush in all the HDDs. Is it OS-controlled or 
firmware-controlled or both?
At least in this case, I have got what I expect. But what about for 
all(generic) the HDDs?

Thanks.

Fred
> 
> -Erik
> 
> 
> > -Original Message-
> > From: Bob Friesenhahn [mailto:bfrie...@simple.dallas.tx.us]
> > Sent: 星期四, 七月 01, 2010 10:01
> > To: Fred Liu
> > Cc: 'OpenSolaris ZFS discuss'
> > Subject: Re: [zfs-discuss] OCZ Vertex 2 Pro performance numbers
> >
> > On Wed, 30 Jun 2010, Fred Liu wrote:
> >
> >
> >> Any duration limit on the supercap? How long can it sustain the data?
> >>
> > A supercap on a SSD drive only needs to sustain the data until it has
> > been saved (perhaps 10 milliseconds).  It is different than a RAID
> > array battery.
> >
> > Bob
> >
> 
> 
> --
> Erik Trimble
> Java System Support
> Mailstop:  usca22-123
> Phone:  x17195
> Santa Clara, CA
> Timezone: US/Pacific (GMT-0800)
> 

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] OCZ Vertex 2 Pro performance numbers

2010-06-30 Thread Erik Trimble
On 6/30/2010 7:17 PM, Fred Liu wrote:
> See. Thanks.
> Does it have the hardware functionality to detect the power outage and do 
> force cache flush when the cache is enabled?
> Any more detailed info about the capacity (farad) of this supercap and how 
> long one discharge will be? 
>
> Thanks.
>
> Fred
>
>   

I don't think it matters the actual size of the supercap. As Bob said,
it only needs to be sized to be big enough to allow all on-board DRAM to
be flushed out to Flash. How big that should be is easily determined by
the manufacturer, and they'd be grossly negligent if it wasn't at least
that size. Any capacity beyond that needed to do a single full flush is
excess, so I would hazard a guess that the supercap is "just big enough,
and no more". That is, just enough to power the SSD for the partial
second or so it takes to flush to flash. I don't think we need to worry
how big that actually is.

Answering your second question first - my reading of the info is that it
will force a cache flush (if the cache is enabled) upon loss of power
under any circumstance. That is, it will flush the cache in both a
controlled power-down (regardless of whether the OS says to do so) and
in an immediate power loss. All this is in the SSD's firmware.

-Erik


> -Original Message-
> From: Bob Friesenhahn [mailto:bfrie...@simple.dallas.tx.us] 
> Sent: 星期四, 七月 01, 2010 10:01
> To: Fred Liu
> Cc: 'OpenSolaris ZFS discuss'
> Subject: Re: [zfs-discuss] OCZ Vertex 2 Pro performance numbers
>
> On Wed, 30 Jun 2010, Fred Liu wrote:
>
>   
>> Any duration limit on the supercap? How long can it sustain the data?
>> 
> A supercap on a SSD drive only needs to sustain the data until it has 
> been saved (perhaps 10 milliseconds).  It is different than a RAID 
> array battery.
>
> Bob
>   


-- 
Erik Trimble
Java System Support
Mailstop:  usca22-123
Phone:  x17195
Santa Clara, CA
Timezone: US/Pacific (GMT-0800)

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Dedup RAM requirements, vs. L2ARC?

2010-06-30 Thread Erik Trimble

On 6/30/2010 2:01 PM, valrh...@gmail.com wrote:

Another question on SSDs in terms of performance vs. capacity.

Between $150 and $200, there are at least three SSDs that would fit the rough 
specifications for the L2ARC on my system:

1. Crucial C300, 64 GB: $150: medium performance, medium capacity.
2. OCZ Vertex 2, 50 GB: $180: higher performance, lower capacity. (The Agility 
2 is similar, but $15 cheaper)
3. Corsair Force 60 GB, $195: similar performance, slightly higher capacity 
(more over-provisioning with the same SandForce controller).
4. Intel X25M G2, 80 GB: $200: largest capacity, probably lowest(?) performance.

So which would be the best choice L2ARC? Is it size, or is it throughput, that 
really matter for this?

Within this range, price doesn't make much difference. Thanks, as always, for 
the guidance.
   


For L2ARC, random read performance is the primary important factor.  
Given what I've seen of the C300's random read performance for NON-4k 
block sizes, and for short queue depths (both of which are highly likely 
for an L2ARC device), my recommendation of the above four is #4, the 
Intel device.  #2 is possibly the best performing for ZIL usage, should 
you choose to use a portion of the device for that purpose.


For what you've said your usage pattern is, I think the Intel X25M is 
the best fit for good performance and size for the dollar.


--
Erik Trimble
Java System Support
Mailstop:  usca22-123
Phone:  x17195
Santa Clara, CA
Timezone: US/Pacific (GMT-0800)

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Use of blocksize (-b) during zfs zvol create, poor performance

2010-06-30 Thread Mike La Spina
Hi Eff,

There are a significant number of variables to work through with dedup and 
compression enabled. So the first suggestion I have is to disable those 
features for now so your not working with too many elements. 

With those features set aside an NTFS cluster operation does not = a 64k raw 
I/O block. As well the ZFS 64k blocksize does not = one I/O operation. We may 
also need to consider the overall network performance behavior and iSCSI 
protocol characteristics and the Windows network stack.

iperf is a good tool to rule that out.

What I primarily suspect the issue may be is that write I/O operations are not 
aligned and are waiting for a I/O completion over multiple vdevs. Alignment is 
important for write I/O optimization and how the I/O maps at the software raid 
mode will make a significant impact to the DMU and SPA operations on a specific 
vdev layout. You may also have an issue with write cache operations,  by 
default large I/O calls such as 64K will not use a ZIL cache vdev, if you have 
one defined, but will be written directly to your array vdevs which will also 
include a transaction group write operation. 

To ensure ZIL log usage with 64k I/O's you can apply the following: 
edit the /etc/system file with  

set zfs:zfs_immediate_write_sz = 131071

a reboot is required to activate the system file

You have also not indicated what your zpool configuration looks like, that 
would helpful in the discussion area. 

It appears that your applying the x4500 as a backup target which means you 
should (if not already) enable write caching on the COMSTAR LU properties for 
this type of application.

e.g
stmfadm modify-lu -p wcd=false 600144F02F2280004C1D62010001

To help triage the perf issue further you could post 2 'kstat zfs' + 2 'kstat 
stmf' outputs on a 5 min interval and a 'zpool iostat -v 30 5' which would help 
visualize the I/O behavior. 

Regards,

Mike

http://blog.laspina.ca/
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] OCZ Vertex 2 Pro performance numbers

2010-06-30 Thread Fred Liu
See. Thanks.
Does it have the hardware functionality to detect the power outage and do force 
cache flush when the cache is enabled?
Any more detailed info about the capacity (farad) of this supercap and how long 
one discharge will be? 

Thanks.

Fred

-Original Message-
From: Bob Friesenhahn [mailto:bfrie...@simple.dallas.tx.us] 
Sent: 星期四, 七月 01, 2010 10:01
To: Fred Liu
Cc: 'OpenSolaris ZFS discuss'
Subject: Re: [zfs-discuss] OCZ Vertex 2 Pro performance numbers

On Wed, 30 Jun 2010, Fred Liu wrote:

> Any duration limit on the supercap? How long can it sustain the data?

A supercap on a SSD drive only needs to sustain the data until it has 
been saved (perhaps 10 milliseconds).  It is different than a RAID 
array battery.

Bob
-- 
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] OCZ Vertex 2 Pro performance numbers

2010-06-30 Thread Bob Friesenhahn

On Wed, 30 Jun 2010, Fred Liu wrote:


Any duration limit on the supercap? How long can it sustain the data?


A supercap on a SSD drive only needs to sustain the data until it has 
been saved (perhaps 10 milliseconds).  It is different than a RAID 
array battery.


Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] OCZ Vertex 2 Pro performance numbers

2010-06-30 Thread Fred Liu
Any duration limit on the supercap? How long can it sustain the data?

Thanks.

Fred

-Original Message-
From: zfs-discuss-boun...@opensolaris.org 
[mailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of David Magda
Sent: 星期六, 六月 26, 2010 21:48
To: Arne Jansen
Cc: 'OpenSolaris ZFS discuss'
Subject: Re: [zfs-discuss] OCZ Vertex 2 Pro performance numbers


On Jun 26, 2010, at 02:09, Arne Jansen wrote:

> Geoff Nordli wrote:
>> Is this the one
>> (http://www.ocztechnology.com/products/solid-state-drives/2-5--sata-ii/maxim
>> um-performance-enterprise-solid-state-drives/ocz-vertex-2-pro- 
>> series-sata-ii
>> -2-5--ssd-.html) with the built in supercap?
>
> Yes.

Crickey. Who's the genius who thinks of these URLs?
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] What happens when unmirrored ZIL log device is removed ungracefully

2010-06-30 Thread Carson Gaspar

Ragnar Sundblad wrote:


I was referring to the case where zfs has written data to the drive but
still hasen't issued a cache flush, and before the cache flush the drive
is reset. If zfs finally issues a cache flush and then isn't informed
that the drive has been reset, data is lost.

I hope this is not the case, on any SCSI-based protocol or SATA.


The nasty race that occurs is if your system crashes or is powered off
*after* the log has acknowledged the write, but before the bits get
shoved to main pool storage.  This is a data loss situation.


With "log", do you mean the ZIL (with or without a slog device)?
If so, that should not be an issue and is exactly with the ZIL
is for - it will be replayed at the next filesystem attach and the
data will be pushed to the main pool storage. Do I misunderstand you?


See your case above - written, ack'd, but not cache flushed. We're 
talking about the same thing.


--
Carson
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs rpool corrupt?????

2010-06-30 Thread Ian Collins

On 07/ 1/10 01:36 AM, Tony MacDoodle wrote:


Hello,

Has anyone encountered the following error message, running Solaris 10 
u8 in an LDom.


bash-3.00# devfsadm
devfsadm: write failed for /dev/.devfsadm_dev.lock: Bad exchange 
descriptor



Not specifically.  But it is clear from what follows that ZFS has 
detected an error in a pool which lacks redundancy.



bash-3.00# zpool status -v rpool
pool: rpool
state: DEGRADED
status: One or more devices has experienced an error resulting in data
corruption. Applications may be affected.
action: Restore the file in question if possible. Otherwise restore the
entire pool from backup.
see: http://www.sun.com/msg/ZFS-8000-8A
scrub: scrub in progress for 0h1m, 22.57% done, 0h5m to go
config:

NAME STATE READ WRITE CKSUM
rpool DEGRADED 0 0 17
c0d0s0 DEGRADED 0 0 34 too many errors

errors: Permanent errors have been detected in the following files:

//dev/.devfsadm_dev.lock
//var/svc/log/system-tsol-zones:default.log
//var/svc/log/system-labeld:default.log
//var/svc/log/system-filesystem-volfs:default.log


--
Ian.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Kernel Panic on zpool clean

2010-06-30 Thread George
Aha:

http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6794136

I think I'll try booting from a b134 Live CD and see that will let me fix 
things.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] b134 pool borked!

2010-06-30 Thread Michael Mattsson
Just in case any stray searches finds it way here, this is what happened to my 
pool: http://phrenetic.to/zfs
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] What happens when unmirrored ZIL log device is removed ungracefully

2010-06-30 Thread Ragnar Sundblad

On 30 jun 2010, at 22.46, Garrett D'Amore wrote:

> On Wed, 2010-06-30 at 22:28 +0200, Ragnar Sundblad wrote:
> 
>> To be safe, the protocol needs to be able to discover that the devices
>> (host or disk) has been disconnected and reconnected or has been reset
>> and that either parts assumptions about the state of the other has to
>> be invalidated.
>> 
>> I don't know enough about either SAS or SATA to say if they guarantee that
>> you will be noticed. But if they don't, they aren't safe for cached writes.
> 
> Generally, ZFS will only notice a removed disk when it is trying to
> write to it -- or when it probes.  ZFS does not necessarily get notified
> on hot device removal -- certainly not immediately.

That should be fine, as soon as it is informed on the next access.

>  (I've written some
> code so that *will* notice, even if no write ever goes there... that's
> the topic of another message.)
> 
> The other thing is that disk writes are generally idempotent.  So, if a
> drive was removed between the time an IO was finished but before the
> time the response was returned to the host, it isn't a problem.   When
> the disk is returned, ZFS should automatically retry the I/O.  (In fact,
> ZFS automatically retries failed I/O operations several times before
> finally "failing".)

I was referring to the case where zfs has written data to the drive but
still hasen't issued a cache flush, and before the cache flush the drive
is reset. If zfs finally issues a cache flush and then isn't informed
that the drive has been reset, data is lost.

I hope this is not the case, on any SCSI-based protocol or SATA.

> The nasty race that occurs is if your system crashes or is powered off
> *after* the log has acknowledged the write, but before the bits get
> shoved to main pool storage.  This is a data loss situation.

With "log", do you mean the ZIL (with or without a slog device)?
If so, that should not be an issue and is exactly with the ZIL
is for - it will be replayed at the next filesystem attach and the
data will be pushed to the main pool storage. Do I misunderstand you?

/ragge

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Dedup RAM requirements, vs. L2ARC?

2010-06-30 Thread Garrett D'Amore
On Wed, 2010-06-30 at 16:41 -0500, Nicolas Williams wrote:
> On Wed, Jun 30, 2010 at 01:35:31PM -0700, valrh...@gmail.com wrote:
> > Finally, for my purposes, it doesn't seem like a ZIL is necessary? I'm
> > the only user of the fileserver, so there probably won't be more than
> > two or three computers, maximum, accessing stuff (and writing stuff)
> > remotely.
> 
> It depends on what you're doing.
> 
> The perennial complaint about NFS is the synchronous open()/close()
> operations and the fact that archivers (tar, ...) will generally unpack
> archives in a single-threaded manner, which means all those synchronous
> ops punctuate the archiver's performance with pauses.  This is a load
> type for which ZIL devices come in quite handy.  If you write lots of
> small files often and in single-threaded ways _and_ want to guarantee
> you don't lose transactions, then you want a ZIL device.  (The recent
> knob for controlling whether synchronous I/O gets done asynchronously
> would help you if you don't care about losing a few seconds worth of
> writes, assuming that feature makes it into any release of Solaris.)

Btw, that feature will be in the NexentaStor 3.0.4 release (which is
currently in late development/early QA, and should be out soon.)

Archivers are not the only thing that acts this way, btw.  Databases,
and even something as benign as compiling a large software suite can
have similar implications where a fast slog device can help.

- Garrett


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Dedup RAM requirements, vs. L2ARC?

2010-06-30 Thread Nicolas Williams
On Wed, Jun 30, 2010 at 01:35:31PM -0700, valrh...@gmail.com wrote:
> Finally, for my purposes, it doesn't seem like a ZIL is necessary? I'm
> the only user of the fileserver, so there probably won't be more than
> two or three computers, maximum, accessing stuff (and writing stuff)
> remotely.

It depends on what you're doing.

The perennial complaint about NFS is the synchronous open()/close()
operations and the fact that archivers (tar, ...) will generally unpack
archives in a single-threaded manner, which means all those synchronous
ops punctuate the archiver's performance with pauses.  This is a load
type for which ZIL devices come in quite handy.  If you write lots of
small files often and in single-threaded ways _and_ want to guarantee
you don't lose transactions, then you want a ZIL device.  (The recent
knob for controlling whether synchronous I/O gets done asynchronously
would help you if you don't care about losing a few seconds worth of
writes, assuming that feature makes it into any release of Solaris.)

> But, from what I can gather, by spending a little under $400, I should
> substantially increase the performance of my system with dedup? Many
> thanks, again, in advance.

If you have deduplicatious data, yes.

Nico
-- 
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Zpool mirror fail testing - odd resilver behaviour after reconnect

2010-06-30 Thread Matt Connolly
I have a Opensolaris snv_134 machine with 2 x 1.5TB drives. One is a Samsung 
Silencer the other is a dreaded Western Digital Green.

I'm testing the mirror for failure by simply yanking out the SATA cable while 
the machine is running. The system never skips a beat, which is great. But the 
reconnect behaviour is vastly different on the two drives.

1. Samsung reconnect. `cfgadm` reported the drive as connected but 
unconfigured. After running `cfgadm -c configure sata1/1`, the drive 
automatically came online in the zpool mirror and resilvered its differences 
which completed in about 10 seconds. This excellent.

2. WD Green reconnect. `cfgadm` reported the drive as disconnected. I had to 
use the '-f' option to connect the drive and then configure it:

m...@vault:~$ cfgadm
Ap_Id  Type Receptacle   Occupant Condition
sata1/0sata-portdisconnected unconfigured failed
sata1/1::dsk/c8t1d0disk connectedconfigured   ok
m...@vault:~$ pfexec cfgadm -c connect sata1/0
cfgadm: Insufficient condition
m...@vault:~$ pfexec cfgadm -f -c connect sata1/0
Activate the port: /devices/p...@0,0/pci8086,4...@1f,2:0
This operation will enable activity on the SATA port
Continue (yes/no)? yes
m...@vault:~$ cfgadm
Ap_Id  Type Receptacle   Occupant Condition
sata1/0disk connectedunconfigured unknown
sata1/1::dsk/c8t1d0disk connectedconfigured   ok
m...@vault:~$ pfexec cfgadm -c configure sata1/0
m...@vault:~$ cfgadm
Ap_Id  Type Receptacle   Occupant Condition
sata1/0::dsk/c8t0d0disk connectedconfigured   ok
sata1/1::dsk/c8t1d0disk connectedconfigured   ok


After this point, zpool resilvered the entire 243GB dataset.

I suspect that the automatic connect is simply a firmware problem and yet 
another reason to NOT BUY Western Digital Green drives.

But my real question is: Why does zpool want to resilver the entire dataset on 
drive, but not the other??
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Dedup RAM requirements, vs. L2ARC?

2010-06-30 Thread valrh...@gmail.com
Another question on SSDs in terms of performance vs. capacity.

Between $150 and $200, there are at least three SSDs that would fit the rough 
specifications for the L2ARC on my system:

1. Crucial C300, 64 GB: $150: medium performance, medium capacity.
2. OCZ Vertex 2, 50 GB: $180: higher performance, lower capacity. (The Agility 
2 is similar, but $15 cheaper)
3. Corsair Force 60 GB, $195: similar performance, slightly higher capacity 
(more over-provisioning with the same SandForce controller).
4. Intel X25M G2, 80 GB: $200: largest capacity, probably lowest(?) performance.

So which would be the best choice L2ARC? Is it size, or is it throughput, that 
really matter for this? 

Within this range, price doesn't make much difference. Thanks, as always, for 
the guidance.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] What happens when unmirrored ZIL log device is removed ungracefully

2010-06-30 Thread Garrett D'Amore
On Wed, 2010-06-30 at 22:28 +0200, Ragnar Sundblad wrote:

> To be safe, the protocol needs to be able to discover that the devices
> (host or disk) has been disconnected and reconnected or has been reset
> and that either parts assumptions about the state of the other has to
> be invalidated.
> 
> I don't know enough about either SAS or SATA to say if they guarantee that
> you will be noticed. But if they don't, they aren't safe for cached writes.

Generally, ZFS will only notice a removed disk when it is trying to
write to it -- or when it probes.  ZFS does not necessarily get notified
on hot device removal -- certainly not immediately.  (I've written some
code so that *will* notice, even if no write ever goes there... that's
the topic of another message.)

The other thing is that disk writes are generally idempotent.  So, if a
drive was removed between the time an IO was finished but before the
time the response was returned to the host, it isn't a problem.   When
the disk is returned, ZFS should automatically retry the I/O.  (In fact,
ZFS automatically retries failed I/O operations several times before
finally "failing".)

The nasty race that occurs is if your system crashes or is powered off
*after* the log has acknowledged the write, but before the bits get
shoved to main pool storage.  This is a data loss situation.

But assuming you don't take a system crash or some other fault, I would
guess that removal of a log device and reinsertion would not cause any
problems.  (Except for possibly delaying synchronous writes.)

That said, I've not actually *tested* it.

- Garrett

> 
> /ragge
> 
> ___
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
> 


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Dedup RAM requirements, vs. L2ARC?

2010-06-30 Thread valrh...@gmail.com
Thanks to everyone for such helpful and detailed answers. Contrary to some of 
the trolls in other threads, I've had a fantastic experience here, and am 
grateful to the community.

Based on the feedback, I'll upgrade my machine to 8 GB of RAM. I only have two 
slots on the motherboard, and either add two 2 GB DIMMs to add to the two I 
have there, or throw those away and start over with 4 GB DIMMs, which is not 
something I'm quite ready to do yet (before this is all working, for instance).

Now, for the SSD, Crucial appears to have their (recommended above) C300 64 GB 
drive for $150, which seems like a good deal. Intel's X25M G2 is $200 for 80 
GB. Does anyone have a strong opinion as to which would work better for the 
L2ARC? I am having a hard time understanding, from the performance numbers 
given, which would be a better choice.

Finally, for my purposes, it doesn't seem like a ZIL is necessary? I'm the only 
user of the fileserver, so there probably won't be more than two or three 
computers, maximum, accessing stuff (and writing stuff) remotely.

But, from what I can gather, by spending a little under $400, I should 
substantially increase the performance of my system with dedup? Many thanks, 
again, in advance.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] What happens when unmirrored ZIL log device is removed ungracefully

2010-06-30 Thread Ragnar Sundblad

On 12 apr 2010, at 22.32, Carson Gaspar wrote:

> Carson Gaspar wrote:
>> Miles Nordin wrote:
 "re" == Richard Elling  writes:
>>> How do you handle the case when a hotplug SATA drive is powered off
>>> unexpectedly with data in its write cache?  Do you replay the writes, or do 
>>> they go down the ZFS hotplug write hole?
>> If zfs never got a positive response to a cache flush, that data is still in 
>> memory and will be re-written. Unless I greatly misunderstand how ZFS 
>> works...
>> If the drive _lies_ about a cache flush, you're screwed (well, you can 
>> probably roll back a few TXGs...). Don't buy broken drives / bridge chipsets.
> 
> Hrm... thinking about this some more, I'm not sure what happens if the drive 
> comes _back_ after a power loss, quickly enough that ZFS is never told about 
> the disappearance (assuming that can happen without a human cfgadm'ing it 
> back online - I don't know).
> 
> Does anyone who understands the internals better than care to take a stab at 
> what happens if:
> 
> - ZFS writes data to /dev/foo
> - /dev/foo looses power and the data from the above write, not yet flushed to 
> rust (say a field tech pulls the wrong drive...)
> - /dev/foo powers back on (field tech quickly goes whoops and plugs it back 
> in)
> 
> In the case of a redundant zpool config, when will ZFS notice the uberblocks 
> are out of sync and repair? If this is a non-redundant zpool, how does the 
> response differ?

To be safe, the protocol needs to be able to discover that the devices
(host or disk) has been disconnected and reconnected or has been reset
and that either parts assumptions about the state of the other has to
be invalidated.

I don't know enough about either SAS or SATA to say if they guarantee that
you will be noticed. But if they don't, they aren't safe for cached writes.

/ragge

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Forcing resilver?

2010-06-30 Thread Roy Sigurd Karlsbakk
> This may not work for you, but it worked for me, and I was pleasantly
> surprised. Replace a drive with itself.
> 
> zpool replace tank c0t2d0 c0t2d0

I tried that - it didn't work - I replaced the drive with a new one, that 
worked, and then I made a new zpool on the old drive with zfs-fuse in Linux, 
destroyed the pool and put the old drive in the "cold spares" drawer

Vennlige hilsener / Best regards

roy
--
Roy Sigurd Karlsbakk
(+47) 97542685
r...@karlsbakk.net
http://blogg.karlsbakk.net/
--
I all pedagogikk er det essensielt at pensum presenteres intelligibelt. Det er 
et elementært imperativ for alle pedagoger å unngå eksessiv anvendelse av 
idiomer med fremmed opprinnelse. I de fleste tilfeller eksisterer adekvate og 
relevante synonymer på norsk.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS on Ubuntu

2010-06-30 Thread Roy Sigurd Karlsbakk
- Original Message -
> I think zfs on ubuntu currently is a rather bad idea. See test below
> with ubuntu Lucid 10.04 (amd64)

Quick update on this - it seems this is due to a bug in the Linux kernel where 
it can't deal with partition changes on a drive with mounted filesystems. I'm 
not 100% sure about this, but it still looks that way. Testing with disks with 
non-mounted filesystems shows this works better.

Just my two cents

Vennlige hilsener / Best regards

roy
--
Roy Sigurd Karlsbakk
(+47) 97542685
r...@karlsbakk.net
http://blogg.karlsbakk.net/
--
I all pedagogikk er det essensielt at pensum presenteres intelligibelt. Det er 
et elementært imperativ for alle pedagoger å unngå eksessiv anvendelse av 
idiomer med fremmed opprinnelse. I de fleste tilfeller eksisterer adekvate og 
relevante synonymer på norsk.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Announce: zfsdump

2010-06-30 Thread Asif Iqbal
On Wed, Jun 30, 2010 at 12:54 PM, Edward Ned Harvey
 wrote:
>> From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
>> boun...@opensolaris.org] On Behalf Of Asif Iqbal
>>
>> would be nice if i could pipe the zfs send stream to a split and then
>> send of those splitted stream over the
>> network to a remote system. it would help sending it over to remote
>> system quicker. can your tool do that?
>
> Does that make sense?  I assume the network is the bottleneck; the only way
> the multiple streams would go any faster than a single stream would be
> because you're multithreading and hogging all the bandwidth for yourself,
> instead of sharing fairly with the httpd or whatever other server is trying
> to use the bandwidth.

currently to speed up the zfs send| zfs recv I am using mbuffer. It
moves the data
lot faster than using netcat (or ssh) as the transport method

that is why I thought may be transport it like axel does better than wget.
axel let you create multiple pipes, so you get the data multiple times faster
than with wget.


>
> If you're talking about streaming to a bunch of separate tape drives (or
> whatever) on a bunch of separate systems because the recipient storage is
> the bottleneck instead of the network ... then "split" probably isn't the
> most useful way to distribute those streams.  Because "split" is serial.
> You would really want to "stripe" your data to all those various
> destinations, so they could all be writing simultaneously.  But this seems
> like a very specialized scenario, that I think is probably very unusual.
>
>



-- 
Asif Iqbal
PGP Key: 0xE62693C5 KeyServer: pgp.mit.edu
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Forcing resilver?

2010-06-30 Thread Edward Ned Harvey
> From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
> boun...@opensolaris.org] On Behalf Of Roy Sigurd Karlsbakk
> 
> There was some messup with switching of drives and an unexpected
> reboot, so I suddenly have a drive in my pool that is partly
> resilvered. zfs status shows the pool is fine, but after a scrub, it
> shows the drive faulted. I've been told on #opensolaris that making a
> new pool on the drive and then destroying that pool, and then zpool
> replace the drive will help, or moving the drive out and putting
> another filesystem on it, then replacing it in the pool, might help.
> But then, is it possible to forcibly resilver a drive without this
> hassle?

This may not work for you, but it worked for me, and I was pleasantly 
surprised.  Replace a drive with itself.

zpool replace tank c0t2d0 c0t2d0

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Announce: zfsdump

2010-06-30 Thread Edward Ned Harvey
> From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
> boun...@opensolaris.org] On Behalf Of Asif Iqbal
> 
> would be nice if i could pipe the zfs send stream to a split and then
> send of those splitted stream over the
> network to a remote system. it would help sending it over to remote
> system quicker. can your tool do that?

Does that make sense?  I assume the network is the bottleneck; the only way
the multiple streams would go any faster than a single stream would be
because you're multithreading and hogging all the bandwidth for yourself,
instead of sharing fairly with the httpd or whatever other server is trying
to use the bandwidth.

If you're talking about streaming to a bunch of separate tape drives (or
whatever) on a bunch of separate systems because the recipient storage is
the bottleneck instead of the network ... then "split" probably isn't the
most useful way to distribute those streams.  Because "split" is serial.
You would really want to "stripe" your data to all those various
destinations, so they could all be writing simultaneously.  But this seems
like a very specialized scenario, that I think is probably very unusual.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] What happens when unmirrored ZIL log device is removed ungracefully

2010-06-30 Thread Ray Van Dolson
On Wed, Jun 30, 2010 at 09:47:15AM -0700, Edward Ned Harvey wrote:
> > From: Arne Jansen [mailto:sensi...@gmx.net]
> > 
> > Edward Ned Harvey wrote:
> > > Due to recent experiences, and discussion on this list, my colleague
> > and
> > > I performed some tests:
> > >
> > > Using solaris 10, fully upgraded.  (zpool 15 is latest, which does
> > not
> > > have log device removal that was introduced in zpool 19)  In any way
> > > possible, you lose an unmirrored log device, and the OS will crash,
> > and
> > > the whole zpool is permanently gone, even after reboots.
> > >
> > 
> > I'm a bit confused. I tried hard, but haven't been able to reproduce
> > this
> > using Sol10U8. I have a mirrored slog device. While putting it
> > under load doing synchronous file creations, we pulled the power cords
> > and unplugged the slog devices. After powering on zfs imported the
> > pool,
> > but prompted to acknowledge the missing slog devices with zpool clear.
> > After that the pool was accessible again. That's exactly how it should
> > be.
> 
> Very interesting.  I did this test some months ago, so I may not recall the
> relevant details, but here are the details I do remember:
> 
> I don't recall if I did this test on osol2009.06, or sol10.
> 
> In Sol10u6 (and I think Sol10u8) the default zpool version is 10, but if you
> apply all your patches, then 15 becomes available.  I am sure that I've
> never upgraded any of my sol10 zpools higher than 10.  So it could be that
> an older zpool version might exhibit the problem, and you might be using a
> newer version.
> 
> In osol2009.06, IIRC, the default is zpool 14, and if you upgrade fully,
> you'll get to something around 24.  So again, it's possible the bad behavior
> went away in zpool 15, or any other number from 11 to 15.
> 
> I'll leave it there for now.  If that doesn't shed any light, I'll try to
> dust out some more of my mental cobwebs.

Anyone else done any testing with zpool version 15 (on Solaris 10 U8)?
Have a new system coming in shortly and will test myself, but knowing
this is a recoverable scenario would help me rest easier as I have an
unmirrored slog setup hanging around still.

Ray
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] What happens when unmirrored ZIL log device is removed ungracefully

2010-06-30 Thread Edward Ned Harvey
> From: Arne Jansen [mailto:sensi...@gmx.net]
> 
> Edward Ned Harvey wrote:
> > Due to recent experiences, and discussion on this list, my colleague
> and
> > I performed some tests:
> >
> > Using solaris 10, fully upgraded.  (zpool 15 is latest, which does
> not
> > have log device removal that was introduced in zpool 19)  In any way
> > possible, you lose an unmirrored log device, and the OS will crash,
> and
> > the whole zpool is permanently gone, even after reboots.
> >
> 
> I'm a bit confused. I tried hard, but haven't been able to reproduce
> this
> using Sol10U8. I have a mirrored slog device. While putting it
> under load doing synchronous file creations, we pulled the power cords
> and unplugged the slog devices. After powering on zfs imported the
> pool,
> but prompted to acknowledge the missing slog devices with zpool clear.
> After that the pool was accessible again. That's exactly how it should
> be.

Very interesting.  I did this test some months ago, so I may not recall the
relevant details, but here are the details I do remember:

I don't recall if I did this test on osol2009.06, or sol10.

In Sol10u6 (and I think Sol10u8) the default zpool version is 10, but if you
apply all your patches, then 15 becomes available.  I am sure that I've
never upgraded any of my sol10 zpools higher than 10.  So it could be that
an older zpool version might exhibit the problem, and you might be using a
newer version.

In osol2009.06, IIRC, the default is zpool 14, and if you upgrade fully,
you'll get to something around 24.  So again, it's possible the bad behavior
went away in zpool 15, or any other number from 11 to 15.

I'll leave it there for now.  If that doesn't shed any light, I'll try to
dust out some more of my mental cobwebs.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs-discuss Digest, Vol 56, Issue 126

2010-06-30 Thread Bob Friesenhahn
I searched and searched but was not able to find your added text in 
this long quoted message.  Please re-submit using the english language 
in simple ASCII text intended for humans.


Thanks,

Bob

On Wed, 30 Jun 2010, Eric Andersen wrote:



On Jun 28, 2010, at 10:03 AM, zfs-discuss-requ...@opensolaris.org wrote:


Send zfs-discuss mailing list submissions to
zfs-discuss@opensolaris.org

To subscribe or unsubscribe via the World Wide Web, visit
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
or, via email, send a message with subject or body 'help' to
zfs-discuss-requ...@opensolaris.org

You can reach the person managing the list at
zfs-discuss-ow...@opensolaris.org

When replying, please edit your Subject line so it is more specific
than "Re: Contents of zfs-discuss digest..."


Today's Topics:

  1. Re: ZFS bug - should I be worried about this? (Gabriele Bulfon)
  2. Re: ZFS bug - should I be worried about this? (Victor Latushkin)
  3. Re: OCZ Vertex 2 Pro performance numbers (Frank Cusack)
  4. Re: ZFS bug - should I be worried about this? (Garrett D'Amore)
  5. Announce: zfsdump (Tristram Scott)
  6. Re: Announce: zfsdump (Brian Kolaci)
  7. Re: zpool import hangs indefinitely (retry post in parts; too
 long?) (Andrew Jones)
  8. Re: Announce: zfsdump (Tristram Scott)
  9. Re: Announce: zfsdump (Brian Kolaci)


--

Message: 1
Date: Mon, 28 Jun 2010 05:16:00 PDT
From: Gabriele Bulfon 
To: zfs-discuss@opensolaris.org
Subject: Re: [zfs-discuss] ZFS bug - should I be worried about this?
Message-ID: <593812734.121277727391600.javamail.tweb...@sf-app1>
Content-Type: text/plain; charset=UTF-8

Yes...they're still running...but being aware that a power failure causing an 
unexpected poweroff may make the pool unreadable is a pain

Yes. Patches should be available.
Or adoption may be lowering a lot...
--
This message posted from opensolaris.org


--

Message: 2
Date: Mon, 28 Jun 2010 18:14:12 +0400
From: Victor Latushkin 
To: Gabriele Bulfon 
Cc: zfs-discuss@opensolaris.org
Subject: Re: [zfs-discuss] ZFS bug - should I be worried about this?
Message-ID: <4c28ae34.1030...@sun.com>
Content-Type: text/plain; CHARSET=US-ASCII; format=flowed

On 28.06.10 16:16, Gabriele Bulfon wrote:

Yes...they're still running...but being aware that a power failure causing an
unexpected poweroff may make the pool unreadable is a pain


Pool integrity is not affected by this issue.



--

Message: 3
Date: Mon, 28 Jun 2010 07:26:45 -0700
From: Frank Cusack 
To: 'OpenSolaris ZFS discuss' 
Subject: Re: [zfs-discuss] OCZ Vertex 2 Pro performance numbers
Message-ID: <5f1b59775f3ffc0e1781f...@cusack.local>
Content-Type: text/plain; charset=us-ascii; format=flowed

On 6/26/10 9:47 AM -0400 David Magda wrote:

Crickey. Who's the genius who thinks of these URLs?


SEOs


--

Message: 4
Date: Mon, 28 Jun 2010 08:17:21 -0700
From: "Garrett D'Amore" 
To: Gabriele Bulfon 
Cc: zfs-discuss@opensolaris.org
Subject: Re: [zfs-discuss] ZFS bug - should I be worried about this?
Message-ID: <1277738241.5596.4325.ca...@velocity>
Content-Type: text/plain; charset="UTF-8"

On Mon, 2010-06-28 at 05:16 -0700, Gabriele Bulfon wrote:

Yes...they're still running...but being aware that a power failure causing an 
unexpected poweroff may make the pool unreadable is a pain

Yes. Patches should be available.
Or adoption may be lowering a lot...



I don't have access to the information, but if this problem is the same
one I think it is, then the pool does not become unreadable.  Rather,
its state after such an event represents a *consistent* state from some
point of time *earlier* than that confirmed fsync() (or a write on a
file opened with O_SYNC or O_DSYNC).

For most users, this is not a critical failing.  For users using
databases or requiring transactional integrity for data stored on ZFS,
then yes, this is a very nasty problem indeed.

I suspect that this is the problem I reported earlier in my blog
(http://gdamore.blogspot.com) about certain kernels having O_SYNC and
O_DSYNC problems.  I can't confirm this though, because I don't have
access to the SunSolve database to read the report.

(This is something I'll have to check into fixing... it seems like my
employer ought to have access to that information...)

- Garrett



--

Message: 5
Date: Mon, 28 Jun 2010 08:26:02 PDT
From: Tristram Scott 
To: zfs-discuss@opensolaris.org
Subject: [zfs-discuss] Announce: zfsdump
Message-ID: <311835455.361277738793747.javamail.tweb...@sf-app1>
Content-Type: text/plain; charset=UTF-8

For quite some time I have been using zfs send -R fsn...@snapname | dd 
of=/dev/rmt/1ln to make a tape backup of my zfs file system.  A few weeks back 
the size of the file system grew to larger than would fit on a single DAT72 
tape, and I once ag

Re: [zfs-discuss] zfs-discuss Digest, Vol 56, Issue 126

2010-06-30 Thread Eric Andersen

On Jun 28, 2010, at 10:03 AM, zfs-discuss-requ...@opensolaris.org wrote:

> Send zfs-discuss mailing list submissions to
>   zfs-discuss@opensolaris.org
> 
> To subscribe or unsubscribe via the World Wide Web, visit
>   http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
> or, via email, send a message with subject or body 'help' to
>   zfs-discuss-requ...@opensolaris.org
> 
> You can reach the person managing the list at
>   zfs-discuss-ow...@opensolaris.org
> 
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of zfs-discuss digest..."
> 
> 
> Today's Topics:
> 
>   1. Re: ZFS bug - should I be worried about this? (Gabriele Bulfon)
>   2. Re: ZFS bug - should I be worried about this? (Victor Latushkin)
>   3. Re: OCZ Vertex 2 Pro performance numbers (Frank Cusack)
>   4. Re: ZFS bug - should I be worried about this? (Garrett D'Amore)
>   5. Announce: zfsdump (Tristram Scott)
>   6. Re: Announce: zfsdump (Brian Kolaci)
>   7. Re: zpool import hangs indefinitely (retry post in parts; too
>  long?) (Andrew Jones)
>   8. Re: Announce: zfsdump (Tristram Scott)
>   9. Re: Announce: zfsdump (Brian Kolaci)
> 
> 
> --
> 
> Message: 1
> Date: Mon, 28 Jun 2010 05:16:00 PDT
> From: Gabriele Bulfon 
> To: zfs-discuss@opensolaris.org
> Subject: Re: [zfs-discuss] ZFS bug - should I be worried about this?
> Message-ID: <593812734.121277727391600.javamail.tweb...@sf-app1>
> Content-Type: text/plain; charset=UTF-8
> 
> Yes...they're still running...but being aware that a power failure causing an 
> unexpected poweroff may make the pool unreadable is a pain
> 
> Yes. Patches should be available.
> Or adoption may be lowering a lot...
> -- 
> This message posted from opensolaris.org
> 
> 
> --
> 
> Message: 2
> Date: Mon, 28 Jun 2010 18:14:12 +0400
> From: Victor Latushkin 
> To: Gabriele Bulfon 
> Cc: zfs-discuss@opensolaris.org
> Subject: Re: [zfs-discuss] ZFS bug - should I be worried about this?
> Message-ID: <4c28ae34.1030...@sun.com>
> Content-Type: text/plain; CHARSET=US-ASCII; format=flowed
> 
> On 28.06.10 16:16, Gabriele Bulfon wrote:
>> Yes...they're still running...but being aware that a power failure causing an
>> unexpected poweroff may make the pool unreadable is a pain
> 
> Pool integrity is not affected by this issue.
> 
> 
> 
> --
> 
> Message: 3
> Date: Mon, 28 Jun 2010 07:26:45 -0700
> From: Frank Cusack 
> To: 'OpenSolaris ZFS discuss' 
> Subject: Re: [zfs-discuss] OCZ Vertex 2 Pro performance numbers
> Message-ID: <5f1b59775f3ffc0e1781f...@cusack.local>
> Content-Type: text/plain; charset=us-ascii; format=flowed
> 
> On 6/26/10 9:47 AM -0400 David Magda wrote:
>> Crickey. Who's the genius who thinks of these URLs?
> 
> SEOs
> 
> 
> --
> 
> Message: 4
> Date: Mon, 28 Jun 2010 08:17:21 -0700
> From: "Garrett D'Amore" 
> To: Gabriele Bulfon 
> Cc: zfs-discuss@opensolaris.org
> Subject: Re: [zfs-discuss] ZFS bug - should I be worried about this?
> Message-ID: <1277738241.5596.4325.ca...@velocity>
> Content-Type: text/plain; charset="UTF-8"
> 
> On Mon, 2010-06-28 at 05:16 -0700, Gabriele Bulfon wrote:
>> Yes...they're still running...but being aware that a power failure causing 
>> an unexpected poweroff may make the pool unreadable is a pain
>> 
>> Yes. Patches should be available.
>> Or adoption may be lowering a lot...
> 
> 
> I don't have access to the information, but if this problem is the same
> one I think it is, then the pool does not become unreadable.  Rather,
> its state after such an event represents a *consistent* state from some
> point of time *earlier* than that confirmed fsync() (or a write on a
> file opened with O_SYNC or O_DSYNC).
> 
> For most users, this is not a critical failing.  For users using
> databases or requiring transactional integrity for data stored on ZFS,
> then yes, this is a very nasty problem indeed.
> 
> I suspect that this is the problem I reported earlier in my blog
> (http://gdamore.blogspot.com) about certain kernels having O_SYNC and
> O_DSYNC problems.  I can't confirm this though, because I don't have
> access to the SunSolve database to read the report.
> 
> (This is something I'll have to check into fixing... it seems like my
> employer ought to have access to that information...)
> 
>   - Garrett
> 
> 
> 
> --
> 
> Message: 5
> Date: Mon, 28 Jun 2010 08:26:02 PDT
> From: Tristram Scott 
> To: zfs-discuss@opensolaris.org
> Subject: [zfs-discuss] Announce: zfsdump
> Message-ID: <311835455.361277738793747.javamail.tweb...@sf-app1>
> Content-Type: text/plain; charset=UTF-8
> 
> For quite some time I have been using zfs send -R fsn...@snapname | dd 
> of=/dev/rmt/1ln to make a tape backup of my zfs file system.  A few weeks 
> back the size of the file system grew to larger than would fit on a single 
> 

Re: [zfs-discuss] ZFS on Caviar Blue (Hard Drive Recommendations)

2010-06-30 Thread Freddie Cash
On Tue, Jun 29, 2010 at 11:25 AM, Patrick Donnelly  wrote:
> I googled around but couldn't find anything on whether someone has
> good or bad experiences with the Caviar *Blue* drives? I saw in the
> archives Caviar Blacks are *not* recommended for ZFS arrays (excluding
> apparently RE3 and RE4?). Specifically I'm looking to buy Western
> Digital Caviar Blue WD10EALS 1TB drives [1]. Does anyone have any
> experience with these drives?

We use a mix of WD Caviar Blue 500 GB, Caviar Black 500 GB, and RE2
500 GB drives in one of our storage servers without any issues.
Attached to 3Ware 9550SXU and 9650SE RAID controllers, configured as
Single Drive arrays.

There's also 8 WD Caviar Green 1.5 TB drives in there, which are not
very good (even after twiddling the idle timeout setting via wdidle3).
 Definitely avoid the Green/GP line of drives.

-- 
Freddie Cash
fjwc...@gmail.com
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] What are requirements for "zpool split" ?

2010-06-30 Thread Cindy Swearingen

Hey Mitch,

The zpool split feature is available in the OpenSolaris release if
you upgrade to build 131.

You can read about the requirements here:

http://hub.opensolaris.org/bin/view/Community+Group+zfs/docs

See the ZFS Admin Guide, page 89-90

Thanks,

Cindy

On 06/29/10 13:37, Mitchell Petty wrote:

Hi,

Is "zpool split" available ? If not when will it be ? If it is what
are the prerequisites ?

Thanks In Advance ,
Mitch















___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] zfs rpool corrupt?????

2010-06-30 Thread Tony MacDoodle
Hello,

Has anyone encountered the following error message, running Solaris 10 u8 in
an LDom.

bash-3.00# devfsadm
devfsadm: write failed for /dev/.devfsadm_dev.lock: Bad exchange descriptor


bash-3.00# zpool status -v rpool
pool: rpool
state: DEGRADED
status: One or more devices has experienced an error resulting in data
corruption. Applications may be affected.
action: Restore the file in question if possible. Otherwise restore the
entire pool from backup.
see: http://www.sun.com/msg/ZFS-8000-8A
scrub: scrub in progress for 0h1m, 22.57% done, 0h5m to go
config:

NAME STATE READ WRITE CKSUM
rpool DEGRADED 0 0 17
c0d0s0 DEGRADED 0 0 34 too many errors

errors: Permanent errors have been detected in the following files:

//dev/.devfsadm_dev.lock
//var/svc/log/system-tsol-zones:default.log
//var/svc/log/system-labeld:default.log
//var/svc/log/system-filesystem-volfs:default.log

Thanks
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Permanet errors detected in :<0x13>

2010-06-30 Thread W Brian Leonard
Well, I was doing a ZFS send / receive to backup a large (60 GB) of 
data, which never completed. A zpool clear at that point just hung and I 
had to reboot the system, after which it appeared to come up clean. As 
soon as I tried the backup again I noticed the pool reported the error 
you see below - but the backup did complete as the pool remained online.


Thanks for your help Cindy,
Brian

Cindy Swearingen wrote:



I reviewed the zpool clear syntax (looking at my own docs) and didn't
remember that a one-device pool probably doesn't need the device
specified. For pools with many devices, you might want to just clear
the errors on a particular device.

USB sticks for pools are problemmatic. It would be good to know what
caused these errors to try to prevent them in the future.

We know that USB devices don't generate/fabricate device IDs so they
are prone to problems when moving/changing/re-inserting but without
more info, its hard to tell what happened.

cs

On 06/29/10 14:13, W Brian Leonard wrote:
Interesting, this time it worked! Does specifying the device to clear 
cause the command to behave differently? I had assumed w/out the 
device specification, the clear would just apply to all devices in 
the pool (which are just the one).


Thanks,
Brian

Cindy Swearingen wrote:

Hi Brian,

Because the pool is still online and the metadata is redundant, maybe
these errors were caused by a brief hiccup from the USB device's
physical connection. You might try:

# zpool clear external c0t0d0p0

Then, run a scrub:

# zpool scrub external

If the above fails, then please identify the Solaris release and what
events preceded this problem.

Thanks,

Cindy




On 06/29/10 11:15, W Brian Leonard wrote:

Hi Cindy,

The scrub didn't help and yes, this is an external USB device.

Thanks,
Brian

Cindy Swearingen wrote:

Hi Brian,

You might try running a scrub on this pool.

Is this an external USB device?

Thanks,

Cindy

On 06/29/10 09:16, Brian Leonard wrote:

Hi,

I have a zpool which is currently reporting that the 
":<0x13>" file is corrupt:


bleon...@opensolaris:~$ pfexec zpool status -xv external
  pool: external
 state: ONLINE
status: One or more devices has experienced an error resulting in 
data

corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise 
restore the

entire pool from backup.
   see: http://www.sun.com/msg/ZFS-8000-8A
 scrub: none requested
config:

NAMESTATE READ WRITE CKSUM
externalONLINE   0 0 0
  c0t0d0p0  ONLINE   0 0 0

errors: Permanent errors have been detected in the following files:

:<0x13>

Otherwise, as you can see, the pool is online. As it's unclear to 
me how to restore the ":<0x13>" file, is my only option 
for correcting this error to destroy and recreate the pool?


Thanks,
Brian






--
W Brian Leonard
Principal Product Manager
860.206.6093
http://blogs.sun.com/observatory

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Permanet errors detected in :<0x13>

2010-06-30 Thread W Brian Leonard
Interesting, this time it worked! Does specifying the device to clear 
cause the command to behave differently? I had assumed w/out the device 
specification, the clear would just apply to all devices in the pool 
(which are just the one).


Thanks,
Brian

Cindy Swearingen wrote:

Hi Brian,

Because the pool is still online and the metadata is redundant, maybe
these errors were caused by a brief hiccup from the USB device's
physical connection. You might try:

# zpool clear external c0t0d0p0

Then, run a scrub:

# zpool scrub external

If the above fails, then please identify the Solaris release and what
events preceded this problem.

Thanks,

Cindy




On 06/29/10 11:15, W Brian Leonard wrote:

Hi Cindy,

The scrub didn't help and yes, this is an external USB device.

Thanks,
Brian

Cindy Swearingen wrote:

Hi Brian,

You might try running a scrub on this pool.

Is this an external USB device?

Thanks,

Cindy

On 06/29/10 09:16, Brian Leonard wrote:

Hi,

I have a zpool which is currently reporting that the 
":<0x13>" file is corrupt:


bleon...@opensolaris:~$ pfexec zpool status -xv external
  pool: external
 state: ONLINE
status: One or more devices has experienced an error resulting in data
corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise 
restore the

entire pool from backup.
   see: http://www.sun.com/msg/ZFS-8000-8A
 scrub: none requested
config:

NAMESTATE READ WRITE CKSUM
externalONLINE   0 0 0
  c0t0d0p0  ONLINE   0 0 0

errors: Permanent errors have been detected in the following files:

:<0x13>

Otherwise, as you can see, the pool is online. As it's unclear to 
me how to restore the ":<0x13>" file, is my only option 
for correcting this error to destroy and recreate the pool?


Thanks,
Brian




--
W Brian Leonard
Principal Product Manager
860.206.6093
http://blogs.sun.com/observatory

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] What are requirements for "zpool split" ?

2010-06-30 Thread Mitchell Petty




Hi,

    Is "zpool split" available ? If not when will it be ? If it is what

are the prerequisites ?

Thanks In Advance ,
Mitch


























___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] ZFS on Caviar Blue (Hard Drive Recommendations)

2010-06-30 Thread Patrick Donnelly
Hi list,

I googled around but couldn't find anything on whether someone has
good or bad experiences with the Caviar *Blue* drives? I saw in the
archives Caviar Blacks are *not* recommended for ZFS arrays (excluding
apparently RE3 and RE4?). Specifically I'm looking to buy Western
Digital Caviar Blue WD10EALS 1TB drives [1]. Does anyone have any
experience with these drives?

If this is the wrong way to go, does anyone have a recommendation for
1TB drives I can get for <= 90$?

[1] http://www.wdc.com/en/products/products.asp?driveid=793

Thanks for any help,

-- 
- Patrick Donnelly
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Permanet errors detected in :<0x13>

2010-06-30 Thread W Brian Leonard

Hi Cindy,

The scrub didn't help and yes, this is an external USB device.

Thanks,
Brian

Cindy Swearingen wrote:

Hi Brian,

You might try running a scrub on this pool.

Is this an external USB device?

Thanks,

Cindy

On 06/29/10 09:16, Brian Leonard wrote:

Hi,

I have a zpool which is currently reporting that the 
":<0x13>" file is corrupt:


bleon...@opensolaris:~$ pfexec zpool status -xv external
  pool: external
 state: ONLINE
status: One or more devices has experienced an error resulting in data
corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
entire pool from backup.
   see: http://www.sun.com/msg/ZFS-8000-8A
 scrub: none requested
config:

NAMESTATE READ WRITE CKSUM
externalONLINE   0 0 0
  c0t0d0p0  ONLINE   0 0 0

errors: Permanent errors have been detected in the following files:

:<0x13>

Otherwise, as you can see, the pool is online. As it's unclear to me 
how to restore the ":<0x13>" file, is my only option for 
correcting this error to destroy and recreate the pool?


Thanks,
Brian


--
W Brian Leonard
Principal Product Manager
860.206.6093
http://blogs.sun.com/observatory

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Kernel Panic on zpool clean

2010-06-30 Thread George
> Please try 
> 
> zdb -U /dev/null -ebcsv storage2

r...@crypt:~# zdb -U /dev/null -ebcsv storage2
zdb: can't open storage2: No such device or address

If I try

r...@crypt:~# zdb -C storage2

Then it prints what appears to be a valid configuration but then the same error 
message about being unable to find the device (output attached).

George
-- 
This message posted from opensolaris.orgr...@crypt:~# zdb -C storage2
version=14
name='storage2'
state=0
txg=1807366
pool_guid=14701046672203578408
hostid=8522651
hostname='crypt'
vdev_tree
type='root'
id=0
guid=14701046672203578408
children[0]
type='raidz'
id=0
guid=15861342641545291969
nparity=1
metaslab_array=14
metaslab_shift=35
ashift=9
asize=3999672565760
is_log=0
children[0]
type='disk'
id=0
guid=14390766171745861103
path='/dev/dsk/c9t4d2s0'
devid='id1,s...@n600d0230006c8a5f0c3fd863ea736d00/a'

phys_path='/p...@0,0/pci1022,7...@b/pci9005,4...@1/s...@4,2:a'
whole_disk=1
DTL=301
children[1]
type='disk'
id=1
guid=14806610527738068493
path='/dev/dsk/c9t4d3s0'
devid='id1,s...@n600d0230006c8a5f0c3fd8514ed8d900/a'

phys_path='/p...@0,0/pci1022,7...@b/pci9005,4...@1/s...@4,3:a'
whole_disk=1
DTL=300
children[2]
type='disk'
id=2
guid=4272121319363331595
path='/dev/dsk/c10t4d2s0'
devid='id1,s...@n600d0230006c8a5f0c3fd84312aa6d00/a'

phys_path='/p...@0,0/pci1022,7...@b/pci9005,4...@1,1/s...@4,2:a'
whole_disk=1
DTL=299
children[3]
type='disk'
id=3
guid=16286569401176941639
path='/dev/dsk/c10t4d4s0'
devid='id1,s...@n600d0230006c8a5f0c3fd8415c62ae00/a'

phys_path='/p...@0,0/pci1022,7...@b/pci9005,4...@1,1/s...@4,4:a'
whole_disk=1
DTL=296
children[1]
type='raidz'
id=1
guid=12601468074885676119
nparity=1
metaslab_array=172
metaslab_shift=35
ashift=9
asize=3999672565760
is_log=0
children[0]
type='disk'
id=0
guid=7040280703157905854
path='/dev/dsk/c10t4d0s0'
devid='id1,s...@n600d0230006c8a5f0c3fd83eda0a4a00/a'

phys_path='/p...@0,0/pci1022,7...@b/pci9005,4...@1,1/s...@4,0:a'
whole_disk=1
DTL=305
children[1]
type='replacing'
id=1
guid=16928413524184799719
whole_disk=0
children[0]
type='disk'
id=0
guid=9102173991259789741
path='/dev/dsk/c9t4d0s0'

devid='id1,s...@n600d0230006c8a5f0c3fd86eee69a300/a'

phys_path='/p...@0,0/pci1022,7...@b/pci9005,4...@1/s...@4,0:a'
whole_disk=1
DTL=304
children[1]
type='disk'
id=1
guid=16888611779137638814
path='/dev/dsk/c9t4d4s0'

devid='id1,s...@n600d0230006c8a5f0c3fd8612edc7d00/a'

phys_path='/p...@0,0/pci1022,7...@b/pci9005,4...@1/s...@4,4:a'
whole_disk=1
DTL=321
children[2]
type='disk'
id=2
guid=4025009484028197162
path='/dev/dsk/c10t4d1s0'
devid='id1,s...@n600d0230006c8a5f0c3fd8609d147700/a'

phys_path='/p...@0,0/pci1022,7...@b/pci9005,4...@1,1/s...@4,1:a'
whole_disk=1
DTL=303
children[3

Re: [zfs-discuss] Kernel Panic on zpool clean

2010-06-30 Thread Victor Latushkin

On Jun 30, 2010, at 10:48 AM, George wrote:

>> I suggest you to try running 'zdb -bcsv storage2' and
>> show the result.
> 
> r...@crypt:/tmp# zdb -bcsv storage2
> zdb: can't open storage2: No such device or address
> 
> then I tried
> 
> r...@crypt:/tmp# zdb -ebcsv storage2
> zdb: can't open storage2: File exists

Please try 

zdb -U /dev/null -ebcsv storage2
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss