date:20120802

Re: [zfs-discuss] Can the ZFS "copies" attribute substitute HW disk redundancy?

2012-08-02 Thread Nigel W

On Thu, Aug 2, 2012 at 3:39 PM, Richard Elling  wrote:
> On Aug 1, 2012, at 8:30 AM, Nigel W wrote:
>
>
> Yes. +1
>
> The L2ARC as is it currently implemented is not terribly useful for
> storing the DDT in anyway because each DDT entry is 376 bytes but the
> L2ARC reference is 176 bytes, so best case you get just over double
> the DDT entries in the L2ARC as what you would get into the ARC but
> then you have also have no ARC left for anything else :(.
>
>
> You are making the assumption that each DDT table entry consumes one
> metadata update. This is not the case. The DDT is implemented as an AVL
> tree. As per other metadata in ZFS, the data is compressed. So you cannot
> make a direct correlation between the DDT entry size and the affect on the
> stored metadata on disk sectors.
>  -- richard
>
It's compressed even when in the ARC?
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] unable to import the zpool

2012-08-02 Thread Hung-Sheng Tsao (LaoTsao) Ph.D

hi
can you post zpool history
regards

Sent from my iPad

On Aug 2, 2012, at 7:47, Suresh Kumar  wrote:

> Hi Hung-sheng,
>  
> It is not displaying any output, like the following.
>  
> bash-3.2#  zpool import -nF tXstpool
> bash-3.2#
>  
>  
> Thanks & Regards,
> Suresh.
>  
>  
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] unable to import the zpool

2012-08-02 Thread GREGG WONDERLY

My experience has always been that ZFS tries hard to keep you from doing 
something wrong when devices are failing or otherwise unavailable.  With 
mirrors, it will import with a device missing from a mirror vdev.   I don't use 
cache or log devices in my mainly storage pools, so I've not seen a failure 
with a "required" device like that missing.  But, I've seen problems with a 
raid-z missing and the pool not coming on line.

As Richard says, it would seem there is a cache or log vdev missing since it is 
showing 1 of 2 mirrored devices in that vdev missing, but still complaining 
about a missing device.

The older OS and ZFS version may in fact have a misbehavior due to some error 
condition not being correctly managed.

Gregg Wonderly

On Aug 2, 2012, at 4:49 PM, Richard Elling  wrote:

> 
> On Aug 1, 2012, at 12:21 AM, Suresh Kumar wrote:
> 
>> Dear ZFS-Users,
>> 
>> I am using Solarisx86 10u10, All the devices which are belongs to my zpool 
>> are in available state .
>> But I am unable to import the zpool.
>> 
>> #zpool import tXstpool
>> cannot import 'tXstpool': one or more devices is currently unavailable
>> ==
>> bash-3.2# zpool import
>>   pool: tXstpool
>> id: 13623426894836622462
>>  state: UNAVAIL
>> status: One or more devices are missing from the system.
>> action: The pool cannot be imported. Attach the missing
>> devices and try again.
>>see: http://www.sun.com/msg/ZFS-8000-6X
>> config:
>> 
>> tXstpool UNAVAIL  missing device
>>   mirror-0   DEGRADED
>> c2t210100E08BB2FC85d0s0  FAULTED  corrupted data
>> c2t21E08B92FC85d2ONLINE
>> 
>> Additional devices are known to be part of this pool, though their
>> exact configuration cannot be determined.
>> 
> 
> This message is your clue. The pool is missing a device. In most of the cases
> where I've seen this, it occurs on older ZFS implementations and the missing
> device is an auxiliary device: cache or spare.
>  -- richard
> 
> --
> ZFS Performance and Training
> richard.ell...@richardelling.com
> +1-760-896-4422
> 
> 
> 
> 
> 
> 
> 
> ___
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] encfs on top of zfs

2012-08-02 Thread Richard Elling


On Jul 31, 2012, at 8:05 PM, opensolarisisdeadlongliveopensolaris wrote:

>> From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
>> boun...@opensolaris.org] On Behalf Of Richard Elling
>> 
>> I believe what you meant to say was "dedup with HDDs sux." If you had
>> used fast SSDs instead of HDDs, you will find dedup to be quite fast.
>>  -- richard
> 
> Yes, but this is a linear scale.  

No, it is definitely NOT a linear scale. Study Amdahl's law a little more 
carefully.

> Suppose an SSD without dedup is 100x faster than a HDD without dedup.  And 
> suppose dedup slows down a system by a factor of 10x.  Now your SSD with 
> dedup is only 10x faster than the HDD without dedup.  So "quite fast" is a 
> relative term.

Of course it is.

>  The SSD with dedup is still faster than the HDD without dedup, but it's also 
> slower than the SSD without dedup.

duh. With dedup you are trading IOPS for space. In general, HDDs have lots of 
space and
terrible IOPS. SSDs have less space, but more IOPS. Obviously, as you point 
out, the best
solution is lots of space and lots of IOPS.

> The extent of fibbing I'm doing is thusly:  In reality, an SSD is about 
> equally fast with HDD for sequential operations, and about 100x faster for 
> random IO.  It just so happens that the dedup performance hit is almost 
> purely random IO, so it's right in the sweet spot of what SSD's handle well.  

In the vast majority of modern systems, there are no sequential I/O workloads. 
That is a myth 
propagated by people who still think HDDs can be fast.

> You can't use an overly simplified linear model like I described above - In 
> reality, there's a grain of truth in what Richard said, and also a grain of 
> truth in what I said.  The real truth is somewhere in between what he said 
> and what I said.

But closer to my truth :-)

> No, the SSD will not perform as well with dedup as it does without dedup.  
> But the "suppose dedup slows down by 10x" that I described above is not 
> accurate.  Depending on what you're doing, dedup might slow down an HDD by 
> 20x, and it might only slow down SSD by 4x doing the same work load.  Highly 
> variable, and highly dependent on the specifics of your workload.

You are making the assumption that the system is not bandwidth limited. This is 
a
good assumption for the HDD case, because the media bandwidth is much less 
than the interconnect bandwidth. For SSDs, this assumption is not necessarily 
true.
There are SSDs that are bandwidth constrained on the interconnect, and in those
cases, your model fails.
 -- richard

--
ZFS Performance and Training
richard.ell...@richardelling.com
+1-760-896-4422







___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Can the ZFS "copies" attribute substitute HW disk redundancy?

2012-08-02 Thread Peter Jeremy

On 2012-Aug-02 18:30:01 +0530, opensolarisisdeadlongliveopensolaris 
 wrote:
>Ok, so the point is, in some cases, somebody might want redundancy on
>a device that has no redundancy.  They're willing to pay for it by
>halving their performance.

This isn't quite true - write performance will be at least halved
(possibly worse due to additional seeking) but read performance
could potentially improve (more copies means, on average, there should
be less seeking to get a a copy than if there was only one copy).
And non-IO performance is unaffected.

>  The only situation I'll acknowledge is
>the laptop situation, and I'll say, present day very few people would
>be willing to pay *that* much for this limited use-case redundancy.

My guess is that, for most people, the overall performance impact
would be minimal because disk write performance isn't the limiting
factor for most laptop usage scenarios.

>The solution that I as an IT person would recommend and deploy would
>be to run without "copies" and instead cover you bum by doing backups.

You need backups in any case but backups won't help you if you can't
conveniently access them.  Before giving a blanket recommendation, you
need to consider how the person uses their laptop.  Consider the
following scenario:  You're in the middle of a week-long business trip
and your laptop develops a bad sector in an inconvenient spot.  Do you:
a) Let ZFS automagically repair the sector thanks to copies=2.
b) Attempt to rebuild your laptop and restore from backups (left securely
   at home) via the dodgy hotel wifi.

-- 
Peter Jeremy

pgpvosNQQa9DJ.pgp
Description: PGP signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] unable to import the zpool

2012-08-02 Thread Richard Elling


On Aug 1, 2012, at 12:21 AM, Suresh Kumar wrote:

> Dear ZFS-Users,
> 
> I am using Solarisx86 10u10, All the devices which are belongs to my zpool 
> are in available state .
> But I am unable to import the zpool.
> 
> #zpool import tXstpool
> cannot import 'tXstpool': one or more devices is currently unavailable
> ==
> bash-3.2# zpool import
>   pool: tXstpool
> id: 13623426894836622462
>  state: UNAVAIL
> status: One or more devices are missing from the system.
> action: The pool cannot be imported. Attach the missing
> devices and try again.
>see: http://www.sun.com/msg/ZFS-8000-6X
> config:
> 
> tXstpool UNAVAIL  missing device
>   mirror-0   DEGRADED
> c2t210100E08BB2FC85d0s0  FAULTED  corrupted data
> c2t21E08B92FC85d2ONLINE
> 
> Additional devices are known to be part of this pool, though their
> exact configuration cannot be determined.
> 

This message is your clue. The pool is missing a device. In most of the cases
where I've seen this, it occurs on older ZFS implementations and the missing
device is an auxiliary device: cache or spare.
 -- richard

--
ZFS Performance and Training
richard.ell...@richardelling.com
+1-760-896-4422







___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Can the ZFS "copies" attribute substitute HW disk redundancy?

2012-08-02 Thread Richard Elling

On Aug 1, 2012, at 8:30 AM, Nigel W wrote:
> On Wed, Aug 1, 2012 at 8:33 AM, Sašo Kiselkov  wrote:
>> On 08/01/2012 04:14 PM, Jim Klimov wrote:
>>> chances are that
>>> some blocks of userdata might be more popular than a DDT block and
>>> would push it out of L2ARC as well...
>> 
>> Which is why I plan on investigating implementing some tunable policy
>> module that would allow the administrator to get around this problem.
>> E.g. administrator dedicates 50G of ARC space to metadata (which
>> includes the DDT) or only the DDT specifically. My idea is still a bit
>> fuzzy, but it revolves primarily around allocating and policing min and
>> max quotas for a given ARC entry type. I'll start a separate discussion
>> thread for this later on once I have everything organized in my mind
>> about where I plan on taking this.
>> 
> 
> Yes. +1
> 
> The L2ARC as is it currently implemented is not terribly useful for
> storing the DDT in anyway because each DDT entry is 376 bytes but the
> L2ARC reference is 176 bytes, so best case you get just over double
> the DDT entries in the L2ARC as what you would get into the ARC but
> then you have also have no ARC left for anything else :(.

You are making the assumption that each DDT table entry consumes one
metadata update. This is not the case. The DDT is implemented as an AVL
tree. As per other metadata in ZFS, the data is compressed. So you cannot
make a direct correlation between the DDT entry size and the affect on the
stored metadata on disk sectors.
 -- richard

--
ZFS Performance and Training
richard.ell...@richardelling.com
+1-760-896-4422







___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Can the ZFS "copies" attribute substitute HW disk redundancy?

2012-08-02 Thread Richard Elling

On Aug 1, 2012, at 2:41 PM, Peter Jeremy wrote:

> On 2012-Aug-01 21:00:46 +0530, Nigel W  wrote:
>> I think a fantastic idea for dealing with the DDT (and all other
>> metadata for that matter) would be an option to put (a copy of)
>> metadata exclusively on a SSD.
> 
> This is on my wishlist as well.  I believe ZEVO supports it so possibly
> it'll be available in ZFS in the near future.

ZEVO does not. The only ZFS vendor I'm aware of with a separate top-level
vdev for metadata is Tegile, and it is available today. 
 -- richard

--
ZFS Performance and Training
richard.ell...@richardelling.com
+1-760-896-4422







___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] single-disk pool - Re: Can the ZFS "copies" attribute substitute HW disk redundancy?

2012-08-02 Thread Toby Thain


On 01/08/12 3:34 PM, opensolarisisdeadlongliveopensolaris wrote:

From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
boun...@opensolaris.org] On Behalf Of Jim Klimov

Well, there is at least a couple of failure scenarios where
copies>1 are good:

1) A single-disk pool, as in a laptop. Noise on the bus,
 media degradation, or any other reason to misread or
 miswrite a block can result in a failed pool.


How does mac/win/lin handle this situation?  (Not counting btrfs.)



Is this a trick question? :)

--Toby


Such noise might result in a temporarily faulted pool (blue screen of death) 
that is fully recovered after reboot.  Meanwhile you're always paying for it in 
terms of performance, and it's all solvable via pool redundancy.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Can the ZFS "copies" attribute substitute HW disk redundancy?

2012-08-02 Thread opensolarisisdeadlongliveopensolaris

> From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
> boun...@opensolaris.org] On Behalf Of Jim Klimov
> 
> In some of my cases I was "lucky" enough to get a corrupted /sbin/init
> or something like that once, and the box had no other BE's yet, so the
> OS could not do anything reasonable after boot. It is different from a
> "corrupted zpool", but ended in a useless OS image due to one broken
> sector nonetheless.

That's very annoying, but if "copies" could have saved you, then pool 
redundancy could have also saved you.


> For a single-disk box, "copies" IS the redundancy. ;)

Ok, so the point is, in some cases, somebody might want redundancy on a device 
that has no redundancy.  They're willing to pay for it by halving their 
performance.  The only situation I'll acknowledge is the laptop situation, and 
I'll say, present day very few people would be willing to pay *that* much for 
this limited use-case redundancy.  The solution that I as an IT person would 
recommend and deploy would be to run without "copies" and instead cover you bum 
by doing backups.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Can the ZFS "copies" attribute substitute HW disk redundancy?

2012-08-02 Thread opensolarisisdeadlongliveopensolaris

> From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
> boun...@opensolaris.org] On Behalf Of Jim Klimov
> 
> 2012-08-01 23:40, opensolarisisdeadlongliveopensolaris пишет:
> 
> > Agreed, ARC/L2ARC help in finding the DDT, but whenever you've got a
> snapshot destroy (happens every 15 minutes) you've got a lot of entries you
> need to write.  Those are all scattered about the pool...  Even if you can 
> find
> them fast, it's still a bear.
> 
> No, these entries you need to update are scattered around your
> SSD (be it ARC or a hypothetical SSD-based copy of metadata
> which I also "campaigned" for some time ago). 

If they were scattered around the hypothetical dedicated DDT SSD, I would say, 
no problem.  But in reality, they're scattered in your main pool.  DDT writes 
don't get coalesced.  Is this simply because they're sync writes?  Or is it 
because they're metadata, which is even lower level than sync writes?  I know, 
for example, that you can disable ZIL on your pool, but still the system is 
going to flush the buffer after certain operations, such as writing the 
uberblock.  I have not seen the code that flushes the buffer after DDT writes, 
but I have seen the performance evidence.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] unable to import the zpool

2012-08-02 Thread Hung-Sheng Tsao (LaoTsao) Ph.D

so zpool import -F ..
zpool import -f ...
all not working?
regards


Sent from my iPad

On Aug 2, 2012, at 7:47, Suresh Kumar  wrote:

> Hi Hung-sheng,
>  
> It is not displaying any output, like the following.
>  
> bash-3.2#  zpool import -nF tXstpool
> bash-3.2#
>  
>  
> Thanks & Regards,
> Suresh.
>  
>  
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] unable to import the zpool

2012-08-02 Thread Suresh Kumar

Hi Hung-sheng,

It is not displaying any output, like the following.

bash-3.2#  zpool import -nF tXstpool
bash-3.2#


*Thanks & Regards,*
*Suresh.*
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] unable to import the zpool

2012-08-02 Thread Hung-Sheng Tsao Ph.D.


http://docs.oracle.com/cd/E19963-01/html/821-1448/gbbwl.html

what is the output of
*zpool import -nF tXstpool*

On 8/2/2012 2:21 AM, Suresh Kumar wrote:

Hi Hung-sheng,
Thanks for your response.
I tried to import the zpool using *zpool import -nF tXstpool*
please consider the below output.
*bash-3.2#  zpool import -nF tXstpool
bash-3.2#
bash-3.2# zpool status tXstpool
cannot open 'tXstpool': no such pool
*
I got these meesages when I run the command using *truss.*
* truss -aefo /zpool.txt zpool import -F tXstpool*

  742  14582:  ioctl(3, ZFS_IOC_POOL_STATS, 0x08041F40) Err#2 ENOENT
  743  14582:  ioctl(3, ZFS_IOC_POOL_TRYIMPORT, 0x08041F90)= 0
  744  14582:  sysinfo(SI_HW_SERIAL, "75706560", 11)   = 9
  745  14582:  ioctl(3, ZFS_IOC_POOL_IMPORT, 0x08041C40) Err#6 ENXIO
  746  14582:  fstat64(2, 0x08040C70)  = 0
  747  14582:  write(2, " c a n n o t   i m p o r".., 24)  = 24
  748  14582:  write(2, " :  ", 2) = 2
  749  14582:  write(2, " o n e   o r   m o r e  ".., 44)  = 44
  750  14582:  write(2, "\n", 1)   = 1

/*Thanks & Regards*/
/*Suresh*/


--

<>___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Can the ZFS "copies" attribute substitute HW disk redundancy?

Re: [zfs-discuss] unable to import the zpool

Re: [zfs-discuss] unable to import the zpool

Re: [zfs-discuss] encfs on top of zfs

Re: [zfs-discuss] Can the ZFS "copies" attribute substitute HW disk redundancy?

Re: [zfs-discuss] unable to import the zpool

Re: [zfs-discuss] Can the ZFS "copies" attribute substitute HW disk redundancy?

Re: [zfs-discuss] Can the ZFS "copies" attribute substitute HW disk redundancy?

[zfs-discuss] single-disk pool - Re: Can the ZFS "copies" attribute substitute HW disk redundancy?

Re: [zfs-discuss] Can the ZFS "copies" attribute substitute HW disk redundancy?

Re: [zfs-discuss] Can the ZFS "copies" attribute substitute HW disk redundancy?

Re: [zfs-discuss] unable to import the zpool

Re: [zfs-discuss] unable to import the zpool

Re: [zfs-discuss] unable to import the zpool

14 matches

Site Navigation

Mail list logo

Footer information