Re: [zfs-discuss] Performance drop during scrub?

2010-05-03 Thread Tonmaus
> On Sun, 2 May 2010, Dave Pooser wrote:
> >
> > If my system is going to fail under the stress of a
> scrub, it's going to
> > fail under the stress of a resilver. From my
> perspective, I'm not as scared
> 
> I don't disagree with any of the opinions you stated
> except to point 
> out that resilver will usually hit the (old) hardware
> less severely 
> than scrub.  Resilver does not have to access any of
> the redundant 
> copies of data or metadata, unless they are the only
> remaining good 
> copy.
> 
> Bob

Adding the perspective that scrub could consume my hard disks life may sound 
like a really good point why I should avoid scrub on my system as far as 
possible, and thus avoid experiencing performance issues in the first place, 
while using scrub.
I just don't buy this. Sorry. It's too far-fetched. I'd still prefer if the 
original issue could be fixed.

Regards,

Tonmaus
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Performance drop during scrub?

2010-05-02 Thread Tonmaus
Hi Bob,
> 
> It is necessary to look at all the factors which
> might result in data 
> loss before deciding what the most effective steps
> are to minimize 
> the probability of loss.
> 
> Bob

I am under the impression that exactly those were the considerations for both 
the ZFS designers to implement a scrub function to ZFS and the author of Best 
Practises to recommend performing this function frequently. I am hearing you 
are coming to a different conclusion and I would be interested in learning what 
could possibly be so highly interpretable in this.

Regards,

Tonmaus
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Performance drop during scrub?

2010-04-29 Thread Tonmaus
> In my opinion periodic scrubs are most useful for
> pools based on 
> mirrors, or raidz1, and much less useful for pools
> based on raidz2 or 
> raidz3.  It is useful to run a scrub at least once on
> a well-populated 
> new pool in order to validate the hardware and OS,
> but otherwise, the 
> scrub is most useful for discovering bit-rot in
> singly-redundant 
> pools.
> 
> Bob

Hi,

for once, well populated pools are rarely new. Second, Best Practises 
recommendations on scrubbing intervals are based on disk product line 
(Enterprise monthly vs. Consumer weekly), not on redundancy level or pool 
configuration. Obviously, the issue under discussion affects all imaginable 
configurations, though. It may only vary in the degree.
Recommending to not using scrub doesn't even qualify as a workaround, in my 
regard.

Regards,

Tonmaus
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Performance drop during scrub?

2010-04-28 Thread Tonmaus
Hi Eric,

> While there may be some possible optimizations, i'm
> sure everyone
> would love the random performance of mirror vdevs,
> combined with the
> redundancy of raidz3 and the space of a raidz1.
>  However, as in all
> ystems, there are tradeoffs.

I think we all may agree that the topic here is scrub trade-offs, specifically. 
My question is if manageability of the pool, and that includes periodical 
scrubs, is a trade-off as well. It would be very bad news, if it were. 
Maintenance functions should be practicable on any supported configuration, if 
possible.
 
> You can choose to bias your workloads so that
> foreground IO takes
> priority over scrub, but then you've got the cases
> where people
> complain that their scrub takes too long.  There may
> be knobs for
> individuals to use, but I don't think overall there's
> a magic answer.

The priority balance only works as long as the IO is within ZFS. As soon as the 
request is in the pipe of the controller/disk, no further bias will occur, as 
that subsystem is agnostic to ZFS rules. This is where Richards answer, just 
above if you read this from jive, kicks in. This leads to the pool being 
basically not operational from a production POV during scrub pass. From that 
perspective, any scrub pass exceeding a periodically acceptable service window 
is "too long". In such a situation, a "pause" option for resuming scrub passes 
upon the next service window might help. The advantage: such an option would be 
usable on any hardware.

Regards,

Tonmaus
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Performance drop during scrub?

2010-04-28 Thread Tonmaus
> Zfs scrub needs to access all written data on all
> disks and is usually 
> disk-seek or disk I/O bound so it is difficult to
> keep it from hogging 
> the disk resources.  A pool based on mirror devices
> will behave much 
> more nicely while being scrubbed than one based on
> RAIDz2.

Experience seconded entirely. I'd like to repeat that I think we need more 
efficient load balancing functions in order to keep housekeeping payload 
manageable. Detrimental side effects of scrub should not be a decision point 
for choosing certain hardware or redundancy concepts in my opinion. 

Regards,

Tonmaus
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Help:Is zfs-fuse's performance is not good

2010-04-25 Thread Tonmaus
I wonder if this is the right place to ask, as the Filesystem in User Space 
implementation is a separate project. In Solaris ZFS runs in kernel. FUSE 
implementations are slow, no doubt. Same goes for other FUSE implementations, 
such as for NTFS.

Regards,

Tonmaus
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Oracle to no longer support ZFS on OpenSolaris?

2010-04-20 Thread Tonmaus
Don't copy the netiquette issue you are seeing, as I am talking about nothing 
but an issue in a post on this forum. Why should I contact the OP off record 
about this?
There is no need to read intentions either. I just made clear once more what is 
obvious from board metadata anyhow.
Besides that, if we are having a dispute about netiquette, that highlights the 
potential substance of the topic more than anything else.

Regards,

Tonmaus
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Oracle to no longer support ZFS on OpenSolaris?

2010-04-20 Thread Tonmaus
> you talking about and to whom were you
> responding?
 My intention was a response to the OP, which I guess from what I am seeing in 
the jive forum, happened as well. Indeed, my concern was the broken link in the 
first post which would be simple to fix if intended. That not being the case 
increases the smell of FUD.

-Tonmaus
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Oracle to no longer support ZFS on OpenSolaris?

2010-04-20 Thread Tonmaus
Why don't you just fix the apparently broken link to your source, then?

Regards,

Tonmaus
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Setting up ZFS on AHCI disks

2010-04-16 Thread Tonmaus
Your adapter read-outs look quite different than mine. I am on ICH-9, snv_133. 
Maybe that's why. But I thought I should ask on that occasion:

-build?
-do the drives currently support SATA-2 standard (by model, by jumper settings?)
- could it be that the Areca controller has done something to them 
partition-wise?

Regards,

Tonmaus
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Setting up ZFS on AHCI disks

2010-04-15 Thread Tonmaus
Hi,

are the drives properly configured in cfgadm?

Cheers,

Tonmaus
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Why would zfs have too many errors when underlying raid array is fine?

2010-04-15 Thread Tonmaus
My understanding of "passthrough disk" from the Areca documentation is that 
single drives are exempted from the RAID controller regime and that the port 
will behave just like a plain HBA port.
Now, on my Areca controller (r.i.p.) that mode always created the biggest havoc 
with ZFS/Opensolaris, including zpool states just like yours. That was on a 
older firmware, though.
12xRAID0 was only marginally better than pass-through.

What I maybe did not mention is that we tried with Ubuntu/dmraid on the same HW 
for an afternoon, but here the initialisation of the RAID crashed with a 
reproducible Kernel Panic. 

I think I mentioned it before: the only thing that worked decently was putting 
the whole controller in JBOD mode.

Yes, it is an expensive way of providing a bunch of SATA ports... in my case it 
wasn't that bad as I got a 1170 for app. 400 Euros, but it was still too 
expensive given the performance under ZFS, so I swapped it against a full 
re-fund for a pair of LSIs.

Regards,
Tonmaus
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Which build is the most stable, mainly for NAS (zfs)?

2010-04-15 Thread Tonmaus
> > I would be really interested how you got past this
> >
> http://defect.opensolaris.org/bz/show_bug.cgi?id=11371
> > which I was so badly bitten by that I considered
> giving up on OpenSolaris.
> 
> 
> I don't get random hangs in normal use; so I haven't
> done anything to "get
> past" this.
> 
> I DO get hangs when funny stuff goes on, which may
> well be related to that
> problem (at least they require a reboot).  Hmmm; I
> get hangs sometimes
> when trying to send a full replication stream to an
> external backup drive,
> and I have to reboot to recover from them.  I can
> live with this, in the
> short term.  But now I'm feeling hopeful that they're
> fixed in what I'm
> likely to be upgrading to next.

That sounds that the only difference probably was the amount of data 
transferred on your and my system. We are working with media files here, each 
multiple Gigabytes, hence the varying mileage, I assume.

FW 2010.x is concerned, my expectations are from past experience with last 
release. I test 2010 maybe even more rigidly before I will jump to it. 
"Technical" stability as you put it before, is basically the same for Dev and 
Release builds both from phenomenon and consequence perspective in a 
OpenSolaris environment.

Regards,

Tonmaus
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Which build is the most stable, mainly for NAS (zfs)?

2010-04-15 Thread Tonmaus
8 hot swap bays is not too much. The rest looks like a cake walk for OSol. But 
with this HW you can't go for 2009.06 anyhow, as ICH-10 won't be recognized. (I 
tried this on x58)

I have a 2U enclosure as well (12-bay), but I'd opt for at least 3U next time, 
as there are too many restrictions for LP add-in cards, let alone bays, bays, 
bays...

Tonmaus
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Which build is the most stable, mainly for NAS (zfs)?

2010-04-14 Thread Tonmaus
> 
> On Wed, April 14, 2010 08:52, Tonmaus wrote:
> > safe to say: 2009.06 (b111) is unusable for the
> purpose, ans CIFS is dead
> > in this build.
> 
> That's strange; I run it every day (my home Windows
> "My Documents" folder
> and all my photos are on 2009.06).
> 
> 
> -bash-3.2$ cat /etc/release
> OpenSolaris 2009.06 snv_111b
>  X86
> Copyright 2009 Sun Microsystems, Inc.  All
> Rights Reserved.
> Use is subject to license
>  terms.
>  Assembled 07 May 2009


I would be really interested how you got past this 
http://defect.opensolaris.org/bz/show_bug.cgi?id=11371
which I was so badly bitten by that I considered giving up on OpenSolaris.

>  not sure if this is best choice. I'd like to
>  hear from others as well.
> Well, it's technically not a stable build.
> 
> I'm holding off to see what 2010.$Spring ends up
> being; I'll convert to
> that unless it turns into a disaster.
> 
> Is it possible to switch to b132 now, for example?  I
> don't think the old
> builds are available after the next one comes out; I
> haven't been able to
> find them.

There are methods to upgrade to any dev build by pkg. Can't tell you from the 
top of my head, but I have done it with success.

I wouldn't know why to go to 132 instead of 133, though. 129 seems to be an 
option.

Regards,

Tonmaus
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] casesensitivity mixed and CIFS

2010-04-14 Thread Tonmaus
was b130 also the version that created the data set?

-Tonmaus
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Which build is the most stable, mainly for NAS (zfs)?

2010-04-14 Thread Tonmaus
safe to say: 2009.06 (b111) is unusable for the purpose, ans CIFS is dead in 
this build.

I am using B133, but I am not sure if this is best choice. I'd like to hear 
from others as well.

-Tonmaus
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Why would zfs have too many errors when underlying raid array is fine?

2010-04-12 Thread Tonmaus
Upgrading the firmware a good idea, as there are other issues with Areca 
controllers that only have been solved recently. i.e. 1.42 is probably still 
affected by a problem with SCSI labels that may give problems importing a pool.

-Tonmaus
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Why would zfs have too many errors when underlying raid array is fine?

2010-04-12 Thread Tonmaus
Hi,

> I started off my setting up all the disks to be
> pass-through disks, and tried to make a raidz2 array
> using all the disks. It would work for a while, then
> suddenly every disk in the array would have too many
> errors and the system would fail.

I had exactly the same experience with my Areca controller. Actually, I 
couldn't get it to work unless I put the whole controller in jbod mode. Neither 
12 x "Raid-0 arrays" with single disks nor pass-through was workable. I had 
kernel panic and pool corruption all over the place, sometimes with, sometimes 
without additional corruption messages from the areca panel.I am not sure if 
this relates to the rest of your problem, though.

Regards,

Tonmaus
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Areca ARC-1680 on OpenSolaris 2009.06?

2010-04-11 Thread Tonmaus
That was a while back when I was shopping for my own HBAs. There were 
compatibility warnings all over the place with some Adpatec controllers and LSI 
SAS expanders.

AFAIK, even the 106x need to be operated in IT mode to properly work with SAS 
expanders. IT mode disables all RAID functions of the 106x.

Regards,

Tonmaus
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Areca ARC-1680 on OpenSolaris 2009.06?

2010-04-10 Thread Tonmaus
As far as I have read, that problem has been reported to be a compatibility 
problem of the Adaptec controller and the expander chipset, e.g. LSI SASx which 
is also on the mentioned Chenbro expander. There is no problem with 106x 
chipset and sas expanders that I know of.

People sceptical about expanders: quite a couple of th Areca cards actually 
have expander chips on board. Don't know about the 1680 specifically, though.

Cheers,

Tonmaus
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Areca ARC-1680 on OpenSolaris 2009.06?

2010-04-09 Thread Tonmaus
Hi David,

why not just use a couple of SAS expanders? 

Regards,

Tonmaus
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] How to destroy iscsi dataset?

2010-03-31 Thread Tonmaus
Hi,

even if you didn't specify so below (both, Comstar and legacy target services 
are inactive) I assume that you have been using Comstar, right?
In that case, the questions are:

- is there still a view on the targets? (check stmfadm)
- is there still a LU mapped? (check sbdadm)

cheers,

Tonmaus
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] What about this status report

2010-03-29 Thread Tonmaus
Both are driver modules for storage adapters
Properties can be reviewed in the documentation:
ahci: http://docs.sun.com/app/docs/doc/816-5177/ahci-7d?a=view
mpt: http://docs.sun.com/app/docs/doc/816-5177/mpt-7d?a=view
ahci has a man entry on b133, as well.

cheers,

Tonmaus
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] What about this status report

2010-03-28 Thread Tonmaus
Yes. Basically working here. All fine under ahci, some problems under mpt 
(smartctl says that WD1002fbys wouldn't allow to store smart events, which I 
think is probably nonsense.)

Regards,

Tonmaus
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Usage of hot spares and hardware allocation capabilities.

2010-03-20 Thread Tonmaus
save. 
Bottom line: you will have to find out. 

What the "warning" is concerned: migrating a whole pool is not the same thing 
as swapping slots within a pool. I.e., if you pull more than the allowed number 
(failover resilience) from your pool at the same time while the pool is hot, 
you will simply destroy the pool.

Regards,

Tonmaus
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Usage of hot spares and hardware allocation capabilities.

2010-03-20 Thread Tonmaus
> So, is there a
> sleep/hibernation/standby mode that the hot spares
> operate in or are they on all the time regardless of
> whether they are in use or not?

This depends on the power-save options of your hardware, not on ZFS. Arguably, 
there is less ware on the heads for a hot spare. I guess that many modern disks 
will park the heads after a certain time, or spin even down, unless the 
controller prevents that. The question is if the disk comes back fast enough 
when required - your bets are on the controller supporting that properly. As it 
seems, there is little focus on that matter at SUN and among community members. 
At least my own investigations how to best make use of power save options like 
most SoHo NAS boxes offer returned only dire results. 
 
> Usually the hot spare is on a not so well-performing
> SAS/SATA controller,

There is no room for "not so well-performing" controllers in my servers. I 
would not allow wasting PCIe slots, backplanes for anything that doesn't live 
up to specs (my requirements). That being said, JBOD HBAs are those that 
perform best with ZFS and those happen to be not very expensive. Additionally, 
I will avoid a checker-board of components striving for keeping things as 
simple as possible. 


> To be more general; are the hard drives in the pool
> "hard coded" to their SAS/SATA channels or can I swap
> their connections arbitrarily if I would want to do
> that? Will zfs automatically identify the association
> of each drive of a given pool or tank and
> automatically reallocate them to put the
> pool/tank/filesystem back in place?

This works very well, given your controller properly supports it. I tried that 
on an Areca 1170 a couple of weeks ago, with interesting results that turned 
out to be an Areca firmware flaw. You may find the thread on this list. I would 
recommend that you do such tests when implementing your array before going in 
production with it. Analogue aspects may apply for

- Hotswapping
- S.M.A.R.T.
- replace failing components or change configuration
- transfer a whole array to another host
(list is not comprehensive)

I think at this moment you have two choices to be sure that all "advertised" 
ZFS features will be available in your system:
- learning it the hard way by try and error
- use SUN hardware, or another turnkey solution that offers ZFS, such as 
NexentaStore

A popular approach is following along the rails of what is being used by SUN, a 
prominent example being the LSI 106x SAS HBAs in "IT" mode.

Regards,

Tonmaus
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] How to manage scrub priority or defer scrub?

2010-03-19 Thread Tonmaus
> 
> > > sata
> > > disks don't understand the prioritisation, so

> 
> Er, the point was exactly that there is no
> discrimination, once the
> request is handed to the disk. 

So, do you say that SCSI drives do understand prioritisation (i.e. TCQ supports 
the schedule from ZFS), while SATA/NCQ drives don't, or is it just boiling down 
to what Richard told us, SATA disks being too slow?

> If the
> internal-to-disk queue is
> enough to keep the heads saturated / seek bound, then
> a new
> high-priority-in-the-kernel request will get to the
> disk sooner, but
> may languish once there.  

Thanks. That makes sense to me.


> 
> You can shorten the number of outstanding IO's per
> vdev for the pool
> overall, or preferably the number scrub will generate
> (to avoid
> penalising all IO).  

That sounds like a meaningful approach to addressing bottlenecks caused by 
zpool scrub to me.

>The tunables for each of these
> should be found
> readily, probably in the Evil Tuning Guide.

I think I should try to digest the Evil Tuning Guide occasionally with respect 
to this topic. Thanks for pointing me to a direction. Maybe what you have 
suggested above (shorten the number of I/Os issued by scrub) is already 
possible? If not, I think it would be a meaningful improvement to request.

> Disks with write cache effectively do this [command cueing] for
> writes, by pretending
> they complete immediately, but reads would block the
> channel until
> satisfied.  (This is all for ATA which lacked this,
> before NCQ. SCSI
> has had these capabilities for a long time).

As scrub is about reads, are you saying that this is still a problem with 
SATA/NCQ drives, or not? I am unsure what you mean at this point.

> > > limiting the number of concurrent IO's handed to
> the disk to try
> > > and avoid saturating the heads.
> > 
> > Indeed, that was what I had in mind. With the
> addition that I think
> > it is as well necessary to avoid saturating other
> components, such
> > as CPU.  
> 
> Less important, since prioritisation can be applied
> there too, but
> potentially also an issue.  Perhaps you want to keep
> the cpu fan
> speed/noise down for a home server, even if the scrub
> runs longer.

Well, the only thing that was really remarkable while scrubbing was CPU load 
constantly near 100%. I still think that is at least contributing to the 
collapse of concurrent payload. I.e., it's all about services that take place 
in Kernel: CIFS, ZFS, iSCSI Mostly, about concurrent load within ZFS 
itself. That means an implicit trade-off while a file is being provided over 
CIFS, i.e..

> 
> AHCI should be fine.  In practice if you see actv > 1
> (with a small
> margin for sampling error) then ncq is working.

Ok, and how is that in respect to mpt? My assertion that mpt will support NCQ 
is mainly based on the marketing information provided by LSI that these 
controllers offer NCQ support with SATA drives. How (by which tool) do I get to 
this "actv" parameter?

Regards,

Tonmaus
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] How to manage scrub priority or defer scrub?

2010-03-18 Thread Tonmaus
Hello Dan,

Thank you very much for this interesting reply.

> roughly speaking, reading through the filesystem does
> the least work
> possible to return the data. A scrub does the most
> work possible to
> check the disks (and returns none of the data).

Thanks for the clarification. That's what I had thought.

> 
> For the OP:  scrub issues low-priority IO (and the
> details of how much
> and how low have changed a few times along the
> version trail).

Is there any documentation about this, besides source code?

> However, that prioritisation applies only within the
> kernel; sata disks
> don't understand the prioritisation, so once the
> requests are with the
> disk they can still saturate out other IOs that made
> it to the front
> of the kernel's queue faster. 

I am not sure what you are hinting at. I initially thought about TCQ vs. NCQ 
when I read this. But I am not sure which detail of TCQ would allow for I/O 
discrimination that NCQ doesn't have. All I know about command cueing is that 
it is about optimising DMA strategies and optimising the handling of the I/O 
requests currently issued in respect to what to do first to return all data in 
the least possible time. (??)

> If you're looking for
> something to
> tune, you may want to look at limiting the number of
> concurrent IO's
> handed to the disk to try and avoid saturating the
> heads.

Indeed, that was what I had in mind. With the addition that I think it is as 
well necessary to avoid saturating other components, such as CPU.
 
> 
> You also want to confirm that your disks are on an
> NCQ-capable
> controller (eg sata rather than cmdk) otherwise they
> will be severely
> limited to processing one request at a time, at least
> for reads if you
> have write-cache on (they will be saturated at the
> stop-and-wait
> channel, long before the heads). 

I have two systems here, a production system that is on LSI SAS (mpt) 
controllers, and another one that is on ICH-9 (ahci). Disks are SATA-2. The 
plan was that this combo will have NCQ support. On the other hand, do you know 
if there a method to verify if its functioning?

Best regards,

Tonmaus
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] How to manage scrub priority or defer scrub?

2010-03-18 Thread Tonmaus
> On that
> occasion: does anybody know if ZFS reads all parities
> during a scrub?
> 
> Yes
> 
> > Wouldn't it be sufficient for stale corruption
> detection to read only one parity set unless an error
> occurs there?
> 
> No, because the parity itself is not verified.

Aha. Well, my understanding was that a scrub basically means reading all data, 
and compare with the parities, which means that these have to be re-computed. 
Is that correct?

Regards,

Tonmaus
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] How to manage scrub priority or defer scrub?

2010-03-17 Thread Tonmaus
Hi,

I got a message from you off-list that doesn't show up in the thread even after 
hours. As you mentioned the aspect here as well I'd like to respond to, I'll do 
it from here:

> Third, as for ZFS scrub prioritization, Richard
> answered your question about that.  He said it is
> low priority and can be tuned lower.  However, he was
> answering within the context of an 11 disk RAIDZ2
> with slow disks  His exact words were:
> 
> 
> This could be tuned lower, but your storage
> is slow and *any* I/O activity will be
> noticed.

Richard told us two times that scrub already is as low in priority as can be. 
From another message:

"Scrub is already the lowest priority. Would you like it to be lower?"

=

As much as the comparison goes between "slow" and "fast" storage. I have 
understood that Richard's message was that with storage providing better random 
I/O zfs priority scheduling will perform significantly better, providing less 
degradation of concurrent load. While I am even inclined to buy that, nobody 
will be able to tell me how a certain system will behave until it was tested, 
and to what degree concurrent scrubbing still will be possible.
Another thing: people are talking a lot about narrow vdevs and mirrors. 
However, when you need to build a 200 TB pool you end up with a lot of disks in 
the first place. You will need at least double failover resilience for such a 
pool. If one would do that with mirrors, ending up with app. 600 TB gross to 
provide 200 TB net capacity is definitely NOT an option.

Regards,

Tonmaus
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] How to manage scrub priority or defer scrub?

2010-03-16 Thread Tonmaus
> Are you sure that you didn't also enable
> something which 
> does consume lots of CPU such as enabling some sort
> of compression, 
> sha256 checksums, or deduplication?

None of them is active on that pool or in any existing file system. Maybe the 
issue is particular to RAIDZ2, which is comparably recent. On that occasion: 
does anybody know if ZFS reads all parities during a scrub? Wouldn't it be 
sufficient for stale corruption detection to read only one parity set unless an 
error occurs there?

> The main concern that one should have is I/O
> bandwidth rather than CPU 
> consumption since "software" based RAID must handle
> the work using the 
> system's CPU rather than expecting it to be done by
> some other CPU. 
> There are more I/Os and (in the case of mirroring)
> more data 
> transferred.

What I am trying to say is that CPU may become the bottleneck for I/O in case 
of parity-secured stripe sets. Mirrors and simple stripe sets have almost 0 
impact on CPU. So far at least my observations. Moreover, x86 processors not 
optimized for that kind of work as much as i.e. an Areca controller with a 
dedicated XOR chip is, in its targeted field.

Regards,

Tonmaus
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Posible newbie question about space between zpool and zfs file systems

2010-03-16 Thread Tonmaus
The reason why there is not more uproar is that cost per data unit is dwindling 
while the gap resulting from this marketing trick is increasing. I remember a 
case a German broadcaster filed against a system integrator in the age of the 4 
GB SCSI drive. This was in the mid-90s.

Regards,

Tonmaus
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] How to manage scrub priority or defer scrub?

2010-03-16 Thread Tonmaus
> If CPU is maxed out then that usually indicates some
> severe problem 
> with choice of hardware or a misbehaving device
> driver.  Modern 
> systems have an abundance of CPU.

AFAICS the CPU loads are only high while scrubbing a double parity pool. I have 
no indication of a technical misbehaviour with the exception of dismal 
concurrent performance.

What is not getting beyond me is the notion that even if I *had* a storage 
configuration with 20 times more I/O capacity it still would max out any CPU I 
could buy better than the single L5410 I am running from currently. I am seeing 
CPU performance being a pain point on any "software" based array I have used so 
far. From SOHO NAS boxes (the usual Thecus stuff) to NetApp 3200 filers, all 
exposed a nominal performance drop once parity configurations were employed.
Performance of the L5410 is abundant for the typical operation of my system, 
btw. It can easiely saturate the dual 1000 Mbit NICs for iSCSI and CIFS 
services. I am slightly reluctant to buy a second L5410 just to provide more 
headroom during maintenance operations, as the device will be idle otherwise, 
consuming power.

Regards,

Tonmaus
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] How to manage scrub priority or defer scrub?

2010-03-16 Thread Tonmaus
Hello,

> In following this discussion, I get the feeling that
> you and Richard are somewhat talking past each
> other.

Talking past each other is a problem I have noted and remarked earlier. I have 
to admit to have got frustrated about the discussion narrowing down to a 
certain perspective that was quite the opposite of my own observations and what 
I had initially described. It may be that I have been more harsh than I should. 
Please accept my apology.
I was trying from the outset to obtain a perspective on the matter that is 
independent from an actual configuration. I firmly believe that the scrub 
function is more meaningful if it can be applied in a variety of 
implementations.
I think however that the insight that there seems to be no specific scrub 
management functions is transferable from a commodity implementation to a 
enterprise configuration.

Regards,

Tonmaus
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Posible newbie question about space between zpool and zfs file systems

2010-03-16 Thread Tonmaus
> Has there been a consideration by anyone to do a
> class-action lawsuit 
> for false advertising on this?  I know they now have
> to include the "1GB 
> = 1,000,000,000 bytes" thing in their specs and
> somewhere on the box, 
> but just because I say "1 L = 0.9 metric liters"
> somewhere on the box, 
> it shouldn't mean that I should be able to avertise
> in huge letters "2 L 
> bottle of Coke" on the outside of the package...

If I am not completely mistaken, 1^n/1,024^n is converging against 0 for n vs 
infinite. That is certainly an unwarranted facilitation of Kryder's law for 
very large storage devices.

Regards,

Tonmaus
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] How to manage scrub priority or defer scrub?

2010-03-16 Thread Tonmaus
Hi Richard,

> > - scrubbing the same pool, configured as raidz1
> didn't max out CPU which is no surprise (haha, slow
> storage...) the notable part is that it didn't slow
> down payload that much either.
> 
> raidz creates more, smaller writes than a mirror or
> simple stripe. If the disks are slow,
> then the IOPS will be lower and the scrub takes
> longer, but the I/O scheduler can
> manage the queue better (disks are slower).

This wasn't mirror vs. raidz but raidz1 vs. raidz2, whereas the latter maxes 
out CPU and the former maxes out physical disc I/O. Concurrent payload 
degradation isn't that extreme on raidz1 pools, as it seems. Hence, the CPU 
theory that you still seem to be reluctant to follow.


> There are several
> bugs/RFEs along these lines, something like:
> http://bugs.opensolaris.org/bugdatabase/view_bug.do?bu
> g_id=6743992

Thanks to pointing at this. As it seems, it's a problem for a couple of years 
already. Obviously, the opinion is being shared that this a management problem, 
not a HW issue.

As a Project Manager I will soon have to take a purchase decision for an 
archival storage system (A/V media), and one of the options we are looking into 
is SAMFS/QFS solution including tiers on disk with ZFS. I will have to make up 
my mind if the pool sizes we are looking into (typically we will need 150-200 
TB) are really manageable under the current circumstances when we think about 
including zfs scrub in the picture. From what I have learned here it rather 
looks as if there will be an extra challenge, if not even a problem for the 
system integrator. That's unfortunate.

Regards,

Tonmaus
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Posible newbie question about space between zpool and zfs file systems

2010-03-15 Thread Tonmaus
> My guess is unit conversion and rounding. Your pool
>  has 11 base 10 TB, 
>  which is 10.2445 base 2 TiB.
>  
> Likewise your fs has 9 base 10 TB, which is 8.3819
>  base 2 TiB.
> Not quite.  
> 
> 11 x 10^12 =~ 10.004 x (1024^4).
> 
> So, the 'zpool list' is right on, at "10T" available.

Duh! I completely forgot about this. Thanks for the heads-up.

Tonmaus
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] corruption of ZFS on iScsi storage

2010-03-15 Thread Tonmaus
> Being an iscsi
> target, this volume was mounted as a single iscsi
> disk from the solaris host, and prepared as a zfs
> pool consisting of this single iscsi target. ZFS best
> practices, tell me that to be safe in case of
> corruption, pools should always be mirrors or raidz
> on 2 or more disks. In this case, I considered all
> safe, because the mirror and raid was managed by the
> storage machine. 

As far as I understand Best Practises, redundancy needs to be within zfs in 
order to provide full protection. So, actually Best Practises says that your 
scenario is rather one to be avoided. 

Regards,
Tonmaus
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Posible newbie question about space between zpool and zfs file systems

2010-03-15 Thread Tonmaus
Hi Cindy,
trying to reproduce this 

> For a RAIDZ pool, the zpool list command identifies
> the "inflated" space
> for the storage pool, which is the physical available
> space without an
> accounting for redundancy overhead.
> 
> The zfs list command identifies how much actual pool
> space is available
> to the file systems.

I am lacking 1 TB on my pool:

u...@filemeister:~$ zpool list daten
NAMESIZE  ALLOC   FREECAP  DEDUP  HEALTH  ALTROOT
daten10T  3,71T  6,29T37%  1.00x  ONLINE  -
u...@filemeister:~$ zpool status daten
  pool: daten
 state: ONLINE
 scrub: none requested
config:

NAME  STATE READ WRITE CKSUM
daten ONLINE   0 0 0
  raidz2-0ONLINE   0 0 0
c10t2d0   ONLINE   0 0 0
c10t3d0   ONLINE   0 0 0
c10t4d0   ONLINE   0 0 0
c10t5d0   ONLINE   0 0 0
c10t6d0   ONLINE   0 0 0
c10t7d0   ONLINE   0 0 0
c10t8d0   ONLINE   0 0 0
c10t9d0   ONLINE   0 0 0
c11t18d0  ONLINE   0 0 0
c11t19d0  ONLINE   0 0 0
c11t20d0  ONLINE   0 0 0
spares
  c11t21d0AVAIL

errors: No known data errors
u...@filemeister:~$ zfs list daten
NAMEUSED  AVAIL  REFER  MOUNTPOINT
daten  3,01T  4,98T   110M  /daten

I am counting 11 disks 1 TB each in a raidz2 pool. This is 11 TB gross 
capacity, and 9 TB net. Zpool is however stating 10 TB and zfs is stating 8TB. 
The difference between net and gross is correct, but where is the capacity from 
the 11th disk going?

Regards,

Tonmaus
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] How to manage scrub priority or defer scrub?

2010-03-14 Thread Tonmaus
Hello again,

I am still concerned if my points are being well taken.

> If you are concerned that a
> single 200TB pool would take a long
> time to scrub, then use more pools and scrub in
> parallel.

The main concern is not scrub time. Scrub time could be weeks if scrub just 
would behave. You may imagine that there are applications where segmentation is 
a pain point, too.

>  The scrub will queue no more
> han 10 I/Os at one time to a device, so devices which
> can handle concurrent I/O
> are not consumed entirely by scrub I/O. This could be
> tuned lower, but your storage 
> is slow and *any* I/O activity will be noticed.

There are a couple of things I maybe don't understand, then.

- zpool iostat is reporting more than 1k of outputs while scrub
- throughput is as high as can be until maxing out CPU
- nominal I/O capacity of a single device is still around 90, how can 10 I/Os 
already bring down payload
- scrubbing the same pool, configured as raidz1 didn't max out CPU which is no 
surprise (haha, slow storage...) the notable part is that it didn't slow down 
payload that much either.
- scrub is obviously fine with data added or deleted during a pass. So, it 
could be possible to pause and resume a pass, couldn't it?

My conclusion from these observations is that not only disk speed counts here, 
but other bottlenecks may strike as well. Solving the issue by the wallet is 
one way, solving it by configuration of parameters is another. So, is there a 
lever for scrub I/O prio, or not? Is there a possibility to pause scrub passed 
and resume?

Regards,

Tonmaus
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] How to manage scrub priority or defer scrub?

2010-03-14 Thread Tonmaus
Hi Richard,

thanks for the answer. I think I am aware on the properties of my configuration 
and how it will scale. Let me please stress that this is not the point in the 
discussion.
The target of this discussion should rather be if scrubbing can co-exist with 
payload or if we are thrown back to scrub in the after-hours.
So, do I have to conclude that zfs is not able to make good decisions about 
load prioritisation on commodity hardware and that there are no further options 
available to tweak scrub load impact, or are there other options? 

I am thinking about managing pools capable of hundred times the capacity of 
mine (currently there are 3,7 TB on disk, and it takes 2,5 h to scrub them on 
the double-parity pool) that practically would be un-scrub-able. (Yes, 
Enterprise HW is faster, but Enterprise service windows are much more narrow as 
well... you can't move around or offline 200 TB of live data for days only 
because you need to scrub the disks can you?)

The only idea I could think of myself is to exchange individual drives in a 
round-robin fashion all the time and use re-silver instead of full scrubs. But 
actually I don't like the idea anymore at second glance.

Regards,

Tonmaus
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] How to manage scrub priority or defer scrub?

2010-03-14 Thread Tonmaus
Hi Richard,

these are 
- 11x WD1002fbys (7200rpm SATA drives) in 1 raidz2 group
- 4 GB RAM
- 1 CPU L5410
- snv_133 (where the current array was created as well)

Regards,

Tonmaus
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] How to manage scrub priority or defer scrub?

2010-03-13 Thread Tonmaus
Dear zfs fellows,

during a specific test I have got the impression that scrub may have quite an 
impact on other I/O. CIFS throughput is down to 7 MB/s from 100 MB/s while 
scrub on my main NAS. That is not a surprise as scrub of my raidz2 pool maxes 
out CPU on that machine. (1 Xeon L5410). 
I am running scrubs during week-ends, so this is not a problem. I am asking 
myself however what will happen on larger pools where a scrub pass will take 
days to weeks. Obviously, zfs file systems are much more scalable than CPU 
power ever will be.
Hence, I am seeing a requirement to manage scrub activity so that trade-offs 
can be done to maintain availability and performance of the pool. Does anybody 
know how this is done?

Thanks in advance for any hints,

Regards,

Tonmaus
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Intel SASUC8I - worth every penny

2010-03-12 Thread Tonmaus
> On Mar 11, 2010, at 10:02 PM, Tonmaus wrote:

> All of the other potential disk controllers line up
> ahead of it.  For example,
> you will see controller numbers assigned for your CD,
> floppy, USB, SD, CF etc.
>  -- richard

Hi Richard,
thanks for the explanation. Actually, I started to worry about controller 
numbers when I installed LSI cards that were replacing an Areca 1170. The Areca 
took place 9, and the LSI cards started from 10. Could it be that the BIOS 
caches configuration data that leads to this? What is btw the proper method to 
configure white box hardware to achieve more convenient readouts?

Regards,

Tonmaus
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Intel SASUC8I - worth every penny

2010-03-11 Thread Tonmaus
Hi,
thanks for sharing.
Is your LSI card running in IT or IR mode? I had some issues getting all drives 
connected in IR mode which is the factory default of the LSI branded cards.
I am also curious why your controller shows up as "c11". Does anybody know more 
about the way this is enumerated? I am having two LSI controllers, one is "c10" 
the other "c11". Why can't controllers count from 1?

Regards,

Tonmaus
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] How to verify ecc for ram is active and enabled?

2010-03-11 Thread Tonmaus
> I'd really like to understand what OS does with
> respect to ECC. 

In information technology ECC (Error Correction Code, Wikipedia article is 
worth reading.) normally protects point-to-point "channels". Hence, this is 
entirely a "hardware" thing here.

Regards,
Tonmaus
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] How to verify ecc for ram is active and enabled?

2010-03-11 Thread Tonmaus
> Is the nature of the scrub that it walks through
> memory doing read/write/read and looking at the ECC
> reply in hardware?

I think ZFS has no specific mechanisms in respect to RAM integrity. It will 
just count on a healthy and robust foundation for any component in the machine. 
As far as I understand it's just a good idea to have ECC RAM once you talk a 
certain amount of data that will inevitably go through a certain path. Servers 
controlling PB of data are certainly a case for ECC memory in my regard.

-Tonmaus
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Fishworks 2010Q1 and dedup bug?

2010-03-05 Thread Tonmaus
Hi,

so, what would be a critical test size in your opinion? Are there any other 
side conditions?

I.e., I am not using any snapshots and have also turned off automatic snapshots 
because I was bitten by system hangs while destroying datasets with living 
snapshots.
I am also aware that Fishworks isn't probably on the same code level as the 
current dev build.

Tonmaus
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Question about multiple RAIDZ vdevs using slices on the same disk

2010-03-05 Thread Tonmaus
Hi,

> > In your case, there are two other aspects:
> > - if you pool small devices as JBODS below a vdev
> > member, no superordinate parity will help you when
> > you loose a member of the underlying JBOD. The
> whole
> > pool will just be broken, and you will loose a
> good
> > part of your data.
> 
> No, that's not correct. The first option of pooling
> smaller disks into larger, logical devices via SVM
> would allow me to theoretically lose up to
> [b]eight[/b] disks while still having a live zpool
> (in the case where I lose 2 logical devices comprised
> of four 500GB drives each; this would only kill two
> actual RAIDZ2 members).

You are right. I was wrong with the JBOD observation. In the worst case the 
array still can't tolerate more than 2 disk failures, if all disk failures are 
across different 2 TB building blocks.

> Using slices, I'd be able to lose up to [b]five[/b]
> disks (in the case where I'd lose one 2TB disk
> (affecting all four vdevs) and four 500GB disks, one
> from each vdev).

As a single 2 TB disk is causing a failure in each group for scenario 2, the 
worst case here is as well "3 disks and you are out". This circumstance reduces 
the options to play with grouping to not less than 4 groups with that setup. 

The payload for redundancy in both scenarios is 4 TB, consequently. (With no 
hot spare)

Doesn't that all point at option 1 as the better choice, as the performance 
will be much better, obviously when slicing the 2 TB drives will leave you at 
basically un-cached IO for these members, dominating the rest of the array?

One more thing with SVM is unclear for me: if one of the smaller disks goes, 
from zfs perspective the whole JBOD has to be resilvered. But what will be the 
interactions between fixing the jbod in SVM and re-silvering in ZFS?
Regards,

Tonmaus
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Question about multiple RAIDZ vdevs using slices on the same disk

2010-03-05 Thread Tonmaus
Hi,

> > In your case, there are two other aspects:
> > - if you pool small devices as JBODS below a vdev
> > member, no superordinate parity will help you when
> > you loose a member of the underlying JBOD. The
> whole
> > pool will just be broken, and you will loose a
> good
> > part of your data.
> 
> No, that's not correct. The first option of pooling
> smaller disks into larger, logical devices via SVM
> would allow me to theoretically lose up to
> [b]eight[/b] disks while still having a live zpool
> (in the case where I lose 2 logical devices comprised
> of four 500GB drives each; this would only kill two
> actual RAIDZ2 members).

You are right. I was wrong with the JBOD observation. In the worst case the 
array still can't tolerate more than 2 disk failures, if all disk failures are 
across different 2 TB building blocks.

> Using slices, I'd be able to lose up to [b]five[/b]
> disks (in the case where I'd lose one 2TB disk
> (affecting all four vdevs) and four 500GB disks, one
> from each vdev).

As a single 2 TB disk is causing a failure in each group for scenario 2, the 
worst case here is as well "3 disks and you are out". This circumstance reduces 
the options to play with grouping to not less than 4 groups with that setup. 

The payload for redundancy in both scenarios is 4 TB, consequently. (With no 
hot spare)

Doesn't that all point at option 1 as the better choice, as the performance 
will be much better, obviously when slicing the 2 TB drives will leave you at 
basically un-cached IO for these members, dominating the rest of the array?

One more thing with SVM is unclear for me: if one of the smaller disks goes, 
from zfs perspective the whole JBOD has to be resilvered. But what will be the 
interactions between fixing the jbod in SVM and re-silvering in ZFS?
Regards,

Tonmaus
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Fishworks 2010Q1 and dedup bug?

2010-03-05 Thread Tonmaus
Hi,

I have tried what dedup does on a test dataset that I have filled with 372 GB 
of partly redundant data. I have used snv_133. All in all, it was successful. 
The net data volume was only 120 GB. Destruction of the dataset finally took a 
while, but without any compromise of anything else.

After this successful test I am planning to use dedup productively soon.

Regards,

Tonmaus
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Question about multiple RAIDZ vdevs using slices on the same disk

2010-03-04 Thread Tonmaus
Hi,

the corners I am basing my previous idea on you can find here:
http://www.solarisinternals.com/wiki/index.php/ZFS_Best_Practices_Guide#RAIDZ_Configuration_Requirements_and_Recommendations
I can confirm some of the recommendations already from personal practise. First 
and foremost this sentence: "The recommended number of disks per group is 
between 3 and 9. If you have more disks, use multiple groups."
One example:
I am running 11+1 disks in a single group now. I have recently changed the 
configuration from raidz to raidz2, and the performance while scrub dropped 
from 500 MB/s to app. 200 MB/s by the imposition of the second parity. I am 
sure that if I had chosen two groups in raidz, the performance would have been 
even better than the original config while I could still loose two drives in 
the pool unless the loss wouldn't occur within a single group. 
The bottom line is that while  increasing the number of stripes in a group the 
performance, especially random I/O, will converge against the performance of a 
single group member.
The only reason why I am sticking with the single group configuration myself is 
that performance is "good enough" for what I am doing for now, and that "11 is 
not so far from 9".

In your case, there are two other aspects:
- if you pool small devices as JBODS below a vdev member, no parity will help 
you when you loose a member of the underlying JBOD. 
- If you use slices as vdev members, performance will drop dramatically.

Regards,

tonmaus
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Question about multiple RAIDZ vdevs using slices on the same disk

2010-03-03 Thread Tonmaus
Hi,

following the zfs best practise guide, my understanding is that neither choice 
is very good. There is maybe a third choice, that is

pool
--vdev1 
--disk
--disk
.
--disk 

...

--vdev n
--disk 
--disk
.
--disk

whereas the vdevs will add up in capacity. As far as I understand the option to 
use a parity protected stripe set (i.e. raidz) would be on the vdev layer. As 
far as I understand the smallest disk will limit the capacity of the vdev, not 
of the pool, so that the size should be constant within a pool. Potential hot 
spares would be universally usable for any vdev if they match the size of the 
largest member of any vdev. (i.e. 2 GB).
The benefit of that solution are that a physical disk device failure will not 
affect more than one vdev, and that IO will scale across vdevs as much as 
capacity. The drawback is that the per-vdev redundancy has a price in capacity.
I hope I am correct - I am a newbie as you.

Regards,

Tonmaus
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zpool status output confusing

2010-02-18 Thread Tonmaus
Hello Cindy,

I have got my LSI controllers and exchanged them for the Areca. The result is 
stunning: 
1. exported pool (in this strange state I reported here)
2. changed controller and re-ordered the drives as before posting this matter 
(c-b-a back to a-b-c)
3. Booted Osol
4. imported pool

Result: everything but the previously inactive spare drive was immediately 
discovered and imported. I am really impressed. The problem was clearly related 
to the Areca controller.
(I should say that the whole procedure wasn't 1,2,3,4 as I had to solve quite a 
lot of hw related issues, such as writing IT firmware over IR type in order to 
get all drives hooked up correctly, but that's another greenhorn story.)

Best ,

Tonmaus
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Disk Issues

2010-02-15 Thread Tonmaus
Hi,

If I may - you mentioned that you use ICH10 over ahci. As far as I know ICH10 
is not officially supported by the ahci module. I have also tried myself on 
various ICH10 systems without success. OSOL wouldn't even install on pre-130 
builds, and I haven't tried since.

Regards,

Tonmaus
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Best 1.5TB drives for consumer RAID?

2010-02-04 Thread Tonmaus
Hi again,

thanks for the answer. Another thing that came to my mind is that you mentioned 
that you mixed the disks among the controllers. Does that mean you mixed them 
as well among pools? Unsurprisingly,  the WD20EADS is slower than the Hitachi 
that is a fixed 7200 rpm drive. I wonder what impact that would have if you use 
them as vdevs of the same pool.

Cheers,

Tonmaus
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Best 1.5TB drives for consumer RAID?

2010-02-04 Thread Tonmaus
Hi Arnaud,

which type of controller is this?

Regards,

Tonmaus
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Workaround for mpt timeouts in snv_127

2010-02-04 Thread Tonmaus
Hi Simon

> I.e. you'll have to manually intervene
> if a consumer drive causes the system to hang, and
> replace it, whereas the RAID edition drives will
> probably report the error quickly and then ZFS will
> rewrite the data elsewhere, and thus maybe not kick
> the drive.

IMHO the relevant aspects are if ZFS is able to give accurate account on cache 
flush status and even realize if a drive is not responsive. That being said, I 
have no seen a specific report that ZFS would kick green drives at random or at 
pattern, like the poor SoHo storage enclosure users do all the time.

> 
> So it sounds preferable to have TLER in operation, if
> one can find a consumer-priced drive that allows it,
> or just take the hit and go with whatever non-TLER
> drive you choose and expect to have to manually
> intervene if a drive plays up. OK for home user where
> he is not too affected, but not good for businesses
> which need to have something recovered quickly.

One point about TLER is that two error correction schemes concur in the case 
you run a consumer drive on an active RAID controller that has its own 
mechanisms. When you run ZFS on a RAID controller in contrast to the best 
practise recommendations, an analogue question arises. On the other hand, if 
you run a green consumer drive on a dumb HBA , I wouldn't know what is wrong 
with it in the first place. 
As much as for manual interventions, the only one I am aware of would be to 
re-attach a single drive. Not an option if you are really affected like those 
miserable Thecus N7000 users that see the entire array of only a handful of 
drives drop out within hours - over and over again, or not even get to finish 
formatting the stripe set.
The dire consequences of the gossiped TLER problems let me believe that there 
would be much more and quite specific reports in this place if this was a 
systematic issue with ZFS. Other than that, we are operating outside supported 
specs when running consumer level drives in large arrays. So far at least the 
perspective of Seagate and WD.

> 
> > That all rather points to singular issues with
> > firmware bugs or similar than to a systematic
> issue,
> > doesn't it?
> 
> I'm not sure. Some people in the WDC threads seem to
> report problems with pauses during media streaming
> etc. 

This was again for SoHo storage enclosures - not for ZFS, right?

>  when the
> 32MB+ cache is empty, then it loads another 32MB into
> cache etc and so on? 

I am not sure if any current disk will have such a simplistic cache management 
that will draw upon completely cycling the buffer content, let alone for reads 
that belong to a single file (a disk basically is agnostic of files). Moreover, 
such a buffer management would be completely useless for a striped array. I 
don't know much better what a disk cache does either, but I am afraid that 
direction is probably not helpful to understanding certain phenomenons people 
have reported.

I think that at this time we are seeing a quite large amount of evolutions 
going on in disk storage, whereas many established assumptions are being 
abandoned while backwards compatibility is not always taken care of. SAS 6G 
(will my controller really work in a PCIe 1.1 slot?) and 4k clusters are 
certainly only prominent examples. It's probably even more true than ever to 
fall back to established technologies in such times, including of biting the 
bullet of cost premium on occasion.

Best regards

Tonmaus
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Workaround for mpt timeouts in snv_127

2010-02-03 Thread Tonmaus
Hi Simon,

they are the new revision. 
I got the impression as well that the complaints you reported were mainly 
related to embedded Linux systems probably running LVM / mda. (thecus, Qnap, 
) Other reports I had seen related to typical HW raids. I don't think the 
situation is comparable to ZFS. 
I have also followed some TLER related threads here. I am not sure if there was 
ever a clear assertion if consumer drive related Error correction will affect a 
ZFS pool or not. Statistically we should have a lot of "restrictive TLER 
settings helped me to solve my ZFS pool issues" success reports here, if it 
were. That all rather points to singular issues with firmware bugs or similar 
than to a systematic issue, doesn't it?

Cheers,

Tonmaus
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zpool status output confusing

2010-02-02 Thread Tonmaus
Hi James,

am I right to understand that in a nutshell the problem is that if page 80/83 
information is present but corrupt/inaccurate/forged (name it as you want), zfs 
will not get to down to the GUID?

regards,

Tonmaus
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zpool status output confusing

2010-02-02 Thread Tonmaus
Thanks. That fixed it.

Tonmaus
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Workaround for mpt timeouts in snv_127

2010-02-02 Thread Tonmaus
Hi Simon,

I am running 5 WD20EADS in a raidz-1+spare on ahci controller without any 
problems I could relate to TLER or head parking.

Cheers,

Tonmaus
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zpool status output confusing

2010-02-02 Thread Tonmaus
Goog morning Cindy,

> Hi,
> 
> Testing how ZFS reacts to a failed disk can be
> difficult to anticipate
> because some systems don't react well when you remove
> a disk.
I am in the process of finding that out for my systems. That's why I am doing 
these tests. 
> On an
> x4500, for example, you have to unconfigure a disk
> before you can remove
> it.
I have made similar experience already with disks attached over ahci. Still 
zpool status won't recognize that they have been removed immediately or 
sometimes not at all. But that's stuff for another thread.
> 
> Before removing a disk, I would consult your h/w docs
> to see what the
> recommended process is for removing components.
Spec-wise all drives, backplanes, controllers and their drivers I am using 
would support hotplug. Still, ZFS seems to have difficulties.
> 
> Swapping disks between the main pool and the spare
> pool isn't an
> accurate test of a disk failure and a spare kicking
> in.

That's correct. You may want to note that it wasn't subject of my test 
procedure. I have just intentionally mixed up some disks.

> 
> If you want to test a spare in a ZFS storage pool
> kicking in, then yank 
> a disk from the main pool (after reviewing your h/w
> docs) and observe 
> the spare behavior.
I am aware of that procedure. Thanks. 

> If a disk fails in real time, I
> doubt it will be
> when the pool is exported and the system is shutdown.

Agreed. Once again: the export, reboot, import sequence was specifically 
followed to eliminate any side fx of hotplug behaviour.

> 
> In general, ZFS pools don't need to be exported to
> replace failed disks.
> I've seen unpredictable behavior when
> devices/controllers change on live 
> pools. I would review the doc pointer I provided for
> recommended disk
> replacement practices.
> 
> I can't comment on the autoreplace behavior with a
> pool exported and
> a swap of disks. Maybe someone else can. The point of
> the autoreplace
> feature is to allow you to take a new replacement
> disk and automatically
> replace a failed disk without having to use the zpool
> replace command.
> Its not a way to swap existing disks in the same
> pool.

The interesting point about this is to finding out if one will be able to i.e. 
replace a controller with a different type in case of a hardware failure, or 
even just move the physical discs to a different enclosure for any imaginable 
reason. Once again, the naive assumption was that ZFS will automatically find 
the members of a previously exported pool by information (metadata) present on 
each of the pool members (disks, vdevs, files, whatever).
The situation now after scrub has finished is that the pool reports without any 
"known data errors", but still with the dubious reporting of the same device 
c7t11d0 both in available Spare status and online pool member at the same time. 
The status sticks with another export/import cycle (this time without an 
intermediate reboot).
The next steps for me will be to change the controller with a mpt driven type 
and rebuild the pool from scratch. Then I may repeat the test.
Thanks so far for your support. I have learned a lot.

Regards,

Sebastian
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zpool status output confusing

2010-02-02 Thread Tonmaus
If I run

 # zdb -l /dev/dsk/c#t#d#

the result is "failed to unpack label" for any disk attached to controllers 
running on ahci or arcmsr controllers.  

Cheers,

Tonmaus
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zpool status output confusing

2010-02-01 Thread Tonmaus
Hi again,

> Follow recommended practices for replacing devices in
> a live pool.

Fair enough. On the other hand I guess it has become clear that the pool went 
offline as a part of the procedure. That was partly as I am not sure about the 
hotplug capabilities of the controller, partly as I wanted to simulate an 
incident that will force me to shut down the machine. I also assumed that a 
controlled  procedure of atomical, legal steps (export, reboot, import) should 
avoid unexpected gotchas. 

> 
> In general, ZFS can handle controller/device changes
> if the driver
> generates or fabricates device IDs. You can view
> device IDs with this
> command:
> 
> # zdb -l /dev/dsk/cvtxdysz
> 
> If you are unsure what impact device changes will
> have your pool, then
> export the pool first. If you see the device ID has
> changed when the
> pool is exported (use prtconf -v to view device IDs
> while the pool is
> exported) with the hardware change, then the
> resulting pool behavior is
> unknown.

That's interesting. I understand I should do this to get a better idea what may 
happen before ripping the drives from the respective slots. Now: in case of an 
enclosure transfer or controller change, how do I find out if the receiving 
configuration will be able to handle it? The test obviously will tell about the 
IDs the sending configuration has produced. What layer will interpret the IDs, 
driver or ZFS? Are the IDs written to disk? 

The reason I am doing this is to find out what I need to observe in respect to 
failover strategies for controllers, mainboards, etc. for the hardware that I 
am using. Which is naturally Non-SUN.

Regards,

Sebastian
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zpool status output confusing

2010-02-01 Thread Tonmaus
Hi Cindys,

> I'm still
> not sure if you  physically swapped c7t11d0 for c7t9d0 or if c7t9d0 is
> still connected  and part of your pool. 

The latter is not the case according to status, the first is definitely the 
case. format reports the drive as present and correctly labelled. 

> ZFS has recommended ways for swapping disks so if the  pool is exported, the 
> system 
> shutdown and then disks are swapped, then  the behavior is unpredictable and 
> ZFS is 
> understandably confused about what happened.
> It might work for some hardware, but in general, ZFS should be notified of 
> the device changes.

For the record, ZFS seems to be only marginally confused: The pool showed no 
errors after the import; the rest remains to be seen after scrub is done. I 
can't see what would be wrong with a clean export/import. And the results of 
the drive swap are part of the plan to find out what impact the HW has on the 
transfer of this pool.


> 
> You might experiment with the autoreplace pool
> property. Enabling this
> property will allow you to replace disks without
> using the zpool replace 
> command. If autoreplace is enabled, then physically
> swapping out an
> active disk in the pool with a spare disk that is is
> also connected to
> the pool without using zpool replace is a good
> approach.

Does this still apply if I did a clean export before the swap?

Regards,

Tonmaus
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zpool status output confusing

2010-02-01 Thread Tonmaus
> Hi--
> 
> Were you trying to swap out a drive in your pool's
> raidz1 VDEV
> with a spare device? Was that your original
> intention?

Not really. I just wanted to see what happens if the physical controller port 
changes, i.e. what practical relevance it would have if I put the disks in the 
same order after moving them from enclosure to enclosure. It was a simulation 
of that principle by just swapping 3/10 drives from position ABC to CAB. The 
naive assumption was that the pool would just import normally. 
I have checked: All resources are available as before. %t0%- %t11% are attached 
to the system. The odd thing still is: %t9% was a member of the pool - where is 
it? And: I thought a spare could only be 'online' in any pool or 'available', 
not both at the same time.

Does it make more sense now?

Regards,

Tonmaus
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] zpool status output confusing

2010-02-01 Thread Tonmaus
Hi all,

this is what I get from 'zpool status pool' after swapping 3 of 10 members of a 
zpool for testing purpose. 

[i]u...@zfs2:~$ zpool status pool
  pool: pool
 state: ONLINE
 scrub: scrub in progress for 0h8m, 4,70% done, 2h51m to go
config:

NAME STATE READ WRITE CKSUM
pool   ONLINE   0 0 0
  raidz1-0   ONLINE   0 0 0
c7t1d0   ONLINE   0 0 0
c7t2d0   ONLINE   0 0 0
c7t3d0   ONLINE   0 0 0
c7t4d0   ONLINE   0 0 0
c7t5d0   ONLINE   0 0 0
c7t6d0   ONLINE   0 0 0
c7t7d0   ONLINE   0 0 0
c7t11d0  ONLINE   0 0 0
c7t8d0   ONLINE   0 0 0
c7t10d0  ONLINE   0 0 0
spares
  c7t11d0AVAIL

errors: No known data errors[/i]

Observe that disk %t11% is assigned as well as a member of the pool as as spare 
available. 
The procedure was 'zpool export pool' > shutdow' > swap drives > boot > 'zpool 
import pool', without a hitch. As you see, scrub is running for peace of mind...

Ideas? TIA.

Cheers,

Tonmaus
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Is LSI SAS3081E-R suitable for a ZFS NAS ?

2010-01-28 Thread Tonmaus
Hi James

> I do not think that you are reading the data
> correctly.
> 
> The issues that we have seen via this list and
> storage-discuss
> have implicated downrev firmware on cards, and the
> various different
> disk drives that people choose to attach to those
> cards.
Thanks for pointing that out. I have indeed noticed such reports but I didn't 
see any specific plans or ack's to take these issues from mpt. Thus the 
question is if these reports justify the assumption that there was anything 
wrong with mpt in general.

> 
> The use of SAS expanders with mpt-based cards is
> *not* an issue.
> The use of MPxIO with mpt-based cards is *not* an
> issue.

I didn't want to make the point of denying some of mpt's core features. I just 
saw a couple of reports that involved SAS Expanders, specifically those based 
on LSI silicon, reported under the "mpt problem" umbrella, and I understood 
that these issues were quite obstinate.

> Personally, I'm quite happy with the LSISAS3081E that
> I have
> installed in my system, with the attached 320Gb
> consumer-grade
> SATA2 disks.
> 

Excellent. That's encouraging. I am planning a similar configuration, with WD 
RE3 1 TB disks though.

Regards,

Tonmaus
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Is LSI SAS3081E-R suitable for a ZFS NAS ?

2010-01-28 Thread Tonmaus
> Thanks for your answer.
> 
> I asked primarily because of the mpt timeout issues I
> saw on the list.

Hi Arnaud,

I am looking into the LSI SAS 3081 as well. My current understanding with mpt 
issues is that the "sticky" part of these problems is rather related to 
multipath features, that is using port multipliers or sas expanders. Avoiding 
these one should be fine. 
I am quite a newbie though. Just judging from what I read here.

Regards,

Tonmaus
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs destroy hangs machine if snapshot exists- workaround found

2010-01-27 Thread Tonmaus
> This sounds like yet another instance of
> 
> 6910767 deleting large holey objects hangs other I/Os
> 
> I have a module based on 130 that includes this fix
> if you would like to try it.
> 
> -tim

Hi Tim,

6910767 seems to be about ZVOLs. The dataset here was not a ZVOL. I had a 1,4 
TB ZVOL on the same pool that also wasn't easy to kill. It hung the machine as 
well - but only once: it was gone after a forced re-boot.

Regards,

Tonmaus
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss