Re: [OmniOS-discuss] Slow scrub on SSD-only pool

2016-04-17 Thread Stephan Budach

Am 17.04.16 um 14:07 schrieb Stephan Budach:

Hi all,

I am running a scrub on a SSD-only zpool on r018. This zpool consists 
of 16 iSCSI targets, which are served from two other OmniOS boxes - 
currently still running r016 over 10GbE connections.


This zpool serves as a NFS share for my Oracle VM cluster and it 
delivers reasonable performance. Even while the scrub is running, I 
can get approx 1200MB/s throughput when dd'ing a vdisk from the ZFS to 
/dev/null.


However, the running scrub is only progressing like this:

root@zfsha02gh79:/root# zpool status ssdTank
  pool: ssdTank
 state: ONLINE
  scan: scrub in progress since Sat Apr 16 23:37:52 2016
68,5G scanned out of 1,36T at 1,36M/s, 276h17m to go
0 repaired, 4,92% done
config:

NAME STATE READ WRITE CKSUM
ssdTank ONLINE   0 0 0
  mirror-0 ONLINE   0 0 0
c3t600144F090D0961356B8A76C0001d0 ONLINE 0 0 0
c3t600144F090D0961356B8A93C0009d0 ONLINE 0 0 0
  mirror-1 ONLINE   0 0 0
c3t600144F090D0961356B8A7BE0002d0 ONLINE 0 0 0
c3t600144F090D0961356B8A948000Ad0 ONLINE 0 0 0
  mirror-2 ONLINE   0 0 0
c3t600144F090D0961356B8A7F10003d0 ONLINE 0 0 0
c3t600144F090D0961356B8A958000Bd0 ONLINE 0 0 0
  mirror-3 ONLINE   0 0 0
c3t600144F090D0961356B8A7FC0004d0 ONLINE 0 0 0
c3t600144F090D0961356B8A964000Cd0 ONLINE 0 0 0
  mirror-4 ONLINE   0 0 0
c3t600144F090D0961356B8A8210005d0 ONLINE 0 0 0
c3t600144F090D0961356B8A96E000Dd0 ONLINE 0 0 0
  mirror-5 ONLINE   0 0 0
c3t600144F090D0961356B8A82E0006d0 ONLINE 0 0 0
c3t600144F090D0961356B8A978000Ed0 ONLINE 0 0 0
  mirror-6 ONLINE   0 0 0
c3t600144F090D0961356B8A83B0007d0 ONLINE 0 0 0
c3t600144F090D0961356B8A983000Fd0 ONLINE 0 0 0
  mirror-7 ONLINE   0 0 0
c3t600144F090D0961356B8A84A0008d0 ONLINE 0 0 0
c3t600144F090D0961356B8A98E0010d0 ONLINE 0 0 0

errors: No known data errors

These are all Intel S3710s with 800GB and I can't seem to find out why 
it's moving so slowly.

Anything I can look at specifically?

Thanks,
Stephan

Well… searching the net somewhat more thoroughfully, I came across an 
archived discussion which deals also with a similar issue. Somewhere 
down the conversation, this parameter got suggested:


echo "zfs_scrub_delay/W0" | mdb -kw

I just tried that as well and although the caculated speed climbs rathet 
slowly up, iostat now shows  approx. 380 MB/s read from the devices, 
which rates  at 24 MB/s per single device * 8 *2.


Being curious, I issued a echo "zfs_scrub_delay/W1" | mdb -kw to see 
what would happen and that command immediately drowned the rate on each 
device down to 1.4 MB/s…


What is the rational behind that? Who wants to wait for weeks for a 
scrub to finish? Usually, I am having znapzend run as well, creating 
snapshots on a regular basis. Wouldn't that hurt scrub performance even 
more?


Cheers,
Stephan
___
OmniOS-discuss mailing list
OmniOS-discuss@lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss


Re: [OmniOS-discuss] Slow scrub on SSD-only pool

2016-04-17 Thread Dale Ghent

> On Apr 17, 2016, at 9:07 AM, Stephan Budach  wrote:
> 
> Well… searching the net somewhat more thoroughfully, I came across an 
> archived discussion which deals also with a similar issue. Somewhere down the 
> conversation, this parameter got suggested:
> 
> echo "zfs_scrub_delay/W0" | mdb -kw
> 
> I just tried that as well and although the caculated speed climbs rathet 
> slowly up, iostat now shows  approx. 380 MB/s read from the devices, which 
> rates  at 24 MB/s per single device * 8 *2.
> 
> Being curious, I issued a echo "zfs_scrub_delay/W1" | mdb -kw to see what 
> would happen and that command immediately drowned the rate on each device 
> down to 1.4 MB/s…
> 
> What is the rational behind that? Who wants to wait for weeks for a scrub to 
> finish? Usually, I am having znapzend run as well, creating snapshots on a 
> regular basis. Wouldn't that hurt scrub performance even more?

zfs_scrub_delay is described here:

http://src.illumos.org/source/xref/illumos-gate/usr/src/uts/common/fs/zfs/dsl_scan.c#63

How busy are your disks if you subtract the IO caused by a scrub? Are you doing 
these scrubs with your VMs causing normal IO as well?

Scrubbing, overall, is treated as a background maintenance process. As such, it 
is designed to not interfere with "production IO" requests. It used to be that 
scrubs ran as fast as disk IO and bus bandwidth would allow, which in turn 
severely impacted the IO performance of running applications, and in some cases 
this would cause problems for production or user services.  The scrub delay 
setting which you've discovered is the main governor of this scrub throttle 
code[1], and by setting it to 0, you are effectively removing the delay it 
imposes on itself to allow non-scrub/resilvering IO requests to finish.

The solution in your case is specific to yourself and how you operate your 
servers and services. Can you accept degraded application IO while a scrub or 
resilver is running? Can you not? Maybe only during certain times?

/dale

[1] 
http://src.illumos.org/source/xref/illumos-gate/usr/src/uts/common/fs/zfs/dsl_scan.c#1841


signature.asc
Description: Message signed with OpenPGP using GPGMail
___
OmniOS-discuss mailing list
OmniOS-discuss@lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss


Re: [OmniOS-discuss] Slow scrub on SSD-only pool

2016-04-17 Thread Stephan Budach

Am 17.04.16 um 20:42 schrieb Dale Ghent:

On Apr 17, 2016, at 9:07 AM, Stephan Budach  wrote:

Well… searching the net somewhat more thoroughfully, I came across an archived 
discussion which deals also with a similar issue. Somewhere down the 
conversation, this parameter got suggested:

echo "zfs_scrub_delay/W0" | mdb -kw

I just tried that as well and although the caculated speed climbs rathet slowly 
up, iostat now shows  approx. 380 MB/s read from the devices, which rates  at 
24 MB/s per single device * 8 *2.

Being curious, I issued a echo "zfs_scrub_delay/W1" | mdb -kw to see what would 
happen and that command immediately drowned the rate on each device down to 1.4 MB/s…

What is the rational behind that? Who wants to wait for weeks for a scrub to 
finish? Usually, I am having znapzend run as well, creating snapshots on a 
regular basis. Wouldn't that hurt scrub performance even more?

zfs_scrub_delay is described here:

http://src.illumos.org/source/xref/illumos-gate/usr/src/uts/common/fs/zfs/dsl_scan.c#63

How busy are your disks if you subtract the IO caused by a scrub? Are you doing 
these scrubs with your VMs causing normal IO as well?

Scrubbing, overall, is treated as a background maintenance process. As such, it is 
designed to not interfere with "production IO" requests. It used to be that 
scrubs ran as fast as disk IO and bus bandwidth would allow, which in turn severely 
impacted the IO performance of running applications, and in some cases this would cause 
problems for production or user services.  The scrub delay setting which you've 
discovered is the main governor of this scrub throttle code[1], and by setting it to 0, 
you are effectively removing the delay it imposes on itself to allow 
non-scrub/resilvering IO requests to finish.

The solution in your case is specific to yourself and how you operate your 
servers and services. Can you accept degraded application IO while a scrub or 
resilver is running? Can you not? Maybe only during certain times?

/dale

[1] 
http://src.illumos.org/source/xref/illumos-gate/usr/src/uts/common/fs/zfs/dsl_scan.c#1841
I do get the notion if this, but if the increase from 0 to 1 reduces the 
throughput from 24Mb/s to 1MB/s, this seems way overboard to me. Having 
to wait for a couple of hours when running with 0 as opposed to days (up 
to 10) when running at 1  - on a 1.3 TB zpool - doesn't seem to be the 
right choice. If this tunable offered some more room for choice,  that 
would be great, but it obviously doesn't.


It's the weekend and my VMs aren't excatly hogging their disks, so there 
was plenty of I/O available… I'd wish for a more granular setting 
regarding this setting.


Anyway, the scrub finished a couple of hours later and of course, I can 
always set this tunable to 0, should I need it,


Thanks,
Stephan
___
OmniOS-discuss mailing list
OmniOS-discuss@lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss


Re: [OmniOS-discuss] Slow scrub on SSD-only pool

2016-04-19 Thread wuffers
You might want to check this old thread:

http://lists.omniti.com/pipermail/omnios-discuss/2014-July/002927.html

Richard Elling had some interesting insights on how the scrub works:

"So I think the pool is not scheduling scrub I/Os very well. You can
increase the number of
scrub I/Os in the scheduler by adjusting the zfs_vdev_scrub_max_active
tunable. The
default is 2, but you'll have to consider that a share (in the stock market
sense) where
the active sync reads and writes are getting 10 each. You can try bumping
up the value
and see what happens over some time, perhaps 10 minutes or so -- too short
of a time
and you won't get a good feeling for the impact (try this in off-peak time).
echo zfs_vdev_scrub_max_active/W0t5 | mdb -kw
will change the value from 2 to 5, increasing its share of the total I/O
workload.

You can see the progress of scan (scrubs do scan) workload by looking at
the ZFS
debug messages.
echo ::zfs_dbgmsg | mdb -k
These will look mysterious... they are. But the interesting bits are about
how many blocks
are visited in some amount of time (txg sync interval). Ideally, this will
change as you
adjust zfs_vdev_scrub_max_active."

I had to increase my zfs_vdev_scrub_max_active parameter higher than 5, but
it sounds like the default setting for that tunable is no longer
satisfactory for today's high performance systems.

On Sun, Apr 17, 2016 at 4:07 PM, Stephan Budach 
wrote:

> Am 17.04.16 um 20:42 schrieb Dale Ghent:
>
> On Apr 17, 2016, at 9:07 AM, Stephan Budach  wrote:
>>>
>>> Well… searching the net somewhat more thoroughfully, I came across an
>>> archived discussion which deals also with a similar issue. Somewhere down
>>> the conversation, this parameter got suggested:
>>>
>>> echo "zfs_scrub_delay/W0" | mdb -kw
>>>
>>> I just tried that as well and although the caculated speed climbs rathet
>>> slowly up, iostat now shows  approx. 380 MB/s read from the devices, which
>>> rates  at 24 MB/s per single device * 8 *2.
>>>
>>> Being curious, I issued a echo "zfs_scrub_delay/W1" | mdb -kw to see
>>> what would happen and that command immediately drowned the rate on each
>>> device down to 1.4 MB/s…
>>>
>>> What is the rational behind that? Who wants to wait for weeks for a
>>> scrub to finish? Usually, I am having znapzend run as well, creating
>>> snapshots on a regular basis. Wouldn't that hurt scrub performance even
>>> more?
>>>
>> zfs_scrub_delay is described here:
>>
>>
>> http://src.illumos.org/source/xref/illumos-gate/usr/src/uts/common/fs/zfs/dsl_scan.c#63
>>
>> How busy are your disks if you subtract the IO caused by a scrub? Are you
>> doing these scrubs with your VMs causing normal IO as well?
>>
>> Scrubbing, overall, is treated as a background maintenance process. As
>> such, it is designed to not interfere with "production IO" requests. It
>> used to be that scrubs ran as fast as disk IO and bus bandwidth would
>> allow, which in turn severely impacted the IO performance of running
>> applications, and in some cases this would cause problems for production or
>> user services.  The scrub delay setting which you've discovered is the main
>> governor of this scrub throttle code[1], and by setting it to 0, you are
>> effectively removing the delay it imposes on itself to allow
>> non-scrub/resilvering IO requests to finish.
>>
>> The solution in your case is specific to yourself and how you operate
>> your servers and services. Can you accept degraded application IO while a
>> scrub or resilver is running? Can you not? Maybe only during certain times?
>>
>> /dale
>>
>> [1]
>> http://src.illumos.org/source/xref/illumos-gate/usr/src/uts/common/fs/zfs/dsl_scan.c#1841
>>
> I do get the notion if this, but if the increase from 0 to 1 reduces the
> throughput from 24Mb/s to 1MB/s, this seems way overboard to me. Having to
> wait for a couple of hours when running with 0 as opposed to days (up to
> 10) when running at 1  - on a 1.3 TB zpool - doesn't seem to be the right
> choice. If this tunable offered some more room for choice,  that would be
> great, but it obviously doesn't.
>
> It's the weekend and my VMs aren't excatly hogging their disks, so there
> was plenty of I/O available… I'd wish for a more granular setting regarding
> this setting.
>
> Anyway, the scrub finished a couple of hours later and of course, I can
> always set this tunable to 0, should I need it,
>
> Thanks,
>
> Stephan
> ___
> OmniOS-discuss mailing list
> OmniOS-discuss@lists.omniti.com
> http://lists.omniti.com/mailman/listinfo/omnios-discuss
>
___
OmniOS-discuss mailing list
OmniOS-discuss@lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss


Re: [OmniOS-discuss] Slow scrub on SSD-only pool

2016-04-21 Thread Stephan Budach

Am 19.04.16 um 23:31 schrieb wuffers:

You might want to check this old thread:

http://lists.omniti.com/pipermail/omnios-discuss/2014-July/002927.html

Richard Elling had some interesting insights on how the scrub works:

"So I think the pool is not scheduling scrub I/Os very well. You can 
increase the number of
scrub I/Os in the scheduler by adjusting the zfs_vdev_scrub_max_active 
tunable. The
default is 2, but you'll have to consider that a share (in the stock 
market sense) where
the active sync reads and writes are getting 10 each. You can try 
bumping up the value
and see what happens over some time, perhaps 10 minutes or so -- too 
short of a time
and you won't get a good feeling for the impact (try this in off-peak 
time).

echo zfs_vdev_scrub_max_active/W0t5 | mdb -kw
will change the value from 2 to 5, increasing its share of the total 
I/O workload.


You can see the progress of scan (scrubs do scan) workload by looking 
at the ZFS

debug messages.
echo ::zfs_dbgmsg | mdb -k
These will look mysterious... they are. But the interesting bits are 
about how many blocks
are visited in some amount of time (txg sync interval). Ideally, this 
will change as you

adjust zfs_vdev_scrub_max_active."

I had to increase my zfs_vdev_scrub_max_active parameter higher than 
5, but it sounds like the default setting for that tunable is no 
longer satisfactory for today's high performance systems.


On Sun, Apr 17, 2016 at 4:07 PM, Stephan Budach > wrote:


Am 17.04.16 um 20:42 schrieb Dale Ghent:

On Apr 17, 2016, at 9:07 AM, Stephan Budach
mailto:stephan.bud...@jvm.de>> wrote:

Well… searching the net somewhat more thoroughfully, I
came across an archived discussion which deals also with a
similar issue. Somewhere down the conversation, this
parameter got suggested:

echo "zfs_scrub_delay/W0" | mdb -kw

I just tried that as well and although the caculated speed
climbs rathet slowly up, iostat now shows approx. 380 MB/s
read from the devices, which rates at 24 MB/s per single
device * 8 *2.

Being curious, I issued a echo "zfs_scrub_delay/W1" | mdb
-kw to see what would happen and that command immediately
drowned the rate on each device down to 1.4 MB/s…

What is the rational behind that? Who wants to wait for
weeks for a scrub to finish? Usually, I am having znapzend
run as well, creating snapshots on a regular basis.
Wouldn't that hurt scrub performance even more?

zfs_scrub_delay is described here:


http://src.illumos.org/source/xref/illumos-gate/usr/src/uts/common/fs/zfs/dsl_scan.c#63

How busy are your disks if you subtract the IO caused by a
scrub? Are you doing these scrubs with your VMs causing normal
IO as well?

Scrubbing, overall, is treated as a background maintenance
process. As such, it is designed to not interfere with
"production IO" requests. It used to be that scrubs ran as
fast as disk IO and bus bandwidth would allow, which in turn
severely impacted the IO performance of running applications,
and in some cases this would cause problems for production or
user services.  The scrub delay setting which you've
discovered is the main governor of this scrub throttle
code[1], and by setting it to 0, you are effectively removing
the delay it imposes on itself to allow non-scrub/resilvering
IO requests to finish.

The solution in your case is specific to yourself and how you
operate your servers and services. Can you accept degraded
application IO while a scrub or resilver is running? Can you
not? Maybe only during certain times?

/dale

[1]

http://src.illumos.org/source/xref/illumos-gate/usr/src/uts/common/fs/zfs/dsl_scan.c#1841

I do get the notion if this, but if the increase from 0 to 1
reduces the throughput from 24Mb/s to 1MB/s, this seems way
overboard to me. Having to wait for a couple of hours when running
with 0 as opposed to days (up to 10) when running at 1  - on a 1.3
TB zpool - doesn't seem to be the right choice. If this tunable
offered some more room for choice, that would be great, but it
obviously doesn't.

It's the weekend and my VMs aren't excatly hogging their disks, so
there was plenty of I/O available… I'd wish for a more granular
setting regarding this setting.

Anyway, the scrub finished a couple of hours later and of course,
I can always set this tunable to 0, should I need it,

Thanks,

Stephan



Interesting read - and it surely works. If you set the tunable before 
you start the scrub you can immediately see the thoughput being much 
higher than with the standard setting. Now, what would b

Re: [OmniOS-discuss] Slow scrub on SSD-only pool

2016-04-21 Thread Chris Siebenmann
[About ZFS scrub tunables:]
> Interesting read - and it surely works. If you set the tunable before
> you start the scrub you can immediately see the thoughput being much
> higher than with the standard setting. [...]

 It's perhaps worth noting here that the scrub rate shown in 'zpool
status' is a cumulative one, ie the average scrub rate since the scrub
started. As far as I know the only way to get the current scrub rate is
run 'zpool status' twice with some time in between and then look at how
much progress the scrub's made during that time.

 As such, increasing the scrub speed in the middle of what had been a
slow scrub up to that point probably won't make a massive or immediate
difference in the reported scrub rate. You should see it rising over
time, especially if you drastically speeded it up, but it's not any sort
of instant jump.

(You can always monitor iostat, but that mixes in other pool IO. There's
probably something clever that can be done with DTrace.)

 This may already be obvious and well known to people, but I figured
I'd mention it just in case.

- cks
___
OmniOS-discuss mailing list
OmniOS-discuss@lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss


Re: [OmniOS-discuss] Slow scrub on SSD-only pool

2016-04-21 Thread Richard Elling

> On Apr 21, 2016, at 7:47 AM, Chris Siebenmann  wrote:
> 
> [About ZFS scrub tunables:]
>> Interesting read - and it surely works. If you set the tunable before
>> you start the scrub you can immediately see the thoughput being much
>> higher than with the standard setting. [...]
> 
> It's perhaps worth noting here that the scrub rate shown in 'zpool
> status' is a cumulative one, ie the average scrub rate since the scrub
> started. As far as I know the only way to get the current scrub rate is
> run 'zpool status' twice with some time in between and then look at how
> much progress the scrub's made during that time.

Scrub rate measured in IOPS or bandwidth is not useful. Neither is a reflection
of the work being performed in ZFS nor the drives.

> 
> As such, increasing the scrub speed in the middle of what had been a
> slow scrub up to that point probably won't make a massive or immediate
> difference in the reported scrub rate. You should see it rising over
> time, especially if you drastically speeded it up, but it's not any sort
> of instant jump.
> 
> (You can always monitor iostat, but that mixes in other pool IO. There's
> probably something clever that can be done with DTrace.)

I've got some dtrace that will show progress. However, it is only marginally
useful when you've got multiple datasets.

> 
> This may already be obvious and well known to people, but I figured
> I'd mention it just in case.

People fret about scrubs and resilvers, when they really shouldn't. In ZFS
accessing data also checks and does recovery, so anything they regularly
access will be unaffected by the subsequent scan. Over the years, I've tried
several ways to approach teaching people about failures and scrubs/resilvers,
but with limited success: some people just like to be afraid... Hollywood makes
a lot of money on them :-)
 -- richard


___
OmniOS-discuss mailing list
OmniOS-discuss@lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss


Re: [OmniOS-discuss] Slow scrub on SSD-only pool

2016-04-22 Thread Stephan Budach

Am 21.04.16 um 18:36 schrieb Richard Elling:

On Apr 21, 2016, at 7:47 AM, Chris Siebenmann  wrote:

[About ZFS scrub tunables:]

Interesting read - and it surely works. If you set the tunable before
you start the scrub you can immediately see the thoughput being much
higher than with the standard setting. [...]

It's perhaps worth noting here that the scrub rate shown in 'zpool
status' is a cumulative one, ie the average scrub rate since the scrub
started. As far as I know the only way to get the current scrub rate is
run 'zpool status' twice with some time in between and then look at how
much progress the scrub's made during that time.

Scrub rate measured in IOPS or bandwidth is not useful. Neither is a reflection
of the work being performed in ZFS nor the drives.


As such, increasing the scrub speed in the middle of what had been a
slow scrub up to that point probably won't make a massive or immediate
difference in the reported scrub rate. You should see it rising over
time, especially if you drastically speeded it up, but it's not any sort
of instant jump.

(You can always monitor iostat, but that mixes in other pool IO. There's
probably something clever that can be done with DTrace.)

I've got some dtrace that will show progress. However, it is only marginally
useful when you've got multiple datasets.


This may already be obvious and well known to people, but I figured
I'd mention it just in case.

People fret about scrubs and resilvers, when they really shouldn't. In ZFS
accessing data also checks and does recovery, so anything they regularly
access will be unaffected by the subsequent scan. Over the years, I've tried
several ways to approach teaching people about failures and scrubs/resilvers,
but with limited success: some people just like to be afraid... Hollywood makes
a lot of money on them :-)
  -- richard


No… not afraid, but I actually do think, that I can judge whether or not 
I want to speed scrubs up and trade in some performance for that. As 
long as I can do that, I am fine with it. And the same applies for 
resilvers, I guess. If you need to resilver one half of a mirrored 
zpool, most people will want that to run as fast as feasible, don't they?


Thanks,
Stephan

___
OmniOS-discuss mailing list
OmniOS-discuss@lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss


Re: [OmniOS-discuss] Slow scrub on SSD-only pool

2016-04-22 Thread Richard Elling

> On Apr 22, 2016, at 5:00 AM, Stephan Budach  wrote:
> 
> Am 21.04.16 um 18:36 schrieb Richard Elling:
>>> On Apr 21, 2016, at 7:47 AM, Chris Siebenmann  wrote:
>>> 
>>> [About ZFS scrub tunables:]
 Interesting read - and it surely works. If you set the tunable before
 you start the scrub you can immediately see the thoughput being much
 higher than with the standard setting. [...]
>>> It's perhaps worth noting here that the scrub rate shown in 'zpool
>>> status' is a cumulative one, ie the average scrub rate since the scrub
>>> started. As far as I know the only way to get the current scrub rate is
>>> run 'zpool status' twice with some time in between and then look at how
>>> much progress the scrub's made during that time.
>> Scrub rate measured in IOPS or bandwidth is not useful. Neither is a 
>> reflection
>> of the work being performed in ZFS nor the drives.
>> 
>>> As such, increasing the scrub speed in the middle of what had been a
>>> slow scrub up to that point probably won't make a massive or immediate
>>> difference in the reported scrub rate. You should see it rising over
>>> time, especially if you drastically speeded it up, but it's not any sort
>>> of instant jump.
>>> 
>>> (You can always monitor iostat, but that mixes in other pool IO. There's
>>> probably something clever that can be done with DTrace.)
>> I've got some dtrace that will show progress. However, it is only marginally
>> useful when you've got multiple datasets.
>> 
>>> This may already be obvious and well known to people, but I figured
>>> I'd mention it just in case.
>> People fret about scrubs and resilvers, when they really shouldn't. In ZFS
>> accessing data also checks and does recovery, so anything they regularly
>> access will be unaffected by the subsequent scan. Over the years, I've tried
>> several ways to approach teaching people about failures and scrubs/resilvers,
>> but with limited success: some people just like to be afraid... Hollywood 
>> makes
>> a lot of money on them :-)
>>  -- richard
>> 
>> 
> No… not afraid, but I actually do think, that I can judge whether or not I 
> want to speed scrubs up and trade in some performance for that. As long as I 
> can do that, I am fine with it. And the same applies for resilvers, I guess.

For current OmniOS the priority scheduler can be adjusted using mdb to change
the priority for scrubs vs other types of I/O. There is no userland interface. 
See Adam's
blog for more details.
http://dtrace.org/blogs/ahl/2014/08/31/openzfs-tuning/ 


If you're running Solaris 11 or pre-2015 OmniOS, then the old write throttle is 
impossible
to control and you'll chase your tail trying to balance scrubs/resilvers 
against any other
workload. From a control theory perspective, it is unstable.

> If you need to resilver one half of a mirrored zpool, most people will want 
> that to run as fast as feasible, don't they?

It depends. I've had customers on both sides of the fence and one customer for 
whom we
cron'ed the priority changes to match their peak. Suffice to say, nobody seems 
to want 
resilvers to dominate real work.
 -- richard

___
OmniOS-discuss mailing list
OmniOS-discuss@lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss


Re: [OmniOS-discuss] Slow scrub on SSD-only pool

2016-04-22 Thread Richard Elling

> On Apr 22, 2016, at 10:28 AM, Dan McDonald  wrote:
> 
> 
>> On Apr 22, 2016, at 1:13 PM, Richard Elling 
>>  wrote:
>> 
>> If you're running Solaris 11 or pre-2015 OmniOS, then the old write throttle 
>> is impossible
>> to control and you'll chase your tail trying to balance scrubs/resilvers 
>> against any other
>> workload. From a control theory perspective, it is unstable.
> 
> pre-2015 can be clarified a bit:  r151014 and later has the modern ZFS write 
> throttle.  Now I know that Stephen is running later versions of OmniOS, so 
> you can be guaranteed it's the modern write-throttle.
> 
> Furthermore, anyone running any OmniOS EARLIER than r151014 is not 
> supportable, and any pre-014 release is not supported.

Thanks for the clarification Dan!
 -- richard

___
OmniOS-discuss mailing list
OmniOS-discuss@lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss


Re: [OmniOS-discuss] Slow scrub on SSD-only pool

2016-04-22 Thread Dan McDonald

> On Apr 22, 2016, at 1:13 PM, Richard Elling 
>  wrote:
> 
> If you're running Solaris 11 or pre-2015 OmniOS, then the old write throttle 
> is impossible
> to control and you'll chase your tail trying to balance scrubs/resilvers 
> against any other
> workload. From a control theory perspective, it is unstable.

pre-2015 can be clarified a bit:  r151014 and later has the modern ZFS write 
throttle.  Now I know that Stephen is running later versions of OmniOS, so you 
can be guaranteed it's the modern write-throttle.

Furthermore, anyone running any OmniOS EARLIER than r151014 is not supportable, 
and any pre-014 release is not supported.

Dan

___
OmniOS-discuss mailing list
OmniOS-discuss@lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss


Re: [OmniOS-discuss] Slow scrub on SSD-only pool

2016-04-22 Thread Stephan Budach

Am 22.04.16 um 19:28 schrieb Dan McDonald:

On Apr 22, 2016, at 1:13 PM, Richard Elling  
wrote:

If you're running Solaris 11 or pre-2015 OmniOS, then the old write throttle is 
impossible
to control and you'll chase your tail trying to balance scrubs/resilvers 
against any other
workload. From a control theory perspective, it is unstable.

pre-2015 can be clarified a bit:  r151014 and later has the modern ZFS write 
throttle.  Now I know that Stephen is running later versions of OmniOS, so you 
can be guaranteed it's the modern write-throttle.

Furthermore, anyone running any OmniOS EARLIER than r151014 is not supportable, 
and any pre-014 release is not supported.

Dan

…and I am actually fine with the new controls/tunables, so there's 
aboslutely no fuss here. ;) Plus, I actually understood, how both work, 
which is a plus…


Cheers,
Stephan
___
OmniOS-discuss mailing list
OmniOS-discuss@lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss