Re: priocscan vs fcfs

2015-12-04 Thread Petar Bogdanovic
On Thu, Dec 03, 2015 at 07:51:28PM -0500, Thor Lancelot Simon wrote:
> 
> What strategy are you using to sort inside RAIDframe?  This is a property
> of the RAID set; you can see it with raidctl I believe for autoconfigured
> sets.
> 
> I wouldn't be terribly surprised to see bad interactions between
> priocscan and RAIDframe's scan or cscan strategies.  What about the "fifo"
> strategy?

It has been the fifo queuing method (queue size: 100) since 2010, when I
created the set (based on the NetBSD RAIDframe guide).


> It's important, though, to understand that there are *very* few pure bulk
> read applications in this world (except single-stream video) and very
> few pure bulk write applications except database logs.  That means that
> single-stream pure read or write tests are really pretty awful predictors
> of disk performance for real workloads.

I also tried a 5-stream dd test, based on one of your previous mails:

https://mail-index.netbsd.org/netbsd-users/2014/12/01/msg015503.html

All five streams got ~10MB/s each (bs=16k), more or less consistent with
the bonnie++ output.

I should retry that with priocscan.


> The "priocscan" strategy, in particular, limits pure read/pure write
> performance *by design* in order to achieve lower latency under real
> world mixed workloads.

Intuitively, I'd always expect some penalty by any kind of queueing, but
for a simple sequential write, the numbers are just way off.  These
disks can sustain 100MB/s seq. reads and I assume it's not much worse
when writing sequentially (unfortunately I can't test that now).. so
35MB/s seems broken and even 55MB/s kind of low.


signature.asc
Description: PGP signature


Re: priocscan vs fcfs

2015-12-04 Thread Petar Bogdanovic
On Fri, Dec 04, 2015 at 10:33:15AM +, Stephen Borrill wrote:
> 
> Am I missing something here? Your figures suggest that Input (i.e. reading)
> is pretty much the same, but it is Output (i.e. writing) that has higher
> throughput

You're right, I mixed it up.  Write throughput is what I meant.


signature.asc
Description: PGP signature


Re: priocscan vs fcfs

2015-12-04 Thread Stephen Borrill

On Thu, 3 Dec 2015, Petar Bogdanovic wrote:

On Wed, Dec 02, 2015 at 08:23:28PM +, Michael van Elst wrote:


That's probably why setting the queues all to fcfs is the best
for you.


Not as dramatic as Emile's numbers but significantly higher read
throughput:


Am I missing something here? Your figures suggest that Input (i.e. 
reading) is pretty much the same, but it is Output (i.e. writing) that 
has higher throughput



   # for i in disksort fcfs priocscan; do
   for j in wd0 wd1 raid0; do dkctl $j strategy $i; done;
   bonnie++ -d /tmp -m $i -s 4g -n 0 -u build -f -D; done

   /dev/rwd0d: disksort -> disksort
   /dev/rwd1d: disksort -> disksort
   /dev/rraid0d: disksort -> disksort
   (...)
   Version  1.97   --Sequential Output-- --Sequential Input- 
--Random-
   Concurrency   1 -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- 
--Seeks--
   MachineSize K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec 
%CP
   disksort 4G   33550  13 25694   8   64922  13 238.2  
11
   Latency1023ms 309ms   161ms4111ms

   /dev/rwd0d: disksort -> fcfs
   /dev/rwd1d: disksort -> fcfs
   /dev/rraid0d: disksort -> fcfs
   (...)
   Version  1.97   --Sequential Output-- --Sequential Input- 
--Random-
   Concurrency   1 -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- 
--Seeks--
   MachineSize K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec 
%CP
   fcfs 4G   55254  22 27584   9   64438  13 244.5  
 9
   Latency 844ms 331ms   233ms6095ms

   /dev/rwd0d: fcfs -> priocscan
   /dev/rwd1d: fcfs -> priocscan
   /dev/rraid0d: fcfs -> priocscan
   (...)
   Version  1.97   --Sequential Output-- --Sequential Input- 
--Random-
   Concurrency   1 -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- 
--Seeks--
   MachineSize K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec 
%CP
   priocscan4G   32649  13 17028   6   60478  13 247.6  
 8
   Latency1983ms1167ms   236ms6198ms


That's on a simple RAID1 setup:

   # dmesg | grep SAM
   wd0: 
   wd1: 

   # mount | grep raid
   /dev/raid0a on / type ffs (log, NFS exported, local)



Re: priocscan vs fcfs

2015-12-04 Thread Petar Bogdanovic
On Fri, Dec 04, 2015 at 12:13:06PM +0100, Petar Bogdanovic wrote:
> 
> I also tried a 5-stream dd test, based on one of your previous mails:
> 
> https://mail-index.netbsd.org/netbsd-users/2014/12/01/msg015503.html
> 
> All five streams got ~10MB/s each (bs=16k), more or less consistent with
> the bonnie++ output.
> 
> I should retry that with priocscan.


~5.5MB/s per stream:

$ for dev in $(sysctl -n hw.disknames); do sudo dkctl $dev strategy priocscan; 
done
/dev/rcd0d: fcfs -> priocscan
/dev/rwd0d: fcfs -> priocscan
/dev/rwd1d: fcfs -> priocscan
/dev/rraid0d: fcfs -> priocscan

$ date ; for i in 0 1 2 3 4; do dd if=/dev/zero of=test$i bs=16k count=1048576 
 ; wait ; date
Fri Dec  4 12:19:07 CET 2015
[1] 9484
[2] 9716
[3] 6843
[4] 9129
[5] 9250
1048576+0 records in
1048576+0 records out
17179869184 bytes transferred in 3047.215 secs (5637892 bytes/sec)
1048576+0 records in
1048576+0 records out
17179869184 bytes transferred in 3069.532 secs (5596901 bytes/sec)
1048576+0 records in
1048576+0 records out
17179869184 bytes transferred in 3082.440 secs (5573464 bytes/sec)
1048576+0 records in
1048576+0 records out
17179869184 bytes transferred in 3092.244 secs (793 bytes/sec)
1048576+0 records in
1048576+0 records out
17179869184 bytes transferred in 3103.526 secs (5535596 bytes/sec)
Fri Dec  4 13:10:50 CET 2015


signature.asc
Description: PGP signature


Re: priocscan vs fcfs

2015-12-03 Thread Petar Bogdanovic
On Wed, Dec 02, 2015 at 08:23:28PM +, Michael van Elst wrote:
> 
> That's probably why setting the queues all to fcfs is the best
> for you.

Not as dramatic as Emile's numbers but significantly higher read
throughput:

# for i in disksort fcfs priocscan; do
for j in wd0 wd1 raid0; do dkctl $j strategy $i; done;
bonnie++ -d /tmp -m $i -s 4g -n 0 -u build -f -D; done

/dev/rwd0d: disksort -> disksort
/dev/rwd1d: disksort -> disksort
/dev/rraid0d: disksort -> disksort
(...)
Version  1.97   --Sequential Output-- --Sequential Input- 
--Random-
Concurrency   1 -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- 
--Seeks--
MachineSize K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec 
%CP
disksort 4G   33550  13 25694   8   64922  13 238.2 
 11
Latency1023ms 309ms   161ms
4111ms

/dev/rwd0d: disksort -> fcfs
/dev/rwd1d: disksort -> fcfs
/dev/rraid0d: disksort -> fcfs
(...)
Version  1.97   --Sequential Output-- --Sequential Input- 
--Random-
Concurrency   1 -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- 
--Seeks--
MachineSize K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec 
%CP
fcfs 4G   55254  22 27584   9   64438  13 244.5 
  9
Latency 844ms 331ms   233ms
6095ms

/dev/rwd0d: fcfs -> priocscan
/dev/rwd1d: fcfs -> priocscan
/dev/rraid0d: fcfs -> priocscan
(...)
Version  1.97   --Sequential Output-- --Sequential Input- 
--Random-
Concurrency   1 -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- 
--Seeks--
MachineSize K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec 
%CP
priocscan4G   32649  13 17028   6   60478  13 247.6 
  8
Latency1983ms1167ms   236ms
6198ms


That's on a simple RAID1 setup:

# dmesg | grep SAM
wd0: 
wd1: 

# mount | grep raid
/dev/raid0a on / type ffs (log, NFS exported, local)


signature.asc
Description: PGP signature


Re: priocscan vs fcfs

2015-12-03 Thread Thor Lancelot Simon
On Thu, Dec 03, 2015 at 07:43:10PM +0100, Petar Bogdanovic wrote:
> On Wed, Dec 02, 2015 at 08:23:28PM +, Michael van Elst wrote:
> > 
> > That's probably why setting the queues all to fcfs is the best
> > for you.
> 
> Not as dramatic as Emile's numbers but significantly higher read
> throughput:

What strategy are you using to sort inside RAIDframe?  This is a property
of the RAID set; you can see it with raidctl I believe for autoconfigured
sets.

I wouldn't be terribly surprised to see bad interactions between
priocscan and RAIDframe's scan or cscan strategies.  What about the "fifo"
strategy?

It's important, though, to understand that there are *very* few pure bulk
read applications in this world (except single-stream video) and very
few pure bulk write applications except database logs.  That means that
single-stream pure read or write tests are really pretty awful predictors
of disk performance for real workloads.

The "priocscan" strategy, in particular, limits pure read/pure write
performance *by design* in order to achieve lower latency under real
world mixed workloads.  I would not be so quick to discard it -- though I
would not use it on something like an SSD, and I am skeptical it would
perform well when stacked on one of RAIDframe's elevator-sort variants.

Thor


priocscan vs fcfs

2015-12-02 Thread Emile `iMil' Heitor


Hi,

I've never been really happy with my NetBSD RAIDframe NAS, never really got the
speed I was supposed to even with the right alignment / raid layout etc.

Today I dug into `dkctl(8)' while searching if cache was enabled for read and
write, and I came across the "strategy" command.
Long story short, changing from priocscan to fcfs strategy multiplied my NAS's
write speed by 6! I changed the strategy for all disk members:

# dkctl wd0 strategy fcfs
# dkctl wd1 strategy fcfs
# dkctl wd2 strategy fcfs

and also for the RAIDframe:

# dkctl raid0 strategy fcfs
/dev/rraid0d: priocscan -> fcfs

as changing it only for the disk members was apparently counter-productive.
And there we go, from a 40/50MB/s write average to a stunning 200 to 300MB/s,
which is more like what the disks can theroically do.

Could anyone with some background on these strategies explain what's behind the
curtain? I couldn't really find precise documentation on this matter...

Thanks,


Emile `iMil' Heitor * 
  _
| http://imil.net| ASCII ribbon campaign ( )
| http://www.NetBSD.org  |  - against HTML email  X
| http://gcu.info|  & vCards / \



Re: priocscan vs fcfs

2015-12-02 Thread Michael van Elst
i...@home.imil.net ("Emile `iMil' Heitor") writes:

>as changing it only for the disk members was apparently counter-productive.
>And there we go, from a 40/50MB/s write average to a stunning 200 to 300MB/s,
>which is more like what the disks can theroically do.

>Could anyone with some background on these strategies explain what's behind the
>curtain? I couldn't really find precise documentation on this matter...


disksort
- execute requests in order of increasing block numbers, then
  continue with the lowest block number. I.e. do a one-way
  scan over the disk.

fcfs
- execute requests in the order they were issued.

priocscan
- The filesystem tags each I/O request (buffer) with a priority.
  Priorities are mapped to multiple queues, the highest priority
  queue is executed first, each queue is executed in block number
  order up to a fixed number of requests (burst) to prevent the
  lower priority queues from starving.
  

"block number order" can be just "cylinder number order" depending
on the disk driver.

buffer priority is time-noncritical (low), time-limited (medium)
and time-critical (high).

Usually synchronous operations and the journal are time-critical,
everything else is time-limited and time-noncritical isn't used.
Accessing the raw device is also time-critical.


So, disksort is the traditional method for filesystems on dumb
disk devices. fcfs can be good for smart disks with caches
and their own queuing and for streaming a raw disk. Priocscan
tries to optimize for concurrent filesystem accesses better
than disksort.

fcfs is also the "neutral" queue for drivers stacking on top of
each other. The queue sorting should really only be done
at one level.

But raidframe is more complicated because it does its own queuing
and sorting outside of this schema, in particular when it has to
read-modify-write stripe sets for small I/O.

That's probably why setting the queues all to fcfs is the best
for you.

-- 
-- 
Michael van Elst
Internet: mlel...@serpens.de
"A potential Snark may lurk in every tree."


Re: priocscan vs fcfs

2015-12-02 Thread Emile `iMil' Heitor

On Wed, 2 Dec 2015, Michael van Elst wrote:


fcfs is also the "neutral" queue for drivers stacking on top of
each other. The queue sorting should really only be done
at one level.

But raidframe is more complicated because it does its own queuing
and sorting outside of this schema, in particular when it has to
read-modify-write stripe sets for small I/O.

That's probably why setting the queues all to fcfs is the best
for you.


Thanks a lot for this clear analysis Michael.

I can now confirm the results I've witnessed earlier, I've ran a couple of
benchmarks, including bonnie++ and iozone, the latter shows a ratio of x5 in
favor of the fsfc strategy for every type of operation. For those interested,
iozone spreadsheet output is available here (OOo / LibreOffice):

https://home.imil.net/tmp/coruscant-iozone-priocscan.ods
https://home.imil.net/tmp/coruscant-iozone-fsfc.ods

For each subset, first column is the amount of data written (from 64K to 4M)
and first row is the block size.


Emile `iMil' Heitor * 
  _
| http://imil.net| ASCII ribbon campaign ( )
| http://www.NetBSD.org  |  - against HTML email  X
| http://gcu.info|  & vCards / \