Re: [PATCH 0/6][RFC] virtio-blk: Change I/O path from request to BIO

2012-01-02 Thread Christoph Hellwig
On Mon, Jan 02, 2012 at 05:18:13PM +0100, Paolo Bonzini wrote:
> >I tried a few times, and the only constant measureable
> >thing was that it regressed performance when used for rotating devices
> >in a few benchmarks.
> 
> Were you trying with cache=none or writeback?  For cache=none,
> that's exactly what I'd expect.  cache=writeback could be more
> interesting...

cache=none - cache=writeback isn't something people should ever use
except for read-only or extremly read-mostly workloads.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/6][RFC] virtio-blk: Change I/O path from request to BIO

2012-01-02 Thread Avi Kivity
On 01/02/2012 06:18 PM, Christoph Hellwig wrote:
> On Sun, Jan 01, 2012 at 04:45:42PM +, Stefan Hajnoczi wrote:
> > win.  The fact that you added batching suggests there is some benefit
> > to what the request-based code path does.  So find out what's good
> > about the request-based code path and how to get the best of both
> > worlds.
>
> Batching pretty much always is a winner.  The maximum bio size is small
> enough that we'll frequently see multiple contiguos bios.  

Maybe the maximum bio size should be increased then; not that I disagree
with your conclusion.

> Because of
> that the Md layer fo example uses the same kind of batching.  I've tried
> to make this more general by passing a bio list to ->make_request and
> make the on-stack plugging work on bios, but in the timeslice I had
> available for that I didn't manage to actually make it work.
>
>

-- 
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/6][RFC] virtio-blk: Change I/O path from request to BIO

2012-01-02 Thread Paolo Bonzini

On 01/02/2012 05:15 PM, Christoph Hellwig wrote:

>  When QEMU uses O_DIRECT, the guest should not use QUEUE_FLAG_NONROT
>  unless it is active for the host disk as well.  (In doubt, as is the
>  case for remote hosts accessed over NFS, I would also avoid NONROT
>  and allow more coalescing).

Do we have any benchmark numbers where QUEUE_FLAG_NONROT makes a
difference?


Not that I know of.


I tried a few times, and the only constant measureable
thing was that it regressed performance when used for rotating devices
in a few benchmarks.


Were you trying with cache=none or writeback?  For cache=none, that's 
exactly what I'd expect.  cache=writeback could be more interesting...


Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/6][RFC] virtio-blk: Change I/O path from request to BIO

2012-01-02 Thread Christoph Hellwig
On Sun, Jan 01, 2012 at 04:45:42PM +, Stefan Hajnoczi wrote:
> win.  The fact that you added batching suggests there is some benefit
> to what the request-based code path does.  So find out what's good
> about the request-based code path and how to get the best of both
> worlds.

Batching pretty much always is a winner.  The maximum bio size is small
enough that we'll frequently see multiple contiguos bios.  Because of
that the Md layer fo example uses the same kind of batching.  I've tried
to make this more general by passing a bio list to ->make_request and
make the on-stack plugging work on bios, but in the timeslice I had
available for that I didn't manage to actually make it work.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/6][RFC] virtio-blk: Change I/O path from request to BIO

2012-01-02 Thread Christoph Hellwig
On Mon, Jan 02, 2012 at 05:12:00PM +0100, Paolo Bonzini wrote:
> On 01/01/2012 05:45 PM, Stefan Hajnoczi wrote:
> >By the way, drivers for solid-state devices can set QUEUE_FLAG_NONROT
> >to hint that seek time optimizations may be sub-optimal.  NBD and
> >other virtual/pseudo device drivers set this flag.  Should virtio-blk
> >set it and how does it affect performance?
> 
> By itself is not a good idea in general.
> 
> When QEMU uses O_DIRECT, the guest should not use QUEUE_FLAG_NONROT
> unless it is active for the host disk as well.  (In doubt, as is the
> case for remote hosts accessed over NFS, I would also avoid NONROT
> and allow more coalescing).

Do we have any benchmark numbers where QUEUE_FLAG_NONROT makes a
difference?  I tried a few times, and the only constant measureable
thing was that it regressed performance when used for rotating devices
in a few benchmarks.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/6][RFC] virtio-blk: Change I/O path from request to BIO

2012-01-02 Thread Paolo Bonzini

On 01/01/2012 05:45 PM, Stefan Hajnoczi wrote:

By the way, drivers for solid-state devices can set QUEUE_FLAG_NONROT
to hint that seek time optimizations may be sub-optimal.  NBD and
other virtual/pseudo device drivers set this flag.  Should virtio-blk
set it and how does it affect performance?


By itself is not a good idea in general.

When QEMU uses O_DIRECT, the guest should not use QUEUE_FLAG_NONROT 
unless it is active for the host disk as well.  (In doubt, as is the 
case for remote hosts accessed over NFS, I would also avoid NONROT and 
allow more coalescing).


When QEMU doesn't use O_DIRECT, instead, using QUEUE_FLAG_NONROT and 
leaving optimizations to the host may make some sense.


In Xen, the back-end driver is bio-based, so the scenario is like QEMU 
with O_DIRECT.  I remember seeing worse performance when switching the 
front-end to either QUEUE_FLAG_NONROT or the noop scheduler.  This was 
with RHEL5 (2.6.18), but it might still be true in more recent kernels, 
modulo benchmarking of course.  Still, the current in-tree xen-blkfront 
driver does use QUEUE_FLAG_NONROT unconditionally, more precisely its 
synonym QUEUE_FLAG_VIRT.


Still, if benchmarking confirms this theory, QEMU could expose a hint 
via a feature bit.  The default could be simply "use QUEUE_FLAG_NONROT 
iff not using O_DIRECT", or it could be more complicated with help from 
sysfs.


Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/6][RFC] virtio-blk: Change I/O path from request to BIO

2012-01-01 Thread Dor Laor

On 01/01/2012 06:45 PM, Stefan Hajnoczi wrote:

On Thu, Dec 22, 2011 at 11:41 PM, Minchan Kim  wrote:

On Thu, Dec 22, 2011 at 12:57:40PM +, Stefan Hajnoczi wrote:

On Wed, Dec 21, 2011 at 1:00 AM, Minchan Kim  wrote:
If you're stumped by the performance perhaps compare blktraces of the
request approach vs the bio approach.  We're probably performing I/O
more CPU-efficiently but the I/O pattern itself is worse.


You mean I/O scheduler have many techniques to do well in I/O pattern?
That's what I want to discuss in this RFC.

I guess request layer have many techniques proved during long time
to do well I/O but BIO-based drvier ignores them for just reducing locking
overhead. Of course, we can add such techniques to BIO-batch driver like
custom-batch in this series. But it needs lots of work, is really duplication,
and will have a problem on maintenance.

I would like to listen opinions whether this direction is good or bad.


This series is a good platform for performance analysis but not
something that should be merged IMO.  As you said it duplicates work
that I/O schedulers and the request-based block layer do.  If other
drivers start taking this approach too then the duplication will be
proliferated.

The value of this series is that you have a prototype to benchmark and
understand the bottlenecks in virtio-blk and the block layer better.
The results do not should that bypassing the I/O scheduler is always a
win.  The fact that you added batching suggests there is some benefit
to what the request-based code path does.  So find out what's good
about the request-based code path and how to get the best of both
worlds.

By the way, drivers for solid-state devices can set QUEUE_FLAG_NONROT
to hint that seek time optimizations may be sub-optimal.  NBD and
other virtual/pseudo device drivers set this flag.  Should virtio-blk
set it and how does it affect performance?


Seems logical to me. If the underlying backing storage of the host is 
SSD or some remote fast SAN server we need such a flag. Even in the case 
of standard local storage, the host will still do the seek time 
optimization so no need to do them twice.




Stefan
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/6][RFC] virtio-blk: Change I/O path from request to BIO

2012-01-01 Thread Stefan Hajnoczi
On Thu, Dec 22, 2011 at 11:41 PM, Minchan Kim  wrote:
> On Thu, Dec 22, 2011 at 12:57:40PM +, Stefan Hajnoczi wrote:
>> On Wed, Dec 21, 2011 at 1:00 AM, Minchan Kim  wrote:
>> If you're stumped by the performance perhaps compare blktraces of the
>> request approach vs the bio approach.  We're probably performing I/O
>> more CPU-efficiently but the I/O pattern itself is worse.
>
> You mean I/O scheduler have many techniques to do well in I/O pattern?
> That's what I want to discuss in this RFC.
>
> I guess request layer have many techniques proved during long time
> to do well I/O but BIO-based drvier ignores them for just reducing locking
> overhead. Of course, we can add such techniques to BIO-batch driver like
> custom-batch in this series. But it needs lots of work, is really duplication,
> and will have a problem on maintenance.
>
> I would like to listen opinions whether this direction is good or bad.

This series is a good platform for performance analysis but not
something that should be merged IMO.  As you said it duplicates work
that I/O schedulers and the request-based block layer do.  If other
drivers start taking this approach too then the duplication will be
proliferated.

The value of this series is that you have a prototype to benchmark and
understand the bottlenecks in virtio-blk and the block layer better.
The results do not should that bypassing the I/O scheduler is always a
win.  The fact that you added batching suggests there is some benefit
to what the request-based code path does.  So find out what's good
about the request-based code path and how to get the best of both
worlds.

By the way, drivers for solid-state devices can set QUEUE_FLAG_NONROT
to hint that seek time optimizations may be sub-optimal.  NBD and
other virtual/pseudo device drivers set this flag.  Should virtio-blk
set it and how does it affect performance?

Stefan
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/6][RFC] virtio-blk: Change I/O path from request to BIO

2011-12-22 Thread Minchan Kim
On Thu, Dec 22, 2011 at 12:57:40PM +, Stefan Hajnoczi wrote:
> On Wed, Dec 21, 2011 at 1:00 AM, Minchan Kim  wrote:
> > This patch is follow-up of Christohp Hellwig's work
> > [RFC: ->make_request support for virtio-blk].
> > http://thread.gmane.org/gmane.linux.kernel/1199763
> >
> > Quote from hch
> > "This patchset allows the virtio-blk driver to support much higher IOP
> > rates which can be driven out of modern PCI-e flash devices.  At this
> > point it really is just a RFC due to various issues."
> 
> Basic question to make sure I understood this series: does this patch
> bypass the guest I/O scheduler (but then you added custom batching
> code into virtio_blk.c)?

Right.

> 
> If you're stumped by the performance perhaps compare blktraces of the
> request approach vs the bio approach.  We're probably performing I/O
> more CPU-efficiently but the I/O pattern itself is worse.

You mean I/O scheduler have many techniques to do well in I/O pattern?
That's what I want to discuss in this RFC.

I guess request layer have many techniques proved during long time
to do well I/O but BIO-based drvier ignores them for just reducing locking
overhead. Of course, we can add such techniques to BIO-batch driver like 
custom-batch in this series. But it needs lots of work, is really duplication,
and will have a problem on maintenance.

I would like to listen opinions whether this direction is good or bad.

> 
> Stefan

-- 
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/6][RFC] virtio-blk: Change I/O path from request to BIO

2011-12-22 Thread Minchan Kim
On Thu, Dec 22, 2011 at 10:45:06AM -0500, Vivek Goyal wrote:
> On Thu, Dec 22, 2011 at 10:05:38AM +0900, Minchan Kim wrote:
> 
> [..]
> > > May be using deadline or noop in guest is better to benchmark against
> > > PCI-E based flash.
> > 
> > Good suggestion.
> > I tested it by deadline on guest side.
> > 
> > The result is not good.
> > Although gap is within noise, Batch BIO's random performance is regressed
> > compared to CFQ. 
> > 
> > RequestBatch BIO
> > 
> >  (MB/s)  stddev  (MB/s)  stddev
> > w787.030 31.494 w748.714 68.490
> > rw   216.044 29.734 rw   216.977 40.635
> > r771.765 3.327  r771.107 4.299
> > rr   280.096 25.135 rr   258.067 43.916
> > 
> > I did some small test for only Batch BIO with deadline and cfq.
> > to see I/O scheduler's effect.
> > I think result is very strange, deadline :149MB, CFQ : 87M
> > Because Batio BIO patch uses make_request_fn instead of request_rfn.
> > So I think we should not affect by I/O scheduler.(I mean we issue I/O 
> > before I/O scheduler handles it)
> > 
> > What do you think about it?
> > Do I miss something?
> 
> This indeed is very strange. In case of bio based drivers, changing IO
> scheduler on the queue should not change anything.
> 
> Trying running blktrace on the vda devie and see if you notice something
> odd.
> 
> Also you seem to be reporting contracdicting results for batch bio.
> 
> Initially you mention that random IO is regressing with deadline as
> comapred to CFQ. (It dropped from 325.976 MB/s to 258.067 MB/s).
> 
> In this second test you are reporting that CFQ performs badly as
> compared to deadline. (149MB/s vs 87MB/s).

First test is with 64 thread with 512M and second test is only 1 thread with 
128M
for finishing quick.
I think so result is very stange.

This is data on test of deadline.
In summary, data is very fluctuated.

Sometime it's very stable but sometime data is very fluctuated like this.
I don't know it's aio-stress problem or fusion I/O stuff problem

1)
[root@RHEL-6 ~]# ./aio-stress -c 1 -t 1 -s 128 -r 8 -O -o 3 -d 512 /dev/vda
num_thread 1
adding stage random read
starting with random read
file size 128MB, record size 8KB, depth 512, ios per iteration 8
max io_submit 8, buffer alignment set to 4KB
threads 1 files 1 contexts 1 context offset 2MB verification off
Running single thread version 
random read on /dev/vda (151.71 MB/s) 128.00 MB in 0.84s
thread 0 random read totals (151.55 MB/s) 128.00 MB in 0.84s

2)
[root@RHEL-6 ~]# ./aio-stress -c 1 -t 1 -s 128 -r 8 -O -o 3 -d 512 /dev/vda
num_thread 1
adding stage random read
starting with random read
file size 128MB, record size 8KB, depth 512, ios per iteration 8
max io_submit 8, buffer alignment set to 4KB
threads 1 files 1 contexts 1 context offset 2MB verification off
Running single thread version 
random read on /dev/vda (201.04 MB/s) 128.00 MB in 0.64s
thread 0 random read totals (200.76 MB/s) 128.00 MB in 0.64s

3)
[root@RHEL-6 ~]# ./aio-stress -c 1 -t 1 -s 128 -r 8 -O -o 3 -d 512 /dev/vda
num_thread 1
adding stage random read
starting with random read
file size 128MB, record size 8KB, depth 512, ios per iteration 8
max io_submit 8, buffer alignment set to 4KB
threads 1 files 1 contexts 1 context offset 2MB verification off
Running single thread version 
random read on /dev/vda (135.31 MB/s) 128.00 MB in 0.95s
thread 0 random read totals (135.19 MB/s) 128.00 MB in 0.95s

4)
[root@RHEL-6 ~]# ./aio-stress -c 1 -t 1 -s 128 -r 8 -O -o 3 -d 512 /dev/vda
num_thread 1
adding stage random read
starting with random read
file size 128MB, record size 8KB, depth 512, ios per iteration 8
max io_submit 8, buffer alignment set to 4KB
threads 1 files 1 contexts 1 context offset 2MB verification off
Running single thread version 
random read on /dev/vda (116.93 MB/s) 128.00 MB in 1.09s
thread 0 random read totals (116.82 MB/s) 128.00 MB in 1.10s

5)
[root@RHEL-6 ~]# ./aio-stress -c 1 -t 1 -s 128 -r 8 -O -o 3 -d 512 /dev/vda
num_thread 1
adding stage random read
starting with random read
file size 128MB, record size 8KB, depth 512, ios per iteration 8
max io_submit 8, buffer alignment set to 4KB
threads 1 files 1 contexts 1 context offset 2MB verification off
Running single thread version 
random read on /dev/vda (130.79 MB/s) 128.00 MB in 0.98s
thread 0 random read totals (130.65 MB/s) 128.00 MB in 0.98s

> 
> Two contradicting results?

Hmm, I need test in /tmp/ to prevent it.

> 
> Thanks
> Vivek

-- 
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/6][RFC] virtio-blk: Change I/O path from request to BIO

2011-12-22 Thread Vivek Goyal
On Thu, Dec 22, 2011 at 10:05:38AM +0900, Minchan Kim wrote:

[..]
> > May be using deadline or noop in guest is better to benchmark against
> > PCI-E based flash.
> 
> Good suggestion.
> I tested it by deadline on guest side.
> 
> The result is not good.
> Although gap is within noise, Batch BIO's random performance is regressed
> compared to CFQ. 
> 
> RequestBatch BIO
> 
>  (MB/s)  stddev  (MB/s)  stddev
> w787.030 31.494 w748.714 68.490
> rw   216.044 29.734 rw   216.977 40.635
> r771.765 3.327  r771.107 4.299
> rr   280.096 25.135 rr   258.067 43.916
> 
> I did some small test for only Batch BIO with deadline and cfq.
> to see I/O scheduler's effect.
> I think result is very strange, deadline :149MB, CFQ : 87M
> Because Batio BIO patch uses make_request_fn instead of request_rfn.
> So I think we should not affect by I/O scheduler.(I mean we issue I/O 
> before I/O scheduler handles it)
> 
> What do you think about it?
> Do I miss something?

This indeed is very strange. In case of bio based drivers, changing IO
scheduler on the queue should not change anything.

Trying running blktrace on the vda devie and see if you notice something
odd.

Also you seem to be reporting contracdicting results for batch bio.

Initially you mention that random IO is regressing with deadline as
comapred to CFQ. (It dropped from 325.976 MB/s to 258.067 MB/s).

In this second test you are reporting that CFQ performs badly as
compared to deadline. (149MB/s vs 87MB/s).

Two contradicting results?

Thanks
Vivek
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/6][RFC] virtio-blk: Change I/O path from request to BIO

2011-12-22 Thread Stefan Hajnoczi
On Wed, Dec 21, 2011 at 1:00 AM, Minchan Kim  wrote:
> This patch is follow-up of Christohp Hellwig's work
> [RFC: ->make_request support for virtio-blk].
> http://thread.gmane.org/gmane.linux.kernel/1199763
>
> Quote from hch
> "This patchset allows the virtio-blk driver to support much higher IOP
> rates which can be driven out of modern PCI-e flash devices.  At this
> point it really is just a RFC due to various issues."

Basic question to make sure I understood this series: does this patch
bypass the guest I/O scheduler (but then you added custom batching
code into virtio_blk.c)?

If you're stumped by the performance perhaps compare blktraces of the
request approach vs the bio approach.  We're probably performing I/O
more CPU-efficiently but the I/O pattern itself is worse.

Stefan
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/6][RFC] virtio-blk: Change I/O path from request to BIO

2011-12-21 Thread Minchan Kim
Hi Vivek,

On Wed, Dec 21, 2011 at 02:11:17PM -0500, Vivek Goyal wrote:
> On Wed, Dec 21, 2011 at 10:00:48AM +0900, Minchan Kim wrote:
> > This patch is follow-up of Christohp Hellwig's work
> > [RFC: ->make_request support for virtio-blk].
> > http://thread.gmane.org/gmane.linux.kernel/1199763
> > 
> > Quote from hch
> > "This patchset allows the virtio-blk driver to support much higher IOP
> > rates which can be driven out of modern PCI-e flash devices.  At this
> > point it really is just a RFC due to various issues."
> > 
> > I fixed race bug and add batch I/O for enhancing sequential I/O,
> > FLUSH/FUA emulation.
> > 
> > I tested this patch on fusion I/O device by aio-stress.
> > Result is following as.
> > 
> > Benchmark : aio-stress (64 thread, test file size 512M, 8K io per IO, 
> > O_DIRECT write)
> > Environment: 8 socket - 8 core, 2533.372Hz, Fusion IO 320G storage
> > Test repeated by 20 times
> > Guest I/O scheduler : CFQ
> > Host I/O scheduler : NOOP
> 
> May be using deadline or noop in guest is better to benchmark against
> PCI-E based flash.

Good suggestion.
I tested it by deadline on guest side.

The result is not good.
Although gap is within noise, Batch BIO's random performance is regressed
compared to CFQ. 

RequestBatch BIO

 (MB/s)  stddev  (MB/s)  stddev
w787.030 31.494 w748.714 68.490
rw   216.044 29.734 rw   216.977 40.635
r771.765 3.327  r771.107 4.299
rr   280.096 25.135 rr   258.067 43.916

I did some small test for only Batch BIO with deadline and cfq.
to see I/O scheduler's effect.
I think result is very strange, deadline :149MB, CFQ : 87M
Because Batio BIO patch uses make_request_fn instead of request_rfn.
So I think we should not affect by I/O scheduler.(I mean we issue I/O 
before I/O scheduler handles it)

What do you think about it?
Do I miss something?


1) deadline
[root@RHEL-6 ~]# ./aio-stress -c 1 -t 1 -s 128 -r 8 -O -o 3 -d 512 /dev/vda
num_thread 1
adding stage random read
starting with random read
file size 128MB, record size 8KB, depth 512, ios per iteration 8
max io_submit 8, buffer alignment set to 4KB
threads 1 files 1 contexts 1 context offset 2MB verification off
Running single thread version 
random read on /dev/vda (149.40 MB/s) 128.00 MB in 0.86s
thread 0 random read totals (149.22 MB/s) 128.00 MB in 0.86s


2) cfq
[root@RHEL-6 ~]# ./aio-stress -c 1 -t 1 -s 128 -r 8 -O -o 3 -d 512 /dev/vda
num_thread 1
adding stage random read
starting with random read
file size 128MB, record size 8KB, depth 512, ios per iteration 8
max io_submit 8, buffer alignment set to 4KB
threads 1 files 1 contexts 1 context offset 2MB verification off
Running single thread version 
random read on /dev/vda (87.21 MB/s) 128.00 MB in 1.47s
thread 0 random read totals (87.15 MB/s) 128.00 MB in 1.47s


> 
> Thanks
> Vivek

-- 
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/6][RFC] virtio-blk: Change I/O path from request to BIO

2011-12-21 Thread Vivek Goyal
On Wed, Dec 21, 2011 at 10:00:48AM +0900, Minchan Kim wrote:
> This patch is follow-up of Christohp Hellwig's work
> [RFC: ->make_request support for virtio-blk].
> http://thread.gmane.org/gmane.linux.kernel/1199763
> 
> Quote from hch
> "This patchset allows the virtio-blk driver to support much higher IOP
> rates which can be driven out of modern PCI-e flash devices.  At this
> point it really is just a RFC due to various issues."
> 
> I fixed race bug and add batch I/O for enhancing sequential I/O,
> FLUSH/FUA emulation.
> 
> I tested this patch on fusion I/O device by aio-stress.
> Result is following as.
> 
> Benchmark : aio-stress (64 thread, test file size 512M, 8K io per IO, 
> O_DIRECT write)
> Environment: 8 socket - 8 core, 2533.372Hz, Fusion IO 320G storage
> Test repeated by 20 times
> Guest I/O scheduler : CFQ
> Host I/O scheduler : NOOP

May be using deadline or noop in guest is better to benchmark against
PCI-E based flash.

Thanks
Vivek
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/6][RFC] virtio-blk: Change I/O path from request to BIO

2011-12-21 Thread Minchan Kim
Hi Sasha!

On Wed, Dec 21, 2011 at 10:28:52AM +0200, Sasha Levin wrote:
> On Wed, 2011-12-21 at 10:00 +0900, Minchan Kim wrote:
> > This patch is follow-up of Christohp Hellwig's work
> > [RFC: ->make_request support for virtio-blk].
> > http://thread.gmane.org/gmane.linux.kernel/1199763
> > 
> > Quote from hch
> > "This patchset allows the virtio-blk driver to support much higher IOP
> > rates which can be driven out of modern PCI-e flash devices.  At this
> > point it really is just a RFC due to various issues."
> > 
> > I fixed race bug and add batch I/O for enhancing sequential I/O,
> > FLUSH/FUA emulation.
> > 
> > I tested this patch on fusion I/O device by aio-stress.
> > Result is following as.
> > 
> > Benchmark : aio-stress (64 thread, test file size 512M, 8K io per IO, 
> > O_DIRECT write)
> > Environment: 8 socket - 8 core, 2533.372Hz, Fusion IO 320G storage
> > Test repeated by 20 times
> > Guest I/O scheduler : CFQ
> > Host I/O scheduler : NOOP
> > 
> > Request BIO(patch 1-4)  BIO-batch(patch 
> > 1-6)
> >  (MB/s)  stddev (MB/s)  stddev  (MB/s)  stddev
> > w737.820 4.063  613.735 31.605  730.288 24.854
> > rw   208.754 20.450 314.630 37.352  317.831 41.719
> > r770.974 2.340  347.483 51.370  750.324 8.280
> > rr   250.391 16.910 350.053 29.986  325.976 24.846
> > 
> > This patch enhances ramdom I/O performance compared to request-based I/O 
> > path.
> > It's still RFC so welcome to any comment and review.
> 
> I did a benchmark against a /dev/shm device instead of an actual storage
> to get rid of any artifacts which are caused by the storage itself, and
> saw that while there was a nice improvement across the board, the hit
> against sequential read and write was quite significant.

Hmm, it seems bandwidth test of sequential is bad but io test is still good.
I don't know how it is possbile that iops is better but bandwidth is bad.
Anyway, it seems sequential write bw is severe but I think it could be
better if test makes many lock overhead in vblk->lock because this patch
is started from the lock overhead.

Thanks for the testing, Sasha!

> 
> I ran the tests with fio running in KVM tool against a 2G file located
> in /dev/shm. Here is a summary of the results:
> 
> Before:
> write_iops_seq
>   write: io=1409.8MB, bw=144217KB/s, iops=36054 , runt= 10010msec
> write_bw_seq
>   write: io=7700.0MB, bw=1323.5MB/s, iops=1323 , runt=  5818msec
> read_iops_seq
>   read : io=1453.7MB, bw=148672KB/s, iops=37168 , runt= 10012msec
> read_bw_seq
>   read : io=7700.0MB, bw=1882.7MB/s, iops=1882 , runt=  4090msec
> write_iops_rand
>   write: io=1266.4MB, bw=129479KB/s, iops=32369 , runt= 10015msec
> write_bw_rand
>   write: io=7539.0MB, bw=1106.1MB/s, iops=1106 , runt=  6811msec
> read_iops_rand
>   read : io=1373.3MB, bw=140475KB/s, iops=35118 , runt= 10010msec
> read_bw_rand
>   read : io=7539.0MB, bw=1314.4MB/s, iops=1314 , runt=  5736msec
> readwrite_iops_seq
>   read : io=726172KB, bw=72292KB/s, iops=18072 , runt= 10045msec
>   write: io=726460KB, bw=72321KB/s, iops=18080 , runt= 10045msec
> readwrite_bw_seq
>   read : io=3856.0MB, bw=779574KB/s, iops=761 , runt=  5065msec
>   write: io=3844.0MB, bw=777148KB/s, iops=758 , runt=  5065msec
> readwrite_iops_rand
>   read : io=701780KB, bw=70094KB/s, iops=17523 , runt= 10012msec
>   write: io=706120KB, bw=70527KB/s, iops=17631 , runt= 10012msec
> readwrite_bw_rand
>   read : io=3705.0MB, bw=601446KB/s, iops=587 , runt=  6308msec
>   write: io=3834.0MB, bw=622387KB/s, iops=607 , runt=  6308msec
> 
> After:
> write_iops_seq
>   write: io=1591.4MB, bw=162626KB/s, iops=40656 , runt= 10020msec
> write_bw_seq
>   write: io=7700.0MB, bw=1276.4MB/s, iops=1276 , runt=  6033msec
> read_iops_seq
>   read : io=1615.7MB, bw=164680KB/s, iops=41170 , runt= 10046msec
> read_bw_seq
>   read : io=7700.0MB, bw=1407.1MB/s, iops=1407 , runt=  5469msec
> write_iops_rand
>   write: io=1243.1MB, bw=126304KB/s, iops=31575 , runt= 10085msec
> write_bw_rand
>   write: io=7539.0MB, bw=1206.3MB/s, iops=1206 , runt=  6250msec
> read_iops_rand
>   read : io=1533.1MB, bw=156795KB/s, iops=39198 , runt= 10018msec
> read_bw_rand
>   read : io=7539.0MB, bw=1413.7MB/s, iops=1413 , runt=  5333msec
> readwrite_iops_seq
>   read : io=819124KB, bw=81790KB/s, iops=20447 , runt= 10015msec
>   write: io=823136KB, bw=82190KB/s, iops=20547 , runt= 10015msec
> readwrite_bw_seq
>   read : io=3913.0MB, bw=704946KB/s, iops=688 , runt=  5684msec
>   write: io=3787.0MB, bw=682246KB/s, iops=666 , runt=  5684msec
> readwrite_iops_rand
>   read : io=802148KB, bw=80159KB/s, iops=20039 , runt= 10007msec
>   write: io=801192KB, bw=80063KB/s, iops=20015 , runt= 10007msec
> readwrite_bw_rand
>   read : io=3731.0MB, bw=677762KB/s, iops=661 , runt=  5637msec
>   write: io=3808.0MB, bw=691750KB/s, iops=675 , runt=  5637msec
> 
> -- 
> 
> Sasha.
> 
> 

-- 
Kind regards,
Minchan Kim
--
To un

Re: [PATCH 0/6][RFC] virtio-blk: Change I/O path from request to BIO

2011-12-20 Thread Sasha Levin
On Wed, 2011-12-21 at 10:00 +0900, Minchan Kim wrote:
> This patch is follow-up of Christohp Hellwig's work
> [RFC: ->make_request support for virtio-blk].
> http://thread.gmane.org/gmane.linux.kernel/1199763
> 
> Quote from hch
> "This patchset allows the virtio-blk driver to support much higher IOP
> rates which can be driven out of modern PCI-e flash devices.  At this
> point it really is just a RFC due to various issues."
> 
> I fixed race bug and add batch I/O for enhancing sequential I/O,
> FLUSH/FUA emulation.
> 
> I tested this patch on fusion I/O device by aio-stress.
> Result is following as.
> 
> Benchmark : aio-stress (64 thread, test file size 512M, 8K io per IO, 
> O_DIRECT write)
> Environment: 8 socket - 8 core, 2533.372Hz, Fusion IO 320G storage
> Test repeated by 20 times
> Guest I/O scheduler : CFQ
> Host I/O scheduler : NOOP
> 
> Request   BIO(patch 1-4)  BIO-batch(patch 
> 1-6)
>  (MB/s)  stddev   (MB/s)  stddev  (MB/s)  stddev
> w737.820 4.063613.735 31.605  730.288 24.854
> rw   208.754 20.450   314.630 37.352  317.831 41.719
> r770.974 2.340347.483 51.370  750.324 8.280
> rr   250.391 16.910   350.053 29.986  325.976 24.846
> 
> This patch enhances ramdom I/O performance compared to request-based I/O path.
> It's still RFC so welcome to any comment and review.

I did a benchmark against a /dev/shm device instead of an actual storage
to get rid of any artifacts which are caused by the storage itself, and
saw that while there was a nice improvement across the board, the hit
against sequential read and write was quite significant.

I ran the tests with fio running in KVM tool against a 2G file located
in /dev/shm. Here is a summary of the results:

Before:
write_iops_seq
  write: io=1409.8MB, bw=144217KB/s, iops=36054 , runt= 10010msec
write_bw_seq
  write: io=7700.0MB, bw=1323.5MB/s, iops=1323 , runt=  5818msec
read_iops_seq
  read : io=1453.7MB, bw=148672KB/s, iops=37168 , runt= 10012msec
read_bw_seq
  read : io=7700.0MB, bw=1882.7MB/s, iops=1882 , runt=  4090msec
write_iops_rand
  write: io=1266.4MB, bw=129479KB/s, iops=32369 , runt= 10015msec
write_bw_rand
  write: io=7539.0MB, bw=1106.1MB/s, iops=1106 , runt=  6811msec
read_iops_rand
  read : io=1373.3MB, bw=140475KB/s, iops=35118 , runt= 10010msec
read_bw_rand
  read : io=7539.0MB, bw=1314.4MB/s, iops=1314 , runt=  5736msec
readwrite_iops_seq
  read : io=726172KB, bw=72292KB/s, iops=18072 , runt= 10045msec
  write: io=726460KB, bw=72321KB/s, iops=18080 , runt= 10045msec
readwrite_bw_seq
  read : io=3856.0MB, bw=779574KB/s, iops=761 , runt=  5065msec
  write: io=3844.0MB, bw=777148KB/s, iops=758 , runt=  5065msec
readwrite_iops_rand
  read : io=701780KB, bw=70094KB/s, iops=17523 , runt= 10012msec
  write: io=706120KB, bw=70527KB/s, iops=17631 , runt= 10012msec
readwrite_bw_rand
  read : io=3705.0MB, bw=601446KB/s, iops=587 , runt=  6308msec
  write: io=3834.0MB, bw=622387KB/s, iops=607 , runt=  6308msec

After:
write_iops_seq
  write: io=1591.4MB, bw=162626KB/s, iops=40656 , runt= 10020msec
write_bw_seq
  write: io=7700.0MB, bw=1276.4MB/s, iops=1276 , runt=  6033msec
read_iops_seq
  read : io=1615.7MB, bw=164680KB/s, iops=41170 , runt= 10046msec
read_bw_seq
  read : io=7700.0MB, bw=1407.1MB/s, iops=1407 , runt=  5469msec
write_iops_rand
  write: io=1243.1MB, bw=126304KB/s, iops=31575 , runt= 10085msec
write_bw_rand
  write: io=7539.0MB, bw=1206.3MB/s, iops=1206 , runt=  6250msec
read_iops_rand
  read : io=1533.1MB, bw=156795KB/s, iops=39198 , runt= 10018msec
read_bw_rand
  read : io=7539.0MB, bw=1413.7MB/s, iops=1413 , runt=  5333msec
readwrite_iops_seq
  read : io=819124KB, bw=81790KB/s, iops=20447 , runt= 10015msec
  write: io=823136KB, bw=82190KB/s, iops=20547 , runt= 10015msec
readwrite_bw_seq
  read : io=3913.0MB, bw=704946KB/s, iops=688 , runt=  5684msec
  write: io=3787.0MB, bw=682246KB/s, iops=666 , runt=  5684msec
readwrite_iops_rand
  read : io=802148KB, bw=80159KB/s, iops=20039 , runt= 10007msec
  write: io=801192KB, bw=80063KB/s, iops=20015 , runt= 10007msec
readwrite_bw_rand
  read : io=3731.0MB, bw=677762KB/s, iops=661 , runt=  5637msec
  write: io=3808.0MB, bw=691750KB/s, iops=675 , runt=  5637msec

-- 

Sasha.


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/6][RFC] virtio-blk: Change I/O path from request to BIO

2011-12-20 Thread Minchan Kim
Hi Rusty,

On Wed, Dec 21, 2011 at 03:38:03PM +1030, Rusty Russell wrote:
> On Wed, 21 Dec 2011 10:00:48 +0900, Minchan Kim  wrote:
> > This patch is follow-up of Christohp Hellwig's work
> > [RFC: ->make_request support for virtio-blk].
> > http://thread.gmane.org/gmane.linux.kernel/1199763
> > 
> > Quote from hch
> > "This patchset allows the virtio-blk driver to support much higher IOP
> > rates which can be driven out of modern PCI-e flash devices.  At this
> > point it really is just a RFC due to various issues."
> > 
> > I fixed race bug and add batch I/O for enhancing sequential I/O,
> > FLUSH/FUA emulation.
> > 
> > I tested this patch on fusion I/O device by aio-stress.
> > Result is following as.
> > 
> > Benchmark : aio-stress (64 thread, test file size 512M, 8K io per IO, 
> > O_DIRECT write)
> > Environment: 8 socket - 8 core, 2533.372Hz, Fusion IO 320G storage
> > Test repeated by 20 times
> > Guest I/O scheduler : CFQ
> > Host I/O scheduler : NOOP
> > 
> > Request BIO(patch 1-4)  BIO-batch(patch 
> > 1-6)
> >  (MB/s)  stddev (MB/s)  stddev  (MB/s)  stddev
> > w737.820 4.063  613.735 31.605  730.288 24.854
> > rw   208.754 20.450 314.630 37.352  317.831 41.719
> > r770.974 2.340  347.483 51.370  750.324 8.280
> > rr   250.391 16.910 350.053 29.986  325.976 24.846
> 
> So, you dropped w and r down 2%, but rw and rr up 40%.
> 
> If I knew what the various rows were, I'd have something intelligent to
> say, I'm sure :)

Sorry for missing that.

w: Sequential Write
rw: Random Write
r: Sequential Read
rr: Random Read

> 
> I can find the source to aio-stress, but no obvious clues.
> 
> Help!
> Rusty.

-- 
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/6][RFC] virtio-blk: Change I/O path from request to BIO

2011-12-20 Thread Rusty Russell
On Wed, 21 Dec 2011 10:00:48 +0900, Minchan Kim  wrote:
> This patch is follow-up of Christohp Hellwig's work
> [RFC: ->make_request support for virtio-blk].
> http://thread.gmane.org/gmane.linux.kernel/1199763
> 
> Quote from hch
> "This patchset allows the virtio-blk driver to support much higher IOP
> rates which can be driven out of modern PCI-e flash devices.  At this
> point it really is just a RFC due to various issues."
> 
> I fixed race bug and add batch I/O for enhancing sequential I/O,
> FLUSH/FUA emulation.
> 
> I tested this patch on fusion I/O device by aio-stress.
> Result is following as.
> 
> Benchmark : aio-stress (64 thread, test file size 512M, 8K io per IO, 
> O_DIRECT write)
> Environment: 8 socket - 8 core, 2533.372Hz, Fusion IO 320G storage
> Test repeated by 20 times
> Guest I/O scheduler : CFQ
> Host I/O scheduler : NOOP
> 
> Request   BIO(patch 1-4)  BIO-batch(patch 
> 1-6)
>  (MB/s)  stddev   (MB/s)  stddev  (MB/s)  stddev
> w737.820 4.063613.735 31.605  730.288 24.854
> rw   208.754 20.450   314.630 37.352  317.831 41.719
> r770.974 2.340347.483 51.370  750.324 8.280
> rr   250.391 16.910   350.053 29.986  325.976 24.846

So, you dropped w and r down 2%, but rw and rr up 40%.

If I knew what the various rows were, I'd have something intelligent to
say, I'm sure :)

I can find the source to aio-stress, but no obvious clues.

Help!
Rusty.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html