subject:"A quick fio test \(was Re\: \[patch 00\/13\] Syslets, Threadlets, generic AIO support, v3\)"

Re: A quick fio test (was Re: [patch 00/13] Syslets, "Threadlets", generic AIO support, v3)

2007-02-28 Thread Jens Axboe

On Wed, Feb 28 2007, Ingo Molnar wrote:
> 
> * Jens Axboe <[EMAIL PROTECTED]> wrote:
> 
> > EngineDepth  Batch  Bw (KiB/sec)
> > libaio  2  821,125
> > syslet  2  819,610
> 
> i'd like to do something more about this to be more in line with libaio 
> - if nothing else then for the bragging rights ;-) It seems to me that a 
> drop of ~7% in throughput cannot be explained with any CPU overhead, it 
> must be some sort of queueing + IO scheduling effect - right?

syslet shows a slightly higher overhead, but nothing that will account
for any bandwidth change in this test. The box is obviously mostly idle
when running this test, it's not very CPU consuming. The IO pattern
issued is not the same, since libaio would commit IO [0..7], then
[8..15] and so on, where syslet would expose [0,8,16,24,32,40,48,56] and
then [1,9,17,25,33,41,49,57] etc. If iodepth_batch is set to 1 you'd
get a closer match wrt io pattern, but at a higher cost (increased
system calls, and 8 times as many pending async threads). That gets it
to 20,253KiB/s here with ~1000 as many context switches.

So in short, it's harder to compare with real storage, as access
patterns don't translate very easily

-- 
Jens Axboe

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: A quick fio test (was Re: [patch 00/13] Syslets, "Threadlets", generic AIO support, v3)

2007-02-28 Thread Ingo Molnar


* Jens Axboe <[EMAIL PROTECTED]> wrote:

> EngineDepth  BatchBw (KiB/sec)
> libaio2  821,125
> syslet2  819,610

i'd like to do something more about this to be more in line with libaio 
- if nothing else then for the bragging rights ;-) It seems to me that a 
drop of ~7% in throughput cannot be explained with any CPU overhead, it 
must be some sort of queueing + IO scheduling effect - right?

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: A quick fio test (was Re: [patch 00/13] Syslets, "Threadlets", generic AIO support, v3)

2007-02-28 Thread Jens Axboe

On Tue, Feb 27 2007, Suparna Bhattacharya wrote:
> > It's not bad for such a high depth/batch setting, but I still wonder why
> > are results are so different. I'll look around for an x86 box with some
> > TCQ/NCQ enabled storage attached for testing. Can you pass me your
> > command line or job file (whatever you use) so we are on the same page?
> 
> Sure - I used variations of the following job file (e.g. engine=syslet-rw,
> iodepth=2).
> 
> Also the io scheduler on my system is set to Anticipatory by default.
> FWIW it is a 4 way SMP (PIII, 700MHz)
> 
> ; aio-stress -l -O -o3 <1GB file>
> [global]
> ioengine=libaio
> buffered=0
> rw=randread
> bs=64k
> size=1024m
> directory=/kdump/suparna
> 
> [testfile2]
> iodepth=64
> iodepth_batch=8

Ok, now that I can run this on more than x86, I gave it a spin on a box
with a little more potent storage. This is a core 2 quad, disks are
7200rpm sata (with NCQ) and a 15krpm SCSI disk. IO scheduler is
deadline.

SATA disk:

Engine  Depth  BatchBw (KiB/sec)

libaio  64 817,486
syslet  64 817,357
libaio  2  817,625
syslet  2  816,526
sync1  1 7,529


SCSI disk:

Engine  Depth  BatchBw (KiB/sec)

libaio  64 820,723
syslet  64 820,742
libaio  2  821,125
syslet  2  819,610
sync1  116,659


> > > Engine  Depth  Batch  Bw (KiB/sec)
> > > 
> > > libaio64 default  17,429
> > > syslet64 default  16,155
> > > libaio2  default  15,494
> > > syslet2  default  7,971
> > >
> > If iodepth_batch isn't set, the syslet queued io will be serialized and
> 
> I see, so then this particular setting is not very meaningful

Not if you want to take advantage of hw queuing, as in this random
workload. fio being a test tool, it's important to be able to control as
many aspects of what happens as possible. That means you can also do
things that you do not want to do in real life, having a pending list of
2 serialized requests is indeed one of them. It also means you
pretty much have to know what you are doing, when testing little details
like this.

-- 
Jens Axboe

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: A quick fio test (was Re: [patch 00/13] Syslets, Threadlets, generic AIO support, v3)

2007-02-28 Thread Jens Axboe

On Tue, Feb 27 2007, Suparna Bhattacharya wrote:
  It's not bad for such a high depth/batch setting, but I still wonder why
  are results are so different. I'll look around for an x86 box with some
  TCQ/NCQ enabled storage attached for testing. Can you pass me your
  command line or job file (whatever you use) so we are on the same page?
 
 Sure - I used variations of the following job file (e.g. engine=syslet-rw,
 iodepth=2).
 
 Also the io scheduler on my system is set to Anticipatory by default.
 FWIW it is a 4 way SMP (PIII, 700MHz)
 
 ; aio-stress -l -O -o3 1GB file
 [global]
 ioengine=libaio
 buffered=0
 rw=randread
 bs=64k
 size=1024m
 directory=/kdump/suparna
 
 [testfile2]
 iodepth=64
 iodepth_batch=8

Ok, now that I can run this on more than x86, I gave it a spin on a box
with a little more potent storage. This is a core 2 quad, disks are
7200rpm sata (with NCQ) and a 15krpm SCSI disk. IO scheduler is
deadline.

SATA disk:

Engine  Depth  BatchBw (KiB/sec)

libaio  64 817,486
syslet  64 817,357
libaio  2  817,625
syslet  2  816,526
sync1  1 7,529


SCSI disk:

Engine  Depth  BatchBw (KiB/sec)

libaio  64 820,723
syslet  64 820,742
libaio  2  821,125
syslet  2  819,610
sync1  116,659


   Engine  Depth  Batch  Bw (KiB/sec)
   
   libaio64 default  17,429
   syslet64 default  16,155
   libaio2  default  15,494
   syslet2  default  7,971
  
  If iodepth_batch isn't set, the syslet queued io will be serialized and
 
 I see, so then this particular setting is not very meaningful

Not if you want to take advantage of hw queuing, as in this random
workload. fio being a test tool, it's important to be able to control as
many aspects of what happens as possible. That means you can also do
things that you do not want to do in real life, having a pending list of
2 serialized requests is indeed one of them. It also means you
pretty much have to know what you are doing, when testing little details
like this.

-- 
Jens Axboe

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: A quick fio test (was Re: [patch 00/13] Syslets, Threadlets, generic AIO support, v3)

2007-02-28 Thread Ingo Molnar


* Jens Axboe [EMAIL PROTECTED] wrote:

 EngineDepth  BatchBw (KiB/sec)
 libaio2  821,125
 syslet2  819,610

i'd like to do something more about this to be more in line with libaio 
- if nothing else then for the bragging rights ;-) It seems to me that a 
drop of ~7% in throughput cannot be explained with any CPU overhead, it 
must be some sort of queueing + IO scheduling effect - right?

Ingo
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: A quick fio test (was Re: [patch 00/13] Syslets, Threadlets, generic AIO support, v3)

2007-02-28 Thread Jens Axboe

On Wed, Feb 28 2007, Ingo Molnar wrote:
 
 * Jens Axboe [EMAIL PROTECTED] wrote:
 
  EngineDepth  Batch  Bw (KiB/sec)
  libaio  2  821,125
  syslet  2  819,610
 
 i'd like to do something more about this to be more in line with libaio 
 - if nothing else then for the bragging rights ;-) It seems to me that a 
 drop of ~7% in throughput cannot be explained with any CPU overhead, it 
 must be some sort of queueing + IO scheduling effect - right?

syslet shows a slightly higher overhead, but nothing that will account
for any bandwidth change in this test. The box is obviously mostly idle
when running this test, it's not very CPU consuming. The IO pattern
issued is not the same, since libaio would commit IO [0..7], then
[8..15] and so on, where syslet would expose [0,8,16,24,32,40,48,56] and
then [1,9,17,25,33,41,49,57] etc. If iodepth_batch is set to 1 you'd
get a closer match wrt io pattern, but at a higher cost (increased
system calls, and 8 times as many pending async threads). That gets it
to 20,253KiB/s here with ~1000 as many context switches.

So in short, it's harder to compare with real storage, as access
patterns don't translate very easily

-- 
Jens Axboe

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: A quick fio test (was Re: [patch 00/13] Syslets, "Threadlets", generic AIO support, v3)

2007-02-27 Thread Jens Axboe

On Tue, Feb 27 2007, Evgeniy Polyakov wrote:
> On Tue, Feb 27, 2007 at 07:45:41PM +0100, Jens Axboe ([EMAIL PROTECTED]) 
> wrote:
> > > Deadline shows this:
> > > 
> > > sync:
> > > READ: io=1,024MiB, aggrb=38,212KiB/s, minb=38,212KiB/s,
> > > maxb=38,212KiB/s, mint=28099msec, maxt=28099msec
> > > 
> > > libaio:
> > > READ: io=1,024MiB, aggrb=37,933KiB/s, minb=37,933KiB/s,
> > > maxb=37,933KiB/s, mint=28306msec, maxt=28306msec
> > > 
> > > syslet-rw:
> > > READ: io=1,024MiB, aggrb=34,759KiB/s, minb=34,759KiB/s,
> > > maxb=34,759KiB/s, mint=30891msec, maxt=30891msec
> > > 
> > > There were about 10k async schedulings.
> > 
> > I think the issue here is pretty simple - when fio gets a queue full
> > like condition (it reaches the depth you set, 32), it commits them and
> > starts queuing again. Since that'll likely block, it'll get issued by
> > another process. So you suddenly have a nice sequence of reads from one
> > process (pending, only one is actually committed since it's serialized),
> > and then a read further down the line that goes behind those you already
> > committed. Then result is seeky, where it should have been sequential.
> > 
> > Do you get expected results if you set iodepth_low=1? That'll make fio
> > drain the queue before building it up again, should get you a sequential
> > access pattern with syslets.
> 
> With such a change results should be better - not only because seek is
> removed with sequential read, but also number of working threads
> decreases with time - until queue is filled again.

Yep, although it probably doesn't matter for such a low bandwidth test
anyway.

> So, syslet-rw has increased to 37mb/sec out of 39/sync and 38/libaio,
> the latter two did not changed.

I wonder why all three aren't doing 39mb/sec flat here, it's a pretty
trivial case...

> With iodepth of 10k, I get the same performance for
> libaio and syslets - about 36mb/sec, it does not depend on iodepth_low
> being set to 1 or default (full).

Yep, the larger the iodepth, the less costly a seek on new queue buildup
gets. So that is as expected.

> So syslets have small problems with small number of iodepth - its
> performance is about 34mb/sec and then increases to 36 with iodepth
> grow. While libaio decreases from 38 down to 36 mb/sec.

Using your job file and fio HEAD (forces iodepth_low=1 for syslet if
iodepth_low isn't specified), I get:

Engine  Depth   Bw (kb/sec)
---
syslet  1  37163
syslet 32  37197
syslet  1  36577
libaio  1  37144
libaio 32  37159
libaio  1  36463
sync1  37154

Results are highly stable. Note that this test case isn't totally fair,
since libaio isn't really async when you do buffered file IO.

-- 
Jens Axboe

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: A quick fio test (was Re: [patch 00/13] Syslets, "Threadlets", generic AIO support, v3)

2007-02-27 Thread Evgeniy Polyakov

On Tue, Feb 27, 2007 at 07:45:41PM +0100, Jens Axboe ([EMAIL PROTECTED]) wrote:
> > Deadline shows this:
> > 
> > sync:
> > READ: io=1,024MiB, aggrb=38,212KiB/s, minb=38,212KiB/s,
> > maxb=38,212KiB/s, mint=28099msec, maxt=28099msec
> > 
> > libaio:
> > READ: io=1,024MiB, aggrb=37,933KiB/s, minb=37,933KiB/s,
> > maxb=37,933KiB/s, mint=28306msec, maxt=28306msec
> > 
> > syslet-rw:
> > READ: io=1,024MiB, aggrb=34,759KiB/s, minb=34,759KiB/s,
> > maxb=34,759KiB/s, mint=30891msec, maxt=30891msec
> > 
> > There were about 10k async schedulings.
> 
> I think the issue here is pretty simple - when fio gets a queue full
> like condition (it reaches the depth you set, 32), it commits them and
> starts queuing again. Since that'll likely block, it'll get issued by
> another process. So you suddenly have a nice sequence of reads from one
> process (pending, only one is actually committed since it's serialized),
> and then a read further down the line that goes behind those you already
> committed. Then result is seeky, where it should have been sequential.
> 
> Do you get expected results if you set iodepth_low=1? That'll make fio
> drain the queue before building it up again, should get you a sequential
> access pattern with syslets.

With such a change results should be better - not only because seek is
removed with sequential read, but also number of working threads
decreases with time - until queue is filled again.

So, syslet-rw has increased to 37mb/sec out of 39/sync and 38/libaio,
the latter two did not changed.

With iodepth of 10k, I get the same performance for
libaio and syslets - about 36mb/sec, it does not depend on iodepth_low
being set to 1 or default (full).

So syslets have small problems with small number of iodepth - its
performance is about 34mb/sec and then increases to 36 with iodepth
grow. While libaio decreases from 38 down to 36 mb/sec.

iodepth_low=1 helps syslets to have 37mb/sec with iodepth=32, with 3200
and 10k it does not play any role.

> -- 
> Jens Axboe

-- 
Evgeniy Polyakov
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: A quick fio test (was Re: [patch 00/13] Syslets, "Threadlets", generic AIO support, v3)

2007-02-27 Thread Jens Axboe

On Tue, Feb 27 2007, Avi Kivity wrote:
> Ingo Molnar wrote:
> >* Avi Kivity <[EMAIL PROTECTED]> wrote:
> >
> >  
> >>But what about cpu usage?  At these low levels, the cpu is probably 
> >>underutilized.  It would be interesting to measure cpu time per I/O 
> >>request (or, alternatively, use an I/O subsystem that can saturate the 
> >>processors).
> >>
> >
> >yeah - that's what testing on ramdisk (Jens') or on a loopback block 
> >device (mine) approximates to a certain degree.
> >
> >  
> 
> Ramdisks or fully cached loopback return immediately, so cache thrashing 
> effects don't show up.
> 
> Maybe a device mapper delay target or nbd + O_DIRECT can insert delays 
> to make the workload more disk-like.

Take a look at scsi-debug, it can do at least some of that.

-- 
Jens Axboe

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: A quick fio test (was Re: [patch 00/13] Syslets, "Threadlets", generic AIO support, v3)

2007-02-27 Thread Jens Axboe

On Tue, Feb 27 2007, Evgeniy Polyakov wrote:
> On Tue, Feb 27, 2007 at 12:29:08PM +0100, Jens Axboe ([EMAIL PROTECTED]) 
> wrote:
> > On Tue, Feb 27 2007, Evgeniy Polyakov wrote:
> > > My two coins:
> > > # cat job
> > > [global]
> > > bs=8k
> > > size=1g
> > > direct=0
> > > ioengine=sync
> > > iodepth=32
> > > rw=read
> > > 
> > > [file]
> > > filename=/home/user/test
> > > 
> > > sync: 
> > > READ: io=1,024MiB, aggrb=39,329KiB/s, minb=39,329KiB/s,
> > > maxb=39,329KiB/s, mint=27301msec, maxt=27301msec
> > > 
> > > libaio:
> > > READ: io=1,024MiB, aggrb=39,435KiB/s, minb=39,435KiB/s,
> > > maxb=39,435KiB/s, mint=27228msec, maxt=27228msec
> > > 
> > > syslet-rw:
> > > READ: io=1,024MiB, aggrb=29,567KiB/s, minb=29,567KiB/s,
> > > maxb=29,567KiB/s, mint=36315msec, maxt=36315msec
> > > 
> > > During syslet-rw test about 9500 async schduledes happend.
> > > I use fio-git-20070226150114.tar.gz
> > 
> > That looks pretty pathetic :-). What IO scheduler did you use? syslets
> > will confuse CFQ currently, so you want to compare with using eg
> > deadline or as. That is one of the downsides of this approach.
> 
> Deadline shows this:
> 
> sync:
> READ: io=1,024MiB, aggrb=38,212KiB/s, minb=38,212KiB/s,
> maxb=38,212KiB/s, mint=28099msec, maxt=28099msec
> 
> libaio:
> READ: io=1,024MiB, aggrb=37,933KiB/s, minb=37,933KiB/s,
> maxb=37,933KiB/s, mint=28306msec, maxt=28306msec
> 
> syslet-rw:
> READ: io=1,024MiB, aggrb=34,759KiB/s, minb=34,759KiB/s,
> maxb=34,759KiB/s, mint=30891msec, maxt=30891msec
> 
> There were about 10k async schedulings.

I think the issue here is pretty simple - when fio gets a queue full
like condition (it reaches the depth you set, 32), it commits them and
starts queuing again. Since that'll likely block, it'll get issued by
another process. So you suddenly have a nice sequence of reads from one
process (pending, only one is actually committed since it's serialized),
and then a read further down the line that goes behind those you already
committed. Then result is seeky, where it should have been sequential.

Do you get expected results if you set iodepth_low=1? That'll make fio
drain the queue before building it up again, should get you a sequential
access pattern with syslets.

-- 
Jens Axboe

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: A quick fio test (was Re: [patch 00/13] Syslets, "Threadlets", generic AIO support, v3)

2007-02-27 Thread Avi Kivity


Ingo Molnar wrote:
  
Maybe a device mapper delay target or nbd + O_DIRECT can insert delays 
to make the workload more disk-like.



yeah. I'll hack a small timeout into loopback requests i think. But then 
real disk-platter effects are left out ... so it all comes down to 
eventually having to try it on real disks too :)
  


Having a random component in the timeout may increase realism.

hundred-disk boxes are noisy, though the blinkenlights are nice :)


--
error compiling committee.c: too many arguments to function

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: A quick fio test (was Re: [patch 00/13] Syslets, "Threadlets", generic AIO support, v3)

2007-02-27 Thread Ingo Molnar

* Avi Kivity <[EMAIL PROTECTED]> wrote:

> > yeah - that's what testing on ramdisk (Jens') or on a loopback block 
> > device (mine) approximates to a certain degree.
> 
> Ramdisks or fully cached loopback return immediately, so cache 
> thrashing effects don't show up.

even fully cached loopback schedules the loopback kernel thread - but i 
agree that it's inaccurate: hence the 'approximates to a certain 
degree'.

> Maybe a device mapper delay target or nbd + O_DIRECT can insert delays 
> to make the workload more disk-like.

yeah. I'll hack a small timeout into loopback requests i think. But then 
real disk-platter effects are left out ... so it all comes down to 
eventually having to try it on real disks too :)

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: A quick fio test (was Re: [patch 00/13] Syslets, "Threadlets", generic AIO support, v3)

2007-02-27 Thread Avi Kivity


Ingo Molnar wrote:

* Avi Kivity <[EMAIL PROTECTED]> wrote:

  
But what about cpu usage?  At these low levels, the cpu is probably 
underutilized.  It would be interesting to measure cpu time per I/O 
request (or, alternatively, use an I/O subsystem that can saturate the 
processors).



yeah - that's what testing on ramdisk (Jens') or on a loopback block 
device (mine) approximates to a certain degree.


  


Ramdisks or fully cached loopback return immediately, so cache thrashing 
effects don't show up.


Maybe a device mapper delay target or nbd + O_DIRECT can insert delays 
to make the workload more disk-like.





--
error compiling committee.c: too many arguments to function

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: A quick fio test (was Re: [patch 00/13] Syslets, "Threadlets", generic AIO support, v3)

2007-02-27 Thread Ingo Molnar


* Avi Kivity <[EMAIL PROTECTED]> wrote:

> But what about cpu usage?  At these low levels, the cpu is probably 
> underutilized.  It would be interesting to measure cpu time per I/O 
> request (or, alternatively, use an I/O subsystem that can saturate the 
> processors).

yeah - that's what testing on ramdisk (Jens') or on a loopback block 
device (mine) approximates to a certain degree.

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: A quick fio test (was Re: [patch 00/13] Syslets, "Threadlets", generic AIO support, v3)

2007-02-27 Thread Indan Zupancic

On Mon, February 26, 2007 15:45, Jens Axboe wrote:
> Test case is doing random reads from /dev/sdb, in chunks of 64kb:
>
> Engine  Depth   Processes   Bw (KiB/sec)
> 
> libaio   200   1002813
> syslet   200   1003944
> libaio 2 12793
> syslet 2 13854
> sync (*)   2 12866

If you have time, could you please add Threadlet results?

Considering how awful the syslet API is, and how nice the threadlet
one, it would be great if it turned out that Syslets aren't worth it.
If they are, a better API/ABI should be thought of.

Greetings,

Indan

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: A quick fio test (was Re: [patch 00/13] Syslets, "Threadlets", generic AIO support, v3)

2007-02-27 Thread Avi Kivity


Suparna Bhattacharya wrote:

I tried the latest fio code with syslet v4, and my results are a little
different - have yet to figure out why or what to make of it.
I hope I have all the right pieces now.

This is an ext2 filesystem, SCSI AIC7xxx.

I used an iodepth_batch size of 8 to limit the number of ios in a single
io_submit (thanks for adding that parameter to fio !), like we did in
aio-stress.

Engine  Depth  BatchBw (KiB/sec)

libaio  64 817,226
syslet  64 817,620
libaio  2  818,552
syslet  2  814,935


Which is not bad, actually.

If I do not specify the iodepth_batch (i.e. default to depth), then the
difference becomes more pronounced at higher depths. However, I doubt
whether anyone would be using such high batch sizes in practice ...

Engine  Depth  BatchBw (KiB/sec)

libaio  64 default  17,429
syslet  64 default  16,155
libaio  2  default  15,494
syslet  2  default  7,971
  


But what about cpu usage?  At these low levels, the cpu is probably 
underutilized.  It would be interesting to measure cpu time per I/O 
request (or, alternatively, use an I/O subsystem that can saturate the 
processors).



--
error compiling committee.c: too many arguments to function

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: A quick fio test (was Re: [patch 00/13] Syslets, "Threadlets", generic AIO support, v3)

2007-02-27 Thread Suparna Bhattacharya

On Tue, Feb 27, 2007 at 10:42:11AM +0100, Jens Axboe wrote:
> On Tue, Feb 27 2007, Suparna Bhattacharya wrote:
> > On Mon, Feb 26, 2007 at 03:45:48PM +0100, Jens Axboe wrote:
> > > On Mon, Feb 26 2007, Suparna Bhattacharya wrote:
> > > > On Mon, Feb 26, 2007 at 02:57:36PM +0100, Jens Axboe wrote:
> > > > > 
> > > > > Some more results, using a larger number of processes and io depths. A
> > > > > repeat of the tests from friday, with added depth 2 for syslet and
> > > > > libaio:
> > > > > 
> > > > > Engine  Depth   Processes   Bw (MiB/sec)
> > > > > 
> > > > > libaio1 1602
> > > > > syslet1 1759
> > > > > sync  1 1776
> > > > > libaio   32 1832
> > > > > syslet   32 1898
> > > > > libaio2 1581
> > > > > syslet2 1609
> > > > > 
> > > > > syslet still on top. Measuring O_DIRECT reads (of 4kb size) on ramfs
> > > > > with 100 processes each with a depth of 200, reading a per-process
> > > > > private file of 10mb (need to fit in my ram...) 10 times each. IOW,
> > > > > doing 10,000MiB of IO in total:
> > > > 
> > > > But, why ramfs ? Don't we want to exercise the case where O_DIRECT 
> > > > actually
> > > > blocks ? Or am I missing something here ?
> > > 
> > > Just overhead numbers for that test case, lets try something like your
> > > described job.
> > > 
> > > Test case is doing random reads from /dev/sdb, in chunks of 64kb:
> > > 
> > > Engine  Depth   Processes   Bw (KiB/sec)
> > > 
> > > libaio   200   1002813
> > > syslet   200   1003944
> > > libaio 2 12793
> > > syslet 2 13854
> > > sync (*)   2 12866
> > > 
> > > deadline was used for IO scheduling, to minimize impact. Not sure why
> > > syslet actually does so much better here, looing at vmstat the rate is
> > > steady and all runs are basically 50/50 idle/wait. One difference is
> > > that the submission itself takes a long time on libaio, since the
> > > io_submit() will block on request allocation.  The generated IO pattern
> > > from each process is the same for all runs. The drive is a lousy sata
> > > that doesn't even do queuing, FWIW.
> > 
> > 
> > I tried the latest fio code with syslet v4, and my results are a little
> > different - have yet to figure out why or what to make of it.
> > I hope I have all the right pieces now.
> > 
> > This is an ext2 filesystem, SCSI AIC7xxx.
> > 
> > I used an iodepth_batch size of 8 to limit the number of ios in a single
> > io_submit (thanks for adding that parameter to fio !), like we did in
> > aio-stress.
> > 
> > Engine  Depth  BatchBw (KiB/sec)
> > 
> > libaio  64 817,226
> > syslet  64 817,620
> > libaio  2  818,552
> > syslet  2  814,935
> > 
> > 
> > Which is not bad, actually.
> 
> It's not bad for such a high depth/batch setting, but I still wonder why
> are results are so different. I'll look around for an x86 box with some
> TCQ/NCQ enabled storage attached for testing. Can you pass me your
> command line or job file (whatever you use) so we are on the same page?

Sure - I used variations of the following job file (e.g. engine=syslet-rw,
iodepth=2).

Also the io scheduler on my system is set to Anticipatory by default.
FWIW it is a 4 way SMP (PIII, 700MHz)

; aio-stress -l -O -o3 <1GB file>
[global]
ioengine=libaio
buffered=0
rw=randread
bs=64k
size=1024m
directory=/kdump/suparna

[testfile2]
iodepth=64
iodepth_batch=8

> 
> > If I do not specify the iodepth_batch (i.e. default to depth), then the
> > difference becomes more pronounced at higher depths. However, I doubt
> > whether anyone would be using such high batch sizes in practice ...
> >
> > Engine  Depth  BatchBw (KiB/sec)
> > 
> > libaio  64 default  17,429
> > syslet  64 default  16,155
> > libaio  2  default  15,494
> > syslet  2  default  7,971
> >
> If iodepth_batch isn't set, the syslet queued io will be serialized and

I see, so then this particular setting is not very meaningful

> not take advantage of queueing. How does the job file perform with
> ioengine=sync?

Just tried it now : 9,027KiB/s

> 
> > Often times it is the application tuning that makes all the difference,
> > so am not really sure how much to read into these results.
> > That's always been the hard part of async io

Re: A quick fio test (was Re: [patch 00/13] Syslets, "Threadlets", generic AIO support, v3)

2007-02-27 Thread Evgeniy Polyakov

On Tue, Feb 27, 2007 at 12:29:08PM +0100, Jens Axboe ([EMAIL PROTECTED]) wrote:
> On Tue, Feb 27 2007, Evgeniy Polyakov wrote:
> > My two coins:
> > # cat job
> > [global]
> > bs=8k
> > size=1g
> > direct=0
> > ioengine=sync
> > iodepth=32
> > rw=read
> > 
> > [file]
> > filename=/home/user/test
> > 
> > sync: 
> > READ: io=1,024MiB, aggrb=39,329KiB/s, minb=39,329KiB/s,
> > maxb=39,329KiB/s, mint=27301msec, maxt=27301msec
> > 
> > libaio:
> > READ: io=1,024MiB, aggrb=39,435KiB/s, minb=39,435KiB/s,
> > maxb=39,435KiB/s, mint=27228msec, maxt=27228msec
> > 
> > syslet-rw:
> > READ: io=1,024MiB, aggrb=29,567KiB/s, minb=29,567KiB/s,
> > maxb=29,567KiB/s, mint=36315msec, maxt=36315msec
> > 
> > During syslet-rw test about 9500 async schduledes happend.
> > I use fio-git-20070226150114.tar.gz
> 
> That looks pretty pathetic :-). What IO scheduler did you use? syslets
> will confuse CFQ currently, so you want to compare with using eg
> deadline or as. That is one of the downsides of this approach.

Deadline shows this:

sync:
READ: io=1,024MiB, aggrb=38,212KiB/s, minb=38,212KiB/s,
maxb=38,212KiB/s, mint=28099msec, maxt=28099msec

libaio:
READ: io=1,024MiB, aggrb=37,933KiB/s, minb=37,933KiB/s,
maxb=37,933KiB/s, mint=28306msec, maxt=28306msec

syslet-rw:
READ: io=1,024MiB, aggrb=34,759KiB/s, minb=34,759KiB/s,
maxb=34,759KiB/s, mint=30891msec, maxt=30891msec

There were about 10k async schedulings.

-- 
Evgeniy Polyakov
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: A quick fio test (was Re: [patch 00/13] Syslets, "Threadlets", generic AIO support, v3)

2007-02-27 Thread Jens Axboe

On Tue, Feb 27 2007, Evgeniy Polyakov wrote:
> My two coins:
> # cat job
> [global]
> bs=8k
> size=1g
> direct=0
> ioengine=sync
> iodepth=32
> rw=read
> 
> [file]
> filename=/home/user/test
> 
> sync: 
> READ: io=1,024MiB, aggrb=39,329KiB/s, minb=39,329KiB/s,
> maxb=39,329KiB/s, mint=27301msec, maxt=27301msec
> 
> libaio:
> READ: io=1,024MiB, aggrb=39,435KiB/s, minb=39,435KiB/s,
> maxb=39,435KiB/s, mint=27228msec, maxt=27228msec
> 
> syslet-rw:
> READ: io=1,024MiB, aggrb=29,567KiB/s, minb=29,567KiB/s,
> maxb=29,567KiB/s, mint=36315msec, maxt=36315msec
> 
> During syslet-rw test about 9500 async schduledes happend.
> I use fio-git-20070226150114.tar.gz

That looks pretty pathetic :-). What IO scheduler did you use? syslets
will confuse CFQ currently, so you want to compare with using eg
deadline or as. That is one of the downsides of this approach.

I'll try your test as soon as this bisect series is done.

> P.S. Jens, fio_latest.tar.gz has wrong permissions, it can not be
> opened.

Oh thanks, indeed. It was disabled symlinks that broke it. Fixed now.

-- 
Jens Axboe

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: A quick fio test (was Re: [patch 00/13] Syslets, "Threadlets", generic AIO support, v3)

2007-02-27 Thread Evgeniy Polyakov

My two coins:
# cat job
[global]
bs=8k
size=1g
direct=0
ioengine=sync
iodepth=32
rw=read

[file]
filename=/home/user/test

sync: 
READ: io=1,024MiB, aggrb=39,329KiB/s, minb=39,329KiB/s,
maxb=39,329KiB/s, mint=27301msec, maxt=27301msec

libaio:
READ: io=1,024MiB, aggrb=39,435KiB/s, minb=39,435KiB/s,
maxb=39,435KiB/s, mint=27228msec, maxt=27228msec

syslet-rw:
READ: io=1,024MiB, aggrb=29,567KiB/s, minb=29,567KiB/s,
maxb=29,567KiB/s, mint=36315msec, maxt=36315msec

During syslet-rw test about 9500 async schduledes happend.
I use fio-git-20070226150114.tar.gz

P.S. Jens, fio_latest.tar.gz has wrong permissions, it can not be
opened.

-- 
Evgeniy Polyakov
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: A quick fio test (was Re: [patch 00/13] Syslets, "Threadlets", generic AIO support, v3)

2007-02-27 Thread Jens Axboe

On Tue, Feb 27 2007, Suparna Bhattacharya wrote:
> On Mon, Feb 26, 2007 at 03:45:48PM +0100, Jens Axboe wrote:
> > On Mon, Feb 26 2007, Suparna Bhattacharya wrote:
> > > On Mon, Feb 26, 2007 at 02:57:36PM +0100, Jens Axboe wrote:
> > > > 
> > > > Some more results, using a larger number of processes and io depths. A
> > > > repeat of the tests from friday, with added depth 2 for syslet and
> > > > libaio:
> > > > 
> > > > Engine  Depth   Processes   Bw (MiB/sec)
> > > > 
> > > > libaio1 1602
> > > > syslet1 1759
> > > > sync  1 1776
> > > > libaio   32 1832
> > > > syslet   32 1898
> > > > libaio2 1581
> > > > syslet2 1609
> > > > 
> > > > syslet still on top. Measuring O_DIRECT reads (of 4kb size) on ramfs
> > > > with 100 processes each with a depth of 200, reading a per-process
> > > > private file of 10mb (need to fit in my ram...) 10 times each. IOW,
> > > > doing 10,000MiB of IO in total:
> > > 
> > > But, why ramfs ? Don't we want to exercise the case where O_DIRECT 
> > > actually
> > > blocks ? Or am I missing something here ?
> > 
> > Just overhead numbers for that test case, lets try something like your
> > described job.
> > 
> > Test case is doing random reads from /dev/sdb, in chunks of 64kb:
> > 
> > Engine  Depth   Processes   Bw (KiB/sec)
> > 
> > libaio   200   1002813
> > syslet   200   1003944
> > libaio 2 12793
> > syslet 2 13854
> > sync (*)   2 12866
> > 
> > deadline was used for IO scheduling, to minimize impact. Not sure why
> > syslet actually does so much better here, looing at vmstat the rate is
> > steady and all runs are basically 50/50 idle/wait. One difference is
> > that the submission itself takes a long time on libaio, since the
> > io_submit() will block on request allocation.  The generated IO pattern
> > from each process is the same for all runs. The drive is a lousy sata
> > that doesn't even do queuing, FWIW.
> 
> 
> I tried the latest fio code with syslet v4, and my results are a little
> different - have yet to figure out why or what to make of it.
> I hope I have all the right pieces now.
> 
> This is an ext2 filesystem, SCSI AIC7xxx.
> 
> I used an iodepth_batch size of 8 to limit the number of ios in a single
> io_submit (thanks for adding that parameter to fio !), like we did in
> aio-stress.
> 
> Engine  Depth  Batch  Bw (KiB/sec)
> 
> libaio64 817,226
> syslet64 817,620
> libaio2  818,552
> syslet2  814,935
> 
> 
> Which is not bad, actually.

It's not bad for such a high depth/batch setting, but I still wonder why
are results are so different. I'll look around for an x86 box with some
TCQ/NCQ enabled storage attached for testing. Can you pass me your
command line or job file (whatever you use) so we are on the same page?

> If I do not specify the iodepth_batch (i.e. default to depth), then the
> difference becomes more pronounced at higher depths. However, I doubt
> whether anyone would be using such high batch sizes in practice ...
>
> Engine  Depth  Batch  Bw (KiB/sec)
> 
> libaio64 default  17,429
> syslet64 default  16,155
> libaio2  default  15,494
> syslet2  default  7,971
>
If iodepth_batch isn't set, the syslet queued io will be serialized and
not take advantage of queueing. How does the job file perform with
ioengine=sync?

> Often times it is the application tuning that makes all the difference,
> so am not really sure how much to read into these results.
> That's always been the hard part of async io ...

Yes I agree, it's handy to get an overview though.

-- 
Jens Axboe

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: A quick fio test (was Re: [patch 00/13] Syslets, Threadlets, generic AIO support, v3)

2007-02-27 Thread Jens Axboe

On Tue, Feb 27 2007, Suparna Bhattacharya wrote:
 On Mon, Feb 26, 2007 at 03:45:48PM +0100, Jens Axboe wrote:
  On Mon, Feb 26 2007, Suparna Bhattacharya wrote:
   On Mon, Feb 26, 2007 at 02:57:36PM +0100, Jens Axboe wrote:

Some more results, using a larger number of processes and io depths. A
repeat of the tests from friday, with added depth 2 for syslet and
libaio:

Engine  Depth   Processes   Bw (MiB/sec)

libaio1 1602
syslet1 1759
sync  1 1776
libaio   32 1832
syslet   32 1898
libaio2 1581
syslet2 1609

syslet still on top. Measuring O_DIRECT reads (of 4kb size) on ramfs
with 100 processes each with a depth of 200, reading a per-process
private file of 10mb (need to fit in my ram...) 10 times each. IOW,
doing 10,000MiB of IO in total:
   
   But, why ramfs ? Don't we want to exercise the case where O_DIRECT 
   actually
   blocks ? Or am I missing something here ?
  
  Just overhead numbers for that test case, lets try something like your
  described job.
  
  Test case is doing random reads from /dev/sdb, in chunks of 64kb:
  
  Engine  Depth   Processes   Bw (KiB/sec)
  
  libaio   200   1002813
  syslet   200   1003944
  libaio 2 12793
  syslet 2 13854
  sync (*)   2 12866
  
  deadline was used for IO scheduling, to minimize impact. Not sure why
  syslet actually does so much better here, looing at vmstat the rate is
  steady and all runs are basically 50/50 idle/wait. One difference is
  that the submission itself takes a long time on libaio, since the
  io_submit() will block on request allocation.  The generated IO pattern
  from each process is the same for all runs. The drive is a lousy sata
  that doesn't even do queuing, FWIW.
 
 
 I tried the latest fio code with syslet v4, and my results are a little
 different - have yet to figure out why or what to make of it.
 I hope I have all the right pieces now.
 
 This is an ext2 filesystem, SCSI AIC7xxx.
 
 I used an iodepth_batch size of 8 to limit the number of ios in a single
 io_submit (thanks for adding that parameter to fio !), like we did in
 aio-stress.
 
 Engine  Depth  Batch  Bw (KiB/sec)
 
 libaio64 817,226
 syslet64 817,620
 libaio2  818,552
 syslet2  814,935
 
 
 Which is not bad, actually.

It's not bad for such a high depth/batch setting, but I still wonder why
are results are so different. I'll look around for an x86 box with some
TCQ/NCQ enabled storage attached for testing. Can you pass me your
command line or job file (whatever you use) so we are on the same page?

 If I do not specify the iodepth_batch (i.e. default to depth), then the
 difference becomes more pronounced at higher depths. However, I doubt
 whether anyone would be using such high batch sizes in practice ...

 Engine  Depth  Batch  Bw (KiB/sec)
 
 libaio64 default  17,429
 syslet64 default  16,155
 libaio2  default  15,494
 syslet2  default  7,971

If iodepth_batch isn't set, the syslet queued io will be serialized and
not take advantage of queueing. How does the job file perform with
ioengine=sync?

 Often times it is the application tuning that makes all the difference,
 so am not really sure how much to read into these results.
 That's always been the hard part of async io ...

Yes I agree, it's handy to get an overview though.

-- 
Jens Axboe

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: A quick fio test (was Re: [patch 00/13] Syslets, Threadlets, generic AIO support, v3)

2007-02-27 Thread Evgeniy Polyakov

My two coins:
# cat job
[global]
bs=8k
size=1g
direct=0
ioengine=sync
iodepth=32
rw=read

[file]
filename=/home/user/test

sync: 
READ: io=1,024MiB, aggrb=39,329KiB/s, minb=39,329KiB/s,
maxb=39,329KiB/s, mint=27301msec, maxt=27301msec

libaio:
READ: io=1,024MiB, aggrb=39,435KiB/s, minb=39,435KiB/s,
maxb=39,435KiB/s, mint=27228msec, maxt=27228msec

syslet-rw:
READ: io=1,024MiB, aggrb=29,567KiB/s, minb=29,567KiB/s,
maxb=29,567KiB/s, mint=36315msec, maxt=36315msec

During syslet-rw test about 9500 async schduledes happend.
I use fio-git-20070226150114.tar.gz

P.S. Jens, fio_latest.tar.gz has wrong permissions, it can not be
opened.

-- 
Evgeniy Polyakov
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: A quick fio test (was Re: [patch 00/13] Syslets, Threadlets, generic AIO support, v3)

2007-02-27 Thread Jens Axboe

On Tue, Feb 27 2007, Evgeniy Polyakov wrote:
 My two coins:
 # cat job
 [global]
 bs=8k
 size=1g
 direct=0
 ioengine=sync
 iodepth=32
 rw=read
 
 [file]
 filename=/home/user/test
 
 sync: 
 READ: io=1,024MiB, aggrb=39,329KiB/s, minb=39,329KiB/s,
 maxb=39,329KiB/s, mint=27301msec, maxt=27301msec
 
 libaio:
 READ: io=1,024MiB, aggrb=39,435KiB/s, minb=39,435KiB/s,
 maxb=39,435KiB/s, mint=27228msec, maxt=27228msec
 
 syslet-rw:
 READ: io=1,024MiB, aggrb=29,567KiB/s, minb=29,567KiB/s,
 maxb=29,567KiB/s, mint=36315msec, maxt=36315msec
 
 During syslet-rw test about 9500 async schduledes happend.
 I use fio-git-20070226150114.tar.gz

That looks pretty pathetic :-). What IO scheduler did you use? syslets
will confuse CFQ currently, so you want to compare with using eg
deadline or as. That is one of the downsides of this approach.

I'll try your test as soon as this bisect series is done.

 P.S. Jens, fio_latest.tar.gz has wrong permissions, it can not be
 opened.

Oh thanks, indeed. It was disabled symlinks that broke it. Fixed now.

-- 
Jens Axboe

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: A quick fio test (was Re: [patch 00/13] Syslets, Threadlets, generic AIO support, v3)

2007-02-27 Thread Evgeniy Polyakov

On Tue, Feb 27, 2007 at 12:29:08PM +0100, Jens Axboe ([EMAIL PROTECTED]) wrote:
 On Tue, Feb 27 2007, Evgeniy Polyakov wrote:
  My two coins:
  # cat job
  [global]
  bs=8k
  size=1g
  direct=0
  ioengine=sync
  iodepth=32
  rw=read
  
  [file]
  filename=/home/user/test
  
  sync: 
  READ: io=1,024MiB, aggrb=39,329KiB/s, minb=39,329KiB/s,
  maxb=39,329KiB/s, mint=27301msec, maxt=27301msec
  
  libaio:
  READ: io=1,024MiB, aggrb=39,435KiB/s, minb=39,435KiB/s,
  maxb=39,435KiB/s, mint=27228msec, maxt=27228msec
  
  syslet-rw:
  READ: io=1,024MiB, aggrb=29,567KiB/s, minb=29,567KiB/s,
  maxb=29,567KiB/s, mint=36315msec, maxt=36315msec
  
  During syslet-rw test about 9500 async schduledes happend.
  I use fio-git-20070226150114.tar.gz
 
 That looks pretty pathetic :-). What IO scheduler did you use? syslets
 will confuse CFQ currently, so you want to compare with using eg
 deadline or as. That is one of the downsides of this approach.

Deadline shows this:

sync:
READ: io=1,024MiB, aggrb=38,212KiB/s, minb=38,212KiB/s,
maxb=38,212KiB/s, mint=28099msec, maxt=28099msec

libaio:
READ: io=1,024MiB, aggrb=37,933KiB/s, minb=37,933KiB/s,
maxb=37,933KiB/s, mint=28306msec, maxt=28306msec

syslet-rw:
READ: io=1,024MiB, aggrb=34,759KiB/s, minb=34,759KiB/s,
maxb=34,759KiB/s, mint=30891msec, maxt=30891msec

There were about 10k async schedulings.

-- 
Evgeniy Polyakov
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: A quick fio test (was Re: [patch 00/13] Syslets, Threadlets, generic AIO support, v3)

2007-02-27 Thread Suparna Bhattacharya

On Tue, Feb 27, 2007 at 10:42:11AM +0100, Jens Axboe wrote:
 On Tue, Feb 27 2007, Suparna Bhattacharya wrote:
  On Mon, Feb 26, 2007 at 03:45:48PM +0100, Jens Axboe wrote:
   On Mon, Feb 26 2007, Suparna Bhattacharya wrote:
On Mon, Feb 26, 2007 at 02:57:36PM +0100, Jens Axboe wrote:
 
 Some more results, using a larger number of processes and io depths. A
 repeat of the tests from friday, with added depth 2 for syslet and
 libaio:
 
 Engine  Depth   Processes   Bw (MiB/sec)
 
 libaio1 1602
 syslet1 1759
 sync  1 1776
 libaio   32 1832
 syslet   32 1898
 libaio2 1581
 syslet2 1609
 
 syslet still on top. Measuring O_DIRECT reads (of 4kb size) on ramfs
 with 100 processes each with a depth of 200, reading a per-process
 private file of 10mb (need to fit in my ram...) 10 times each. IOW,
 doing 10,000MiB of IO in total:

But, why ramfs ? Don't we want to exercise the case where O_DIRECT 
actually
blocks ? Or am I missing something here ?
   
   Just overhead numbers for that test case, lets try something like your
   described job.
   
   Test case is doing random reads from /dev/sdb, in chunks of 64kb:
   
   Engine  Depth   Processes   Bw (KiB/sec)
   
   libaio   200   1002813
   syslet   200   1003944
   libaio 2 12793
   syslet 2 13854
   sync (*)   2 12866
   
   deadline was used for IO scheduling, to minimize impact. Not sure why
   syslet actually does so much better here, looing at vmstat the rate is
   steady and all runs are basically 50/50 idle/wait. One difference is
   that the submission itself takes a long time on libaio, since the
   io_submit() will block on request allocation.  The generated IO pattern
   from each process is the same for all runs. The drive is a lousy sata
   that doesn't even do queuing, FWIW.
  
  
  I tried the latest fio code with syslet v4, and my results are a little
  different - have yet to figure out why or what to make of it.
  I hope I have all the right pieces now.
  
  This is an ext2 filesystem, SCSI AIC7xxx.
  
  I used an iodepth_batch size of 8 to limit the number of ios in a single
  io_submit (thanks for adding that parameter to fio !), like we did in
  aio-stress.
  
  Engine  Depth  BatchBw (KiB/sec)
  
  libaio  64 817,226
  syslet  64 817,620
  libaio  2  818,552
  syslet  2  814,935
  
  
  Which is not bad, actually.
 
 It's not bad for such a high depth/batch setting, but I still wonder why
 are results are so different. I'll look around for an x86 box with some
 TCQ/NCQ enabled storage attached for testing. Can you pass me your
 command line or job file (whatever you use) so we are on the same page?

Sure - I used variations of the following job file (e.g. engine=syslet-rw,
iodepth=2).

Also the io scheduler on my system is set to Anticipatory by default.
FWIW it is a 4 way SMP (PIII, 700MHz)

; aio-stress -l -O -o3 1GB file
[global]
ioengine=libaio
buffered=0
rw=randread
bs=64k
size=1024m
directory=/kdump/suparna

[testfile2]
iodepth=64
iodepth_batch=8

 
  If I do not specify the iodepth_batch (i.e. default to depth), then the
  difference becomes more pronounced at higher depths. However, I doubt
  whether anyone would be using such high batch sizes in practice ...
 
  Engine  Depth  BatchBw (KiB/sec)
  
  libaio  64 default  17,429
  syslet  64 default  16,155
  libaio  2  default  15,494
  syslet  2  default  7,971
 
 If iodepth_batch isn't set, the syslet queued io will be serialized and

I see, so then this particular setting is not very meaningful

 not take advantage of queueing. How does the job file perform with
 ioengine=sync?

Just tried it now : 9,027KiB/s

 
  Often times it is the application tuning that makes all the difference,
  so am not really sure how much to read into these results.
  That's always been the hard part of async io ...
 
 Yes I agree, it's handy to get an overview though.

True, at least some of this helps us gain a little more understanding
about the boundaries and how to tune it to be most effective.

Regards
Suparna

 
 -- 
 Jens Axboe

-- 
Suparna Bhattacharya ([EMAIL

Re: A quick fio test (was Re: [patch 00/13] Syslets, Threadlets, generic AIO support, v3)

2007-02-27 Thread Avi Kivity


Suparna Bhattacharya wrote:

I tried the latest fio code with syslet v4, and my results are a little
different - have yet to figure out why or what to make of it.
I hope I have all the right pieces now.

This is an ext2 filesystem, SCSI AIC7xxx.

I used an iodepth_batch size of 8 to limit the number of ios in a single
io_submit (thanks for adding that parameter to fio !), like we did in
aio-stress.

Engine  Depth  BatchBw (KiB/sec)

libaio  64 817,226
syslet  64 817,620
libaio  2  818,552
syslet  2  814,935


Which is not bad, actually.

If I do not specify the iodepth_batch (i.e. default to depth), then the
difference becomes more pronounced at higher depths. However, I doubt
whether anyone would be using such high batch sizes in practice ...

Engine  Depth  BatchBw (KiB/sec)

libaio  64 default  17,429
syslet  64 default  16,155
libaio  2  default  15,494
syslet  2  default  7,971
  


But what about cpu usage?  At these low levels, the cpu is probably 
underutilized.  It would be interesting to measure cpu time per I/O 
request (or, alternatively, use an I/O subsystem that can saturate the 
processors).



--
error compiling committee.c: too many arguments to function

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: A quick fio test (was Re: [patch 00/13] Syslets, Threadlets, generic AIO support, v3)

2007-02-27 Thread Indan Zupancic

On Mon, February 26, 2007 15:45, Jens Axboe wrote:
 Test case is doing random reads from /dev/sdb, in chunks of 64kb:

 Engine  Depth   Processes   Bw (KiB/sec)
 
 libaio   200   1002813
 syslet   200   1003944
 libaio 2 12793
 syslet 2 13854
 sync (*)   2 12866

If you have time, could you please add Threadlet results?

Considering how awful the syslet API is, and how nice the threadlet
one, it would be great if it turned out that Syslets aren't worth it.
If they are, a better API/ABI should be thought of.

Greetings,

Indan


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: A quick fio test (was Re: [patch 00/13] Syslets, Threadlets, generic AIO support, v3)

2007-02-27 Thread Ingo Molnar


* Avi Kivity [EMAIL PROTECTED] wrote:

 But what about cpu usage?  At these low levels, the cpu is probably 
 underutilized.  It would be interesting to measure cpu time per I/O 
 request (or, alternatively, use an I/O subsystem that can saturate the 
 processors).

yeah - that's what testing on ramdisk (Jens') or on a loopback block 
device (mine) approximates to a certain degree.

Ingo
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: A quick fio test (was Re: [patch 00/13] Syslets, Threadlets, generic AIO support, v3)

2007-02-27 Thread Avi Kivity


Ingo Molnar wrote:

* Avi Kivity [EMAIL PROTECTED] wrote:

  
But what about cpu usage?  At these low levels, the cpu is probably 
underutilized.  It would be interesting to measure cpu time per I/O 
request (or, alternatively, use an I/O subsystem that can saturate the 
processors).



yeah - that's what testing on ramdisk (Jens') or on a loopback block 
device (mine) approximates to a certain degree.


  


Ramdisks or fully cached loopback return immediately, so cache thrashing 
effects don't show up.


Maybe a device mapper delay target or nbd + O_DIRECT can insert delays 
to make the workload more disk-like.





--
error compiling committee.c: too many arguments to function

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: A quick fio test (was Re: [patch 00/13] Syslets, Threadlets, generic AIO support, v3)

2007-02-27 Thread Ingo Molnar


* Avi Kivity [EMAIL PROTECTED] wrote:

  yeah - that's what testing on ramdisk (Jens') or on a loopback block 
  device (mine) approximates to a certain degree.
 
 Ramdisks or fully cached loopback return immediately, so cache 
 thrashing effects don't show up.

even fully cached loopback schedules the loopback kernel thread - but i 
agree that it's inaccurate: hence the 'approximates to a certain 
degree'.

 Maybe a device mapper delay target or nbd + O_DIRECT can insert delays 
 to make the workload more disk-like.

yeah. I'll hack a small timeout into loopback requests i think. But then 
real disk-platter effects are left out ... so it all comes down to 
eventually having to try it on real disks too :)

Ingo
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: A quick fio test (was Re: [patch 00/13] Syslets, Threadlets, generic AIO support, v3)

2007-02-27 Thread Avi Kivity


Ingo Molnar wrote:
  
Maybe a device mapper delay target or nbd + O_DIRECT can insert delays 
to make the workload more disk-like.



yeah. I'll hack a small timeout into loopback requests i think. But then 
real disk-platter effects are left out ... so it all comes down to 
eventually having to try it on real disks too :)
  


Having a random component in the timeout may increase realism.

hundred-disk boxes are noisy, though the blinkenlights are nice :)


--
error compiling committee.c: too many arguments to function

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: A quick fio test (was Re: [patch 00/13] Syslets, Threadlets, generic AIO support, v3)

2007-02-27 Thread Jens Axboe

On Tue, Feb 27 2007, Evgeniy Polyakov wrote:
 On Tue, Feb 27, 2007 at 12:29:08PM +0100, Jens Axboe ([EMAIL PROTECTED]) 
 wrote:
  On Tue, Feb 27 2007, Evgeniy Polyakov wrote:
   My two coins:
   # cat job
   [global]
   bs=8k
   size=1g
   direct=0
   ioengine=sync
   iodepth=32
   rw=read
   
   [file]
   filename=/home/user/test
   
   sync: 
   READ: io=1,024MiB, aggrb=39,329KiB/s, minb=39,329KiB/s,
   maxb=39,329KiB/s, mint=27301msec, maxt=27301msec
   
   libaio:
   READ: io=1,024MiB, aggrb=39,435KiB/s, minb=39,435KiB/s,
   maxb=39,435KiB/s, mint=27228msec, maxt=27228msec
   
   syslet-rw:
   READ: io=1,024MiB, aggrb=29,567KiB/s, minb=29,567KiB/s,
   maxb=29,567KiB/s, mint=36315msec, maxt=36315msec
   
   During syslet-rw test about 9500 async schduledes happend.
   I use fio-git-20070226150114.tar.gz
  
  That looks pretty pathetic :-). What IO scheduler did you use? syslets
  will confuse CFQ currently, so you want to compare with using eg
  deadline or as. That is one of the downsides of this approach.
 
 Deadline shows this:
 
 sync:
 READ: io=1,024MiB, aggrb=38,212KiB/s, minb=38,212KiB/s,
 maxb=38,212KiB/s, mint=28099msec, maxt=28099msec
 
 libaio:
 READ: io=1,024MiB, aggrb=37,933KiB/s, minb=37,933KiB/s,
 maxb=37,933KiB/s, mint=28306msec, maxt=28306msec
 
 syslet-rw:
 READ: io=1,024MiB, aggrb=34,759KiB/s, minb=34,759KiB/s,
 maxb=34,759KiB/s, mint=30891msec, maxt=30891msec
 
 There were about 10k async schedulings.

I think the issue here is pretty simple - when fio gets a queue full
like condition (it reaches the depth you set, 32), it commits them and
starts queuing again. Since that'll likely block, it'll get issued by
another process. So you suddenly have a nice sequence of reads from one
process (pending, only one is actually committed since it's serialized),
and then a read further down the line that goes behind those you already
committed. Then result is seeky, where it should have been sequential.

Do you get expected results if you set iodepth_low=1? That'll make fio
drain the queue before building it up again, should get you a sequential
access pattern with syslets.

-- 
Jens Axboe

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: A quick fio test (was Re: [patch 00/13] Syslets, Threadlets, generic AIO support, v3)

2007-02-27 Thread Jens Axboe

On Tue, Feb 27 2007, Avi Kivity wrote:
 Ingo Molnar wrote:
 * Avi Kivity [EMAIL PROTECTED] wrote:
 
   
 But what about cpu usage?  At these low levels, the cpu is probably 
 underutilized.  It would be interesting to measure cpu time per I/O 
 request (or, alternatively, use an I/O subsystem that can saturate the 
 processors).
 
 
 yeah - that's what testing on ramdisk (Jens') or on a loopback block 
 device (mine) approximates to a certain degree.
 
   
 
 Ramdisks or fully cached loopback return immediately, so cache thrashing 
 effects don't show up.
 
 Maybe a device mapper delay target or nbd + O_DIRECT can insert delays 
 to make the workload more disk-like.

Take a look at scsi-debug, it can do at least some of that.

-- 
Jens Axboe

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: A quick fio test (was Re: [patch 00/13] Syslets, Threadlets, generic AIO support, v3)

2007-02-27 Thread Evgeniy Polyakov

On Tue, Feb 27, 2007 at 07:45:41PM +0100, Jens Axboe ([EMAIL PROTECTED]) wrote:
  Deadline shows this:
  
  sync:
  READ: io=1,024MiB, aggrb=38,212KiB/s, minb=38,212KiB/s,
  maxb=38,212KiB/s, mint=28099msec, maxt=28099msec
  
  libaio:
  READ: io=1,024MiB, aggrb=37,933KiB/s, minb=37,933KiB/s,
  maxb=37,933KiB/s, mint=28306msec, maxt=28306msec
  
  syslet-rw:
  READ: io=1,024MiB, aggrb=34,759KiB/s, minb=34,759KiB/s,
  maxb=34,759KiB/s, mint=30891msec, maxt=30891msec
  
  There were about 10k async schedulings.
 
 I think the issue here is pretty simple - when fio gets a queue full
 like condition (it reaches the depth you set, 32), it commits them and
 starts queuing again. Since that'll likely block, it'll get issued by
 another process. So you suddenly have a nice sequence of reads from one
 process (pending, only one is actually committed since it's serialized),
 and then a read further down the line that goes behind those you already
 committed. Then result is seeky, where it should have been sequential.
 
 Do you get expected results if you set iodepth_low=1? That'll make fio
 drain the queue before building it up again, should get you a sequential
 access pattern with syslets.

With such a change results should be better - not only because seek is
removed with sequential read, but also number of working threads
decreases with time - until queue is filled again.

So, syslet-rw has increased to 37mb/sec out of 39/sync and 38/libaio,
the latter two did not changed.

With iodepth of 10k, I get the same performance for
libaio and syslets - about 36mb/sec, it does not depend on iodepth_low
being set to 1 or default (full).

So syslets have small problems with small number of iodepth - its
performance is about 34mb/sec and then increases to 36 with iodepth
grow. While libaio decreases from 38 down to 36 mb/sec.

iodepth_low=1 helps syslets to have 37mb/sec with iodepth=32, with 3200
and 10k it does not play any role.

 -- 
 Jens Axboe

-- 
Evgeniy Polyakov
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: A quick fio test (was Re: [patch 00/13] Syslets, Threadlets, generic AIO support, v3)

2007-02-27 Thread Jens Axboe

On Tue, Feb 27 2007, Evgeniy Polyakov wrote:
 On Tue, Feb 27, 2007 at 07:45:41PM +0100, Jens Axboe ([EMAIL PROTECTED]) 
 wrote:
   Deadline shows this:
   
   sync:
   READ: io=1,024MiB, aggrb=38,212KiB/s, minb=38,212KiB/s,
   maxb=38,212KiB/s, mint=28099msec, maxt=28099msec
   
   libaio:
   READ: io=1,024MiB, aggrb=37,933KiB/s, minb=37,933KiB/s,
   maxb=37,933KiB/s, mint=28306msec, maxt=28306msec
   
   syslet-rw:
   READ: io=1,024MiB, aggrb=34,759KiB/s, minb=34,759KiB/s,
   maxb=34,759KiB/s, mint=30891msec, maxt=30891msec
   
   There were about 10k async schedulings.
  
  I think the issue here is pretty simple - when fio gets a queue full
  like condition (it reaches the depth you set, 32), it commits them and
  starts queuing again. Since that'll likely block, it'll get issued by
  another process. So you suddenly have a nice sequence of reads from one
  process (pending, only one is actually committed since it's serialized),
  and then a read further down the line that goes behind those you already
  committed. Then result is seeky, where it should have been sequential.
  
  Do you get expected results if you set iodepth_low=1? That'll make fio
  drain the queue before building it up again, should get you a sequential
  access pattern with syslets.
 
 With such a change results should be better - not only because seek is
 removed with sequential read, but also number of working threads
 decreases with time - until queue is filled again.

Yep, although it probably doesn't matter for such a low bandwidth test
anyway.

 So, syslet-rw has increased to 37mb/sec out of 39/sync and 38/libaio,
 the latter two did not changed.

I wonder why all three aren't doing 39mb/sec flat here, it's a pretty
trivial case...

 With iodepth of 10k, I get the same performance for
 libaio and syslets - about 36mb/sec, it does not depend on iodepth_low
 being set to 1 or default (full).

Yep, the larger the iodepth, the less costly a seek on new queue buildup
gets. So that is as expected.

 So syslets have small problems with small number of iodepth - its
 performance is about 34mb/sec and then increases to 36 with iodepth
 grow. While libaio decreases from 38 down to 36 mb/sec.

Using your job file and fio HEAD (forces iodepth_low=1 for syslet if
iodepth_low isn't specified), I get:

Engine  Depth   Bw (kb/sec)
---
syslet  1  37163
syslet 32  37197
syslet  1  36577
libaio  1  37144
libaio 32  37159
libaio  1  36463
sync1  37154

Results are highly stable. Note that this test case isn't totally fair,
since libaio isn't really async when you do buffered file IO.

-- 
Jens Axboe

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: A quick fio test (was Re: [patch 00/13] Syslets, "Threadlets", generic AIO support, v3)

2007-02-26 Thread Suparna Bhattacharya

On Mon, Feb 26, 2007 at 03:45:48PM +0100, Jens Axboe wrote:
> On Mon, Feb 26 2007, Suparna Bhattacharya wrote:
> > On Mon, Feb 26, 2007 at 02:57:36PM +0100, Jens Axboe wrote:
> > > 
> > > Some more results, using a larger number of processes and io depths. A
> > > repeat of the tests from friday, with added depth 2 for syslet and
> > > libaio:
> > > 
> > > Engine  Depth   Processes   Bw (MiB/sec)
> > > 
> > > libaio1 1602
> > > syslet1 1759
> > > sync  1 1776
> > > libaio   32 1832
> > > syslet   32 1898
> > > libaio2 1581
> > > syslet2 1609
> > > 
> > > syslet still on top. Measuring O_DIRECT reads (of 4kb size) on ramfs
> > > with 100 processes each with a depth of 200, reading a per-process
> > > private file of 10mb (need to fit in my ram...) 10 times each. IOW,
> > > doing 10,000MiB of IO in total:
> > 
> > But, why ramfs ? Don't we want to exercise the case where O_DIRECT actually
> > blocks ? Or am I missing something here ?
> 
> Just overhead numbers for that test case, lets try something like your
> described job.
> 
> Test case is doing random reads from /dev/sdb, in chunks of 64kb:
> 
> Engine  Depth   Processes   Bw (KiB/sec)
> 
> libaio   200   1002813
> syslet   200   1003944
> libaio 2 12793
> syslet 2 13854
> sync (*)   2 12866
> 
> deadline was used for IO scheduling, to minimize impact. Not sure why
> syslet actually does so much better here, looing at vmstat the rate is
> steady and all runs are basically 50/50 idle/wait. One difference is
> that the submission itself takes a long time on libaio, since the
> io_submit() will block on request allocation.  The generated IO pattern
> from each process is the same for all runs. The drive is a lousy sata
> that doesn't even do queuing, FWIW.


I tried the latest fio code with syslet v4, and my results are a little
different - have yet to figure out why or what to make of it.
I hope I have all the right pieces now.

This is an ext2 filesystem, SCSI AIC7xxx.

I used an iodepth_batch size of 8 to limit the number of ios in a single
io_submit (thanks for adding that parameter to fio !), like we did in
aio-stress.

Engine  Depth  BatchBw (KiB/sec)

libaio  64 817,226
syslet  64 817,620
libaio  2  818,552
syslet  2  814,935


Which is not bad, actually.

If I do not specify the iodepth_batch (i.e. default to depth), then the
difference becomes more pronounced at higher depths. However, I doubt
whether anyone would be using such high batch sizes in practice ...

Engine  Depth  BatchBw (KiB/sec)

libaio  64 default  17,429
syslet  64 default  16,155
libaio  2  default  15,494
syslet  2  default  7,971

Often times it is the application tuning that makes all the difference,
so am not really sure how much to read into these results.
That's always been the hard part of async io ...

Regards
Suparna

> 
> [*] Just for comparison, the depth is obviously really 1 at the kernel
> side since it's sync.
> 
> -- 
> Jens Axboe

-- 
Suparna Bhattacharya ([EMAIL PROTECTED])
Linux Technology Center
IBM Software Lab, India

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: A quick fio test (was Re: [patch 00/13] Syslets, "Threadlets", generic AIO support, v3)

2007-02-26 Thread Davide Libenzi

On Mon, 26 Feb 2007, Jens Axboe wrote:

> 
> Some more results, using a larger number of processes and io depths. A
> repeat of the tests from friday, with added depth 2 for syslet and
> libaio:
> 
> Engine  Depth   Processes   Bw (MiB/sec)
> 
> libaio1 1602
> syslet1 1759
> sync  1 1776
> libaio   32 1832
> syslet   32 1898
> libaio2 1581
> syslet2 1609

That looks great! IMO there may be a little higher cost associated with 
the syslets thread switches, that we currently do not perform 100% 
cleanly, but results look nevertheless awesome.



- Davide


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: A quick fio test (was Re: [patch 00/13] Syslets, "Threadlets", generic AIO support, v3)

2007-02-26 Thread Jens Axboe

On Mon, Feb 26 2007, Suparna Bhattacharya wrote:
> On Mon, Feb 26, 2007 at 02:57:36PM +0100, Jens Axboe wrote:
> > 
> > Some more results, using a larger number of processes and io depths. A
> > repeat of the tests from friday, with added depth 2 for syslet and
> > libaio:
> > 
> > Engine  Depth   Processes   Bw (MiB/sec)
> > 
> > libaio1 1602
> > syslet1 1759
> > sync  1 1776
> > libaio   32 1832
> > syslet   32 1898
> > libaio2 1581
> > syslet2 1609
> > 
> > syslet still on top. Measuring O_DIRECT reads (of 4kb size) on ramfs
> > with 100 processes each with a depth of 200, reading a per-process
> > private file of 10mb (need to fit in my ram...) 10 times each. IOW,
> > doing 10,000MiB of IO in total:
> 
> But, why ramfs ? Don't we want to exercise the case where O_DIRECT actually
> blocks ? Or am I missing something here ?

Just overhead numbers for that test case, lets try something like your
described job.

Test case is doing random reads from /dev/sdb, in chunks of 64kb:

Engine  Depth   Processes   Bw (KiB/sec)

libaio   200   1002813
syslet   200   1003944
libaio 2 12793
syslet 2 13854
sync (*)   2 12866

deadline was used for IO scheduling, to minimize impact. Not sure why
syslet actually does so much better here, looing at vmstat the rate is
steady and all runs are basically 50/50 idle/wait. One difference is
that the submission itself takes a long time on libaio, since the
io_submit() will block on request allocation.  The generated IO pattern
from each process is the same for all runs. The drive is a lousy sata
that doesn't even do queuing, FWIW.

[*] Just for comparison, the depth is obviously really 1 at the kernel
side since it's sync.

-- 
Jens Axboe

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: A quick fio test (was Re: [patch 00/13] Syslets, "Threadlets", generic AIO support, v3)

2007-02-26 Thread Ingo Molnar

* Suparna Bhattacharya <[EMAIL PROTECTED]> wrote:

> > syslet still on top. Measuring O_DIRECT reads (of 4kb size) on ramfs 
> > with 100 processes each with a depth of 200, reading a per-process 
> > private file of 10mb (need to fit in my ram...) 10 times each. IOW, 
> > doing 10,000MiB of IO in total:
> 
> But, why ramfs ? Don't we want to exercise the case where O_DIRECT 
> actually blocks ? Or am I missing something here ?

ramfs is just the easiest way to measure the pure CPU overhead of a 
workload without real IO delays (and resulting idle time) getting in the 
way. It's certainly not the same thing, but still it's pretty useful 
most of the time. I used a different method, loopback block device, and 
got similar results. [ Real IO shows similar results as well, but is a 
lot more noisy and hence harder to interpret (and thus easier to get 
wrong). ]

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: A quick fio test (was Re: [patch 00/13] Syslets, "Threadlets", generic AIO support, v3)

2007-02-26 Thread Suparna Bhattacharya

On Mon, Feb 26, 2007 at 02:57:36PM +0100, Jens Axboe wrote:
> 
> Some more results, using a larger number of processes and io depths. A
> repeat of the tests from friday, with added depth 2 for syslet and
> libaio:
> 
> Engine  Depth   Processes   Bw (MiB/sec)
> 
> libaio1 1602
> syslet1 1759
> sync  1 1776
> libaio   32 1832
> syslet   32 1898
> libaio2 1581
> syslet2 1609
> 
> syslet still on top. Measuring O_DIRECT reads (of 4kb size) on ramfs
> with 100 processes each with a depth of 200, reading a per-process
> private file of 10mb (need to fit in my ram...) 10 times each. IOW,
> doing 10,000MiB of IO in total:

But, why ramfs ? Don't we want to exercise the case where O_DIRECT actually
blocks ? Or am I missing something here ?

Regards
Suparna

> 
> Engine  Depth   Processes   Bw (MiB/sec)
> 
> libaio   200   1001488
> syslet   200   1001714
> 
> Results are stable to within approx +/- 10MiB/sec. The syslet case
> completes a whole second faster than libaio (~6 vs ~7 seconds). Testing
> was done with fio HEAD eb7c8ae27bc301b77490b3586dd5ccab7c95880a, and it
> uses the v4 patch series.
> 
> Engine  Depth   Processes   Bw (MiB/sec)
> 
> libaio200   1001488
> syslet200   1001714
> sync  200   1001843
> 
> -- 
> Jens Axboe

-- 
Suparna Bhattacharya ([EMAIL PROTECTED])
Linux Technology Center
IBM Software Lab, India

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: A quick fio test (was Re: [patch 00/13] Syslets, "Threadlets", generic AIO support, v3)

2007-02-26 Thread Jens Axboe


Some more results, using a larger number of processes and io depths. A
repeat of the tests from friday, with added depth 2 for syslet and
libaio:

Engine  Depth   Processes   Bw (MiB/sec)

libaio1 1602
syslet1 1759
sync  1 1776
libaio   32 1832
syslet   32 1898
libaio2 1581
syslet2 1609

syslet still on top. Measuring O_DIRECT reads (of 4kb size) on ramfs
with 100 processes each with a depth of 200, reading a per-process
private file of 10mb (need to fit in my ram...) 10 times each. IOW,
doing 10,000MiB of IO in total:

Engine  Depth   Processes   Bw (MiB/sec)

libaio   200   1001488
syslet   200   1001714

Results are stable to within approx +/- 10MiB/sec. The syslet case
completes a whole second faster than libaio (~6 vs ~7 seconds). Testing
was done with fio HEAD eb7c8ae27bc301b77490b3586dd5ccab7c95880a, and it
uses the v4 patch series.

Engine  Depth   Processes   Bw (MiB/sec)

libaio200   1001488
syslet200   1001714
sync  200   1001843

-- 
Jens Axboe

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: A quick fio test (was Re: [patch 00/13] Syslets, Threadlets, generic AIO support, v3)

2007-02-26 Thread Jens Axboe


Some more results, using a larger number of processes and io depths. A
repeat of the tests from friday, with added depth 2 for syslet and
libaio:

Engine  Depth   Processes   Bw (MiB/sec)

libaio1 1602
syslet1 1759
sync  1 1776
libaio   32 1832
syslet   32 1898
libaio2 1581
syslet2 1609

syslet still on top. Measuring O_DIRECT reads (of 4kb size) on ramfs
with 100 processes each with a depth of 200, reading a per-process
private file of 10mb (need to fit in my ram...) 10 times each. IOW,
doing 10,000MiB of IO in total:

Engine  Depth   Processes   Bw (MiB/sec)

libaio   200   1001488
syslet   200   1001714

Results are stable to within approx +/- 10MiB/sec. The syslet case
completes a whole second faster than libaio (~6 vs ~7 seconds). Testing
was done with fio HEAD eb7c8ae27bc301b77490b3586dd5ccab7c95880a, and it
uses the v4 patch series.

Engine  Depth   Processes   Bw (MiB/sec)

libaio200   1001488
syslet200   1001714
sync  200   1001843

-- 
Jens Axboe

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: A quick fio test (was Re: [patch 00/13] Syslets, Threadlets, generic AIO support, v3)

2007-02-26 Thread Suparna Bhattacharya

On Mon, Feb 26, 2007 at 02:57:36PM +0100, Jens Axboe wrote:
 
 Some more results, using a larger number of processes and io depths. A
 repeat of the tests from friday, with added depth 2 for syslet and
 libaio:
 
 Engine  Depth   Processes   Bw (MiB/sec)
 
 libaio1 1602
 syslet1 1759
 sync  1 1776
 libaio   32 1832
 syslet   32 1898
 libaio2 1581
 syslet2 1609
 
 syslet still on top. Measuring O_DIRECT reads (of 4kb size) on ramfs
 with 100 processes each with a depth of 200, reading a per-process
 private file of 10mb (need to fit in my ram...) 10 times each. IOW,
 doing 10,000MiB of IO in total:

But, why ramfs ? Don't we want to exercise the case where O_DIRECT actually
blocks ? Or am I missing something here ?

Regards
Suparna

 
 Engine  Depth   Processes   Bw (MiB/sec)
 
 libaio   200   1001488
 syslet   200   1001714
 
 Results are stable to within approx +/- 10MiB/sec. The syslet case
 completes a whole second faster than libaio (~6 vs ~7 seconds). Testing
 was done with fio HEAD eb7c8ae27bc301b77490b3586dd5ccab7c95880a, and it
 uses the v4 patch series.
 
 Engine  Depth   Processes   Bw (MiB/sec)
 
 libaio200   1001488
 syslet200   1001714
 sync  200   1001843
 
 -- 
 Jens Axboe

-- 
Suparna Bhattacharya ([EMAIL PROTECTED])
Linux Technology Center
IBM Software Lab, India

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: A quick fio test (was Re: [patch 00/13] Syslets, Threadlets, generic AIO support, v3)

2007-02-26 Thread Ingo Molnar


* Suparna Bhattacharya [EMAIL PROTECTED] wrote:

  syslet still on top. Measuring O_DIRECT reads (of 4kb size) on ramfs 
  with 100 processes each with a depth of 200, reading a per-process 
  private file of 10mb (need to fit in my ram...) 10 times each. IOW, 
  doing 10,000MiB of IO in total:
 
 But, why ramfs ? Don't we want to exercise the case where O_DIRECT 
 actually blocks ? Or am I missing something here ?

ramfs is just the easiest way to measure the pure CPU overhead of a 
workload without real IO delays (and resulting idle time) getting in the 
way. It's certainly not the same thing, but still it's pretty useful 
most of the time. I used a different method, loopback block device, and 
got similar results. [ Real IO shows similar results as well, but is a 
lot more noisy and hence harder to interpret (and thus easier to get 
wrong). ]

Ingo
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: A quick fio test (was Re: [patch 00/13] Syslets, Threadlets, generic AIO support, v3)

2007-02-26 Thread Jens Axboe

On Mon, Feb 26 2007, Suparna Bhattacharya wrote:
 On Mon, Feb 26, 2007 at 02:57:36PM +0100, Jens Axboe wrote:
  
  Some more results, using a larger number of processes and io depths. A
  repeat of the tests from friday, with added depth 2 for syslet and
  libaio:
  
  Engine  Depth   Processes   Bw (MiB/sec)
  
  libaio1 1602
  syslet1 1759
  sync  1 1776
  libaio   32 1832
  syslet   32 1898
  libaio2 1581
  syslet2 1609
  
  syslet still on top. Measuring O_DIRECT reads (of 4kb size) on ramfs
  with 100 processes each with a depth of 200, reading a per-process
  private file of 10mb (need to fit in my ram...) 10 times each. IOW,
  doing 10,000MiB of IO in total:
 
 But, why ramfs ? Don't we want to exercise the case where O_DIRECT actually
 blocks ? Or am I missing something here ?

Just overhead numbers for that test case, lets try something like your
described job.

Test case is doing random reads from /dev/sdb, in chunks of 64kb:

Engine  Depth   Processes   Bw (KiB/sec)

libaio   200   1002813
syslet   200   1003944
libaio 2 12793
syslet 2 13854
sync (*)   2 12866

deadline was used for IO scheduling, to minimize impact. Not sure why
syslet actually does so much better here, looing at vmstat the rate is
steady and all runs are basically 50/50 idle/wait. One difference is
that the submission itself takes a long time on libaio, since the
io_submit() will block on request allocation.  The generated IO pattern
from each process is the same for all runs. The drive is a lousy sata
that doesn't even do queuing, FWIW.

[*] Just for comparison, the depth is obviously really 1 at the kernel
side since it's sync.

-- 
Jens Axboe

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: A quick fio test (was Re: [patch 00/13] Syslets, Threadlets, generic AIO support, v3)

2007-02-26 Thread Davide Libenzi

On Mon, 26 Feb 2007, Jens Axboe wrote:

 
 Some more results, using a larger number of processes and io depths. A
 repeat of the tests from friday, with added depth 2 for syslet and
 libaio:
 
 Engine  Depth   Processes   Bw (MiB/sec)
 
 libaio1 1602
 syslet1 1759
 sync  1 1776
 libaio   32 1832
 syslet   32 1898
 libaio2 1581
 syslet2 1609

That looks great! IMO there may be a little higher cost associated with 
the syslets thread switches, that we currently do not perform 100% 
cleanly, but results look nevertheless awesome.



- Davide


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: A quick fio test (was Re: [patch 00/13] Syslets, Threadlets, generic AIO support, v3)

2007-02-26 Thread Suparna Bhattacharya

On Mon, Feb 26, 2007 at 03:45:48PM +0100, Jens Axboe wrote:
 On Mon, Feb 26 2007, Suparna Bhattacharya wrote:
  On Mon, Feb 26, 2007 at 02:57:36PM +0100, Jens Axboe wrote:
   
   Some more results, using a larger number of processes and io depths. A
   repeat of the tests from friday, with added depth 2 for syslet and
   libaio:
   
   Engine  Depth   Processes   Bw (MiB/sec)
   
   libaio1 1602
   syslet1 1759
   sync  1 1776
   libaio   32 1832
   syslet   32 1898
   libaio2 1581
   syslet2 1609
   
   syslet still on top. Measuring O_DIRECT reads (of 4kb size) on ramfs
   with 100 processes each with a depth of 200, reading a per-process
   private file of 10mb (need to fit in my ram...) 10 times each. IOW,
   doing 10,000MiB of IO in total:
  
  But, why ramfs ? Don't we want to exercise the case where O_DIRECT actually
  blocks ? Or am I missing something here ?
 
 Just overhead numbers for that test case, lets try something like your
 described job.
 
 Test case is doing random reads from /dev/sdb, in chunks of 64kb:
 
 Engine  Depth   Processes   Bw (KiB/sec)
 
 libaio   200   1002813
 syslet   200   1003944
 libaio 2 12793
 syslet 2 13854
 sync (*)   2 12866
 
 deadline was used for IO scheduling, to minimize impact. Not sure why
 syslet actually does so much better here, looing at vmstat the rate is
 steady and all runs are basically 50/50 idle/wait. One difference is
 that the submission itself takes a long time on libaio, since the
 io_submit() will block on request allocation.  The generated IO pattern
 from each process is the same for all runs. The drive is a lousy sata
 that doesn't even do queuing, FWIW.


I tried the latest fio code with syslet v4, and my results are a little
different - have yet to figure out why or what to make of it.
I hope I have all the right pieces now.

This is an ext2 filesystem, SCSI AIC7xxx.

I used an iodepth_batch size of 8 to limit the number of ios in a single
io_submit (thanks for adding that parameter to fio !), like we did in
aio-stress.

Engine  Depth  BatchBw (KiB/sec)

libaio  64 817,226
syslet  64 817,620
libaio  2  818,552
syslet  2  814,935


Which is not bad, actually.

If I do not specify the iodepth_batch (i.e. default to depth), then the
difference becomes more pronounced at higher depths. However, I doubt
whether anyone would be using such high batch sizes in practice ...

Engine  Depth  BatchBw (KiB/sec)

libaio  64 default  17,429
syslet  64 default  16,155
libaio  2  default  15,494
syslet  2  default  7,971

Often times it is the application tuning that makes all the difference,
so am not really sure how much to read into these results.
That's always been the hard part of async io ...

Regards
Suparna

 
 [*] Just for comparison, the depth is obviously really 1 at the kernel
 side since it's sync.
 
 -- 
 Jens Axboe

-- 
Suparna Bhattacharya ([EMAIL PROTECTED])
Linux Technology Center
IBM Software Lab, India

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: A quick fio test (was Re: [patch 00/13] Syslets, "Threadlets", generic AIO support, v3)

2007-02-24 Thread Jens Axboe

On Fri, Feb 23 2007, Joel Becker wrote:
> On Fri, Feb 23, 2007 at 01:52:47PM +0100, Jens Axboe wrote:
> > Results:
> > 
> > Engine  Depth   Bw (MiB/sec)
> > 
> > libaio1 441
> > syslet1 574
> > sync  1 589
> > libaio   32 613
> > syslet   32 681
> 
>   Can we get runs with large I/Os, large I/O depths, and most
> importantly tons of processes?  I can absolutely believe that syslets
> would compete well with one process on the system.  But with 1000
> processes doing 1000s of blocking I/Os, I'd really be interested to see
> how that plays out.

Sure, I'll add this to the testing list for monday.

-- 
Jens Axboe

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: A quick fio test (was Re: [patch 00/13] Syslets, Threadlets, generic AIO support, v3)

2007-02-24 Thread Jens Axboe

On Fri, Feb 23 2007, Joel Becker wrote:
 On Fri, Feb 23, 2007 at 01:52:47PM +0100, Jens Axboe wrote:
  Results:
  
  Engine  Depth   Bw (MiB/sec)
  
  libaio1 441
  syslet1 574
  sync  1 589
  libaio   32 613
  syslet   32 681
 
   Can we get runs with large I/Os, large I/O depths, and most
 importantly tons of processes?  I can absolutely believe that syslets
 would compete well with one process on the system.  But with 1000
 processes doing 1000s of blocking I/Os, I'd really be interested to see
 how that plays out.

Sure, I'll add this to the testing list for monday.

-- 
Jens Axboe

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: A quick fio test (was Re: [patch 00/13] Syslets, "Threadlets", generic AIO support, v3)

2007-02-23 Thread Joel Becker

On Fri, Feb 23, 2007 at 01:52:47PM +0100, Jens Axboe wrote:
> Results:
> 
> Engine  Depth   Bw (MiB/sec)
> 
> libaio1 441
> syslet1 574
> sync  1 589
> libaio   32 613
> syslet   32 681

Can we get runs with large I/Os, large I/O depths, and most
importantly tons of processes?  I can absolutely believe that syslets
would compete well with one process on the system.  But with 1000
processes doing 1000s of blocking I/Os, I'd really be interested to see
how that plays out.

Joel

-- 

 Joel's Second Law:

If a code change requires additional user setup, it is wrong.

Joel Becker
Principal Software Developer
Oracle
E-mail: [EMAIL PROTECTED]
Phone: (650) 506-8127
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: A quick fio test (was Re: [patch 00/13] Syslets, "Threadlets", generic AIO support, v3)

2007-02-23 Thread Jens Axboe

On Fri, Feb 23 2007, Suparna Bhattacharya wrote:
> On Fri, Feb 23, 2007 at 05:25:08PM +0100, Jens Axboe wrote:
> > On Fri, Feb 23 2007, Suparna Bhattacharya wrote:
> > > On Fri, Feb 23, 2007 at 03:58:26PM +0100, Ingo Molnar wrote:
> > > > 
> > > > * Suparna Bhattacharya <[EMAIL PROTECTED]> wrote:
> > > > 
> > > > > As a really crude (and not very realistic) example of the potential 
> > > > > impact of large numbers of outstanding IOs, I tried some quick direct 
> > > > > IO comparisons using fio:
> > > > > 
> > > > > [global]
> > > > > ioengine=syslet-rw
> > > > > buffered=0
> > > > > rw=randread
> > > > > bs=64k
> > > > > size=1024m
> > > > > iodepth=64
> > > > 
> > > > could you please try those iodepth=2 tests with the latest 
> > > > fio-testing branch of fio as well? Jens wrote a new, smarter syslet 
> > > > plugin for FIO. You'll need the v3 syslet kernel plus:
> > > > 
> > > >  git-clone git://git.kernel.dk/data/git/fio.git
> > > >  cd fio
> > > >  git-checkout syslet-testing
> > > > 
> > > > my expectation is that it should behave better with iodepth=2 
> > > > (although i havent tried that yet).
> > > 
> > > I picked up the fio snapshot from 22nd Feb (fio-git-2007012513.tar.gz)
> > > and used the v3 syslet patches from your web-site.
> > > 
> > > Do I still need to get something more recent ?
> > 
> > Yes, you need to test the syslet+testing branch that Ingo referenced.
> > Your test above is not totally fair right now, since you are doing
> > significantly less system calls with libaio. So to compare apples with
> > apples, try the syslet-testing branch. If you can't get it because of
> > firewall problems, check http://brick.kernel.dk/snaps/ for the latest
> > fio snapshot. If it has the syslet-testing branch, then that is
> > recent enough.
> 
> I have a feeling this is getting to be a little more bleeding edge than
> I had anticipated :), so will just hold off for a bit until this
> crystallizes a bit.

Fair enough, I'll try your test with a huge number of pending requests
and see how it fares.

-- 
Jens Axboe

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: A quick fio test (was Re: [patch 00/13] Syslets, "Threadlets", generic AIO support, v3)

2007-02-23 Thread Suparna Bhattacharya

On Fri, Feb 23, 2007 at 05:25:08PM +0100, Jens Axboe wrote:
> On Fri, Feb 23 2007, Suparna Bhattacharya wrote:
> > On Fri, Feb 23, 2007 at 03:58:26PM +0100, Ingo Molnar wrote:
> > > 
> > > * Suparna Bhattacharya <[EMAIL PROTECTED]> wrote:
> > > 
> > > > As a really crude (and not very realistic) example of the potential 
> > > > impact of large numbers of outstanding IOs, I tried some quick direct 
> > > > IO comparisons using fio:
> > > > 
> > > > [global]
> > > > ioengine=syslet-rw
> > > > buffered=0
> > > > rw=randread
> > > > bs=64k
> > > > size=1024m
> > > > iodepth=64
> > > 
> > > could you please try those iodepth=2 tests with the latest 
> > > fio-testing branch of fio as well? Jens wrote a new, smarter syslet 
> > > plugin for FIO. You'll need the v3 syslet kernel plus:
> > > 
> > >  git-clone git://git.kernel.dk/data/git/fio.git
> > >  cd fio
> > >  git-checkout syslet-testing
> > > 
> > > my expectation is that it should behave better with iodepth=2 
> > > (although i havent tried that yet).
> > 
> > I picked up the fio snapshot from 22nd Feb (fio-git-2007012513.tar.gz)
> > and used the v3 syslet patches from your web-site.
> > 
> > Do I still need to get something more recent ?
> 
> Yes, you need to test the syslet+testing branch that Ingo referenced.
> Your test above is not totally fair right now, since you are doing
> significantly less system calls with libaio. So to compare apples with
> apples, try the syslet-testing branch. If you can't get it because of
> firewall problems, check http://brick.kernel.dk/snaps/ for the latest
> fio snapshot. If it has the syslet-testing branch, then that is
> recent enough.

I have a feeling this is getting to be a little more bleeding edge than
I had anticipated :), so will just hold off for a bit until this
crystallizes a bit.

Regards
Suparna

> 
> -- 
> Jens Axboe

-- 
Suparna Bhattacharya ([EMAIL PROTECTED])
Linux Technology Center
IBM Software Lab, India

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: A quick fio test (was Re: [patch 00/13] Syslets, "Threadlets", generic AIO support, v3)

2007-02-23 Thread Ingo Molnar


* Suparna Bhattacharya <[EMAIL PROTECTED]> wrote:

> > my expectation is that it should behave better with iodepth=2 
> > (although i havent tried that yet).
> 
> I picked up the fio snapshot from 22nd Feb 
> (fio-git-2007012513.tar.gz) and used the v3 syslet patches from 
> your web-site.
> 
> Do I still need to get something more recent ?

yeah, there's something more recent. Please do this:

  yum install git
  git-clone git://git.kernel.dk/data/git/fio.git
  cd fio
  git-branch syslet-testing
  git-checkout

this should give you the latest version of the v3 based FIO code. It's 
one generation newer than the one you tried. I mean the snapshot you 
used is meanwhile a /whole/ day old, so it's truly ancient stuff! ;-)

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: A quick fio test (was Re: [patch 00/13] Syslets, "Threadlets", generic AIO support, v3)

2007-02-23 Thread Jens Axboe

On Fri, Feb 23 2007, Suparna Bhattacharya wrote:
> On Fri, Feb 23, 2007 at 03:58:26PM +0100, Ingo Molnar wrote:
> > 
> > * Suparna Bhattacharya <[EMAIL PROTECTED]> wrote:
> > 
> > > As a really crude (and not very realistic) example of the potential 
> > > impact of large numbers of outstanding IOs, I tried some quick direct 
> > > IO comparisons using fio:
> > > 
> > > [global]
> > > ioengine=syslet-rw
> > > buffered=0
> > > rw=randread
> > > bs=64k
> > > size=1024m
> > > iodepth=64
> > 
> > could you please try those iodepth=2 tests with the latest 
> > fio-testing branch of fio as well? Jens wrote a new, smarter syslet 
> > plugin for FIO. You'll need the v3 syslet kernel plus:
> > 
> >  git-clone git://git.kernel.dk/data/git/fio.git
> >  cd fio
> >  git-checkout syslet-testing
> > 
> > my expectation is that it should behave better with iodepth=2 
> > (although i havent tried that yet).
> 
> I picked up the fio snapshot from 22nd Feb (fio-git-2007012513.tar.gz)
> and used the v3 syslet patches from your web-site.
> 
> Do I still need to get something more recent ?

Yes, you need to test the syslet+testing branch that Ingo referenced.
Your test above is not totally fair right now, since you are doing
significantly less system calls with libaio. So to compare apples with
apples, try the syslet-testing branch. If you can't get it because of
firewall problems, check http://brick.kernel.dk/snaps/ for the latest
fio snapshot. If it has the syslet-testing branch, then that is
recent enough.

-- 
Jens Axboe

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: A quick fio test (was Re: [patch 00/13] Syslets, "Threadlets", generic AIO support, v3)

2007-02-23 Thread Suparna Bhattacharya

On Fri, Feb 23, 2007 at 03:58:26PM +0100, Ingo Molnar wrote:
> 
> * Suparna Bhattacharya <[EMAIL PROTECTED]> wrote:
> 
> > As a really crude (and not very realistic) example of the potential 
> > impact of large numbers of outstanding IOs, I tried some quick direct 
> > IO comparisons using fio:
> > 
> > [global]
> > ioengine=syslet-rw
> > buffered=0
> > rw=randread
> > bs=64k
> > size=1024m
> > iodepth=64
> 
> could you please try those iodepth=2 tests with the latest 
> fio-testing branch of fio as well? Jens wrote a new, smarter syslet 
> plugin for FIO. You'll need the v3 syslet kernel plus:
> 
>  git-clone git://git.kernel.dk/data/git/fio.git
>  cd fio
>  git-checkout syslet-testing
> 
> my expectation is that it should behave better with iodepth=2 
> (although i havent tried that yet).

I picked up the fio snapshot from 22nd Feb (fio-git-2007012513.tar.gz)
and used the v3 syslet patches from your web-site.

Do I still need to get something more recent ?

Regards
Suparna


> 
>   Ingo

-- 
Suparna Bhattacharya ([EMAIL PROTECTED])
Linux Technology Center
IBM Software Lab, India

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: A quick fio test (was Re: [patch 00/13] Syslets, "Threadlets", generic AIO support, v3)

2007-02-23 Thread Ingo Molnar

* Suparna Bhattacharya <[EMAIL PROTECTED]> wrote:

> As a really crude (and not very realistic) example of the potential 
> impact of large numbers of outstanding IOs, I tried some quick direct 
> IO comparisons using fio:
> 
> [global]
> ioengine=syslet-rw
> buffered=0
> rw=randread
> bs=64k
> size=1024m
> iodepth=64

could you please try those iodepth=2 tests with the latest 
fio-testing branch of fio as well? Jens wrote a new, smarter syslet 
plugin for FIO. You'll need the v3 syslet kernel plus:

 git-clone git://git.kernel.dk/data/git/fio.git
 cd fio
 git-checkout syslet-testing

my expectation is that it should behave better with iodepth=2 
(although i havent tried that yet).

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: A quick fio test (was Re: [patch 00/13] Syslets, "Threadlets", generic AIO support, v3)

2007-02-23 Thread Suparna Bhattacharya

On Fri, Feb 23, 2007 at 01:52:47PM +0100, Jens Axboe wrote:
> On Wed, Feb 21 2007, Ingo Molnar wrote:
> > this is the v3 release of the syslet/threadlet subsystem:
> > 
> >http://redhat.com/~mingo/syslet-patches/
> 
> [snip]
> 
> Ingo, some testing of the experimental syslet queueing stuff, in the
> syslet-testing branch of fio.
> 
> Fio job file:
> 
> [global]
> bs=8k
> size=1g
> direct=0
> ioengine=syslet-rw
> iodepth=32
> rw=read
> 
> [file]
> filename=/ramfs/testfile
> 
> Only changes between runs was changing ioengine and iodepth as indicated
> in the table below.
> 
> Results:
> 
> Engine  Depth   Bw (MiB/sec)
> 
> libaio1 441
> syslet1 574
> sync  1 589
> libaio   32 613
> syslet   32 681
> 
> Results are stable to within +/- 1MiB/sec. So you can see that syslet
> are still a bit slower than sync for depth 1, but beats the pants off
> libaio for equal depths. Note that this is buffered IO, I'll be out for
> the weekend, but I'll hack some direct IO testing up next week to
> compare "real" queuing.
> 
> Just a quick microbenchmark to gauge current overhead...

This is just ramfs, to gauge pure overheads, is that correct ? 

BTW, I'm not surprised at Ingo's initial results of syslet vs libaio
overheads, for aio-stress/fio type streaming io runs, because these cases
do not involve large numbers of outstanding ios. So the overhead of
thread creation with syslets is amortized across the entire run of io
submissions because of the reuse of already created async threads. While
in the libaio case there is a setup and teardown of kiocbs per request.

What I have been concerned about instead in the past when considering
thread based AIO implementations is the resource(memory) consumption impact
on overall system performance and adaptability to varying loads. It is nice
that we can avoid that for the cached cases, but for the general blocking
cases, it is still not clear to me whether we have addressed this well
enough yet. I used to think that even the kiocb was too heavyweight for its
purpose ... especially in terms of scaling to larger loads.

As a really crude (and not very realistic) example of the potential impact
of large numbers of outstanding IOs, I tried some quick direct IO comparisons
using fio:

[global]
ioengine=syslet-rw
buffered=0
rw=randread
bs=64k
size=1024m
iodepth=64

Engine  Depth   Bw (MiB/sec)

libaio  64  17.323
syslet  64  17.524
libaio  2   15.226
syslet  2   11.015

Regards
Suparna

-- 
Suparna Bhattacharya ([EMAIL PROTECTED])
Linux Technology Center
IBM Software Lab, India

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

A quick fio test (was Re: [patch 00/13] Syslets, "Threadlets", generic AIO support, v3)

2007-02-23 Thread Jens Axboe

On Wed, Feb 21 2007, Ingo Molnar wrote:
> this is the v3 release of the syslet/threadlet subsystem:
> 
>http://redhat.com/~mingo/syslet-patches/

[snip]

Ingo, some testing of the experimental syslet queueing stuff, in the
syslet-testing branch of fio.

Fio job file:

[global]
bs=8k
size=1g
direct=0
ioengine=syslet-rw
iodepth=32
rw=read

[file]
filename=/ramfs/testfile

Only changes between runs was changing ioengine and iodepth as indicated
in the table below.

Results:

Engine  Depth   Bw (MiB/sec)

libaio1 441
syslet1 574
sync  1 589
libaio   32 613
syslet   32 681

Results are stable to within +/- 1MiB/sec. So you can see that syslet
are still a bit slower than sync for depth 1, but beats the pants off
libaio for equal depths. Note that this is buffered IO, I'll be out for
the weekend, but I'll hack some direct IO testing up next week to
compare "real" queuing.

Just a quick microbenchmark to gauge current overhead...

-- 
Jens Axboe

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

A quick fio test (was Re: [patch 00/13] Syslets, Threadlets, generic AIO support, v3)

2007-02-23 Thread Jens Axboe

On Wed, Feb 21 2007, Ingo Molnar wrote:
 this is the v3 release of the syslet/threadlet subsystem:
 
http://redhat.com/~mingo/syslet-patches/

[snip]

Ingo, some testing of the experimental syslet queueing stuff, in the
syslet-testing branch of fio.

Fio job file:

[global]
bs=8k
size=1g
direct=0
ioengine=syslet-rw
iodepth=32
rw=read

[file]
filename=/ramfs/testfile

Only changes between runs was changing ioengine and iodepth as indicated
in the table below.

Results:

Engine  Depth   Bw (MiB/sec)

libaio1 441
syslet1 574
sync  1 589
libaio   32 613
syslet   32 681

Results are stable to within +/- 1MiB/sec. So you can see that syslet
are still a bit slower than sync for depth 1, but beats the pants off
libaio for equal depths. Note that this is buffered IO, I'll be out for
the weekend, but I'll hack some direct IO testing up next week to
compare real queuing.

Just a quick microbenchmark to gauge current overhead...

-- 
Jens Axboe

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: A quick fio test (was Re: [patch 00/13] Syslets, Threadlets, generic AIO support, v3)

2007-02-23 Thread Suparna Bhattacharya

On Fri, Feb 23, 2007 at 01:52:47PM +0100, Jens Axboe wrote:
 On Wed, Feb 21 2007, Ingo Molnar wrote:
  this is the v3 release of the syslet/threadlet subsystem:
  
 http://redhat.com/~mingo/syslet-patches/
 
 [snip]
 
 Ingo, some testing of the experimental syslet queueing stuff, in the
 syslet-testing branch of fio.
 
 Fio job file:
 
 [global]
 bs=8k
 size=1g
 direct=0
 ioengine=syslet-rw
 iodepth=32
 rw=read
 
 [file]
 filename=/ramfs/testfile
 
 Only changes between runs was changing ioengine and iodepth as indicated
 in the table below.
 
 Results:
 
 Engine  Depth   Bw (MiB/sec)
 
 libaio1 441
 syslet1 574
 sync  1 589
 libaio   32 613
 syslet   32 681
 
 Results are stable to within +/- 1MiB/sec. So you can see that syslet
 are still a bit slower than sync for depth 1, but beats the pants off
 libaio for equal depths. Note that this is buffered IO, I'll be out for
 the weekend, but I'll hack some direct IO testing up next week to
 compare real queuing.
 
 Just a quick microbenchmark to gauge current overhead...

This is just ramfs, to gauge pure overheads, is that correct ? 

BTW, I'm not surprised at Ingo's initial results of syslet vs libaio
overheads, for aio-stress/fio type streaming io runs, because these cases
do not involve large numbers of outstanding ios. So the overhead of
thread creation with syslets is amortized across the entire run of io
submissions because of the reuse of already created async threads. While
in the libaio case there is a setup and teardown of kiocbs per request.

What I have been concerned about instead in the past when considering
thread based AIO implementations is the resource(memory) consumption impact
on overall system performance and adaptability to varying loads. It is nice
that we can avoid that for the cached cases, but for the general blocking
cases, it is still not clear to me whether we have addressed this well
enough yet. I used to think that even the kiocb was too heavyweight for its
purpose ... especially in terms of scaling to larger loads.

As a really crude (and not very realistic) example of the potential impact
of large numbers of outstanding IOs, I tried some quick direct IO comparisons
using fio:

[global]
ioengine=syslet-rw
buffered=0
rw=randread
bs=64k
size=1024m
iodepth=64

Engine  Depth   Bw (MiB/sec)

libaio  64  17.323
syslet  64  17.524
libaio  2   15.226
syslet  2   11.015


Regards
Suparna

-- 
Suparna Bhattacharya ([EMAIL PROTECTED])
Linux Technology Center
IBM Software Lab, India

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: A quick fio test (was Re: [patch 00/13] Syslets, Threadlets, generic AIO support, v3)

2007-02-23 Thread Ingo Molnar


* Suparna Bhattacharya [EMAIL PROTECTED] wrote:

 As a really crude (and not very realistic) example of the potential 
 impact of large numbers of outstanding IOs, I tried some quick direct 
 IO comparisons using fio:
 
 [global]
 ioengine=syslet-rw
 buffered=0
 rw=randread
 bs=64k
 size=1024m
 iodepth=64

could you please try those iodepth=2 tests with the latest 
fio-testing branch of fio as well? Jens wrote a new, smarter syslet 
plugin for FIO. You'll need the v3 syslet kernel plus:

 git-clone git://git.kernel.dk/data/git/fio.git
 cd fio
 git-checkout syslet-testing

my expectation is that it should behave better with iodepth=2 
(although i havent tried that yet).

Ingo
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: A quick fio test (was Re: [patch 00/13] Syslets, Threadlets, generic AIO support, v3)

2007-02-23 Thread Suparna Bhattacharya

On Fri, Feb 23, 2007 at 03:58:26PM +0100, Ingo Molnar wrote:
 
 * Suparna Bhattacharya [EMAIL PROTECTED] wrote:
 
  As a really crude (and not very realistic) example of the potential 
  impact of large numbers of outstanding IOs, I tried some quick direct 
  IO comparisons using fio:
  
  [global]
  ioengine=syslet-rw
  buffered=0
  rw=randread
  bs=64k
  size=1024m
  iodepth=64
 
 could you please try those iodepth=2 tests with the latest 
 fio-testing branch of fio as well? Jens wrote a new, smarter syslet 
 plugin for FIO. You'll need the v3 syslet kernel plus:
 
  git-clone git://git.kernel.dk/data/git/fio.git
  cd fio
  git-checkout syslet-testing
 
 my expectation is that it should behave better with iodepth=2 
 (although i havent tried that yet).

I picked up the fio snapshot from 22nd Feb (fio-git-2007012513.tar.gz)
and used the v3 syslet patches from your web-site.

Do I still need to get something more recent ?

Regards
Suparna


 
   Ingo

-- 
Suparna Bhattacharya ([EMAIL PROTECTED])
Linux Technology Center
IBM Software Lab, India

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: A quick fio test (was Re: [patch 00/13] Syslets, Threadlets, generic AIO support, v3)

2007-02-23 Thread Jens Axboe

On Fri, Feb 23 2007, Suparna Bhattacharya wrote:
 On Fri, Feb 23, 2007 at 03:58:26PM +0100, Ingo Molnar wrote:
  
  * Suparna Bhattacharya [EMAIL PROTECTED] wrote:
  
   As a really crude (and not very realistic) example of the potential 
   impact of large numbers of outstanding IOs, I tried some quick direct 
   IO comparisons using fio:
   
   [global]
   ioengine=syslet-rw
   buffered=0
   rw=randread
   bs=64k
   size=1024m
   iodepth=64
  
  could you please try those iodepth=2 tests with the latest 
  fio-testing branch of fio as well? Jens wrote a new, smarter syslet 
  plugin for FIO. You'll need the v3 syslet kernel plus:
  
   git-clone git://git.kernel.dk/data/git/fio.git
   cd fio
   git-checkout syslet-testing
  
  my expectation is that it should behave better with iodepth=2 
  (although i havent tried that yet).
 
 I picked up the fio snapshot from 22nd Feb (fio-git-2007012513.tar.gz)
 and used the v3 syslet patches from your web-site.
 
 Do I still need to get something more recent ?

Yes, you need to test the syslet+testing branch that Ingo referenced.
Your test above is not totally fair right now, since you are doing
significantly less system calls with libaio. So to compare apples with
apples, try the syslet-testing branch. If you can't get it because of
firewall problems, check http://brick.kernel.dk/snaps/ for the latest
fio snapshot. If it has the syslet-testing branch, then that is
recent enough.

-- 
Jens Axboe

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: A quick fio test (was Re: [patch 00/13] Syslets, Threadlets, generic AIO support, v3)

2007-02-23 Thread Ingo Molnar


* Suparna Bhattacharya [EMAIL PROTECTED] wrote:

  my expectation is that it should behave better with iodepth=2 
  (although i havent tried that yet).
 
 I picked up the fio snapshot from 22nd Feb 
 (fio-git-2007012513.tar.gz) and used the v3 syslet patches from 
 your web-site.
 
 Do I still need to get something more recent ?

yeah, there's something more recent. Please do this:

  yum install git
  git-clone git://git.kernel.dk/data/git/fio.git
  cd fio
  git-branch syslet-testing
  git-checkout

this should give you the latest version of the v3 based FIO code. It's 
one generation newer than the one you tried. I mean the snapshot you 
used is meanwhile a /whole/ day old, so it's truly ancient stuff! ;-)

Ingo
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: A quick fio test (was Re: [patch 00/13] Syslets, Threadlets, generic AIO support, v3)

2007-02-23 Thread Suparna Bhattacharya

On Fri, Feb 23, 2007 at 05:25:08PM +0100, Jens Axboe wrote:
 On Fri, Feb 23 2007, Suparna Bhattacharya wrote:
  On Fri, Feb 23, 2007 at 03:58:26PM +0100, Ingo Molnar wrote:
   
   * Suparna Bhattacharya [EMAIL PROTECTED] wrote:
   
As a really crude (and not very realistic) example of the potential 
impact of large numbers of outstanding IOs, I tried some quick direct 
IO comparisons using fio:

[global]
ioengine=syslet-rw
buffered=0
rw=randread
bs=64k
size=1024m
iodepth=64
   
   could you please try those iodepth=2 tests with the latest 
   fio-testing branch of fio as well? Jens wrote a new, smarter syslet 
   plugin for FIO. You'll need the v3 syslet kernel plus:
   
git-clone git://git.kernel.dk/data/git/fio.git
cd fio
git-checkout syslet-testing
   
   my expectation is that it should behave better with iodepth=2 
   (although i havent tried that yet).
  
  I picked up the fio snapshot from 22nd Feb (fio-git-2007012513.tar.gz)
  and used the v3 syslet patches from your web-site.
  
  Do I still need to get something more recent ?
 
 Yes, you need to test the syslet+testing branch that Ingo referenced.
 Your test above is not totally fair right now, since you are doing
 significantly less system calls with libaio. So to compare apples with
 apples, try the syslet-testing branch. If you can't get it because of
 firewall problems, check http://brick.kernel.dk/snaps/ for the latest
 fio snapshot. If it has the syslet-testing branch, then that is
 recent enough.

I have a feeling this is getting to be a little more bleeding edge than
I had anticipated :), so will just hold off for a bit until this
crystallizes a bit.

Regards
Suparna

 
 -- 
 Jens Axboe

-- 
Suparna Bhattacharya ([EMAIL PROTECTED])
Linux Technology Center
IBM Software Lab, India

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: A quick fio test (was Re: [patch 00/13] Syslets, Threadlets, generic AIO support, v3)

2007-02-23 Thread Jens Axboe

On Fri, Feb 23 2007, Suparna Bhattacharya wrote:
 On Fri, Feb 23, 2007 at 05:25:08PM +0100, Jens Axboe wrote:
  On Fri, Feb 23 2007, Suparna Bhattacharya wrote:
   On Fri, Feb 23, 2007 at 03:58:26PM +0100, Ingo Molnar wrote:

* Suparna Bhattacharya [EMAIL PROTECTED] wrote:

 As a really crude (and not very realistic) example of the potential 
 impact of large numbers of outstanding IOs, I tried some quick direct 
 IO comparisons using fio:
 
 [global]
 ioengine=syslet-rw
 buffered=0
 rw=randread
 bs=64k
 size=1024m
 iodepth=64

could you please try those iodepth=2 tests with the latest 
fio-testing branch of fio as well? Jens wrote a new, smarter syslet 
plugin for FIO. You'll need the v3 syslet kernel plus:

 git-clone git://git.kernel.dk/data/git/fio.git
 cd fio
 git-checkout syslet-testing

my expectation is that it should behave better with iodepth=2 
(although i havent tried that yet).
   
   I picked up the fio snapshot from 22nd Feb (fio-git-2007012513.tar.gz)
   and used the v3 syslet patches from your web-site.
   
   Do I still need to get something more recent ?
  
  Yes, you need to test the syslet+testing branch that Ingo referenced.
  Your test above is not totally fair right now, since you are doing
  significantly less system calls with libaio. So to compare apples with
  apples, try the syslet-testing branch. If you can't get it because of
  firewall problems, check http://brick.kernel.dk/snaps/ for the latest
  fio snapshot. If it has the syslet-testing branch, then that is
  recent enough.
 
 I have a feeling this is getting to be a little more bleeding edge than
 I had anticipated :), so will just hold off for a bit until this
 crystallizes a bit.

Fair enough, I'll try your test with a huge number of pending requests
and see how it fares.

-- 
Jens Axboe

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: A quick fio test (was Re: [patch 00/13] Syslets, Threadlets, generic AIO support, v3)

2007-02-23 Thread Joel Becker

On Fri, Feb 23, 2007 at 01:52:47PM +0100, Jens Axboe wrote:
 Results:
 
 Engine  Depth   Bw (MiB/sec)
 
 libaio1 441
 syslet1 574
 sync  1 589
 libaio   32 613
 syslet   32 681

Can we get runs with large I/Os, large I/O depths, and most
importantly tons of processes?  I can absolutely believe that syslets
would compete well with one process on the system.  But with 1000
processes doing 1000s of blocking I/Os, I'd really be interested to see
how that plays out.

Joel

-- 

 Joel's Second Law:

If a code change requires additional user setup, it is wrong.

Joel Becker
Principal Software Developer
Oracle
E-mail: [EMAIL PROTECTED]
Phone: (650) 506-8127
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

68 matches

Mail list logo