Re: Drive performance bottleneck

2005-02-05 Thread Nuno Silva
Andrew Morton wrote:
Lincoln Dale <[EMAIL PROTECTED]> wrote:
sg_dd uses a window into a kernel DMA window.  as such, two of the four 
memory acccesses are cut out (1. DMA from HBA to RAM, 2. userspace 
accessing data).
1.6Gbps / 2 = 800MB/s -- or roughly what Ian was seeing with sg_dd.

Right.  That's a fancy way of saying "cheating" ;)
But from the oprofile output it appears to me that there is plenty of CPU
capacity left over.  Maybe I'm misreading it due to oprofile adding in the
SMP factor (25% CPU on a 4-way means we've exhausted CPU capacity).
sg_dd is lying or /dev/sg* is broken. Try to do that sg_dd test in any 
single drive and you'll get 20 times the performance you're supposed to 
achieve:

puma:/tmp/dd# time sg_dd if=/dev/sg1 of=/dev/null bs=64k count=100 
time=1
Reducing read to 64 blocks per loop
time to transfer data was 69.784784 secs, 939.12 MB/sec
100+0 records in
100+0 records out

This is a single sata drive. I'm lucky, am I not?  ;-)
Regards,
Nuno Silva
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Drive performance bottleneck

2005-02-05 Thread Nuno Silva
Andrew Morton wrote:
Lincoln Dale [EMAIL PROTECTED] wrote:
sg_dd uses a window into a kernel DMA window.  as such, two of the four 
memory acccesses are cut out (1. DMA from HBA to RAM, 2. userspace 
accessing data).
1.6Gbps / 2 = 800MB/s -- or roughly what Ian was seeing with sg_dd.

Right.  That's a fancy way of saying cheating ;)
But from the oprofile output it appears to me that there is plenty of CPU
capacity left over.  Maybe I'm misreading it due to oprofile adding in the
SMP factor (25% CPU on a 4-way means we've exhausted CPU capacity).
sg_dd is lying or /dev/sg* is broken. Try to do that sg_dd test in any 
single drive and you'll get 20 times the performance you're supposed to 
achieve:

puma:/tmp/dd# time sg_dd if=/dev/sg1 of=/dev/null bs=64k count=100 
time=1
Reducing read to 64 blocks per loop
time to transfer data was 69.784784 secs, 939.12 MB/sec
100+0 records in
100+0 records out

This is a single sata drive. I'm lucky, am I not?  ;-)
Regards,
Nuno Silva
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Drive performance bottleneck

2005-02-04 Thread Andrew Morton
Lincoln Dale <[EMAIL PROTECTED]> wrote:
>
> sg_dd uses a window into a kernel DMA window.  as such, two of the four 
> memory acccesses are cut out (1. DMA from HBA to RAM, 2. userspace 
> accessing data).
> 1.6Gbps / 2 = 800MB/s -- or roughly what Ian was seeing with sg_dd.

Right.  That's a fancy way of saying "cheating" ;)

But from the oprofile output it appears to me that there is plenty of CPU
capacity left over.  Maybe I'm misreading it due to oprofile adding in the
SMP factor (25% CPU on a 4-way means we've exhausted CPU capacity).

> DIRECT_IO should achieve similar numbers to sg_dd, but perhaps not quite as 
> efficient.

Probably so.

There are various tools in
http://www.zip.com.au/~akpm/linux/patches/stuff/ext3-tools.tar.gz which are
more useful than dd, btw.  `odread' and `odwrite' are usful for this sort
of thing.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Drive performance bottleneck

2005-02-04 Thread Lincoln Dale
At 08:32 PM 4/02/2005, Andrew Morton wrote:
Something funny is happening here - it looks like there's plenty of CPU
capacity left over.
[..]
Could you monitor the CPU load during the various tests?  If the `dd'
workload isn't pegging the CPU then it could be that there's something
wrong with the I/O submission patterns.
as an educated guess, i'd say that the workload is running out of memory 
bandwidth ..

lets say the RAM is single-channel DDR400.  peak bandwidth = 3.2Gb/s (400 x 
10^6 x 64 bits / 10).  its fair to say that peak bandwidth is 
pretty rare thing to achieve with SDRAM given real-world access patterns -- 
lets take a conservative "it'll be 50% efficient" -- so DDR400 realistic 
peak = 1.6Gbps.

as far as memory-accesses go, a standard user-space read() from disk 
results in 4 memory-accesses (1. DMA from HBA to RAM, 2. read in 
copy_to_user(), 3. write in copy_to_user(), 4. userspace accessing that data).
1.6Gbps / 4 = 400MB/s -- or roughly what Ian was seeing.

sg_dd uses a window into a kernel DMA window.  as such, two of the four 
memory acccesses are cut out (1. DMA from HBA to RAM, 2. userspace 
accessing data).
1.6Gbps / 2 = 800MB/s -- or roughly what Ian was seeing with sg_dd.

DIRECT_IO should achieve similar numbers to sg_dd, but perhaps not quite as 
efficient.

cheers,
lincoln.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Drive performance bottleneck

2005-02-04 Thread Andy Isaacson
On Thu, Feb 03, 2005 at 07:03:48PM +, Paulo Marques wrote:
> FYI there was a patch running around last April that made a new option 
> for "dd" to make it use O_DIRECT. You can get it here:
> 
> http://marc.theaimsgroup.com/?l=linux-kernel=108135935629589=2
> 
> Unfortunately this hasn't made it into coreutils.

Follow down the thread and you'll see that it was merged to coreutils
CVS (message-id <[EMAIL PROTECTED]>), but there apparently
hasn't been a coreutils release since then.  Nor has the patch been
added to the debian package.

-andy
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Drive performance bottleneck

2005-02-04 Thread Andrew Morton
Ian Godin <[EMAIL PROTECTED]> wrote:
>
> 
>I am trying to get very fast disk drive performance and I am seeing 
> some interesting bottlenecks.  We are trying to get 800 MB/sec or more 
> (yes, that is megabytes per second).  We are currently using 
> PCI-Express with a 16 drive raid card (SATA drives).  We have achieved 
> that speed, but only through the SG (SCSI generic) driver.  This is 
> running the stock 2.6.10 kernel.  And the device is not mounted as a 
> file system.  I also set the read ahead size on the device to 16KB 
> (which speeds things up a lot):
> ...
> samples  %symbol name
> 8481858.3510  __copy_to_user_ll
> 7721727.6026  do_anonymous_page
> 7015796.9076  _spin_lock_irq
> 5790245.7009  __copy_user_intel
> 3616343.5606  _spin_lock
> 3430183.3773  _spin_lock_irqsave
> 3074623.0272  kmap_atomic
> 1933271.9035  page_fault

Something funny is happening here - it looks like there's plenty of CPU
capacity left over.

It's odd that you're getting a lot of pagefaults in this test but not with
the sg_dd test, too.  I wonder why dd is getting so many pagefaults?  (I
recall that sg_dd did something cheaty, but I forget what it was).

Could you monitor the CPU load during the various tests?  If the `dd'
workload isn't pegging the CPU then it could be that there's something
wrong with the I/O submission patterns.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Drive performance bottleneck

2005-02-04 Thread Andrew Morton
Ian Godin [EMAIL PROTECTED] wrote:

 
I am trying to get very fast disk drive performance and I am seeing 
 some interesting bottlenecks.  We are trying to get 800 MB/sec or more 
 (yes, that is megabytes per second).  We are currently using 
 PCI-Express with a 16 drive raid card (SATA drives).  We have achieved 
 that speed, but only through the SG (SCSI generic) driver.  This is 
 running the stock 2.6.10 kernel.  And the device is not mounted as a 
 file system.  I also set the read ahead size on the device to 16KB 
 (which speeds things up a lot):
 ...
 samples  %symbol name
 8481858.3510  __copy_to_user_ll
 7721727.6026  do_anonymous_page
 7015796.9076  _spin_lock_irq
 5790245.7009  __copy_user_intel
 3616343.5606  _spin_lock
 3430183.3773  _spin_lock_irqsave
 3074623.0272  kmap_atomic
 1933271.9035  page_fault

Something funny is happening here - it looks like there's plenty of CPU
capacity left over.

It's odd that you're getting a lot of pagefaults in this test but not with
the sg_dd test, too.  I wonder why dd is getting so many pagefaults?  (I
recall that sg_dd did something cheaty, but I forget what it was).

Could you monitor the CPU load during the various tests?  If the `dd'
workload isn't pegging the CPU then it could be that there's something
wrong with the I/O submission patterns.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Drive performance bottleneck

2005-02-04 Thread Andy Isaacson
On Thu, Feb 03, 2005 at 07:03:48PM +, Paulo Marques wrote:
 FYI there was a patch running around last April that made a new option 
 for dd to make it use O_DIRECT. You can get it here:
 
 http://marc.theaimsgroup.com/?l=linux-kernelm=108135935629589w=2
 
 Unfortunately this hasn't made it into coreutils.

Follow down the thread and you'll see that it was merged to coreutils
CVS (message-id [EMAIL PROTECTED]), but there apparently
hasn't been a coreutils release since then.  Nor has the patch been
added to the debian package.

-andy
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Drive performance bottleneck

2005-02-04 Thread Lincoln Dale
At 08:32 PM 4/02/2005, Andrew Morton wrote:
Something funny is happening here - it looks like there's plenty of CPU
capacity left over.
[..]
Could you monitor the CPU load during the various tests?  If the `dd'
workload isn't pegging the CPU then it could be that there's something
wrong with the I/O submission patterns.
as an educated guess, i'd say that the workload is running out of memory 
bandwidth ..

lets say the RAM is single-channel DDR400.  peak bandwidth = 3.2Gb/s (400 x 
10^6 x 64 bits / 10).  its fair to say that peak bandwidth is 
pretty rare thing to achieve with SDRAM given real-world access patterns -- 
lets take a conservative it'll be 50% efficient -- so DDR400 realistic 
peak = 1.6Gbps.

as far as memory-accesses go, a standard user-space read() from disk 
results in 4 memory-accesses (1. DMA from HBA to RAM, 2. read in 
copy_to_user(), 3. write in copy_to_user(), 4. userspace accessing that data).
1.6Gbps / 4 = 400MB/s -- or roughly what Ian was seeing.

sg_dd uses a window into a kernel DMA window.  as such, two of the four 
memory acccesses are cut out (1. DMA from HBA to RAM, 2. userspace 
accessing data).
1.6Gbps / 2 = 800MB/s -- or roughly what Ian was seeing with sg_dd.

DIRECT_IO should achieve similar numbers to sg_dd, but perhaps not quite as 
efficient.

cheers,
lincoln.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Drive performance bottleneck

2005-02-04 Thread Andrew Morton
Lincoln Dale [EMAIL PROTECTED] wrote:

 sg_dd uses a window into a kernel DMA window.  as such, two of the four 
 memory acccesses are cut out (1. DMA from HBA to RAM, 2. userspace 
 accessing data).
 1.6Gbps / 2 = 800MB/s -- or roughly what Ian was seeing with sg_dd.

Right.  That's a fancy way of saying cheating ;)

But from the oprofile output it appears to me that there is plenty of CPU
capacity left over.  Maybe I'm misreading it due to oprofile adding in the
SMP factor (25% CPU on a 4-way means we've exhausted CPU capacity).

 DIRECT_IO should achieve similar numbers to sg_dd, but perhaps not quite as 
 efficient.

Probably so.

There are various tools in
http://www.zip.com.au/~akpm/linux/patches/stuff/ext3-tools.tar.gz which are
more useful than dd, btw.  `odread' and `odwrite' are usful for this sort
of thing.


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Drive performance bottleneck

2005-02-03 Thread Paulo Marques
Ian Godin wrote:
[...]
  Definitely have been able to repeat that here, so the SG driver 
definitely appears to be broken.  At least I'm glad I am not going 
insane, I was starting to wonder :)

  I'll run some more tests with O_DIRECT and such things, see if I can 
figure out what the REAL max speed is.
FYI there was a patch running around last April that made a new option 
for "dd" to make it use O_DIRECT. You can get it here:

http://marc.theaimsgroup.com/?l=linux-kernel=108135935629589=2
Unfortunately this hasn't made it into coreutils. IIRC there were issues 
about dd being multi-platform and the way O_DIRECT was done in other 
systems.

Anyway, you can patch dd yourself and have a tool for debugging with 
O_DIRECT. I hope this helps,

--
Paulo Marques - www.grupopie.com
All that is necessary for the triumph of evil is that good men do nothing.
Edmund Burke (1729 - 1797)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Drive performance bottleneck

2005-02-03 Thread Nuno Silva
Ian Godin wrote:
  I am trying to get very fast disk drive performance and I am seeing 
some interesting bottlenecks.  We are trying to get 800 MB/sec or more 
(yes, that is megabytes per second).  We are currently using PCI-Express 
with a 16 drive raid card (SATA drives).  We have achieved that speed, 
but only through the SG (SCSI generic) driver.  This is running the 
stock 2.6.10 kernel.  And the device is not mounted as a file system.  I 
also set the read ahead size on the device to 16KB (which speeds things 
up a lot):
I was trying to reproduce but got distracted by this:
(use page down, if you just want to see the odd result)
puma:/tmp/dd# sg_map
/dev/sg0  /dev/sda
/dev/sg1  /dev/sdb
/dev/sg2  /dev/scd0
/dev/sg3  /dev/sdc
puma:/tmp/dd# time sg_dd if=/dev/sg1 of=/tmp/dd/sg1 bs=64k count=1000
Reducing read to 64 blocks per loop
1000+0 records in
1000+0 records out
real0m0.187s
user0m0.001s
sys 0m0.141s
puma:/tmp/dd# time dd if=/dev/sdb of=/tmp/dd/sdb bs=64k count=1000
1000+0 records in
1000+0 records out
65536000 bytes transferred in 1.203468 seconds (54455956 bytes/sec)
real0m1.219s
user0m0.001s
sys 0m0.138s
puma:/tmp/dd# ls -l
total 128000
-rw-r--r--  1 root root 65536000 Feb  3 17:16 sdb
-rw-r--r--  1 root root 65536000 Feb  3 17:16 sg1
puma:/tmp/dd# md5sum *
ec31224970ddd3fb74501c8e68327e7b  sdb
60d4689227d60e6122f1ffe0ec1b2ad7  sg1
^
See? dd from sdb is not the same as sg1! Is this supposed to happen?
About the 900MB/sec:
This same sg1 (= sdb, which is a single hitachi sata hdd) performes like 
this:

puma:/tmp/dd# time sg_dd if=/dev/sg1 of=/dev/null bs=64k count=100 
time=1
Reducing read to 64 blocks per loop
time to transfer data was 69.784784 secs, 939.12 MB/sec
100+0 records in
100+0 records out

real1m9.787s
user0m0.063s
sys 0m58.115s
I can assure you that this drive can't do more than 60MB/sec sustained.
My only conclusion is that sg (or sg_dd) is broken? ;)
Peace,
Nuno Silva
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Drive performance bottleneck

2005-02-03 Thread Ian Godin
On Feb 3, 2005, at 9:40 AM, Nuno Silva wrote:
Ian Godin wrote:
  I am trying to get very fast disk drive performance and I am seeing 
some interesting bottlenecks.  We are trying to get 800 MB/sec or 
more (yes, that is megabytes per second).  We are currently using 
PCI-Express with a 16 drive raid card (SATA drives).  We have 
achieved that speed, but only through the SG (SCSI generic) driver.  
This is running the stock 2.6.10 kernel.  And the device is not 
mounted as a file system.  I also set the read ahead size on the 
device to 16KB (which speeds things up a lot):
I was trying to reproduce but got distracted by this:
(use page down, if you just want to see the odd result)
puma:/tmp/dd# sg_map
/dev/sg0  /dev/sda
/dev/sg1  /dev/sdb
/dev/sg2  /dev/scd0
/dev/sg3  /dev/sdc
puma:/tmp/dd# time sg_dd if=/dev/sg1 of=/tmp/dd/sg1 bs=64k count=1000
Reducing read to 64 blocks per loop
1000+0 records in
1000+0 records out
real0m0.187s
user0m0.001s
sys 0m0.141s
puma:/tmp/dd# time dd if=/dev/sdb of=/tmp/dd/sdb bs=64k count=1000
1000+0 records in
1000+0 records out
65536000 bytes transferred in 1.203468 seconds (54455956 bytes/sec)
real0m1.219s
user0m0.001s
sys 0m0.138s
puma:/tmp/dd# ls -l
total 128000
-rw-r--r--  1 root root 65536000 Feb  3 17:16 sdb
-rw-r--r--  1 root root 65536000 Feb  3 17:16 sg1
puma:/tmp/dd# md5sum *
ec31224970ddd3fb74501c8e68327e7b  sdb
60d4689227d60e6122f1ffe0ec1b2ad7  sg1
^
See? dd from sdb is not the same as sg1! Is this supposed to happen?
About the 900MB/sec:
This same sg1 (= sdb, which is a single hitachi sata hdd) performes 
like this:

puma:/tmp/dd# time sg_dd if=/dev/sg1 of=/dev/null bs=64k count=100 
time=1
Reducing read to 64 blocks per loop
time to transfer data was 69.784784 secs, 939.12 MB/sec
100+0 records in
100+0 records out

real1m9.787s
user0m0.063s
sys 0m58.115s
I can assure you that this drive can't do more than 60MB/sec sustained.
My only conclusion is that sg (or sg_dd) is broken? ;)
Peace,
Nuno Silva
  Definitely have been able to repeat that here, so the SG driver 
definitely appears to be broken.  At least I'm glad I am not going 
insane, I was starting to wonder :)

  I'll run some more tests with O_DIRECT and such things, see if I can 
figure out what the REAL max speed is.

 Thanks for the help everyone,
   Ian.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Drive performance bottleneck

2005-02-03 Thread Ian Godin
On Feb 2, 2005, at 7:56 PM, Bernd Eckenfels wrote:
In article <[EMAIL PROTECTED]> you 
wrote:
 Below is an oprofile (truncated) of (the same) dd running on 
/dev/sdb.
do  you also have the oprofile of the sg_dd handy?
Greetings
Bernd
  Just ran it on the sg_dd (using /dev/sg1):
CPU: P4 / Xeon, speed 3402.13 MHz (estimated)
Counted GLOBAL_POWER_EVENTS events (time during which processor is not 
stopped) with a unit mask of 0x01 (mandatory) count 10

samples  %symbol name
2145136  89.7474  __copy_to_user_ll
64720 2.7077  lock_kernel
17883 0.7482  mark_offset_tsc
16212 0.6783  page_address
8091  0.3385  schedule
5380  0.2251  timer_interrupt
5314  0.2223  _spin_lock
5311  0.  mwait_idle
4034  0.1688  sysenter_past_esp
3569  0.1493  do_anonymous_page
3530  0.1477  apic_timer_interrupt
3368  0.1409  _spin_lock_irqsave
3150  0.1318  sg_read_xfer
3043  0.1273  kmem_cache_alloc
2590  0.1084  find_busiest_group
2553  0.1068  scheduler_tick
2319  0.0970  __copy_from_user_ll
2299  0.0962  sg_ioctl
2131  0.0892  irq_entries_start
1905  0.0797  sys_ioctl
1773  0.0742  copy_page_range
1678  0.0702  fget
1648  0.0689  __switch_to
1648  0.0689  scsi_block_when_processing_errors
1632  0.0683  sg_start_req
1569  0.0656  increment_tail
1511  0.0632  try_to_wake_up
1506  0.0630  update_one_process
1454  0.0608  fput
1397  0.0584  do_gettimeofday
1396  0.0584  zap_pte_range
1371  0.0574  recalc_task_prio
1357  0.0568  sched_clock
1352  0.0566  sg_link_reserve
1282  0.0536  __elv_add_request
1229  0.0514  kmap_atomic
1195  0.0500  page_fault

  Oddly enough it appears to also be copying to user space...
 Ian.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Drive performance bottleneck

2005-02-03 Thread Jens Axboe
On Wed, Feb 02 2005, Ian Godin wrote:
> 
>   I am trying to get very fast disk drive performance and I am seeing 
> some interesting bottlenecks.  We are trying to get 800 MB/sec or more 
> (yes, that is megabytes per second).  We are currently using 
> PCI-Express with a 16 drive raid card (SATA drives).  We have achieved 
> that speed, but only through the SG (SCSI generic) driver.  This is 
> running the stock 2.6.10 kernel.  And the device is not mounted as a 
> file system.  I also set the read ahead size on the device to 16KB 
> (which speeds things up a lot):
> 
> blockdev --setra 16834 /dev/sdb
> 
> So here are the results:
> 
> $ time dd if=/dev/sdb of=/dev/null bs=64k count=100
> 100+0 records in
> 100+0 records out
> 0.27user 86.19system 2:40.68elapsed 53%CPU (0avgtext+0avgdata 
> 0maxresident)k
> 0inputs+0outputs (0major+177minor)pagefaults 0swaps
> 
> 64k * 100 / 160.68 = 398.3 MB/sec
> 
> Using sg_dd just to make sure it works the same:
> 
> $ time sg_dd if=/dev/sdb of=/dev/null bs=64k count=100
> 100+0 records in
> 100+0 records out
> 0.05user 144.27system 2:41.55elapsed 89%CPU (0avgtext+0avgdata 
> 0maxresident)k
> 0inputs+0outputs (17major+5375minor)pagefaults 0swaps
> 
>   Pretty much the same speed.  Now using the SG device (sg1 is tied to 
> sdb):
> 
> $ time sg_dd if=/dev/sg1 of=/dev/null bs=64k count=100
> Reducing read to 16 blocks per loop
> 100+0 records in
> 100+0 records out
> 0.22user 66.21system 1:10.23elapsed 94%CPU (0avgtext+0avgdata 
> 0maxresident)k
> 0inputs+0outputs (0major+2327minor)pagefaults 0swaps
> 
> 64k * 100 / 70.23 = 911.3 MB/sec
> 
>   Now that's more like the speeds we expected.  I understand that the 
> SG device uses direct I/O and/or mmap memory from the kernel.  What I 
> cannot believe is that there is that much overhead in going through the 
> page buffer/cache system in Linux.

It's not going through the page cache that is the problem, it's the
copying to user space. Have you tried using O_DIRECT? What kind of
speeds are you getting with that?

-- 
Jens Axboe

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Drive performance bottleneck

2005-02-03 Thread Jens Axboe
On Wed, Feb 02 2005, Ian Godin wrote:
 
   I am trying to get very fast disk drive performance and I am seeing 
 some interesting bottlenecks.  We are trying to get 800 MB/sec or more 
 (yes, that is megabytes per second).  We are currently using 
 PCI-Express with a 16 drive raid card (SATA drives).  We have achieved 
 that speed, but only through the SG (SCSI generic) driver.  This is 
 running the stock 2.6.10 kernel.  And the device is not mounted as a 
 file system.  I also set the read ahead size on the device to 16KB 
 (which speeds things up a lot):
 
 blockdev --setra 16834 /dev/sdb
 
 So here are the results:
 
 $ time dd if=/dev/sdb of=/dev/null bs=64k count=100
 100+0 records in
 100+0 records out
 0.27user 86.19system 2:40.68elapsed 53%CPU (0avgtext+0avgdata 
 0maxresident)k
 0inputs+0outputs (0major+177minor)pagefaults 0swaps
 
 64k * 100 / 160.68 = 398.3 MB/sec
 
 Using sg_dd just to make sure it works the same:
 
 $ time sg_dd if=/dev/sdb of=/dev/null bs=64k count=100
 100+0 records in
 100+0 records out
 0.05user 144.27system 2:41.55elapsed 89%CPU (0avgtext+0avgdata 
 0maxresident)k
 0inputs+0outputs (17major+5375minor)pagefaults 0swaps
 
   Pretty much the same speed.  Now using the SG device (sg1 is tied to 
 sdb):
 
 $ time sg_dd if=/dev/sg1 of=/dev/null bs=64k count=100
 Reducing read to 16 blocks per loop
 100+0 records in
 100+0 records out
 0.22user 66.21system 1:10.23elapsed 94%CPU (0avgtext+0avgdata 
 0maxresident)k
 0inputs+0outputs (0major+2327minor)pagefaults 0swaps
 
 64k * 100 / 70.23 = 911.3 MB/sec
 
   Now that's more like the speeds we expected.  I understand that the 
 SG device uses direct I/O and/or mmap memory from the kernel.  What I 
 cannot believe is that there is that much overhead in going through the 
 page buffer/cache system in Linux.

It's not going through the page cache that is the problem, it's the
copying to user space. Have you tried using O_DIRECT? What kind of
speeds are you getting with that?

-- 
Jens Axboe

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Drive performance bottleneck

2005-02-03 Thread Ian Godin
On Feb 2, 2005, at 7:56 PM, Bernd Eckenfels wrote:
In article [EMAIL PROTECTED] you 
wrote:
 Below is an oprofile (truncated) of (the same) dd running on 
/dev/sdb.
do  you also have the oprofile of the sg_dd handy?
Greetings
Bernd
  Just ran it on the sg_dd (using /dev/sg1):
CPU: P4 / Xeon, speed 3402.13 MHz (estimated)
Counted GLOBAL_POWER_EVENTS events (time during which processor is not 
stopped) with a unit mask of 0x01 (mandatory) count 10

samples  %symbol name
2145136  89.7474  __copy_to_user_ll
64720 2.7077  lock_kernel
17883 0.7482  mark_offset_tsc
16212 0.6783  page_address
8091  0.3385  schedule
5380  0.2251  timer_interrupt
5314  0.2223  _spin_lock
5311  0.  mwait_idle
4034  0.1688  sysenter_past_esp
3569  0.1493  do_anonymous_page
3530  0.1477  apic_timer_interrupt
3368  0.1409  _spin_lock_irqsave
3150  0.1318  sg_read_xfer
3043  0.1273  kmem_cache_alloc
2590  0.1084  find_busiest_group
2553  0.1068  scheduler_tick
2319  0.0970  __copy_from_user_ll
2299  0.0962  sg_ioctl
2131  0.0892  irq_entries_start
1905  0.0797  sys_ioctl
1773  0.0742  copy_page_range
1678  0.0702  fget
1648  0.0689  __switch_to
1648  0.0689  scsi_block_when_processing_errors
1632  0.0683  sg_start_req
1569  0.0656  increment_tail
1511  0.0632  try_to_wake_up
1506  0.0630  update_one_process
1454  0.0608  fput
1397  0.0584  do_gettimeofday
1396  0.0584  zap_pte_range
1371  0.0574  recalc_task_prio
1357  0.0568  sched_clock
1352  0.0566  sg_link_reserve
1282  0.0536  __elv_add_request
1229  0.0514  kmap_atomic
1195  0.0500  page_fault

  Oddly enough it appears to also be copying to user space...
 Ian.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Drive performance bottleneck

2005-02-03 Thread Ian Godin
On Feb 3, 2005, at 9:40 AM, Nuno Silva wrote:
Ian Godin wrote:
  I am trying to get very fast disk drive performance and I am seeing 
some interesting bottlenecks.  We are trying to get 800 MB/sec or 
more (yes, that is megabytes per second).  We are currently using 
PCI-Express with a 16 drive raid card (SATA drives).  We have 
achieved that speed, but only through the SG (SCSI generic) driver.  
This is running the stock 2.6.10 kernel.  And the device is not 
mounted as a file system.  I also set the read ahead size on the 
device to 16KB (which speeds things up a lot):
I was trying to reproduce but got distracted by this:
(use page down, if you just want to see the odd result)
puma:/tmp/dd# sg_map
/dev/sg0  /dev/sda
/dev/sg1  /dev/sdb
/dev/sg2  /dev/scd0
/dev/sg3  /dev/sdc
puma:/tmp/dd# time sg_dd if=/dev/sg1 of=/tmp/dd/sg1 bs=64k count=1000
Reducing read to 64 blocks per loop
1000+0 records in
1000+0 records out
real0m0.187s
user0m0.001s
sys 0m0.141s
puma:/tmp/dd# time dd if=/dev/sdb of=/tmp/dd/sdb bs=64k count=1000
1000+0 records in
1000+0 records out
65536000 bytes transferred in 1.203468 seconds (54455956 bytes/sec)
real0m1.219s
user0m0.001s
sys 0m0.138s
puma:/tmp/dd# ls -l
total 128000
-rw-r--r--  1 root root 65536000 Feb  3 17:16 sdb
-rw-r--r--  1 root root 65536000 Feb  3 17:16 sg1
puma:/tmp/dd# md5sum *
ec31224970ddd3fb74501c8e68327e7b  sdb
60d4689227d60e6122f1ffe0ec1b2ad7  sg1
^
See? dd from sdb is not the same as sg1! Is this supposed to happen?
About the 900MB/sec:
This same sg1 (= sdb, which is a single hitachi sata hdd) performes 
like this:

puma:/tmp/dd# time sg_dd if=/dev/sg1 of=/dev/null bs=64k count=100 
time=1
Reducing read to 64 blocks per loop
time to transfer data was 69.784784 secs, 939.12 MB/sec
100+0 records in
100+0 records out

real1m9.787s
user0m0.063s
sys 0m58.115s
I can assure you that this drive can't do more than 60MB/sec sustained.
My only conclusion is that sg (or sg_dd) is broken? ;)
Peace,
Nuno Silva
  Definitely have been able to repeat that here, so the SG driver 
definitely appears to be broken.  At least I'm glad I am not going 
insane, I was starting to wonder :)

  I'll run some more tests with O_DIRECT and such things, see if I can 
figure out what the REAL max speed is.

 Thanks for the help everyone,
   Ian.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Drive performance bottleneck

2005-02-03 Thread Nuno Silva
Ian Godin wrote:
  I am trying to get very fast disk drive performance and I am seeing 
some interesting bottlenecks.  We are trying to get 800 MB/sec or more 
(yes, that is megabytes per second).  We are currently using PCI-Express 
with a 16 drive raid card (SATA drives).  We have achieved that speed, 
but only through the SG (SCSI generic) driver.  This is running the 
stock 2.6.10 kernel.  And the device is not mounted as a file system.  I 
also set the read ahead size on the device to 16KB (which speeds things 
up a lot):
I was trying to reproduce but got distracted by this:
(use page down, if you just want to see the odd result)
puma:/tmp/dd# sg_map
/dev/sg0  /dev/sda
/dev/sg1  /dev/sdb
/dev/sg2  /dev/scd0
/dev/sg3  /dev/sdc
puma:/tmp/dd# time sg_dd if=/dev/sg1 of=/tmp/dd/sg1 bs=64k count=1000
Reducing read to 64 blocks per loop
1000+0 records in
1000+0 records out
real0m0.187s
user0m0.001s
sys 0m0.141s
puma:/tmp/dd# time dd if=/dev/sdb of=/tmp/dd/sdb bs=64k count=1000
1000+0 records in
1000+0 records out
65536000 bytes transferred in 1.203468 seconds (54455956 bytes/sec)
real0m1.219s
user0m0.001s
sys 0m0.138s
puma:/tmp/dd# ls -l
total 128000
-rw-r--r--  1 root root 65536000 Feb  3 17:16 sdb
-rw-r--r--  1 root root 65536000 Feb  3 17:16 sg1
puma:/tmp/dd# md5sum *
ec31224970ddd3fb74501c8e68327e7b  sdb
60d4689227d60e6122f1ffe0ec1b2ad7  sg1
^
See? dd from sdb is not the same as sg1! Is this supposed to happen?
About the 900MB/sec:
This same sg1 (= sdb, which is a single hitachi sata hdd) performes like 
this:

puma:/tmp/dd# time sg_dd if=/dev/sg1 of=/dev/null bs=64k count=100 
time=1
Reducing read to 64 blocks per loop
time to transfer data was 69.784784 secs, 939.12 MB/sec
100+0 records in
100+0 records out

real1m9.787s
user0m0.063s
sys 0m58.115s
I can assure you that this drive can't do more than 60MB/sec sustained.
My only conclusion is that sg (or sg_dd) is broken? ;)
Peace,
Nuno Silva
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Drive performance bottleneck

2005-02-03 Thread Paulo Marques
Ian Godin wrote:
[...]
  Definitely have been able to repeat that here, so the SG driver 
definitely appears to be broken.  At least I'm glad I am not going 
insane, I was starting to wonder :)

  I'll run some more tests with O_DIRECT and such things, see if I can 
figure out what the REAL max speed is.
FYI there was a patch running around last April that made a new option 
for dd to make it use O_DIRECT. You can get it here:

http://marc.theaimsgroup.com/?l=linux-kernelm=108135935629589w=2
Unfortunately this hasn't made it into coreutils. IIRC there were issues 
about dd being multi-platform and the way O_DIRECT was done in other 
systems.

Anyway, you can patch dd yourself and have a tool for debugging with 
O_DIRECT. I hope this helps,

--
Paulo Marques - www.grupopie.com
All that is necessary for the triumph of evil is that good men do nothing.
Edmund Burke (1729 - 1797)
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Drive performance bottleneck

2005-02-02 Thread Bernd Eckenfels
In article <[EMAIL PROTECTED]> you wrote:
>  Below is an oprofile (truncated) of (the same) dd running on /dev/sdb.

do  you also have the oprofile of the sg_dd handy?

Greetings
Bernd
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Drive performance bottleneck

2005-02-02 Thread Ian Godin
  I am trying to get very fast disk drive performance and I am seeing 
some interesting bottlenecks.  We are trying to get 800 MB/sec or more 
(yes, that is megabytes per second).  We are currently using 
PCI-Express with a 16 drive raid card (SATA drives).  We have achieved 
that speed, but only through the SG (SCSI generic) driver.  This is 
running the stock 2.6.10 kernel.  And the device is not mounted as a 
file system.  I also set the read ahead size on the device to 16KB 
(which speeds things up a lot):

blockdev --setra 16834 /dev/sdb
So here are the results:
$ time dd if=/dev/sdb of=/dev/null bs=64k count=100
100+0 records in
100+0 records out
0.27user 86.19system 2:40.68elapsed 53%CPU (0avgtext+0avgdata 
0maxresident)k
0inputs+0outputs (0major+177minor)pagefaults 0swaps

64k * 100 / 160.68 = 398.3 MB/sec
Using sg_dd just to make sure it works the same:
$ time sg_dd if=/dev/sdb of=/dev/null bs=64k count=100
100+0 records in
100+0 records out
0.05user 144.27system 2:41.55elapsed 89%CPU (0avgtext+0avgdata 
0maxresident)k
0inputs+0outputs (17major+5375minor)pagefaults 0swaps

  Pretty much the same speed.  Now using the SG device (sg1 is tied to 
sdb):

$ time sg_dd if=/dev/sg1 of=/dev/null bs=64k count=100
Reducing read to 16 blocks per loop
100+0 records in
100+0 records out
0.22user 66.21system 1:10.23elapsed 94%CPU (0avgtext+0avgdata 
0maxresident)k
0inputs+0outputs (0major+2327minor)pagefaults 0swaps

64k * 100 / 70.23 = 911.3 MB/sec
  Now that's more like the speeds we expected.  I understand that the 
SG device uses direct I/O and/or mmap memory from the kernel.  What I 
cannot believe is that there is that much overhead in going through the 
page buffer/cache system in Linux.

  We also tried going through a file system (various ones, JFS, XFS, 
Reiser, Ext3).  They all seem to bottleneck at around 400MB/sec, much 
like /dev/sdb does.  We also have a "real" SCSI raid system which also 
bottlenecks right at 400 MB/sec.  Under Windows (XP) both of these 
systems run at 650 (SCSI) or 800 (SATA) MB/sec.

  Other variations I've tried: setting the read ahead to larger or 
smaller number (1, 2, 4, 8, 16, 32, 64 KB)... 8 or 16 seems to be 
optimal.  Using different block sizes in the dd command (again 1, 2, 4, 
8, 16, 32, 64).  16, 32, 64 are pretty much identical and fastest.

 Below is an oprofile (truncated) of (the same) dd running on /dev/sdb.
  So is the overhead really that high?  Hopefully there's a bottleneck 
in there that no one has come across yet, and it can be optimized.  
Anyone else trying to pull close to 1GB/sec from disk? :)  The kernel 
has changed a lot since the last time I really worked with it (2.2), so 
any suggestions are appreciated.

Ian Godin
Senior Software Developer
DTS/Lowry Digital Images
---
CPU: P4 / Xeon, speed 3402.13 MHz (estimated)
Counted GLOBAL_POWER_EVENTS events (time during which processor is not 
stopped) with a unit mask of 0x01 (mandatory) count 10
samples  %symbol name
8481858.3510  __copy_to_user_ll
7721727.6026  do_anonymous_page
7015796.9076  _spin_lock_irq
5790245.7009  __copy_user_intel
3616343.5606  _spin_lock
3430183.3773  _spin_lock_irqsave
3074623.0272  kmap_atomic
1933271.9035  page_fault
1810401.7825  schedule
1745021.7181  radix_tree_delete
1589671.5652  end_buffer_async_read
1241241.2221  free_hot_cold_page
1190571.1722  sysenter_past_esp
1173841.1557  shrink_list
1127621.1102  buffered_rmqueue
1054901.0386  smp_call_function
1015681.  kmem_cache_alloc
97404 0.9590  kmem_cache_free
95826 0.9435  __rmqueue
95443 0.9397  __copy_from_user_ll
93181 0.9174  free_pages_bulk
92732 0.9130  release_pages
86912 0.8557  shrink_cache
85896 0.8457  block_read_full_page
79629 0.7840  free_block
78304 0.7710  mempool_free
72264 0.7115  create_empty_buffers
71303 0.7020  do_syslog
70769 0.6968  emit_log_char
66413 0.6539  mark_offset_tsc
64333 0.6334  vprintk
63468 0.6249  file_read_actor
63292 0.6232  add_to_page_cache
62281 0.6132  unlock_page
61655 0.6070  _spin_unlock_irqrestore
59486 0.5857  find_get_page
58901 0.5799  drop_buffers
58775 0.5787  do_generic_mapping_read
55070 0.5422  __wake_up_bit
48681 0.4793  __end_that_request_first
47121 0.4639  bad_range
47102 0.4638  submit_bh
45009 0.4431  journal_add_journal_head
41270 0.4063  __alloc_pages
41247 0.4061  page_waitqueue
39520 0.3891  generic_file_buffered_write
38520 0.3793  __pagevec_lru_add
38142 0.3755  do_select
38105 0.3752  do_mpage_readpage
37020 0.3645  vsnprintf
36541 0.3598  __clear_page_buffers
35932 0.3538  journal_put_journal_head
35769 0.3522  radix_tree_lookup
35636 0.3509  bio_put
34904 0.3437  jfs_get_blocks
34865 0.3433  mark_page_accessed
33686 0.3317  bio_alloc
33273 0.3276  

Drive performance bottleneck

2005-02-02 Thread Ian Godin
  I am trying to get very fast disk drive performance and I am seeing 
some interesting bottlenecks.  We are trying to get 800 MB/sec or more 
(yes, that is megabytes per second).  We are currently using 
PCI-Express with a 16 drive raid card (SATA drives).  We have achieved 
that speed, but only through the SG (SCSI generic) driver.  This is 
running the stock 2.6.10 kernel.  And the device is not mounted as a 
file system.  I also set the read ahead size on the device to 16KB 
(which speeds things up a lot):

blockdev --setra 16834 /dev/sdb
So here are the results:
$ time dd if=/dev/sdb of=/dev/null bs=64k count=100
100+0 records in
100+0 records out
0.27user 86.19system 2:40.68elapsed 53%CPU (0avgtext+0avgdata 
0maxresident)k
0inputs+0outputs (0major+177minor)pagefaults 0swaps

64k * 100 / 160.68 = 398.3 MB/sec
Using sg_dd just to make sure it works the same:
$ time sg_dd if=/dev/sdb of=/dev/null bs=64k count=100
100+0 records in
100+0 records out
0.05user 144.27system 2:41.55elapsed 89%CPU (0avgtext+0avgdata 
0maxresident)k
0inputs+0outputs (17major+5375minor)pagefaults 0swaps

  Pretty much the same speed.  Now using the SG device (sg1 is tied to 
sdb):

$ time sg_dd if=/dev/sg1 of=/dev/null bs=64k count=100
Reducing read to 16 blocks per loop
100+0 records in
100+0 records out
0.22user 66.21system 1:10.23elapsed 94%CPU (0avgtext+0avgdata 
0maxresident)k
0inputs+0outputs (0major+2327minor)pagefaults 0swaps

64k * 100 / 70.23 = 911.3 MB/sec
  Now that's more like the speeds we expected.  I understand that the 
SG device uses direct I/O and/or mmap memory from the kernel.  What I 
cannot believe is that there is that much overhead in going through the 
page buffer/cache system in Linux.

  We also tried going through a file system (various ones, JFS, XFS, 
Reiser, Ext3).  They all seem to bottleneck at around 400MB/sec, much 
like /dev/sdb does.  We also have a real SCSI raid system which also 
bottlenecks right at 400 MB/sec.  Under Windows (XP) both of these 
systems run at 650 (SCSI) or 800 (SATA) MB/sec.

  Other variations I've tried: setting the read ahead to larger or 
smaller number (1, 2, 4, 8, 16, 32, 64 KB)... 8 or 16 seems to be 
optimal.  Using different block sizes in the dd command (again 1, 2, 4, 
8, 16, 32, 64).  16, 32, 64 are pretty much identical and fastest.

 Below is an oprofile (truncated) of (the same) dd running on /dev/sdb.
  So is the overhead really that high?  Hopefully there's a bottleneck 
in there that no one has come across yet, and it can be optimized.  
Anyone else trying to pull close to 1GB/sec from disk? :)  The kernel 
has changed a lot since the last time I really worked with it (2.2), so 
any suggestions are appreciated.

Ian Godin
Senior Software Developer
DTS/Lowry Digital Images
---
CPU: P4 / Xeon, speed 3402.13 MHz (estimated)
Counted GLOBAL_POWER_EVENTS events (time during which processor is not 
stopped) with a unit mask of 0x01 (mandatory) count 10
samples  %symbol name
8481858.3510  __copy_to_user_ll
7721727.6026  do_anonymous_page
7015796.9076  _spin_lock_irq
5790245.7009  __copy_user_intel
3616343.5606  _spin_lock
3430183.3773  _spin_lock_irqsave
3074623.0272  kmap_atomic
1933271.9035  page_fault
1810401.7825  schedule
1745021.7181  radix_tree_delete
1589671.5652  end_buffer_async_read
1241241.2221  free_hot_cold_page
1190571.1722  sysenter_past_esp
1173841.1557  shrink_list
1127621.1102  buffered_rmqueue
1054901.0386  smp_call_function
1015681.  kmem_cache_alloc
97404 0.9590  kmem_cache_free
95826 0.9435  __rmqueue
95443 0.9397  __copy_from_user_ll
93181 0.9174  free_pages_bulk
92732 0.9130  release_pages
86912 0.8557  shrink_cache
85896 0.8457  block_read_full_page
79629 0.7840  free_block
78304 0.7710  mempool_free
72264 0.7115  create_empty_buffers
71303 0.7020  do_syslog
70769 0.6968  emit_log_char
66413 0.6539  mark_offset_tsc
64333 0.6334  vprintk
63468 0.6249  file_read_actor
63292 0.6232  add_to_page_cache
62281 0.6132  unlock_page
61655 0.6070  _spin_unlock_irqrestore
59486 0.5857  find_get_page
58901 0.5799  drop_buffers
58775 0.5787  do_generic_mapping_read
55070 0.5422  __wake_up_bit
48681 0.4793  __end_that_request_first
47121 0.4639  bad_range
47102 0.4638  submit_bh
45009 0.4431  journal_add_journal_head
41270 0.4063  __alloc_pages
41247 0.4061  page_waitqueue
39520 0.3891  generic_file_buffered_write
38520 0.3793  __pagevec_lru_add
38142 0.3755  do_select
38105 0.3752  do_mpage_readpage
37020 0.3645  vsnprintf
36541 0.3598  __clear_page_buffers
35932 0.3538  journal_put_journal_head
35769 0.3522  radix_tree_lookup
35636 0.3509  bio_put
34904 0.3437  jfs_get_blocks
34865 0.3433  mark_page_accessed
33686 0.3317  bio_alloc
33273 0.3276  

Re: Drive performance bottleneck

2005-02-02 Thread Bernd Eckenfels
In article [EMAIL PROTECTED] you wrote:
  Below is an oprofile (truncated) of (the same) dd running on /dev/sdb.

do  you also have the oprofile of the sg_dd handy?

Greetings
Bernd
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/