Re: Direct io on block device has performance regression on 2.6.x kernel

2005-03-10 Thread Andrew Morton
"Chen, Kenneth W" <[EMAIL PROTECTED]> wrote: > > Let me work on the readv/writev support (unless someone beat me to it). Please also move it to the address_space_operations level. Yes, there are performance benefits from simply omitting the LFS checks, the mmap consistency fixes, etc. But they'r

RE: Direct io on block device has performance regression on 2.6.x kernel

2005-03-10 Thread Chen, Kenneth W
Andrew Morton wrote on Thursday, March 10, 2005 12:31 PM > > > Fine-grained alignment is probably too hard, and it should fall back to > > > __blockdev_direct_IO(). > > > > > > Does it do the right thing with a request which is non-page-aligned, but > > > 512-byte aligned? > > > > > > readv

Re: Direct io on block device has performance regression on 2.6.x kernel

2005-03-10 Thread Andrew Morton
"Chen, Kenneth W" <[EMAIL PROTECTED]> wrote: > > Losing 6% just from Linux kernel is a huge deal for this type of benchmark. > People work for days to implement features which might give sub percentage > gain. Making Software run faster is not easy, but making software run slower > apparently i

RE: Direct io on block device has performance regression on 2.6.x kernel

2005-03-10 Thread Chen, Kenneth W
Andrew Morton wrote on Wednesday, March 09, 2005 8:10 PM > > 2.6.9 kernel is 6% slower compare to distributor's 2.4 kernel (RHEL3). > > Roughly > > 2% came from storage driver (I'm not allowed to say anything beyond that, > > there > > is a fix though). > > The codepaths are indeed longer in 2.6

Re: Direct io on block device has performance regression on 2.6.x kernel

2005-03-09 Thread Andrew Morton
"Chen, Kenneth W" <[EMAIL PROTECTED]> wrote: > > > Did you generate a kernel profile? > > Top 40 kernel hot functions, percentage is normalized to kernel utilization. > > _spin_unlock_irqrestore 23.54% > _spin_unlock_irq 19.27% Cripes. Is that with CONFIG_PRE

Re: Direct io on block device has performance regression on 2.6.x kernel

2005-03-09 Thread Jesse Barnes
On Wednesday, March 9, 2005 3:23 pm, Andi Kleen wrote: > "Chen, Kenneth W" <[EMAIL PROTECTED]> writes: > > Just to clarify here, these data need to be taken at grain of salt. A > > high count in _spin_unlock_* functions do not automatically points to > > lock contention. It's one of the blind spot

Re: Direct io on block device has performance regression on 2.6.x kernel

2005-03-09 Thread Andrew Vasquez
On Wed, 09 Mar 2005, Chen, Kenneth W wrote: > Andrew Morton wrote Wednesday, March 09, 2005 6:26 PM > > What does "1/3 of the total benchmark performance regression" mean? One > > third of 0.1% isn't very impressive. You haven't told us anything at all > > about the magnitude of this regression.

Re: Direct io on block device has performance regression on 2.6.x kernel

2005-03-09 Thread Jesse Barnes
On Wednesday, March 9, 2005 3:23 pm, Andi Kleen wrote: > "Chen, Kenneth W" <[EMAIL PROTECTED]> writes: > > Just to clarify here, these data need to be taken at grain of salt. A > > high count in _spin_unlock_* functions do not automatically points to > > lock contention. It's one of the blind spot

RE: Direct io on block device has performance regression on 2.6.x kernel

2005-03-09 Thread David Lang
On Wed, 9 Mar 2005, Chen, Kenneth W wrote: Also, I'm rather peeved that we're hearing about this regression now rather than two years ago. And mystified as to why yours is the only group which has reported it. 2.6.X kernel has never been faster than the 2.4 kernel (RHEL3). At one point of time,

Re: Direct io on block device has performance regression on 2.6.x kernel

2005-03-09 Thread Andrew Morton
"Chen, Kenneth W" <[EMAIL PROTECTED]> wrote: > > Andrew Morton wrote Wednesday, March 09, 2005 6:26 PM > > What does "1/3 of the total benchmark performance regression" mean? One > > third of 0.1% isn't very impressive. You haven't told us anything at all > > about the magnitude of this regressio

Re: Direct io on block device has performance regression on 2.6.x kernel

2005-03-09 Thread Andrew Morton
David Lang <[EMAIL PROTECTED]> wrote: > > (I've seen a 50% > performance hit on 2.4 with just a thousand or two threads compared to > 2.6) Was that 2.4 kernel a vendor kernel with the O(1) scheduler? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a mes

RE: Direct io on block device has performance regression on 2.6.x kernel

2005-03-09 Thread Chen, Kenneth W
Andrew Morton wrote Wednesday, March 09, 2005 6:26 PM > What does "1/3 of the total benchmark performance regression" mean? One > third of 0.1% isn't very impressive. You haven't told us anything at all > about the magnitude of this regression. 2.6.9 kernel is 6% slower compare to distributor's

Re: Direct io on block device has performance regression on 2.6.x kernel

2005-03-09 Thread Andrew Morton
"Chen, Kenneth W" <[EMAIL PROTECTED]> wrote: > > This is all real: real benchmark running on real hardware, with real > result showing large performance regression. Nothing synthetic here. > Ken, could you *please* be more complete, more organized and more specific? What does "1/3 of the total

Re: Direct io on block device has performance regression on 2.6.x kernel

2005-03-09 Thread Andrew Morton
"Chen, Kenneth W" <[EMAIL PROTECTED]> wrote: > > Andrew Morton wrote on Wednesday, March 09, 2005 2:45 PM > > > > > > > Did you generate a kernel profile? > > > > > > Top 40 kernel hot functions, percentage is normalized to kernel > utilization. > > > > > > _spin_unlock_irqrestore

RE: Direct io on block device has performance regression on 2.6.x kernel

2005-03-09 Thread Chen, Kenneth W
Chen, Kenneth W wrote on Wednesday, March 09, 2005 5:45 PM > Andrew Morton wrote on Wednesday, March 09, 2005 5:34 PM > > What are these percentages? Total CPU time? The direct-io stuff doesn't > > look too bad. It's surprising that tweaking the direct-io submission code > > makes much differenc

RE: Direct io on block device has performance regression on 2.6.x kernel

2005-03-09 Thread Chen, Kenneth W
Andrew Morton wrote on Wednesday, March 09, 2005 2:45 PM > > > > > Did you generate a kernel profile? > > > > Top 40 kernel hot functions, percentage is normalized to kernel > > utilization. > > > > _spin_unlock_irqrestore23.54% > > _spin_unlock_irq 19.27% > > Crip

RE: Direct io on block device has performance regression on 2.6.x kernel

2005-03-09 Thread Chen, Kenneth W
Andrew Morton wrote on Wednesday, March 09, 2005 5:34 PM > What are these percentages? Total CPU time? The direct-io stuff doesn't > look too bad. It's surprising that tweaking the direct-io submission code > makes much difference. Percentage is relative to total kernel time. There are three D

RE: Direct io on block device has performance regression on 2.6.x kernel

2005-03-09 Thread Chen, Kenneth W
For people who is dying to see some q-tool profile, here is one. It's not a vanilla 2.6.9 kernel, but with patches in raw device to get around the DIO performance problem. - Ken Flat profile of CPU_CYCLES in hist#0: Each histogram sample counts as 255.337u seconds % time self cumul

RE: Direct io on block device has performance regression on 2.6.x kernel

2005-03-09 Thread Chen, Kenneth W
Jesse Barnes wrote on Wednesday, March 09, 2005 3:53 PM > > "Chen, Kenneth W" <[EMAIL PROTECTED]> writes: > > > Just to clarify here, these data need to be taken at grain of salt. A > > > high count in _spin_unlock_* functions do not automatically points to > > > lock contention. It's one of the b

RE: Direct io on block device has performance regression on 2.6.x kernel

2005-03-09 Thread Chen, Kenneth W
Andi Kleen wrote on Wednesday, March 09, 2005 3:23 PM > > Just to clarify here, these data need to be taken at grain of salt. A > > high count in _spin_unlock_* functions do not automatically points to > > lock contention. It's one of the blind spot syndrome with timer based > > profile on ia64.

Re: Direct io on block device has performance regression on 2.6.x kernel

2005-03-09 Thread Andi Kleen
"Chen, Kenneth W" <[EMAIL PROTECTED]> writes: > > Just to clarify here, these data need to be taken at grain of salt. A > high count in _spin_unlock_* functions do not automatically points to > lock contention. It's one of the blind spot syndrome with timer based > profile on ia64. There are some

RE: Direct io on block device has performance regression on 2.6.x kernel

2005-03-09 Thread Chen, Kenneth W
Chen, Kenneth W wrote on Wednesday, March 09, 2005 1:59 PM > > Did you generate a kernel profile? > > Top 40 kernel hot functions, percentage is normalized to kernel utilization. > > _spin_unlock_irqrestore 23.54% > _spin_unlock_irq 19.27% > > > Profile with

RE: Direct io on block device has performance regression on 2.6.x kernel

2005-03-09 Thread Chen, Kenneth W
Andrew Morton wrote on Wednesday, March 09, 2005 12:05 PM > "Chen, Kenneth W" <[EMAIL PROTECTED]> wrote: > > Let me answer the questions in reverse order. We started with running > > industry standard transaction processing database benchmark on 2.6 kernel, > > on real hardware (4P smp, 64 GB memo

Re: Direct io on block device has performance regression on 2.6.x kernel

2005-03-09 Thread Andrew Morton
"Chen, Kenneth W" <[EMAIL PROTECTED]> wrote: > > Andrew Morton wrote on Tuesday, March 08, 2005 10:28 PM > > But before doing anything else, please bench this on real hardware, > > see if it is worth pursuing. > > Let me answer the questions in reverse order. We started with running > industry st

RE: Direct io on block device has performance regression on 2.6.x kernel

2005-03-09 Thread Chen, Kenneth W
Andrew Morton wrote on Tuesday, March 08, 2005 10:28 PM > But before doing anything else, please bench this on real hardware, > see if it is worth pursuing. Let me answer the questions in reverse order. We started with running industry standard transaction processing database benchmark on 2.6 ker

Re: Direct io on block device has performance regression on 2.6.x kernel

2005-03-08 Thread Andrew Morton
"Chen, Kenneth W" <[EMAIL PROTECTED]> wrote: > > Direct I/O on block device running 2.6.X kernel is a lot SLOWER > than running on a 2.4 Kernel! > A little bit slower, it appears. It used to be faster. > ... > > synchronous I/O AIO >

RE: Direct io on block device has performance regression on 2.6.x kernel - fix sync I/O path

2005-03-08 Thread Chen, Kenneth W
Christoph Hellwig wrote on Tuesday, March 08, 2005 6:20 PM > this is not the blockdevice, but the obsolete raw device driver. Please > benchmark and if nessecary fix the blockdevice O_DIRECT codepath insted > as the raw driver is slowly going away. >From performance perspective, can raw device be

Re: Direct io on block device has performance regression on 2.6.x kernel - fix sync I/O path

2005-03-08 Thread Christoph Hellwig
> --- linux-2.6.9/drivers/char/raw.c2004-10-18 14:54:37.0 -0700 > +++ linux-2.6.9.ken/drivers/char/raw.c2005-03-08 17:22:07.0 > -0800 this is not the blockdevice, but the obsolete raw device driver. Please benchmark and if nessecary fix the blockdevice O_DIRECT codepa

RE: Direct io on block device has performance regression on 2.6.x kernel

2005-03-08 Thread Chen, Kenneth W
OK, last one in the series: user level test programs that stress the kernel I/O stack. Pretty dull stuff. - Ken diff -Nur zero/aio_null.c blknull_test/aio_null.c --- zero/aio_null.c 1969-12-31 16:00:00.0 -0800 +++ blknull_test/aio_null.c 2005-03-08 00:46:17.0 -0800 @@ -