Andrew Morton wrote on Wednesday, March 09, 2005 12:05 PM
> "Chen, Kenneth W" <[EMAIL PROTECTED]> wrote:
> > Let me answer the questions in reverse order.  We started with running
> > industry standard transaction processing database benchmark on 2.6 kernel,
> > on real hardware (4P smp, 64 GB memory, 450 disks) running industry
> > standard db application.  What we measured is that with best tuning done
> > to the system, 2.6 kernel has a huge performance regression relative to
> > its predecessor 2.4 kernel (a kernel from RHEL3, 2.4.21 based).
>
> That's news to me.  I thought we were doing OK with big database stuff.
> Surely lots of people have been testing such things.

There are different level of "big" stuff.  We used to work on 32-way numa
box, but other show stopper issues popping up before we get to the I/O stack.
The good thing came out of that work is the removal of global unplug lock.


> > And yes, it is all worth pursuing, the two patches on raw device recuperate
> > 1/3 of the total benchmark performance regression.
>
> On a real disk driver?  hm, I'm wrong then.
>

Yes, on real disk driver (qlogic fiber channel) and with real 15K rpm disks.


> Did you generate a kernel profile?

Top 40 kernel hot functions, percentage is normalized to kernel utilization.

_spin_unlock_irqrestore         23.54%
_spin_unlock_irq                        19.27%
__blockdev_direct_IO            3.57%
follow_hugetlb_page             1.84%
e1000_clean                             1.38%
kmem_cache_alloc                        1.31%
put_page                                1.29%
__generic_file_aio_read         1.18%
e1000_intr                              1.07%
schedule                                1.01%
dio_bio_complete                        0.97%
mempool_alloc                   0.96%
kmem_cache_free                 0.90%
__end_that_request_first        0.88%
__copy_user                             0.82%
kfree                                   0.77%
generic_make_request            0.73%
_spin_lock                              0.73%
kref_put                                0.73%
vfs_read                                0.68%
update_atime                    0.68%
scsi_dispatch_cmd                       0.67%
fget_light                              0.66%
put_io_context                  0.60%
_spin_lock_irqsave              0.58%
scsi_finish_command             0.58%
generic_file_aio_write_nolock   0.57%
inode_times_differ              0.55%
break_fault                             0.53%
__do_softirq                    0.48%
aio_read_evt                    0.48%
try_atomic_semop                        0.44%
sys_pread64                             0.43%
__bio_add_page                  0.43%
__mod_timer                             0.42%
bio_alloc                               0.41%
scsi_decide_disposition         0.40%
e1000_clean_rx_irq              0.39%
find_vma                                0.38%
dnotify_parent                  0.38%


Profile with spin lock inlined, so that it is easier to see functions
that has the lock contention, again top 40 hot functions:

scsi_request_fn         7.54%
finish_task_switch      6.25%
__blockdev_direct_IO    4.97%
__make_request          3.87%
scsi_end_request                3.54%
dio_bio_end_io          2.70%
follow_hugetlb_page     2.39%
__wake_up                       2.37%
aio_complete            1.82%
kmem_cache_alloc                1.68%
__mod_timer                     1.63%
e1000_clean                     1.57%
__generic_file_aio_read 1.42%
mempool_alloc           1.37%
put_page                        1.35%
e1000_intr                      1.31%
schedule                        1.25%
dio_bio_complete                1.20%
scsi_device_unbusy      1.07%
kmem_cache_free         1.06%
__copy_user                     1.04%
scsi_dispatch_cmd               1.04%
__end_that_request_first1.04%
generic_make_request    1.02%
kfree                           0.94%
__aio_get_req           0.93%
sys_pread64                     0.83%
get_request                     0.79%
put_io_context          0.76%
dnotify_parent          0.73%
vfs_read                        0.73%
update_atime            0.73%
finished_one_bio                0.63%
generic_file_aio_write_nolock   0.63%
scsi_put_command                0.62%
break_fault                     0.62%
e1000_xmit_frame                0.62%
aio_read_evt            0.59%
scsi_io_completion      0.59%
inode_times_differ      0.58%



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to