When you say "local box" do you mean the openfiler box or the windows box? I have run local tests on the windows box without isse but haven't tried the openfiler box. I will try disabling jumbo frames, I have tried without NAPI and flow control with no success
-----Original Message----- From: Rafiu Fakunle [mailto:[EMAIL PROTECTED] Sent: Tuesday, 28 November 2006 10:27 a.m. To: Dave Watkins Cc: [email protected] Subject: Re: [OF-users] iSCSI bug? Dave Watkins wrote: > I'll start with a better description and get onto working with the new > packages > > I'm running iometer on the Windows Server 2003 x64 box against an iSCSI > volume using the MS iSCSI initiator (2.02). > > Using iometer and selecting any of the 0% read, 0% random access > specifications will return expected performance numbers, but stopping > that test, and changing the access specification to any 100% read, 0% > random test will generate the errors. Larger block sizes _seem_ to make > it happen more frequently so I have created a 256k block size test for > the above access specifications. I have been using 16 and 64 outstanding > I/O's but anything above zero seems to show the error. > > Networking on both ends is via Intel e1000 cards and jumbo frames are > enabled, and so is flow control, NAPI is also enabled on the Openfiler > box. All other network settings are default. > Try without all the tweaking and then enable them one by one. Also have you successfully run the benchmarks on the local box without going through iSCSI? R. > Dave > > -----Original Message----- > From: Rafiu Fakunle [mailto:[EMAIL PROTECTED] > Sent: Tuesday, 28 November 2006 1:30 a.m. > To: Dave Watkins > Cc: [email protected] > Subject: Re: [OF-users] iSCSI bug? > > Dave Watkins wrote: > >> Still the same, both with bonding enabled and disabled unfortunatly >> >> > First: > > try adding "nosoftlockup" to the grub boot options and then run the > benchmarks again. > > Next: > > http://www.openfiler.com/download/PACKAGES/iscsi_trgt-kernel-r78.ccs > http://www.openfiler.com/download/PACKAGES/iscsi_trgt-r78.ccs > > (kernel and userland) > > same as before (--replace-files) > > Finally: > > Also a bit more detail about your test set-up (components, parameters, > triggers etc) would be great. > > R. > >> -----Original Message----- >> From: Rafiu Fakunle [mailto:[EMAIL PROTECTED] >> Sent: Monday, 27 November 2006 2:52 p.m. >> To: Rafiu Fakunle >> Cc: Dave Watkins; [email protected] >> Subject: Re: [OF-users] iSCSI bug? >> >> Rafiu Fakunle wrote: >> >> >>> Dave Watkins wrote: >>> >>> >>>> Ok, UP is fine. To be sure it wasn't the e1000 driver I also tried >>>> >>>> >> using >> >> >>>> only the Broadcom NIC's as well. Under UP there is no error, under >>>> >>>> >> SMP >> >> >>>> the error reoccurs even with e1000 not loaded and no bonding. >>>> >>>> Hope this helps >>>> >>>> >>>> >>> Immensely. I'm just doing up a changeset for you. >>> >>> >> > http://www.openfiler.com/download/PACKAGES/iscsi_trgt-kernel-0.4.14.ccs > >> conary update iscsi_trgt-kernel-0.4.14.ccs --replace-files >> >> Then test again with 2.6.17.14-0.3.smp.x86_64 (with and without >> > bonding) > >> >> >> Thx, >> >> R. >> >> >>> R. >>> >>> >>> >>>> Dave >>>> >>>> -----Original Message----- >>>> From: Rafiu Fakunle [mailto:[EMAIL PROTECTED] Sent: Monday, 27 >>>> November 2006 1:15 p.m. >>>> To: Dave Watkins >>>> Cc: [email protected] >>>> Subject: Re: [OF-users] iSCSI bug? >>>> >>>> OK, and UP without trunking? >>>> >>>> R. >>>> >>>> Dave Watkins wrote: >>>> >>>> >>>> >>>>> With or without trunking seem to generate the same problem >>>>> >>>>> Without trunking I got >>>>> BUG: soft lockup detected on CPU#0! >>>>> >>>>> Call Trace: <IRQ> <ffffffff8029f73c>{softlockup_tick+210} >>>>> <ffffffff80289151>{update_process_times+66} >>>>> <ffffffff802713fe>{smp_local_timer_interrupt+35} >>>>> <ffffffff80271463>{smp_apic_timer_interrupt+65} >>>>> <ffffffff8025f54c>{apic_timer_interrupt+132} <EOI> >>>>> <ffffffff80224b87>{tcp_sendmsg+0} >>>>> <ffffffff80413bba>{inet_ioctl+0} >>>>> <ffffffff88141216>{:iscsi_trgt:is_data_available+62} >>>>> <ffffffff881419e7>{:iscsi_trgt:istd+1460} >>>>> <ffffffff80403ea6>{tcp_sendpage+0} >>>>> <ffffffff8027fef6>{__wake_up_common+67} >>>>> <ffffffff8029131c>{keventd_create_kthread+0} >>>>> <ffffffff88141433>{:iscsi_trgt:istd+0} >>>>> <ffffffff8029131c>{keventd_create_kthread+0} >>>>> <ffffffff80231a7d>{kthread+200} >>>>> >>>>> >> <ffffffff8025f8a2>{child_rip+8} >> >> >>>>> <ffffffff8029131c>{keventd_create_kthread+0} >>>>> <ffffffff8027308f>{flat_send_IPI_mask+0} >>>>> <ffffffff8027308f>{flat_send_IPI_mask+0} >>>>> <ffffffff8027308f>{flat_send_IPI_mask+0} >>>>> <ffffffff802319b5>{kthread+0} >>>>> > <ffffffff8025f89a>{child_rip+0} > >>>>> BUG: soft lockup detected on CPU#0! >>>>> >>>>> Call Trace: <IRQ> <ffffffff8029f73c>{softlockup_tick+210} >>>>> <ffffffff80289151>{update_process_times+66} >>>>> <ffffffff802713fe>{smp_local_timer_interrupt+35} >>>>> <ffffffff80271463>{smp_apic_timer_interrupt+65} >>>>> <ffffffff8025f54c>{apic_timer_interrupt+132} <EOI> >>>>> <ffffffff80224b87>{tcp_sendmsg+0} >>>>> <ffffffff881411c0>{:iscsi_trgt:nthread_wakeup+35} >>>>> <ffffffff881411b3>{:iscsi_trgt:nthread_wakeup+22} >>>>> <ffffffff8814219a>{:iscsi_trgt:istd+3431} >>>>> <ffffffff80403ea6>{tcp_sendpage+0} >>>>> <ffffffff8027fef6>{__wake_up_common+67} >>>>> <ffffffff8029131c>{keventd_create_kthread+0} >>>>> <ffffffff88141433>{:iscsi_trgt:istd+0} >>>>> <ffffffff8029131c>{keventd_create_kthread+0} >>>>> <ffffffff80231a7d>{kthread+200} >>>>> <ffffffff8025f8a2>{child_rip+8} >>>>> <ffffffff8029131c>{keventd_create_kthread+0} >>>>> <ffffffff8027308f>{flat_send_IPI_mask+0} >>>>> <ffffffff8027308f>{flat_send_IPI_mask+0} >>>>> <ffffffff8027308f>{flat_send_IPI_mask+0} >>>>> <ffffffff802319b5>{kthread+0} >>>>> <ffffffff8025f89a>{child_rip+0} >>>>> >>>>> Re-enabling trunking again and I get >>>>> BUG: soft lockup detected on CPU#0! >>>>> >>>>> Call Trace: <IRQ> <ffffffff8029f73c>{softlockup_tick+210} >>>>> <ffffffff80289151>{update_process_times+66} >>>>> <ffffffff802713fe>{smp_local_timer_interrupt+35} >>>>> <ffffffff80271463>{smp_apic_timer_interrupt+65} >>>>> <ffffffff8025f54c>{apic_timer_interrupt+132} <EOI> >>>>> <ffffffff80254356>{tcp_ioctl+0} >>>>> <ffffffff8020af50>{__might_sleep+30} >>>>> <ffffffff802326d7>{lock_sock+28} >>>>> <ffffffff80263257>{_spin_lock_bh+9} >>>>> <ffffffff8022fd23>{release_sock+15} >>>>> <ffffffff802543a2>{tcp_ioctl+76} >>>>> <ffffffff80413c44>{inet_ioctl+138} >>>>> <ffffffff88141216>{:iscsi_trgt:is_data_available+62} >>>>> <ffffffff8814125a>{:iscsi_trgt:do_recv+41} >>>>> <ffffffff8023081f>{qdisc_restart+24} >>>>> <ffffffff8022eaa6>{dev_queue_xmit+510} >>>>> <ffffffff8807c266>{:bonding:bond_dev_queue_xmit+489} >>>>> <ffffffff8023277e>{lock_sock+195} >>>>> <ffffffff8807fd96>{:bonding:bond_xmit_roundrobin+154} >>>>> <ffffffff80232136>{__tcp_push_pending_frames+1367} >>>>> <ffffffff8022fd23>{release_sock+15} >>>>> <ffffffff80225551>{tcp_sendmsg+2506} >>>>> <ffffffff80236f84>{do_sock_write+199} >>>>> <ffffffff803dbac1>{sock_writev+220} >>>>> <ffffffff8025db21>{cache_alloc_refill+237} >>>>> <ffffffff80220d80>{tcp_transmit_skb+1579} >>>>> <ffffffff80408067>{tcp_retransmit_skb+1352} >>>>> <ffffffff80254356>{tcp_ioctl+0} >>>>> <ffffffff8024f5a4>{finish_wait+52} >>>>> <ffffffff803e0d10>{sk_stream_wait_memory+458} >>>>> <ffffffff80291608>{autoremove_wake_function+0} >>>>> <ffffffff80291608>{autoremove_wake_function+0} >>>>> <ffffffff8022fd23>{release_sock+15} >>>>> <ffffffff80246a25>{try_to_wake_up+955} >>>>> <ffffffff88141609>{:iscsi_trgt:istd+470} >>>>> <ffffffff80403ea6>{tcp_sendpage+0} >>>>> <ffffffff8027fef6>{__wake_up_common+67} >>>>> <ffffffff8029131c>{keventd_create_kthread+0} >>>>> <ffffffff88141433>{:iscsi_trgt:istd+0} >>>>> <ffffffff8029131c>{keventd_create_kthread+0} >>>>> <ffffffff80231a7d>{kthread+200} >>>>> <ffffffff8025f8a2>{child_rip+8} >>>>> <ffffffff8029131c>{keventd_create_kthread+0} >>>>> <ffffffff8027308f>{flat_send_IPI_mask+0} >>>>> <ffffffff8027308f>{flat_send_IPI_mask+0} >>>>> <ffffffff8027308f>{flat_send_IPI_mask+0} >>>>> <ffffffff802319b5>{kthread+0} >>>>> <ffffffff8025f89a>{child_rip+0} >>>>> >>>>> Without trunking though the write performance after this doesn't >>>>> >>>>> >> seem >> >> >>>>> >>>>> >>>>> >>>> to >>>> >>>> >>>> >>>>> be affected (still at about 80-90MB rather than down at less than >>>>> >>>>> >>>>> >>>> 10MB) >>>> >>>> >>>> >>>>> -----Original Message----- >>>>> From: Rafiu Fakunle [mailto:[EMAIL PROTECTED] Sent: Monday, 27 >>>>> November 2006 12:27 p.m. >>>>> To: Dave Watkins >>>>> Cc: [email protected] >>>>> Subject: Re: [OF-users] iSCSI bug? >>>>> >>>>> Dave Watkins wrote: >>>>> >>>>> >>>>> >>>>>> Sorry about that, I remembered as soon as I sent it that I hadn't >>>>>> included version. It's x86_64 version 2.2 (did a conary updateall >>>>>> >>>>>> >>>>>> >>>> from >>>> >>>> >>>> >>>>>> 2.1 beta. Uname -r gives 2.6.17.14-0.3.smp.x86_64. >>>>>> >>>>>> I'll try with a UP kernel although it will take some time as I >>>>>> > have > >>>>>> >>>>>> >>>>>> >>>> to >>>> >>>> >>>> >>>>>> rebuild the e1000 module from the UP kernel sources. >>>>>> >>>>>> >>>>> Try without the network trunking anyway in the meantime. Would be >>>>> > an > >>>>> >>>>> >> >> >>>>> interesting test. >>>>> >>>>> R. >>>>> >>>>> >>>>> >>>>> >>>>> >>>>>> I'll let you know >>>>>> if I can reproduce on the UP kernel. >>>>>> >>>>>> I don't think it's related to that ticket as they are all writes >>>>>> >>>>>> >>>>>> >>>>> anyway >>>>> >>>>> >>>>> >>>>>> and they only see the problem on large files. >>>>>> >>>>>> Dave >>>>>> >>>>>> -----Original Message----- >>>>>> From: Rafiu Fakunle [mailto:[EMAIL PROTECTED] Sent: Monday, 27 >>>>>> November 2006 11:40 a.m. >>>>>> To: Dave Watkins >>>>>> Cc: [email protected] >>>>>> Subject: Re: [OF-users] iSCSI bug? >>>>>> >>>>>> Hi Dave, >>>>>> >>>>>> Excellent test and bug report. >>>>>> >>>>>> I wonder whether it may be related to this: >>>>>> >>>>>> https://project.openfiler.com/tracker/ticket/435 >>>>>> >>>>>> Can you try to reproduce with a UP kernel pls. >>>>>> >>>>>> Also I need the output of `uname -r` >>>>>> >>>>>> Thx, >>>>>> >>>>>> R. >>>>>> >>>>>> FTR: this is running r58 from IET svn >>>>>> >>>>>> >>>>>> Dave Watkins wrote: >>>>>> >>>>>> >>>>>> >>>>>>> Hi All >>>>>>> >>>>>>> I think I've found a bug in the iscsi target software in my >>>>>>> benchmarking/testing. >>>>>>> >>>>>>> Some background on the hardware first in case it may be related. >>>>>>> Dual core/dual opteron with 2GB of ram >>>>>>> 3ware 8006 2 port raid card for OS drives >>>>>>> 3ware 9550SX card for data drives >>>>>>> Dual GB Broadcom on-board NIC's teamed into bond0 (management) >>>>>>> Quad port Intel PCI-E GB NIC with all 4 ports teamed into bond1 >>>>>>> >>>>>>> >>>>>>> >>>> (main >>>> >>>> >>>> >>>>>>> iscsi data network) >>>>>>> 4 x 250GB WD SATA HDD's in RAID5 >>>>>>> >>>>>>> Of note here is that I have had to replace the e1000 driver with >>>>>>> >>>>>>> >> the >> >> >>>>>>> latest from Intel to support the quad port card >>>>>>> >>>>>>> I have made some volumes and mounted them on various windows >>>>>>> >>>>>>> >> servers >> >> >>>>>>> >>>>>>> >>>>>>> >>>>>> and >>>>>> >>>>>> >>>>>> >>>>>>> have been using iobench to tune performance of the system. When >>>>>>> >>>>>>> >>>>>>> >>>> using >>>> >>>> >>>> >>>>>>> >>>>>>> >>>>>>> >>>>>> a >>>>>> >>>>>> >>>>>> >>>>>>> read only test pattern I see this >>>>>>> >>>>>>> BUG: soft lockup detected on CPU#0! >>>>>>> >>>>>>> Call Trace: <IRQ> <ffffffff8029f73c>{softlockup_tick+210} >>>>>>> <ffffffff80289151>{update_process_times+66} >>>>>>> <ffffffff802713fe>{smp_local_timer_interrupt+35} >>>>>>> <ffffffff80271463>{smp_apic_timer_interrupt+65} >>>>>>> <ffffffff8025f54c>{apic_timer_interrupt+132} <EOI> >>>>>>> <ffffffff88141486>{:iscsi_trgt:istd+83} >>>>>>> <ffffffff88141476>{:iscsi_trgt:istd+67} >>>>>>> <ffffffff80403ea6>{tcp_sendpage+0} >>>>>>> <ffffffff8027fef6>{__wake_up_common+67} >>>>>>> <ffffffff8029131c>{keventd_create_kthread+0} >>>>>>> <ffffffff88141433>{:iscsi_trgt:istd+0} >>>>>>> <ffffffff8029131c>{keventd_create_kthread+0} >>>>>>> <ffffffff80231a7d>{kthread+200} >>>>>>> <ffffffff8025f8a2>{child_rip+8} >>>>>>> <ffffffff8029131c>{keventd_create_kthread+0} >>>>>>> <ffffffff802319b5>{kthread+0} >>>>>>> >>>>>>> >> <ffffffff8025f89a>{child_rip+0} >> >> >>>>>>> BUG: soft lockup detected on CPU#0! >>>>>>> >>>>>>> Call Trace: <IRQ> <ffffffff8029f73c>{softlockup_tick+210} >>>>>>> <ffffffff80289151>{update_process_times+66} >>>>>>> <ffffffff802713fe>{smp_local_timer_interrupt+35} >>>>>>> <ffffffff80271463>{smp_apic_timer_interrupt+65} >>>>>>> <ffffffff8025f54c>{apic_timer_interrupt+132} <EOI> >>>>>>> <ffffffff802631ec>{_spin_unlock_irqrestore+8} >>>>>>> <ffffffff80246a25>{try_to_wake_up+955} >>>>>>> <ffffffff881411cc>{:iscsi_trgt:nthread_wakeup+47} >>>>>>> <ffffffff8814219a>{:iscsi_trgt:istd+3431} >>>>>>> <ffffffff80403ea6>{tcp_sendpage+0} >>>>>>> <ffffffff8027fef6>{__wake_up_common+67} >>>>>>> <ffffffff8029131c>{keventd_create_kthread+0} >>>>>>> <ffffffff88141433>{:iscsi_trgt:istd+0} >>>>>>> <ffffffff8029131c>{keventd_create_kthread+0} >>>>>>> <ffffffff80231a7d>{kthread+200} >>>>>>> <ffffffff8025f8a2>{child_rip+8} >>>>>>> <ffffffff8029131c>{keventd_create_kthread+0} >>>>>>> <ffffffff802319b5>{kthread+0} >>>>>>> >>>>>>> >> <ffffffff8025f89a>{child_rip+0} >> >> >>>>>>> Doing write only based patterns this doesn't come up. After this >>>>>>> performance of the system dives (from about 110MB/sec of iscsi >>>>>>> performance to about 10MB/sec). >>>>>>> >>>>>>> This is fairly reproducible here so if you need anymore >>>>>>> >>>>>>> >> information >> >> >>>>>>> >>>>>>> >>>>>>> >>>>>> just >>>>>> >>>>>> >>>>>> >>>>>>> ask. >>>>>>> >>>>>>> Dave >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> > ------------------------------------------------------------------------ > >> >> >>>> >>>> >>>> >>>>> >>>>> >>>>> >>>>>> >>>>>> >>>>>> >>>>>>> _______________________________________________ >>>>>>> Openfiler-users mailing list >>>>>>> [email protected] >>>>>>> https://lists.openfiler.com/mailman/listinfo/openfiler-users >>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> >>>>> >>>>> >>>>> >>>> >>>> >>>> >>> _______________________________________________ >>> Openfiler-users mailing list >>> [email protected] >>> https://lists.openfiler.com/mailman/listinfo/openfiler-users >>> >>> >> >> > > _______________________________________________ Openfiler-users mailing list [email protected] https://lists.openfiler.com/mailman/listinfo/openfiler-users
