It doesn't look like it's going to compile cleanly on an openfiler platform unfortunately. I'm going to see if I can recreate the problem by just moving a lot of data around on the iSCSI volumes.
Dave -----Original Message----- From: Rafiu Fakunle [mailto:[EMAIL PROTECTED] Sent: Tuesday, 28 November 2006 10:49 a.m. To: Dave Watkins Cc: [email protected] Subject: Re: [OF-users] iSCSI bug? Dave Watkins wrote: > When you say "local box" do you mean the openfiler box Yes please. > or the windows > box? I have run local tests on the windows box without isse but haven't > tried the openfiler box. I will try disabling jumbo frames, I have tried > without NAPI and flow control with no success > > -----Original Message----- > From: Rafiu Fakunle [mailto:[EMAIL PROTECTED] > Sent: Tuesday, 28 November 2006 10:27 a.m. > To: Dave Watkins > Cc: [email protected] > Subject: Re: [OF-users] iSCSI bug? > > Dave Watkins wrote: > >> I'll start with a better description and get onto working with the new >> packages >> >> I'm running iometer on the Windows Server 2003 x64 box against an >> > iSCSI > >> volume using the MS iSCSI initiator (2.02). >> >> Using iometer and selecting any of the 0% read, 0% random access >> specifications will return expected performance numbers, but stopping >> that test, and changing the access specification to any 100% read, 0% >> random test will generate the errors. Larger block sizes _seem_ to >> > make > >> it happen more frequently so I have created a 256k block size test for >> the above access specifications. I have been using 16 and 64 >> > outstanding > >> I/O's but anything above zero seems to show the error. >> >> Networking on both ends is via Intel e1000 cards and jumbo frames are >> enabled, and so is flow control, NAPI is also enabled on the Openfiler >> box. All other network settings are default. >> >> > > Try without all the tweaking and then enable them one by one. > > Also have you successfully run the benchmarks on the local box without > going through iSCSI? > > > R. > > >> Dave >> >> -----Original Message----- >> From: Rafiu Fakunle [mailto:[EMAIL PROTECTED] >> Sent: Tuesday, 28 November 2006 1:30 a.m. >> To: Dave Watkins >> Cc: [email protected] >> Subject: Re: [OF-users] iSCSI bug? >> >> Dave Watkins wrote: >> >> >>> Still the same, both with bonding enabled and disabled unfortunatly >>> >>> >>> >> First: >> >> try adding "nosoftlockup" to the grub boot options and then run the >> benchmarks again. >> >> Next: >> >> http://www.openfiler.com/download/PACKAGES/iscsi_trgt-kernel-r78.ccs >> http://www.openfiler.com/download/PACKAGES/iscsi_trgt-r78.ccs >> >> (kernel and userland) >> >> same as before (--replace-files) >> >> Finally: >> >> Also a bit more detail about your test set-up (components, parameters, >> > > >> triggers etc) would be great. >> >> R. >> >> >>> -----Original Message----- >>> From: Rafiu Fakunle [mailto:[EMAIL PROTECTED] >>> Sent: Monday, 27 November 2006 2:52 p.m. >>> To: Rafiu Fakunle >>> Cc: Dave Watkins; [email protected] >>> Subject: Re: [OF-users] iSCSI bug? >>> >>> Rafiu Fakunle wrote: >>> >>> >>> >>>> Dave Watkins wrote: >>>> >>>> >>>> >>>>> Ok, UP is fine. To be sure it wasn't the e1000 driver I also tried >>>>> >>>>> >>>>> >>> using >>> >>> >>> >>>>> only the Broadcom NIC's as well. Under UP there is no error, under >>>>> >>>>> >>>>> >>> SMP >>> >>> >>> >>>>> the error reoccurs even with e1000 not loaded and no bonding. >>>>> >>>>> Hope this helps >>>>> >>>>> >>>>> >>>>> >>>> Immensely. I'm just doing up a changeset for you. >>>> >>>> >>>> >>> >>> > http://www.openfiler.com/download/PACKAGES/iscsi_trgt-kernel-0.4.14.ccs > >> >> >>> conary update iscsi_trgt-kernel-0.4.14.ccs --replace-files >>> >>> Then test again with 2.6.17.14-0.3.smp.x86_64 (with and without >>> >>> >> bonding) >> >> >>> Thx, >>> >>> R. >>> >>> >>> >>>> R. >>>> >>>> >>>> >>>> >>>>> Dave >>>>> >>>>> -----Original Message----- >>>>> From: Rafiu Fakunle [mailto:[EMAIL PROTECTED] Sent: Monday, 27 >>>>> November 2006 1:15 p.m. >>>>> To: Dave Watkins >>>>> Cc: [email protected] >>>>> Subject: Re: [OF-users] iSCSI bug? >>>>> >>>>> OK, and UP without trunking? >>>>> >>>>> R. >>>>> >>>>> Dave Watkins wrote: >>>>> >>>>> >>>>> >>>>> >>>>>> With or without trunking seem to generate the same problem >>>>>> >>>>>> Without trunking I got >>>>>> BUG: soft lockup detected on CPU#0! >>>>>> >>>>>> Call Trace: <IRQ> <ffffffff8029f73c>{softlockup_tick+210} >>>>>> <ffffffff80289151>{update_process_times+66} >>>>>> <ffffffff802713fe>{smp_local_timer_interrupt+35} >>>>>> <ffffffff80271463>{smp_apic_timer_interrupt+65} >>>>>> <ffffffff8025f54c>{apic_timer_interrupt+132} <EOI> >>>>>> <ffffffff80224b87>{tcp_sendmsg+0} >>>>>> <ffffffff80413bba>{inet_ioctl+0} >>>>>> <ffffffff88141216>{:iscsi_trgt:is_data_available+62} >>>>>> <ffffffff881419e7>{:iscsi_trgt:istd+1460} >>>>>> <ffffffff80403ea6>{tcp_sendpage+0} >>>>>> <ffffffff8027fef6>{__wake_up_common+67} >>>>>> <ffffffff8029131c>{keventd_create_kthread+0} >>>>>> <ffffffff88141433>{:iscsi_trgt:istd+0} >>>>>> <ffffffff8029131c>{keventd_create_kthread+0} >>>>>> <ffffffff80231a7d>{kthread+200} >>>>>> >>>>>> >>>>>> >>> <ffffffff8025f8a2>{child_rip+8} >>> >>> >>> >>>>>> <ffffffff8029131c>{keventd_create_kthread+0} >>>>>> <ffffffff8027308f>{flat_send_IPI_mask+0} >>>>>> <ffffffff8027308f>{flat_send_IPI_mask+0} >>>>>> <ffffffff8027308f>{flat_send_IPI_mask+0} >>>>>> <ffffffff802319b5>{kthread+0} >>>>>> >>>>>> >> <ffffffff8025f89a>{child_rip+0} >> >> >>>>>> BUG: soft lockup detected on CPU#0! >>>>>> >>>>>> Call Trace: <IRQ> <ffffffff8029f73c>{softlockup_tick+210} >>>>>> <ffffffff80289151>{update_process_times+66} >>>>>> <ffffffff802713fe>{smp_local_timer_interrupt+35} >>>>>> <ffffffff80271463>{smp_apic_timer_interrupt+65} >>>>>> <ffffffff8025f54c>{apic_timer_interrupt+132} <EOI> >>>>>> <ffffffff80224b87>{tcp_sendmsg+0} >>>>>> <ffffffff881411c0>{:iscsi_trgt:nthread_wakeup+35} >>>>>> <ffffffff881411b3>{:iscsi_trgt:nthread_wakeup+22} >>>>>> <ffffffff8814219a>{:iscsi_trgt:istd+3431} >>>>>> <ffffffff80403ea6>{tcp_sendpage+0} >>>>>> <ffffffff8027fef6>{__wake_up_common+67} >>>>>> <ffffffff8029131c>{keventd_create_kthread+0} >>>>>> <ffffffff88141433>{:iscsi_trgt:istd+0} >>>>>> <ffffffff8029131c>{keventd_create_kthread+0} >>>>>> <ffffffff80231a7d>{kthread+200} >>>>>> <ffffffff8025f8a2>{child_rip+8} >>>>>> <ffffffff8029131c>{keventd_create_kthread+0} >>>>>> <ffffffff8027308f>{flat_send_IPI_mask+0} >>>>>> <ffffffff8027308f>{flat_send_IPI_mask+0} >>>>>> <ffffffff8027308f>{flat_send_IPI_mask+0} >>>>>> <ffffffff802319b5>{kthread+0} >>>>>> <ffffffff8025f89a>{child_rip+0} >>>>>> >>>>>> Re-enabling trunking again and I get >>>>>> BUG: soft lockup detected on CPU#0! >>>>>> >>>>>> Call Trace: <IRQ> <ffffffff8029f73c>{softlockup_tick+210} >>>>>> <ffffffff80289151>{update_process_times+66} >>>>>> <ffffffff802713fe>{smp_local_timer_interrupt+35} >>>>>> <ffffffff80271463>{smp_apic_timer_interrupt+65} >>>>>> <ffffffff8025f54c>{apic_timer_interrupt+132} <EOI> >>>>>> <ffffffff80254356>{tcp_ioctl+0} >>>>>> <ffffffff8020af50>{__might_sleep+30} >>>>>> <ffffffff802326d7>{lock_sock+28} >>>>>> <ffffffff80263257>{_spin_lock_bh+9} >>>>>> <ffffffff8022fd23>{release_sock+15} >>>>>> <ffffffff802543a2>{tcp_ioctl+76} >>>>>> <ffffffff80413c44>{inet_ioctl+138} >>>>>> <ffffffff88141216>{:iscsi_trgt:is_data_available+62} >>>>>> <ffffffff8814125a>{:iscsi_trgt:do_recv+41} >>>>>> <ffffffff8023081f>{qdisc_restart+24} >>>>>> <ffffffff8022eaa6>{dev_queue_xmit+510} >>>>>> <ffffffff8807c266>{:bonding:bond_dev_queue_xmit+489} >>>>>> <ffffffff8023277e>{lock_sock+195} >>>>>> <ffffffff8807fd96>{:bonding:bond_xmit_roundrobin+154} >>>>>> <ffffffff80232136>{__tcp_push_pending_frames+1367} >>>>>> <ffffffff8022fd23>{release_sock+15} >>>>>> <ffffffff80225551>{tcp_sendmsg+2506} >>>>>> <ffffffff80236f84>{do_sock_write+199} >>>>>> <ffffffff803dbac1>{sock_writev+220} >>>>>> <ffffffff8025db21>{cache_alloc_refill+237} >>>>>> <ffffffff80220d80>{tcp_transmit_skb+1579} >>>>>> <ffffffff80408067>{tcp_retransmit_skb+1352} >>>>>> <ffffffff80254356>{tcp_ioctl+0} >>>>>> <ffffffff8024f5a4>{finish_wait+52} >>>>>> <ffffffff803e0d10>{sk_stream_wait_memory+458} >>>>>> <ffffffff80291608>{autoremove_wake_function+0} >>>>>> <ffffffff80291608>{autoremove_wake_function+0} >>>>>> <ffffffff8022fd23>{release_sock+15} >>>>>> <ffffffff80246a25>{try_to_wake_up+955} >>>>>> <ffffffff88141609>{:iscsi_trgt:istd+470} >>>>>> <ffffffff80403ea6>{tcp_sendpage+0} >>>>>> <ffffffff8027fef6>{__wake_up_common+67} >>>>>> <ffffffff8029131c>{keventd_create_kthread+0} >>>>>> <ffffffff88141433>{:iscsi_trgt:istd+0} >>>>>> <ffffffff8029131c>{keventd_create_kthread+0} >>>>>> <ffffffff80231a7d>{kthread+200} >>>>>> <ffffffff8025f8a2>{child_rip+8} >>>>>> <ffffffff8029131c>{keventd_create_kthread+0} >>>>>> <ffffffff8027308f>{flat_send_IPI_mask+0} >>>>>> <ffffffff8027308f>{flat_send_IPI_mask+0} >>>>>> <ffffffff8027308f>{flat_send_IPI_mask+0} >>>>>> <ffffffff802319b5>{kthread+0} >>>>>> <ffffffff8025f89a>{child_rip+0} >>>>>> >>>>>> Without trunking though the write performance after this doesn't >>>>>> >>>>>> >>>>>> >>> seem >>> >>> >>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>> to >>>>> >>>>> >>>>> >>>>> >>>>>> be affected (still at about 80-90MB rather than down at less than >>>>>> >>>>>> >>>>>> >>>>>> >>>>> 10MB) >>>>> >>>>> >>>>> >>>>> >>>>>> -----Original Message----- >>>>>> From: Rafiu Fakunle [mailto:[EMAIL PROTECTED] Sent: Monday, 27 >>>>>> November 2006 12:27 p.m. >>>>>> To: Dave Watkins >>>>>> Cc: [email protected] >>>>>> Subject: Re: [OF-users] iSCSI bug? >>>>>> >>>>>> Dave Watkins wrote: >>>>>> >>>>>> >>>>>> >>>>>> >>>>>>> Sorry about that, I remembered as soon as I sent it that I hadn't >>>>>>> included version. It's x86_64 version 2.2 (did a conary updateall >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>> from >>>>> >>>>> >>>>> >>>>> >>>>>>> 2.1 beta. Uname -r gives 2.6.17.14-0.3.smp.x86_64. >>>>>>> >>>>>>> I'll try with a UP kernel although it will take some time as I >>>>>>> >>>>>>> >> have >> >> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>> to >>>>> >>>>> >>>>> >>>>> >>>>>>> rebuild the e1000 module from the UP kernel sources. >>>>>>> >>>>>>> >>>>>>> >>>>>> Try without the network trunking anyway in the meantime. Would be >>>>>> >>>>>> >> an >> >> >>>>>> >>>>>> >>>>>> >>> >>> >>> >>>>>> interesting test. >>>>>> >>>>>> R. >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>>> I'll let you know >>>>>>> if I can reproduce on the UP kernel. >>>>>>> >>>>>>> I don't think it's related to that ticket as they are all writes >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>> anyway >>>>>> >>>>>> >>>>>> >>>>>> >>>>>>> and they only see the problem on large files. >>>>>>> >>>>>>> Dave >>>>>>> >>>>>>> -----Original Message----- >>>>>>> From: Rafiu Fakunle [mailto:[EMAIL PROTECTED] Sent: Monday, 27 >>>>>>> > > >>>>>>> November 2006 11:40 a.m. >>>>>>> To: Dave Watkins >>>>>>> Cc: [email protected] >>>>>>> Subject: Re: [OF-users] iSCSI bug? >>>>>>> >>>>>>> Hi Dave, >>>>>>> >>>>>>> Excellent test and bug report. >>>>>>> >>>>>>> I wonder whether it may be related to this: >>>>>>> >>>>>>> https://project.openfiler.com/tracker/ticket/435 >>>>>>> >>>>>>> Can you try to reproduce with a UP kernel pls. >>>>>>> >>>>>>> Also I need the output of `uname -r` >>>>>>> >>>>>>> Thx, >>>>>>> >>>>>>> R. >>>>>>> >>>>>>> FTR: this is running r58 from IET svn >>>>>>> >>>>>>> >>>>>>> Dave Watkins wrote: >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>>> Hi All >>>>>>>> >>>>>>>> I think I've found a bug in the iscsi target software in my >>>>>>>> benchmarking/testing. >>>>>>>> >>>>>>>> Some background on the hardware first in case it may be related. >>>>>>>> Dual core/dual opteron with 2GB of ram >>>>>>>> 3ware 8006 2 port raid card for OS drives >>>>>>>> 3ware 9550SX card for data drives >>>>>>>> Dual GB Broadcom on-board NIC's teamed into bond0 (management) >>>>>>>> Quad port Intel PCI-E GB NIC with all 4 ports teamed into bond1 >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>> (main >>>>> >>>>> >>>>> >>>>> >>>>>>>> iscsi data network) >>>>>>>> 4 x 250GB WD SATA HDD's in RAID5 >>>>>>>> >>>>>>>> Of note here is that I have had to replace the e1000 driver with >>>>>>>> >>>>>>>> >>>>>>>> >>> the >>> >>> >>> >>>>>>>> latest from Intel to support the quad port card >>>>>>>> >>>>>>>> I have made some volumes and mounted them on various windows >>>>>>>> >>>>>>>> >>>>>>>> >>> servers >>> >>> >>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> and >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>>> have been using iobench to tune performance of the system. When >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>> using >>>>> >>>>> >>>>> >>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> a >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>>> read only test pattern I see this >>>>>>>> >>>>>>>> BUG: soft lockup detected on CPU#0! >>>>>>>> >>>>>>>> Call Trace: <IRQ> <ffffffff8029f73c>{softlockup_tick+210} >>>>>>>> <ffffffff80289151>{update_process_times+66} >>>>>>>> <ffffffff802713fe>{smp_local_timer_interrupt+35} >>>>>>>> <ffffffff80271463>{smp_apic_timer_interrupt+65} >>>>>>>> <ffffffff8025f54c>{apic_timer_interrupt+132} <EOI> >>>>>>>> <ffffffff88141486>{:iscsi_trgt:istd+83} >>>>>>>> <ffffffff88141476>{:iscsi_trgt:istd+67} >>>>>>>> <ffffffff80403ea6>{tcp_sendpage+0} >>>>>>>> <ffffffff8027fef6>{__wake_up_common+67} >>>>>>>> <ffffffff8029131c>{keventd_create_kthread+0} >>>>>>>> <ffffffff88141433>{:iscsi_trgt:istd+0} >>>>>>>> <ffffffff8029131c>{keventd_create_kthread+0} >>>>>>>> <ffffffff80231a7d>{kthread+200} >>>>>>>> <ffffffff8025f8a2>{child_rip+8} >>>>>>>> <ffffffff8029131c>{keventd_create_kthread+0} >>>>>>>> <ffffffff802319b5>{kthread+0} >>>>>>>> >>>>>>>> >>>>>>>> >>> <ffffffff8025f89a>{child_rip+0} >>> >>> >>> >>>>>>>> BUG: soft lockup detected on CPU#0! >>>>>>>> >>>>>>>> Call Trace: <IRQ> <ffffffff8029f73c>{softlockup_tick+210} >>>>>>>> <ffffffff80289151>{update_process_times+66} >>>>>>>> <ffffffff802713fe>{smp_local_timer_interrupt+35} >>>>>>>> <ffffffff80271463>{smp_apic_timer_interrupt+65} >>>>>>>> <ffffffff8025f54c>{apic_timer_interrupt+132} <EOI> >>>>>>>> <ffffffff802631ec>{_spin_unlock_irqrestore+8} >>>>>>>> <ffffffff80246a25>{try_to_wake_up+955} >>>>>>>> <ffffffff881411cc>{:iscsi_trgt:nthread_wakeup+47} >>>>>>>> <ffffffff8814219a>{:iscsi_trgt:istd+3431} >>>>>>>> <ffffffff80403ea6>{tcp_sendpage+0} >>>>>>>> <ffffffff8027fef6>{__wake_up_common+67} >>>>>>>> <ffffffff8029131c>{keventd_create_kthread+0} >>>>>>>> <ffffffff88141433>{:iscsi_trgt:istd+0} >>>>>>>> <ffffffff8029131c>{keventd_create_kthread+0} >>>>>>>> <ffffffff80231a7d>{kthread+200} >>>>>>>> <ffffffff8025f8a2>{child_rip+8} >>>>>>>> <ffffffff8029131c>{keventd_create_kthread+0} >>>>>>>> <ffffffff802319b5>{kthread+0} >>>>>>>> >>>>>>>> >>>>>>>> >>> <ffffffff8025f89a>{child_rip+0} >>> >>> >>> >>>>>>>> Doing write only based patterns this doesn't come up. After this >>>>>>>> performance of the system dives (from about 110MB/sec of iscsi >>>>>>>> performance to about 10MB/sec). >>>>>>>> >>>>>>>> This is fairly reproducible here so if you need anymore >>>>>>>> >>>>>>>> >>>>>>>> >>> information >>> >>> >>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> just >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>>> ask. >>>>>>>> >>>>>>>> Dave >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> > ------------------------------------------------------------------------ > >> >> >>> >>> >>> >>>>> >>>>> >>>>> >>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> Openfiler-users mailing list >>>>>>>> [email protected] >>>>>>>> https://lists.openfiler.com/mailman/listinfo/openfiler-users >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>> >>>>> >>>>> >>>>> >>>> _______________________________________________ >>>> Openfiler-users mailing list >>>> [email protected] >>>> https://lists.openfiler.com/mailman/listinfo/openfiler-users >>>> >>>> >>>> >>> >>> >>> >> >> > > _______________________________________________ Openfiler-users mailing list [email protected] https://lists.openfiler.com/mailman/listinfo/openfiler-users
