When you say "local box" do you mean the openfiler box or the windows
box? I have run local tests on the windows box without isse but haven't
tried the openfiler box. I will try disabling jumbo frames, I have tried
without NAPI and flow control with no success

-----Original Message-----
From: Rafiu Fakunle [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, 28 November 2006 10:27 a.m.
To: Dave Watkins
Cc: [email protected]
Subject: Re: [OF-users] iSCSI bug?

Dave Watkins wrote:
> I'll start with a better description and get onto working with the new
> packages
>
> I'm running iometer on the Windows Server 2003 x64 box against an
iSCSI
> volume using the MS iSCSI initiator (2.02).
>
> Using iometer and selecting any of the 0% read, 0% random access
> specifications will return expected performance numbers, but stopping
> that test, and changing the access specification to any 100% read, 0%
> random test will generate the errors. Larger block sizes _seem_ to
make
> it happen more frequently so I have created a 256k block size test for
> the above access specifications. I have been using 16 and 64
outstanding
> I/O's but anything above zero seems to show the error. 
>
> Networking on both ends is via Intel e1000 cards and jumbo frames are
> enabled, and so is flow control, NAPI is also enabled on the Openfiler
> box. All other network settings are default.
>   

Try without all the tweaking and then enable them one by one.

Also have you successfully run the benchmarks on the local box without 
going through iSCSI?


R.

> Dave
>
> -----Original Message-----
> From: Rafiu Fakunle [mailto:[EMAIL PROTECTED] 
> Sent: Tuesday, 28 November 2006 1:30 a.m.
> To: Dave Watkins
> Cc: [email protected]
> Subject: Re: [OF-users] iSCSI bug?
>
> Dave Watkins wrote:
>   
>> Still the same, both with bonding enabled and disabled unfortunatly
>>   
>>     
> First:
>
> try adding "nosoftlockup" to the grub boot options and then run the 
> benchmarks again.
>
> Next:
>
> http://www.openfiler.com/download/PACKAGES/iscsi_trgt-kernel-r78.ccs
> http://www.openfiler.com/download/PACKAGES/iscsi_trgt-r78.ccs
>
> (kernel and userland)
>
> same as before (--replace-files)
>
> Finally:
>
> Also a bit more detail about your test set-up (components, parameters,

> triggers etc) would be great.
>
> R.
>   
>> -----Original Message-----
>> From: Rafiu Fakunle [mailto:[EMAIL PROTECTED] 
>> Sent: Monday, 27 November 2006 2:52 p.m.
>> To: Rafiu Fakunle
>> Cc: Dave Watkins; [email protected]
>> Subject: Re: [OF-users] iSCSI bug?
>>
>> Rafiu Fakunle wrote:
>>   
>>     
>>> Dave Watkins wrote:
>>>     
>>>       
>>>> Ok, UP is fine. To be sure it wasn't the e1000 driver I also tried
>>>>       
>>>>         
>> using
>>   
>>     
>>>> only the Broadcom NIC's as well. Under UP there is no error, under
>>>>       
>>>>         
>> SMP
>>   
>>     
>>>> the error reoccurs even with e1000 not loaded and no bonding.
>>>>
>>>> Hope this helps
>>>>   
>>>>       
>>>>         
>>> Immensely. I'm just doing up a changeset for you.
>>>     
>>>       
>>     
>
http://www.openfiler.com/download/PACKAGES/iscsi_trgt-kernel-0.4.14.ccs
>   
>> conary update iscsi_trgt-kernel-0.4.14.ccs --replace-files
>>
>> Then test again with 2.6.17.14-0.3.smp.x86_64 (with and without
>>     
> bonding)
>   
>>
>>
>> Thx,
>>
>> R.
>>   
>>     
>>> R.
>>>
>>>     
>>>       
>>>> Dave
>>>>
>>>> -----Original Message-----
>>>> From: Rafiu Fakunle [mailto:[EMAIL PROTECTED] Sent: Monday, 27 
>>>> November 2006 1:15 p.m.
>>>> To: Dave Watkins
>>>> Cc: [email protected]
>>>> Subject: Re: [OF-users] iSCSI bug?
>>>>
>>>> OK, and UP without trunking?
>>>>
>>>> R.
>>>>
>>>> Dave Watkins wrote:
>>>>  
>>>>       
>>>>         
>>>>> With or without trunking seem to generate the same problem
>>>>>
>>>>> Without trunking I got
>>>>> BUG: soft lockup detected on CPU#0!
>>>>>
>>>>> Call Trace: <IRQ> <ffffffff8029f73c>{softlockup_tick+210}
>>>>>        <ffffffff80289151>{update_process_times+66}
>>>>> <ffffffff802713fe>{smp_local_timer_interrupt+35}
>>>>>        <ffffffff80271463>{smp_apic_timer_interrupt+65}
>>>>> <ffffffff8025f54c>{apic_timer_interrupt+132} <EOI>
>>>>>        <ffffffff80224b87>{tcp_sendmsg+0}
>>>>> <ffffffff80413bba>{inet_ioctl+0}
>>>>>        <ffffffff88141216>{:iscsi_trgt:is_data_available+62}
>>>>>        <ffffffff881419e7>{:iscsi_trgt:istd+1460}
>>>>> <ffffffff80403ea6>{tcp_sendpage+0}
>>>>>        <ffffffff8027fef6>{__wake_up_common+67}
>>>>> <ffffffff8029131c>{keventd_create_kthread+0}
>>>>>        <ffffffff88141433>{:iscsi_trgt:istd+0}
>>>>> <ffffffff8029131c>{keventd_create_kthread+0}
>>>>>        <ffffffff80231a7d>{kthread+200}
>>>>>         
>>>>>           
>> <ffffffff8025f8a2>{child_rip+8}
>>   
>>     
>>>>>        <ffffffff8029131c>{keventd_create_kthread+0}
>>>>> <ffffffff8027308f>{flat_send_IPI_mask+0}
>>>>>        <ffffffff8027308f>{flat_send_IPI_mask+0}
>>>>> <ffffffff8027308f>{flat_send_IPI_mask+0}
>>>>>        <ffffffff802319b5>{kthread+0}
>>>>>           
> <ffffffff8025f89a>{child_rip+0}
>   
>>>>> BUG: soft lockup detected on CPU#0!
>>>>>
>>>>> Call Trace: <IRQ> <ffffffff8029f73c>{softlockup_tick+210}
>>>>>        <ffffffff80289151>{update_process_times+66}
>>>>> <ffffffff802713fe>{smp_local_timer_interrupt+35}
>>>>>        <ffffffff80271463>{smp_apic_timer_interrupt+65}
>>>>> <ffffffff8025f54c>{apic_timer_interrupt+132} <EOI>
>>>>>        <ffffffff80224b87>{tcp_sendmsg+0}
>>>>> <ffffffff881411c0>{:iscsi_trgt:nthread_wakeup+35}
>>>>>        <ffffffff881411b3>{:iscsi_trgt:nthread_wakeup+22}
>>>>> <ffffffff8814219a>{:iscsi_trgt:istd+3431}
>>>>>        <ffffffff80403ea6>{tcp_sendpage+0}
>>>>> <ffffffff8027fef6>{__wake_up_common+67}
>>>>>        <ffffffff8029131c>{keventd_create_kthread+0}
>>>>> <ffffffff88141433>{:iscsi_trgt:istd+0}
>>>>>        <ffffffff8029131c>{keventd_create_kthread+0}
>>>>> <ffffffff80231a7d>{kthread+200}
>>>>>        <ffffffff8025f8a2>{child_rip+8}
>>>>> <ffffffff8029131c>{keventd_create_kthread+0}
>>>>>        <ffffffff8027308f>{flat_send_IPI_mask+0}
>>>>> <ffffffff8027308f>{flat_send_IPI_mask+0}
>>>>>        <ffffffff8027308f>{flat_send_IPI_mask+0}
>>>>> <ffffffff802319b5>{kthread+0}
>>>>>        <ffffffff8025f89a>{child_rip+0}
>>>>>
>>>>> Re-enabling trunking again and I get
>>>>> BUG: soft lockup detected on CPU#0!
>>>>>
>>>>> Call Trace: <IRQ> <ffffffff8029f73c>{softlockup_tick+210}
>>>>>        <ffffffff80289151>{update_process_times+66}
>>>>> <ffffffff802713fe>{smp_local_timer_interrupt+35}
>>>>>        <ffffffff80271463>{smp_apic_timer_interrupt+65}
>>>>> <ffffffff8025f54c>{apic_timer_interrupt+132} <EOI>
>>>>>        <ffffffff80254356>{tcp_ioctl+0}
>>>>> <ffffffff8020af50>{__might_sleep+30}
>>>>>        <ffffffff802326d7>{lock_sock+28}
>>>>> <ffffffff80263257>{_spin_lock_bh+9}
>>>>>        <ffffffff8022fd23>{release_sock+15}
>>>>> <ffffffff802543a2>{tcp_ioctl+76}
>>>>>        <ffffffff80413c44>{inet_ioctl+138}
>>>>> <ffffffff88141216>{:iscsi_trgt:is_data_available+62}
>>>>>        <ffffffff8814125a>{:iscsi_trgt:do_recv+41}
>>>>> <ffffffff8023081f>{qdisc_restart+24}
>>>>>        <ffffffff8022eaa6>{dev_queue_xmit+510}
>>>>> <ffffffff8807c266>{:bonding:bond_dev_queue_xmit+489}
>>>>>        <ffffffff8023277e>{lock_sock+195}
>>>>> <ffffffff8807fd96>{:bonding:bond_xmit_roundrobin+154}
>>>>>        <ffffffff80232136>{__tcp_push_pending_frames+1367}
>>>>> <ffffffff8022fd23>{release_sock+15}
>>>>>        <ffffffff80225551>{tcp_sendmsg+2506}
>>>>> <ffffffff80236f84>{do_sock_write+199}
>>>>>        <ffffffff803dbac1>{sock_writev+220}
>>>>> <ffffffff8025db21>{cache_alloc_refill+237}
>>>>>        <ffffffff80220d80>{tcp_transmit_skb+1579}
>>>>> <ffffffff80408067>{tcp_retransmit_skb+1352}
>>>>>        <ffffffff80254356>{tcp_ioctl+0}
>>>>> <ffffffff8024f5a4>{finish_wait+52}
>>>>>        <ffffffff803e0d10>{sk_stream_wait_memory+458}
>>>>> <ffffffff80291608>{autoremove_wake_function+0}
>>>>>        <ffffffff80291608>{autoremove_wake_function+0}
>>>>> <ffffffff8022fd23>{release_sock+15}
>>>>>        <ffffffff80246a25>{try_to_wake_up+955}
>>>>> <ffffffff88141609>{:iscsi_trgt:istd+470}
>>>>>        <ffffffff80403ea6>{tcp_sendpage+0}
>>>>> <ffffffff8027fef6>{__wake_up_common+67}
>>>>>        <ffffffff8029131c>{keventd_create_kthread+0}
>>>>> <ffffffff88141433>{:iscsi_trgt:istd+0}
>>>>>        <ffffffff8029131c>{keventd_create_kthread+0}
>>>>> <ffffffff80231a7d>{kthread+200}
>>>>>        <ffffffff8025f8a2>{child_rip+8}
>>>>> <ffffffff8029131c>{keventd_create_kthread+0}
>>>>>        <ffffffff8027308f>{flat_send_IPI_mask+0}
>>>>> <ffffffff8027308f>{flat_send_IPI_mask+0}
>>>>>        <ffffffff8027308f>{flat_send_IPI_mask+0}
>>>>> <ffffffff802319b5>{kthread+0}
>>>>>        <ffffffff8025f89a>{child_rip+0}
>>>>>
>>>>> Without trunking though the write performance after this doesn't
>>>>>         
>>>>>           
>> seem
>>   
>>     
>>>>>     
>>>>>         
>>>>>           
>>>> to
>>>>  
>>>>       
>>>>         
>>>>> be affected (still at about 80-90MB rather than down at less than
>>>>>     
>>>>>         
>>>>>           
>>>> 10MB)
>>>>  
>>>>       
>>>>         
>>>>> -----Original Message-----
>>>>> From: Rafiu Fakunle [mailto:[EMAIL PROTECTED] Sent: Monday, 27 
>>>>> November 2006 12:27 p.m.
>>>>> To: Dave Watkins
>>>>> Cc: [email protected]
>>>>> Subject: Re: [OF-users] iSCSI bug?
>>>>>
>>>>> Dave Watkins wrote:
>>>>>      
>>>>>         
>>>>>           
>>>>>> Sorry about that, I remembered as soon as I sent it that I hadn't
>>>>>> included version. It's x86_64 version 2.2 (did a conary updateall
>>>>>>       
>>>>>>           
>>>>>>             
>>>> from
>>>>  
>>>>       
>>>>         
>>>>>> 2.1 beta. Uname -r gives 2.6.17.14-0.3.smp.x86_64.
>>>>>>
>>>>>> I'll try with a UP kernel although it will take some time as I
>>>>>>             
> have
>   
>>>>>>       
>>>>>>           
>>>>>>             
>>>> to
>>>>  
>>>>       
>>>>         
>>>>>> rebuild the e1000 module from the UP kernel sources.           
>>>>>>           
>>>>>>             
>>>>> Try without the network trunking anyway in the meantime. Would be
>>>>>           
> an
>   
>>>>>         
>>>>>           
>>   
>>     
>>>>> interesting test.
>>>>>
>>>>> R.
>>>>>
>>>>>
>>>>>      
>>>>>         
>>>>>           
>>>>>> I'll let you know
>>>>>> if I can reproduce on the UP kernel.
>>>>>>
>>>>>> I don't think it's related to that ticket as they are all writes
>>>>>>           
>>>>>>           
>>>>>>             
>>>>> anyway
>>>>>      
>>>>>         
>>>>>           
>>>>>> and they only see the problem on large files.
>>>>>>
>>>>>> Dave
>>>>>>
>>>>>> -----Original Message-----
>>>>>> From: Rafiu Fakunle [mailto:[EMAIL PROTECTED] Sent: Monday, 27

>>>>>> November 2006 11:40 a.m.
>>>>>> To: Dave Watkins
>>>>>> Cc: [email protected]
>>>>>> Subject: Re: [OF-users] iSCSI bug?
>>>>>>
>>>>>> Hi Dave,
>>>>>>
>>>>>> Excellent test and bug report.
>>>>>>
>>>>>> I wonder whether it may be related to this:
>>>>>>
>>>>>> https://project.openfiler.com/tracker/ticket/435
>>>>>>
>>>>>> Can you try to reproduce with a UP kernel pls.
>>>>>>
>>>>>> Also I need the output of `uname -r`
>>>>>>
>>>>>> Thx,
>>>>>>
>>>>>> R.
>>>>>>
>>>>>> FTR: this is running r58 from IET svn
>>>>>>
>>>>>>
>>>>>> Dave Watkins wrote:
>>>>>>            
>>>>>>           
>>>>>>             
>>>>>>> Hi All
>>>>>>>
>>>>>>> I think I've found a bug in the iscsi target software in my
>>>>>>> benchmarking/testing.
>>>>>>>
>>>>>>> Some background on the hardware first in case it may be related.
>>>>>>> Dual core/dual opteron with 2GB of ram
>>>>>>> 3ware 8006 2 port raid card for OS drives
>>>>>>> 3ware 9550SX card for data drives
>>>>>>> Dual GB Broadcom on-board NIC's teamed into bond0 (management)
>>>>>>> Quad port Intel PCI-E GB NIC with all 4 ports teamed into bond1
>>>>>>>         
>>>>>>>             
>>>>>>>               
>>>> (main
>>>>  
>>>>       
>>>>         
>>>>>>> iscsi data network)
>>>>>>> 4 x 250GB WD SATA HDD's in RAID5
>>>>>>>
>>>>>>> Of note here is that I have had to replace the e1000 driver with
>>>>>>>             
>>>>>>>               
>> the
>>   
>>     
>>>>>>> latest from Intel to support the quad port card
>>>>>>>
>>>>>>> I have made some volumes and mounted them on various windows
>>>>>>>             
>>>>>>>               
>> servers
>>   
>>     
>>>>>>>                   
>>>>>>>             
>>>>>>>               
>>>>>> and
>>>>>>            
>>>>>>           
>>>>>>             
>>>>>>> have been using iobench to tune performance of the system. When
>>>>>>>         
>>>>>>>             
>>>>>>>               
>>>> using
>>>>  
>>>>       
>>>>         
>>>>>>>                   
>>>>>>>             
>>>>>>>               
>>>>>> a
>>>>>>            
>>>>>>           
>>>>>>             
>>>>>>> read only test pattern I see this
>>>>>>>
>>>>>>> BUG: soft lockup detected on CPU#0!
>>>>>>>
>>>>>>> Call Trace: <IRQ> <ffffffff8029f73c>{softlockup_tick+210}
>>>>>>>        <ffffffff80289151>{update_process_times+66}
>>>>>>> <ffffffff802713fe>{smp_local_timer_interrupt+35}
>>>>>>>        <ffffffff80271463>{smp_apic_timer_interrupt+65}
>>>>>>> <ffffffff8025f54c>{apic_timer_interrupt+132} <EOI>
>>>>>>>        <ffffffff88141486>{:iscsi_trgt:istd+83}
>>>>>>> <ffffffff88141476>{:iscsi_trgt:istd+67}
>>>>>>>        <ffffffff80403ea6>{tcp_sendpage+0}
>>>>>>> <ffffffff8027fef6>{__wake_up_common+67}
>>>>>>>        <ffffffff8029131c>{keventd_create_kthread+0}
>>>>>>> <ffffffff88141433>{:iscsi_trgt:istd+0}
>>>>>>>        <ffffffff8029131c>{keventd_create_kthread+0}
>>>>>>> <ffffffff80231a7d>{kthread+200}
>>>>>>>        <ffffffff8025f8a2>{child_rip+8}
>>>>>>> <ffffffff8029131c>{keventd_create_kthread+0}
>>>>>>>        <ffffffff802319b5>{kthread+0}
>>>>>>>             
>>>>>>>               
>> <ffffffff8025f89a>{child_rip+0}
>>   
>>     
>>>>>>> BUG: soft lockup detected on CPU#0!
>>>>>>>
>>>>>>> Call Trace: <IRQ> <ffffffff8029f73c>{softlockup_tick+210}
>>>>>>>        <ffffffff80289151>{update_process_times+66}
>>>>>>> <ffffffff802713fe>{smp_local_timer_interrupt+35}
>>>>>>>        <ffffffff80271463>{smp_apic_timer_interrupt+65}
>>>>>>> <ffffffff8025f54c>{apic_timer_interrupt+132} <EOI>
>>>>>>>        <ffffffff802631ec>{_spin_unlock_irqrestore+8}
>>>>>>> <ffffffff80246a25>{try_to_wake_up+955}
>>>>>>>        <ffffffff881411cc>{:iscsi_trgt:nthread_wakeup+47}
>>>>>>> <ffffffff8814219a>{:iscsi_trgt:istd+3431}
>>>>>>>        <ffffffff80403ea6>{tcp_sendpage+0}
>>>>>>> <ffffffff8027fef6>{__wake_up_common+67}
>>>>>>>        <ffffffff8029131c>{keventd_create_kthread+0}
>>>>>>> <ffffffff88141433>{:iscsi_trgt:istd+0}
>>>>>>>        <ffffffff8029131c>{keventd_create_kthread+0}
>>>>>>> <ffffffff80231a7d>{kthread+200}
>>>>>>>        <ffffffff8025f8a2>{child_rip+8}
>>>>>>> <ffffffff8029131c>{keventd_create_kthread+0}
>>>>>>>        <ffffffff802319b5>{kthread+0}
>>>>>>>             
>>>>>>>               
>> <ffffffff8025f89a>{child_rip+0}
>>   
>>     
>>>>>>> Doing write only based patterns this doesn't come up. After this
>>>>>>> performance of the system dives (from about 110MB/sec of iscsi
>>>>>>> performance to about 10MB/sec).
>>>>>>>
>>>>>>> This is fairly reproducible here so if you need anymore
>>>>>>>             
>>>>>>>               
>> information
>>   
>>     
>>>>>>>                   
>>>>>>>             
>>>>>>>               
>>>>>> just
>>>>>>            
>>>>>>           
>>>>>>             
>>>>>>> ask.
>>>>>>>
>>>>>>> Dave
>>>>>>>
>>>>>>>  
>>>>>>>                   
>>>>>>>             
>>>>>>>               
>
------------------------------------------------------------------------
>   
>>   
>>     
>>>>  
>>>>       
>>>>         
>>>>>      
>>>>>         
>>>>>           
>>>>>>            
>>>>>>           
>>>>>>             
>>>>>>> _______________________________________________
>>>>>>> Openfiler-users mailing list
>>>>>>> [email protected]
>>>>>>> https://lists.openfiler.com/mailman/listinfo/openfiler-users
>>>>>>>                     
>>>>>>>             
>>>>>>>               
>>>>>>             
>>>>>>           
>>>>>>             
>>>>>       
>>>>>         
>>>>>           
>>>>   
>>>>       
>>>>         
>>> _______________________________________________
>>> Openfiler-users mailing list
>>> [email protected]
>>> https://lists.openfiler.com/mailman/listinfo/openfiler-users
>>>     
>>>       
>>   
>>     
>
>   

_______________________________________________
Openfiler-users mailing list
[email protected]
https://lists.openfiler.com/mailman/listinfo/openfiler-users

Reply via email to