Re: [E1000-devel] Detected Hardware Unit Hang on Intel Wired Ethernet

2012-01-09 Thread Pratyush Anand
On 1/7/2012 12:25 AM, Dave, Tushar N wrote:
> Pratyush,
>
> Sorry I got your name reversed.
> Are you using in-kernel driver or one from Sourceforge.

I am using in-kernel driver from kernel 2.6.37.

> Please send me output of ethtool -i ethx.

root@192.168.1.10:~# ethtool -i eth0
driver: e1000e
version: 1.2.7-k2
firmware-version: 5.11-8
bus-info: :01:00.0

Regards
Pratyush

>
> -Tushar
>
> -Original Message-
> From: Pratyush Anand [mailto:pratyush.an...@st.com]
> Sent: Thursday, January 05, 2012 8:25 PM
> To: Dave, Tushar N
> Cc: Greg KH; Pratyush Anand; e1000-devel@lists.sourceforge.net; 
> net...@vger.kernel.org; Shiraz HASHIM; Deepak SIKRI; Bhavna YADAV; 
> linux-...@vger.kernel.org; Linux NICS
> Subject: Re: Detected Hardware Unit Hang on Intel Wired Ethernet
>
> Thanks Tushar,
>
> On 1/6/2012 5:24 AM, Dave, Tushar N wrote:
>> Anand,
>>
>> Sorry to hear that you have this issue with card. And yeah, thanks for doing 
>> the debugging and providing the bus trace.
>> I think we should run the debug driver that prints the HW ring details when 
>> hang occurs. I can provide you a debug driver. You can then install debug 
>> driver and also let the bus tracer running. Once the issue occurs, provide 
>> me the full dmesg output (that has HW ring details) and bus trace.
>>
>> Tell me which card you have, 1gig or 10gig? Which driver are you running 
>> e1000e or igb or ixgbe?
>> Can you also provide ethtool -i ethx output.
>>
>> Once I know which driver, I send you debug driver.
>
> I am using Intel PRO/1000 PT Server Adapter.
> http://www.intel.com/content/www/us/en/network-adapters/gigabit-network-adapters/pro-1000-pt.html
>
> I am using e1000e driver.
>
> I see the problem when I try to mount rootfilesystem using NFS and use
> MSI interrupt. I see this issue even before I can have cell prompt.
> Please see first mail in this thread.
>
> http://www.mail-archive.com/e1000-devel@lists.sourceforge.net/msg04894.html
>
> Here, you can also see tx ring details when issue occur.
> Please let me know, if you need any more info.
>
> Regards
> Pratyush
>
>>
>> Thanks.
>>
>> -Tushar
>>
>> -Original Message-
>> From: netdev-ow...@vger.kernel.org [mailto:netdev-ow...@vger.kernel.org] On 
>> Behalf Of Pratyush Anand
>> Sent: Wednesday, January 04, 2012 8:31 PM
>> To: Greg KH
>> Cc: Pratyush Anand; e1000-devel@lists.sourceforge.net; 
>> net...@vger.kernel.org; Shiraz HASHIM; Deepak SIKRI; Bhavna YADAV; 
>> linux-...@vger.kernel.org; Linux NICS
>> Subject: Re: Detected Hardware Unit Hang on Intel Wired Ethernet
>>
>> On 1/5/2012 12:52 AM, Greg KH wrote:
>>> On Wed, Jan 04, 2012 at 04:31:36PM +0530, Pratyush Anand wrote:
 Adding PCI mailing list too, as problem is coming only when MSI is enabled.

 If I connect an PCIe analyzer, I see that at the time of issue
 MRd(64) for 32 words has been issued with a wrong 64 bit address
 from ethernet card to my RC.
 In the normal course it always issues MRd(32) only.
>>>
>>> Bug in your pcie firmware controller?
>>>
>>> .
>>>
>>
>> when you say "Bug in your pcie firmware controller?", is it RC's
>> software or EP's software?
>>
>> Here I am pasting a part of analyzer log converted into text.
>> Packet(177940), is an upstream request for MSI. Whenever any device
>> writes at address 0x58A8F8, my PCIe RC considers it as MSI and generates
>> an interrupt. So I receive MSI interrupt correctly in my software. Also
>> MSI controller is correctly able to point me that the interrupt is from
>> ethernet card.
>>
>> Now in Packet(178010), ethernet controller sends another upstream
>> request for MRd(64) of 32 dwords with Address(AFECEB87:A9D88B00).Since,
>> this address does not exist in my RC's world so, an UR is returned and
>> hence the problem occurs.
>>
>> Now, question is, why ethernet card is generating inbound request with
>> such a wrong address. I have taken log of all the tx_desc->buffer_addr
>> programmed by software in function e1000_tx_queue. None of them is 64
>> bit or any invalid address.
>>
>> ___|___
>> Packet(177916) Upstream 2.5(x1) TLP(1475) Mem MWr(32)(10:0) Length(4)
>> ___| RequesterID(003:00:0) Tag(2) Address(0EB00200) 1st BE()
>> ___| Last BE() Data(4 dwords) LCRC(0x44E0407C)
>> ___| Time Stamp(0013 . 460 549 544 s)
>> ___|___
>> Packet(177918) Downstream 2.5(x1) DLLP ACK AckNak_Seq_Num(1475)
>> ___| CRC 16(0x0EB7) Time Stamp(0013 . 460 551 144 s)
>> ___|___
>> Packet(177940) Upstream 2.5(x1) TLP(1476) Mem MWr(32)(10:0) Length(1)
>> ___| RequesterID(003:00:0) Tag(30) Address(0058A8F8) 1st BE(0011)
>> ___| Last BE() Data(1 dword) LCRC(0xC21F32B6)
>> ___| Time Stamp(0013 . 460 588 544 s)
>> ___|___

Re: [E1000-devel] Question about ixgbevf driver

2012-01-09 Thread David Yeung
Greg,

Thank you for the clarification!

David
On 1/9/2012 12:59 PM, Rose, Gregory V wrote:
> David,
>
> The results being reported are expected.
>
>>Supports auto-negotiation: No   ( We think this should be Yes )
> Actually no...  The virtual function (VF) device does not do any 
> auto-negotiation.  It requires the physical function (PF) device to do that 
> on its behalf.
>
>>Advertised auto-negotiation: No ( We think this should be Yes )
> Again, no.  The VF does no auto-negotiation.  It depends upon the PF device 
> for that.
>
>>Port: Unknown! (255)  ( We think this should be Twisted Pair )
> The VF has no knowledge of or need of such knowledge of the Phy port type.  
> Therefore it is in fact unknown.
>
>>Transceiver: Unknown!  ( We think this should be external )
> Same as previous - VF has no knowledge of it and shouldn't be concerned.
>
>>Auto-negotiation: off  ( We think this should be on )
> And the same here.  The VF does not initiate, advertise or engage in 
> auto-negotiation.  It can only report the link speed set by the PF device and 
> whether the link is up and it does that.
>
> The 82599 supports up to 63 VF devices.  If each VF was able to control and 
> initiate auto-negotiation parameters it would be a real mess to manage.  Our 
> controller doesn't allow that and doesn't even allow the VF to even see the 
> information for security reasons.
>
> - Greg
>
>> -Original Message-
>> From: David Yeung [mailto:david.ye...@oracle.com]
>> Sent: Monday, January 09, 2012 11:45 AM
>> To: Rose, Gregory V
>> Cc: Kirsher, Jeffrey T; e1000-devel; Allan, Bruce W; Brandeburg, Jesse;
>> Steve Sarvate
>> Subject: Re: Question about ixgbevf driver
>>
>> Greg,
>>
>> Thank you for your help!
>> In the OEL 5.5 guest domain, its ethtool reports strange results of
>> virtual interfaces of Twinville:
>>Supports auto-negotiation: No   ( We think this should be Yes )
>>Advertised auto-negotiation: No ( We think this should be Yes )
>>Port: Unknown! (255)  ( We think this should be Twisted Pair )
>>Transceiver: Unknown!  ( We think this should be external )
>>Auto-negotiation: off  ( We think this should be on )
>>
>> Do you see this problem in your guest OS OEL 5.6 and 5.5?
>>
>> For details:
>> ==
>> eth5 and eth6 are virtual interfaces of Twinville
>>
>> [root@Twinville_VM_156 ~]# ethtool eth5
>> Settings for eth5:
>>   Supported ports: [ ]
>>   Supported link modes:   1baseT/Full
>>   Supports auto-negotiation: No
>>   Advertised link modes:  Not reported
>>   Advertised auto-negotiation: No
>>   Speed: 1Mb/s
>>   Duplex: Full
>>   Port: Unknown! (255)
>>   PHYAD: 0
>>   Transceiver: Unknown!
>>   Auto-negotiation: off
>>   Current message level: 0x0007 (7)
>>   Link detected: yes
>> [root@Twinville_VM_156 ~]#
>> [root@Twinville_VM_156 ~]# ethtool eth6
>> Settings for eth6:
>>   Supported ports: [ ]
>>   Supported link modes:   1baseT/Full
>>   Supports auto-negotiation: No
>>   Advertised link modes:  Not reported
>>   Advertised auto-negotiation: No
>>   Speed: 1Mb/s
>>   Duplex: Full
>>   Port: Unknown! (255)
>>   PHYAD: 0
>>   Transceiver: Unknown!
>>   Auto-negotiation: off
>>   Current message level: 0x0007 (7)
>>   Link detected: yes
>> [root@Twinville_VM_162 ~]# ethtool -i eth5
>> driver: ixgbevf
>> version: 2.4.0-NAPI
>> firmware-version: N/A
>> bus-info: :00:08.0
>> [root@Twinville_VM_162 ~]#
>> [root@Twinville_VM_162 ~]# ethtool -i eth6
>> driver: ixgbevf
>> version: 2.4.0-NAPI
>> firmware-version: N/A
>> bus-info: :00:09.0
>> [root@Twinville_VM_162 ~]#
>> [root@Twinville_VM_156 ~]# more  /etc/*release
>> ::
>> /etc/enterprise-release
>> ::
>> Enterprise Linux Enterprise Linux Server release 5.5 (Carthage)
>> ::
>> /etc/redhat-release
>> ::
>> Red Hat Enterprise Linux Server release 5.5 (Tikanga)
>> [root@Twinville_VM_156 ~]#
>>
>>
>> Thanks,
>>
>> David
>>
>> On 01/09/12 09:52, Rose, Gregory V wrote:
>>> David,
>>>
>>> We've found that if we use RHEL 5.5 as the guest then the bug still
>> occurs but if we upgrade the guest OS to RHEL 5.6 then it does not occur.
>> So it does not appear to be a driver bug.  It looks like some issue with
>> RHEL 5.5 and OEL 5.5.
>>> We suggest upgrading to 5.6.
>>>
>>> Thanks,
>>>
>>> - Greg
>>>
 -Original Message-
 From: Rose, Gregory V
 Sent: Friday, January 06, 2012 4:32 PM
 To: Rose, Gregory V; Kirsher, Jeffrey T; david.ye...@oracle.com
 Cc: e1000-devel; Allan, Bruce W; Brandeburg, Jesse; Steve Sarvate
 Subject: RE: Question about ixgbevf driver

 David,

 We've confirmed the bug and are looking into it.  We'll keep you
>

Re: [E1000-devel] Intel ixgbe: uninitialized variable use in ixgbe_non_sfp_link_config()

2012-01-09 Thread Skidmore, Donald C


>-Original Message-
>From: Kirsher, Jeffrey T
>Sent: Sunday, January 08, 2012 9:07 PM
>To: Jesper Juhl; Skidmore, Donald C
>Cc: e1000-devel Mailing List; David S. Miller; Brandeburg, Jesse; netdev
>Subject: Re: Intel ixgbe: uninitialized variable use in
>ixgbe_non_sfp_link_config()
>
>On Sun, 2012-01-08 at 22:21 +0100, Jesper Juhl wrote:
>> Hi
>>
>> In ixgbe_non_sfp_link_config(), the variable 'negotiation' is declared
>> without initializer and unless we take the true branch in the 'if
>> ((!autoneg) && (hw->mac.ops.get_link_capabilities))' statement it will
>> remain uninitialized when it is subsequently read in the 'ret =
>> hw->mac.ops.setup_link(hw, autoneg, negotiation, link_up)' statement.
>>
>> The test of 'ret' after the 'if ((!autoneg) &&
>> (hw->mac.ops.get_link_capabilities))' statement also looks fairly
>> pointless if we do not take the true branch, since then 'ret' will not
>> have been changed since the previous identical test.
>>
>> The correct fix escapes me since I don't really know this code (and
>don't
>> plan to spend the time to get to know it), but I thought I'd just
>report
>> what I had noticed and then someone else can hopefully come up with a
>good
>> fix :-)
>>
>>
>> PS. Please CC me on replies.
>>
>
>Adding netdev mailing list and Don Skidmore (ixgbe maintainer)
>Removed Bruce Allan (e1000e maintainer)
>
>I see the potential issue you are referring to, I will defer to Don to
>either explain the reasoning in the logic or suggest a fix.


Thanks for bringing this up Jasper.

I've actually noticed this before and have it on my list of things that need 
refactoring.  I'll try to get to it much sooner now. :)

The good news is that his doesn't actually cause a problem.  Since all the 
possible functions that the setup_link pointer points to don't actually use the 
'negotiation' variable until it has been initialized in the same said function. 
 This does begs the question "why do we even pass it in then", well the short 
answer is we shouldn't.  I plan on refactoring the code to remove it.  The only 
reason I haven't done it yet is we have other drivers that use this same code 
which makes the change a bit more complicated coordinating the effort.  

I'll also fix the redundant conditional around the goto while I'm at it.

Thanks again,
-Don Skidmore 
--
Write once. Port to many.
Get the SDK and tools to simplify cross-platform app development. Create 
new or port existing apps to sell to consumers worldwide. Explore the 
Intel AppUpSM program developer opportunity. appdeveloper.intel.com/join
http://p.sf.net/sfu/intel-appdev
___
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel® Ethernet, visit 
http://communities.intel.com/community/wired


Re: [E1000-devel] Question about ixgbevf driver

2012-01-09 Thread Rose, Gregory V
David,

The results being reported are expected.

>   Supports auto-negotiation: No   ( We think this should be Yes )

Actually no...  The virtual function (VF) device does not do any 
auto-negotiation.  It requires the physical function (PF) device to do that on 
its behalf.

>   Advertised auto-negotiation: No ( We think this should be Yes )

Again, no.  The VF does no auto-negotiation.  It depends upon the PF device for 
that.

>   Port: Unknown! (255)  ( We think this should be Twisted Pair )

The VF has no knowledge of or need of such knowledge of the Phy port type.  
Therefore it is in fact unknown.

>   Transceiver: Unknown!  ( We think this should be external )

Same as previous - VF has no knowledge of it and shouldn't be concerned.

>   Auto-negotiation: off  ( We think this should be on )

And the same here.  The VF does not initiate, advertise or engage in 
auto-negotiation.  It can only report the link speed set by the PF device and 
whether the link is up and it does that.

The 82599 supports up to 63 VF devices.  If each VF was able to control and 
initiate auto-negotiation parameters it would be a real mess to manage.  Our 
controller doesn't allow that and doesn't even allow the VF to even see the 
information for security reasons.

- Greg

> -Original Message-
> From: David Yeung [mailto:david.ye...@oracle.com]
> Sent: Monday, January 09, 2012 11:45 AM
> To: Rose, Gregory V
> Cc: Kirsher, Jeffrey T; e1000-devel; Allan, Bruce W; Brandeburg, Jesse;
> Steve Sarvate
> Subject: Re: Question about ixgbevf driver
> 
> Greg,
> 
> Thank you for your help!
> In the OEL 5.5 guest domain, its ethtool reports strange results of
> virtual interfaces of Twinville:
>   Supports auto-negotiation: No   ( We think this should be Yes )
>   Advertised auto-negotiation: No ( We think this should be Yes )
>   Port: Unknown! (255)  ( We think this should be Twisted Pair )
>   Transceiver: Unknown!  ( We think this should be external )
>   Auto-negotiation: off  ( We think this should be on )
> 
> Do you see this problem in your guest OS OEL 5.6 and 5.5?
> 
> For details:
> ==
> eth5 and eth6 are virtual interfaces of Twinville
> 
> [root@Twinville_VM_156 ~]# ethtool eth5
> Settings for eth5:
>  Supported ports: [ ]
>  Supported link modes:   1baseT/Full
>  Supports auto-negotiation: No
>  Advertised link modes:  Not reported
>  Advertised auto-negotiation: No
>  Speed: 1Mb/s
>  Duplex: Full
>  Port: Unknown! (255)
>  PHYAD: 0
>  Transceiver: Unknown!
>  Auto-negotiation: off
>  Current message level: 0x0007 (7)
>  Link detected: yes
> [root@Twinville_VM_156 ~]#
> [root@Twinville_VM_156 ~]# ethtool eth6
> Settings for eth6:
>  Supported ports: [ ]
>  Supported link modes:   1baseT/Full
>  Supports auto-negotiation: No
>  Advertised link modes:  Not reported
>  Advertised auto-negotiation: No
>  Speed: 1Mb/s
>  Duplex: Full
>  Port: Unknown! (255)
>  PHYAD: 0
>  Transceiver: Unknown!
>  Auto-negotiation: off
>  Current message level: 0x0007 (7)
>  Link detected: yes
> [root@Twinville_VM_162 ~]# ethtool -i eth5
> driver: ixgbevf
> version: 2.4.0-NAPI
> firmware-version: N/A
> bus-info: :00:08.0
> [root@Twinville_VM_162 ~]#
> [root@Twinville_VM_162 ~]# ethtool -i eth6
> driver: ixgbevf
> version: 2.4.0-NAPI
> firmware-version: N/A
> bus-info: :00:09.0
> [root@Twinville_VM_162 ~]#
> [root@Twinville_VM_156 ~]# more  /etc/*release
> ::
> /etc/enterprise-release
> ::
> Enterprise Linux Enterprise Linux Server release 5.5 (Carthage)
> ::
> /etc/redhat-release
> ::
> Red Hat Enterprise Linux Server release 5.5 (Tikanga)
> [root@Twinville_VM_156 ~]#
> 
> 
> Thanks,
> 
> David
> 
> On 01/09/12 09:52, Rose, Gregory V wrote:
> > David,
> >
> > We've found that if we use RHEL 5.5 as the guest then the bug still
> occurs but if we upgrade the guest OS to RHEL 5.6 then it does not occur.
> So it does not appear to be a driver bug.  It looks like some issue with
> RHEL 5.5 and OEL 5.5.
> >
> > We suggest upgrading to 5.6.
> >
> > Thanks,
> >
> > - Greg
> >
> >> -Original Message-
> >> From: Rose, Gregory V
> >> Sent: Friday, January 06, 2012 4:32 PM
> >> To: Rose, Gregory V; Kirsher, Jeffrey T; david.ye...@oracle.com
> >> Cc: e1000-devel; Allan, Bruce W; Brandeburg, Jesse; Steve Sarvate
> >> Subject: RE: Question about ixgbevf driver
> >>
> >> David,
> >>
> >> We've confirmed the bug and are looking into it.  We'll keep you
> updated
> >> on what we find.
> >>
> >> Thanks,
> >>
> >> - Greg
> >>
> >>> -Original Message-
> >>> From: Rose, Gregory V [mailto:gregory.v.r...@intel.com]
> >>> Sent: Thursday, January 05, 2012 8:06 AM
> >>> To: Kirsher, Jeffrey T; david

Re: [E1000-devel] Question about ixgbevf driver

2012-01-09 Thread David Yeung
Greg,

Thank you for your help!
In the OEL 5.5 guest domain, its ethtool reports strange results of
virtual interfaces of Twinville:
  Supports auto-negotiation: No   ( We think this should be Yes )
  Advertised auto-negotiation: No ( We think this should be Yes )
  Port: Unknown! (255)  ( We think this should be Twisted Pair )
  Transceiver: Unknown!  ( We think this should be external )
  Auto-negotiation: off  ( We think this should be on )

Do you see this problem in your guest OS OEL 5.6 and 5.5?

For details:
==
eth5 and eth6 are virtual interfaces of Twinville

[root@Twinville_VM_156 ~]# ethtool eth5
Settings for eth5:
 Supported ports: [ ]
 Supported link modes:   1baseT/Full
 Supports auto-negotiation: No
 Advertised link modes:  Not reported
 Advertised auto-negotiation: No
 Speed: 1Mb/s
 Duplex: Full
 Port: Unknown! (255)
 PHYAD: 0
 Transceiver: Unknown!
 Auto-negotiation: off
 Current message level: 0x0007 (7)
 Link detected: yes
[root@Twinville_VM_156 ~]#
[root@Twinville_VM_156 ~]# ethtool eth6
Settings for eth6:
 Supported ports: [ ]
 Supported link modes:   1baseT/Full
 Supports auto-negotiation: No
 Advertised link modes:  Not reported
 Advertised auto-negotiation: No
 Speed: 1Mb/s
 Duplex: Full
 Port: Unknown! (255)
 PHYAD: 0
 Transceiver: Unknown!
 Auto-negotiation: off
 Current message level: 0x0007 (7)
 Link detected: yes
[root@Twinville_VM_162 ~]# ethtool -i eth5
driver: ixgbevf
version: 2.4.0-NAPI
firmware-version: N/A
bus-info: :00:08.0
[root@Twinville_VM_162 ~]#
[root@Twinville_VM_162 ~]# ethtool -i eth6
driver: ixgbevf
version: 2.4.0-NAPI
firmware-version: N/A
bus-info: :00:09.0
[root@Twinville_VM_162 ~]#
[root@Twinville_VM_156 ~]# more  /etc/*release
::
/etc/enterprise-release
::
Enterprise Linux Enterprise Linux Server release 5.5 (Carthage)
::
/etc/redhat-release
::
Red Hat Enterprise Linux Server release 5.5 (Tikanga)
[root@Twinville_VM_156 ~]#


Thanks,

David

On 01/09/12 09:52, Rose, Gregory V wrote:
> David,
>
> We've found that if we use RHEL 5.5 as the guest then the bug still occurs 
> but if we upgrade the guest OS to RHEL 5.6 then it does not occur.  So it 
> does not appear to be a driver bug.  It looks like some issue with RHEL 5.5 
> and OEL 5.5.
>
> We suggest upgrading to 5.6.
>
> Thanks,
>
> - Greg
>
>> -Original Message-
>> From: Rose, Gregory V
>> Sent: Friday, January 06, 2012 4:32 PM
>> To: Rose, Gregory V; Kirsher, Jeffrey T; david.ye...@oracle.com
>> Cc: e1000-devel; Allan, Bruce W; Brandeburg, Jesse; Steve Sarvate
>> Subject: RE: Question about ixgbevf driver
>>
>> David,
>>
>> We've confirmed the bug and are looking into it.  We'll keep you updated
>> on what we find.
>>
>> Thanks,
>>
>> - Greg
>>
>>> -Original Message-
>>> From: Rose, Gregory V [mailto:gregory.v.r...@intel.com]
>>> Sent: Thursday, January 05, 2012 8:06 AM
>>> To: Kirsher, Jeffrey T; david.ye...@oracle.com
>>> Cc: e1000-devel; Allan, Bruce W; Brandeburg, Jesse; Steve Sarvate
>>> Subject: Re: [E1000-devel] Question about ixgbevf driver
>>>
 -Original Message-
 From: Kirsher, Jeffrey T
 Sent: Wednesday, January 04, 2012 7:58 PM
 To: david.ye...@oracle.com; Rose, Gregory V
 Cc: Brandeburg, Jesse; Allan, Bruce W; Steve Sarvate; e1000-devel
 Subject: Re: Question about ixgbevf driver

 Adding e1000-devel mailing list as well as Greg Rose (ixgbevf
 maintainer)...
>>>
>>> I've never noticed this before but then I can't say as how I was looking
>>> for it either.  Let me check it out and I'll get back to you.
>>>
>>> - Greg
>>>

 On Wed, 2012-01-04 at 17:26 -0800, David Yeung wrote:
> Hi  Bruce/Jeffrey/Jesse/,
>
> How do you do?
> We are using the  ixgbevf  driver ( version: 2.4.0-NAPI ) to test
>> the
> VLAN interfaces of Twinville NICs inside the OEL 5 virtual machine,
>> it
> looks like the bi-directional network traffic ran properly on the
>> VLAN
> interfaces of Twinville NICs  inside the OEL 5 virtual machine for
> hours, but the ifconfig command reports strange amount ( it is 0 all
> the time )  of  RX packets and RX bytes of VLAN interfaces of
>>> Twinville:
>
> [root@Twinville_VM_156 ~]# ifconfig
> .
> eth6.10   Link encap:Ethernet  HWaddr F2:54:11:05:72:7C
> inet addr:192.6.10.156  Bcast:192.6.10.255
 Mask:255.255.255.0
> UP BROADCAST RUNNING MULTICAST  MTU:9210  Metric:1
> RX packets:0 errors:0 dropped:0 overruns:0 frame:0
> TX packets:111332155 errors:0 dropped:0 overruns:0
>>> carrier:0
> collisions

Re: [E1000-devel] igb + balance-rr + bridge + IPv6 = no go without promiscuous mode

2012-01-09 Thread Chris Boot
On 09/01/2012 17:19, Wyborny, Carolyn wrote:
>
>
>> -Original Message-
>> From: Chris Boot [mailto:bo...@bootc.net]
>> Sent: Wednesday, January 04, 2012 8:58 AM
>> To: Wyborny, Carolyn
>> Cc: Nicolas de Pesloüan; netdev; e1000-devel@lists.sourceforge.net
>> Subject: Re: igb + balance-rr + bridge + IPv6 = no go without
>> promiscuous mode
>>
>> On 04/01/2012 16:00, Wyborny, Carolyn wrote:
>>>
>>>
 -Original Message-
 From: netdev-ow...@vger.kernel.org [mailto:netdev-
>> ow...@vger.kernel.org]
 On Behalf Of Wyborny, Carolyn
 Sent: Tuesday, January 03, 2012 3:24 PM
 To: Chris Boot; Nicolas de Pesloüan
 Cc: netdev; e1000-devel@lists.sourceforge.net
 Subject: RE: igb + balance-rr + bridge + IPv6 = no go without
 promiscuous mode



> -Original Message-
> From: netdev-ow...@vger.kernel.org [mailto:netdev-
 ow...@vger.kernel.org]
> On Behalf Of Chris Boot
> Sent: Tuesday, December 27, 2011 1:53 PM
> To: Nicolas de Pesloüan
> Cc: netdev
> Subject: Re: igb + balance-rr + bridge + IPv6 = no go without
> promiscuous mode
>
> On 23/12/2011 10:56, Chris Boot wrote:
>> On 23/12/2011 10:48, Nicolas de Pesloüan wrote:
>>> [ Forwarded to netdev, because two previous e-mail erroneously
>> sent
> in
>>> HTML ]
>>>
>>> Le 23/12/2011 11:15, Chris Boot a écrit :
 On 23/12/2011 09:52, Nicolas de Pesloüan wrote:
>
>
> Le 23 déc. 2011 10:42, "Chris Boot" >   a écrit :
>>
>> Hi folks,
>>
>> As per Eric Dumazet and Dave Miller, I'm opening up a separate
> thread on this issue.
>>
>> I have two identical servers in a cluster for running KVM
 virtual
> machines. They each have a
> single connection to the Internet (irrelevant for this) and two
> gigabit connections between each
> other for cluster replication, etc... These two connections are
>> in
> a
> balance-rr bonded connection,
> which is itself member of a bridge that the VMs attach to. I'm
> running v3.2-rc6-140-gb9e26df on
> Debian Wheezy.
>>
>> When the bridge is brought up, IPv4 works fine but IPv6 does
 not.
> I can use neither the
> automatic link-local on the brid ge nor the static global
>> address
 I
> assign. Neither machine can
> perform neighbour discovery over the link until I put the bond
> members (eth0 and eth1) into
> promiscuous mode. I can do this either with tcpdump or 'ip link
 set
> dev ethX promisc on' and this
> is enough to make the link spring to life.
>
> For as far as I remember, setting bond0 to promisc should set
>> the
> bonding member to promisc too.
> And inserting bond0 into br0 should set bond0 to promisc... So
> everything should be in promisc
> mode anyway... but you shoudn't have to do it by hand.
>

 Sorry, I should have added that I tried this. Setting bond0 or
>> br0
> to
 promisc has no effect. I
 discovered this by running tcpdump on br0 first, then bond0, then
 eventually each bond member in
 turn. Only at the last stage did things jump to life.

>>
>> This cluster is not currently live so I can easily test patches
> and various configurations.
>
> Can you try to remove the bonding part, connecting eth0 and eth1
> directly to br0 and see if it
> works better? (This is a test ony. I perfectly understand that
>> you
> would loose balance-rr in this
> setup.)
>

 Good call. Let's see.

 I took br0 and bond0 apart, took eth0 and eth1 out of enforced
 promisc mode, then manually built a
 br0 with eth0 in only so I didn't cause a network loop. Adding
>> eth0
 to br0 did not make it go into
 promisc mode, but IPv6 does work over this setup. I also made
>> sure
> ip
 -6 neigh was empty on both
 machines before I started.

 I then decided to try the test with just the bond0 in balance-rr
 mode. Once again I took everything
 down and ensured no promisc mode and no ip -6 neigh. I noticed
 bond0
 wasn't getting a link-local and
 I found out for some reason
 /proc/sys/net/ipv6/conf/bond0/disable_ipv6 was set on both
>> servers
> so I
 set it to 0. That brought things to life.

 So then I put it all back together again and it didn't work. I
>> once
 again noticed disable_ipv6 was
 set on the bond0 interfaces, now part of the bridge. Toggling
>> this
> on
 the _bond_ interface made
 things work again.

 What's setting disable

Re: [E1000-devel] Question about ixgbevf driver

2012-01-09 Thread Rose, Gregory V
David,

We've found that if we use RHEL 5.5 as the guest then the bug still occurs but 
if we upgrade the guest OS to RHEL 5.6 then it does not occur.  So it does not 
appear to be a driver bug.  It looks like some issue with RHEL 5.5 and OEL 5.5.

We suggest upgrading to 5.6.

Thanks,

- Greg

> -Original Message-
> From: Rose, Gregory V
> Sent: Friday, January 06, 2012 4:32 PM
> To: Rose, Gregory V; Kirsher, Jeffrey T; david.ye...@oracle.com
> Cc: e1000-devel; Allan, Bruce W; Brandeburg, Jesse; Steve Sarvate
> Subject: RE: Question about ixgbevf driver
> 
> David,
> 
> We've confirmed the bug and are looking into it.  We'll keep you updated
> on what we find.
> 
> Thanks,
> 
> - Greg
> 
> > -Original Message-
> > From: Rose, Gregory V [mailto:gregory.v.r...@intel.com]
> > Sent: Thursday, January 05, 2012 8:06 AM
> > To: Kirsher, Jeffrey T; david.ye...@oracle.com
> > Cc: e1000-devel; Allan, Bruce W; Brandeburg, Jesse; Steve Sarvate
> > Subject: Re: [E1000-devel] Question about ixgbevf driver
> >
> > > -Original Message-
> > > From: Kirsher, Jeffrey T
> > > Sent: Wednesday, January 04, 2012 7:58 PM
> > > To: david.ye...@oracle.com; Rose, Gregory V
> > > Cc: Brandeburg, Jesse; Allan, Bruce W; Steve Sarvate; e1000-devel
> > > Subject: Re: Question about ixgbevf driver
> > >
> > > Adding e1000-devel mailing list as well as Greg Rose (ixgbevf
> > > maintainer)...
> >
> > I've never noticed this before but then I can't say as how I was looking
> > for it either.  Let me check it out and I'll get back to you.
> >
> > - Greg
> >
> > >
> > > On Wed, 2012-01-04 at 17:26 -0800, David Yeung wrote:
> > > > Hi  Bruce/Jeffrey/Jesse/,
> > > >
> > > > How do you do?
> > > > We are using the  ixgbevf  driver ( version: 2.4.0-NAPI ) to test
> the
> > > > VLAN interfaces of Twinville NICs inside the OEL 5 virtual machine,
> it
> > > > looks like the bi-directional network traffic ran properly on the
> VLAN
> > > > interfaces of Twinville NICs  inside the OEL 5 virtual machine for
> > > > hours, but the ifconfig command reports strange amount ( it is 0 all
> > > > the time )  of  RX packets and RX bytes of VLAN interfaces of
> > Twinville:
> > > >
> > > > [root@Twinville_VM_156 ~]# ifconfig
> > > > .
> > > > eth6.10   Link encap:Ethernet  HWaddr F2:54:11:05:72:7C
> > > >inet addr:192.6.10.156  Bcast:192.6.10.255
> > > Mask:255.255.255.0
> > > >UP BROADCAST RUNNING MULTICAST  MTU:9210  Metric:1
> > > >RX packets:0 errors:0 dropped:0 overruns:0 frame:0
> > > >TX packets:111332155 errors:0 dropped:0 overruns:0
> > carrier:0
> > > >collisions:0 txqueuelen:0
> > > >RX bytes:0 (0.0 b)  TX bytes:2565278846027 (2.3 TiB)
> > > >
> > > > eth6.11   Link encap:Ethernet  HWaddr F2:54:11:05:72:7C
> > > >inet addr:192.6.11.156  Bcast:192.6.11.255
> > > Mask:255.255.255.0
> > > >UP BROADCAST RUNNING MULTICAST  MTU:9210  Metric:1
> > > >RX packets:0 errors:0 dropped:0 overruns:0 frame:0
> > > >TX packets:111289098 errors:0 dropped:0 overruns:0
> > carrier:0
> > > >collisions:0 txqueuelen:0
> > > >RX bytes:0 (0.0 b)  TX bytes:2562582125939 (2.3 TiB)
> > > >
> > > > eth6.12   Link encap:Ethernet  HWaddr F2:54:11:05:72:7C
> > > >inet addr:192.6.12.156  Bcast:192.6.12.255
> > > Mask:255.255.255.0
> > > >UP BROADCAST RUNNING MULTICAST  MTU:9210  Metric:1
> > > >RX packets:0 errors:0 dropped:0 overruns:0 frame:0
> > > >TX packets:111287930 errors:0 dropped:0 overruns:0
> > carrier:0
> > > >collisions:0 txqueuelen:0
> > > >RX bytes:0 (0.0 b)  TX bytes:2564949229588 (2.3 TiB)
> > > >
> > > > eth6.13   Link encap:Ethernet  HWaddr F2:54:11:05:72:7C
> > > >inet addr:192.6.93.156  Bcast:192.6.93.255
> > > Mask:255.255.255.0
> > > >UP BROADCAST RUNNING MULTICAST  MTU:9210  Metric:1
> > > >RX packets:0 errors:0 dropped:0 overruns:0 frame:0
> > > >TX packets:111070858 errors:0 dropped:0 overruns:0
> > carrier:0
> > > >collisions:0 txqueuelen:0
> > > >RX bytes:0 (0.0 b)  TX bytes:2566358628283 (2.3 TiB)
> > > >
> > > > eth6.14   Link encap:Ethernet  HWaddr F2:54:11:05:72:7C
> > > >inet addr:192.6.14.156  Bcast:192.6.14.255
> > > Mask:255.255.255.0
> > > >UP BROADCAST RUNNING MULTICAST  MTU:9210  Metric:1
> > > >RX packets:0 errors:0 dropped:0 overruns:0 frame:0
> > > >TX packets:111362848 errors:0 dropped:0 overruns:0
> > carrier:0
> > > >collisions:0 txqueuelen:0
> > > >RX bytes:0 (0.0 b)  TX bytes:2566349976603 (2.3 TiB)
> > > > .
> > > >
> > > > [root@Twinville_VM_156 ~]# ethtool -i eth6
> > > > driver: ixgbevf
> > > > version: 2.4.0-NAPI
> > > > firmware-version: N/A
> > > > bus-info: :00:09.0
>

Re: [E1000-devel] [PATCH net-next V3 1/2] igb: add PTP Hardware Clock code

2012-01-09 Thread Keller, Jacob E
> -Original Message-
> From: Richard Cochran [mailto:richardcoch...@gmail.com]
> Sent: Saturday, January 07, 2012 11:38 AM
> To: net...@vger.kernel.org
> Cc: e1000-devel@lists.sourceforge.net; Keller, Jacob E; Kirsher,
> Jeffrey T; Ronciak, John; John Stultz; Thomas Gleixner
> Subject: [PATCH net-next V3 1/2] igb: add PTP Hardware Clock code
> 
> This patch adds a source file implementing a PHC. Only the basic
> clock operations have been implemented, although the hardware
> would offer some ancillary functions. The code is fairly self
> contained and is not yet used in the main igb driver.
> 
> Every timestamp and clock read operation must consult the overflow
> counter to form a correct time value. Access to the counter is
> protected by a spin lock, and this would be sufficient, assuming that
> the time values are monotonic.
> 
> However, this assumption does not hold in general. Consider the
> following sequence.
> 
>  1. Hardware latches a receive timestamp (just before counter
> overflow).
>  2. User calls clock_gettime() (just after counter overflow).
>  3. User takes lock, detects 1-0 transition, and increments overflow
> count.
>  4. Driver takes lock, incorrectly combines overflow count with
> timestamp.
> 
> A very similar race condition exists between two nearly simultaneous
> hardware timestamps.
>
> The implementation detects this race by checking the two most
> significant bits for a transition from 11b to 00b. When this pattern
> is detected, the code uses the previous value of the overflow count.
> 
> The two most significant bits divide the overflow period into four
> regions, and so the watchdog timeout is set to sample the counter at
> just over four times per period.
>
Is there a reason for not using the timecounter structure from the kernel? It 
is a layer beneath the timecompare code which is meant to handle this 
condition. As far as I can tell this issue is solved in the timecounter code. 
If it is not, then that should be a bug in the timecounter cyclecounter code. I 
don't know if this issue occurs in the timecounter structure because it handles 
the ns conversion differently.

 
> Signed-off-by: Richard Cochran 
> ---
>  drivers/net/ethernet/intel/igb/igb.h |8 +
>  drivers/net/ethernet/intel/igb/igb_ptp.c |  427
> ++
>  2 files changed, 435 insertions(+), 0 deletions(-)
>  create mode 100644 drivers/net/ethernet/intel/igb/igb_ptp.c
> 
> diff --git a/drivers/net/ethernet/intel/igb/igb.h
> b/drivers/net/ethernet/intel/igb/igb.h
> index c69feeb..f30458d 100644
> --- a/drivers/net/ethernet/intel/igb/igb.h
> +++ b/drivers/net/ethernet/intel/igb/igb.h
> @@ -37,6 +37,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
> 
> @@ -364,6 +365,13 @@ struct igb_adapter {
>   u32 wvbr;
>   int node;
>   u32 *shadow_vfta;
> +
> + struct ptp_clock *ptp_clock;
> + struct ptp_clock_info caps;
> + struct delayed_work overflow_work;
> + spinlock_t tmreg_lock;
> + u32 overflow_counter;
> + unsigned int last_msb;
>  };
> 
>  #define IGB_FLAG_HAS_MSI   (1 << 0)
> diff --git a/drivers/net/ethernet/intel/igb/igb_ptp.c
> b/drivers/net/ethernet/intel/igb/igb_ptp.c
> new file mode 100644
> index 000..dd27be1
> --- /dev/null
> +++ b/drivers/net/ethernet/intel/igb/igb_ptp.c
> @@ -0,0 +1,427 @@
> +/*
> + * PTP Hardware Clock (PHC) driver for the Intel 82576 and 82580
> + *
> + * Copyright (C) 2011 Richard Cochran 
> + *
> + * This program is free software; you can redistribute it and/or
> modify
> + * it under the terms of the GNU General Public License as published
> by
> + * the Free Software Foundation; either version 2 of the License, or
> + * (at your option) any later version.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> along
> + * with this program; if not, write to the Free Software Foundation,
> Inc.,
> + * 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
> + */
> +#include 
> +#include 
> +
> +#include "igb.h"
> +
> +#define INCVALUE_MASK0x7fff
> +#define ISGN 0x8000
> +
> +/*
> + * Neither the 82576 nor the 82580 offer registers wide enough to hold
> + * nanoseconds time values for very long. For the 82580, SYSTIM always
> + * counts nanoseconds, but the upper 24 bits are not availible. The
> + * frequency is adjusted by changing the 32 bit fractional nanoseconds
> + * register, TIMINCA.
> + *
> + * For the 82576, the SYSTIM register time unit is affect by the
> + * choice of the 24 bit TININCA:IV (incvalue) field. Five bits of this
> + * field are needed to provide the nominal 16 nanosecond period,
> + * leaving 19 bits for fractiona

Re: [E1000-devel] Linux igb issue

2012-01-09 Thread Azher Mughal
Hi Carolyn,

Please see comments inline:

On 1/9/2012 7:54 AM, Wyborny, Carolyn wrote:
>
>> -Original Message-
>> From: Azher Mughal [mailto:az...@hep.caltech.edu]
>> Sent: Saturday, January 07, 2012 6:59 AM
>> To: Kirsher, Jeffrey T; Wyborny, Carolyn
>> Subject: Linux igb issue
>>
>> Hi All,
>>
>> I downloaded the latest driver from sourceforge ver 3.3.6 and it has
>> your patch. Removing the older driver and installing a new one on
>> Scientific Linux 6.1 (kernel 2.6.39.4) still gives the error. Some Info
>> is below. Any suggestions how to fix it ?
>>
>> Thanks
>> -Azher
>>
>> SuperMicro SandyBridge based Server.
>>
>> # uname -a
>> Linux sc33-sm-g3-2 2.6.39.4 #1 SMP PREEMPT Fri Jan 6 10:28:06 PST 2012
>> x86_64 x86_64 x86_64 GNU/Linux
>>
>> Intel(R) Gigabit Ethernet Network Driver - version 3.3.6
>> Copyright (c) 2007-2011 Intel Corporation.
>> igb :04:00.0: PCI INT A -> GSI 27 (level, low) -> IRQ 27
>> igb :04:00.0: setting latency timer to 64
>> igb :04:00.0: irq 129 for MSI/MSI-X
>> igb :04:00.0: irq 130 for MSI/MSI-X
>> igb :04:00.0: The NVM Checksum Is Not Valid
>> igb :04:00.0: PCI INT A disabled
>> igb: probe of :04:00.0 failed with error -5
>> igb :04:00.1: PCI INT B -> GSI 30 (level, low) -> IRQ 30
>> igb :04:00.1: setting latency timer to 64
>> igb :04:00.1: irq 129 for MSI/MSI-X
>> igb :04:00.1: irq 130 for MSI/MSI-X
>> igb :04:00.1: The NVM Checksum Is Not Valid
>> igb :04:00.1: PCI INT B disabled
>> igb: probe of :04:00.1 failed with error -5
>> Intel(R) Gigabit Ethernet Network Driver - version 3.3.6
>> Copyright (c) 2007-2011 Intel Corporation.
>> igb :04:00.0: PCI INT A -> GSI 27 (level, low) -> IRQ 27
>> igb :04:00.0: setting latency timer to 64
>> igb :04:00.0: irq 129 for MSI/MSI-X
>> igb :04:00.0: irq 130 for MSI/MSI-X
>> igb :04:00.0: The NVM Checksum Is Not Valid
>> igb :04:00.0: PCI INT A disabled
>> igb: probe of :04:00.0 failed with error -5
>> igb :04:00.1: PCI INT B -> GSI 30 (level, low) -> IRQ 30
>> igb :04:00.1: setting latency timer to 64
>> igb :04:00.1: irq 129 for MSI/MSI-X
>> igb :04:00.1: irq 130 for MSI/MSI-X
>> igb :04:00.1: The NVM Checksum Is Not Valid
>> igb :04:00.1: PCI INT B disabled
>> igb: probe of :04:00.1 failed with error -5
>>
>> lspci
>> =
>> 04:00.0 Ethernet controller: Intel Corporation I350 Gigabit Network
>> Connection (rev 01)
>>Flags: fast devsel, IRQ 27
>>Memory at dfd2 (32-bit, non-prefetchable) [size=128K]
>>I/O ports at 7020 [size=32]
>>Memory at dfdc4000 (32-bit, non-prefetchable) [size=16K]
>>Capabilities: [40] Power Management version 3
>>Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit+
>>Capabilities: [70] MSI-X: Enable- Count=10 Masked-
>>Capabilities: [a0] Express Endpoint, MSI 00
>>Capabilities: [e0] Vital Product Data
>>Capabilities: [100] Advanced Error Reporting
>>Capabilities: [140] Device Serial Number 00-25-90-ff-ff-0f-f0-1e
>>Capabilities: [150] Alternative Routing-ID Interpretation (ARI)
>>Capabilities: [160] Single Root I/O Virtualization (SR-IOV)
>>Capabilities: [1a0] #17
>>Capabilities: [1c0] #18
>>Capabilities: [1d0] Access Control Services
>>Kernel modules: igb
>>
>> 04:00.1 Ethernet controller: Intel Corporation I350 Gigabit Network
>> Connection (rev 01)
>>Flags: fast devsel, IRQ 30
>>Memory at dfd0 (32-bit, non-prefetchable) [size=128K]
>>I/O ports at 7000 [size=32]
>>Memory at dfdc (32-bit, non-prefetchable) [size=16K]
>>Capabilities: [40] Power Management version 3
>>Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit+
>>Capabilities: [70] MSI-X: Enable- Count=10 Masked-
>>Capabilities: [a0] Express Endpoint, MSI 00
>>Capabilities: [e0] Vital Product Data
>>Capabilities: [100] Advanced Error Reporting
>>Capabilities: [140] Device Serial Number 00-25-90-ff-ff-0f-f0-1e
>>Capabilities: [150] Alternative Routing-ID Interpretation (ARI)
>>Capabilities: [160] Single Root I/O Virtualization (SR-IOV)
>>Capabilities: [1a0] #17
>>Capabilities: [1d0] Access Control Services
>>Kernel modules: igb
>>
>>
>>
> What version of the driver were you using before the update?
Scientific Linux 6.1 (all updated)

# modinfo igb
filename:   /lib/modules/2.6.39.4/kernel/drivers/net/igb/igb.ko
version:3.0.6-k2
license:GPL
description:Intel(R) Gigabit Ethernet Network Driver
author: Intel Corporation, 
srcversion: 928BEC232C2A15E5BA94453
..
depends:dca
vermagic:   2.6.39.4 SMP preempt mod_unload modversions
parm:   max_vfs:Maximum number of virtual functions to allocate
per physical function (uint)

>   Do you only have two i350 devices in your system or do you really have four?
The

Re: [E1000-devel] igb + balance-rr + bridge + IPv6 = no go without promiscuous mode

2012-01-09 Thread Wyborny, Carolyn


>-Original Message-
>From: Chris Boot [mailto:bo...@bootc.net]
>Sent: Wednesday, January 04, 2012 8:58 AM
>To: Wyborny, Carolyn
>Cc: Nicolas de Pesloüan; netdev; e1000-devel@lists.sourceforge.net
>Subject: Re: igb + balance-rr + bridge + IPv6 = no go without
>promiscuous mode
>
>On 04/01/2012 16:00, Wyborny, Carolyn wrote:
>>
>>
>>> -Original Message-
>>> From: netdev-ow...@vger.kernel.org [mailto:netdev-
>ow...@vger.kernel.org]
>>> On Behalf Of Wyborny, Carolyn
>>> Sent: Tuesday, January 03, 2012 3:24 PM
>>> To: Chris Boot; Nicolas de Pesloüan
>>> Cc: netdev; e1000-devel@lists.sourceforge.net
>>> Subject: RE: igb + balance-rr + bridge + IPv6 = no go without
>>> promiscuous mode
>>>
>>>
>>>
 -Original Message-
 From: netdev-ow...@vger.kernel.org [mailto:netdev-
>>> ow...@vger.kernel.org]
 On Behalf Of Chris Boot
 Sent: Tuesday, December 27, 2011 1:53 PM
 To: Nicolas de Pesloüan
 Cc: netdev
 Subject: Re: igb + balance-rr + bridge + IPv6 = no go without
 promiscuous mode

 On 23/12/2011 10:56, Chris Boot wrote:
> On 23/12/2011 10:48, Nicolas de Pesloüan wrote:
>> [ Forwarded to netdev, because two previous e-mail erroneously
>sent
 in
>> HTML ]
>>
>> Le 23/12/2011 11:15, Chris Boot a écrit :
>>> On 23/12/2011 09:52, Nicolas de Pesloüan wrote:


 Le 23 déc. 2011 10:42, "Chris Boot">>> >  a écrit :
>
> Hi folks,
>
> As per Eric Dumazet and Dave Miller, I'm opening up a separate
 thread on this issue.
>
> I have two identical servers in a cluster for running KVM
>>> virtual
 machines. They each have a
 single connection to the Internet (irrelevant for this) and two
 gigabit connections between each
 other for cluster replication, etc... These two connections are
>in
 a
 balance-rr bonded connection,
 which is itself member of a bridge that the VMs attach to. I'm
 running v3.2-rc6-140-gb9e26df on
 Debian Wheezy.
>
> When the bridge is brought up, IPv4 works fine but IPv6 does
>>> not.
 I can use neither the
 automatic link-local on the brid ge nor the static global
>address
>>> I
 assign. Neither machine can
 perform neighbour discovery over the link until I put the bond
 members (eth0 and eth1) into
 promiscuous mode. I can do this either with tcpdump or 'ip link
>>> set
 dev ethX promisc on' and this
 is enough to make the link spring to life.

 For as far as I remember, setting bond0 to promisc should set
>the
 bonding member to promisc too.
 And inserting bond0 into br0 should set bond0 to promisc... So
 everything should be in promisc
 mode anyway... but you shoudn't have to do it by hand.

>>>
>>> Sorry, I should have added that I tried this. Setting bond0 or
>br0
 to
>>> promisc has no effect. I
>>> discovered this by running tcpdump on br0 first, then bond0, then
>>> eventually each bond member in
>>> turn. Only at the last stage did things jump to life.
>>>
>
> This cluster is not currently live so I can easily test patches
 and various configurations.

 Can you try to remove the bonding part, connecting eth0 and eth1
 directly to br0 and see if it
 works better? (This is a test ony. I perfectly understand that
>you
 would loose balance-rr in this
 setup.)

>>>
>>> Good call. Let's see.
>>>
>>> I took br0 and bond0 apart, took eth0 and eth1 out of enforced
>>> promisc mode, then manually built a
>>> br0 with eth0 in only so I didn't cause a network loop. Adding
>eth0
>>> to br0 did not make it go into
>>> promisc mode, but IPv6 does work over this setup. I also made
>sure
 ip
>>> -6 neigh was empty on both
>>> machines before I started.
>>>
>>> I then decided to try the test with just the bond0 in balance-rr
>>> mode. Once again I took everything
>>> down and ensured no promisc mode and no ip -6 neigh. I noticed
>>> bond0
>>> wasn't getting a link-local and
>>> I found out for some reason
>>> /proc/sys/net/ipv6/conf/bond0/disable_ipv6 was set on both
>servers
 so I
>>> set it to 0. That brought things to life.
>>>
>>> So then I put it all back together again and it didn't work. I
>once
>>> again noticed disable_ipv6 was
>>> set on the bond0 interfaces, now part of the bridge. Toggling
>this
 on
>>> the _bond_ interface made
>>> things work again.
>>>
>>> What's setting disable_ipv6? Should this be having an impact if
>the
>>> port is part of a bridge?
>
> Hmm, as a further update... I brought up my VMs on the bridge with
> disable_ipv6 turned off. The VMs on one hos

Re: [E1000-devel] Problem with 82575GB Card and packets/s capabilities.

2012-01-09 Thread Wyborny, Carolyn


>-Original Message-
>From: Zoop [mailto:junkm...@lone.ath.cx]
>Sent: Tuesday, December 20, 2011 7:07 AM
>To: e1000-devel@lists.sourceforge.net
>Subject: [E1000-devel] Problem with 82575GB Card and packets/s
>capabilities.
>
>I have a Dell 710 server using two 82575GB Intel Quad cards.  I am using
>two
>interfaces on each card and using the bonding driver within linux.  I
>have run
>into a problem where it seems when one of the interfaces in the bonding
>pair
>hits about 116,000 packets/s that it is unable to handle more traffic.
>Odd
>thing is I don't really seem to see any interface errors via Ethtool
>etc.
>
>What I do notice is that when it does do this the system load avg goes
>above 1.
> I have seen one of the ksoftirqd process sit at about 70%.  Sometimes a
>restart
>of irqbalance will cause this to move or clear the issue.  I an not
>familiar
>enough with the irq affinity settings to play with it myself yet.
>
>I would expect to be able to get a lot more out of the cards I am
>assuming
>something is wrong with my setup.  I am using the default settings for
>the igb
>driver and have made some adjustments to the kernel itself.  Generally
>most of
>the packets through the system are being routed.
>
>settings via the proc that I changed
>net.core.optmem_max = 20480
># Increase number of incoming connections backlog
>net.core.netdev_max_backlog = 4000
>net.core.dev_weight = 64
># # Bump up default r/wmem to max
>net.core.rmem_default = 262141
>net.core.wmem_default = 262141
># # Bump up max r/wmem
>net.core.rmem_max = 262141
>net.core.wmem_max = 262141
>
>This is a Dual Quad core 2.4Ghz Intel system
>
>
>Kernel messages for one of the interfaces:
>[5.404672] igb :0d:00.0: Intel(R) Gigabit Ethernet Network
>Connection
>[5.404677] igb :0d:00.0: eth6: (PCIe:2.5Gb/s:Width x4)
>00:1b:21:14:34:54
>[5.404762] igb :0d:00.0: eth6: PBA No: E34573-001
>[5.404765] igb :0d:00.0: Using MSI-X interrupts. 4 rx queue(s),
>4 tx
>queue(s)
>[5.404796] igb :0d:00.1: PCI INT B -> GSI 50 (level, low) -> IRQ
>50
>[5.404807] igb :0d:00.1: setting latency timer to 64
>[5.405636] igb :0d:00.1: irq 155 for MSI/MSI-X
>[5.405645] igb :0d:00.1: irq 156 for MSI/MSI-X
>[5.405655] igb :0d:00.1: irq 157 for MSI/MSI-X
>[5.405665] igb :0d:00.1: irq 158 for MSI/MSI-X
>[5.405675] igb :0d:00.1: irq 159 for MSI/MSI-X
>[5.405684] igb :0d:00.1: irq 160 for MSI/MSI-X
>[5.405694] igb :0d:00.1: irq 161 for MSI/MSI-X
>[5.405703] igb :0d:00.1: irq 162 for MSI/MSI-X
>[5.405720] igb :0d:00.1: irq 163 for MSI/MSI-X
>
>
>Some vmstat output
>
>procs ---memory-- ---swap-- -io -system-- 
>cpu
> 1  0264 2018016 474692 434456000 0 0 180402 1209  0
>4 95  0
> 1  0264 2018296 474692 434461600 0 0 189968 1186  0
>3 97  0
> 0  0264 2017896 474692 434466800 0 0 183079 1345  0
>3 97  0
> 0  0264 2017892 474692 434472800 016 189246 1161  0
>7 93  0
> 1  0264 2018076 474692 434477200 0 0 177470 1013  0
>6 94  0
> 0  0264 2017972 474692 434482000 0 0 185231 1145  0
>3 97  0
> 1  0264 2018040 474692 434486400 016 165925 1577  0
>7 93  0
> 1  0264 2017820 474692 434493200 0 0 173961 1245  0
>7 93  0
> 1  0264 2017812 474692 434498800 020 184350 1282  0
>3 97  0
> 0  0264 2018040 474692 434504800 0 0 177809 1185  0
>3 97  0
> 0  0264 2017864 474692 434510800 0 0 181536 1217  0
>3 97  0
> 0  0264 2017860 474692 434516400 0 0 181901 1143  0
>4 96  0
> 0  0264 2017704 474692 434521600 0 0 170947 1233  0
>9 90  0
> 1  0264 2017660 474692 434527200 020 173678 1087  0
>6 94  0
> 1  0264 2016904 474692 434592800 0  1036 184045 1338  0
>5 95  0
> 1  0264 2010440 474692 435194400 0 0 179177 1356  0
>7 92  0
> 1  0264 2010300 474692 435262400 0   904 167765 1321  0
>9 91  0
> 1  0264 2010460 474692 435206800 0   148 195913 1262  0
>4 96  0
> 0  0264 2010464 474692 435213600 0 8 192341 1943  0
>4 96  0
> 0  0264 2010412 474692 435222000 0 0 179244 1673  0
>4 95  0
> 0  0264 2010516 474692 435229200 0 0 168384 1335  0
>5 95  0
> 0  0264 2010020 474692 435236400 016 188664 1350  0
>3 97  0
>
>
>If you think the graphs and data from the interface and load on the
>server would
>be helpful let me know I will figure out how to get them available to
>you.
>
>Let me know what other information I can provide here.
>
>I did see this post and it looks like it is related maybe to what I am
>doing,
>but it seems to not have completed.
>
>http://comments.gmane.org/gmane.linux.drivers.e1000.devel/7221
>
>Thank you.
>
>

Re: [E1000-devel] Linux igb issue

2012-01-09 Thread Wyborny, Carolyn


>-Original Message-
>From: Azher Mughal [mailto:az...@hep.caltech.edu]
>Sent: Saturday, January 07, 2012 6:59 AM
>To: Kirsher, Jeffrey T; Wyborny, Carolyn
>Subject: Linux igb issue
>
>Hi All,
>
>I downloaded the latest driver from sourceforge ver 3.3.6 and it has
>your patch. Removing the older driver and installing a new one on
>Scientific Linux 6.1 (kernel 2.6.39.4) still gives the error. Some Info
>is below. Any suggestions how to fix it ?
>
>Thanks
>-Azher
>
>SuperMicro SandyBridge based Server.
>
># uname -a
>Linux sc33-sm-g3-2 2.6.39.4 #1 SMP PREEMPT Fri Jan 6 10:28:06 PST 2012
>x86_64 x86_64 x86_64 GNU/Linux
>
>Intel(R) Gigabit Ethernet Network Driver - version 3.3.6
>Copyright (c) 2007-2011 Intel Corporation.
>igb :04:00.0: PCI INT A -> GSI 27 (level, low) -> IRQ 27
>igb :04:00.0: setting latency timer to 64
>igb :04:00.0: irq 129 for MSI/MSI-X
>igb :04:00.0: irq 130 for MSI/MSI-X
>igb :04:00.0: The NVM Checksum Is Not Valid
>igb :04:00.0: PCI INT A disabled
>igb: probe of :04:00.0 failed with error -5
>igb :04:00.1: PCI INT B -> GSI 30 (level, low) -> IRQ 30
>igb :04:00.1: setting latency timer to 64
>igb :04:00.1: irq 129 for MSI/MSI-X
>igb :04:00.1: irq 130 for MSI/MSI-X
>igb :04:00.1: The NVM Checksum Is Not Valid
>igb :04:00.1: PCI INT B disabled
>igb: probe of :04:00.1 failed with error -5
>Intel(R) Gigabit Ethernet Network Driver - version 3.3.6
>Copyright (c) 2007-2011 Intel Corporation.
>igb :04:00.0: PCI INT A -> GSI 27 (level, low) -> IRQ 27
>igb :04:00.0: setting latency timer to 64
>igb :04:00.0: irq 129 for MSI/MSI-X
>igb :04:00.0: irq 130 for MSI/MSI-X
>igb :04:00.0: The NVM Checksum Is Not Valid
>igb :04:00.0: PCI INT A disabled
>igb: probe of :04:00.0 failed with error -5
>igb :04:00.1: PCI INT B -> GSI 30 (level, low) -> IRQ 30
>igb :04:00.1: setting latency timer to 64
>igb :04:00.1: irq 129 for MSI/MSI-X
>igb :04:00.1: irq 130 for MSI/MSI-X
>igb :04:00.1: The NVM Checksum Is Not Valid
>igb :04:00.1: PCI INT B disabled
>igb: probe of :04:00.1 failed with error -5
>
>lspci
>=
>04:00.0 Ethernet controller: Intel Corporation I350 Gigabit Network
>Connection (rev 01)
>Flags: fast devsel, IRQ 27
>Memory at dfd2 (32-bit, non-prefetchable) [size=128K]
>I/O ports at 7020 [size=32]
>Memory at dfdc4000 (32-bit, non-prefetchable) [size=16K]
>Capabilities: [40] Power Management version 3
>Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit+
>Capabilities: [70] MSI-X: Enable- Count=10 Masked-
>Capabilities: [a0] Express Endpoint, MSI 00
>Capabilities: [e0] Vital Product Data
>Capabilities: [100] Advanced Error Reporting
>Capabilities: [140] Device Serial Number 00-25-90-ff-ff-0f-f0-1e
>Capabilities: [150] Alternative Routing-ID Interpretation (ARI)
>Capabilities: [160] Single Root I/O Virtualization (SR-IOV)
>Capabilities: [1a0] #17
>Capabilities: [1c0] #18
>Capabilities: [1d0] Access Control Services
>Kernel modules: igb
>
>04:00.1 Ethernet controller: Intel Corporation I350 Gigabit Network
>Connection (rev 01)
>Flags: fast devsel, IRQ 30
>Memory at dfd0 (32-bit, non-prefetchable) [size=128K]
>I/O ports at 7000 [size=32]
>Memory at dfdc (32-bit, non-prefetchable) [size=16K]
>Capabilities: [40] Power Management version 3
>Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit+
>Capabilities: [70] MSI-X: Enable- Count=10 Masked-
>Capabilities: [a0] Express Endpoint, MSI 00
>Capabilities: [e0] Vital Product Data
>Capabilities: [100] Advanced Error Reporting
>Capabilities: [140] Device Serial Number 00-25-90-ff-ff-0f-f0-1e
>Capabilities: [150] Alternative Routing-ID Interpretation (ARI)
>Capabilities: [160] Single Root I/O Virtualization (SR-IOV)
>Capabilities: [1a0] #17
>Capabilities: [1d0] Access Control Services
>Kernel modules: igb
>
>
>
What version of the driver were you using before the update?  Do you only have 
two i350 devices in your system or do you really have four?  Do you have access 
to any of our EEPROM updating tools, like eeupdate or eeedit?  Your EEPROM/NVM 
image likely needs to be updated.  Contact your device vendor for an updated 
image.  

Let me know if you have any problems getting that image updated.

Thanks,

Carolyn

Carolyn Wyborny
Linux Development
LAN Access Division
Intel Corporation



--
Ridiculously easy VDI. With Citrix VDI-in-a-Box, you don't need a complex
infrastructure or vast IT resources to deliver seamless, secure access to
virtual desktops. With this all-in-one solution, easily deploy virtual 
desktops for less than the cost of PCs and save 60% on VDI infrastructure 
costs