[CentOS] Centos 6.3 Network bnx2 Problem on HP DL360

2013-03-19 Thread Woehrle Hartmut SBB CFF FFS (Extern)
Hello Mailing List

I got a severe network error message at a HP DL360 Server.
The kernel log says:

--- /var/log/messages 
-
Mar 19 15:45:06 server kernel: do_IRQ: 2.168 No irq handler for vector (irq -1)
Mar 19 15:45:17 server kernel: bnx2 :02:00.1: eth1: DEBUG: intr_sem[0] 
PCI_CMD[00100446]
Mar 19 15:45:17 server kernel: bnx2 :02:00.1: eth1: DEBUG: PCI_PM[19002108] 
PCI_MISC_CFG[9288]
Mar 19 15:45:17 server kernel: bnx2 :02:00.1: eth1: DEBUG: 
EMAC_TX_STATUS[0008] EMAC_RX_STATUS[0006]
Mar 19 15:45:17 server kernel: bnx2 :02:00.1: eth1: DEBUG: 
RPM_MGMT_PKT_CTRL[4088]
Mar 19 15:45:17 server kernel: bnx2 :02:00.1: eth1: DEBUG: 
HC_STATS_INTERRUPT_STATUS[017f0080]
Mar 19 15:45:17 server kernel: bnx2 :02:00.1: eth1: DEBUG: PBA[]
Mar 19 15:45:17 server kernel: bnx2 :02:00.1: eth1: <--- start MCP states 
dump --->
Mar 19 15:45:17 server kernel: bnx2 :02:00.1: eth1: DEBUG: 
MCP_STATE_P0[0003610e] MCP_STATE_P1[0003610e]
Mar 19 15:45:17 server kernel: bnx2 :02:00.1: eth1: DEBUG: MCP 
mode[b880] state[80008000] evt_mask[0500]
Mar 19 15:45:17 server kernel: bnx2 :02:00.1: eth1: DEBUG: pc[0800adec] 
pc[0800aeb0] instr[8fb10014]
Mar 19 15:45:17 server kernel: bnx2 :02:00.1: eth1: DEBUG: shmem states:
Mar 19 15:45:17 server kernel: bnx2 :02:00.1: eth1: DEBUG: drv_mb[0103000f] 
fw_mb[000f] link_status[006f] drv_pulse_mb[432b]
Mar 19 15:45:17 server kernel: bnx2 :02:00.1: eth1: DEBUG: 
dev_info_signature[44564903] reset_type[01005254] condition[0003610e]
Mar 19 15:45:17 server kernel: bnx2 :02:00.1: eth1: DEBUG: 03cc: 
   0a3c
Mar 19 15:45:17 server kernel: bnx2 :02:00.1: eth1: DEBUG: 03dc: 
0ffe   
Mar 19 15:45:17 server kernel: bnx2 :02:00.1: eth1: DEBUG: 03ec: 
   0002
Mar 19 15:45:17 server kernel: bnx2 :02:00.1: eth1: DEBUG: 0x3fc[]
Mar 19 15:45:17 server kernel: bnx2 :02:00.1: eth1: <--- end MCP states 
dump --->
Mar 19 15:45:17 server kernel: bnx2 :02:00.1: eth1: NIC Copper Link is Down
Mar 19 15:45:20 server kernel: bnx2 :02:00.1: eth1: NIC Copper Link is Up, 
1000 Mbps full duplex
---

Does anyone know that problem?

System is Centos 6.3 Kernel 
Linux server 2.6.32-279.5.2.el6.centos.plus.x86_64 #1 SMP Fri Aug 24 00:25:34 
UTC 2012 x86_64 x86_64 x86_64 GNU/Linux


Thanks
Hartmut

___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Centos 6.3 Network bnx2 Problem on HP DL360

2013-03-19 Thread Nathan Duehr

On Mar 19, 2013, at 9:32 AM, Woehrle Hartmut SBB CFF FFS (Extern) 
 wrote:

> Hello Mailing List
> 
> I got a severe network error message at a HP DL360 Server.
> The kernel log says:

If that's a DL360 G7 server, make sure you've applied all of the latest 
firmware patches from HP on it.  The G7 version has been almost notorious for 
firmware issues with drive controllers, ethernet interfaces, etc.

Nate
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Centos 6.3 Network bnx2 Problem on HP DL360

2013-03-20 Thread Woehrle Hartmut SBB CFF FFS (Extern)

> On Mar 19, 2013, at 9:32 AM, Woehrle Hartmut SBB CFF FFS (Extern) 
>  wrote:
>
> > Hello Mailing List
> > 
> > I got a severe network error message at a HP DL360 Server.
> > The kernel log says:
>
> If that's a DL360 G7 server, make sure you've applied all of the latest 
> firmware patches from HP on it.  The G7 version has been 
> almost notorious for firmware issues with drive controllers, ethernet 
> interfaces, etc.
> 
> Nate

Hello Nate

It is a G6 Server and the firmware is more or less the latest version:

# bash CP017428.scexe -c
MAC  PCI-ID  NIC
18A90576C820 14E4-1639-103C-7055 HP NC382i DP Multifunction Gigabit Server 
Adapter 

 (Installed)(Available)  Interface
 Image  Version Image  Version   eth0
--
  BC5.2.3   BC 5.2.3
  iSCSI 4.2.10   iSCSI  7.4.2  
< I don't use iSCSI at this maschine
  NCSI  2.0.6   NCSI   2.0.12


Hartmut
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Centos 6.3 Network bnx2 Problem on HP DL360

2013-03-20 Thread Banyan He
What's the irq number you can find for the device? You may have to find 
the driver development guide to figure out what the debug message says.

Just the first line points out there is no irq for the device. You can 
check it in /proc/interrupts, then find a match in /proc/irq/


Banyan He
Blog: http://www.rootong.com
Email: ban...@rootong.com

On 3/19/2013 11:32 PM, Woehrle Hartmut SBB CFF FFS (Extern) wrote:
> Hello Mailing List
>
> I got a severe network error message at a HP DL360 Server.
> The kernel log says:
>
> --- /var/log/messages 
> -
> Mar 19 15:45:06 server kernel: do_IRQ: 2.168 No irq handler for vector (irq 
> -1)
> Mar 19 15:45:17 server kernel: bnx2 :02:00.1: eth1: DEBUG: intr_sem[0] 
> PCI_CMD[00100446]
> Mar 19 15:45:17 server kernel: bnx2 :02:00.1: eth1: DEBUG: 
> PCI_PM[19002108] PCI_MISC_CFG[9288]
> Mar 19 15:45:17 server kernel: bnx2 :02:00.1: eth1: DEBUG: 
> EMAC_TX_STATUS[0008] EMAC_RX_STATUS[0006]
> Mar 19 15:45:17 server kernel: bnx2 :02:00.1: eth1: DEBUG: 
> RPM_MGMT_PKT_CTRL[4088]
> Mar 19 15:45:17 server kernel: bnx2 :02:00.1: eth1: DEBUG: 
> HC_STATS_INTERRUPT_STATUS[017f0080]
> Mar 19 15:45:17 server kernel: bnx2 :02:00.1: eth1: DEBUG: PBA[]
> Mar 19 15:45:17 server kernel: bnx2 :02:00.1: eth1: <--- start MCP states 
> dump --->
> Mar 19 15:45:17 server kernel: bnx2 :02:00.1: eth1: DEBUG: 
> MCP_STATE_P0[0003610e] MCP_STATE_P1[0003610e]
> Mar 19 15:45:17 server kernel: bnx2 :02:00.1: eth1: DEBUG: MCP 
> mode[b880] state[80008000] evt_mask[0500]
> Mar 19 15:45:17 server kernel: bnx2 :02:00.1: eth1: DEBUG: pc[0800adec] 
> pc[0800aeb0] instr[8fb10014]
> Mar 19 15:45:17 server kernel: bnx2 :02:00.1: eth1: DEBUG: shmem states:
> Mar 19 15:45:17 server kernel: bnx2 :02:00.1: eth1: DEBUG: 
> drv_mb[0103000f] fw_mb[000f] link_status[006f] drv_pulse_mb[432b]
> Mar 19 15:45:17 server kernel: bnx2 :02:00.1: eth1: DEBUG: 
> dev_info_signature[44564903] reset_type[01005254] condition[0003610e]
> Mar 19 15:45:17 server kernel: bnx2 :02:00.1: eth1: DEBUG: 03cc: 
>    0a3c
> Mar 19 15:45:17 server kernel: bnx2 :02:00.1: eth1: DEBUG: 03dc: 
> 0ffe   
> Mar 19 15:45:17 server kernel: bnx2 :02:00.1: eth1: DEBUG: 03ec: 
>    0002
> Mar 19 15:45:17 server kernel: bnx2 :02:00.1: eth1: DEBUG: 0x3fc[]
> Mar 19 15:45:17 server kernel: bnx2 :02:00.1: eth1: <--- end MCP states 
> dump --->
> Mar 19 15:45:17 server kernel: bnx2 :02:00.1: eth1: NIC Copper Link is 
> Down
> Mar 19 15:45:20 server kernel: bnx2 :02:00.1: eth1: NIC Copper Link is 
> Up, 1000 Mbps full duplex
> ---
>
> Does anyone know that problem?
>
> System is Centos 6.3 Kernel
> Linux server 2.6.32-279.5.2.el6.centos.plus.x86_64 #1 SMP Fri Aug 24 00:25:34 
> UTC 2012 x86_64 x86_64 x86_64 GNU/Linux
>
>
> Thanks
> Hartmut
>
> ___
> CentOS mailing list
> CentOS@centos.org
> http://lists.centos.org/mailman/listinfo/centos
> .
>

___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Centos 6.3 Network bnx2 Problem on HP DL360

2013-03-20 Thread Svavar Örn Eysteinsson
How often are you getting these crashes ?

I had simular problem on my HP DL380 G7 server.

I disabled Active State PowerManagement on the PCI-E express.

Try it.

Add pcie_aspm=off as optional boot option.


Best regards,

Svavar O
Reykjavik - Iceland



On 19.3.2013, at 15:32, Woehrle Hartmut SBB CFF FFS (Extern) wrote:

> Hello Mailing List
> 
> I got a severe network error message at a HP DL360 Server.
> The kernel log says:
> 
> --- /var/log/messages 
> -
> Mar 19 15:45:06 server kernel: do_IRQ: 2.168 No irq handler for vector (irq 
> -1)
> Mar 19 15:45:17 server kernel: bnx2 :02:00.1: eth1: DEBUG: intr_sem[0] 
> PCI_CMD[00100446]
> Mar 19 15:45:17 server kernel: bnx2 :02:00.1: eth1: DEBUG: 
> PCI_PM[19002108] PCI_MISC_CFG[9288]
> Mar 19 15:45:17 server kernel: bnx2 :02:00.1: eth1: DEBUG: 
> EMAC_TX_STATUS[0008] EMAC_RX_STATUS[0006]
> Mar 19 15:45:17 server kernel: bnx2 :02:00.1: eth1: DEBUG: 
> RPM_MGMT_PKT_CTRL[4088]
> Mar 19 15:45:17 server kernel: bnx2 :02:00.1: eth1: DEBUG: 
> HC_STATS_INTERRUPT_STATUS[017f0080]
> Mar 19 15:45:17 server kernel: bnx2 :02:00.1: eth1: DEBUG: PBA[]
> Mar 19 15:45:17 server kernel: bnx2 :02:00.1: eth1: <--- start MCP states 
> dump --->
> Mar 19 15:45:17 server kernel: bnx2 :02:00.1: eth1: DEBUG: 
> MCP_STATE_P0[0003610e] MCP_STATE_P1[0003610e]
> Mar 19 15:45:17 server kernel: bnx2 :02:00.1: eth1: DEBUG: MCP 
> mode[b880] state[80008000] evt_mask[0500]
> Mar 19 15:45:17 server kernel: bnx2 :02:00.1: eth1: DEBUG: pc[0800adec] 
> pc[0800aeb0] instr[8fb10014]
> Mar 19 15:45:17 server kernel: bnx2 :02:00.1: eth1: DEBUG: shmem states:
> Mar 19 15:45:17 server kernel: bnx2 :02:00.1: eth1: DEBUG: 
> drv_mb[0103000f] fw_mb[000f] link_status[006f] drv_pulse_mb[432b]
> Mar 19 15:45:17 server kernel: bnx2 :02:00.1: eth1: DEBUG: 
> dev_info_signature[44564903] reset_type[01005254] condition[0003610e]
> Mar 19 15:45:17 server kernel: bnx2 :02:00.1: eth1: DEBUG: 03cc: 
>    0a3c
> Mar 19 15:45:17 server kernel: bnx2 :02:00.1: eth1: DEBUG: 03dc: 
> 0ffe   
> Mar 19 15:45:17 server kernel: bnx2 :02:00.1: eth1: DEBUG: 03ec: 
>    0002
> Mar 19 15:45:17 server kernel: bnx2 :02:00.1: eth1: DEBUG: 0x3fc[]
> Mar 19 15:45:17 server kernel: bnx2 :02:00.1: eth1: <--- end MCP states 
> dump --->
> Mar 19 15:45:17 server kernel: bnx2 :02:00.1: eth1: NIC Copper Link is 
> Down
> Mar 19 15:45:20 server kernel: bnx2 :02:00.1: eth1: NIC Copper Link is 
> Up, 1000 Mbps full duplex
> ---
> 
> Does anyone know that problem?
> 
> System is Centos 6.3 Kernel 
> Linux server 2.6.32-279.5.2.el6.centos.plus.x86_64 #1 SMP Fri Aug 24 00:25:34 
> UTC 2012 x86_64 x86_64 x86_64 GNU/Linux
> 
> 
> Thanks
> Hartmut
> 
> ___
> CentOS mailing list
> CentOS@centos.org
> http://lists.centos.org/mailman/listinfo/centos


___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Centos 6.3 Network bnx2 Problem on HP DL360

2013-03-25 Thread Woehrle Hartmut SBB CFF FFS (Extern)
Hello Svavar

This was the first time that this problem occurred - with 60 Servers and about 
half a year of Centos 6 (5 before).
But because the interfaces have a permanent load - really 24x7 - problems with 
power management would be a disaster.
I will try to switch off.

Thanks
Hartmut

> How often are you getting these crashes ?
>
>I had simular problem on my HP DL380 G7 server.
>
>I disabled Active State PowerManagement on the PCI-E express.
>
>Try it.
>
>Add pcie_aspm=off as optional boot option.
>
>
>Best regards,
>
>Svavar O
>Reykjavik - Iceland



> On 19.3.2013, at 15:32, Woehrle Hartmut SBB CFF FFS (Extern) wrote:
>
> Hello Mailing List
> 
> I got a severe network error message at a HP DL360 Server.
> The kernel log says:
> 
> --- /var/log/messages 
> -
> Mar 19 15:45:06 server kernel: do_IRQ: 2.168 No irq handler for vector 
> (irq -1) Mar 19 15:45:17 server kernel: bnx2 :02:00.1: eth1: 
> DEBUG: intr_sem[0] PCI_CMD[00100446] Mar 19 15:45:17 server kernel: 
> bnx2 :02:00.1: eth1: DEBUG: PCI_PM[19002108] 
> PCI_MISC_CFG[9288] Mar 19 15:45:17 server kernel: bnx2 
> :02:00.1: eth1: DEBUG: EMAC_TX_STATUS[0008] 
> EMAC_RX_STATUS[0006] Mar 19 15:45:17 server kernel: bnx2 
> :02:00.1: eth1: DEBUG: RPM_MGMT_PKT_CTRL[4088] Mar 19 15:45:17 
> server kernel: bnx2 :02:00.1: eth1: DEBUG: 
> HC_STATS_INTERRUPT_STATUS[017f0080]
> Mar 19 15:45:17 server kernel: bnx2 :02:00.1: eth1: DEBUG: 
> PBA[] Mar 19 15:45:17 server kernel: bnx2 :02:00.1: eth1: 
> <--- start MCP states dump ---> Mar 19 15:45:17 server kernel: bnx2 
> :02:00.1: eth1: DEBUG: MCP_STATE_P0[0003610e] 
> MCP_STATE_P1[0003610e] Mar 19 15:45:17 server kernel: bnx2 
> :02:00.1: eth1: DEBUG: MCP mode[b880] state[80008000] 
> evt_mask[0500] Mar 19 15:45:17 server kernel: bnx2 :02:00.1: eth1: 
> DEBUG: pc[0800adec] pc[0800aeb0] instr[8fb10014] Mar 19 15:45:17 server 
> kernel: bnx2 :02:00.1: eth1: DEBUG: shmem states:
> Mar 19 15:45:17 server kernel: bnx2 :02:00.1: eth1: DEBUG: 
> drv_mb[0103000f] fw_mb[000f] link_status[006f] 
> drv_pulse_mb[432b] Mar 19 15:45:17 server kernel: bnx2 
> :02:00.1: eth1: DEBUG: dev_info_signature[44564903] 
> reset_type[01005254] condition[0003610e] Mar 19 15:45:17 server 
> kernel: bnx2 :02:00.1: eth1: DEBUG: 03cc:   
>  0a3c Mar 19 15:45:17 server kernel: bnx2 :02:00.1: 
> eth1: DEBUG: 03dc: 0ffe    Mar 19 
> 15:45:17 server kernel: bnx2 :02:00.1: eth1: DEBUG: 03ec: 
>    0002 Mar 19 15:45:17 server kernel: 
> bnx2 :02:00.1: eth1: DEBUG: 0x3fc[] Mar 19 15:45:17 server 
> kernel: bnx2 :02:00.1: eth1: <--- end MCP states dump ---> Mar 19 
> 15:45:17 server kernel: bnx2 :02:00.1: eth1: NIC Copper Link is 
> Down Mar 19 15:45:20 server kernel: bnx2 :02:00.1: eth1: NIC 
> Copper Link is Up, 1000 Mbps full duplex
> --
> -
> 
> Does anyone know that problem?
> 
> System is Centos 6.3 Kernel
> Linux server 2.6.32-279.5.2.el6.centos.plus.x86_64 #1 SMP Fri Aug 24 
> 00:25:34 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux
> 
> 
> Thanks
> Hartmut
> 
> ___
> CentOS mailing list
> CentOS@centos.org
> http://lists.centos.org/mailman/listinfo/centos


___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Centos 6.3 Network bnx2 Problem on HP DL360

2013-03-25 Thread Svavar Örn Eysteinsson

After you have tried the pcie_aspm boot option, also try :

echo performance > /sys/module/pcie_aspm/parameters/policy
This will disable ASPM on PCIe and operate with maximum performance.

This is what I use today on the DL380 G7.




On 25.3.2013, at 09:06, Woehrle Hartmut SBB CFF FFS (Extern) wrote:

> Hello Svavar
> 
> This was the first time that this problem occurred - with 60 Servers and 
> about half a year of Centos 6 (5 before).
> But because the interfaces have a permanent load - really 24x7 - problems 
> with power management would be a disaster.
> I will try to switch off.
> 
> Thanks
> Hartmut
> 
>> How often are you getting these crashes ?
>> 
>> I had simular problem on my HP DL380 G7 server.
>> 
>> I disabled Active State PowerManagement on the PCI-E express.
>> 
>> Try it.
>> 
>> Add pcie_aspm=off as optional boot option.
>> 
>> 
>> Best regards,
>> 
>> Svavar O
>> Reykjavik - Iceland
> 
> 
> 
>> On 19.3.2013, at 15:32, Woehrle Hartmut SBB CFF FFS (Extern) wrote:
>> 
>> Hello Mailing List
>> 
>> I got a severe network error message at a HP DL360 Server.
>> The kernel log says:
>> 
>> --- /var/log/messages 
>> -
>> Mar 19 15:45:06 server kernel: do_IRQ: 2.168 No irq handler for vector 
>> (irq -1) Mar 19 15:45:17 server kernel: bnx2 :02:00.1: eth1: 
>> DEBUG: intr_sem[0] PCI_CMD[00100446] Mar 19 15:45:17 server kernel: 
>> bnx2 :02:00.1: eth1: DEBUG: PCI_PM[19002108] 
>> PCI_MISC_CFG[9288] Mar 19 15:45:17 server kernel: bnx2 
>> :02:00.1: eth1: DEBUG: EMAC_TX_STATUS[0008] 
>> EMAC_RX_STATUS[0006] Mar 19 15:45:17 server kernel: bnx2 
>> :02:00.1: eth1: DEBUG: RPM_MGMT_PKT_CTRL[4088] Mar 19 15:45:17 
>> server kernel: bnx2 :02:00.1: eth1: DEBUG: 
>> HC_STATS_INTERRUPT_STATUS[017f0080]
>> Mar 19 15:45:17 server kernel: bnx2 :02:00.1: eth1: DEBUG: 
>> PBA[] Mar 19 15:45:17 server kernel: bnx2 :02:00.1: eth1: 
>> <--- start MCP states dump ---> Mar 19 15:45:17 server kernel: bnx2 
>> :02:00.1: eth1: DEBUG: MCP_STATE_P0[0003610e] 
>> MCP_STATE_P1[0003610e] Mar 19 15:45:17 server kernel: bnx2 
>> :02:00.1: eth1: DEBUG: MCP mode[b880] state[80008000] 
>> evt_mask[0500] Mar 19 15:45:17 server kernel: bnx2 :02:00.1: eth1: 
>> DEBUG: pc[0800adec] pc[0800aeb0] instr[8fb10014] Mar 19 15:45:17 server 
>> kernel: bnx2 :02:00.1: eth1: DEBUG: shmem states:
>> Mar 19 15:45:17 server kernel: bnx2 :02:00.1: eth1: DEBUG: 
>> drv_mb[0103000f] fw_mb[000f] link_status[006f] 
>> drv_pulse_mb[432b] Mar 19 15:45:17 server kernel: bnx2 
>> :02:00.1: eth1: DEBUG: dev_info_signature[44564903] 
>> reset_type[01005254] condition[0003610e] Mar 19 15:45:17 server 
>> kernel: bnx2 :02:00.1: eth1: DEBUG: 03cc:   
>>  0a3c Mar 19 15:45:17 server kernel: bnx2 :02:00.1: 
>> eth1: DEBUG: 03dc: 0ffe    Mar 19 
>> 15:45:17 server kernel: bnx2 :02:00.1: eth1: DEBUG: 03ec: 
>>    0002 Mar 19 15:45:17 server kernel: 
>> bnx2 :02:00.1: eth1: DEBUG: 0x3fc[] Mar 19 15:45:17 server 
>> kernel: bnx2 :02:00.1: eth1: <--- end MCP states dump ---> Mar 19 
>> 15:45:17 server kernel: bnx2 :02:00.1: eth1: NIC Copper Link is 
>> Down Mar 19 15:45:20 server kernel: bnx2 :02:00.1: eth1: NIC 
>> Copper Link is Up, 1000 Mbps full duplex
>> --
>> -
>> 
>> Does anyone know that problem?
>> 
>> System is Centos 6.3 Kernel
>> Linux server 2.6.32-279.5.2.el6.centos.plus.x86_64 #1 SMP Fri Aug 24 
>> 00:25:34 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux
>> 
>> 
>> Thanks
>> Hartmut
>> 
>> ___
>> CentOS mailing list
>> CentOS@centos.org
>> http://lists.centos.org/mailman/listinfo/centos
> 
> 
> ___
> CentOS mailing list
> CentOS@centos.org
> http://lists.centos.org/mailman/listinfo/centos
> ___
> CentOS mailing list
> CentOS@centos.org
> http://lists.centos.org/mailman/listinfo/centos

___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos