Re: [CentOS] Network bond - one port goes down from time to time

2016-03-28 Thread Götz Reinicke - IT Koordinator
Am 28.03.16 um 16:23 schrieb Marcelo Ricardo Leitner:
> Em 28-03-2016 06:27, Götz Reinicke escreveu:
>> Hi,
>>
>> may be someone has an idea:
>>
>> We have three supermicron servers with two 10Gb Ports each, connected
>> to a cisco switch stack 1Gb ports. All are on auto speed.
>>
>> I configured a LACP bond on both sides on all servers, first with
>> citrix xen server.
>>
>> On one server eth0 goes down from time to time … maybe within minutes,
>> someday it is up for some hours.
>>
>> Two server are fine; the bond is up for 24 days(!) now without any
>> problem.
>>
>> Recently I installed centos 7.2 on that server in question and - bam -
>> eth0 is going down from time to time …
>>
>> I checked patch cables, tried an other switch port channel,
>> reconfigured the ports, reinstalled the os. Same behavior.
>>
>> And: We got a replacement server. Same behavior …. :)
>>
>> Currently the cisco tech guys don’t see a problem on the switch (which
>> is up for 3 Years now with 10+ servers connected … no problem so far),
>> from the citrix side I don’t get much more hints.
>>
>> In the logs i just have a Nic Link is Down … Nic Link is Up. It is
>> always eth0.
>>
>> Question:
>>
>> Any idea ? One suggestion was Disable all power saving features in the
>> server bios. Did not do that yet.
>>
>> Is there any chance to set some sort of higher debug level for that
>> nic/kernel/whatever to get some server os side feedback why the port
>> goes down?
>>
>> Regards and thanks for any hint! . Götz
> 
> If you are seeing NIC Link is Down as in:
> [710442.668059] e1000e: enp0s25 NIC Link is Down
> then the NIC lost its link and bond is just protecting you as you
> probably didn't have any downtime due to that. IOW bonding is not the
> issue.
> 
> Which NIC do you have on those servers?


The mainbord is a supermicro X10DRI-T with Intel X540 Dual port 10GBase-T.

regards . Götz






___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Network bond - one port goes down from time to time

2016-03-28 Thread Götz Reinicke - IT Koordinator
Am 28.03.16 um 12:12 schrieb Leon Fauster:
> Am 28.03.2016 um 11:27 schrieb Götz Reinicke :
>> We have three supermicron servers with two 10Gb Ports each, connected to a 
>> cisco switch stack 1Gb ports. All are on auto speed.
>>
>> I configured a LACP bond on both sides on all servers, first with citrix xen 
>> server. 
>>
>> On one server eth0 goes down from time to time … maybe within minutes, 
>> someday it is up for some hours.
>>
>> Two server are fine; the bond is up for 24 days(!) now without any problem.
>>
>> Recently I installed centos 7.2 on that server in question and - bam - eth0 
>> is going down from time to time …
>>
>> I checked patch cables, tried an other switch port channel, reconfigured the 
>> ports, reinstalled the os. Same behavior.
>>
>> And: We got a replacement server. Same behavior …. :)
>>
>> Currently the cisco tech guys don’t see a problem on the switch (which is up 
>> for 3 Years now with 10+ servers connected … no problem so far), from the 
>> citrix side I don’t get much more hints.
>>
>> In the logs i just have a Nic Link is Down … Nic Link is Up. It is always 
>> eth0.
>>
>> Question:
>>
>> Any idea ? One suggestion was Disable all power saving features in the 
>> server bios. Did not do that yet.
>>
>> Is there any chance to set some sort of higher debug level for that 
>> nic/kernel/whatever to get some server os side feedback why the port goes 
>> down?
> 
> 
> How is your interface exactly configured ? 


TYPE=Bond   #Interface type set to bond
BOOTPROTO=static
BONDING_MASTER=yes
BONDING_OPTS="mode=4"  #i set mode to active-backup
DEFROUTE=yes
IPADDR="192.168.xxx.xxx"
NETMASK=255.255.255.0
GATEWAY="192.168.xxx.xxx"
IPV4_FAILURE_FATAL=no
IPV6INIT=no
NAME=bond0
DEVICE=bond0
ONBOOT=yes


TYPE="Ethernet"
MASTER=bond0
SLAVE=yes
NAME="enp4s0f0"
UUID="xxx"
DEVICE="enp4s0f0"
ONBOOT="yes"

TYPE="Ethernet"
MASTER=bond0
SLAVE=yes
NAME="enp4s0f0"
UUID="xxx"
DEVICE="enp4s0f1"
ONBOOT="yes"


/Götz



___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Network bond - one port goes down from time to time

2016-03-28 Thread Marcelo Ricardo Leitner

Em 28-03-2016 06:27, Götz Reinicke escreveu:

Hi,

may be someone has an idea:

We have three supermicron servers with two 10Gb Ports each, connected to a 
cisco switch stack 1Gb ports. All are on auto speed.

I configured a LACP bond on both sides on all servers, first with citrix xen 
server.

On one server eth0 goes down from time to time … maybe within minutes, someday 
it is up for some hours.

Two server are fine; the bond is up for 24 days(!) now without any problem.

Recently I installed centos 7.2 on that server in question and - bam - eth0 is 
going down from time to time …

I checked patch cables, tried an other switch port channel, reconfigured the 
ports, reinstalled the os. Same behavior.

And: We got a replacement server. Same behavior …. :)

Currently the cisco tech guys don’t see a problem on the switch (which is up 
for 3 Years now with 10+ servers connected … no problem so far), from the 
citrix side I don’t get much more hints.

In the logs i just have a Nic Link is Down … Nic Link is Up. It is always eth0.

Question:

Any idea ? One suggestion was Disable all power saving features in the server 
bios. Did not do that yet.

Is there any chance to set some sort of higher debug level for that 
nic/kernel/whatever to get some server os side feedback why the port goes down?

Regards and thanks for any hint! . Götz


If you are seeing NIC Link is Down as in:
[710442.668059] e1000e: enp0s25 NIC Link is Down
then the NIC lost its link and bond is just protecting you as you 
probably didn't have any downtime due to that. IOW bonding is not the issue.


Which NIC do you have on those servers?

  Marcelo

___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Network bond - one port goes down from time to time

2016-03-28 Thread Leon Fauster
Am 28.03.2016 um 11:27 schrieb Götz Reinicke :
> We have three supermicron servers with two 10Gb Ports each, connected to a 
> cisco switch stack 1Gb ports. All are on auto speed.
> 
> I configured a LACP bond on both sides on all servers, first with citrix xen 
> server. 
> 
> On one server eth0 goes down from time to time … maybe within minutes, 
> someday it is up for some hours.
> 
> Two server are fine; the bond is up for 24 days(!) now without any problem.
> 
> Recently I installed centos 7.2 on that server in question and - bam - eth0 
> is going down from time to time …
> 
> I checked patch cables, tried an other switch port channel, reconfigured the 
> ports, reinstalled the os. Same behavior.
> 
> And: We got a replacement server. Same behavior …. :)
> 
> Currently the cisco tech guys don’t see a problem on the switch (which is up 
> for 3 Years now with 10+ servers connected … no problem so far), from the 
> citrix side I don’t get much more hints.
> 
> In the logs i just have a Nic Link is Down … Nic Link is Up. It is always 
> eth0.
> 
> Question:
> 
> Any idea ? One suggestion was Disable all power saving features in the server 
> bios. Did not do that yet.
> 
> Is there any chance to set some sort of higher debug level for that 
> nic/kernel/whatever to get some server os side feedback why the port goes 
> down?


How is your interface exactly configured ? 

--
LF


___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


[CentOS] Network bond - one port goes down from time to time

2016-03-28 Thread Götz Reinicke
Hi,

may be someone has an idea:

We have three supermicron servers with two 10Gb Ports each, connected to a 
cisco switch stack 1Gb ports. All are on auto speed.

I configured a LACP bond on both sides on all servers, first with citrix xen 
server. 

On one server eth0 goes down from time to time … maybe within minutes, someday 
it is up for some hours.

Two server are fine; the bond is up for 24 days(!) now without any problem.

Recently I installed centos 7.2 on that server in question and - bam - eth0 is 
going down from time to time …

I checked patch cables, tried an other switch port channel, reconfigured the 
ports, reinstalled the os. Same behavior.

And: We got a replacement server. Same behavior …. :)

Currently the cisco tech guys don’t see a problem on the switch (which is up 
for 3 Years now with 10+ servers connected … no problem so far), from the 
citrix side I don’t get much more hints.

In the logs i just have a Nic Link is Down … Nic Link is Up. It is always eth0.

Question:

Any idea ? One suggestion was Disable all power saving features in the server 
bios. Did not do that yet.

Is there any chance to set some sort of higher debug level for that 
nic/kernel/whatever to get some server os side feedback why the port goes down?

Regards and thanks for any hint! . Götz

 
___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos