Re: [CentOS] Failing Network card

2012-06-21 Thread Chuck Munro

> Date: Wed, 20 Jun 2012 10:54:33 -0700
> From: John R Pierce
> Subject: Re: [CentOS] Failing Network card
> To:centos@centos.org
> Message-ID:<4fe20e59.20...@hogranch.com>
> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>
> On 06/20/12 8:44 AM, Gregory P. Ennis wrote:
>> >  01:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd.
>> >  RTL8111/8168B PCI Express Gigabit Ethernet controller (rev ff)
 >
> pure unmitigated junk.
>
> -- john r pierce N 37, W 122 santa cruz ca mid-left coast

I agree with John's comment.  Realtek chips are junk with unpredictable 
reliability, especially under heavy load.  I have had several problems 
with various versions of the 81xx chips.  When I tossed the cards in the 
garbage and switched to Intel-based NICs, all the problems went away.

Every time I build systems with Realtek network chips on the 
motherboard, I disable them in the BIOS and add Intel NICs instead.

YMMV, but please consider ditching Realtek altogether.

Chuck
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Failing Network card

2012-06-20 Thread m . roth
John R Pierce wrote:
> On 06/20/12 12:21 PM, Gregory P. Ennis wrote:
>> I am being persuaded that you are right.  I'll have to look at the
>> mother board to answer Dale's question; the machine and the nic card
>> came from Fry's.  I have had pretty good luck with Fry's in the past,
>> but this has turned out to be a real pain.
>>
>> What chip set, or what pci-e nic card would you recommend?
>
> I've had good luck with Intel ethernet chips/cards, especially the
> server oriented ones.

We've got a lot of Broadcom ones. IIRC, Realtek tends towards consumer
grade, not server grade.

  mark


___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Failing Network card

2012-06-20 Thread John R Pierce
On 06/20/12 12:21 PM, Gregory P. Ennis wrote:
> I am being persuaded that you are right.  I'll have to look at the
> mother board to answer Dale's question; the machine and the nic card
> came from Fry's.  I have had pretty good luck with Fry's in the past,
> but this has turned out to be a real pain.
>
> What chip set, or what pci-e nic card would you recommend?

I've had good luck with Intel ethernet chips/cards, especially the 
server oriented ones.

one of my recent servers has these NICs...

03:00.0 Ethernet controller: Intel Corporation 82571EB Gigabit Ethernet 
Controller (rev 06)
03:00.1 Ethernet controller: Intel Corporation 82571EB Gigabit Ethernet 
Controller (rev 06)
07:00.0 Ethernet controller: Intel Corporation 82576 Gigabit Network 
Connection (rev 01)
07:00.1 Ethernet controller: Intel Corporation 82576 Gigabit Network 
Connection (rev 01)

another has...

03:00.0 Ethernet controller: Intel Corporation 82574L Gigabit Network 
Connection
04:00.0 Ethernet controller: Intel Corporation 82574L Gigabit Network 
Connection

these are both recent Xeon 5600 class CPUs.  The first is a HP DL380g6, 
the 2nd is a whitebox server with a SuperMicro X8DTE-F motherboard.



-- 
john r pierceN 37, W 122
santa cruz ca mid-left coast

___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Failing Network card

2012-06-20 Thread Gregory P. Ennis
On 06/20/12 11:17 AM, Dale Dellutri wrote:
> Or it could mean that the PCI-e slots are not providing enough power
> for this card, or the slots are specialized to run only certain types of
> cards.  What motherboard does the OP have?

more likely, it means once again Fry's is selling junk that belongs in a 
scrap pile.

-

John,

I am being persuaded that you are right.  I'll have to look at the
mother board to answer Dale's question; the machine and the nic card
came from Fry's.  I have had pretty good luck with Fry's in the past,
but this has turned out to be a real pain.

What chip set, or what pci-e nic card would you recommend?

Greg


___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Failing Network card

2012-06-20 Thread John R Pierce
On 06/20/12 11:17 AM, Dale Dellutri wrote:
> Or it could mean that the PCI-e slots are not providing enough power
> for this card, or the slots are specialized to run only certain types of
> cards.  What motherboard does the OP have?

more likely, it means once again Fry's is selling junk that belongs in a 
scrap pile.

-- 
john r pierceN 37, W 122
santa cruz ca mid-left coast

___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Failing Network card

2012-06-20 Thread Dale Dellutri
On Wed, Jun 20, 2012 at 12:18 PM, Chris Beattie  wrote:
> On 6/20/2012 9:34 AM, Gregory P. Ennis wrote:
>>
>> I have been chasing a problem with a pci-e TrendNet(TEG-ECTX) gigabit
>> card.  After adding the card to a machine with a new Centos 6.2 install
>> and naming it 'eth4' it works well for 6 to 12 hours and then fails.
>
> Try moving the network card to a new slot, especially if you can swap
> the network card with another card which is known to work.  Also, try
> swapping the card into a spare server.
>
> If the problem follows the network card, then the card is probably bad.
>  If a known-good card misbehaves in the slot where you previously had
> the network card, then the slot may be bad as well.

Or it could mean that the PCI-e slots are not providing enough power
for this card, or the slots are specialized to run only certain types of
cards.  What motherboard does the OP have?

-- 
Dale Dellutri
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Failing Network card

2012-06-20 Thread John R Pierce
On 06/20/12 8:44 AM, Gregory P. Ennis wrote:
> 01:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd.
> RTL8111/8168B PCI Express Gigabit Ethernet controller (rev ff)

pure unmitigated junk.



-- 
john r pierceN 37, W 122
santa cruz ca mid-left coast

___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Failing Network card

2012-06-20 Thread Bob Hoffman
On 6/20/2012 11:09 AM, Gregory P. Ennis wrote:
> That's interesting.  Here are the log entries for the previous card as
> well as the eth4 that is currently installed.
>
> # PCI device 0x10ec:0x8168 (r8169)
> SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", 
> ATTR{address}=="00:e0:b3:10:f6:81", ATTR{type}=="1", KERNEL=="eth*", 
> NAME="eth3"
>
> # PCI device 0x10ec:0x8168 (r8169)
> SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", 
> ATTR{address}=="00:e0:b3:10:fc:6e", ATTR{type}=="1", KERNEL=="eth*", 
> NAME="eth4"
have you deleted all the information from udev of the old card you 
pulled out.
Could be an issue, not sure, if you are using the same slot ?
Sometimes you get bad batches though and one failure can mean many more too.

if both cards had the same issue, then I doubt udev or any of that is at 
fault.
Having to unplug power to the machine is odd, but would support a bad 
card idea.

Try instead of pulling plug, rebooting but unplugging network cable 
first, see if that has an effect.

I would just return it and get a different type of card...or try an 
extra one you have lying around.

All I know is with computers is come down to two things
1) its broke, return it
2) its something really silly, usually one misconfiguration or error, 
something simple but overlooked.


___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Failing Network card

2012-06-20 Thread Chris Beattie
On 6/20/2012 9:34 AM, Gregory P. Ennis wrote:
>
> I have been chasing a problem with a pci-e TrendNet(TEG-ECTX) gigabit
> card.  After adding the card to a machine with a new Centos 6.2 install
> and naming it 'eth4' it works well for 6 to 12 hours and then fails.

Try moving the network card to a new slot, especially if you can swap 
the network card with another card which is known to work.  Also, try 
swapping the card into a spare server.

If the problem follows the network card, then the card is probably bad. 
  If a known-good card misbehaves in the slot where you previously had 
the network card, then the slot may be bad as well.

-- 
-Chris

Nothing in this message is intended to make or accept an offer or to form a 
contract, except that an attachment that is an image of a contract bearing the 
signature of an officer of our company may be or become a contract. This 
message (including any attachments) is intended only for the use of the 
individual or entity to whom it is addressed. It may contain information that 
is non-public, proprietary, privileged, confidential, and exempt from 
disclosure under applicable law or may constitute as attorney work product. If 
you are not the intended recipient, we hereby notify you that any use, 
dissemination, distribution, or copying of this message is strictly prohibited. 
If you have received this message in error, please notify us immediately by 
telephone and delete this message immediately.

Thank you.

___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Failing Network card

2012-06-20 Thread Gregory P. Ennis
On 6/20/2012 11:13 AM, Gregory P. Ennis wrote:
>> Gregory P. Ennis wrote:
>> 
>>> I have been chasing a problem with a pci-e TrendNet(TEG-ECTX) gigabit
>>> card.  After adding the card to a machine with a new Centos 6.2 install
>>> and naming it 'eth4' it works well for 6 to 12 hours and then fails.
>>> The failure is characterized by dropping its connection speed from 1000
>>> to 100 while not allowing any data to flow in or out.  When this happens
>>> a shutdown and reboot does not solve the problem, but shutting down and
>>> then removing the power does solve the problem.
>> 
>>> Some additional information that may be useful.  The TrendNet card is
>>> the second TrendNet card I have used.  The first card had the same
>>> symptoms, and I deduced the card was bad, and purchased another one. The
>>> symptoms are the same with the second card.
>> 
>> Several questions: do you have another machine on the same network? Does
>> *it* show the problem, around the same time?
>>
>> And, finally, did you buy both TrendNet cards from the same vendor? Are
>> their MACs close? If so, it could be the vendor got a bad batch, either
>> OEM's fault, or the gorilla who un/loaded it during shipping.
>>
>>  mark
>>
>> -
>>
>> Mark,
>>
>> I have several machines on that network, and only one machine is having
>> the problem.  The machine is being used as a mail server, web server,
>> and gateway for the network.  After this problem surfaced with the
>> failure of the eth4 card (internal network), I created a gateway out of
>> one of the other machines that is working without incident.
>>
>> I did purchase both TrendNet Cards from Fry's.  Fry's was good about
>> taking the first one back without question, but now that the second one
>> has failed, I thought it best to look deeper.  I don't have the previous
>> card's MAC address, but my first thought was that this was a bad card
>> too. Both the first and second cards did not appear to have any damage
>> on the boxes or the card itself.  Before I tried to get a third card
>> from a different manufacturer I wanted to post things here to see if
>> there was an obvious problem I am missing.
>>
>> Thanks for your help!!!
>>
>> Greg
>>
>> ___
>> CentOS mailing list
>> CentOS@centos.org
>> http://lists.centos.org/mailman/listinfo/centos
> If you are having to fully 'cold boot' the system before it will work
> again I can't help but wonder if it is a conflict between special
> motherboard functions/settings and the card. I've seen this with some
> high end video cards under Winders. I am totally speculating here and
> have nothing to draw from, but wake on lan functions and such just
> leaves me wondering. Do you have a different machine/motherboard around
> where it wouldn't be hard to set up this testing? Maybe Googling a bit
> on motherboard model and eth card model might give a helpful return?
>
> 
>
> John,
>
> That is a good idea !!!
>
> I have appended the output of 'ethtool eth4' below.  Is there a way to
> change the wake setting from the command line as opposed to changing the
> bios setting at boot.
>
> Greg
>
> Settings for eth4:
>  Supported ports: [ TP MII ]
>  Supported link modes:   10baseT/Half 10baseT/Full
>  100baseT/Half 100baseT/Full
>  1000baseT/Half 1000baseT/Full
>  Supports auto-negotiation: Yes
>  Advertised link modes:  10baseT/Half 10baseT/Full
>  100baseT/Half 100baseT/Full
>  1000baseT/Half 1000baseT/Full
>  Advertised pause frame use: No
>  Advertised auto-negotiation: Yes
>  Link partner advertised link modes:  10baseT/Half 10baseT/Full
>   100baseT/Half 100baseT/Full
>   1000baseT/Half 1000baseT/Full
>  Link partner advertised pause frame use: No
>  Link partner advertised auto-negotiation: Yes
>  Speed: 1000Mb/s
>  Duplex: Full
>  Port: MII
>  PHYAD: 0
>  Transceiver: internal
>  Auto-negotiation: on
>  Supports Wake-on: pumbg
>  Wake-on: pumbg
>  Current message level: 0x0033 (51)
>  Link detected: yes
>
>
I always disable wake on lan on the motherboard and so far have never 
had an issue. To me this 'feature' should never be on by default but 
most of my experience has shown the opposite. I suppose there is good 
use for this, but I sure don't have one. At the mb bios level, it just 
seems like another level of security to worry about with little info on 
'knowing' the potential. I have no experience with disabling wake on lan 
on the cards themselves. If this is a mailserver, it seems it s

Re: [CentOS] Failing Network card

2012-06-20 Thread Gregory P. Ennis
Gregory P. Ennis wrote:
> Gregory P. Ennis wrote:
>> 
 Some additional information that may be useful.  The TrendNet card is
 the second TrendNet card I have used.  The first card had the same
 symptoms, and I deduced the card was bad, and purchased another one.
 The symptoms are the same with the second card.

>> Looks like addresses are close.
>
> So-so; not *that* close. I have some servers with two on-board NIC's whose
> MAC addresses end in things like fe:ab, fe:ac, fe;36, fe:37. Still
>
> Actually, I missed the beginning of this thread. Are there no on-board
> NICs? I've not seen a m/b in a long time without that; even Rasberry Pi
> has one.
>
> There is an on board nic with the m/b.  Here is the mac entry of it.

Are those in use? If not, why not use them?

   mark "I must be missing something"



Mark,

I have the m/b nic set as the external (open to the internet) card.  The
pci-e nic was set for the internal network card.  I had this machine set
to be a gateway for the rest of the internal machines.  I only have two
nics on this system, eth0 and eth4.  The reason it is labeled eth4 is
related to some installation problems I had during the installation of
the pci-e card.  Once I got eth4 to work, I have been too lazy to go
back and modify things to relabel it as eth1.  Now that it is failing, I
am glad I left it alone.

Greg

___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Failing Network card

2012-06-20 Thread John Hinton
On 6/20/2012 11:13 AM, Gregory P. Ennis wrote:
>> Gregory P. Ennis wrote:
>> 
>>> I have been chasing a problem with a pci-e TrendNet(TEG-ECTX) gigabit
>>> card.  After adding the card to a machine with a new Centos 6.2 install
>>> and naming it 'eth4' it works well for 6 to 12 hours and then fails.
>>> The failure is characterized by dropping its connection speed from 1000
>>> to 100 while not allowing any data to flow in or out.  When this happens
>>> a shutdown and reboot does not solve the problem, but shutting down and
>>> then removing the power does solve the problem.
>> 
>>> Some additional information that may be useful.  The TrendNet card is
>>> the second TrendNet card I have used.  The first card had the same
>>> symptoms, and I deduced the card was bad, and purchased another one. The
>>> symptoms are the same with the second card.
>> 
>> Several questions: do you have another machine on the same network? Does
>> *it* show the problem, around the same time?
>>
>> And, finally, did you buy both TrendNet cards from the same vendor? Are
>> their MACs close? If so, it could be the vendor got a bad batch, either
>> OEM's fault, or the gorilla who un/loaded it during shipping.
>>
>>  mark
>>
>> -
>>
>> Mark,
>>
>> I have several machines on that network, and only one machine is having
>> the problem.  The machine is being used as a mail server, web server,
>> and gateway for the network.  After this problem surfaced with the
>> failure of the eth4 card (internal network), I created a gateway out of
>> one of the other machines that is working without incident.
>>
>> I did purchase both TrendNet Cards from Fry's.  Fry's was good about
>> taking the first one back without question, but now that the second one
>> has failed, I thought it best to look deeper.  I don't have the previous
>> card's MAC address, but my first thought was that this was a bad card
>> too. Both the first and second cards did not appear to have any damage
>> on the boxes or the card itself.  Before I tried to get a third card
>> from a different manufacturer I wanted to post things here to see if
>> there was an obvious problem I am missing.
>>
>> Thanks for your help!!!
>>
>> Greg
>>
>> ___
>> CentOS mailing list
>> CentOS@centos.org
>> http://lists.centos.org/mailman/listinfo/centos
> If you are having to fully 'cold boot' the system before it will work
> again I can't help but wonder if it is a conflict between special
> motherboard functions/settings and the card. I've seen this with some
> high end video cards under Winders. I am totally speculating here and
> have nothing to draw from, but wake on lan functions and such just
> leaves me wondering. Do you have a different machine/motherboard around
> where it wouldn't be hard to set up this testing? Maybe Googling a bit
> on motherboard model and eth card model might give a helpful return?
>
> 
>
> John,
>
> That is a good idea !!!
>
> I have appended the output of 'ethtool eth4' below.  Is there a way to
> change the wake setting from the command line as opposed to changing the
> bios setting at boot.
>
> Greg
>
> Settings for eth4:
>  Supported ports: [ TP MII ]
>  Supported link modes:   10baseT/Half 10baseT/Full
>  100baseT/Half 100baseT/Full
>  1000baseT/Half 1000baseT/Full
>  Supports auto-negotiation: Yes
>  Advertised link modes:  10baseT/Half 10baseT/Full
>  100baseT/Half 100baseT/Full
>  1000baseT/Half 1000baseT/Full
>  Advertised pause frame use: No
>  Advertised auto-negotiation: Yes
>  Link partner advertised link modes:  10baseT/Half 10baseT/Full
>   100baseT/Half 100baseT/Full
>   1000baseT/Half 1000baseT/Full
>  Link partner advertised pause frame use: No
>  Link partner advertised auto-negotiation: Yes
>  Speed: 1000Mb/s
>  Duplex: Full
>  Port: MII
>  PHYAD: 0
>  Transceiver: internal
>  Auto-negotiation: on
>  Supports Wake-on: pumbg
>  Wake-on: pumbg
>  Current message level: 0x0033 (51)
>  Link detected: yes
>
>
I always disable wake on lan on the motherboard and so far have never 
had an issue. To me this 'feature' should never be on by default but 
most of my experience has shown the opposite. I suppose there is good 
use for this, but I sure don't have one. At the mb bios level, it just 
seems like another level of security to worry about with little info on 
'knowing' the potential. I have no experience with disabling wake on lan 
on the cards themselves. If this is a mailserver, it seems it s

Re: [CentOS] Failing Network card

2012-06-20 Thread m . roth
Gregory P. Ennis wrote:
> Gregory P. Ennis wrote:
>> 
 Some additional information that may be useful.  The TrendNet card is
 the second TrendNet card I have used.  The first card had the same
 symptoms, and I deduced the card was bad, and purchased another one.
 The symptoms are the same with the second card.

>> Looks like addresses are close.
>
> So-so; not *that* close. I have some servers with two on-board NIC's whose
> MAC addresses end in things like fe:ab, fe:ac, fe;36, fe:37. Still
>
> Actually, I missed the beginning of this thread. Are there no on-board
> NICs? I've not seen a m/b in a long time without that; even Rasberry Pi
> has one.
>
> There is an on board nic with the m/b.  Here is the mac entry of it.

Are those in use? If not, why not use them?

   mark "I must be missing something"

___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Failing Network card

2012-06-20 Thread Gregory P. Ennis
Gregory P. Ennis wrote:
> 
>>> Some additional information that may be useful.  The TrendNet card is
>>> the second TrendNet card I have used.  The first card had the same
>>> symptoms, and I deduced the card was bad, and purchased another one.
>>> The symptoms are the same with the second card.
>> 
> Ah, but you should in your logs, or - if you're running 6.2 - possibly in
> /etc/udev/rules.d/70-persistant-net.rules.
>
>> too. Both the first and second cards did not appear to have any damage
>> on the boxes or the card itself.  Before I tried to get a third card
> 
>
> In that case, sounds like the OEM had a q/c problem.
>
> That's interesting.  Here are the log entries for the previous card as
> well as the eth4 that is currently installed.
>
> # PCI device 0x10ec:0x8168 (r8169)
> SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*",
> ATTR{address}=="00:e0:b3:10:f6:81", ATTR{type}=="1", KERNEL=="eth*",
> NAME="eth3"
>
> # PCI device 0x10ec:0x8168 (r8169)
> SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*",
> ATTR{address}=="00:e0:b3:10:fc:6e", ATTR{type}=="1", KERNEL=="eth*",
> NAME="eth4"
>
> Looks like addresses are close.

So-so; not *that* close. I have some servers with two on-board NIC's whose
MAC addresses end in things like fe:ab, fe:ac, fe;36, fe:37. Still

Actually, I missed the beginning of this thread. Are there no on-board
NICs? I've not seen a m/b in a long time without that; even Rasberry Pi
has one.

mark

Mark,

There is an on board nic with the m/b.  Here is the mac entry of it.

SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*",
ATTR{address}=="38:60:77:ed:41:a0", ATTR{type}=="1", KERNEL=="eth*",
NAME="eth0"

Both nic's apparently have the same chipset :

"lspci | grep net" outputs :
01:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd.
RTL8111/8168B PCI Express Gigabit Ethernet controller (rev ff)
03:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd.
RTL8111/8168B PCI Express Gigabit Ethernet controller (rev 06)

Greg

___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Failing Network card

2012-06-20 Thread m . roth
Greg,

Gregory P. Ennis wrote:
> 
>>> Some additional information that may be useful.  The TrendNet card is
>>> the second TrendNet card I have used.  The first card had the same
>>> symptoms, and I deduced the card was bad, and purchased another one.
>>> The symptoms are the same with the second card.
>> 
> Ah, but you should in your logs, or - if you're running 6.2 - possibly in
> /etc/udev/rules.d/70-persistant-net.rules.
>
>> too. Both the first and second cards did not appear to have any damage
>> on the boxes or the card itself.  Before I tried to get a third card
> 
>
> In that case, sounds like the OEM had a q/c problem.
>
> That's interesting.  Here are the log entries for the previous card as
> well as the eth4 that is currently installed.
>
> # PCI device 0x10ec:0x8168 (r8169)
> SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*",
> ATTR{address}=="00:e0:b3:10:f6:81", ATTR{type}=="1", KERNEL=="eth*",
> NAME="eth3"
>
> # PCI device 0x10ec:0x8168 (r8169)
> SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*",
> ATTR{address}=="00:e0:b3:10:fc:6e", ATTR{type}=="1", KERNEL=="eth*",
> NAME="eth4"
>
> Looks like addresses are close.

So-so; not *that* close. I have some servers with two on-board NIC's whose
MAC addresses end in things like fe:ab, fe:ac, fe;36, fe:37. Still

Actually, I missed the beginning of this thread. Are there no on-board
NICs? I've not seen a m/b in a long time without that; even Rasberry Pi
has one.

mark

___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Failing Network card

2012-06-20 Thread Gregory P. Ennis
> Gregory P. Ennis wrote:
> 
>> I have been chasing a problem with a pci-e TrendNet(TEG-ECTX) gigabit
>> card.  After adding the card to a machine with a new Centos 6.2 install
>> and naming it 'eth4' it works well for 6 to 12 hours and then fails.
>> The failure is characterized by dropping its connection speed from 1000
>> to 100 while not allowing any data to flow in or out.  When this happens
>> a shutdown and reboot does not solve the problem, but shutting down and
>> then removing the power does solve the problem.
> 
>> Some additional information that may be useful.  The TrendNet card is
>> the second TrendNet card I have used.  The first card had the same
>> symptoms, and I deduced the card was bad, and purchased another one. The
>> symptoms are the same with the second card.
> 
> Several questions: do you have another machine on the same network? Does
> *it* show the problem, around the same time?
>
> And, finally, did you buy both TrendNet cards from the same vendor? Are
> their MACs close? If so, it could be the vendor got a bad batch, either
> OEM's fault, or the gorilla who un/loaded it during shipping.
>
> mark
>
> -
>
> Mark,
>
> I have several machines on that network, and only one machine is having
> the problem.  The machine is being used as a mail server, web server,
> and gateway for the network.  After this problem surfaced with the
> failure of the eth4 card (internal network), I created a gateway out of
> one of the other machines that is working without incident.
>
> I did purchase both TrendNet Cards from Fry's.  Fry's was good about
> taking the first one back without question, but now that the second one
> has failed, I thought it best to look deeper.  I don't have the previous
> card's MAC address, but my first thought was that this was a bad card
> too. Both the first and second cards did not appear to have any damage
> on the boxes or the card itself.  Before I tried to get a third card
> from a different manufacturer I wanted to post things here to see if
> there was an obvious problem I am missing.
>
> Thanks for your help!!!
>
> Greg
>
> ___
> CentOS mailing list
> CentOS@centos.org
> http://lists.centos.org/mailman/listinfo/centos
If you are having to fully 'cold boot' the system before it will work 
again I can't help but wonder if it is a conflict between special 
motherboard functions/settings and the card. I've seen this with some 
high end video cards under Winders. I am totally speculating here and 
have nothing to draw from, but wake on lan functions and such just 
leaves me wondering. Do you have a different machine/motherboard around 
where it wouldn't be hard to set up this testing? Maybe Googling a bit 
on motherboard model and eth card model might give a helpful return?



John,

That is a good idea !!!

I have appended the output of 'ethtool eth4' below.  Is there a way to
change the wake setting from the command line as opposed to changing the
bios setting at boot.

Greg

Settings for eth4:
Supported ports: [ TP MII ]
Supported link modes:   10baseT/Half 10baseT/Full 
100baseT/Half 100baseT/Full 
1000baseT/Half 1000baseT/Full 
Supports auto-negotiation: Yes
Advertised link modes:  10baseT/Half 10baseT/Full 
100baseT/Half 100baseT/Full 
1000baseT/Half 1000baseT/Full 
Advertised pause frame use: No
Advertised auto-negotiation: Yes
Link partner advertised link modes:  10baseT/Half 10baseT/Full 
 100baseT/Half 100baseT/Full 
 1000baseT/Half 1000baseT/Full 
Link partner advertised pause frame use: No
Link partner advertised auto-negotiation: Yes
Speed: 1000Mb/s
Duplex: Full
Port: MII
PHYAD: 0
Transceiver: internal
Auto-negotiation: on
Supports Wake-on: pumbg
Wake-on: pumbg
Current message level: 0x0033 (51)
Link detected: yes


___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Failing Network card

2012-06-20 Thread Gregory P. Ennis

>> Some additional information that may be useful.  The TrendNet card is
>> the second TrendNet card I have used.  The first card had the same
>> symptoms, and I deduced the card was bad, and purchased another one. The
>> symptoms are the same with the second card.
> 
> Several questions: do you have another machine on the same network? Does
> *it* show the problem, around the same time?
>
> And, finally, did you buy both TrendNet cards from the same vendor? Are
> their MACs close? If so, it could be the vendor got a bad batch, either
> OEM's fault, or the gorilla who un/loaded it during shipping.
>
> I have several machines on that network, and only one machine is having
> the problem.  The machine is being used as a mail server, web server,
> and gateway for the network.  After this problem surfaced with the
> failure of the eth4 card (internal network), I created a gateway out of
> one of the other machines that is working without incident.

Good deal.
>
> I did purchase both TrendNet Cards from Fry's.  Fry's was good about
> taking the first one back without question, but now that the second one
> has failed, I thought it best to look deeper.  I don't have the previous
> card's MAC address, but my first thought was that this was a bad card

Ah, but you should in your logs, or - if you're running 6.2 - possibly in
/etc/udev/rules.d/70-persistant-net.rules.

> too. Both the first and second cards did not appear to have any damage
> on the boxes or the card itself.  Before I tried to get a third card


In that case, sounds like the OEM had a q/c problem.

 mark

Mark,

That's interesting.  Here are the log entries for the previous card as
well as the eth4 that is currently installed.

# PCI device 0x10ec:0x8168 (r8169)
SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", 
ATTR{address}=="00:e0:b3:10:f6:81", ATTR{type}=="1", KERNEL=="eth*", NAME="eth3"

# PCI device 0x10ec:0x8168 (r8169)
SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", 
ATTR{address}=="00:e0:b3:10:fc:6e", ATTR{type}=="1", KERNEL=="eth*", NAME="eth4"


Looks like addresses are close.

Greg

___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Failing Network card

2012-06-20 Thread m . roth
John Hinton wrote:
> On 6/20/2012 10:27 AM, Gregory P. Ennis wrote:
>> Gregory P. Ennis wrote:
>> 
>>> I have been chasing a problem with a pci-e TrendNet(TEG-ECTX) gigabit
>>> card.  After adding the card to a machine with a new Centos 6.2 install

> If you are having to fully 'cold boot' the system before it will work
> again I can't help but wonder if it is a conflict between special
> motherboard functions/settings and the card. I've seen this with some
> high end video cards under Winders. I am totally speculating here and
> have nothing to draw from, but wake on lan functions and such just
> leaves me wondering. Do you have a different machine/motherboard around
> where it wouldn't be hard to set up this testing? Maybe Googling a bit
> on motherboard model and eth card model might give a helpful return?

Interesting questions. Is wake-on-lan enabled (try turning it off, so it's
always on). Also, if it's 6.2, check that udev rule.

mark

___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Failing Network card

2012-06-20 Thread m . roth
Greg,

Gregory P. Ennis wrote:
> Gregory P. Ennis wrote:

>> Some additional information that may be useful.  The TrendNet card is
>> the second TrendNet card I have used.  The first card had the same
>> symptoms, and I deduced the card was bad, and purchased another one. The
>> symptoms are the same with the second card.
> 
> Several questions: do you have another machine on the same network? Does
> *it* show the problem, around the same time?
>
> And, finally, did you buy both TrendNet cards from the same vendor? Are
> their MACs close? If so, it could be the vendor got a bad batch, either
> OEM's fault, or the gorilla who un/loaded it during shipping.
>
> I have several machines on that network, and only one machine is having
> the problem.  The machine is being used as a mail server, web server,
> and gateway for the network.  After this problem surfaced with the
> failure of the eth4 card (internal network), I created a gateway out of
> one of the other machines that is working without incident.

Good deal.
>
> I did purchase both TrendNet Cards from Fry's.  Fry's was good about
> taking the first one back without question, but now that the second one
> has failed, I thought it best to look deeper.  I don't have the previous
> card's MAC address, but my first thought was that this was a bad card

Ah, but you should in your logs, or - if you're running 6.2 - possibly in
/etc/udev/rules.d/70-persistant-net.rules.

> too. Both the first and second cards did not appear to have any damage
> on the boxes or the card itself.  Before I tried to get a third card


In that case, sounds like the OEM had a q/c problem.

 mark

___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Failing Network card

2012-06-20 Thread John Hinton
On 6/20/2012 10:27 AM, Gregory P. Ennis wrote:
> Gregory P. Ennis wrote:
> 
>> I have been chasing a problem with a pci-e TrendNet(TEG-ECTX) gigabit
>> card.  After adding the card to a machine with a new Centos 6.2 install
>> and naming it 'eth4' it works well for 6 to 12 hours and then fails.
>> The failure is characterized by dropping its connection speed from 1000
>> to 100 while not allowing any data to flow in or out.  When this happens
>> a shutdown and reboot does not solve the problem, but shutting down and
>> then removing the power does solve the problem.
> 
>> Some additional information that may be useful.  The TrendNet card is
>> the second TrendNet card I have used.  The first card had the same
>> symptoms, and I deduced the card was bad, and purchased another one. The
>> symptoms are the same with the second card.
> 
> Several questions: do you have another machine on the same network? Does
> *it* show the problem, around the same time?
>
> And, finally, did you buy both TrendNet cards from the same vendor? Are
> their MACs close? If so, it could be the vendor got a bad batch, either
> OEM's fault, or the gorilla who un/loaded it during shipping.
>
> mark
>
> -
>
> Mark,
>
> I have several machines on that network, and only one machine is having
> the problem.  The machine is being used as a mail server, web server,
> and gateway for the network.  After this problem surfaced with the
> failure of the eth4 card (internal network), I created a gateway out of
> one of the other machines that is working without incident.
>
> I did purchase both TrendNet Cards from Fry's.  Fry's was good about
> taking the first one back without question, but now that the second one
> has failed, I thought it best to look deeper.  I don't have the previous
> card's MAC address, but my first thought was that this was a bad card
> too. Both the first and second cards did not appear to have any damage
> on the boxes or the card itself.  Before I tried to get a third card
> from a different manufacturer I wanted to post things here to see if
> there was an obvious problem I am missing.
>
> Thanks for your help!!!
>
> Greg
>
> ___
> CentOS mailing list
> CentOS@centos.org
> http://lists.centos.org/mailman/listinfo/centos
If you are having to fully 'cold boot' the system before it will work 
again I can't help but wonder if it is a conflict between special 
motherboard functions/settings and the card. I've seen this with some 
high end video cards under Winders. I am totally speculating here and 
have nothing to draw from, but wake on lan functions and such just 
leaves me wondering. Do you have a different machine/motherboard around 
where it wouldn't be hard to set up this testing? Maybe Googling a bit 
on motherboard model and eth card model might give a helpful return?

-- 
John Hinton
877-777-1407 ext 502
http://www.ew3d.com
Comprehensive Online Solutions

___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Failing Network card

2012-06-20 Thread Gregory P. Ennis
Gregory P. Ennis wrote:

> I have been chasing a problem with a pci-e TrendNet(TEG-ECTX) gigabit
> card.  After adding the card to a machine with a new Centos 6.2 install
> and naming it 'eth4' it works well for 6 to 12 hours and then fails.
> The failure is characterized by dropping its connection speed from 1000
> to 100 while not allowing any data to flow in or out.  When this happens
> a shutdown and reboot does not solve the problem, but shutting down and
> then removing the power does solve the problem.

> Some additional information that may be useful.  The TrendNet card is
> the second TrendNet card I have used.  The first card had the same
> symptoms, and I deduced the card was bad, and purchased another one. The
> symptoms are the same with the second card.

Several questions: do you have another machine on the same network? Does
*it* show the problem, around the same time?

And, finally, did you buy both TrendNet cards from the same vendor? Are
their MACs close? If so, it could be the vendor got a bad batch, either
OEM's fault, or the gorilla who un/loaded it during shipping.

   mark

-

Mark,

I have several machines on that network, and only one machine is having
the problem.  The machine is being used as a mail server, web server,
and gateway for the network.  After this problem surfaced with the
failure of the eth4 card (internal network), I created a gateway out of
one of the other machines that is working without incident.  

I did purchase both TrendNet Cards from Fry's.  Fry's was good about
taking the first one back without question, but now that the second one
has failed, I thought it best to look deeper.  I don't have the previous
card's MAC address, but my first thought was that this was a bad card
too. Both the first and second cards did not appear to have any damage
on the boxes or the card itself.  Before I tried to get a third card
from a different manufacturer I wanted to post things here to see if
there was an obvious problem I am missing.  

Thanks for your help!!!

Greg

___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Failing Network card

2012-06-20 Thread m . roth
Gregory P. Ennis wrote:

> I have been chasing a problem with a pci-e TrendNet(TEG-ECTX) gigabit
> card.  After adding the card to a machine with a new Centos 6.2 install
> and naming it 'eth4' it works well for 6 to 12 hours and then fails.
> The failure is characterized by dropping its connection speed from 1000
> to 100 while not allowing any data to flow in or out.  When this happens
> a shutdown and reboot does not solve the problem, but shutting down and
> then removing the power does solve the problem.

> Some additional information that may be useful.  The TrendNet card is
> the second TrendNet card I have used.  The first card had the same
> symptoms, and I deduced the card was bad, and purchased another one. The
> symptoms are the same with the second card.

Several questions: do you have another machine on the same network? Does
*it* show the problem, around the same time?

And, finally, did you buy both TrendNet cards from the same vendor? Are
their MACs close? If so, it could be the vendor got a bad batch, either
OEM's fault, or the gorilla who un/loaded it during shipping.

   mark

___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


[CentOS] Failing Network card

2012-06-20 Thread Gregory P. Ennis
Everyone,

Most of the time I am over my head in trying to troubleshoot problems.
However, after reading manuals, man pages, and getting advice from this
list I have been able to work my way through difficulties, and at the
end, I usually have a better understanding of what 'is going on'.  I can
only hope this method will work on this problem too.

I have been chasing a problem with a pci-e TrendNet(TEG-ECTX) gigabit
card.  After adding the card to a machine with a new Centos 6.2 install
and naming it 'eth4' it works well for 6 to 12 hours and then fails.
The failure is characterized by dropping its connection speed from 1000
to 100 while not allowing any data to flow in or out.  When this happens
a shutdown and reboot does not solve the problem, but shutting down and
then removing the power does solve the problem.  

I wrote a perl script that uses the  eth4 interface by pinging another
machine every 60 seconds to try to figure out the relationship of the
message log entries with the time of failure, and I think there is a
corelation of the failure of eth4 to function with the below entry.
Unfortunately, I am way over my head on this one.  If any of you can
help I would surely appreciate your thoughts.

Some additional information that may be useful.  The TrendNet card is
the second TrendNet card I have used.  The first card had the same
symptoms, and I deduced the card was bad, and purchased another one. The
symptoms are the same with the second card.  

Before I purchase a third card from a different manufacturer I thought I
would post this to see what some of you think.  This is the first pci-e
card I have used; are there problems with the pci-e interfaces as
opposed to pci?  Do you think the motherboard could be the problem, and
moving eth4 to a different slot on the motherboard would be worthwhile.

Any ideas ???

Greg Ennis
P.S.  Here is the appropriate log entry in the /var/log/message file.

Jun 20 03:08:38 Mail kernel: [ cut here ]
Jun 20 03:08:38 Mail kernel: WARNING: at net/sched/sch_generic.c:261
dev_watchdog+0x26d/0x280() (Not tainted)
Jun 20 03:08:38 Mail kernel: Hardware name: p7-1220
Jun 20 03:08:38 Mail kernel: NETDEV WATCHDOG: eth4 (r8169): transmit
queue 0 timed out
Jun 20 03:08:38 Mail kernel: Modules linked in: ipt_REDIRECT ipt_LOG
xt_limit ebtable_nat ebtables ipt_MASQUERADE iptable_nat nf_nat
xt_CHECKSUM iptable_mangle bridge autofs4 sunrpc bnx2fc cnic uio fcoe
libfcoe libfc 8021q scsi_transport_fc garp stp llc scsi_tgt
cpufreq_ondemand powernow_k8 freq_table mperf ipt_REJECT
nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables ip6t_REJECT
nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter
ip6_tables ipv6 vhost_net macvtap macvlan tun kvm uinput sg btusb
bluetooth rfkill microcode snd_hda_codec_realtek snd_hda_intel
snd_hda_codec snd_hwdep snd_seq snd_seq_device snd_pcm snd_timer snd
soundcore snd_page_alloc i2c_piix4 r8169 mii ext4 mbcache jbd2 sr_mod
cdrom sd_mod crc_t10dif usb_storage sdhci_pci sdhci mmc_core ahci radeon
ttm drm_kms_helper drm i2c_algo_bit i2c_core dm_mirror dm_region_hash
dm_log dm_mod [last unloaded: scsi_wait_scan]
Jun 20 03:08:38 Mail kernel: Pid: 0, comm: swapper Not tainted
2.6.32-220.23.1.el6.centos.plus.x86_64 #1
Jun 20 03:08:38 Mail kernel: Call Trace:
Jun 20 03:08:38 Mail kernel:   [] ?
warn_slowpath_common+0x87/0xc0
Jun 20 03:08:38 Mail kernel: [] ? warn_slowpath_fmt
+0x46/0x50
Jun 20 03:08:38 Mail kernel: [] ? warn_slowpath_fmt
+0x46/0x50
Jun 20 03:08:38 Mail kernel: [] ? dev_watchdog
+0x26d/0x280
Jun 20 03:08:38 Mail kernel: [] ? dev_watchdog
+0x0/0x280
Jun 20 03:08:38 Mail kernel: [] ?
trace_nowake_buffer_unlock_commit+0x43/0x60
Jun 20 03:08:38 Mail kernel: [] ? dev_watchdog
+0x0/0x280
Jun 20 03:08:38 Mail kernel: [] ? run_timer_softirq
+0x197/0x340
Jun 20 03:08:38 Mail kernel: [] ? __do_softirq
+0xc1/0x1d0
Jun 20 03:08:38 Mail kernel: [] ? hrtimer_interrupt
+0x140/0x250
Jun 20 03:08:38 Mail kernel: [] ? call_softirq
+0x1c/0x30
Jun 20 03:08:38 Mail kernel: [] ? do_softirq+0x65/0xa0
Jun 20 03:08:38 Mail kernel: [] ? irq_exit+0x85/0x90
Jun 20 03:08:38 Mail kernel: [] ?
smp_apic_timer_interrupt+0x70/0x9b
Jun 20 03:08:38 Mail kernel: [] ? apic_timer_interrupt
+0x13/0x20
Jun 20 03:08:38 Mail kernel:   [] ?
acpi_idle_enter_simple+0x114/0x14b
Jun 20 03:08:38 Mail kernel: [] ?
acpi_idle_enter_simple+0x110/0x14b
Jun 20 03:08:38 Mail kernel: [] ? cpuidle_idle_call
+0xa7/0x140
Jun 20 03:08:38 Mail kernel: [] ? cpu_idle+0xb6/0x110
Jun 20 03:08:38 Mail kernel: [] ? start_secondary
+0x202/0x245
Jun 20 03:08:38 Mail kernel: ---[ end trace 24f15998c117ac8f ]---
Jun 20 03:08:38 Mail kernel: r8169 :01:00.0: eth4: link up
Jun 20 03:08:39 Mail abrtd: Directory 'oops-2012-06-20-03:08:39-2420-0'
creation detected
Jun 20 03:08:39 Mail abrt-dump-oops: Reported 1 kernel oopses to Abrt
Jun 20 03:08:39 Mail abrtd: Can't open file
'/var/spool/abrt/oops-2012-06-20-03:08:39-2420-0/uid': No such file or
directory