Bug#577747: linux-image-2.6.32-5-amd64 - same bug, different HW

2010-10-21 Thread Jan Schermer
 Hi,
I'm experiencing the same problem. My kernel is
linux-image-2.6.32-5-amd64, fresh squeezy/testing installation.

I just installed a new home server (Core2Duo, 3 ethernets, wireless, 2x
SATA in RAID1)

I needed to copy some backup data to the server, but once I plugged my
laptop directly into the gigabit port and started copying (rsync+ssh),
SSH died with message:

Corrupted MAC on input. Disconnecting: Packet corrupt

This was a direct cable between my laptop and server - eth2 - both
gigabit ethernets.

Googling suggested that my hardware is faulty, so because I also have
those data on the internet, I fired scp in the background and went to work.

I logged back to the server and it died again! Same message!

This time it was a different ethernet port, and different remote server.

r...@gw:~# ls -l /sys/class/net/*/device/driver
lrwxrwxrwx 1 root root 0 Oct 21 11:35 /sys/class/net/eth0/device/driver
- ../../../../bus/pci/drivers/e100
lrwxrwxrwx 1 root root 0 Oct 21 11:35 /sys/class/net/eth1/device/driver
- ../../../../bus/pci/drivers/3c59x
lrwxrwxrwx 1 root root 0 Oct 21 11:35 /sys/class/net/eth2/device/driver
- ../../../bus/pci/drivers/e1000e
lrwxrwxrwx 1 root root 0 Oct 21 11:35
/sys/class/net/mon.wlan0/device/driver - ../../../../bus/pci/drivers/ath9k
lrwxrwxrwx 1 root root 0 Oct 21 11:35 /sys/class/net/wlan0/device/driver
- ../../../../bus/pci/drivers/ath9k
r...@gw:~# grep . /sys/class/net/*/features
/sys/class/net/eth0/features:0x0
/sys/class/net/eth1/features:0x803
/sys/class/net/eth2/features:0x1109a9
/sys/class/net/lo/features:0x13865
/sys/class/net/mon.wlan0/features:0x2000
/sys/class/net/wlan0/features:0x2000

For the record, my laptop has atl1c gigabit ethernet card, the
internet server is a xen domU on unknown hardware, but there never was a
problem with either.

So, is it hardware? I guess memory could be faulty, but that would
manifest itself sooner and also on other places I guess.

I believe there is a problem either in those drivers (sharing
something?), or in some algorithm.
I switched SSH MACs from
hmac-md5,hmac-sha1,umac...@openssh.com,hmac-ripemd160
to hmac-ripemd160 and that doesn't work either. B0rked openssl? B0rked
openssh?

It happens more often on large (4GB) files, but also saw it on a bunch
of 1KB files now, restarting makes it going again...

I'm obviously going to run a memtest once I get home, but for now I
believe this is a software fault.

Jan





--
To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/4cc00a77.8050...@zviratko.net



Re: Bug#577747: linux-image-2.6.32-5-amd64 - same bug, different HW

2010-10-21 Thread Jan Schermer
 I don't think that's it. Right now I'm looking at the same problem with
e100 on one side and some domU on the other (I'm pretty sure domU is OK)
this time over internet/WAN. Same error.

Jan



On 10/21/2010 02:04 PM, Frédéric Boiteux wrote:
   Hello,

 Alas, I'm searched on the web and found some people having similar
 problem with the same atl1c driver. I don't know if it's a default
 hardware or a software bug, but I'll avoid any hardware driven by this
 in the future.

 Fred.



smime.p7s
Description: S/MIME Cryptographic Signature


Bug#577747: linux-image-2.6.32-5-amd64 - same bug, different HW

2010-10-21 Thread Jan Schermer
 Update: It's not in SSH/crypto but a network problem

I tried netcat over network and the file also got corrupted (5x OK, 1x
corrupt, 1GB file - former swap file, so half random data and half zeroes).
There are ~10 bytes corrupted in the middle of the file, very close
together (on one page in vbindiff) - so probably one packet/fragment/frame.
I also have tcpdump record of the whole connection - but nothing
apparently fishy there.

Right now I'm testing without netfilter enabled and so far so good (also
had to reboot so it might have fixed itself).
Memtest done without a problem via memtester on 75% of memory, proper
memtest86 will be done tonight.

Any suggestions where to go next? I'm thinking of making a 1GB plaintext
file so that the corruption will be readable and searchable in a data
stream and I can inspect the corrupted packets - but what to look for?

Jan




--
To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/4cc05bbc.7050...@zviratko.net