I have not experienced data corruption when transferring data via ethernet.
I seem to get one corruption every 2 to 3 Gb over wifi.

The symptom is a block of data, roughly 1400 bytes in size, that appears
twice in a row in the destination file.  The first instance of the repeated data
is erroneous.  It does not match the source file.  The second instance of
the repeated data block is correct.  It matches that in the source file.  Both
the source and destination files are the same length.

Before looking more closely at instances of corruption, I tried to reproduce the
problem by transferring a data file containing zero bytes.  The destination file
was always correct.  Hence, a file containing all As is also unlikely
to experience
visible corruption.  Displacing a block of constant data doesn't change the file
contents.

To me it looks like the code is code is handling the same data block twice or
the card is overwriting a receive buffer that it shouldn't.

If the card erroneously generates two receive interrupts for the same data
packet, would that produce the error I'm seeing or would higher level networking
code throw out the packet because of a TCP/IP sequence number error?

I can do experiments, such as reducing the number of receive buffers or adding
code to ath5k_tasklet_rx that checks for duplicated data.  The symptom
I'm seeing is pretty specific.  I'm hopeful that there are a small number of
ways this type of corruption might be caused.

bob

====================

On Tue, May 25, 2010 at 10:16 AM, Bob Copeland <m...@bobcopeland.com> wrote:
> On Sun, May 23, 2010 at 11:49 AM, Robert Brown <robert.br...@gmail.com> wrote:
>> I do think it's worthwhile to think about how a repeated block of data could
>> get into the destination file, assuming something in the networking code is
>> buggy.  TCP/IP should be verifying packet checksums, so isn't it likely
>> (necessary?) that the repeated data is a complete TCP/IP packet?
>
> Yes, there shouldn't be a way to get corrupt data from the
> TCP stack generally.  But we already know that ath5k corrupts
> memory in some corner cases due to a use-after-free which
> we're still trying to pin down.  Sl*b debug/kmemcheck should
> indicate if this is your problem.
>
> I don't understand your point about repeated data: are the
> two files the same except one has a repeated block of data
> in it (and is therefore larger than file2?)  If so I'd expect
> a much bigger set of changes from cmp.
>
> I would suggest transferring a large file with a recognizable
> bit pattern (like all 'A's) to make it easier to see the
> corruption.  I looked at the data from cmp; it doesn't look like
> the sort of thing that ath5k would scribble in memory -- no
> discernable 802.11 frames etc.
>
> You may also try using fsx-linux or try a wired network to
> ensure the problem doesn't lie in fs/hardware.
>
> --
> Bob Copeland %% www.bobcopeland.com
_______________________________________________
ath5k-devel mailing list
ath5k-devel@lists.ath5k.org
https://lists.ath5k.org/mailman/listinfo/ath5k-devel

Reply via email to