Hi,

Sorry, hopefully that will be a plain-text.

There are a lot of bug announcement, just make a search:
gcc.gnu.org/bugzilla/show_bug.cgi?id=33721
gcc.gnu.org/bugzilla/show_bug.cgi?id=16660

Also, I printed out the buffer addresses, and that temporary RX buffer was not aligned. So the transmit function rounded it down to the alignment boundary, and so caused invalid data transmission. (By the way. Shouldn't the transmit function check whether the alignment is proper, and throw an error message, instead of round it down? That would make more sense.)

Best regards,
Robert Hodaszi

On 2013-09-12 12:50, Marek Vasut wrote:
Dear Hector Palacios,

Hello,

Going back to this old thread I have some news regarding the problem with
TFTP transmissions blocking (timed out) after 10 seconds on the FEC of the
MX28. See below:

On 07/17/2013 05:55 PM, Hector Palacios wrote:
Dear Marek,

On 07/16/2013 06:44 AM, Marek Vasut wrote:
Dear Fabio Estevam,

On Tue, Jul 16, 2013 at 12:51 AM, Fabio Estevam<feste...@gmail.com>
wrote:
On Mon, Jul 15, 2013 at 12:09 PM, Hector Palacios

<hector.palac...@digi.com>  wrote:
@Fabio: could you manually run the command 'tftp ${loadaddr}
file100M' in your EVK?
Yes, this is what I have been running since the beginning.

If it doesn't fail, could you try running it again after playing with
the environment (setting/printing some variables).
I can't reproduce the problem here.

As I said, this issue appeared with different TFTP servers and is
independent of whether the dcache is or not enabled.
Can you transfer from a PC to another PC via TFTP?
Yes I can.

Another thing of interest would be a 'tcpdump' pcap capture of that
connection.
I was initially filtering out only TFTP packets of my wireshark trace and
all looked correct. After taking a second look to the full trace I see
now a hint. Around 7 seconds after starting the TFTP transfer the server
is sending an ARP to the target asking for the owner of the target's IP.
The target is receiving this ARP and apparently responding (at least
this is what my debug code shows as it gets into arp.c:ArpReceive(),
case ARPOP_REQUEST and sending a packet), but this ARP reply from the
target is not reaching the network. My sniffer does not capture this
reply.

The server resends the ARP request twice more (seconds 8 and 9) to the
target and since it doesn't get a reply then sends a broadcast ARP
(seconds 10) asking who has that IP. Since nobody responds it stops
sending data.

The times that it works (and I don't know the magic behind using a
numeric address versus using ${loadaddr} when they have the same value),
the ARP replies do reach the network and the server continues the
transmission normally.

Using a v2009 U-Boot, the behaviour is exactly the same, but the target's
ARP replies always reach the network, and the transfers always succeed.

Since Fabio cannot reproduce it I guess it must be a local ghost. :o(
We tracked down the issue to an ARP request from the server that was never
answered by the target. We later noticed that the problem did not happen
anymore when building U-Boot with a different toolchain and that the issue
seemed to be in the alignment of the RX buffer in the stack, which old GCC
compilers seem to do wrong.

Here is a patch:

From: Robert Hodaszi<robert.hoda...@digi.com>
Date: Fri, 6 Sep 2013 09:50:52 +0200
Subject: [PATCH] net: fec: fix invalid temporary RX buffer alignment
because of GCC bug

Older GCC versions don't handle well alignment on stack variables.
The temporary RX buffer is a local variable, so it is on the stack.
Because the FEC driver is using DMA for transmission, receive and
transmit buffers should be aligned on 64 byte. The transmit buffers are
not allocated in the driver internally, it sends the packets directly
as it gets them. So these packets should be aligned.
When the ARP layer wants to reply to an ARP request, it uses the FEC
driver's temporary RX buffer (used to pass data to the ARP layer) to
store the new packet, and pass it back to the FEC driver's send function.
Because of a GCC bug
Can you point to this GCC bug in the GCC bugzilla or something?

this buffer is not aligned well, and when the
driver tries to send it, it first rounds the address down to the
alignment boundary. That causes invalid data.

To fix it, don't put the temporary onto the stack.

Signed-off-by: Robert Hodaszi<robert.hoda...@digi.com>
Signed-off-by: Hector Palacios<hector.palac...@digi.com>
---
   drivers/net/fec_mxc.c | 5 ++++-
   1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/net/fec_mxc.c b/drivers/net/fec_mxc.c
index f4f72b7..315017e 100644
--- a/drivers/net/fec_mxc.c
+++ b/drivers/net/fec_mxc.c
@@ -828,7 +828,10 @@ static int fec_recv(struct eth_device *dev)
          uint16_t bd_status;
          uint32_t addr, size, end;
          int i;
-       uchar buff[FEC_MAX_PKT_SIZE] __aligned(ARCH_DMA_MINALIGN);
+       /* Don't place this variable on the stack, because older GCC
version +        * doesn't handle alignement on stack well.
+        */
+       static uchar buff[FEC_MAX_PKT_SIZE] __aligned(ARCH_DMA_MINALIGN);
The buffer might as well be allocated using ALLOC_CACHE_ALIGN_BUFFER() from
include/common.h . Still, are you _really_ sure the buffer is unaligned ? Do you
have a testcase maybe ?

btw. I am able to replicate this issue sometimes even using GCC 4.8.0 .

Best regards,
Marek Vasut

_______________________________________________
U-Boot mailing list
U-Boot@lists.denx.de
http://lists.denx.de/mailman/listinfo/u-boot

Reply via email to