On Thu, 2011-10-06 at 15:57 +0200, Eli Cohen wrote: > On Wed, Oct 05, 2011 at 10:15:02AM +0200, Eli Cohen wrote: > > How about this patch - can you give it a try? > > > >From dee60547aa9e35a02835451d9e694cd80dd3072f Mon Sep 17 00:00:00 2001 > From: Eli Cohen <e...@mellanox.co.il> > Date: Thu, 6 Oct 2011 15:50:02 +0200 > Subject: [PATCH] mlx4_en: Fix blue flame on powerpc > > The source buffer used for copying into the blue flame register is already in > big endian. However, when copying to device on powerpc, the endianess is > swapped so the data reaches th device in little endian which is wrong. On x86 > based platform no swapping occurs so it reaches the device with the correct > endianess. Fix this by calling le32_to_cpu() on the buffer. On LE systems > there > is no change; on BE there will be a swap.
That looks wrong. What is this __iowrite64_copy... oh I see Nice, somebody _AGAIN_ added a bunch of "generic" IO accessors that are utterly wrong on all archs except x86 (ok, -almost-). There isn't a single bloody memory barrier in there ! So, __iowrite64_copy is doing raw_writel which will -not- swap, so your buffer is going to have the same endianness in the destination than it has in the source. This is _NOT_ the right place to do a swap. It's the original construction of the descriptor that needs change. The data itself should never need to be affected accross a copy operation (unless your HW is terminally busted). Cheers, Ben. > Signed-off-by: Eli Cohen <e...@mellanox.co.il> > --- > drivers/net/mlx4/en_tx.c | 10 ++++++++++ > 1 files changed, 10 insertions(+), 0 deletions(-) > > diff --git a/drivers/net/mlx4/en_tx.c b/drivers/net/mlx4/en_tx.c > index 16337fb..3743acc 100644 > --- a/drivers/net/mlx4/en_tx.c > +++ b/drivers/net/mlx4/en_tx.c > @@ -601,6 +601,16 @@ u16 mlx4_en_select_queue(struct net_device *dev, struct > sk_buff *skb) > > static void mlx4_bf_copy(unsigned long *dst, unsigned long *src, unsigned > bytecnt) > { > + int i; > + __le32 *psrc = (__le32 *)src; > + > + /* > + * the buffer is already in big endian. For little endian machines > that's > + * fine. For big endain machines we must swap since the chipset swaps > again > + */ > + for (i = 0; i < bytecnt / 4; ++i) > + psrc[i] = le32_to_cpu(psrc[i]); > + > __iowrite64_copy(dst, src, bytecnt / 8); > } > > -- > 1.7.7.rc0.70.g82660 > > > > > On Tue, Oct 04, 2011 at 05:26:20PM -0300, Thadeu Lima de Souza Cascardo > > wrote: > > > > I believe we have an endianess problem here. The source buffer is in > > big endian - in x86 archs, it will rich the pci device unswapped since > > both x86 and pci are little endian. In ppc, it wil be swapped by the > > chipset so it will reach the device in little endian which is wrong. > > So, in mlx4_bf_copy, you could loop over the buffer and swap32 the all > > the dwords before the call to __iowrite64_copy. Of course which should > > fix this in an arch independent manner. Let me know this works for > > you. > > > > > On Tue, Oct 04, 2011 at 08:02:12AM +0200, Benjamin Herrenschmidt wrote: > > > > On Mon, 2011-10-03 at 17:53 -0300, Thadeu Lima de Souza Cascardo wrote: > > > > > > > > .../... > > > > > > > > > > Can you also send me the output of ethtool -i? > > > > > > It seems that there is a problem with write combining on Power > > > > > > processors, we will check this issue. > > > > > > > > > > > > Yevgeny > > > > > > > > > > Hello, Yevgeny. > > > > > > > > > > You will find the output of ethtool -i below. > > > > > > > > > > I am copying Ben and powerpc list, in case this is an issue with Power > > > > > processors. They can provide us some more insight into this. > > > > > > > > May I get some background please ? :-) > > > > > > > > I'm not aware of a specific issue with write combining but I'd need to > > > > know more about what you are doing and the code to do it to comment on > > > > whether it should work or not. > > > > > > > > Cheers, > > > > Ben. > > > > > > > > > > > > > > Hello, Ben. > > > > > > Sorry for that. I am testing mlx4_en driver on a POWER. Yevgeny has > > > added blue frame support, that does not require writing to the device > > > memory to indicate a new packet (the doorbell register as it is called). > > > > > > Well, the ring is getting full with no interrupt or packets transmitted. > > > I simply added a write to the doorbell register and it works for me. > > > Yevgeny says this is not the right fix, claiming there is a problem with > > > write combining on POWER. The code uses memory barriers, so I don't know > > > why there is any problem. > > > > > > I am posting the code here to show better what the situation is. > > > Yevgeny can tell more about the device and the driver. > > > > > > The code below is the driver as of now, including a diff with what I > > > changed and had resulted OK for me. Before the blue frame support, the > > > only code executed was the else part. I can't tell much what the device > > > should be seeing and doing after the blue frame part of the code is > > > executed. But it does send the packet if I write to the doorbell > > > register. > > > > > > Yevgeny, can you tell us what the device should be doing and why you > > > think this is a problem on POWER? Is it possible that this is simply a > > > problem with the firmware version? > > > > > > Thanks, > > > Cascardo. > > > > > > --- > > > if (ring->bf_enabled && desc_size <= MAX_BF && !bounce && > > > !vlan_tag) { > > > *(u32 *) (&tx_desc->ctrl.vlan_tag) |= > > > ring->doorbell_qpn; > > > op_own |= htonl((bf_index & 0xffff) << 8); > > > /* Ensure new descirptor hits memory > > > * before setting ownership of this descriptor to HW */ > > > wmb(); > > > tx_desc->ctrl.owner_opcode = op_own; > > > > > > wmb(); > > > > > > mlx4_bf_copy(ring->bf.reg + ring->bf.offset, (unsigned > > > long *) &tx_desc->ctrl, > > > desc_size); > > > > > > wmb(); > > > > > > ring->bf.offset ^= ring->bf.buf_size; > > > } else { > > > /* Ensure new descirptor hits memory > > > * before setting ownership of this descriptor to HW */ > > > wmb(); > > > tx_desc->ctrl.owner_opcode = op_own; > > > - wmb(); > > > - writel(ring->doorbell_qpn, ring->bf.uar->map + > > > MLX4_SEND_DOORBELL); > > > } > > > > > > + wmb(); > > > + writel(ring->doorbell_qpn, ring->bf.uar->map + > > > MLX4_SEND_DOORBELL); > > > + > > > --- _______________________________________________ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev