Re: [PATCH 1/4] net: skb_orphan on dev_hard_start_xmit

2009-08-18 Thread David Miller
From: Herbert Xu herb...@gondor.apana.org.au
Date: Wed, 19 Aug 2009 13:19:26 +1000

 I'm in the process of repeating the same experiment with cxgb3
 which hopefully should let me turn interrupts off on descriptors
 while still reporting completion status.

Ok, I look forward to seeing your work however it turns out.

Once I see what you've done, I'll give it a spin on niu.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [PATCH 1/4] net: skb_orphan on dev_hard_start_xmit

2009-07-04 Thread Herbert Xu
On Fri, Jul 03, 2009 at 08:13:47PM -0700, David Miller wrote:

 NIU
 
 I basically can't defer interrupts because the chip supports
 per-TX-desc interrupt indications but it lacks an all TX queue sent
 event.  So if, say, tell it to interrupt every 1/4 of the TX queue
 then up to 1/4 of the queue can have packets stuck in there
 if TX activity all of a sudden ceases.

Here's an idea: We let the sender decide whether we need to enable
notification.  This decision would be carried as a flag in the skb.
For example, UDP would set this flag when its socket buffer is close
to capacity.  Routing would set this flag per NAPI run, etc.

Of course you'd ignore this flag completely if the qdisc queue is
non-empty.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmVHI~} herb...@gondor.apana.org.au
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [PATCH 1/4] net: skb_orphan on dev_hard_start_xmit

2009-07-04 Thread Herbert Xu
On Sat, Jul 04, 2009 at 05:09:10PM +0800, Herbert Xu wrote:

 One potential problem is if the socket is constantly running
 close to capacity, but that should only happen if the device
 TX queue is also close to capacity which means that the qdisc
 queue should be non-empty.

Here's a another crazy idea:

Let's use dummy TX descriptors to generate an interrupt, either
with or without transmitting an actual packet on the wire depending
on the NIC.

xmit(skb)

if (TX queue contains no interrupting descriptor 
qdisc is empty)
mark TX descriptor as interrupting

clean()

do work
if (TX queue contains no interrupting descriptor 
TX queue non-empty 
qdisc is empty)
send dummy TX descriptor

This is based on the assumption that in the time it takes for
the NIC to process one packet and interrupt us, we can generate
more packets.  Obviously if we can't then even if the NIC had
a queue-empty interrupt capability it wouldn't help.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmVHI~} herb...@gondor.apana.org.au
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [PATCH 1/4] net: skb_orphan on dev_hard_start_xmit

2009-07-04 Thread Herbert Xu
On Sun, Jul 05, 2009 at 11:26:58AM +0800, Herbert Xu wrote:
 
 Here's a another crazy idea:
 
 Let's use dummy TX descriptors to generate an interrupt, either
 with or without transmitting an actual packet on the wire depending
 on the NIC.

Here's an even crazier idea that doesn't use dummy descriptors.

xmit(skb)

if (TX queue contains no interrupting descriptor 
qdisc is empty)
mark TX descriptor as interrupting

if (TX queue now contains an interrupting descriptor 
qdisc len  2)
stop queue

if (TX ring full)
stop queue

clean()

do work
wake queue as per usual

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmVHI~} herb...@gondor.apana.org.au
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [PATCH 1/4] net: skb_orphan on dev_hard_start_xmit

2009-07-03 Thread David Miller
From: Herbert Xu herb...@gondor.apana.org.au
Date: Fri, 3 Jul 2009 15:55:30 +0800

 Calling skb_orphan like this should be forbidden.  Apart from the
 problems already raised, it is a sign that the driver is trying to
 paper over a more serious issue of not cleaning up skb's timely.
 
 Yes skb_orphan will work for the cases where calling the skb
 destructor allows forward progress, but for the cases where you
 really need to the skb to be freed (e.g., iSCSI or Xen), this
 simply doesn't work.
 
 So anytime someone tries to propose such a solution it is a sign
 that they have bigger problems.

Agreed, but alas we are foaming at the mouth until we have a truly
usable alternative.

In particular the case of handling a device without usable TX
completion event indications is still quite troublesome.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [PATCH 1/4] net: skb_orphan on dev_hard_start_xmit

2009-07-03 Thread Herbert Xu
On Fri, Jul 03, 2009 at 08:02:54PM -0700, David Miller wrote:

 In particular the case of handling a device without usable TX
 completion event indications is still quite troublesome.

Which particular devices do you have in mind?

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmVHI~} herb...@gondor.apana.org.au
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [PATCH 1/4] net: skb_orphan on dev_hard_start_xmit

2009-07-03 Thread David Miller
From: Herbert Xu herb...@gondor.apana.org.au
Date: Sat, 4 Jul 2009 11:08:30 +0800

 On Fri, Jul 03, 2009 at 08:02:54PM -0700, David Miller wrote:

 In particular the case of handling a device without usable TX
 completion event indications is still quite troublesome.
 e
 Which particular devices do you have in mind?

NIU

I basically can't defer interrupts because the chip supports
per-TX-desc interrupt indications but it lacks an all TX queue sent
event.  So if, say, tell it to interrupt every 1/4 of the TX queue
then up to 1/4 of the queue can have packets stuck in there
if TX activity all of a sudden ceases.

The only thing I've come up with to be able to mitigate interrupts is
to use an hrtimer of some sort.  But that's going to be hard to get
right, and who knows what kind of latencies will be introduced for TX
completion packet freeing unless I am very carefull.

And finally this belongs in generic code, not in the NIU driver,
whatever we come up with.  Especially since my understanding is that
this is similar to what Rusty needs.

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [PATCH 1/4] net: skb_orphan on dev_hard_start_xmit

2009-06-03 Thread Eric Dumazet
Rusty Russell a écrit :
 On Sat, 30 May 2009 12:41:00 am Eric Dumazet wrote:
 Rusty Russell a écrit :
 DaveM points out that there are advantages to doing it generally (it's
 more likely to be on same CPU than after xmit), and I couldn't find
 any new starvation issues in simple benchmarking here.
 If really no starvations are possible at all, I really wonder why some
 guys added memory accounting to UDP flows. Maybe they dont run simple
 benchmarks but real apps ? :)
 
 Well, without any accounting at all you could use quite a lot of memory as 
 there are many places packets can be queued.
 
 For TCP, I agree your patch is a huge benefit, since its paced by remote
 ACKS and window control
 
 I doubt that.  There'll be some cache friendliness, but I'm not sure it'll be 
 measurable, let alone huge.  It's the win to drivers which don't have a 
 timely and batching tx free mechanism which I aim for.

At 250.000 packets/second on a Gigabit link, this is huge, I can tell you.
(250.000 incoming packets and 250.000 outgoing packets per second, 700 Mbit/s)

According to this oprofile on CPU0 (dedicated to softirqs on one bnx2 eth 
adapter)

We can see sock_wfree() being number 2 on the profile, because it touches three 
cache lines per socket and
transmited packet in TX completion handler.

Also, taking a reference on socket for each xmit packet in flight is very 
expensive, since it slows
down receiver in __udp4_lib_lookup(). Several cpus are fighting for sk-refcnt 
cache line.


CPU: Core 2, speed 3000.24 MHz (estimated)
Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask 
of 0x00 (Unhalted core cycles) count 10
samples  cum. samples  %cum. % symbol name
2121521215 11.8847  11.8847bnx2_poll_work
1723938454  9.6573  21.5420sock_wfree   effect of udp 
memory accounting 
1481753271  8.3005  29.8425__slab_free
1463567906  8.1986  38.0411__udp4_lib_lookup
1142579331  6.4003  44.4414__alloc_skb
9710 89041  5.4396  49.8810__slab_alloc
8095 97136  4.5348  54.4158__udp4_lib_rcv
7831 104967 4.3869  58.8027sock_def_write_space
7586 112553 4.2497  63.0524ip_rcv
7518 120071 4.2116  67.2640skb_dma_unmap
6711 126782 3.7595  71.0235netif_receive_skb
6272 133054 3.5136  74.5371udp_queue_rcv_skb
5262 138316 2.9478  77.4849skb_release_data
5023 143339 2.8139  80.2988__kmalloc_track_caller
4070 147409 2.2800  82.5788kmem_cache_alloc
3216 150625 1.8016  84.3804ipt_do_table
2576 153201 1.4431  85.8235skb_queue_tail



___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [PATCH 1/4] net: skb_orphan on dev_hard_start_xmit

2009-06-03 Thread Rusty Russell
On Thu, 4 Jun 2009 06:32:53 am Eric Dumazet wrote:
 Rusty Russell a écrit :
  On Sat, 30 May 2009 12:41:00 am Eric Dumazet wrote:
  For TCP, I agree your patch is a huge benefit, since its paced by remote
  ACKS and window control
 
  I doubt that.  There'll be some cache friendliness, but I'm not sure
  it'll be measurable, let alone huge.
...
 We can see sock_wfree() being number 2 on the profile, because it touches
 three cache lines per socket and transmited packet in TX completion
 handler.

Interesting, I take it back: got some after stats as well?

 Also, taking a reference on socket for each xmit packet in flight is very
 expensive, since it slows down receiver in __udp4_lib_lookup(). Several
 cpus are fighting for sk-refcnt cache line.

Now we have decent dynamic per-cpu, we can finally implement bigrefs.  More 
obvious for device counts than sockets, but perhaps applicable here as well?

Rusty.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [PATCH 1/4] net: skb_orphan on dev_hard_start_xmit

2009-06-03 Thread David Miller
From: Rusty Russell ru...@rustcorp.com.au
Date: Thu, 4 Jun 2009 13:24:57 +0930

 On Thu, 4 Jun 2009 06:32:53 am Eric Dumazet wrote:
 Also, taking a reference on socket for each xmit packet in flight is very
 expensive, since it slows down receiver in __udp4_lib_lookup(). Several
 cpus are fighting for sk-refcnt cache line.
 
 Now we have decent dynamic per-cpu, we can finally implement bigrefs.  More 
 obvious for device counts than sockets, but perhaps applicable here as well?

It might be very beneficial for longer lasting, active, connections, but
for high connection rates it's going to be a lose in my estimation.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [PATCH 1/4] net: skb_orphan on dev_hard_start_xmit

2009-06-03 Thread Eric Dumazet
David Miller a écrit :
 From: Rusty Russell ru...@rustcorp.com.au
 Date: Thu, 4 Jun 2009 13:24:57 +0930
 
 On Thu, 4 Jun 2009 06:32:53 am Eric Dumazet wrote:
 Also, taking a reference on socket for each xmit packet in flight is very
 expensive, since it slows down receiver in __udp4_lib_lookup(). Several
 cpus are fighting for sk-refcnt cache line.
 Now we have decent dynamic per-cpu, we can finally implement bigrefs.  More 
 obvious for device counts than sockets, but perhaps applicable here as well?
 
 It might be very beneficial for longer lasting, active, connections, but
 for high connection rates it's going to be a lose in my estimation.

Agreed.

We also can avoid the sock_put()/sock_hold() pair for each tx packet,
to only touch sk_wmem_alloc (with appropriate atomic_sub_return() in 
sock_wfree()
and atomic_dec_test in sk_free

We could initialize sk-sk_wmem_alloc to one instead of 0, so that
sock_wfree() could just synchronize itself with sk_free()

void sk_free(struct sock *sk)
{
if (atomic_dec_test(sk-sk_wmem_alloc))
__sk_free(sk)
}

 static inline void skb_set_owner_w(struct sk_buff *skb, struct sock *sk)
 {
-   sock_hold(sk);
skb-sk = sk;
skb-destructor = sock_wfree;
atomic_add(skb-truesize, sk-sk_wmem_alloc);
 }

 void sock_wfree(struct sk_buff *skb)
 {
struct sock *sk = skb-sk;
+   int res;

/* In case it might be waiting for more memory. */
-   atomic_sub(skb-truesize, sk-sk_wmem_alloc);
+   res = atomic_sub_return(skb-truesize, sk-sk_wmem_alloc);
if (!sock_flag(sk, SOCK_USE_WRITE_QUEUE))
sk-sk_write_space(sk);
-   sock_put(sk);
+   if (res == 0)
+   __sk_free(sk);
 }

Patch will follow after some testing

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [PATCH 1/4] net: skb_orphan on dev_hard_start_xmit

2009-06-03 Thread David Miller
From: Eric Dumazet eric.duma...@gmail.com
Date: Thu, 04 Jun 2009 06:54:24 +0200

 We also can avoid the sock_put()/sock_hold() pair for each tx packet,
 to only touch sk_wmem_alloc (with appropriate atomic_sub_return() in 
 sock_wfree()
 and atomic_dec_test in sk_free
 
 We could initialize sk-sk_wmem_alloc to one instead of 0, so that
 sock_wfree() could just synchronize itself with sk_free()

Excellent idea Eric.

 Patch will follow after some testing

I look forward to it :-)
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [PATCH 1/4] net: skb_orphan on dev_hard_start_xmit

2009-06-02 Thread David Miller
From: Patrick Ohly patrick.o...@intel.com
Date: Mon, 01 Jun 2009 21:47:22 +0200

 On Fri, 2009-05-29 at 23:44 +0930, Rusty Russell wrote:
 This patch adds skb_orphan to the start of dev_hard_start_xmit(): it
 can be premature in the NETDEV_TX_BUSY case, but that's uncommon.
 
 Would it be possible to make the new skb_orphan() at the start of
 dev_hard_start_xmit() conditionally so that it is not executed for
 packets that are to be time stamped?
 
 As discussed before
 (http://article.gmane.org/gmane.linux.network/121378/), the skb-sk
 socket pointer is required for sending back the send time stamp from
 inside the device driver. Calling skb_orphan() unconditionally as in
 this patch would break the hardware time stamping of outgoing packets.

Indeed, we need to check that case, at a minimum.

And there are other potentially other problems.  For example, I
wonder how this interacts with the new TX MMAP af_packet support
in net-next-2.6 :-/

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [PATCH 1/4] net: skb_orphan on dev_hard_start_xmit

2009-06-02 Thread Rusty Russell
On Tue, 2 Jun 2009 04:55:53 pm David Miller wrote:
 From: Patrick Ohly patrick.o...@intel.com
 Date: Mon, 01 Jun 2009 21:47:22 +0200

  On Fri, 2009-05-29 at 23:44 +0930, Rusty Russell wrote:
  This patch adds skb_orphan to the start of dev_hard_start_xmit(): it
  can be premature in the NETDEV_TX_BUSY case, but that's uncommon.
 
  Would it be possible to make the new skb_orphan() at the start of
  dev_hard_start_xmit() conditionally so that it is not executed for
  packets that are to be time stamped?
 
  As discussed before
  (http://article.gmane.org/gmane.linux.network/121378/), the skb-sk
  socket pointer is required for sending back the send time stamp from
  inside the device driver. Calling skb_orphan() unconditionally as in
  this patch would break the hardware time stamping of outgoing packets.

 Indeed, we need to check that case, at a minimum.

 And there are other potentially other problems.  For example, I
 wonder how this interacts with the new TX MMAP af_packet support
 in net-next-2.6 :-/

I think I'll do this in the driver for now, and let's revisit doing it 
generically later?

Thanks,
Rusty.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [PATCH 1/4] net: skb_orphan on dev_hard_start_xmit

2009-06-02 Thread Patrick Ohly
On Fri, 2009-05-29 at 23:44 +0930, Rusty Russell wrote:
 This patch adds skb_orphan to the start of dev_hard_start_xmit(): it
 can be premature in the NETDEV_TX_BUSY case, but that's uncommon.

Would it be possible to make the new skb_orphan() at the start of
dev_hard_start_xmit() conditionally so that it is not executed for
packets that are to be time stamped?

As discussed before
(http://article.gmane.org/gmane.linux.network/121378/), the skb-sk
socket pointer is required for sending back the send time stamp from
inside the device driver. Calling skb_orphan() unconditionally as in
this patch would break the hardware time stamping of outgoing packets.

This should work without much overhead (skb_tx() expands to a lookup of
the skb_shared_info):
if (!skb_tx(skb)-flags)
skb_orphan(skb);

Instead of looking at the individual skb_shared_tx hardware and
software bits I'm just checking the whole byte here because it is
shorter and perhaps faster. Semantically the result is the same.

-- 
Best Regards, Patrick Ohly

The content of this message is my personal opinion only and although
I am an employee of Intel, the statements I make here in no way
represent Intel's position on the issue, nor am I authorized to speak
on behalf of Intel on this matter.


___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [PATCH 1/4] net: skb_orphan on dev_hard_start_xmit

2009-06-02 Thread David Miller
From: Rusty Russell ru...@rustcorp.com.au
Date: Tue, 2 Jun 2009 23:38:29 +0930

 On Tue, 2 Jun 2009 04:55:53 pm David Miller wrote:
 From: Patrick Ohly patrick.o...@intel.com
 Date: Mon, 01 Jun 2009 21:47:22 +0200

  On Fri, 2009-05-29 at 23:44 +0930, Rusty Russell wrote:
  This patch adds skb_orphan to the start of dev_hard_start_xmit(): it
  can be premature in the NETDEV_TX_BUSY case, but that's uncommon.
 
  Would it be possible to make the new skb_orphan() at the start of
  dev_hard_start_xmit() conditionally so that it is not executed for
  packets that are to be time stamped?
 
  As discussed before
  (http://article.gmane.org/gmane.linux.network/121378/), the skb-sk
  socket pointer is required for sending back the send time stamp from
  inside the device driver. Calling skb_orphan() unconditionally as in
  this patch would break the hardware time stamping of outgoing packets.

 Indeed, we need to check that case, at a minimum.

 And there are other potentially other problems.  For example, I
 wonder how this interacts with the new TX MMAP af_packet support
 in net-next-2.6 :-/
 
 I think I'll do this in the driver for now, and let's revisit doing it 
 generically later?

That might be the best course of action for the time being.
This whole area is a rat's nest.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [PATCH 1/4] net: skb_orphan on dev_hard_start_xmit

2009-06-01 Thread Rusty Russell
On Sat, 30 May 2009 12:41:00 am Eric Dumazet wrote:
 Rusty Russell a écrit :
  DaveM points out that there are advantages to doing it generally (it's
  more likely to be on same CPU than after xmit), and I couldn't find
  any new starvation issues in simple benchmarking here.

 If really no starvations are possible at all, I really wonder why some
 guys added memory accounting to UDP flows. Maybe they dont run simple
 benchmarks but real apps ? :)

Well, without any accounting at all you could use quite a lot of memory as 
there are many places packets can be queued.

 For TCP, I agree your patch is a huge benefit, since its paced by remote
 ACKS and window control

I doubt that.  There'll be some cache friendliness, but I'm not sure it'll be 
measurable, let alone huge.  It's the win to drivers which don't have a 
timely and batching tx free mechanism which I aim for.

 , but an UDP sender will likely be able to saturate
 a link.

I couldn't see any difference in saturation here (with default scheduler and an 
100MBit e1000e).  Two reasons come to mind: firstly, only the hardware queue is 
unregulated: the tx queue is still accounted.  And when you add scheduling to 
the mix, I can't in practice cause starvation of other senders.

Hope that clarifies,
Rusty.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [PATCH 1/4] net: skb_orphan on dev_hard_start_xmit

2009-05-29 Thread Eric Dumazet
Rusty Russell a écrit :
 Various drivers do skb_orphan() because they do not free transmitted
 skbs in a timely manner (causing apps which hit their socket limits to
 stall, socket close to hang, etc.).
 
 DaveM points out that there are advantages to doing it generally (it's
 more likely to be on same CPU than after xmit), and I couldn't find
 any new starvation issues in simple benchmarking here.
 


If really no starvations are possible at all, I really wonder why some
guys added memory accounting to UDP flows. Maybe they dont run simple
benchmarks but real apps ? :)

For TCP, I agree your patch is a huge benefit, since its paced by remote ACKS
and window control, but an UDP sender will likely be able to saturate a link.

Maybe we can try to selectively call skb_orphan() only for tcp packets ?

I understand your motivations are the driver simplifications, so you
want all packets being orphaned... hmm...

 This patch adds skb_orphan to the start of dev_hard_start_xmit(): it
 can be premature in the NETDEV_TX_BUSY case, but that's uncommon.
 
 I removed the drivers' skb_orphan(), though it's harmless.
 
 Signed-off-by: Rusty Russell ru...@rustcorp.com.au
 Cc: Divy Le Ray d...@chelsio.com
 Cc: Roland Dreier rola...@cisco.com
 Cc: Pavel Emelianov xe...@openvz.org
 Cc: Dan Williams d...@redhat.com
 Cc: libertas-...@lists.infradead.org
 ---
  drivers/net/cxgb3/sge.c|   27 ---
  drivers/net/loopback.c |2 --
  drivers/net/mlx4/en_tx.c   |4 
  drivers/net/niu.c  |3 +--
  drivers/net/veth.c |2 --
  drivers/net/wireless/libertas/tx.c |3 ---
  net/core/dev.c |   21 +
  7 files changed, 6 insertions(+), 56 deletions(-)
 
 diff --git a/drivers/net/cxgb3/sge.c b/drivers/net/cxgb3/sge.c
 --- a/drivers/net/cxgb3/sge.c
 +++ b/drivers/net/cxgb3/sge.c
 @@ -1288,33 +1288,6 @@ int t3_eth_xmit(struct sk_buff *skb, str
   dev-trans_start = jiffies;
   spin_unlock(q-lock);
  
 - /*
 -  * We do not use Tx completion interrupts to free DMAd Tx packets.
 -  * This is good for performamce but means that we rely on new Tx
 -  * packets arriving to run the destructors of completed packets,
 -  * which open up space in their sockets' send queues.  Sometimes
 -  * we do not get such new packets causing Tx to stall.  A single
 -  * UDP transmitter is a good example of this situation.  We have
 -  * a clean up timer that periodically reclaims completed packets
 -  * but it doesn't run often enough (nor do we want it to) to prevent
 -  * lengthy stalls.  A solution to this problem is to run the
 -  * destructor early, after the packet is queued but before it's DMAd.
 -  * A cons is that we lie to socket memory accounting, but the amount
 -  * of extra memory is reasonable (limited by the number of Tx
 -  * descriptors), the packets do actually get freed quickly by new
 -  * packets almost always, and for protocols like TCP that wait for
 -  * acks to really free up the data the extra memory is even less.
 -  * On the positive side we run the destructors on the sending CPU
 -  * rather than on a potentially different completing CPU, usually a
 -  * good thing.  We also run them without holding our Tx queue lock,
 -  * unlike what reclaim_completed_tx() would otherwise do.
 -  *
 -  * Run the destructor before telling the DMA engine about the packet
 -  * to make sure it doesn't complete and get freed prematurely.
 -  */
 - if (likely(!skb_shared(skb)))
 - skb_orphan(skb);
 -
   write_tx_pkt_wr(adap, skb, pi, pidx, gen, q, ndesc, compl);
   check_ring_tx_db(adap, q);
   return NETDEV_TX_OK;
 diff --git a/drivers/net/loopback.c b/drivers/net/loopback.c
 --- a/drivers/net/loopback.c
 +++ b/drivers/net/loopback.c
 @@ -72,8 +72,6 @@ static int loopback_xmit(struct sk_buff 
  {
   struct pcpu_lstats *pcpu_lstats, *lb_stats;
  
 - skb_orphan(skb);
 -
   skb-protocol = eth_type_trans(skb,dev);
  
   /* it's OK to use per_cpu_ptr() because BHs are off */
 diff --git a/drivers/net/mlx4/en_tx.c b/drivers/net/mlx4/en_tx.c
 --- a/drivers/net/mlx4/en_tx.c
 +++ b/drivers/net/mlx4/en_tx.c
 @@ -807,10 +807,6 @@ int mlx4_en_xmit(struct sk_buff *skb, st
   if (tx_desc == (struct mlx4_en_tx_desc *) ring-bounce_buf)
   tx_desc = mlx4_en_bounce_to_desc(priv, ring, index, desc_size);
  
 - /* Run destructor before passing skb to HW */
 - if (likely(!skb_shared(skb)))
 - skb_orphan(skb);
 -
   /* Ensure new descirptor hits memory
* before setting ownership of this descriptor to HW */
   wmb();
 diff --git a/drivers/net/niu.c b/drivers/net/niu.c
 --- a/drivers/net/niu.c
 +++ b/drivers/net/niu.c
 @@ -6667,8 +6667,7 @@ static int niu_start_xmit(struct sk_buff
   }
   kfree_skb(skb);