On Thursday, July 10, 2014 6:31:43 pm Rick Macklem wrote: > John Baldwin wrote: > > On Thursday, July 03, 2014 8:51:01 pm Rick Macklem wrote: > > > Russell L. Carter wrote: > > > > > > > > > > > > On 07/02/14 19:09, Rick Macklem wrote: > > > > > > > > > Could you please post the dmesg stuff for the network > > > > > interface, > > > > > so I can tell what driver is being used? I'll take a look at > > > > > it, > > > > > in case it needs to be changed to use m_defrag(). > > > > > > > > em0: <Intel(R) PRO/1000 Network Connection 7.4.2> port > > > > 0xd020-0xd03f > > > > mem > > > > 0xfe4a0000-0xfe4bffff,0xfe480000-0xfe49ffff irq 44 at device 0.0 > > > > on > > > > pci2 > > > > em0: Using an MSI interrupt > > > > em0: Ethernet address: 00:15:17:bc:29:ba > > > > 001.000007 [2323] netmap_attach success for em0 tx > > > > 1/1024 > > > > rx > > > > 1/1024 queues/slots > > > > > > > > This is one of those dual nic cards, so there is em1 as well... > > > > > > > Well, I took a quick look at the driver and it does use m_defrag(), > > > but > > > I think that the "retry:" label it does a goto after doing so might > > > be in > > > the wrong place. > > > > > > The attached untested patch might fix this. > > > > > > Is it convenient to build a kernel with this patch applied and then > > > try > > > it with TSO enabled? > > > > > > rick > > > ps: It does have the transmit segment limit set to 32. I have no > > > idea if > > > this is a hardware limitation. > > > > I think the retry is not in the wrong place, but the overhead of all > > those > > pullups is apparently quite severe. > The m_defrag() call after the first failure will just barely squeeze > the just under 64K TSO segment into 32 mbuf clusters. Then I think any > m_pullup() done during the retry will allocate an mbuf > (at a glance it seems to always do this when the old mbuf is a cluster) > and prepend that to the list. > --> Now the list is > 32 mbufs again and the bus_dmammap_load_mbuf_sg() > will fail again on the retry, this time fatally, I think? > > I can't see any reason to re-do all the stuff using m_pullup() and Russell > reported that moving the "retry:" fixed his problem, from what I understood.
Ah, I had assumed (incorrectly) that the m_pullup()s would all be nops in this case. It seems the NIC would really like to have all those things in a single segment, but it is not required, so I agree that your patch is fine. > > It would be interesting to test > > the > > following in addition to your change to see if it improves > > performance > > further: > > > > Index: if_em.c > > =================================================================== > > --- if_em.c (revision 268495) > > +++ if_em.c (working copy) > > @@ -1959,7 +1959,9 @@ retry: > > if (error == EFBIG && remap) { > > struct mbuf *m; > > > > - m = m_defrag(*m_headp, M_NOWAIT); > > + m = m_collapse(*m_headp, M_NOWAIT, EM_MAX_SCATTER); > > + if (m == NULL) > > + m = m_defrag(*m_headp, M_NOWAIT); > Since a just under 64K TSO segment barely fits in 32 mbuf clusters, > I'm at least 99% sure the m_collapse() will fail, but it can't hurt to > try it. (If it supported 33 or 34, I think m_collapse() would have a > reasonable chance of success.) > > Right now the NFS and krpc code creates 2 small mbufs in front of the > read/write data clusters and I think the TCP layer adds another one. > Even if this was modified to put it all in one cluster, I don't think > m_collapse() would succeed, since it only copies the data up and deletes > an mbuf from the chain if it will all fit in the preceding one. Since > the read/write data clusters are full (except the last one), they can't > fit in the M_TRAILINGSPACE() of the preceding one unless it is empty > from my reading of m_collapse(). Correct, ok. -- John Baldwin _______________________________________________ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"