Re: NFS client READ performance on -current

Julian Elischer Thu, 17 Jul 2014 04:46:36 -0700

On 7/15/14, 10:34 PM, John Baldwin wrote:

On Saturday, July 12, 2014 5:14:00 pm Rick Macklem wrote:

Yonghyeon Pyun wrote:

On Fri, Jul 11, 2014 at 09:54:23AM -0400, John Baldwin wrote:

On Thursday, July 10, 2014 6:31:43 pm Rick Macklem wrote:

John Baldwin wrote:

On Thursday, July 03, 2014 8:51:01 pm Rick Macklem wrote:

Russell L. Carter wrote:


On 07/02/14 19:09, Rick Macklem wrote:

Could you please post the dmesg stuff for the network
interface,
so I can tell what driver is being used? I'll take a look
at
it,
in case it needs to be changed to use m_defrag().

em0: <Intel(R) PRO/1000 Network Connection 7.4.2> port
0xd020-0xd03f
mem
0xfe4a0000-0xfe4bffff,0xfe480000-0xfe49ffff irq 44 at
device 0.0
on
pci2
em0: Using an MSI interrupt
em0: Ethernet address: 00:15:17:bc:29:ba
001.000007 [2323] netmap_attach             success for em0
tx
1/1024
rx
1/1024 queues/slots

This is one of those dual nic cards, so there is em1 as
well...

Well, I took a quick look at the driver and it does use
m_defrag(),
but
I think that the "retry:" label it does a goto after doing so
might
be in
the wrong place.

The attached untested patch might fix this.

Is it convenient to build a kernel with this patch applied
and then
try
it with TSO enabled?

rick
ps: It does have the transmit segment limit set to 32. I have
no
idea if
     this is a hardware limitation.

I think the retry is not in the wrong place, but the overhead
of all
those
pullups is apparently quite severe.

The m_defrag() call after the first failure will just barely
squeeze
the just under 64K TSO segment into 32 mbuf clusters. Then I
think any
m_pullup() done during the retry will allocate an mbuf
(at a glance it seems to always do this when the old mbuf is a
cluster)
and prepend that to the list.
--> Now the list is > 32 mbufs again and the
bus_dmammap_load_mbuf_sg()
     will fail again on the retry, this time fatally, I think?

I can't see any reason to re-do all the stuff using m_pullup()
and Russell
reported that moving the "retry:" fixed his problem, from what I
understood.

Ah, I had assumed (incorrectly) that the m_pullup()s would all be
nops in this
case.  It seems the NIC would really like to have all those things
in a single
segment, but it is not required, so I agree that your patch is
fine.

I recall em(4) controllers have various limitation in TSO. Driver
has to update IP header to make TSO work so driver has to get a
writable mbufs.  bpf(4) consumers will see IP packet length is 0
after this change.  I think tcpdump has a compile time option to
guess correct IP packet length.  The firmware of controller also
should be able to access complete IP/TCP header in a single buffer.
I don't remember more details in TSO limitation but I guess you may
be able to get more details TSO limitation from publicly available
Intel data sheet.

I think that the patch should handle this ok. All of the m_pullup()
stuff gets done the first time. Then, if the result is more than 32
mbufs in the list, m_defrag() is called to copy the chain. This should
result in all the header stuff in the first mbuf cluster and the map
call is done again with this list of clusters. (Without the patch,
m_pullup() would allocate another prepended mbuf and make the chain
more than 32mbufs again.)

Hmm, I am surprised by the m_pullup() behavior that it doesn't just
notice that the first mbuf with a cluster has the desired data already
and returns without doing anything.  That is, I'm surprised the first
statement in m_pullup() isn't just:

        if (n->m_len >= len)
                return (n);

I seem to remember that the standard behaviour is for the caller to doexactly that.


_______________________________________________
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: NFS client READ performance on -current

Reply via email to