FreeBSD netmap build from ports error: no member named '_Ios_Openmode' in namespace 'std'

2014-03-20 Thread FreeBSD ML Dev
Hello Friends, On fresh installed FreeBSD 10.0-Release, there is an error on building netmap from ports: root@test01:/usr/ports/net/netmap # *make install clean* === Building for netmap-0.1.3_1 gmake[1]: Entering directory `/usr/ports/net/netmap/work/netmap-0.1.3' gmake -C belgolib gmake[2]:

some problem about netmap

2014-03-20 Thread mstian88
./pkt-gen -i eth1 -f rx -X the print info shows slot-len is 2048? why? the packets was sended with tcpreplay. when Multiple packets len is 1514, pkt-gen receive avial 1, the slot-len is 2048 like this:

Re: 9.2 ixgbe tx queue hang

2014-03-20 Thread Markus Gebert
On 19.03.2014, at 20:17, Christopher Forgeron csforge...@gmail.com wrote: Hello, I can report this problem as well on 10.0-RELEASE. I think it's the same as kern/183390? Possible. We still see this on nfsclients only, but I’m not convinced that nfs is the only trigger. I have

Re: some problem about netmap

2014-03-20 Thread Luigi Rizzo
what os, device driver ? are you using emulated or native netmap mode ? Do you have tso and receive side coalescing enabled ? (you can disable them with ethtool). cheers luigi On Thu, Mar 20, 2014 at 11:03 AM, mstian88 mstia...@163.com wrote: ./pkt-gen -i eth1 -f rx -X the print info shows

Network stack returning EFBIG?

2014-03-20 Thread Garrett Wollman
I recently put a new server running 9.2 (with a local patches for NFS) into production, and it's immediately started to fail in an odd way. Since I pounded this server pretty heavily and never saw the error in testing, I'm more than a little bit taken aback. We have identical hardware in

Re: Network stack returning EFBIG?

2014-03-20 Thread Daniel Braniss
turn off TSO the problems sound similar to the one I reported a while back. truing off tso fixed it. danny On Mar 20, 2014, at 3:26 PM, Garrett Wollman woll...@bimajority.org wrote: I recently put a new server running 9.2 (with a local patches for NFS) into production, and it's immediately

Re: Network stack returning EFBIG?

2014-03-20 Thread wollman
In article 21290.60558.750106.630...@hergotha.csail.mit.edu, I wrote: Since we put this server into production, random network system calls have started failing with [EFBIG] or maybe sometimes [EIO]. I've observed this with a simple ping, but various daemons also log the errors: Mar 20 09:22:04

Re: Network loss

2014-03-20 Thread Christopher Forgeron
Sure: $ uname -a FreeBSD SAN0.XX 10.0-RELEASE FreeBSD 10.0-RELEASE #0 r260789: Thu Jan 16 22:34:59 UTC 2014 r...@snap.freebsd.org:/usr/obj/usr/src/sys/GENERIC amd64 I normally run a slightly tweaked NFS kernel, but I'm back on the default build for now until this problem is resolved.

Re: Network stack returning EFBIG?

2014-03-20 Thread Markus Gebert
On 20.03.2014, at 14:51, woll...@bimajority.org wrote: In article 21290.60558.750106.630...@hergotha.csail.mit.edu, I wrote: Since we put this server into production, random network system calls have started failing with [EFBIG] or maybe sometimes [EIO]. I've observed this with a simple

Re: 9.2 ixgbe tx queue hang

2014-03-20 Thread Christopher Forgeron
On Thu, Mar 20, 2014 at 7:40 AM, Markus Gebert markus.geb...@hostpoint.chwrote: Possible. We still see this on nfsclients only, but I'm not convinced that nfs is the only trigger. Just to clarify, I'm experiencing this error with NFS, but also with iSCSI - I turned off my NFS server in

Re: 9.2 ixgbe tx queue hang

2014-03-20 Thread Christopher Forgeron
Markus, I just wanted to clarify what dtrace will output in a 'no-error' situation. I'm seeing the following during a normal ping (no errors) on ix0, or even on a non-problematic bge NIC: On Thu, Mar 20, 2014 at 7:40 AM, Markus Gebert markus.geb...@hostpoint.chwrote: Also, if you have

Re: 9.2 ixgbe tx queue hang

2014-03-20 Thread Christopher Forgeron
(Struggling with this mail client for some reason, sorry, here's the paste) # dtrace -n 'fbt:::return / arg1 == EFBIG execname == ping / { stack(); }' dtrace: description 'fbt:::return ' matched 24892 probes CPU IDFUNCTION:NAME 19 29656 maybe_yield:return

Re: 9.2 ixgbe tx queue hang

2014-03-20 Thread Markus Gebert
On 20.03.2014, at 16:50, Christopher Forgeron csforge...@gmail.com wrote: Markus, I just wanted to clarify what dtrace will output in a 'no-error' situation. I'm seeing the following during a normal ping (no errors) on ix0, or even on a non-problematic bge NIC: This is expected. This

Re: 9.2 ixgbe tx queue hang

2014-03-20 Thread Christopher Forgeron
Output from the patch you gave me (I have screens of it.. let me know what you're hoping to see. Mar 20 16:37:22 SAN0 kernel: after mbcnt=33 pklen=65538 actl=65538 Mar 20 16:37:22 SAN0 kernel: before pklen=65538 actl=65538 Mar 20 16:37:22 SAN0 kernel: after mbcnt=33 pklen=65538 actl=65538 Mar 20

Re: 9.2 ixgbe tx queue hang

2014-03-20 Thread Christopher Forgeron
BTW, When I have the problem, this is what I see from netstat -m 4080/2956/7036/6127254 mbuf clusters in use (current/cache/total/max) 4080/2636 mbuf+clusters out of packet secondary zone in use (current/cache) 0/50/50/3063627 4k (page size) jumbo clusters in use (current/cache/total/max)

Re: 9.2 ixgbe tx queue hang

2014-03-20 Thread Christopher Forgeron
Re: cpuset ping I can report that I do not get any fails with this ping - I have screens of failed flood pings on the ix0 nic, but these always pass (i have that cpuset ping looping constantly). I can't report about the dtrace yet, as I'm running Rick's ixgbe patch, and there seems to be a .ko

Re: 9.2 ixgbe tx queue hang

2014-03-20 Thread Garrett Wollman
In article cab2_nwaomptzjb03pdditk2ovqgqk-tyf83jq4ukt9jnza8...@mail.gmail.com, csforge...@gmail.com writes: 50/27433/0 requests for jumbo clusters denied (4k/9k/16k) This is going to screw you. You need to make sure that no NIC driver ever allocates 9k jumbo pages -- unless you are using one of

Re: 9.2 ixgbe tx queue hang

2014-03-20 Thread Christopher Forgeron
I have found this: http://lists.freebsd.org/pipermail/freebsd-net/2013-October/036955.html I think what you're saying is that; - a MTU of 9000 doesn't need to equal a 9k mbuf / jumbo cluster - modern NIC drivers can gather 9000 bytes of data from various memory locations - The fact that I'm

Re: 9.2 ixgbe tx queue hang

2014-03-20 Thread Christopher Forgeron
Any recommendations on what to do? I'm experimenting with disabling TSO right now, but it's too early to tell if it fixes my problem. On my 9.2 box, we don't see this number climbing. With TSO off on 10.0, I also see the number is not climbing. I'd appreciate any links you may have so I can read

Re: 9.2 ixgbe tx queue hang

2014-03-20 Thread Jack Vogel
What he's saying is that the driver should not be using 9K mbuf clusters, I thought this had been changed but I see the code in HEAD is still using the larger clusters when you up the mtu. I will put it on my list to change with the next update to HEAD. What version of ixgbe are you using?

Re: 9.2 ixgbe tx queue hang

2014-03-20 Thread Christopher Forgeron
Hi Jack, I'm on ixgbe 2.5.15 I see a few other threads about using MJUMPAGESIZE instead of MJUM9BYTES. If you have a patch you'd like me to test, I'll compile it in and let you know. I was just looking at Garrett's if_em.c patch and thinking about applying it to ixgbe.. As it stands I seem

Re: 9.2 ixgbe tx queue hang

2014-03-20 Thread Jack Vogel
I strongly discourage anyone from disabling TSO on 10G, its necessary to get the performance one wants to see on the hardware. Here is a patch to do what i'm talking about: *** ixgbe.cFri Jan 10 18:12:20 2014 --- ixgbe.jfv.cThu Mar 20 23:04:15 2014 ***

Re: 9.2 ixgbe tx queue hang

2014-03-20 Thread Christopher Forgeron
I agree, performance is noticeably worse with TSO off, but I thought it would be a good step in troubleshooting. I'm glad you're a regular reader of the list, so I don't have to settle for slow performance. :-) I'm applying your patch now, I think it will fix it - but I'll report in after it's

Re: 9.2 ixgbe tx queue hang

2014-03-20 Thread Jack Vogel
Your 4K mbuf pool is not being used, make sure you increase the size once you are using that or you'll just be having the same issue with a different pool. Oh, and that patch was against the code in HEAD, it might need some manual hacking if you're using anything older. Not sure what you mean

Re: 9.2 ixgbe tx queue hang

2014-03-20 Thread Christopher Forgeron
Ah, good point about the 4k buff size : I will allocate more to kern.ipc.nmbjumbop , perhaps taking it from 9 and 16. Yes, I did have to tweak the patch slightly to work on 10.0, but it's basically the same thing I was trying after looking at Garrett's notes. I see this is part of a larger

Re: 9.2 ixgbe tx queue hang

2014-03-20 Thread Rick Macklem
Christopher Forgeron wrote: Output from the patch you gave me (I have screens of it.. let me know what you're hoping to see. Mar 20 16:37:22 SAN0 kernel: after mbcnt=33 pklen=65538 actl=65538 Mar 20 16:37:22 SAN0 kernel: before pklen=65538 actl=65538 Hmm. I think this means that the loop

Re: 9.2 ixgbe tx queue hang

2014-03-20 Thread Rick Macklem
Christopher Forgeron wrote: On Thu, Mar 20, 2014 at 7:40 AM, Markus Gebert markus.geb...@hostpoint.ch wrote: Possible. We still see this on nfsclients only, but I’m not convinced that nfs is the only trigger. Since Christopher is getting a bunch of the before

Re: 9.2 ixgbe tx queue hang

2014-03-20 Thread Christopher Forgeron
Yes, there is something broken in TSO for sure, as disabling it allows me to run without error. It is possible that the drop in performance is allowing me to stay under a critical threshold for the problem, but I'd feel happier testing to make sure. I understand what you're asking for in the

Re: Network stack returning EFBIG?

2014-03-20 Thread Rick Macklem
Markus Gebert wrote: On 20.03.2014, at 14:51, woll...@bimajority.org wrote: In article 21290.60558.750106.630...@hergotha.csail.mit.edu, I wrote: Since we put this server into production, random network system calls have started failing with [EFBIG] or maybe sometimes [EIO].

Re: 9.2 ixgbe tx queue hang

2014-03-20 Thread Rick Macklem
Christopher Forgeron wrote: Yes, there is something broken in TSO for sure, as disabling it allows me to run without error. It is possible that the drop in performance is allowing me to stay under a critical threshold for the problem, but I'd feel happier testing to make sure. I

Re: 9.2 ixgbe tx queue hang

2014-03-20 Thread Christopher Forgeron
BTW - I think this will end up being a TSO issue, not the patch that Jack applied. When I boot Jack's patch (MJUM9BYTES removal) this is what netstat -m shows: 21489/2886/24375 mbufs in use (current/cache/total) 4080/626/4706/6127254 mbuf clusters in use (current/cache/total/max) 4080/587