Re: ipsec with ipfw

2017-03-13 Thread Hooman Fazaeli

On 2017-03-13 11:01, Andrey V. Elsukov wrote:

On 12.03.2017 00:23, Hooman Fazaeli wrote:

Hi,

As you know the ipsec/setkey provide limited syntax to define security
policies: only a single subnet/host, protocol number and optional port
may be used to specify traffic's source and destination.

I was thinking about the idea of using ipfw as the packet selector for
ipsec,
much like it is used with dummeynet. Something like:

ipfw add 100 ipsec 2 tcp from  to 
80,443,110,139

What this rule should do? How do you plan implement policy lookup for
inbound packets?



For instance, Outbound packets matching the rule would go through the
tunnel whose index is 2. The tunnel itself is defined using setkey.
Something like:

spdadd 2 esp/tunnel/1.1.1.1-2.2.2.2/require

It's basically the same as spdadd without the src/dst/proto/port
specification. A similar rule would be written for inbound packets.
This is just to indicate the idea. Obviously, exact mechanism
needs further thought & investigation (i.e., the issue of stateful vs.
stateless rules).

One important aspect, as s...@zxy.spb.ru pointed out, is how to deal with
IKE/ISAKMP to support the mechanism, as the current protocol requires that
negotiating parties to exchange & match subject-to-ipsec-traffic
specification in SA payloads (which is restricted to single subnet+proto+port).
I was thinking about some form of labeling (like MPLS) plus custom
payload types or DOIs.

Your ideas are welcome.

--
Best regards
Hooman Fazaeli

___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


ipsec with ipfw

2017-03-11 Thread Hooman Fazaeli

Hi,

As you know the ipsec/setkey provide limited syntax to define security
policies: only a single subnet/host, protocol number and optional port
may be used to specify traffic's source and destination.

I was thinking about the idea of using ipfw as the packet selector for ipsec,
much like it is used with dummeynet. Something like:

ipfw add 100 ipsec 2 tcp from  to  
80,443,110,139

What do you think? Are you interested in such a feature?
Is it worth the effort? What are the implementation challenges?

--
Best regards
Hooman Fazaeli

___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: projects/routing announcement/status

2016-08-27 Thread Hooman Fazaeli

On 2016-01-22 03:11, Alexander V. Chernikov wrote:

I would like to introduce routing rework which started as projects/routing SVN 
branch.
It has been around for quite a long time, some of the code has made its way to 
HEAD, but there hasn't been any public announcements.

So, what is projects/routing about?

First, it is about bringing more scalability by solving most annoying problems 
on packet output path.
To be more specific, it eliminates 2 out of 4 locks, converts other 2 to 
rmlock(9) and adds infrastructure to reduce locking to single rmlock for 
certain traffic types.
With these changes, OS is able to forward 12MPPS on 16-core box for both 
IPv4/IPv6 which is 6-10 times better than stock HEAD.

Second, it eases hacking by avoiding direct access to route/lltable internals 
and providing higher level API instead.

Third, it is about bringing advanced features like route multipath, and even 
more speed by adding modular lookup API permitting to use different route 
lookup algorithms based on server role.

Description with graphs and links is available at: 
http://wiki.freebsd.org/ProjectsRoutingProposal
Used API is described in http://wiki.freebsd.org/ProjectsRoutingProposal/API
Current status is available at 
http://wiki.freebsd.org/ProjectsRoutingProposal/ConversionStatus

It is probably much more convenient to read project details on wiki, however 
I’ll try to summarise the most important things here (wiki readers can skip 
till the end).

Typical packet processing (forwarding for router, or output for web server) 
path consists of:

doing routing lookup (radix read rwlock + routing entry (rte) mutex lock)
(optionally) interface address (ifa) atomic refcount acquire/release
doing link level entry (lle, llentry) lookup (afdata read rwlock + llentry read 
(or write) lock)


Most annoying one is the rtentry mutex. The only goal of this mutex is to 
provide rtentry refcounting so consumer code can use it without the risk of 
rtentrry being deleted.
We solve this by saving all needed data into on-stack optimised structure 
instead of refcounting.
Additionally, we are trying to pre-calculate the data we need to pass by using 
special next-hop structures instead of route entries.
Several different (in terms of returned info and relative overhead) functions 
for retrieving routing data are provided.
Most of the consumers have already been switched to the new KPI. Actual 
output/forward path are not converted yet.

It should be noted, that since individual rtentries are not returned, it is not 
possible to do per-ifa output packet accouting (can be observed in netstat -s).

Route table lock is switched to ipfw-like dual-locking mode (read rmlock() for 
data path, rwlock for config changes, route export, etc..).
The reasons of having rwlock are to 1) provide serialization for things in 
control plane not directly used for data path and 2) avoid acquiring 
contested/sleeping locks for rmlock. See projects/routing r287078 for an 
example.

Lltable entry locks were eliminated in r291853, r292155.

Lltable lock is also planned to be converted to dual-locking model, with the 
similar reasoning.
However, instead of (ab)using AFDATA lock, it needs to be converted to 
per-lltable set of locks.


Open problems:
SCTP/Flowtable references rtentries directly. It is not possible to convert 
ip[6]_output() path without dealing with that.

Brief merge plan:
Discuss/merge new routing KPI for data path
Discuss/merge lltable dual-lock (WIP)
Discuss/merge  explicit nexthop changes
Discuss/merge IPv4/IPv6 output path (along with converted sctp/flowtable)
Discuss/merge route table dual-lock

Current outstanding reviews (I encourage you to take a look at these)

D5009 (IPv4 fast forwarding conversion)
D5010 (IPv6 forwarding conversion)
D4794 (Deal with per-ifa output counters)
D4962 (new LLE lookup functions, no sockaddrs in lltable data path)
D4751 (move all lltable code to separate files)

___
freebsd-a...@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-arch
To unsubscribe, send any mail to "freebsd-arch-unsubscr...@freebsd.org"


First, thanks for the effort. I personally very much appreciate
any improvements made to the network related stuff.

Second have you considered replacing the existing radix tree with
a faster data structure, specially the Luigi DXR
tables? (http://info.iet.unipi.it/~luigi/papers/20120601-dxr.pdf 
<http://info.iet.unipi.it/%7Eluigi/papers/20120601-dxr.pdf>)
I apologize if the question is not much relevant to your work.



--
Best regards
Hooman Fazaeli

___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: kernel panic with netgraph and mpd3.8

2016-07-10 Thread Hooman Fazaeli

On 2016-07-10 10:49, Donald Baud via freebsd-net wrote:

Hi I'm running an l2tp lns through mpd3.8 and it's been crashing twice in 24h.
This is a new project replacing a cisco 7206, 700-sessions 800mbit/s


I am not familiar with troubleshooting kernel panic's,

I am suspecting that the crash is happening inside the netgraph module because 
the crash is happening at the

instruction pointer = 0x20:0x81c38283
I included the 2 two crash logs.  I need some help to to figure out what to do 
next.

-Dbaud




- Upgrade to mpd 5 (/usr/ports/net/mpd5)
- Try below workarounds:
https://lists.freebsd.org/pipermail/freebsd-bugs/2014-June/056548.html
https://lists.freebsd.org/pipermail/freebsd-bugs/2014-June/056549.html
https://lists.freebsd.org/pipermail/freebsd-net/2014-June/038954.html


--
Best regards
Hooman Fazaeli

___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: tcp window scaling + syn cookies problem

2016-03-07 Thread Hooman Fazaeli

On 2016-03-07 4:26 PM, Hooman Fazaeli wrote:


Hi,

In our network, Windows clients connect to internet via our custom developed 
transparent
tcp proxy (running on 7.3). Things work fine, except that _sometimes_ downloads 
from the
some windows clients become very slow. To debug the problem, we inspected a few 
packet traces and
found out that the problem happens because the proxy TCP stack forgets about 
client's window
scale factor, as illustrated in the following packet trace (it is for a 
download from
ftp.freebsd.org site. x.y.z.y is a windows 8):

1. 15:09:32.765713 IP (tos 0x0, ttl 63, id 16510, offset 0, flags [DF], proto 
TCP (6), length 52)
x.y.z.y.57430 > 96.47.72.72.80: S, cksum 0x8343 (correct), 
1530161492:1530161492(0) win 8192 

2. 15:09:32.765729 IP (tos 0x0, ttl 64, id 55869, offset 0, flags [none], proto 
TCP (6), length 52)
96.47.72.72.80 > x.y.z.y.57430: S, cksum 0xe2c0 (correct), 503882603:503882603(0) 
ack 1530161493 win 65535 

3. 15:09:32.766071 IP (tos 0x0, ttl 63, id 16511, offset 0, flags [DF], proto 
TCP (6), length 40)
x.y.z.y.57430 > 96.47.72.72.80: ., cksum 0x2192 (correct), ack 1 win 256

4. 15:09:32.770074 IP (tos 0x0, ttl 63, id 16512, offset 0, flags [DF], proto 
TCP (6), length 408)
x.y.z.y.57430 > 96.47.72.72.80: P, cksum 0x259c (correct), 1:369(368) ack 1 
win 256

5. 15:09:32.869286 IP (tos 0x0, ttl 64, id 57834, offset 0, flags [none], proto 
TCP (6), length 40)
96.47.72.72.80 > x.y.z.y.57430: ., cksum 0x2122 (correct), ack 369 win 65535

6. 15:09:33.180983 IP (tos 0x0, ttl 64, id 64495, offset 0, flags [none], proto 
TCP (6), length 296)
96.47.72.72.80 > x.y.z.y.57430: ., cksum 0xbd5a (correct), 1:257(256) ack 
369 win 65535

7. 15:09:33.231475 IP (tos 0x0, ttl 63, id 16513, offset 0, flags [DF], proto 
TCP (6), length 40)
x.y.z.y.57430 > 96.47.72.72.80: ., cksum 0x1f23 (correct), ack 257 win 255

8. 15:09:33.231494 IP (tos 0x0, ttl 64, id 248, offset 0, flags [none], proto 
TCP (6), length 295)
96.47.72.72.80 > x.y.z.y.57430: ., cksum 0xc9b6 (correct), 257:512(255) ack 
369 win 65535

9. 15:09:33.282256 IP (tos 0x0, ttl 63, id 16514, offset 0, flags [DF], proto 
TCP (6), length 40)
x.y.z.y.57430 > 96.47.72.72.80: ., cksum 0x1e25 (correct), ack 512 win 254

10. 15:09:33.282279 IP (tos 0x0, ttl 64, id 1283, offset 0, flags [none], proto 
TCP (6), length 294)
96.47.72.72.80 > x.y.z.y.57430: ., cksum 0x1e25 (correct), 512:766(254) ack 
369 win 65535

11. 15:09:33.333006 IP (tos 0x0, ttl 63, id 16515, offset 0, flags [DF], proto 
TCP (6), length 40)
x.y.z.y.57430 > 96.47.72.72.80: ., cksum 0x1d28 (correct), ack 766 win 253

12. 15:09:33.333023 IP (tos 0x0, ttl 64, id 2520, offset 0, flags [none], proto 
TCP (6), length 293)
96.47.72.72.80 > x.y.z.y.57430: ., cksum 0x1d28 (correct), 766:1019(253) 
ack 369 win 65535

13. 15:09:33.383926 IP (tos 0x0, ttl 63, id 16516, offset 0, flags [DF], proto 
TCP (6), length 40)
x.y.z.y.57430 > 96.47.72.72.80: ., cksum 0x1c2c (correct), ack 1019 win 252

As can be seen, the client advertises a window scale factor of 8 and then 
correctly
sets packet's window size based on the advertised factor. But the proxy
seems to forget about client's scale factor and sends as much data as the
client's unscaled window size sent in a previous ACK.

Now, setting 'net.inet.tcp.syncookies' to zero obviously seems to fix the 
problem
and the download speed becomes as expected.

Is this bad interaction between window scaling and syn cookies
a known problem? Why it happens? Has it been fixed in later freebsd version?

Thanks in advance.



A few minutes after posting, I found the following thread which describes an
exact duplicate of our problem : 
https://lists.freebsd.org/pipermail/freebsd-net/2013-February/034519.html



--
Best regards
Hooman Fazaeli

___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


tcp window scaling + syn cookies problem

2016-03-07 Thread Hooman Fazaeli


Hi,

In our network, Windows clients connect to internet via our custom developed 
transparent
tcp proxy (running on 7.3). Things work fine, except that _sometimes_ downloads 
from the
some windows clients become very slow. To debug the problem, we inspected a few 
packet traces and
found out that the problem happens because the proxy TCP stack forgets about 
client's window
scale factor, as illustrated in the following packet trace (it is for a 
download from
ftp.freebsd.org site. x.y.z.y is a windows 8):

1. 15:09:32.765713 IP (tos 0x0, ttl 63, id 16510, offset 0, flags [DF], proto 
TCP (6), length 52)
x.y.z.y.57430 > 96.47.72.72.80: S, cksum 0x8343 (correct), 
1530161492:1530161492(0) win 8192 

2. 15:09:32.765729 IP (tos 0x0, ttl 64, id 55869, offset 0, flags [none], proto 
TCP (6), length 52)
96.47.72.72.80 > x.y.z.y.57430: S, cksum 0xe2c0 (correct), 503882603:503882603(0) 
ack 1530161493 win 65535 

3. 15:09:32.766071 IP (tos 0x0, ttl 63, id 16511, offset 0, flags [DF], proto 
TCP (6), length 40)
x.y.z.y.57430 > 96.47.72.72.80: ., cksum 0x2192 (correct), ack 1 win 256

4. 15:09:32.770074 IP (tos 0x0, ttl 63, id 16512, offset 0, flags [DF], proto 
TCP (6), length 408)
x.y.z.y.57430 > 96.47.72.72.80: P, cksum 0x259c (correct), 1:369(368) ack 1 
win 256

5. 15:09:32.869286 IP (tos 0x0, ttl 64, id 57834, offset 0, flags [none], proto 
TCP (6), length 40)
96.47.72.72.80 > x.y.z.y.57430: ., cksum 0x2122 (correct), ack 369 win 65535

6. 15:09:33.180983 IP (tos 0x0, ttl 64, id 64495, offset 0, flags [none], proto 
TCP (6), length 296)
96.47.72.72.80 > x.y.z.y.57430: ., cksum 0xbd5a (correct), 1:257(256) ack 
369 win 65535

7. 15:09:33.231475 IP (tos 0x0, ttl 63, id 16513, offset 0, flags [DF], proto 
TCP (6), length 40)
x.y.z.y.57430 > 96.47.72.72.80: ., cksum 0x1f23 (correct), ack 257 win 255

8. 15:09:33.231494 IP (tos 0x0, ttl 64, id 248, offset 0, flags [none], proto 
TCP (6), length 295)
96.47.72.72.80 > x.y.z.y.57430: ., cksum 0xc9b6 (correct), 257:512(255) ack 
369 win 65535

9. 15:09:33.282256 IP (tos 0x0, ttl 63, id 16514, offset 0, flags [DF], proto 
TCP (6), length 40)
x.y.z.y.57430 > 96.47.72.72.80: ., cksum 0x1e25 (correct), ack 512 win 254

10. 15:09:33.282279 IP (tos 0x0, ttl 64, id 1283, offset 0, flags [none], proto 
TCP (6), length 294)
96.47.72.72.80 > x.y.z.y.57430: ., cksum 0x1e25 (correct), 512:766(254) ack 
369 win 65535

11. 15:09:33.333006 IP (tos 0x0, ttl 63, id 16515, offset 0, flags [DF], proto 
TCP (6), length 40)
x.y.z.y.57430 > 96.47.72.72.80: ., cksum 0x1d28 (correct), ack 766 win 253

12. 15:09:33.333023 IP (tos 0x0, ttl 64, id 2520, offset 0, flags [none], proto 
TCP (6), length 293)
96.47.72.72.80 > x.y.z.y.57430: ., cksum 0x1d28 (correct), 766:1019(253) 
ack 369 win 65535

13. 15:09:33.383926 IP (tos 0x0, ttl 63, id 16516, offset 0, flags [DF], proto 
TCP (6), length 40)
x.y.z.y.57430 > 96.47.72.72.80: ., cksum 0x1c2c (correct), ack 1019 win 252

As can be seen, the client advertises a window scale factor of 8 and then 
correctly
sets packet's window size based on the advertised factor. But the proxy
seems to forget about client's scale factor and sends as much data as the
client's unscaled window size sent in a previous ACK.

Now, setting 'net.inet.tcp.syncookies' to zero obviously seems to fix the 
problem
and the download speed becomes as expected.

Is this bad interaction between window scaling and syn cookies
a known problem? Why it happens? Has it been fixed in later freebsd version?

Thanks in advance.

--
Best regards
Hooman Fazaeli

___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Bridge Interfaces and ARPs

2015-12-03 Thread Hooman Fazaeli

On 12/3/2015 5:24 PM, Jason Van Patten wrote:

Hey gang -

I posted this to the FreeBSD user forums but figured I'd send a message off to the list to see if anyone has any input, guidance, or ideas. Emailing diagrams around isn't good form (IMHO) but having 
a diagram handy will help with the discussion.  So please glance at:


http://pics.lateapex.net/vz.png

Background: I have a business class Verizon FIOS connection for Internet at home.  Along with that connection, I have 13 (not 14!) static IPs from VZ.  They almost fall within a proper CIDR block, 
but not quite: 1.2.3.210 - 1.2.3.222.  I don't own .209, so I can't claim 1.2.3.208/28 as my IP block (dammit!)  The subnet for the static IPs is a /24, and the default route is *Verizon's* router: 
1.2.3.1.


There are a number of different choices for this network layout: DMZ, bridging, or binat.  I chose bridging so that I don't have the complexity of binatting, and yet have some protection for the 
servers via my router.  So, per the drawing, the FreeBSD router's em0 is connected to the Verizon equipment, while re0 and re1 are both connected to a managed Cisco switch, on different VLANs.


VLAN 10 for re0: Public IPs (public services, etc)
VLAN 20 for re1: Private IPs (NAS, wireless AP, etc)

Via the router, VLAN 10 and Verizon's network are bridged together.  The bridge interface on the router has IP: 1.2.3.222/24 with a default route set to 1.2.3.1.  All servers on VLAN 10 have IPs 
within the allocated range (.210 - .220) and the same default route.


Now: the problem.  I used the LAGG'd server as an example in the diagram, but the same thing is happening with other servers: the router is learning ARP entries for the IPs I own *from* Verizon's 
router.  As soon as the router caches that bad entry, it no longer routes traffic to those public IPs *from* VLAN 20 (private side). So, in other words, a laptop on the wireless network won't be 
able to get to 1.2.3.215.


My work-around for now has been a series of static ARP entries on the router 
for each of my public servers.  That seems to work fine, but I wonder if 
there's something I might be doing wrong?

If I didn't include enough info, fire away.  Thanks!


Can you post the output of the following commands (on freebsd router):

# ifconfig
# ifconfig bridgeX addr
# arp -na
# netstat -nr -f inet
# sysctl net.inet.ip

--
Best regards
Hooman Fazaeli

___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


mbuf statistics

2015-12-01 Thread Hooman Fazaeli


Hi,

On an idle freebsd 9.3 system:


vmstat -z | egrep "mbuf_cluster|ITEM" | column -t

ITEM   SIZE   LIMIT   USED   FREE  REQFAIL  SLEEP
mbuf_cluster:  2048,  10284,  1152,  56,   4237,  0,0


netstat -mb | grep "mbuf clusters in use"

512/696/1208/10284 mbuf clusters in use (current/cache/total/max)

one can see that:

current + cache == total == USED + FREE

but the current/cache values as reported by netstat
are very different form USED/FREE values reported by vmstat, so
they should have different meaning.

The question is: what is the exact meaning of USED/FREE and
current/cache values? Is there any relationship between them?

--
Best regards
Hooman Fazaeli

___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


tcp window scaling (rfc1323) problem

2015-07-30 Thread Hooman Fazaeli

Hi,

We connect to the Internet through a TCP proxy running on FreeBSD 8.3-RELEASE.
Everything works except that instagram clients frequently fail to get/refresh
some images and feeds. I have checked anything that may be the cause of problem
and found that setting net.inet.tcp.rfc1323 to zero improves the situation.

Googling a bit, I found out that there are reports about window scaling impl. 
bug
in older freebsds (i.e., 
https://lists.freebsd.org/pipermail/freebsd-hackers/2007-January/019070.html).

My question is that which version of freebsd is known to have the bug of
window scaling fixed? Is there any known problem related to window
scaling in newer (8+) freebsd versions?

Thanks in advance.


--
Best regards
Hooman Fazaeli

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


Re: Locking Memory Question

2015-07-30 Thread Hooman Fazaeli

On 7/30/2015 5:22 AM, Laurie Jennings via freebsd-net wrote:


On Wed, 7/29/15, John-Mark Gurney j...@funkthat.com wrote:

  Subject: Re: Locking Memory Question
  To: Laurie Jennings laurie_jennings_1...@yahoo.com
  Cc: John Baldwin j...@freebsd.org, freebsd-net@freebsd.org
  Date: Wednesday, July 29, 2015, 7:25 PM
  
  Laurie Jennings via

  freebsd-net wrote this message on Wed, Jul 29, 2015 at 15:26
  -0700:
  
   I have a problem and
  I can't quite figure out where to look. This is what Im
  doing:
  
   I have an
  IOCTL to read a block of data, but the data is too large to
  return via ioctl. So to get the data,
   I
  allocate a block in a kernel module:
  
  
   foo =

  malloc(1024000,M_DEVBUF,M_WAITOK);
  
I pass up a pointer and in user space
  map it using /dev/kmem:
  
  An easier solution would be for your ioctl to

  pass in a userland
  pointer and then use
  copyout(9) to push the data to userland...  This
  means the userland process doesn't have to
  have /dev/kmem access...
  
  Is

  there a reason you need to use kmem?  The only reason you
  list above
  is that it's too large via
  ioctl, but a copyout is fine, and would
  handle all page faults for you..
  
  __

I'm using kmem because the only options I could think of was to

1) use shared memory
2) use kmem
3) use a huge ioctl structure.

Im not clear how I'd do that. the data being passed up from the kernel is a 
variable size. To use copyout I'd have to pass a
pointer with a static buffer, right? Is there a way to malloc user space memory 
from within an ioctl call? Or
would I just have to pass down a pointer to a huge buffer large enough for the 
largest possible answer?

thanks

Laurie


You can use two IOCTLs. Get the block size from kernel module with the first 
ioctl,
and malloc(3) a buffer in userland with that size. Then use a second ioctl to 
pass the
address of allocated buffer to kernel module. The module may use copyout(9) to 
copy
in-kernel data to user space buffer.



___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org



--
Best regards
Hooman Fazaeli

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


Re: IPSEC in GENERIC [was: Re: netmap in GENERIC, by default, on HEAD]

2014-11-06 Thread Hooman Fazaeli

On 11/6/2014 1:30 PM, Olivier Cochard-Labbé wrote:

How to correctly bench IPSec performance ?

For benching forwarding performance I generate minimum-size packet (2000
flows: 100 different source IP * 20 different destination IP) like with
this netmap's pkt-gen example:
pkt-gen -i ix0 -f tx -n 10 -l 60 -d 9.1.1.1:2000-9.1.1.100
-s 8.1.1.1:2000-8.1.1.20

= This permit me to obtain the maximum PPS forwarded by the server.

May be off-topic: How much PPS and on  which hardware?


But for benching IPSec: Is the PPS with minimum-size packet a useful value ?
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org



--

Best regards.
Hooman Fazaeli

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


Re: transparent udp proxy

2014-11-02 Thread Hooman Fazaeli

On 10/31/2014 8:11 PM, Adrian Chadd wrote:

Hi,

If it's missing in 10 or later then please file a bug and I'll see
what it'll take to add another socket option to return the original
destination address+port.

Thanks,


https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=194758



-adrian

On 31 October 2014 08:00, Hooman Fazaeli hoomanfaza...@gmail.com wrote:

On 10/31/2014 5:30 PM, Mark Felder wrote:

I'm not sure if this is what you're looking for, but perhaps the
solution is in net/samplicator ?

  From the project's website:

This simple program listens for UDP datagrams on a network port, and
sends copies of these datagrams on to a set of destinations. Optionally,
it can perform sampling, i.e. rather than forwarding every packet,
forward only 1 in N. Another option is that it can spoof the IP source
address, so that the copies appear to come from the original source,
rather than the relay. Currently only supports IPv4.
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


Thanks. I do not thinks it provides what I am looking for.

I am not looking for an application performing a specific task, but a
mechanism
to get the __original__ destination address and port of packets forwarded to
a
local UDP proxy by ipfw fwd rules. As I figured it out until now, The
original destination
address may be obtained by IP_RECVDSTADDR on 9.0+ (but not on 8.x and older
versions) but
there seems to be no mechanism get the _original_ destination _port_ (Apart
from this
missing mechanism, my proxy is functional and performs what it is intended
to do).


--

Best regards.
Hooman Fazaeli

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org



--

Best regards.
Hooman Fazaeli

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


Re: transparent udp proxy

2014-11-01 Thread Hooman Fazaeli

On 10/31/2014 8:30 PM, Ian Smith wrote:

On Fri, 31 Oct 2014 18:30:00 +0330, Hooman Fazaeli wrote:
   On 10/31/2014 5:30 PM, Mark Felder wrote:
I'm not sure if this is what you're looking for, but perhaps the
solution is in net/samplicator ?
   
 From the project's website:
   
This simple program listens for UDP datagrams on a network port, and
sends copies of these datagrams on to a set of destinations. Optionally,
it can perform sampling, i.e. rather than forwarding every packet,
forward only 1 in N. Another option is that it can spoof the IP source
address, so that the copies appear to come from the original source,
rather than the relay. Currently only supports IPv4.

   Thanks. I do not thinks it provides what I am looking for.
  
   I am not looking for an application performing a specific task, but a
   mechanism to get the __original__ destination address and port of
   packets forwarded to a local UDP proxy by ipfw fwd rules. As I
   figured it out until now, The original destination address may be
   obtained by IP_RECVDSTADDR on 9.0+ (but not on 8.x and older
   versions) but there seems to be no mechanism get the _original_
   destination _port_ (Apart from this missing mechanism, my proxy is
   functional and performs what it is intended to do).

  : ipfw add 10 fwd localhost,7000 udp from any to any recv em1

Given these are local packets and that ipfw(8) /fwd states:

 The fwd action does not change the contents of the packet at all.
 In particular, the destination address remains unmodified, so
 packets forwarded to another system will usually be rejected by
 that system unless there is a matching rule on that system to
 capture them.  For packets forwarded locally, the local address
 of the socket will be set to the original destination address of
 the packet.  This makes the netstat(1) entry look rather weird
 but is intended for use with transparent proxy servers.

For FreeBSDs before 9.0, that description is only correct for TCP packets. For 
9.0+, it is true for both UDP and TCP.

Old kernels (before 9.0), change the destination of UDP packets forwarded to a 
local address to
the forwarded-to address and port (those specified in the fwd rule).


Has the destination port in the received packet been changed to 7000?

If not, you're all set.  If so, where else could the dst port be stored?

cheers, Ian

There is no way to get the destination port. That is the problem.
recvmsg(2) only returns source address+port  and destination IP address. (on 
9.0+).

--

Best regards.
Hooman Fazaeli

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


Re: transparent udp proxy

2014-11-01 Thread Hooman Fazaeli

On 10/31/2014 8:11 PM, Adrian Chadd wrote:

Hi,

If it's missing in 10 or later then please file a bug and I'll see
what it'll take to add another socket option to return the original
destination address+port.

Thanks,


-adrian



Thanks. I will check ASAP.

--

Best regards.
Hooman Fazaeli

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


transparent udp proxy

2014-10-31 Thread Hooman Fazaeli

Hi,

I my setup, I use a fwd rule to forward all udp traffic to my local proxy:

ipfw add 10 fwd localhost,7000 udp from any to any recv em1

The proxy needs to know the original destination address of forwarded 
datagrams, but
there seems to be no way to obtain that address.

Using recvmsg with IP_RECVDSTADDR does not help because it returns next-hop 
address
instead of original destination. This is because udp_input() overwrites 
packet's destination
with next-hop address before doing ip_savecontrol.

It seems easy to change udp_input to pass the original dest. address to 
ip_savecontrol.
Another soultion would be to implement IP_RECVDSTSOCKADDR option, which records 
the original
destination address:port as a 'struct sockaddr_in[6]' in packet's control data.

Comments/suggestions are welcome.


--

Best regards.
Hooman Fazaeli

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


Re: transparent udp proxy

2014-10-31 Thread Hooman Fazaeli

On 10/31/2014 2:18 PM, Andrey V. Elsukov wrote:

On 31.10.2014 12:50, Hooman Fazaeli wrote:

Hi,

I my setup, I use a fwd rule to forward all udp traffic to my local proxy:

ipfw add 10 fwd localhost,7000 udp from any to any recv em1

The proxy needs to know the original destination address of forwarded
datagrams, but
there seems to be no way to obtain that address.

Using recvmsg with IP_RECVDSTADDR does not help because it returns
next-hop address
instead of original destination. This is because udp_input() overwrites
packet's destination
with next-hop address before doing ip_savecontrol.

Hi,

udp_input() doesn't overwrite destination address. Probably you have NAT
that does this.


There is no NAT stuff.
I checked that on 8.4 source: 
http://fxr.watson.org/fxr/source/netinet/udp_usrreq.c?v=FREEBSD8#L461


--

Best regards.
Hooman Fazaeli

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


Re: transparent udp proxy

2014-10-31 Thread Hooman Fazaeli

On 10/31/2014 3:38 PM, Andrey V. Elsukov wrote:

On 31.10.2014 15:04, Hooman Fazaeli wrote:

Hi,

udp_input() doesn't overwrite destination address. Probably you have NAT
that does this.


There is no NAT stuff.
I checked that on 8.4 source:
http://fxr.watson.org/fxr/source/netinet/udp_usrreq.c?v=FREEBSD8#L461

The more recent FreeBSD versions don't overwrite destination address.

https://svnweb.freebsd.org/base?view=revisionrevision=225044


Yes. It seems so.
But still the problem of obtaining original destination port remains.

--

Best regards.
Hooman Fazaeli

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


Re: transparent udp proxy

2014-10-31 Thread Hooman Fazaeli

On 10/31/2014 5:30 PM, Mark Felder wrote:

I'm not sure if this is what you're looking for, but perhaps the
solution is in net/samplicator ?

 From the project's website:

This simple program listens for UDP datagrams on a network port, and
sends copies of these datagrams on to a set of destinations. Optionally,
it can perform sampling, i.e. rather than forwarding every packet,
forward only 1 in N. Another option is that it can spoof the IP source
address, so that the copies appear to come from the original source,
rather than the relay. Currently only supports IPv4.
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


Thanks. I do not thinks it provides what I am looking for.

I am not looking for an application performing a specific task, but a mechanism
to get the __original__ destination address and port of packets forwarded to a
local UDP proxy by ipfw fwd rules. As I figured it out until now, The original 
destination
address may be obtained by IP_RECVDSTADDR on 9.0+ (but not on 8.x and older 
versions) but
there seems to be no mechanism get the _original_ destination _port_ (Apart 
from this
missing mechanism, my proxy is functional and performs what it is intended to 
do).


--

Best regards.
Hooman Fazaeli

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


Re: pf stuck

2014-09-30 Thread Hooman Fazaeli

On 9/30/2014 12:12 AM, Andrea Venturoli wrote:

On 09/29/14 20:21, Ermal Luçi wrote:

Probably is better you ask this on freebsd-pf@.


Thanks, I see you have already cc:ed it.




Though this sounds like state limit reached.


Can this happen even if all my pf rules have no state?

No. Anyway, you can check state statistics  with: pfctl -s i ; pfctl -s m

--

Best regards.
Hooman Fazaeli

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org

Re: UDP/TCP versus IP frames - subtle out of order packets with hardware hashing

2014-07-15 Thread Hooman Fazaeli

On 7/15/2014 5:14 AM, Adrian Chadd wrote:

Hi,

Whilst digging into UDP receive side scaling on the intel ixgbe(4)
NIC, I stumbled across how it hashes traffic between IP fragmented
traffic and non IP-fragmented traffic.

Here's how it surfaced:

* the ixgbe(4) NIC is configured to hash on both IP (2-tuple) and
TCP/UDP (4-tuple);
* when a non-fragmented UDP frame comes in, it's hashed on the 4-tuple
and comes into queue A;
* when a fragmented UDP frame comes in, it's hashed on the IP 2-tuple
and comes into queue B.

So if there's a mix of small and large datagrams, we'll end up with
some packets coming in via queue A and some by queue B. In normal
operation that'll result in out of order packets.

For the RSS stuff I'm working on it means that some packets will match
the PCBGROUP setup and some won't. By default UDP configures a 2-tuple
hash so it expects packets to come in hashed appropriately. But that
only matches for large frames. For small frames it'll be hashed via
the 4-tuple and it won't match.

The ip reassembly code doesn't recalculate the flowid/flowtype once
it's finished. It'd be nice to do that before further processing so it
can be placed in the right netisr.

So there's a couple of semi-overlapping issues:

* Right now we could get TCP and UDP frames out of order. I'd like to
at least have ixgbe(4) hash on the 2-tuple for UDP rather than the
4-tuple. That fixes that silly corner case. It's not likely going to
show up except for things like forwarding workloads. Maybe people
doing memcached work, I'm not sure.

* Whether or not to calculate the flowid/flowtype in ip_reass() (or
maybe in the netisr input path, in case there's no flowid assigned) so
work is better distributed;

* .. then if we do that, we could do 4-tuple UDP hashing again and
we'd just recalculate for any large frames.

Here's what happened with Linux and ixgbe in 2010 on this topic:

http://comments.gmane.org/gmane.linux.network/166687

What do people think?


-a
___
freebsd-a...@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-arch
To unsubscribe, send any mail to freebsd-arch-unsubscr...@freebsd.org

Doesn't the problem applies to TCP too?
TCP may be fragmented too but is less likely because of MSS.

--

Best regards.
Hooman Fazaeli

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


Re: FreeBSD 9 w/ MPD5 crashes as LNS with 300+ tunnels. Netgraph issue?

2014-06-15 Thread Hooman Fazaeli
 and core dump somewhere for download 
so we can have a closer look at panic trace.

--

Best regards.
Hooman Fazaeli

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


Re: TSO and FreeBSD vs Linux

2013-09-03 Thread Hooman Fazaeli

On 9/4/2013 9:23 AM, Julian Elischer wrote:

On 9/4/13 6:49 AM, David Wolfskill wrote:

On Tue, Sep 03, 2013 at 12:27:34PM -0700, David Wolfskill wrote:

...
As soon as I issued sudo net.inet.tcp.tso=0 ... the copy worked without
a hitch or a whine.  And I was able to copy all 117709618 bytes, not just
2097152 (2^21).

The above command should (of course) have read

sudo sysctl net.inet.tcp.tso=0

Also: I normally had the em0 NIC on the machine in question connected to
a Netgear GS105 (5-port Gigabit switch).  In the process of
trouble-shooting the problem with NFS writes, I bypassed that switch and
connected the em0 NIC directly to the jack in my cube.

In that configuration, the em0 NIC showed media: Ethernet 1000baseT
(autoselect), while connected to the GS105, it showed media: Ethernet
100baseTX (autoselect).

While the NFS write worked whether or not I had the GS105 in the path,
it seemed ... suboptimal ... to have a NIC capable of 1000baseT
connected to a Gigabit switch, but negotiating at 100baseTX.

So I tried setting the media via ifconfig em0 media 1000baseT; after a
few seconds, it finally woke back up, and now reports media: Ethernet
1000baseT (1000baseT full-duplex).

So it appears that the em(4) driver and Intel 82578DM NIC fail to
negotiate 1000baseT with the Netgear GS105.


yeah auto-negotiation seems a bit fragile.. not just for us either..
I often end up hardwiring it in rc.conf.



I had also experienced similar problems (one case was  82574 with cisco 3550).
I also remember cases when auto-select worked but fixed media did not (link was 
settled down to 100 or half-duplex)
I am  curious as what is the exact technical reason(s) for such media problems?
Are they more hardware or driver related?



--

Best regards.
Hooman Fazaeli

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


Re: Intel 4-port ethernet adaptor link aggregation issue

2013-08-02 Thread Hooman Fazaeli

On 8/2/2013 2:44 AM, Joe Moog wrote:

On Aug 1, 2013, at 4:27 PM, Joe Moog joem...@ebureau.com wrote:


On Aug 1, 2013, at 3:55 PM, Ryan Stone ryst...@gmail.com wrote:


Have you tried using only two ports, but both from the NIC?  My suspicion would 
be that the problem is in the lagg's handling of more than 2 ports rather than 
the driver, especially given that it is the igb driver in all cases.

Ryan:

We have done this successfully with two ports on the NIC, on another 
hardware-identical host. That said, it is entirely possible that this is a 
shortcoming of lagg.

Can you think of any sort of workaround? Our desired implementation really 
requires the inclusion of all 4 ports in the lagg. Failing this we're looking 
at the likelihood of 10G ethernet, but with that comes significant overhead, 
both cost and administration (before anybody tries to force the cost debate, 
remember that there are 10G router modules and 10G-capable distribution 
switches involved, never mind the cabling and SFPs -- it's not just a $600 10G 
card for the host). I'd like to defer that requirement as long as possible. 4 
aggregated gig ports would serve us perfectly well for the near-term.

Thanks

Joe

UPDATE: After additional testing, I'm beginning to suspect the igb driver. With our 
setup, ifconfig identifies all the ethernet ports as igb(0-5). I configured igb0 with a 
single static IP address (say, 192.168.1.10), and was able to connect to the host 
administratively. While connected, I enabled another port as a second standalone port, 
again with a unique address (say, 192.168.1.20), and was able to access the host via that 
interface as well. The problem arises when we attempt to similarly add a third interface 
to the mix -- and it doesn't seem to matter what interface(s) we use, or in what order we 
activate them. Always on the third interface, that third interface fails to respond 
despite showing active both in ifconfig and on the switch.

If there is anything else I could try that would be useful to help identify 
where the issue may reside, please let me know.

Thanks

Joe

___


Assign IP addresses from __different__ subnets to the four NIC ports and 
re-test.
(e.g., 192.168.0.10/24, 1.10/24, 2.10/24, 3.10/24).

--

Best regards.
Hooman Fazaeli

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


Re: netmap bridge can tranmit big packet in line rate ?

2013-05-21 Thread Hooman Fazaeli
On 5/21/2013 5:10 PM, Barney Cordoba wrote:

 --- On Tue, 5/21/13, liujie liu...@263.net wrote:

 From: liujie liu...@263.net
 Subject: Re: netmap bridge can tranmit big packet in line rate ?
 To: freebsd-net@freebsd.org
 Date: Tuesday, May 21, 2013, 5:25 AM
 Hi, Prof.Luigi RIZZO

  Firstly i should thank you for netmap. I tried to send a
 e-mail to you
 yestoday, but it was rejected.

  I used two machines to test netmap bridge. all with i7-2600
 cpu and intel
 82599 dual-interfaces card.

  One worked as sender and receiver with pkt-gen, the other
 worked as bridge
 with bridge.c.

  as you said,I feeled comfous too when i saw the big packet
 performance
 dropped, i tried to change the memory parameters of
 netmap(netmap_mem1.c
 netmap_mem2.c),but it seemed that  can not resove the
 problem.
   60-byte packet send 14882289 pps  recv 
 13994753 pps
   124-byte 
send   8445770 pps 
 recv7628942 pps
   252-byte 
send   4529819 pps 
 recv 3757843 pps
   508-byte 
send2350815 pps 
 recv1645647 pps
   1514-byte   send 
   814288 pps recv  489133
 pps
 These numbers indicate you're tx'ing 7.2Gb/s with 60 byte packets and
 9.8Gb/s with 1514, so maybe you just need a new calculator?

 BC
 ___

AsBarney pointed outalready, your numbers are reasonable. You have almost 
saturated
the link with 1514 byte packets.In the case of 64 byte packets, you do not 
achieve line
rate probably because of the congestion on the bus.Can you show us top -SI 
output on the
sender machine?


-- 

Best regards.
Hooman Fazaeli

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


Re: High CPU interrupt load on intel I350T4 with igb on 8.3

2013-05-11 Thread Hooman Fazaeli
On 5/11/2013 8:26 PM, Barney Cordoba wrote:
 Clearly you don't understand the problem. Your logic is that because other 
 drivers are defective also; therefore its not a driver problem? The problem 
 is caused by a multi-threaded driver that
 haphazardly launches tasks and that doesn't manage the case that the rest of 
 the system can't handle the load. It's no different than a driver that barfs 
 when mbuf clusters are exhausted. The answer
 isn't to increase memory or mbufs, even though that may alleviate the 
 problem. The answer is to fix the driver, so that it doesn't crash the system 
 for an event that is wholly predictable. igb has
 1) too many locks and 2) exasperates the problem by binding to cpus, which 
 causes it to not only have to wait for the lock to free, but also for a 
 specific cpu to become free. So it chugs along
 happily until it encounters a bottleneck, at which point it quickly blows up 
 the entire system in a domino effect. It needs to manage locks more 
 efficiently, and also to detect when the backup is
 unmanageable. Ever since FreeBSD 5 the answer has been it's fixed in 7, or 
 its fixed in 9, or it's fixed in 10. There will always be bottlenecks, and 
 no driver should blow up the system no matter
 what intermediate code may present a problem. Its the driver's responsibility 
 to behave and to drop packets if necessary. BC

And how the driver should behave? You suggest dropping the packets. Even if we 
accept
that dropping packets is a good strategy in all configurations (which I doubt), 
the driver is
definitely not the best place to implement it, since that involves duplication 
of similar
code between drivers. Somewhere like the Ethernet layer is a much better choice 
to watch
load of packets and drop them to prevent them to eat all the cores. 
Furthermore, ignoring
the fact that pf is not optimized for multi-processors and blaming drivers for 
not adjusting
themselves with the this pf's fault, is a bit unfair, I believe.


-- 

Best regards.
Hooman Fazaeli

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


Re: 'no buffer space available' after switch goes down on freeBSD 7.3

2012-12-25 Thread Hooman Fazaeli

On 12/25/2012 4:31 AM, Ryan Stone wrote:

I don't believe that this is fixed in later versions of the driver. The
problem is that when the interface loses link the transmit queue can fill
up. Once that happens the driver never gets any more calls from the network
stack to make it send packets. Pinging the interface fixes it because the
driver processes rx.and tx from the same context, so when it receives a
packet it starts transmitting again.

The patch that I sent fixes the problem by forcing the driver to process
the tx queue when ever links goes from down to up.
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org




I have not tested it but it is apparently fixed:

http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/dev/e1000/if_em.c#rev1.21.2.23



--

Best regards.
Hooman Fazaeli

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


Re: ping: sendto: No buffer space available

2012-09-29 Thread Hooman Fazaeli



On 9/27/2012 9:38 PM, Rudy wrote:

On 09/27/2012 11:00 AM, Rudy wrote:

Rebooting and/or the settings change seems to have stopped the errors.
Here is a pretty little graph showing error rate on em1 for the past 3
days.

  http://www.monkeybrains.net/images/ErrorRate-em1.png



Interesting... if I zoom in on the graph, I see the errors were 'every other 
sample period' until I rebooted the box.

http://www.monkeybrains.net/images/ErrorRate-em1-zoom.png




How much traffic (bytes/s and packets/s) and of what type is passing through 
this box?
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


Re: ping: sendto: No buffer space available

2012-09-25 Thread Hooman Fazaeli


On 9/24/2012 7:50 PM, Rudy (bulk) wrote:


Sometimes when I try to ping a neighbor machine (plugged directly in with no 
switch involved), I get:

ping: sendto: No buffer space available
ping: sendto: No buffer space available

If I reset the interface  'ifconfig em1 down; ifconfig em1 up' the problem goes 
away.

The pings are:
 FreeBSD 8.3  em1   -- FreeBSD 9.0  em2
and I am seeing the issue on the FreeBSD 8.3 machine.  The box has 6GB of free 
ram and is a quagga router.

What do I need to tune?

Thanks!

Rudy




# netstat -m
10236/8454/18690 mbufs in use (current/cache/total)
10234/5388/15622/262144 mbuf clusters in use (current/cache/total/max)
10234/5382 mbuf+clusters out of packet secondary zone in use (current/cache)
0/327/327/12800 4k (page size) jumbo clusters in use (current/cache/total/max)
0/3070/3070/6400 9k jumbo clusters in use (current/cache/total/max)
0/0/0/3200 16k jumbo clusters in use (current/cache/total/max)
23027K/41827K/64854K bytes allocated to network (current/cache/total)
0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters)
0/0/0 requests for jumbo clusters denied (4k/9k/16k)
0/0/0 sfbufs in use (current/peak/max)
0 requests for sfbufs denied
0 requests for sfbufs delayed
0 requests for I/O initiated by sendfile
0 calls to protocol drain routines

# ifconfig em1
em1: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST metric 0 mtu 1500
options=4219bRXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4,WOL_MAGIC,VLAN_HWTSO
ether 00:25:90:56:60:7f
inet 10.1.1.1 netmask 0xfffc broadcast 10.1.1.3
media: Ethernet autoselect (1000baseT full-duplex)
status: active



FreeBSD 8.3
### loader.conf:
net.link.ifqmaxlen=1024
hw.em.rxd=1024
hw.em.txd=1024


### sysctl.conf:
kern.timecounter.hardware=HPET

net.route.netisr_maxqlen=2048
net.inet.ip.intr_queue_maxlen=1024

kern.ipc.somaxconn=256
kern.random.sys.harvest.interrupt=0
kern.random.sys.harvest.ethernet=0

net.inet.raw.maxdgram=16384
net.inet.raw.recvspace=16384

net.inet.icmp.icmplim=1000
net.inet.ip.fastforwarding=1
kern.ipc.nmbclusters=262144
net.inet.icmp.drop_redirect=1

dev.em.0.rx_processing_limit=200
dev.em.1.rx_processing_limit=200
dev.em.2.rx_processing_limit=200
dev.em.3.rx_processing_limit=200

net.link.ether.inet.max_age=300
hw.intr_storm_threshold=9000

# Security
net.inet.ip.redirect=0
net.inet.ip.sourceroute=0
net.inet.ip.accept_sourceroute=0
net.inet.icmp.maskrepl=0




Not sure if it matters, but here are the tunings on the other box:

FreeBSD 9.0
### loader.conf:
net.link.ifqmaxlen=512

### sysctl.conf:
net.inet.ip.fastforwarding=1
kern.ipc.nmbclusters=262144
kern.timecounter.hardware=HPET
net.inet.ip.rtminexpire=2
net.inet.ip.rtmaxcache=1024


dev.igb.0.rx_processing_limit=480
dev.igb.1.rx_processing_limit=480

net.inet.icmp.icmplim=1000
kern.random.sys.harvest.interrupt=0
kern.random.sys.harvest.ethernet=0

net.link.ether.inet.max_age=300

##Sat Apr 21 00:06:48 PDT 2012
net.inet.ip.redirect=0
net.route.netisr_maxqlen=2048
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org



The most likely cause is that the interface send queue has become full and 
stayed
in that condition. What type of NIC is at the other end of link? can you post 
the output of:

# sysctl dev.em.1




___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


Re: ping: sendto: No buffer space available

2012-09-25 Thread Hooman Fazaeli



On 9/25/2012 11:08 AM, Rudy (bulk) wrote:

On 9/24/12 11:52 PM, Hooman Fazaeli wrote:
sysctl dev.em.1 


From the side having the 'No buffer space available' (FreeBSD 8.3  Sep 13 2012)

# sysctl dev.em.1
dev.em.1.%desc: Intel(R) PRO/1000 Network Connection 7.3.2
dev.em.1.%driver: em
dev.em.1.%location: slot=0 function=0
dev.em.1.%pnpinfo: vendor=0x8086 device=0x10d3 subvendor=0x15d9 
subdevice=0x class=0x02
dev.em.1.%parent: pci5
dev.em.1.nvm: -1
dev.em.1.debug: -1
dev.em.1.fc: 3
dev.em.1.rx_int_delay: 0
dev.em.1.tx_int_delay: 66
dev.em.1.rx_abs_int_delay: 66
dev.em.1.tx_abs_int_delay: 66
dev.em.1.rx_processing_limit: 200
dev.em.1.eee_control: 0
dev.em.1.link_irq: 6379725883
dev.em.1.mbuf_alloc_fail: 0
dev.em.1.cluster_alloc_fail: 0
dev.em.1.dropped: 0
dev.em.1.tx_dma_fail: 0
dev.em.1.rx_overruns: 0
dev.em.1.watchdog_timeouts: 0
dev.em.1.device_control: 1477444168
dev.em.1.rx_control: 67141634
dev.em.1.fc_high_water: 18432
dev.em.1.fc_low_water: 16932
dev.em.1.queue0.txd_head: 188
dev.em.1.queue0.txd_tail: 188
dev.em.1.queue0.tx_irq: 760427663
dev.em.1.queue0.no_desc_avail: 0
dev.em.1.queue0.rxd_head: 300
dev.em.1.queue0.rxd_tail: 297
dev.em.1.queue0.rx_irq: 838300057
dev.em.1.mac_stats.excess_coll: 0
dev.em.1.mac_stats.single_coll: 0
dev.em.1.mac_stats.multiple_coll: 0
dev.em.1.mac_stats.late_coll: 0
dev.em.1.mac_stats.collision_count: 0
dev.em.1.mac_stats.symbol_errors: 0
dev.em.1.mac_stats.sequence_errors: 0
dev.em.1.mac_stats.defer_count: 0
dev.em.1.mac_stats.missed_packets: 580251107926
dev.em.1.mac_stats.recv_no_buff: 895
dev.em.1.mac_stats.recv_undersize: 0
dev.em.1.mac_stats.recv_fragmented: 0
dev.em.1.mac_stats.recv_oversize: 0
dev.em.1.mac_stats.recv_jabber: 0
dev.em.1.mac_stats.recv_errs: 0
dev.em.1.mac_stats.crc_errs: 0
dev.em.1.mac_stats.alignment_errs: 0
dev.em.1.mac_stats.coll_ext_errs: 0
dev.em.1.mac_stats.xon_recvd: 809
dev.em.1.mac_stats.xon_txd: 684
dev.em.1.mac_stats.xoff_recvd: 580251112172
dev.em.1.mac_stats.xoff_txd: 580251108668
dev.em.1.mac_stats.total_pkts_recvd: 582154845658
dev.em.1.mac_stats.good_pkts_recvd: 1903732156
dev.em.1.mac_stats.bcast_pkts_recvd: 923
dev.em.1.mac_stats.mcast_pkts_recvd: 0
dev.em.1.mac_stats.rx_frames_64: 257128416
dev.em.1.mac_stats.rx_frames_65_127: 702676478
dev.em.1.mac_stats.rx_frames_128_255: 225331435
dev.em.1.mac_stats.rx_frames_256_511: 59888288
dev.em.1.mac_stats.rx_frames_512_1023: 4176
dev.em.1.mac_stats.rx_frames_1024_1522: 610930363
dev.em.1.mac_stats.good_octets_recvd: 1057190106675
dev.em.1.mac_stats.good_octets_txd: 1502996801989
dev.em.1.mac_stats.total_pkts_txd: 582709483882
dev.em.1.mac_stats.good_pkts_txd: 2458374408
dev.em.1.mac_stats.bcast_pkts_txd: 73
dev.em.1.mac_stats.mcast_pkts_txd: 0
dev.em.1.mac_stats.tx_frames_64: 314613253
dev.em.1.mac_stats.tx_frames_65_127: 841961719
dev.em.1.mac_stats.tx_frames_128_255: 268669868
dev.em.1.mac_stats.tx_frames_256_511: 73341358
dev.em.1.mac_stats.tx_frames_512_1023: 62765737
dev.em.1.mac_stats.tx_frames_1024_1522: 897022473
dev.em.1.mac_stats.tso_txd: 1880
dev.em.1.mac_stats.tso_ctx_fail: 0
dev.em.1.interrupts.asserts: 6331439142
dev.em.1.interrupts.rx_pkt_timer: 0
dev.em.1.interrupts.rx_abs_timer: 0
dev.em.1.interrupts.tx_pkt_timer: 0
dev.em.1.interrupts.tx_abs_timer: 0
dev.em.1.interrupts.tx_queue_empty: 0
dev.em.1.interrupts.tx_queue_min_thresh: 0
dev.em.1.interrupts.rx_desc_min_thresh: 0
dev.em.1.interrupts.rx_overrun: 74346455


And the the other end of the link (FreeBSD 9.0-STABLE  Feb 1 2012)

# sysctl dev.em.2
dev.em.2.%desc: Intel(R) PRO/1000 Network Connection 7.2.3
dev.em.2.%driver: em
dev.em.2.%location: slot=0 function=0
dev.em.2.%pnpinfo: vendor=0x8086 device=0x10d3 subvendor=0x15d9 
subdevice=0x10d3 class=0x02
dev.em.2.%parent: pci7
dev.em.2.nvm: -1
dev.em.2.debug: -1
dev.em.2.rx_int_delay: 0
dev.em.2.tx_int_delay: 66
dev.em.2.rx_abs_int_delay: 66
dev.em.2.tx_abs_int_delay: 66
dev.em.2.rx_processing_limit: 100
dev.em.2.flow_control: 3
dev.em.2.eee_control: 0
dev.em.2.link_irq: 6379294926
dev.em.2.mbuf_alloc_fail: 0
dev.em.2.cluster_alloc_fail: 0
dev.em.2.dropped: 0
dev.em.2.tx_dma_fail: 0
dev.em.2.rx_overruns: 0
dev.em.2.watchdog_timeouts: 0
dev.em.2.device_control: 1477444168
dev.em.2.rx_control: 67141634
dev.em.2.fc_high_water: 18432
dev.em.2.fc_low_water: 16932
dev.em.2.queue0.txd_head: 735
dev.em.2.queue0.txd_tail: 735
dev.em.2.queue0.tx_irq: 839960061
dev.em.2.queue0.no_desc_avail: 0
dev.em.2.queue0.rxd_head: 237
dev.em.2.queue0.rxd_tail: 236
dev.em.2.queue0.rx_irq: 762108556
dev.em.2.mac_stats.excess_coll: 0
dev.em.2.mac_stats.single_coll: 0
dev.em.2.mac_stats.multiple_coll: 0
dev.em.2.mac_stats.late_coll: 0
dev.em.2.mac_stats.collision_count: 0
dev.em.2.mac_stats.symbol_errors: 0
dev.em.2.mac_stats.sequence_errors: 0
dev.em.2.mac_stats.defer_count: 0
dev.em.2.mac_stats.missed_packets: 580252415422
dev.em.2.mac_stats.recv_no_buff: 3211
dev.em.2.mac_stats.recv_undersize: 0
dev.em.2.mac_stats.recv_fragmented: 0
dev.em.2.mac_stats.recv_oversize: 0

Re: ping: sendto: No buffer space available

2012-09-25 Thread Hooman Fazaeli



On 9/25/2012 11:08 AM, Rudy (bulk) wrote:

On 9/24/12 11:52 PM, Hooman Fazaeli wrote:

sysctl dev.em.1


From the side having the 'No buffer space available' (FreeBSD 8.3  Sep 13 2012)

# sysctl dev.em.1
dev.em.1.%desc: Intel(R) PRO/1000 Network Connection 7.3.2
dev.em.1.%driver: em
dev.em.1.%location: slot=0 function=0
dev.em.1.%pnpinfo: vendor=0x8086 device=0x10d3 subvendor=0x15d9 
subdevice=0x class=0x02
dev.em.1.%parent: pci5
dev.em.1.nvm: -1
dev.em.1.debug: -1
dev.em.1.fc: 3
dev.em.1.rx_int_delay: 0
dev.em.1.tx_int_delay: 66
dev.em.1.rx_abs_int_delay: 66
dev.em.1.tx_abs_int_delay: 66
dev.em.1.rx_processing_limit: 200
dev.em.1.eee_control: 0
dev.em.1.link_irq: 6379725883
dev.em.1.mbuf_alloc_fail: 0
dev.em.1.cluster_alloc_fail: 0
dev.em.1.dropped: 0
dev.em.1.tx_dma_fail: 0
dev.em.1.rx_overruns: 0
dev.em.1.watchdog_timeouts: 0
dev.em.1.device_control: 1477444168
dev.em.1.rx_control: 67141634
dev.em.1.fc_high_water: 18432
dev.em.1.fc_low_water: 16932
dev.em.1.queue0.txd_head: 188
dev.em.1.queue0.txd_tail: 188
dev.em.1.queue0.tx_irq: 760427663
dev.em.1.queue0.no_desc_avail: 0
dev.em.1.queue0.rxd_head: 300
dev.em.1.queue0.rxd_tail: 297
dev.em.1.queue0.rx_irq: 838300057
dev.em.1.mac_stats.excess_coll: 0
dev.em.1.mac_stats.single_coll: 0
dev.em.1.mac_stats.multiple_coll: 0
dev.em.1.mac_stats.late_coll: 0
dev.em.1.mac_stats.collision_count: 0
dev.em.1.mac_stats.symbol_errors: 0
dev.em.1.mac_stats.sequence_errors: 0
dev.em.1.mac_stats.defer_count: 0
dev.em.1.mac_stats.missed_packets: 580251107926
dev.em.1.mac_stats.recv_no_buff: 895
dev.em.1.mac_stats.recv_undersize: 0
dev.em.1.mac_stats.recv_fragmented: 0
dev.em.1.mac_stats.recv_oversize: 0
dev.em.1.mac_stats.recv_jabber: 0
dev.em.1.mac_stats.recv_errs: 0
dev.em.1.mac_stats.crc_errs: 0
dev.em.1.mac_stats.alignment_errs: 0
dev.em.1.mac_stats.coll_ext_errs: 0
dev.em.1.mac_stats.xon_recvd: 809
dev.em.1.mac_stats.xon_txd: 684
dev.em.1.mac_stats.xoff_recvd: 580251112172
dev.em.1.mac_stats.xoff_txd: 580251108668
dev.em.1.mac_stats.total_pkts_recvd: 582154845658
dev.em.1.mac_stats.good_pkts_recvd: 1903732156
dev.em.1.mac_stats.bcast_pkts_recvd: 923
dev.em.1.mac_stats.mcast_pkts_recvd: 0
dev.em.1.mac_stats.rx_frames_64: 257128416
dev.em.1.mac_stats.rx_frames_65_127: 702676478
dev.em.1.mac_stats.rx_frames_128_255: 225331435
dev.em.1.mac_stats.rx_frames_256_511: 59888288
dev.em.1.mac_stats.rx_frames_512_1023: 4176
dev.em.1.mac_stats.rx_frames_1024_1522: 610930363
dev.em.1.mac_stats.good_octets_recvd: 1057190106675
dev.em.1.mac_stats.good_octets_txd: 1502996801989
dev.em.1.mac_stats.total_pkts_txd: 582709483882
dev.em.1.mac_stats.good_pkts_txd: 2458374408
dev.em.1.mac_stats.bcast_pkts_txd: 73
dev.em.1.mac_stats.mcast_pkts_txd: 0
dev.em.1.mac_stats.tx_frames_64: 314613253
dev.em.1.mac_stats.tx_frames_65_127: 841961719
dev.em.1.mac_stats.tx_frames_128_255: 268669868
dev.em.1.mac_stats.tx_frames_256_511: 73341358
dev.em.1.mac_stats.tx_frames_512_1023: 62765737
dev.em.1.mac_stats.tx_frames_1024_1522: 897022473
dev.em.1.mac_stats.tso_txd: 1880
dev.em.1.mac_stats.tso_ctx_fail: 0
dev.em.1.interrupts.asserts: 6331439142
dev.em.1.interrupts.rx_pkt_timer: 0
dev.em.1.interrupts.rx_abs_timer: 0
dev.em.1.interrupts.tx_pkt_timer: 0
dev.em.1.interrupts.tx_abs_timer: 0
dev.em.1.interrupts.tx_queue_empty: 0
dev.em.1.interrupts.tx_queue_min_thresh: 0
dev.em.1.interrupts.rx_desc_min_thresh: 0
dev.em.1.interrupts.rx_overrun: 74346455


And the the other end of the link (FreeBSD 9.0-STABLE  Feb 1 2012)

# sysctl dev.em.2
dev.em.2.%desc: Intel(R) PRO/1000 Network Connection 7.2.3
dev.em.2.%driver: em
dev.em.2.%location: slot=0 function=0
dev.em.2.%pnpinfo: vendor=0x8086 device=0x10d3 subvendor=0x15d9 
subdevice=0x10d3 class=0x02
dev.em.2.%parent: pci7
dev.em.2.nvm: -1
dev.em.2.debug: -1
dev.em.2.rx_int_delay: 0
dev.em.2.tx_int_delay: 66
dev.em.2.rx_abs_int_delay: 66
dev.em.2.tx_abs_int_delay: 66
dev.em.2.rx_processing_limit: 100
dev.em.2.flow_control: 3
dev.em.2.eee_control: 0
dev.em.2.link_irq: 6379294926
dev.em.2.mbuf_alloc_fail: 0
dev.em.2.cluster_alloc_fail: 0
dev.em.2.dropped: 0
dev.em.2.tx_dma_fail: 0
dev.em.2.rx_overruns: 0
dev.em.2.watchdog_timeouts: 0
dev.em.2.device_control: 1477444168
dev.em.2.rx_control: 67141634
dev.em.2.fc_high_water: 18432
dev.em.2.fc_low_water: 16932
dev.em.2.queue0.txd_head: 735
dev.em.2.queue0.txd_tail: 735
dev.em.2.queue0.tx_irq: 839960061
dev.em.2.queue0.no_desc_avail: 0
dev.em.2.queue0.rxd_head: 237
dev.em.2.queue0.rxd_tail: 236
dev.em.2.queue0.rx_irq: 762108556
dev.em.2.mac_stats.excess_coll: 0
dev.em.2.mac_stats.single_coll: 0
dev.em.2.mac_stats.multiple_coll: 0
dev.em.2.mac_stats.late_coll: 0
dev.em.2.mac_stats.collision_count: 0
dev.em.2.mac_stats.symbol_errors: 0
dev.em.2.mac_stats.sequence_errors: 0
dev.em.2.mac_stats.defer_count: 0
dev.em.2.mac_stats.missed_packets: 580252415422
dev.em.2.mac_stats.recv_no_buff: 3211
dev.em.2.mac_stats.recv_undersize: 0
dev.em.2.mac_stats.recv_fragmented: 0
dev.em.2.mac_stats.recv_oversize: 0

Re: FreeBSD 9.0-R em0 issues?

2012-08-12 Thread Hooman Fazaeli

On 8/11/2012 2:17 PM, Karl Pielorz wrote:



--On 11 August 2012 12:36 +0430 Hooman Fazaeli hoomanfaza...@gmail.com wrote:



NameMtu Network   Address  Ipkts Ierrs Idrop
Opkts Oerrs  Coll em01500 Link#5  00:25:90:31:82:46 355482
10612864185945 0 291109 3032246910270 1516123455135 

82574L with ASPM enabled is known to cause a problem like yours.
(See:http://www.google.com/#hl=ensclient=psy-abq=82574L+%2B+ASPM
http://www.google.com/#hl=ensclient=psy-abq=82574L+%2B+ASPM)
However, some time ago, jack committed a fix which disabled ASPM to fix
the problem.
I recommend getting and compiling latest e1000 source from CVS (which is
version 7.3.2)
and see what happens.


Hi,

In the midst of trying to get this onto the machine (without the NIC working - 
which was fun), during a reboot the NIC suddenly disappeared completely.

Rebooting the machine again gives a 50/50 on the NIC probing when FreeBSD runs 
up - half the time I'm left with em1 only, and no em0.

It looks like this has gone from a 'possible software' issue to a 'probable 
hardware' issue now? - I've moved the connection over to em1, I'll see how I 
get on with that.


I have also seen this problem on a different hardware but I can not recall if
I fixed it with hardware replacement or driver update. Anyway, it is worth to
give the driver update a try.

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


Re: FreeBSD 9.0-R em0 issues?

2012-08-11 Thread Hooman Fazaeli



On 8/10/2012 11:24 PM, Karl Pielorz wrote:


Hi,

Apologies for posting to -net as well - I originally posted this to -hackers, 
but was advised to re-post it here...

A FreeBSD 9.0-R amd64 box - based on a SuperMicro X8DTL-IF Rev. 2.01 w/Intel  L5630 
 6Gb of RAM seems to have issues with it's onboard NIC (em driver based - i.e. 
em0).

The machine runs fine - but then suddenly loses all network connectivity. 
Nothing is logged on the console, or /var/log/messages.

Doing an 'infconfig em0 down' then up, doesn't fix it. Rebooting the box does fix it for a while. Having dug around Google - I've now set hw.em.enable_msix=0 - the box ran the whole of the day 
with that set, before again - having em0 wedge up.


When it does this 'netstat -n -i' returns silly figures - i.e. if I catch it even moments after it's done it - it'll claim to have suffered billions of input/output and collision errors (huge 
amounts more than the amount of traffic that machine would have handled) - e.g.



NameMtu Network   Address  Ipkts Ierrs IdropOpkts Oerrs 
 Coll
em01500 Link#5  00:25:90:31:82:46 355482 10612864185945 0 291109 
3032246910270 1516123455135


82574L with ASPM enabled is known to cause a problem like yours.
(See:http://www.google.com/#hl=ensclient=psy-abq=82574L+%2B+ASPM 
http://www.google.com/#hl=ensclient=psy-abq=82574L+%2B+ASPM)
However, some time ago, jack committed a fix which disabled ASPM to fix the 
problem.
I recommend getting and compiling latest e1000 source from CVS (which is 
version 7.3.2)
and see what happens.



___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


Re: FreeBSD 10G forwarding performance @Intel

2012-07-17 Thread Hooman Fazaeli

On 7/16/2012 10:13 PM, Alexander V. Chernikov wrote:

Old kernel from previous letters, same setup:

net.inet.ip.fw.enable=0
2.3 MPPS
net.inet.ip.fw.update_counters=0
net.inet.ip.fw.enable=1
1.93MPPS
net.inet.ip.fw.update_counters=1
1.74MPPS

Kernel with ipfw pcpu counters:

net.inet.ip.fw.enable=0
2.3 MPPS
net.inet.ip.fw.update_counters=0
net.inet.ip.fw.enable=1
1.93MPPS
net.inet.ip.fw.update_counters=1
1.93MPPS

Counters seems to be working without any (significant) overhead.
(Maybe I'm wrong somewhere?)

Additionally, I've got (from my previous pcpu attempt) a small patch permitting 
ipfw to re-use rule map allocation instead of reallocating on every rule. This 
saves a bit of system time:

loading 20k rules with ipfw binary gives us:
5.1s system time before and 4.1s system time after.



May be slightly off-topic, but do you have tested (or have plans to test )
with bidirectional traffic?




___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


Re: Intel 82574L interface wedging - em7.3.2/8.2-STABLE

2012-03-10 Thread Hooman Fazaeli

Dear Jason

With a link_irq of 4, I still guess your problem is snd_buf filling up during
a temporary link_loss (see: 
http://lists.freebsd.org/pipermail/freebsd-net/2011-November/030424.html).

I use a patched version of e1000 which addresses this issue and
works good for me but it is based on 7.2.3 and I have just tested in
on 7.3-RELEASE.

If interested, I can send you the sources for test.
You may also port my changes to 7.3.2 and roll your own version.


On 3/8/2012 12:27 AM, Jason Wolfe wrote:


I'm sure it's getting old with all of the recent work put into the
e1000 driver, but this is still ongoing with MSI-X enabled.  Most
machines are running an 8.2-STABLE from early Feb, though it appears
there have been no relevant changes in RELENG_8 since then.  I've
disabled all possible em options on the devices also to rule that out
and am still seeing the issue.  I guess reverting back to MSI-X
disabled is the next step if nothing is spotted.  This box had been
doing between 1 and 1.5Gb/s steady for the 26 days before the network
hang.



___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


Re: Intel 82574L interface wedging - em7.3.2/8.2-STABLE

2012-03-10 Thread Hooman Fazaeli

On 3/11/2012 5:31 AM, Adrian Chadd wrote:

Are you able to post the patch here?
Maybe Jack can look at what's going on and apply it to the latest
intel ethernet driver.


Adrian



Below is the patch for if_em.c (7.2.3). It simply checks driver's
queue status when the link state changes (inactive - active) and
start transmit task if queue(s) are not empty.

It also contains stuff I have added to compile on 7 plus some code
for test and diagnostics.

Hope it helps.

--- if_em.c.orig2011-10-27 14:47:20.0 +0330
+++ if_em.c2011-11-19 16:11:54.0 +0330
@@ -85,6 +85,14 @@
 #include e1000_82571.h
 #include if_em.h

+#if !defined(DISABLE_FIXUPS)  __FreeBSD_version  80
+static __inline int
+pci_find_cap(device_t dev, int capability, int *capreg)
+{
+return (PCI_FIND_EXTCAP(device_get_parent(dev), dev, capability, 
capreg));
+}
+#endif
+
 /*
  *  Set this to one to display debug statistics
  */
@@ -93,7 +101,11 @@
 /*
  *  Driver version:
  */
+#ifdef PKG_VERSION
+char em_driver_version[] = version 7.2.3 (ifdrivers-  PKG_VERSION );
+#else
 char em_driver_version[] = 7.2.3;
+#endif

 /*
  *  PCI Device ID Table
@@ -293,6 +305,11 @@
 static poll_handler_t em_poll;
 #endif /* POLLING */

+#ifndef DISABLE_FIXUPS
+static int em_sysctl_snd_ifq_len(SYSCTL_HANDLER_ARGS);
+static int em_sysctl_snd_ifq_drv_len(SYSCTL_HANDLER_ARGS);
+#endif
+
 /*
  *  FreeBSD Device Interface Entry Points
  */
@@ -399,6 +416,23 @@
 /* Global used in WOL setup with multiport cards */
 static int global_quad_port_a = 0;

+#ifndef DISABLE_FIXUPS
+static int enable_hang_fixup = 1;
+TUNABLE_INT(hw.em.enable_hang_fixup, enable_hang_fixup);
+SYSCTL_INT(_hw_em, OID_AUTO, enable_hang_fixup, CTLFLAG_RW, 
enable_hang_fixup, 0,
+Enable rx/tx hang fixup);
+
+static int em_regard_tx_link_status = 1;
+TUNABLE_INT(hw.em.regard_tx_link_status, em_regard_tx_link_status);
+SYSCTL_INT(_hw_em, OID_AUTO, regard_tx_link_status, CTLFLAG_RW, 
em_regard_tx_link_status, 0,
+Regard tx link status);
+
+static int link_master_slave = e1000_ms_hw_default;
+TUNABLE_INT(hw.em.link_master_slave, link_master_slave);
+SYSCTL_INT(_hw_em, OID_AUTO, link_master_slave, CTLFLAG_RW, link_master_slave,
+0, Link negotiation master/slave type);
+#endif
+
 /*
  *  Device identification routine
  *
@@ -411,7 +445,11 @@
 static int
 em_probe(device_t dev)
 {
+#ifdef PKG_VERSION
+charadapter_name[sizeof(em_driver_version) + 60];
+#else
 charadapter_name[60];
+#endif
 u16pci_vendor_id = 0;
 u16pci_device_id = 0;
 u16pci_subvendor_id = 0;
@@ -864,7 +902,11 @@
 int err = 0, enq = 0;

 if ((ifp-if_drv_flags  (IFF_DRV_RUNNING | IFF_DRV_OACTIVE)) !=
+#ifndef DISABLE_FIXUPS
 IFF_DRV_RUNNING || adapter-link_active == 0) {
+#else
+IFF_DRV_RUNNING || (em_regard_tx_link_status  
!adapter-link_active)) {
+#endif
 if (m != NULL)
 err = drbr_enqueue(ifp, txr-br, m);
 return (err);
@@ -962,9 +1004,17 @@
 if ((ifp-if_drv_flags  (IFF_DRV_RUNNING|IFF_DRV_OACTIVE)) !=
 IFF_DRV_RUNNING)
 return;
+#ifdef _TEST
+if (adapter-forced_link_status == 0)
+return;
+#endif

+#ifdef DISABLE_FIXUPS
 if (!adapter-link_active)
+#else
+if (em_regard_tx_link_status  !adapter-link_active)
 return;
+#endif

 while (!IFQ_DRV_IS_EMPTY(ifp-if_snd)) {
 /* Call cleanup if number of TX descriptors low */
@@ -977,6 +1027,17 @@
 IFQ_DRV_DEQUEUE(ifp-if_snd, m_head);
 if (m_head == NULL)
 break;
+#ifdef _TEST
+if (adapter-forced_xmit_error == ENOMEM) {
+ifp-if_drv_flags |= IFF_DRV_OACTIVE;
+IFQ_DRV_PREPEND(ifp-if_snd, m_head);
+break;
+} else if (adapter-forced_xmit_error != 0) {
+m_freem(m_head);
+m_head = NULL;
+break;
+} else
+#endif
 /*
  *  Encapsulation can modify our pointer, and or make it
  *  NULL on failure.  In that event, we can't requeue.
@@ -1141,6 +1202,10 @@
 adapter-hw.phy.reset_disable = FALSE;
 /* Check SOL/IDER usage */
 EM_CORE_LOCK(adapter);
+#ifndef DISABLE_FIXUPS
+if (adapter-hw.phy.media_type == e1000_media_type_copper)
+adapter-hw.phy.ms_type = link_master_slave;
+#endif
 if (e1000_check_reset_block(adapter-hw)) {
 

Re: em0 hangs on 8-STABLE again

2012-02-04 Thread Hooman Fazaeli

Dear Jack

Is the problem related to link loss fixed in this version?
The problem was that if if_snd fills up during a link_active == 0
period, stack never calls em_start again, because em does not
kick off tx when link becomes active again.


On 1/29/2012 9:51 PM, Jack Vogel wrote:

No, I told Mike I'd get it into 8.x, have just been busy, but will try
and get it pushed up in the queue.

Jack


2012/1/29 Lev Serebryakovl...@freebsd.org


Hello, Mike.
You wrote 29 января 2012 г., 16:54:59:


   My home server lost connection on em0 this night again. It was
persistent problem some times ago, but with version 7.2.3 it is first
time, but with worse symptoms.

7.3.0 from HEAD is quite stable for me.  Hopefully it will be MFC'd soon

:)
  I'm afraid, that MFC'd means to 9-STABLE now :(


--
// Black Lion AKA Lev Serebryakovl...@freebsd.org

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


Re: Intel 82574L interface wedging on em 7.1.9/7.2.3 when MSIX enabled

2011-11-10 Thread Hooman Fazaeli

On 11/10/2011 3:39 AM, Adrian Chadd wrote:

There's no locking around the OACTIVE flag set/clear, right?
Is it possible that multiple TX threads are fiddling with OACTIVE and
then it's not being properly cleared and tx kicked?


Adrian

If we check for OACTIVE periodically (for instance, in local_timer) and under
transient resource shortage, the driver will finally end up with OACTIVE
cleared. Under frequent resource shortages, the driver may remain OACTIVE
longer than it is ~OACTIVE or it may constantly toggles but there is
not much the driver can do about this and a simple locking around OACTIVE 
set/clear
does not change the situation. The problem _is_ low resources and the only
fix is to increase it.

The problems we should focus on here are two things:

1- The driver _must_ be able to recover from OACTIVE after transient resource 
shortages.
2- It is desirable to do this as fast as possible.

Doing recovery in local_timer accommodates the first need but it is very far 
from
from the second.

One possible solution for 2 would be to defer setting OACTIVE until N 
consecutive
transmissions fail (i.e., N == 75% (if_snd.ifq_maxlen - if_snd.ifq_len)). The 
overhead
is a little wasted cpu time in longer OACTIVE states. We still need local_timer
to recover from these states.






___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


Re: Intel 82574L interface wedging on em 7.1.9/7.2.3 when MSIX enabled

2011-11-10 Thread Hooman Fazaeli

On 11/10/2011 3:39 AM, Adrian Chadd wrote:

There's no locking around the OACTIVE flag set/clear, right?
Is it possible that multiple TX threads are fiddling with OACTIVE and
then it's not being properly cleared and tx kicked?


Adrian

sorry! I forgot to cleanup the the last message ... here is the correct one:

If we check for OACTIVE periodically (for instance, in local_timer) and under
transient resource shortage, the driver will finally end up with OACTIVE
cleared. Under frequent resource shortages, the driver may remain OACTIVE
longer than it is ~OACTIVE or it may constantly toggles but there is
not much the driver can do about this and a simple locking around OACTIVE 
set/clear
does not change the situation. The problem _is_ low resources and the only
fix is to increase it.

The problems we should focus on here are two things:

1- The driver _must_ be able to recover from OACTIVE after transient resource 
shortages.
2- It is desirable to do this as fast as possible.

Doing recovery in local_timer accommodates the first need but it is very far 
from
from the second.

One possible solution for 2 would be to defer setting OACTIVE until N 
consecutive
transmissions fail (i.e., N == 75% (if_snd.ifq_maxlen - if_snd.ifq_len)). The 
overhead
is a little wasted cpu time consumed in longer OACTIVE states. We still need 
local_timer
to recover from these states.

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


Re: Intel 82574L interface wedging on em 7.1.9/7.2.3 when MSIX enabled

2011-11-09 Thread Hooman Fazaeli

On 11/8/2011 11:00 PM, Adrian Chadd wrote:

On 8 November 2011 09:21, Hooman Fazaelihoomanfaza...@gmail.com  wrote:


With MSIX enabled, the link task (em_handle_link) does _not_ triggers
_start when the link changes state from inactive to active (which it
should).
If if_snd quickly fills up during a temporary link loss, transmission is
stopped forever and the driver never recovers from that state.

The last patch should have reduced the frequency of the problem
but it assumes every IFQ_ENQUEUE is followed by a if_start which
is not a true assumption.


FWIW, I saw something very similar with the if_arge code port from
Linux. If the TX queue filled up and wasn't serviced before it hit
completely full, it was never drained.

It may be worthwhile auditing some of the other NIC drivers to ensure
this kind of situation isn't occuring. Especially if they came from
Linux. :-)

That's a great catch, I hope it finally fixes the if_em issues with MSIX. :-)


Adrian

Just for the record, I should inform you that igb, ixgb and ixbge have the
same issue. I have not checked other drivers.

And there is another subtle problem with all these drivers: if transmit 
(xxx_xmit)
fails for a temporary memory shortage (i.e., DMA failure for ENOMEM), the driver
may enter the OACTIVE state and _never_ recovers! The scenario is somehow as
before:

- if_start is executed.
- xxx_xmit fails with ENOMEM.
- xxx_start_locked sets OACTIVE. Note that this is different from a low TX 
descriptor
  condition which also sets OACTIVE.
- stack enqueues packets in if_snd but does not call if_start since driver is 
OACTIVE.
- stack enqueues more packets until if_snd fills up and packets start to drop.
- Since there is nowhere in the driver's code to re-try transmission when 
memory becomes
  available again (xxx_local_timer is a candidate), the driver remains OACTIVE 
forever
  until it is re-initialized.

I am working on patches for em/igb/ixgb/ixgbe to fix these issues and would be
happy to share them with anyone who is interested.

since these are really severe problems, I hope gurus apply official fixes ASAP.








___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


Re: Intel 82574L interface wedging on em 7.1.9/7.2.3 when MSIX enabled

2011-11-09 Thread Hooman Fazaeli

On 11/8/2011 10:23 PM, Jason Wolfe wrote:

On Tue, Nov 8, 2011 at 10:21 AM, Hooman Fazaeli hoomanfaza...@gmail.com 
mailto:hoomanfaza...@gmail.com wrote:

I have allocated more time to the problem and guess I can explain what
your problem is.

With MSIX disabled, the driver uses fast interrupt handler (em_irq_fast)
which calls rx/tx task and then checks for link status change. This
implies that rx/tx task is executed with every link state change. This is
not efficient, as it is a waste of time to start transmission when link is 
down.
However, it has the effect that after a temporary link loss 
(active-inactive-active),
_start is executed and transmission continues normally. The value of 
link_toggles (3)
clearly indicates that you had such a transition when the problem occured.

With MSIX enabled, the link task (em_handle_link) does _not_ triggers
_start when the link changes state from inactive to active (which it 
should).
If if_snd quickly fills up during a temporary link loss, transmission is
stopped forever and the driver never recovers from that state.

The last patch should have reduced the frequency of the problem
but it assumes every IFQ_ENQUEUE is followed by a if_start which
is not a true assumption.

If you are willing to test, I can prepare another patch for you to fix
the issue in a different and more reliable way.


Hooman,

Thanks again for the assist, it sounds like this may also be why we see a bit 
higher latency with MSI-X disabled on this chipset.

I'm happy to test any patches as I have a handful of boxes set aside to 
'research' this issue.  Hopefully the testing here helps along any patches to 
the tree for others benefit also.

Jason

Latency may or may not be related. I am doing more tests and will post
my findings soon.



___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


Re: Intel 82574L interface wedging on em 7.1.9/7.2.3 when MSIX enabled

2011-11-08 Thread Hooman Fazaeli

On 11/7/2011 9:24 PM, Jason Wolfe wrote:

On Mon, Oct 31, 2011 at 1:13 AM, Hooman Fazaeli hoomanfaza...@gmail.com 
mailto:hoomanfaza...@gmail.com wrote:


Attached is a patch for if_em.c. It flushes interface queue when it is full
and link is not active. Please note that when this happens, drops are 
increasing
on interface and this will trigger your scripts as before. You need to 
change
a little the scripts as follows:

  check interface TX status
  if (interface TX seems hung) {
sleep 5
check interface TX status
if (interface TX seems hung) {
 reset the interface.
}
}

For MULTIQUEUE, it just disables the check for link status (which is not 
good).
so pls. test in non-MULTIQUEUE mode.

The patch also contains some minor fixups to compile on 7 plus
a fix from r1.69 which addressed RX hang problem (the fix was
later removed in r1.70). I included it for Emil to give it a
try.

Pls. let me know if you have any problems with patch.


Hooman,

Unfortunately  one of the server just had a wedge event a couple hours ago with 
this patch.  To confirm your changes should cause a recovery within the time 
I'm allowing, here is the current format:

check interface TX status
if (interface TX seems hung) {
sleep 3
check packets out
sleep 2
check packets out
if (packets not incrementing) {
reset the interface
}
}

I bounced em0 because dropped packets incremented 1749543 to 1749708 and the 
interface is not incrementing packets out.

4:10AM up 6 days, 15:23, 0 users, load averages: 0.02, 0.12, 0.14

em0: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST metric 0 mtu 1500
options=219bRXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4,WOL_MAGIC
ether X
inet6 X%em0 prefixlen 64 scopeid 0x1
nd6 options=1PERFORMNUD
media: Ethernet autoselect (1000baseT full-duplex)
status: active

em1: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST metric 0 mtu 1500
options=219bRXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4,WOL_MAGIC
ether X
inet6 X%em1 prefixlen 64 scopeid 0x2
nd6 options=3PERFORMNUD,ACCEPT_RTADV
media: Ethernet autoselect (1000baseT full-duplex)
status: active

ipfw0: flags=8801UP,SIMPLEX,MULTICAST metric 0 mtu 65536

lo0: flags=8049UP,LOOPBACK,RUNNING,MULTICAST metric 0 mtu 16384
options=3RXCSUM,TXCSUM
inet 127.0.0.1 netmask 0xff00
inet6 ::1 prefixlen 128
inet6 fe80::1%lo0 prefixlen 64 scopeid 0x4
inet X.X.X.X netmask 0x
inet X.X.X.X netmask 0x
inet X.X.X.X netmask 0x
nd6 options=3PERFORMNUD,ACCEPT_RTADV

lagg0: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST metric 0 mtu 1500
options=219bRXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4,WOL_MAGIC
ether X
inet X.X.X.X netmask 0xff00 broadcast X.X.X.X
inet6 X%lagg0 prefixlen 64 scopeid 0x5
inet6 X prefixlen 64 autoconf
nd6 options=3PERFORMNUD,ACCEPT_RTADV
media: Ethernet autoselect
status: active
laggproto loadbalance
laggport: em0 flags=4ACTIVE
laggport: em1 flags=4ACTIVE

interrupt total rate
irq3: uart1 3810 0
cpu0: timer 1147568087 2000
irq256: em0:rx 0 59779710 104
irq257: em0:tx 0 2771888652 4831
irq258: em0:link 1 0
irq259: em1:rx 0 3736828886 6512
irq260: em1:tx 0 2790566376 4863
irq261: em1:link 27286 0
irq262: mps0 395687386 689
cpu1: timer 1147559894 2000
cpu2: timer 1147559901 2000
cpu3: timer 1147559902 2000
Total 14345029891 25001

13466/4144/17610 mbufs in use (current/cache/total)
2567/2635/5202/5853720 mbuf clusters in use (current/cache/total/max)
2567/633 mbuf+clusters out of packet secondary zone in use (current/cache)
6798/554/7352/2926859 4k (page size) jumbo clusters in use 
(current/cache/total/max)
0/0/0/6400 9k jumbo clusters in use (current/cache/total/max)
0/0/0/3200 16k jumbo clusters in use (current/cache/total/max)
35692K/8522K/44214K bytes allocated to network (current/cache/total)
0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters)
0/0/0 requests for jumbo clusters denied (4k/9k/16k)
0/0/0 sfbufs in use (current/peak/max)
0 requests for sfbufs denied
0 requests for sfbufs delayed
0 requests for I/O initiated by sendfile
0 calls to protocol drain routines

Name Mtu Network Address Ipkts Ierrs Idrop Opkts Oerrs Coll Drop
em0 1500 Link#1 00:25:90:2b:e5:75 60747643 0 0 11246408092 0 0 1750763
em0 1500 fe80:1::225:9 fe80:1::225:90ff: 0 - - 4 - - -
em1 1500 Link#2 00:25:90:2b:e5:75 11237195776 123950 0 11344722383 0 0 545682
em1 1500 fe80:2::225:9 fe80:2::225:90ff: 0 - - 1 - - -
lagg0 1500 Link#5 00:25:90:2b:e5:75 11297850142 0 0 22588666102 2296445 0 0
lagg0 1500 69.164.38.0/2 http://69.164.38.0/2 69.164.38.83 10189108030 - - 
22592881776 - - -
lagg0 1500 fe80:5::225:9 fe80:5::225:90ff: 24 - - 28 - - -
lagg0 1500 2607:f4e8:310 2607:f4e8:310:12: 19578 - - 19591 - - -

kern.msgbuf:

Nov 7 04:10:06 cds1033 kernel

Re: Intel 82574L interface wedging on em 7.1.9/7.2.3 when MSIX enabled

2011-10-31 Thread Hooman Fazaeli

On 10/31/2011 7:33 AM, Jason Wolfe wrote:



Thanks for looking into this.  I'd be happy to test any patch thrown my way, but keep in mind my issue is only tickled when MSI-X is enabled.  My interfaces aren't bouncing, though it might be 
possible some unique path in the MSI-X code is causing a throughput hang akin to connectivity loss?


Jack is the delta your speaking to the 7.2.4 code?  I did manage to get the code from Intel compiled with a couple minutes of work, but haven't loaded it up yet as I didn't see anything that caught 
my untrained eye in the diffs.  I'll wait until its ported over and would be happy to test if needed.


Conveniently enough I just received another report from my test boxes with a pretty stock loader.conf.  I had forgotten to remove the advanced options from the interfaces after I cycled them to pick 
up the fc_setting=0.  Fixed that up just meow.


hw.em.fc_setting=0
cc_cubic_load=YES

I bounced em0 because dropped packets incremented 368756 to 369124 and the 
interface is not incrementing packets out.

5:35PM up 2 days, 17:45, 0 users, load averages: 0.34, 0.45, 0.48

em0: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST metric 0 mtu 1500
options=219bRXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4,WOL_MAGIC
ether X
inet6 X%em0 prefixlen 64 scopeid 0x1
nd6 options=1PERFORMNUD
media: Ethernet autoselect (1000baseT full-duplex)
status: active

em1: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST metric 0 mtu 1500
options=219bRXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4,WOL_MAGIC
ether X
inet6 X%em1 prefixlen 64 scopeid 0x2
inet6 X prefixlen 64 autoconf
nd6 options=3PERFORMNUD,ACCEPT_RTADV
media: Ethernet autoselect (1000baseT full-duplex)
status: active

ipfw0: flags=8801UP,SIMPLEX,MULTICAST metric 0 mtu 65536

lo0: flags=8049UP,LOOPBACK,RUNNING,MULTICAST metric 0 mtu 16384
options=3RXCSUM,TXCSUM
inet 127.0.0.1 netmask 0xff00
inet6 ::1 prefixlen 128
inet6 fe80::1%lo0 prefixlen 64 scopeid 0x4
inet X.X.X.X netmask 0x
inet X.X.X.X netmask 0x
inet X.X.X.X netmask 0x
inet X.X.X.X netmask 0x
inet X.X.X.X netmask 0x
inet X.X.X.X netmask 0x
inet X.X.X.X netmask 0x
inet X.X.X.X netmask 0x
nd6 options=3PERFORMNUD,ACCEPT_RTADV

lagg0: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST metric 0 mtu 1500
options=219bRXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4,WOL_MAGIC
ether X
inet X.X.X.X netmask 0xff00 broadcast X.X.X.X
inet6 X%lagg0 prefixlen 64 scopeid 0x5
nd6 options=3PERFORMNUD,ACCEPT_RTADV
media: Ethernet autoselect
status: active
laggproto loadbalance
laggport: em0 flags=4ACTIVE
laggport: em1 flags=4ACTIVE

interrupt total rate
irq3: uart1 3456 0
cpu0: timer 473404250 2000
irq256: em0:rx 0 24614350 103
irq257: em0:tx 0 1220810972 5157
irq258: em0:link 1 0
irq259: em1:rx 0 1533295149 6477
irq260: em1:tx 0 1194032538 5044
irq261: em1:link 3272 0
irq262: mps0 189602667 801
cpu3: timer 473396089 2000
cpu1: timer 473396089 2000
cpu2: timer 473396081 2000
Total 6055954914 25585

32999/8476/41475 mbufs in use (current/cache/total)
4064/3398/7462/5872038 mbuf clusters in use (current/cache/total/max)
4064/800 mbuf+clusters out of packet secondary zone in use (current/cache)
24900/669/25569/2936019 4k (page size) jumbo clusters in use 
(current/cache/total/max)
0/0/0/6400 9k jumbo clusters in use (current/cache/total/max)
0/0/0/3200 16k jumbo clusters in use (current/cache/total/max)
115977K/11591K/127568K bytes allocated to network (current/cache/total)
0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters)
0/0/0 requests for jumbo clusters denied (4k/9k/16k)
0/0/0 sfbufs in use (current/peak/max)
0 requests for sfbufs denied
0 requests for sfbufs delayed
61 requests for I/O initiated by sendfile
0 calls to protocol drain routines

Name Mtu Network Address Ipkts Ierrs Idrop Opkts Oerrs Coll Drop
em0 1500 Link#1 00:25:90:2a:a2:d7 24946787 0 0 5734180355 0 0 369844
em0 1500 fe80:1::225:9 fe80:1::225:90ff: 0 - - 2 - - -
em1 1500 Link#2 00:25:90:2a:a2:d7 5220869518 15996 0 5429971995 0 0 37009
em1 1500 fe80:2::225:9 fe80:2::225:90ff: 0 - - 1 - - -
em1 1500 2607:f4e8:310 2607:f4e8:310:12: 0 - - 0 - - -
lagg0 1500 Link#5 00:25:90:2a:a2:d7 5245767782 0 0 11162877037 406853 0 0
lagg0 1500 69.164.38.0/2 http://69.164.38.0/2 69.164.38.69 4776881809 - - 
11164303625 - - -
lagg0 1500 fe80:5::225:9 fe80:5::225:90ff: 0 - - 3 - - -

kern.msgbuf:

Oct 30 17:08:38 cds1019 kernel: ifa_add_loopback_route: insertion failed
Oct 30 17:12:10 cds1019 kernel: ifa_add_loopback_route: insertion failed
Oct 30 17:20:20 cds1019 last message repeated 3 times
Oct 30 17:32:13 cds1019 last message repeated 4 times
Oct 30 17:34:27 cds1019 kernel: ifa_add_loopback_route: insertion failed
Oct 30 17:35:03 cds1019 kernel: Interface is RUNNING and INACTIVE
Oct 30 17:35:03 cds1019 kernel: em0: hw tdh = 818, hw tdt = 818
Oct 30 17:35:03 cds1019 kernel: em0: hw rdh = 99, hw rdt = 98
Oct 30 17:35:03 cds1019 kernel: em0: Tx 

Re: Intel 82574L interface wedging on em 7.1.9/7.2.3 when MSIX enabled

2011-10-31 Thread Hooman Fazaeli

On 10/31/2011 11:43 AM, Emil Muratov wrote:




You may try these settings and see if they help:

- hw.em.fc_setting=0 (in /boot/loader.conf)
- hw.em.rxd=4096 (in /boot/loader.conf)
- hw.em.txd=4096 (in /boot/loader.conf)
- Fix speed and duplex at both link sides. After doing that, confirm on the 
freebsd
  box (with ifconfig) and the other device (with whatever command it provides) 
that
  the same speed and duplex is used by both devices.

you also have  high values for dev.em.0.rx/tx_[abs]_int_delay. If you
have set them manually, remove them or replace them with these in loader.conf:

hw.em.rx_int_delay=0
hw.em.tx_int_delay=66
hw.em.tx_abs_int_delay=66
hw.em.rx_abs_int_delay=66

these may be set via corresponding sysctls too.



Still no luck with the above settings, I've got another lockups a couple of 
times. Here is the recent details


=
11.10.30-23:43:06 ... interface em0 is down...
we have Ierrs and no ingoing packets for 5 secs, interface em0 must be toggled

11:43PM  up 1 day,  3:01, 2 users, load averages: 0.76, 0.64, 0.70

 == vmstat -i ==
interrupt  total   rate
irq18: ehci0 1145540 11
irq22: nfe0473895599   4872
cpu0: timer195004026   2005
irq256: ahci0   12832958131
irq257: em0:rx 095571051982
irq258: em0:tx 088777545912
irq259: em0:link 946  0
cpu3: timer195003397   2005
cpu1: timer195003398   2005
cpu2: timer195003399   2005
Total 1452237859  14932

 == netstat -m ==
5424/1701/7125 mbufs in use (current/cache/total)
719/1185/1904/51200 mbuf clusters in use (current/cache/total/max)
719/582 mbuf+clusters out of packet secondary zone in use (current/cache)
329/583/912/12800 4k (page size) jumbo clusters in use (current/cache/total/max)
4095/342/4437/12800 9k jumbo clusters in use (current/cache/total/max)
0/0/0/3200 16k jumbo clusters in use (current/cache/total/max)
40978K/8205K/49183K bytes allocated to network (current/cache/total)
0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters)
0/6663503/0 requests for jumbo clusters denied (4k/9k/16k)
0/0/0 sfbufs in use (current/peak/max)
0 requests for sfbufs denied
0 requests for sfbufs delayed
0 requests for I/O initiated by sendfile
0 calls to protocol drain routines

 == netstat -ind ==
NameMtu Network   Address  Ipkts Ierrs IdropOpkts Oerrs 
 Coll Drop
usbus 0 Link#1   0 0 00 0 
00
usbus 0 Link#2   0 0 00 0 
00
nfe0   1500 Link#3  00:25:22:21:86:89 196018201 0 0 350650768 
0 0  664
nfe0   1500 fe80::225:22f fe80::225:22ff:fe0 - -0 - 
--
nfe0   1500 10.16.128.0/1 10.16.189.71 6 - - 29787707 - 
--
em09000 Link#4  00:1b:21:ab:bf:4a 175676617   949 0 101627139 
0 00
em09000 192.168.168.0 192.168.168.1  7628423 - - 13654747 - 
--
em09000 fe80::21b:21f fe80::21b:21ff:fe   45 - - 5747 - 
--
em09000 2002:d5xx:xxx 2002:d5xx::x:  153 - -  159 - 
--

Oct 30 23:43:06 ion kernel: Interface is RUNNING and INACTIVE
Oct 30 23:43:07 ion kernel: em0: hw tdh = 2656, hw tdt = 3271
Oct 30 23:43:07 ion kernel: em0: hw rdh = 2112, hw rdt = 2111
Oct 30 23:43:07 ion kernel: em0: Tx Queue Status = 1
Oct 30 23:43:07 ion kernel: em0: TX descriptors avail = 3481
Oct 30 23:43:07 ion kernel: em0: Tx Descriptors avail failure = 0
Oct 30 23:43:07 ion kernel: em0: RX discarded packets = 0
Oct 30 23:43:07 ion kernel: em0: RX Next to Check = 2112
Oct 30 23:43:07 ion kernel: em0: RX Next to Refresh = 2111
net.inet.ip.intr_queue_maxlen: 4096
net.inet.ip.intr_queue_drops: 0
dev.em.0.%desc: Intel(R) PRO/1000 Network Connection 7.2.3
dev.em.0.%driver: em
dev.em.0.%location: slot=0 function=0
dev.em.0.%pnpinfo: vendor=0x8086 device=0x10d3 subvendor=0x8086 
subdevice=0xa01f class=0x02
dev.em.0.%parent: pci2
dev.em.0.nvm: -1
dev.em.0.debug: -1
dev.em.0.rx_int_delay: 0
dev.em.0.tx_int_delay: 66
dev.em.0.rx_abs_int_delay: 66
dev.em.0.tx_abs_int_delay: 66
dev.em.0.rx_processing_limit: 100
dev.em.0.flow_control: 0
dev.em.0.eee_control: 0
dev.em.0.link_irq: 956
dev.em.0.mbuf_alloc_fail: 0
dev.em.0.cluster_alloc_fail: 0
dev.em.0.dropped: 0
dev.em.0.tx_dma_fail: 1
dev.em.0.rx_overruns: 0
dev.em.0.watchdog_timeouts: 0
dev.em.0.device_control: 1074790984
dev.em.0.rx_control: 100827170
dev.em.0.fc_high_water: 11264
dev.em.0.fc_low_water: 9764
dev.em.0.queue0.txd_head: 2656
dev.em.0.queue0.txd_tail: 3274
dev.em.0.queue0.tx_irq: 88769608
dev.em.0.queue0.no_desc_avail: 0
dev.em.0.queue0.rxd_head: 2112

Re: Intel 82574L interface wedging on em 7.1.9/7.2.3 when MSIX enabled

2011-10-31 Thread Hooman Fazaeli

On 10/31/2011 12:51 PM, Emil Muratov wrote:

On 31.10.2011 12:13, Hooman Fazaeli wrote:





Thanks for looking into this.  I'd be happy to test any patch thrown my way, but keep in mind my issue is only tickled when MSI-X is enabled.  My interfaces aren't bouncing, though it might be 
possible some unique path in the MSI-X code is causing a throughput hang akin to connectivity loss?


Jack is the delta your speaking to the 7.2.4 code?  I did manage to get the code from Intel compiled with a couple minutes of work, but haven't loaded it up yet as I didn't see anything that 
caught my untrained eye in the diffs.  I'll wait until its ported over and would be happy to test if needed.


Conveniently enough I just received another report from my test boxes with a pretty stock loader.conf.  I had forgotten to remove the advanced options from the interfaces after I cycled them to 
pick up the fc_setting=0.  Fixed that up just meow.


hw.em.fc_setting=0
cc_cubic_load=YES




Jason

Attached is a patch for if_em.c. It flushes interface queue when it is full
and link is not active. Please note that when this happens, drops are increasing
on interface and this will trigger your scripts as before. You need to change
a little the scripts as follows:

  check interface TX status
  if (interface TX seems hung) {
sleep 5
check interface TX status
if (interface TX seems hung) {
 reset the interface.
}
}

For MULTIQUEUE, it just disables the check for link status (which is not good).
so pls. test in non-MULTIQUEUE mode.

The patch also contains some minor fixups to compile on 7 plus
a fix from r1.69 which addressed RX hang problem (the fix was
later removed in r1.70). I included it for Emil to give it a
try.

Pls. let me know if you have any problems with patch.





Hi! Thanks for the update. But I can't make it, there is an error in build 
process. Can you kindly take a look at it?


-emil@ion-/usr/src/sys/dev/e1000
--(0) sudo patch  /home/emil/patches/if_em/if_em.c.patch
Password:
Hmm...  Looks like a unified diff to me...
The text leading up to this was:
--
|--- if_em.c.orig   2011-10-31 11:43:35.0 +0330
|+++ if_em.c2011-10-31 11:43:35.0 +0330
--
Patching file if_em.c using Plan A...
Hunk #1 succeeded at 85.
Hunk #2 succeeded at 101.
Hunk #3 succeeded at 382 (offset -29 lines).
Hunk #4 succeeded at 400 (offset -29 lines).
Hunk #5 succeeded at 857 (offset -29 lines).
Hunk #6 succeeded at 960 (offset -29 lines).
Hunk #7 succeeded at 1420 (offset -29 lines).
Hunk #8 succeeded at 1436 (offset -29 lines).
Hunk #9 succeeded at 1466 (offset -29 lines).
Hunk #10 succeeded at 2230 (offset -29 lines).
Hunk #11 succeeded at 2338 (offset -29 lines).
Hunk #12 succeeded at 2350 (offset -29 lines).
Hunk #13 succeeded at 3799 (offset -29 lines).
Hunk #14 succeeded at 5164 with fuzz 2 (offset -29 lines).
Hunk #15 succeeded at 5616 (offset -4 lines).
done

-emil@ion-/usr/src/sys/dev/e1000
--(0) sudo patch  /home/emil/patches/if_em/if_em.h.patch
Hmm...  Looks like a unified diff to me...
The text leading up to this was:
--
|--- if_em.h.orig   2011-10-31 11:43:34.0 +0330
|+++ if_em.h2011-10-31 11:43:35.0 +0330
--
Patching file if_em.h using Plan A...
Hunk #1 succeeded at 438.
done


#root@ion-/usr/src/sys/modules/em
#-(0) make
Warning: Object directory not changed from original /usr/src/sys/modules/em
awk -f @/tools/makeobjops.awk @/kern/device_if.m -h
awk -f @/tools/makeobjops.awk @/kern/bus_if.m -h
awk -f @/tools/makeobjops.awk @/dev/pci/pci_if.m -h
: opt_inet.h
cc -O2 -pipe -march=nocona -fno-strict-aliasing -Werror -D_KERNEL -DKLD_MODULE -nostdinc  -I/usr/src/sys/modules/em/../../dev/e1000 -I. -I@-I@/contrib/altq 
-finline-limit=8000 --param inline-unit-growth=100 --param large-function-growth=1000 -fno-common  -fno-omit-frame-pointer-mcmodel=kernel -mno-red-zone  -mfpmath=387 
-mno-sse -mno-sse2 -mno-sse3 -mno-mmx -mno-3dnow  -msoft-float -fno-asynchronous-unwind-tables -   ffreestanding -fstack-protector -std=iso9899:1999 -fstack-protector -Wall 
-Wredundant-decls -Wnested-externs -Wstrict-prototypes  -Wmissing-   prototypes -Wpointer-arith -Winline -Wcast-qual  -Wundef -Wno-pointer-sign -fformat-extensions -c 
/usr/src/sys/modules/em/../../dev/e1000/if   _em.c

/usr/src/sys/modules/em/../../dev/e1000/if_em.c:387: error: 
'sysctl__hw_em_children' undeclared here (not in a function)
*** Error code 1

Stop in /usr/src/sys/modules/em.





Please sync your sys/dev/e1000 with HEAD and try again:

setenv CVSROOT :pserver:anon...@anoncvs.freebsd.org:/home/ncvs
cvs login
password: enter anonymous
cd /usr/src

Re: kern/162028: [ixgbe] [patch] misplaced #endif in ixgbe.c

2011-10-30 Thread Hooman Fazaeli
The following reply was made to PR kern/162028; it has been noted by GNATS.

From: Hooman Fazaeli hoomanfaza...@gmail.com
To: Sergey Kandaurov pluk...@gmail.com
Cc: bug-follo...@freebsd.org
Subject: Re: kern/162028: [ixgbe] [patch] misplaced #endif in ixgbe.c
Date: Sun, 30 Oct 2011 11:03:44 +0330

 On 10/29/2011 4:28 PM, Sergey Kandaurov wrote:
  I have a more complete patch. Can you test it please?
 
  Index: sys/dev/ixgbe/ixgbe.c
  ===
  --- sys/dev/ixgbe/ixgbe.c   (revision 226068)
  +++ sys/dev/ixgbe/ixgbe.c   (working copy)
  @@ -867,16 +867,15 @@ static int
ixgbe_ioctl(struct ifnet * ifp, u_long command, caddr_t data)
{
   struct adapter  *adapter = ifp-if_softc;
  -   struct ifreq*ifr = (struct ifreq *) data;
  +   struct ifreq*ifr = (struct ifreq *)data;
#if defined(INET) || defined(INET6)
  -   struct ifaddr *ifa = (struct ifaddr *)data;
  -   boolavoid_reset = FALSE;
  +   struct ifaddr   *ifa = (struct ifaddr *)data;
#endif
  -   int error = 0;
  +   boolavoid_reset = FALSE;
  +   int error = 0;
 
   switch (command) {
  -
  -case SIOCSIFADDR:
  +   case SIOCSIFADDR:
#ifdef INET
   if (ifa-ifa_addr-sa_family == AF_INET)
   avoid_reset = TRUE;
  @@ -885,7 +884,6 @@ ixgbe_ioctl(struct ifnet * ifp, u_long command, ca
   if (ifa-ifa_addr-sa_family == AF_INET6)
   avoid_reset = TRUE;
#endif
  -#if defined(INET) || defined(INET6)
   /*
   ** Calling init results in link renegotiation,
   ** so we avoid doing it when possible.
  @@ -894,12 +892,13 @@ ixgbe_ioctl(struct ifnet * ifp, u_long command, ca
   ifp-if_flags |= IFF_UP;
   if (!(ifp-if_drv_flags  IFF_DRV_RUNNING))
   ixgbe_init(adapter);
  +#ifdef INET
   if (!(ifp-if_flags  IFF_NOARP))
   arp_ifinit(ifp, ifa);
  +#endif
   } else
   error = ether_ioctl(ifp, command, data);
   break;
  -#endif
   case SIOCSIFMTU:
   IOCTL_DEBUGOUT(ioctl: SIOCSIFMTU (Set Interface MTU));
   if (ifr-ifr_mtu  IXGBE_MAX_FRAME_SIZE - ETHER_HDR_LEN) {
 
 
 sure.
 I am very busy right now.
 Will test as soon as I can.
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


Re: Intel 82574L interface wedging on em 7.1.9/7.2.3 when MSIX enabled

2011-10-30 Thread Hooman Fazaeli


I finally managed to re-produce an affect similar to Jason's case. It
may not be the exact same issue, but it is a serious problem and must
be addressed.

1. Push out packet on em/igb with high rate.
2. Disconnect cable and wait for a few seconds. netstat -ind shows that
   Drops are increasing.
3. Re-connect the cable. Both sides of like re-negotiate and the links comes up.
4. But ..., no packets is ever transmitted again and Drops still increasing!

This is because em/lem/igb and some other interfaces (i.e., bce) have
a check at the very beginning of their _start function
which checks link status and immediately returns if it is inactive.
This behavior causes if_snd to fills up in step 2 and when this happens,
IFQ_HANDOFF never calls if_start again, even when the link becomes
active again.

A cable unplug is not necessary to trigger the issue. Any temporary
link loss (e.i., during re-negotiation) can potentially lead to
aforementioned problem.

IMHO, this is not a driver issue and the real fix would be to change
IFQ_HANDOFF to call if_start when the queue is full.

Jason, If you are interested, I can prepare a patch for you
to address this issue in if_em and see if it helps.





--- if_em.c.orig2011-10-27 21:09:33.0 +0330
+++ if_em.c 2011-10-27 21:46:18.0 +0330
@@ -85,6 +85,14 @@
 #include e1000_82571.h
 #include if_em.h
 
+#if !defined(DISABLE_FIXUPS)  __FreeBSD_version  80
+static __inline int
+pci_find_cap(device_t dev, int capability, int *capreg)
+{
+return (PCI_FIND_EXTCAP(device_get_parent(dev), dev, capability, 
capreg));
+}
+#endif
+
 /*
  *  Set this to one to display debug statistics
  */
@@ -399,6 +407,12 @@
 /* Global used in WOL setup with multiport cards */
 static int global_quad_port_a = 0;
 
+#ifndef DISABLE_FIXUPS
+static int em_rx_hang_fixup = 0;
+SYSCTL_INT(_hw_em, OID_AUTO, rx_hang_fixup, CTLFLAG_RW, em_rx_hang_fixup, 0,
+Enable/disable r1.69 RX hang fixup code);
+#endif
+
 /*
  *  Device identification routine
  *
@@ -864,7 +878,11 @@
 int err = 0, enq = 0;
 
if ((ifp-if_drv_flags  (IFF_DRV_RUNNING | IFF_DRV_OACTIVE)) !=
+#ifdef DISABLE_FIXUPS  
IFF_DRV_RUNNING || adapter-link_active == 0) {
+#else
+   IFF_DRV_RUNNING) {
+#endif
if (m != NULL)
err = drbr_enqueue(ifp, txr-br, m);
return (err);
@@ -963,8 +981,10 @@
IFF_DRV_RUNNING)
return;
 
+#ifdef DISABLE_FIXUPS
if (!adapter-link_active)
return;
+#endif
 
while (!IFQ_DRV_IS_EMPTY(ifp-if_snd)) {
/* Call cleanup if number of TX descriptors low */
@@ -1414,7 +1434,11 @@
  *  Legacy polling routine: note this only works with single queue
  *
  */
+#if !defined(DISABLE_FIXUPS)  __FreeBSD_version  80
+static void
+#else
 static int
+#endif
 em_poll(struct ifnet *ifp, enum poll_cmd cmd, int count)
 {
struct adapter *adapter = ifp-if_softc;
@@ -1426,7 +1450,11 @@
EM_CORE_LOCK(adapter);
if ((ifp-if_drv_flags  IFF_DRV_RUNNING) == 0) {
EM_CORE_UNLOCK(adapter);
+#if !defined(DISABLE_FIXUPS)  __FreeBSD_version  80
+   return;
+#else
return (0);
+#endif
}
 
if (cmd == POLL_AND_CHECK_STATUS) {
@@ -1452,8 +1480,11 @@
em_start_locked(ifp, txr);
 #endif
EM_TX_UNLOCK(txr);
-
+#if !defined(DISABLE_FIXUPS)  __FreeBSD_version  80
+   return;
+#else
return (rx_done);
+#endif
 }
 #endif /* DEVICE_POLLING */
 
@@ -2213,6 +2244,16 @@
e1000_get_laa_state_82571(adapter-hw))
e1000_rar_set(adapter-hw, adapter-hw.mac.addr, 0);
 
+#ifndef DISABLE_FIXUPS
+   if (em_rx_hang_fixup) {
+   /* trigger tq to refill rx ring queue if it is empty */
+   for (int i = 0; i  adapter-num_queues; i++, rxr++) {
+   if (rxr-next_to_check == rxr-next_to_refresh) {
+   taskqueue_enqueue(rxr-tq, rxr-rx_task);
+   }
+   }
+   }
+#endif
/* Mask to use in the irq trigger */
if (adapter-msix_mem)
trigger = rxr-ims; /* RX for 82574 */
@@ -3766,7 +3807,7 @@
  * If we have a minimum free, clear IFF_DRV_OACTIVE
  * to tell the stack that it is OK to send packets.
  */
-if (txr-tx_avail  EM_MAX_SCATTER)
+if (txr-tx_avail = EM_MAX_SCATTER)
 ifp-if_drv_flags = ~IFF_DRV_OACTIVE;
 
/* Disable watchdog if all clean */
@@ -5553,4 +5594,8 @@
rxr-rx_discarded);
device_printf(dev, RX Next to Check = %d\n, rxr-next_to_check);

Re: Intel 82574L interface wedging on em 7.1.9/7.2.3 when MSIX enabled

2011-10-30 Thread Hooman Fazaeli

On 10/30/2011 6:03 PM, Ryan Stone wrote:

On Sun, Oct 30, 2011 at 4:57 AM, Hooman Fazaelihoomanfaza...@gmail.com  wrote:

IMHO, this is not a driver issue and the real fix would be to change
IFQ_HANDOFF to call if_start when the queue is full.

I'm not sure that's the right approach.  99% of the time, calling
if_start when the queue is full will be a waste of time. It seems to
me that the link interrupt handler needs to kick off the tx task to
drain the tx queue instead.

If the queue were not full, system would consume the CPU for sending
packets. Now, that it is full, a much little time is used to recover
from a (temporary) problem. Not a big deal!

Furthermore, the most common case for queue being full is stack
sending packets too fast. In this case OACTIVE is set and if_start
is not called at all.

Changing HANDOFF has the benefit that it is simple, can be implemented fast and
fixes the problem once for all drivers and for all such dangerous bugs not yet
discovered. It also makes drivers' code simpler.


___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


Re: Intel 82574L interface wedging on em 7.1.9/7.2.3 when MSIX enabled

2011-10-27 Thread Hooman Fazaeli

On 10/27/2011 9:59 AM, Emil Muratov wrote:


Hi Hooman

Here is what I've got when the script triggered just in time when the interface 
was locked


11.10.26-23:39:10 ... interface em0 is down...

FreeBSD ion.hotplug.ru 8.2-STABLE FreeBSD 8.2-STABLE #0: Thu Oct 20 20:20:25 
MSD 2011 r...@epia.home
.lan:/usr/obj/usr/src/sys/ION6debug  amd64
11:39PM  up  1:12, 2 users, load averages: 0.26, 0.48, 0.58


 == vmstat -i ==
interrupt  total   rate
irq22: nfe0 16644480   3865
cpu0: timer  8610122   1999
irq256: ahci0 606705140
irq257: em0:rx 0 3896622904
irq258: em0:tx 0 2762957641
irq259: em0:link 620  0
cpu3: timer  8609499   1999
cpu1: timer  8609499   1999
cpu2: timer  8609499   1999
Total   58350003  13550

 == netstat -ind ==
NameMtu Network   Address  Ipkts Ierrs IdropOpkts Oerrs 
 Coll Drop
usbus 0 Link#1   0 0 00 0 
00
usbus 0 Link#2   0 0 00 0 
00
nfe0   1500 Link#3  00:25:22:21:86:89  7157140 0 0 12266747 0 
00
nfe0   1500 fe80::225:22f fe80::225:22ff:fe0 - -   85 - 
--
nfe0   1500 10.16.128.0/1 10.16.189.71 0 - -48135 - 
--
em09000 Link#4  00:1b:21:ab:bf:4a  5465087   623 0  2862028 0 
0  113
em09000 192.168.168.0 192.168.168.1   764085 - -  1005078 - 
--
em09000 fe80::21b:21f fe80::21b:21ff:fe   45 - -  252 - 
--
em09000 2002:d58d:871 2002:d58d:8715:1:   73 - -   38 - 
--
wifi   1500 Link#7  00:1b:21:ab:bf:4a  347 0 0  350 0 
00
wifi   1500 192.168.168.6 192.168.168.65   0 - -0 - 
--
wifi   1500 fe80::225:x fe80::225:x:x0 - -  349 - - 
   -
wifi   1500 2002:x:x 2002:x:x:2:0 - -0 - --
wifio  1500 Link#8  00:1b:21:ab:bf:4a59559 0 0   114639 0 
00
wifio  1500 192.168.168.8 192.168.168.81   0 - -  160 - 
--
wifio  1500 fe80::225:x fe80::225:x:x0 - -0 - - 
   -
stf0   1280 Link#95725 0 0 6125   420 
00
stf0   1280 2002:x:x 2002:x:x::1 1878 - - 1121 - --
ng0*   1500 Link#10  0 0 00 0 
00
ng1*   1500 Link#11  0 0 00 0 
00
ng21492 Link#127143733 0 0 12234436 0 
00
ng21492 213.141.x.x 213.141.x.x 4735932 - -  8480089 - 
--
ng21492 fe80::x:x fe80::x:x:x0 - -1 - --
tun0   1455 Link#13350 0 0  172 0 
00
tun0   1455 fe80::225:x fe80::225:x:x0 - -2 - - 
   -
tun0   1455 192.168.169.1 192.168.169.1  117 - -  167 - 
--

Oct 26 23:39:11 ion kernel: em0: hw tdh = 975, hw tdt = 944
Oct 26 23:39:11 ion kernel: em0: hw rdh = 960, hw rdt = 959
Oct 26 23:39:11 ion kernel: em0: Tx Queue Status = 1
Oct 26 23:39:11 ion kernel: em0: TX descriptors avail = 31
Oct 26 23:39:11 ion kernel: em0: Tx Descriptors avail failure = 0
Oct 26 23:39:11 ion kernel: em0: RX discarded packets = 0
Oct 26 23:39:11 ion kernel: em0: RX Next to Check = 960
Oct 26 23:39:11 ion kernel: em0: RX Next to Refresh = 959

net.inet.ip.intr_queue_maxlen: 4096
net.inet.ip.intr_queue_drops: 0
dev.em.0.%desc: Intel(R) PRO/1000 Network Connection 7.2.3
dev.em.0.%driver: em
dev.em.0.%location: slot=0 function=0
dev.em.0.%pnpinfo: vendor=0x8086 device=0x10d3 subvendor=0x8086 
subdevice=0xa01f class=0x02
dev.em.0.%parent: pci2
dev.em.0.nvm: -1
dev.em.0.debug: -1
dev.em.0.rx_int_delay: 200
dev.em.0.tx_int_delay: 200
dev.em.0.rx_abs_int_delay: 4096
dev.em.0.tx_abs_int_delay: 4096
dev.em.0.rx_processing_limit: 100
dev.em.0.flow_control: 3
dev.em.0.eee_control: 0
dev.em.0.link_irq: 648
dev.em.0.mbuf_alloc_fail: 0
dev.em.0.cluster_alloc_fail: 0
dev.em.0.dropped: 0
dev.em.0.tx_dma_fail: 0
dev.em.0.rx_overruns: 0
dev.em.0.watchdog_timeouts: 0
dev.em.0.device_control: 1477444168
dev.em.0.rx_control: 100827170
dev.em.0.fc_high_water: 11264
dev.em.0.fc_low_water: 9764
dev.em.0.queue0.txd_head: 975
dev.em.0.queue0.txd_tail: 944
dev.em.0.queue0.tx_irq: 2762762
dev.em.0.queue0.no_desc_avail: 0
dev.em.0.queue0.rxd_head: 960
dev.em.0.queue0.rxd_tail: 959
dev.em.0.queue0.rx_irq: 3895860
dev.em.0.mac_stats.excess_coll: 0

Re: Intel 82574L interface wedging on em 7.1.9/7.2.3 when MSIX enabled

2011-10-26 Thread Hooman Fazaeli

Hi Jason

Have you tried:

hw.em.fc_setting=0 (in loader.conf)
ifconfig emX -tso -lro -rxcsum -txcsum -vlanhwtag -wol

with MSIX and no multiqueue.

Advanced features has always been a source of problem.
It is worth a try and help to narrow down possibilities.

It would also be helpful if you provide 'ifconfig' output
when the problem happens.

And a question: Does interface RX also hangs or it is just TX?

On 10/26/2011 12:25 AM, Jason Wolfe wrote:

On Fri, Oct 7, 2011 at 2:14 PM, Jason Wolfenitrobo...@gmail.com  wrote:


Bumping rx/tx descriptors to 2048 was actually for performance reasons and
not to try to get around the issue. I did some fairly in depth testing and
found under heavy load it performed the best with those settings.

As mentioned on the other thread I'll re enable MSI-X on a few servers here
and collect uptime and the kernel msgbuf in addition. I'll bump the
descriptors down to 512 to try and increase our chances and compile the
driver with EM_MULTIQUEUE also.

Jason


Hi there,

So I have a small pool of server running EM_MULTIQUEUE with lower
descriptors as promised and just received an alert of an event.  I have a
fairly large pool of servers on the same hardware running the same OS/driver
sans MSI-X and multiqueue with not a single 'wedge' event in about 2 months
now, and it seems multiqueue has not changed the commonality of the issue.
  Here is my loader.conf followed by everything collected:

net.inet.tcp.tcbhashsize=4096
net.inet.tcp.syncache.hashsize=1024
net.inet.tcp.syncache.bucketlimit=512
net.inet.tcp.syncache.cachelimit=65536
net.inet.tcp.hostcache.hashsize=1024
net.inet.tcp.hostcache.bucketlimit=512
net.inet.tcp.hostcache.cachelimit=65536
hw.em.rxd=512
hw.em.txd=512
cc_cubic_load=YES

I bounced em1 because dropped packets incremented 1386169 to 1386355 and the
interface is not incrementing packets out.

1:30PM up 4 days, 6:19, 0 users, load averages: 0.18, 0.38, 0.42

interrupt total rate
irq3: uart1 5816 0
cpu0: timer 736655476 2000
irq256: em0:rx 0 38122306 103
irq257: em0:tx 0 1605535054 4359
irq258: em0:link 1 0
irq259: em1:rx 0 2192460862 5952
irq260: em1:tx 0 1599049303 4341
irq261: em1:link 4172 0
irq262: mps0 212448927 576
cpu2: timer 736647277 2000
cpu3: timer 736647302 2000
cpu1: timer 736647302 2000
Total 8594223798 2

27653/6022/33675 mbufs in use (current/cache/total)
3054/3196/6250/5700670 mbuf clusters in use (current/cache/total/max)
3054/1041 mbuf+clusters out of packet secondary zone in use (current/cache)
23266/1642/24908/2850335 4k (page size) jumbo clusters in use
(current/cache/total/max)
0/0/0/6400 9k jumbo clusters in use (current/cache/total/max)
0/0/0/3200 16k jumbo clusters in use (current/cache/total/max)
106085K/14465K/120550K bytes allocated to network (current/cache/total)
0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters)
0/0/0 requests for jumbo clusters denied (4k/9k/16k)
0/0/0 sfbufs in use (current/peak/max)
0 requests for sfbufs denied
0 requests for sfbufs delayed
22 requests for I/O initiated by sendfile
0 calls to protocol drain routines

Name Mtu Network Address Ipkts Ierrs Idrop Opkts Oerrs Coll Drop
em0 1500Link#1  00:25:90:1f:f5:7d 38575296 0 0 6300959828 0 0 706638
em0 1500 fe80:1::225:9 fe80:1::225:90ff: 0 - - 3 - - -
em1 1500Link#2  00:25:90:1f:f5:7d 6091053202 22415 0 6327642657 0 0
1386797
em1 1500 fe80:2::225:9 fe80:2::225:90ff: 0 - - 1 - - -
lagg0 1500Link#5  00:25:90:1f:f5:7d 6129556798 0 0 12627493094 2093435 0 0

lagg0 1500 69.164.38.0/2 69.164.38.93 5429109508 - - 12630422599 - - -
lagg0 1500 fe80:5::225:9 fe80:5::225:90ff: 12 - - 17 - - -
lagg0 1500 2607:f4e8:310 2607:f4e8:310:12: 13655 - - 13663 - - -

kern.msgbuf:

Oct 25 13:30:04 cds1043 kernel: Interface is RUNNING and INACTIVE
Oct 25 13:30:04 cds1043 kernel: em0: hw tdh = 105, hw tdt = 158
Oct 25 13:30:04 cds1043 kernel: em0: hw rdh = 191, hw rdt = 190
Oct 25 13:30:04 cds1043 kernel: em0: Tx Queue Status = 0
Oct 25 13:30:04 cds1043 kernel: em0: TX descriptors avail = 422
Oct 25 13:30:04 cds1043 kernel: em0: Tx Descriptors avail failure = 0
Oct 25 13:30:04 cds1043 kernel: em0: RX discarded packets = 0
Oct 25 13:30:04 cds1043 kernel: em0: RX Next to Check = 192
Oct 25 13:30:04 cds1043 kernel: em0: RX Next to Refresh = 191
Oct 25 13:30:04 cds1043 kernel: Interface is RUNNING and INACTIVE
Oct 25 13:30:04 cds1043 kernel: em1: hw tdh = 159, hw tdt = 159
Oct 25 13:30:04 cds1043 kernel: em1: hw rdh = 193, hw rdt = 191
Oct 25 13:30:04 cds1043 kernel: em1: Tx Queue Status = 0
Oct 25 13:30:04 cds1043 kernel: em1: TX descriptors avail = 512
Oct 25 13:30:04 cds1043 kernel: em1: Tx Descriptors avail failure = 0
Oct 25 13:30:04 cds1043 kernel: em1: RX discarded packets = 0
Oct 25 13:30:04 cds1043 kernel: em1: RX Next to Check = 407
Oct 25 13:30:04 cds1043 kernel: em1: RX Next to Refresh = 436

net.inet.ip.intr_queue_maxlen: 512
net.inet.ip.intr_queue_drops: 0
dev.em.0.%desc: Intel(R) PRO/1000 Network Connection 7.2.3
dev.em.0.%driver: em
dev.em.0.%location: 

Re: Intel 82574L interface wedging on em 7.1.9/7.2.3 when MSIX enabled

2011-10-26 Thread Hooman Fazaeli

Hi,

Can yan you pls post the output of these command _when_ the problem happens?

uname -a
sysctl dev.em
netstat -ind
ifconfig


I've got almost the same problem with intel 82574L based nic. My platform is nvidia ion running Atom 1.6 and nic is an external PCI-express adapter. Unlike Jason's case mine is always stuck in 
receiving traffic, it's Ierrs increasing while Ipkts not. Thanks to Jason's script I can see those locks and interface flapping every several hours. My system is not a heavy loaded server but just a 
home nas/router, usually routing at 100 mbps or less. Nither disabling MSIX nor tuning txd rxd doesn't help me.




___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


misplaced #endif in ixgbe

2011-10-17 Thread Hooman Fazaeli


A misplaced #endif in ixgbe_ioctl() causes all sorts of problems
when INET and INET6 are undefined. Pls. see the attached patch.



--- ixgbe.c.orig2011-10-17 20:37:17.0 +0330
+++ ixgbe.c 2011-10-17 20:38:40.0 +0330
@@ -898,8 +898,8 @@
arp_ifinit(ifp, ifa);
} else
error = ether_ioctl(ifp, command, data);
-   break;
 #endif
+   break;
case SIOCSIFMTU:
IOCTL_DEBUGOUT(ioctl: SIOCSIFMTU (Set Interface MTU));
if (ifr-ifr_mtu  IXGBE_MAX_FRAME_SIZE - ETHER_HDR_LEN) {
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org

Re: em(4) high latency w/o msix

2011-10-04 Thread Hooman Fazaeli


8.2-RELEASE and stable/8 have the same problem. Ping RTT triples
when MSIX is disabled.

On 10/3/2011 11:50 AM, Jack Vogel wrote:

Can you try the driver in 8.2 and possibly stable/8 to see the behavior there.

And, just curious, why are you disabling MSIX?

Jack


On Mon, Oct 3, 2011 at 12:51 AM, Hooman Fazaeli faza...@sepehrs.com 
mailto:faza...@sepehrs.com wrote:

Hi Jack

The hardware is a PCIe network appliance with 3 port modules. The ports I 
have
used in the test are 82574L residing on a 4 port module. Anyway, as I noted
in last mail, the stock 7.3-RELEASE driver does not expose this problem on
the same hardware.


On 10/2/2011 7:38 PM, Jack Vogel wrote:

On what hardware?

Jack


On Sun, Oct 2, 2011 at 6:42 AM, Hooman Fazaeli faza...@sepehrs.com 
mailto:faza...@sepehrs.com wrote:


Latest em(4) driver from HEAD seems to have high latency
when MSIX is disabled.

With MSIX enabled (hw.em.enable_msix=1):

# ping -c5 192.168.1.83
PING 192.168.1.83 (192.168.1.83): 56 data bytes
64 bytes from 192.168.1.83 http://192.168.1.83: icmp_seq=0 ttl=64 
time=0.055 ms
64 bytes from 192.168.1.83 http://192.168.1.83: icmp_seq=1 ttl=64 
time=0.076 ms
64 bytes from 192.168.1.83 http://192.168.1.83: icmp_seq=2 ttl=64 
time=0.066 ms
64 bytes from 192.168.1.83 http://192.168.1.83: icmp_seq=3 ttl=64 
time=0.051 ms
64 bytes from 192.168.1.83 http://192.168.1.83: icmp_seq=4 ttl=64 
time=0.063 ms

--- 192.168.1.83 ping statistics ---
5 packets transmitted, 5 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 0.051/0.062/0.076/0.009 ms

With MSIX disabled:

# ping -c5 192.168.1.83
PING 192.168.1.83 (192.168.1.83): 56 data bytes
64 bytes from 192.168.1.83 http://192.168.1.83: icmp_seq=0 ttl=64 
time=0.180 ms
64 bytes from 192.168.1.83 http://192.168.1.83: icmp_seq=1 ttl=64 
time=0.164 ms
64 bytes from 192.168.1.83 http://192.168.1.83: icmp_seq=2 ttl=64 
time=0.169 ms
64 bytes from 192.168.1.83 http://192.168.1.83: icmp_seq=3 ttl=64 
time=0.172 ms
64 bytes from 192.168.1.83 http://192.168.1.83: icmp_seq=4 ttl=64 
time=0.167 ms

--- 192.168.1.83 ping statistics ---
5 packets transmitted, 5 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 0.164/0.170/0.180/0.005 ms

As you see, w/o MSIX, RTT increases by a factor of 3.

I also tested the following drivers:
   - igb(4) from HEAD: OK.
   - Stock 7.3-RELEASE: OK.
   - Stock 7.4-RELEASE: problem exist.

Any ideas?











___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


intel checksum offload

2011-09-15 Thread Hooman Fazaeli

Hi list,

The data sheet for intel 82576 advertises IP TX/RX checksum offload
but the driver does not set CSUM_IP in ifp-if_hwassist. Does this mean that
driver (and chip) do not support IP TX checksum offload or the support for
TX is not yet included in the driver?

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


Re: em driver, 82574L chip, and possibly ASPM

2011-07-12 Thread Hooman Fazaeli


I have similar problems on a couple of 7.3 boxes with latest driver form 
-CURRENT.
I just wanted to know if your 7 boxes work fine so I look for cause else where.

On 2/7/2011 3:23 AM, Mike Tancsa wrote:

So far so good.  I would often get a hang on the level zero dumps to my
backup server Sunday AM, and it made it through!  So a good sign, but
not a definitive sign.

I have a PCIe em card that has this chipset as well and was showing the
same sort of problem in a customer's RELENG_7 box.  I will see if I can
get the customer to try the card in their box with the patch for
RELENG_7 as it would show this issue at least once a day until I pulled
the card for an older version

---Mike


On 2/4/2011 1:12 PM, Jack Vogel wrote:

Was curious too, but being more patient than you :)

Jack


On Fri, Feb 4, 2011 at 10:09 AM, Sean Brunosean...@yahoo-inc.com  wrote:


Any more data on this problem or do we have to wait a while?

Sean


On Wed, 2011-02-02 at 10:28 -0800, Mike Tancsa wrote:

On 2/2/2011 12:37 PM, Jack Vogel wrote:

So has everyone that wanted to get something  testing been able to do

so?

I have been testing in the back and will deploy to my production box
this afternoon.  As I am not able to reproduce it easily, it will be a
bit before I can say the issue is gone.  Jan however, was able to
trigger it with greater ease ?

 ---Mike


Jack


On Tue, Feb 1, 2011 at 7:03 PM, Mike Tancsam...@sentex.net  wrote:


On 2/1/2011 5:03 PM, Sean Bruno wrote:

On Tue, 2011-02-01 at 13:43 -0800, Jack Vogel wrote:

To those who are going to test, here is the if_em.c, based on head,
with my
changes, I have to leave for the afternoon, and have not had a

chance

to build
this, but it should work. I will check back in the later evening.

Any blatant problems Sean, feel free to fix them :)

Jack



I suspect that line 1490 should be:
   if (more_rx || (ifp-if_drv_flags  IFF_DRV_OACTIVE)) {



I have hacked up a RELENG_8 version which I think is correct including
the above change

http://www.tancsa.com/if_em-8.c



--- if_em.c.orig2011-02-01 21:47:14.0 -0500
+++ if_em.c 2011-02-01 21:47:19.0 -0500
@@ -30,7 +30,7 @@
   POSSIBILITY OF SUCH DAMAGE.




  
**/

-/*$FreeBSD: src/sys/dev/e1000/if_em.c,v 1.21.2.20 2011/01/22 01:37:53
jfv Exp $*/
+/*$FreeBSD$*/

  #ifdef HAVE_KERNEL_OPTION_HEADERS
  #include opt_device_polling.h
@@ -93,7 +93,7 @@


  /*

  *  Driver version:


  */

-char em_driver_version[] = 7.1.9;
+char em_driver_version[] = 7.1.9-test;



  /*

  *  PCI Device ID Table
@@ -927,11 +927,10 @@
if (!adapter-link_active)
return;

-/* Call cleanup if number of TX descriptors low */
-   if (txr-tx_avail= EM_TX_CLEANUP_THRESHOLD)
-   em_txeof(txr);
-
while (!IFQ_DRV_IS_EMPTY(ifp-if_snd)) {
+   /* First cleanup if TX descriptors low */
+   if (txr-tx_avail= EM_TX_CLEANUP_THRESHOLD)
+   em_txeof(txr);
if (txr-tx_avail  EM_MAX_SCATTER) {
ifp-if_drv_flags |= IFF_DRV_OACTIVE;
break;
@@ -1411,8 +1410,7 @@
if (!drbr_empty(ifp, txr-br))
em_mq_start_locked(ifp, txr, NULL);
  #else
-   if (!IFQ_DRV_IS_EMPTY(ifp-if_snd))
-   em_start_locked(ifp, txr);
+   em_start_locked(ifp, txr);
  #endif
EM_TX_UNLOCK(txr);

@@ -1475,11 +1473,10 @@
struct ifnet*ifp = adapter-ifp;
struct tx_ring  *txr = adapter-tx_rings;
struct rx_ring  *rxr = adapter-rx_rings;
-   boolmore;
-

if (ifp-if_drv_flags  IFF_DRV_RUNNING) {
-   more = em_rxeof(rxr, adapter-rx_process_limit, NULL);
+   boolmore_rx;
+   more_rx = em_rxeof(rxr, adapter-rx_process_limit,

NULL);

EM_TX_LOCK(txr);
em_txeof(txr);
@@ -1487,12 +1484,10 @@
if (!drbr_empty(ifp, txr-br))
em_mq_start_locked(ifp, txr, NULL);
  #else
-   if (!IFQ_DRV_IS_EMPTY(ifp-if_snd))
-   em_start_locked(ifp, txr);
+   em_start_locked(ifp, txr);
  #endif
-   em_txeof(txr);
EM_TX_UNLOCK(txr);
-   if (more) {
+   if (more_rx || (ifp-if_drv_flags  IFF_DRV_OACTIVE))

{

taskqueue_enqueue(adapter-tq,

adapter-que_task);

return;
}
@@ -1604,7 +1599,6 @@
if (!IFQ_DRV_IS_EMPTY(ifp-if_snd))
em_start_locked(ifp, txr);
  #endif
-   em_txeof(txr);
E1000_WRITE_REG(adapter-hw, E1000_IMS, txr-ims);
EM_TX_UNLOCK(txr);
  

Re: em driver, 82574L chip, and possibly ASPM

2011-07-06 Thread Hooman Fazaeli

Can you pls. share the patch for freebsd 7?

On 2/7/2011 3:23 AM, Mike Tancsa wrote:

So far so good.  I would often get a hang on the level zero dumps to my
backup server Sunday AM, and it made it through!  So a good sign, but
not a definitive sign.

I have a PCIe em card that has this chipset as well and was showing the
same sort of problem in a customer's RELENG_7 box.  I will see if I can
get the customer to try the card in their box with the patch for
RELENG_7 as it would show this issue at least once a day until I pulled
the card for an older version

---Mike


On 2/4/2011 1:12 PM, Jack Vogel wrote:

Was curious too, but being more patient than you :)

Jack


On Fri, Feb 4, 2011 at 10:09 AM, Sean Brunosean...@yahoo-inc.com  wrote:


Any more data on this problem or do we have to wait a while?

Sean


On Wed, 2011-02-02 at 10:28 -0800, Mike Tancsa wrote:

On 2/2/2011 12:37 PM, Jack Vogel wrote:

So has everyone that wanted to get something  testing been able to do

so?

I have been testing in the back and will deploy to my production box
this afternoon.  As I am not able to reproduce it easily, it will be a
bit before I can say the issue is gone.  Jan however, was able to
trigger it with greater ease ?

 ---Mike


Jack


On Tue, Feb 1, 2011 at 7:03 PM, Mike Tancsam...@sentex.net  wrote:


On 2/1/2011 5:03 PM, Sean Bruno wrote:

On Tue, 2011-02-01 at 13:43 -0800, Jack Vogel wrote:

To those who are going to test, here is the if_em.c, based on head,
with my
changes, I have to leave for the afternoon, and have not had a

chance

to build
this, but it should work. I will check back in the later evening.

Any blatant problems Sean, feel free to fix them :)

Jack



I suspect that line 1490 should be:
   if (more_rx || (ifp-if_drv_flags  IFF_DRV_OACTIVE)) {



I have hacked up a RELENG_8 version which I think is correct including
the above change

http://www.tancsa.com/if_em-8.c



--- if_em.c.orig2011-02-01 21:47:14.0 -0500
+++ if_em.c 2011-02-01 21:47:19.0 -0500
@@ -30,7 +30,7 @@
   POSSIBILITY OF SUCH DAMAGE.




  
**/

-/*$FreeBSD: src/sys/dev/e1000/if_em.c,v 1.21.2.20 2011/01/22 01:37:53
jfv Exp $*/
+/*$FreeBSD$*/

  #ifdef HAVE_KERNEL_OPTION_HEADERS
  #include opt_device_polling.h
@@ -93,7 +93,7 @@


  /*

  *  Driver version:


  */

-char em_driver_version[] = 7.1.9;
+char em_driver_version[] = 7.1.9-test;



  /*

  *  PCI Device ID Table
@@ -927,11 +927,10 @@
if (!adapter-link_active)
return;

-/* Call cleanup if number of TX descriptors low */
-   if (txr-tx_avail= EM_TX_CLEANUP_THRESHOLD)
-   em_txeof(txr);
-
while (!IFQ_DRV_IS_EMPTY(ifp-if_snd)) {
+   /* First cleanup if TX descriptors low */
+   if (txr-tx_avail= EM_TX_CLEANUP_THRESHOLD)
+   em_txeof(txr);
if (txr-tx_avail  EM_MAX_SCATTER) {
ifp-if_drv_flags |= IFF_DRV_OACTIVE;
break;
@@ -1411,8 +1410,7 @@
if (!drbr_empty(ifp, txr-br))
em_mq_start_locked(ifp, txr, NULL);
  #else
-   if (!IFQ_DRV_IS_EMPTY(ifp-if_snd))
-   em_start_locked(ifp, txr);
+   em_start_locked(ifp, txr);
  #endif
EM_TX_UNLOCK(txr);

@@ -1475,11 +1473,10 @@
struct ifnet*ifp = adapter-ifp;
struct tx_ring  *txr = adapter-tx_rings;
struct rx_ring  *rxr = adapter-rx_rings;
-   boolmore;
-

if (ifp-if_drv_flags  IFF_DRV_RUNNING) {
-   more = em_rxeof(rxr, adapter-rx_process_limit, NULL);
+   boolmore_rx;
+   more_rx = em_rxeof(rxr, adapter-rx_process_limit,

NULL);

EM_TX_LOCK(txr);
em_txeof(txr);
@@ -1487,12 +1484,10 @@
if (!drbr_empty(ifp, txr-br))
em_mq_start_locked(ifp, txr, NULL);
  #else
-   if (!IFQ_DRV_IS_EMPTY(ifp-if_snd))
-   em_start_locked(ifp, txr);
+   em_start_locked(ifp, txr);
  #endif
-   em_txeof(txr);
EM_TX_UNLOCK(txr);
-   if (more) {
+   if (more_rx || (ifp-if_drv_flags  IFF_DRV_OACTIVE))

{

taskqueue_enqueue(adapter-tq,

adapter-que_task);

return;
}
@@ -1604,7 +1599,6 @@
if (!IFQ_DRV_IS_EMPTY(ifp-if_snd))
em_start_locked(ifp, txr);
  #endif
-   em_txeof(txr);
E1000_WRITE_REG(adapter-hw, E1000_IMS, txr-ims);
EM_TX_UNLOCK(txr);
  }
@@ -3730,17 +3724,17 @@
txr-queue_status = EM_QUEUE_HUNG;

 /*
- * If we have enough 

Re: Introducing netmap: line-rate packet send/receive at 10Gbit/s

2011-06-06 Thread Hooman Fazaeli


Thanks for the work.

Is source for driver patches available?

On 6/3/2011 3:01 AM, Luigi Rizzo wrote:

Hi,
we have recently worked on a project, called netmap, which lets
FreeBSD send/receive packets at line rate even at 10 Gbit/s with
very low CPU overhead: one core at 1.33 GHz does 14.88 Mpps with a
modified ixgbe driver, which gives plenty of CPU cycles to handle
multiple interface and/or do useful work (packet forwarding, analysis, etc.)

You can find full documentation and source code and even a picobsd image at

 http://info.iet.unipi.it/~luigi/netmap/

The system uses memory mapped packet buffers to reduce the cost of
data movements, but this would not be enough to make it useful or
novel.  Netmap uses many other small but important tricks to make
the system fast, safe and easy to use, and support transmission,
reception, and communication with the host stack.

You can see full details in  documentation at the above link.

Feedback welcome.

cheers
luigi
-+---
   Prof. Luigi RIZZO, ri...@iet.unipi.it  . Dip. di Ing. dell'Informazione
   http://www.iet.unipi.it/~luigi/. Universita` di Pisa
   TEL  +39-050-2211611   . via Diotisalvi 2
   Mobile   +39-338-6809875   . 56122 PISA (Italy)
-+---
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


broadcom 57710 support

2009-07-19 Thread Hooman Fazaeli

Any one knows if there is any near plan to develop drivers for

network cards based on broadcom NetXtereme II 57710

10 GbE controller?

---
best regards
Hooman Fazaeli



  
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org