Re: NFS on 10G interface terribly slow

2015-06-29 Thread Rick Macklem
I wrote:
 Gerrit Kuhn wrote:
  On Fri, 26 Jun 2015 20:42:08 -0400 (EDT) Rick Macklem
  rmack...@uoguelph.ca wrote about Re: NFS on 10G interface terribly slow:
  
  RM Btw, can you tell us what Intel chip(s) you're using?
  
  I have
  
  ix0@pci0:5:0:0: class=0x02 card=0x00028086 chip=0x15288086 rev=0x01
  hdr=0x00 vendor = 'Intel Corporation'
  device = 'Ethernet Controller 10-Gigabit X540-AT2'
  class  = network
  subclass   = ethernet
  
 Yea, I don't know how to decode this either.
I took a look at the driver and, if I read it correctly, most chips (including
all the X540 ones) use IXGBE_82599_SCATTER.

As such, you will be doing lots of m_defrag() calls, but since disabling TSO
didn't help, that doesn't seem to be the bottleneck.

rick

 I was actually interested in
 what chip Scott was using and getting wire speed.
 As noted in the other reply, since disabling TSO didn't help, you probably
 aren't affected by this issue.
 
 rick
 
  RM For example, from the ix driver:
  RM #define IXGBE_82598_SCATTER 100
  RM #define IXGBE_82599_SCATTER 32
  
  Hm, I cannot find out into which chipset number this translates for my
  device...
  
  RM Btw, it appears that the driver in head/current now sets
  RM if_hw_tsomaxsegcount, but the driver in stable/10 does not. This means
  RM that the 82599 chip will end up doing the m_defrag() calls for 10.x.
  
  So the next step could even be updating to -current...
  OTOH, I get the same (bad) resulsts, no matter if TSO is enabled or
  disabled on the interface.
  
  
  cu
Gerrit
  ___
  freebsd-net@freebsd.org mailing list
  http://lists.freebsd.org/mailman/listinfo/freebsd-net
  To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
  
 ___
 freebsd-net@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-net
 To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
 
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


Re: NFS on 10G interface terribly slow

2015-06-29 Thread Carsten Aulbert
Hi Rick

On 06/29/2015 02:20 PM, Rick Macklem wrote:
 If the Solaris server is using ZFS, setting sync=disabled might help w.r.t.
 write performance. It is, however, somewhat dangerous w.r.t. loss of recently
 written data when the server crashes. (Server has told client data is safely
 on stable storage so client will not re-write the block(s) although data 
 wasn't
 on stable storage and is lost.)
 (I'm not a ZFS guy, so I can't suggest more w.r.t. ZFS.)
 

The system on the other side uses SAM/QFS, i.e. there is no such option
for the file system per se (only the file system metadata is in a zvol
thus not a full featured zfs).

In parallel we are working also with Oracle to see where there may be a
matching knob to turn as we see about the same performance issues from a
Linux host (NFS client, Debian Jessie) with a Mellanox Technologies
MT27500 Family [ConnectX-3] controller.

Cheers

Carsten

-- 
Dr. Carsten Aulbert, Atlas cluster administration
Max Planck Institute for Gravitational Physics (Albert Einstein Institute)
Callinstraße 38, 30167 Hannover, Germany
Tel: +49 511 762 17185, Fax: +49 511 762 17193
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org

Re: NFS on 10G interface terribly slow

2015-06-29 Thread Mike Tancsa
On 6/29/2015 8:20 AM, Rick Macklem wrote:
 If the Solaris server is using ZFS, setting sync=disabled might help w.r.t.

On my FreeBSD zfs server, this is a must for decent and consistent write
throughput.  Using FreeBSD as an iSCSI target and a Linux initiator, I
can saturate a 1G nic no problem with sync disabled. Its barely usable
with the default sync standard as its so bursty

---Mike


-- 
---
Mike Tancsa, tel +1 519 651 3400
Sentex Communications, m...@sentex.net
Providing Internet services since 1994 www.sentex.net
Cambridge, Ontario Canada   http://www.tancsa.com/
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


Re: NFS on 10G interface terribly slow

2015-06-29 Thread Olivier Cochard-Labbé
On Mon, Jun 29, 2015 at 9:19 AM, Gerrit Kühn gerrit.ku...@aei.mpg.de
wrote:

 On Fri, 26 Jun 2015 20:42:08 -0400 (EDT) Rick Macklem
 rmack...@uoguelph.ca wrote about Re: NFS on 10G interface terribly slow:

 RM Btw, can you tell us what Intel chip(s) you're using?

 I have

 ix0@pci0:5:0:0: class=0x02 card=0x00028086 chip=0x15288086 rev=0x01
 hdr=0x00 vendor = 'Intel Corporation'
 device = 'Ethernet Controller 10-Gigabit X540-AT2'
 class  = network
 subclass   = ethernet

 RM For example, from the ix driver:
 RM #define IXGBE_82598_SCATTER 100
 RM #define IXGBE_82599_SCATTER 32

 Hm, I cannot find out into which chipset number this translates for my
 device...


​extract first 4 numbers of chip, then try a grep:​
​grep 1528 /usr/src/sys/dev/ixgbe/*
/usr/src/sys/dev/ixgbe/ixgbe_type.h:#define
IXGBE_DEV_ID_X540T  0x1528

= Then your chipset is X540
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org

Re: NFS on 10G interface terribly slow

2015-06-29 Thread Rick Macklem
Gerrit Kuhn wrote:
 On Fri, 26 Jun 2015 20:42:08 -0400 (EDT) Rick Macklem
 rmack...@uoguelph.ca wrote about Re: NFS on 10G interface terribly slow:
 
 RM Btw, can you tell us what Intel chip(s) you're using?
 
 I have
 
 ix0@pci0:5:0:0: class=0x02 card=0x00028086 chip=0x15288086 rev=0x01
 hdr=0x00 vendor = 'Intel Corporation'
 device = 'Ethernet Controller 10-Gigabit X540-AT2'
 class  = network
 subclass   = ethernet
 
 RM For example, from the ix driver:
 RM #define IXGBE_82598_SCATTER   100
 RM #define IXGBE_82599_SCATTER   32
 
 Hm, I cannot find out into which chipset number this translates for my
 device...
 
 RM Btw, it appears that the driver in head/current now sets
 RM if_hw_tsomaxsegcount, but the driver in stable/10 does not. This means
 RM that the 82599 chip will end up doing the m_defrag() calls for 10.x.
 
 So the next step could even be updating to -current...
 OTOH, I get the same (bad) resulsts, no matter if TSO is enabled or
 disabled on the interface.
 
Since disabling TSO had no effect, I don't think updating would matter.

If you can test against a different NFS server, that might indicate whether
or not the Solaris server is the bottleneck.

If the Solaris server is using ZFS, setting sync=disabled might help w.r.t.
write performance. It is, however, somewhat dangerous w.r.t. loss of recently
written data when the server crashes. (Server has told client data is safely
on stable storage so client will not re-write the block(s) although data wasn't
on stable storage and is lost.)
(I'm not a ZFS guy, so I can't suggest more w.r.t. ZFS.)

rick

 
 cu
   Gerrit
 
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


[Bug 200323] BPF userland misuse can crash the system

2015-06-29 Thread bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=200323

Ermal Luçi e...@pfsense.org changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|Open|Closed

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org

Re: NFS on 10G interface terribly slow

2015-06-29 Thread Rick Macklem
Gerrit Kuhn wrote:
 On Fri, 26 Jun 2015 20:42:08 -0400 (EDT) Rick Macklem
 rmack...@uoguelph.ca wrote about Re: NFS on 10G interface terribly slow:
 
 RM Btw, can you tell us what Intel chip(s) you're using?
 
 I have
 
 ix0@pci0:5:0:0: class=0x02 card=0x00028086 chip=0x15288086 rev=0x01
 hdr=0x00 vendor = 'Intel Corporation'
 device = 'Ethernet Controller 10-Gigabit X540-AT2'
 class  = network
 subclass   = ethernet
 
Yea, I don't know how to decode this either. I was actually interested in
what chip Scott was using and getting wire speed.
As noted in the other reply, since disabling TSO didn't help, you probably
aren't affected by this issue.

rick

 RM For example, from the ix driver:
 RM #define IXGBE_82598_SCATTER   100
 RM #define IXGBE_82599_SCATTER   32
 
 Hm, I cannot find out into which chipset number this translates for my
 device...
 
 RM Btw, it appears that the driver in head/current now sets
 RM if_hw_tsomaxsegcount, but the driver in stable/10 does not. This means
 RM that the 82599 chip will end up doing the m_defrag() calls for 10.x.
 
 So the next step could even be updating to -current...
 OTOH, I get the same (bad) resulsts, no matter if TSO is enabled or
 disabled on the interface.
 
 
 cu
   Gerrit
 ___
 freebsd-net@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-net
 To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
 
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


netmap custom RSS and custom packet info

2015-06-29 Thread Slawa Olhovchenkov
Working with netmap and modern hardware I am lacking some features:

a) some spare space before packet (64/128/192/256 bytes) for
application data. For example: application do some pre-analysig
packet, filled structure in this space and routed packet (via NETMAP
pipe) to other thread. Received thread got packet and linked
inforamtion about this packet for processing w/o additional overhead.

b) custom RSS. Modern NIC have RSS poorly interoperable with packet
analysing: packets from same flow, but different direction placed in
different queue, PPPoE encapsulated packets placed in queue 0,
different tunneling don't recognised and etc. May be NETMAP can be
used custom RSS hashing from loadable kernel module, provideng by
user? Function frm this module can be packet analysing, tunnel
removing, custom RSS hashnig with direction-independly maner, filled
some structure prepended to buffer (see above) and pass this
information to application.

This is possible? This is useful not only to me?

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


Re: netmap custom RSS and custom packet info

2015-06-29 Thread Luigi Rizzo
On Mon, Jun 29, 2015 at 5:17 PM, Slawa Olhovchenkov s...@zxy.spb.ru wrote:

 Working with netmap and modern hardware I am lacking some features:

 a) some spare space before packet (64/128/192/256 bytes) for
 application data. For example: application do some pre-analysig
 packet, filled structure in this space and routed packet (via NETMAP
 pipe) to other thread. Received thread got packet and linked
 inforamtion about this packet for processing w/o additional overhead.


​spare space in front of the packet is something we have
been considering for a different purpose, namely better
support for encapsulation/decapsulation and things like
vhost-net header.

​Note though that the annotation is transferred for free
only in the case of pipes or ports sharing the same memory
region; vale ports would have to explicitly copy the
extra​ bytes which is (moderately) expensive.

A quick and dirty way to support what you want is the following:
- in the kernel code, modify NMB(), PNMB() and the offset between
  the netmap_ring and the first buffer to add the extra space
  you want in front of the packet. You can possibly make this
  offset a sysctl-controlled value

- in netmap_vale.c, make a small change to the code that copies
  buffers so that it includes also the space before the actual packet.

That should be all.



 b) custom RSS. Modern NIC have RSS poorly interoperable with packet
 analysing: packets from same flow, but different direction placed in
 different queue, PPPoE encapsulated packets placed in queue 0,
 different tunneling don't recognised and etc. May be NETMAP can be
 used custom RSS hashing from loadable kernel module, provideng by
 user? Function frm this module can be packet analysing, tunnel
 removing, custom RSS hashnig with direction-independly maner, filled
 some structure prepended to buffer (see above) and pass this
 information to application.


​RSS is completely orthogonal to​

​ netmap and I strongly
suggest to keep it this way, using either use the NIC-specific
tools to control RSS or some generic mechanism
(on linux there is ethtool, and we should implement something
similar also on freebsd).

​cheers
luigi
​


 This is possible? This is useful not only to me?

 ___
 freebsd-net@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-net
 To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org




-- 
-+---
 Prof. Luigi RIZZO, ri...@iet.unipi.it  . Dip. di Ing. dell'Informazione
 http://www.iet.unipi.it/~luigi/. Universita` di Pisa
 TEL  +39-050-2217533   . via Diotisalvi 2
 Mobile   +39-338-6809875   . 56122 PISA (Italy)
-+---
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org

Re: NFS on 10G interface terribly slow

2015-06-29 Thread Scott Larson
 82599 in our case. One problem I do have is the stack likes to blow up
on occasion with the right combo of high load and high throughput while TSO
is enabled, possibly relating to the 10.x driver issue you've pointed out.
But when it comes to the throughput they'll blast 10G with no problem.


*[image: userimage]Scott Larson[image: los angeles]
https://www.google.com/maps/place/4216+Glencoe+Ave,+Marina+Del+Rey,+CA+90292/@33.9892151,-118.4421334,17z/data=!3m1!4b1!4m2!3m1!1s0x80c2ba88ffae914d:0x14e1d00084d4d09cLead
Systems Administrator[image: wdlogo] https://www.wiredrive.com/ [image:
linkedin] https://www.linkedin.com/company/wiredrive [image: facebook]
https://www.twitter.com/wiredrive [image: twitter]
https://www.facebook.com/wiredrive [image: instagram]
https://www.instagram.com/wiredriveT 310 823 8238 x1106
310%20823%208238%20x1106  |  M 310 904 8818 310%20904%208818*

On Mon, Jun 29, 2015 at 5:22 AM, Rick Macklem rmack...@uoguelph.ca wrote:

 Gerrit Kuhn wrote:
  On Fri, 26 Jun 2015 20:42:08 -0400 (EDT) Rick Macklem
  rmack...@uoguelph.ca wrote about Re: NFS on 10G interface terribly
 slow:
 
  RM Btw, can you tell us what Intel chip(s) you're using?
 
  I have
 
  ix0@pci0:5:0:0: class=0x02 card=0x00028086 chip=0x15288086 rev=0x01
  hdr=0x00 vendor = 'Intel Corporation'
  device = 'Ethernet Controller 10-Gigabit X540-AT2'
  class  = network
  subclass   = ethernet
 
 Yea, I don't know how to decode this either. I was actually interested in
 what chip Scott was using and getting wire speed.
 As noted in the other reply, since disabling TSO didn't help, you probably
 aren't affected by this issue.

 rick

  RM For example, from the ix driver:
  RM #define IXGBE_82598_SCATTER   100
  RM #define IXGBE_82599_SCATTER   32
 
  Hm, I cannot find out into which chipset number this translates for my
  device...
 
  RM Btw, it appears that the driver in head/current now sets
  RM if_hw_tsomaxsegcount, but the driver in stable/10 does not. This
 means
  RM that the 82599 chip will end up doing the m_defrag() calls for 10.x.
 
  So the next step could even be updating to -current...
  OTOH, I get the same (bad) resulsts, no matter if TSO is enabled or
  disabled on the interface.
 
 
  cu
Gerrit
  ___
  freebsd-net@freebsd.org mailing list
  http://lists.freebsd.org/mailman/listinfo/freebsd-net
  To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
 

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


Re: netmap custom RSS and custom packet info

2015-06-29 Thread Slawa Olhovchenkov
On Mon, Jun 29, 2015 at 06:05:41PM +0200, Luigi Rizzo wrote:

 On Mon, Jun 29, 2015 at 5:17 PM, Slawa Olhovchenkov s...@zxy.spb.ru wrote:
 
  Working with netmap and modern hardware I am lacking some features:
 
  a) some spare space before packet (64/128/192/256 bytes) for
  application data. For example: application do some pre-analysig
  packet, filled structure in this space and routed packet (via NETMAP
  pipe) to other thread. Received thread got packet and linked
  inforamtion about this packet for processing w/o additional overhead.
 
 
 ​spare space in front of the packet is something we have
 been considering for a different purpose, namely better
 support for encapsulation/decapsulation and things like
 vhost-net header.

Adding more space (sysctl or ioctl controled may be satisfy both:
4-8-20 bytes for encapsulation and rest for application).

 ​Note though that the annotation is transferred for free
 only in the case of pipes or ports sharing the same memory
 region; vale ports would have to explicitly copy the
 extra​ bytes which is (moderately) expensive.

I think this bytes don't be transfered throw VALE.
This is only packet-processing information, like tags, opposite to
VALE that is like packet transfered by wire.

 A quick and dirty way to support what you want is the following:
 - in the kernel code, modify NMB(), PNMB() and the offset between
   the netmap_ring and the first buffer to add the extra space
   you want in front of the packet. You can possibly make this
   offset a sysctl-controlled value
 
 - in netmap_vale.c, make a small change to the code that copies
   buffers so that it includes also the space before the actual packet.
 
 That should be all.

Do you plan to do this?
I am don't like have permanenty private branch/patchs.

  b) custom RSS. Modern NIC have RSS poorly interoperable with packet
  analysing: packets from same flow, but different direction placed in
  different queue, PPPoE encapsulated packets placed in queue 0,
  different tunneling don't recognised and etc. May be NETMAP can be
  used custom RSS hashing from loadable kernel module, provideng by
  user? Function frm this module can be packet analysing, tunnel
  removing, custom RSS hashnig with direction-independly maner, filled
  some structure prepended to buffer (see above) and pass this
  information to application.
 
 
 ​RSS is completely orthogonal to​
 
 ​ netmap and I strongly
 suggest to keep it this way, using either use the NIC-specific
 tools to control RSS or some generic mechanism
 (on linux there is ethtool, and we should implement something
 similar also on freebsd).

This is not true RSS. This is only trick for reassigning RX packets to
different netmap rings. All hardware avalable RSS mechanism is fully
inacceptable for this:

- don't support different encapsulation (PPPoE, GRE, GTP and etc)
- give different rings for packet 1.2.3.4-5.6.7.8 and  5.6.7.8-1.2.3.4

Producing unversal hashing/distributing mechanism is too complex. But
using user-providing kernel module (syncing to application) may be
acceptable?

This is like ephemeral permanent NETMAP pipe between real hardware
RX rings/driver and application visible rings.
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org

Re: netmap custom RSS and custom packet info

2015-06-29 Thread Navdeep Parhar

On 06/29/2015 08:17, Slawa Olhovchenkov wrote:
...

b) custom RSS. Modern NIC have RSS poorly interoperable with packet
analysing: packets from same flow, but different direction placed in
different queue, ...


This is default behavior because the default hash (Toeplitz) is not 
symmetrical.  There are modern NICs that do support other, symmetrical 
hashes.


Regards,
Navdeep
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


Re: netmap custom RSS and custom packet info

2015-06-29 Thread Adrian Chadd
Hi,

Turns out there are a class of symmetric RSS Toeplitz keys. Use google
to find the paper. :)


-a


On 29 June 2015 at 10:19, Slawa Olhovchenkov s...@zxy.spb.ru wrote:
 On Mon, Jun 29, 2015 at 10:07:15AM -0700, Navdeep Parhar wrote:

 On 06/29/2015 08:17, Slawa Olhovchenkov wrote:
 ...
  b) custom RSS. Modern NIC have RSS poorly interoperable with packet
  analysing: packets from same flow, but different direction placed in
  different queue, ...

 This is default behavior because the default hash (Toeplitz) is not
 symmetrical.  There are modern NICs that do support other, symmetrical
 hashes.

 Anyway this is still hardware-depended.
 I am don't see symmetrical hashes for 1G/10G Intel cards.
 ___
 freebsd-net@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-net
 To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


Re: netmap custom RSS and custom packet info

2015-06-29 Thread Slawa Olhovchenkov
On Mon, Jun 29, 2015 at 10:29:14AM -0700, Adrian Chadd wrote:

 Hi,
 
 Turns out there are a class of symmetric RSS Toeplitz keys. Use google
 to find the paper. :)

How this interopperate with PPPoE encapsulation?
With GRE/GTP/MPLS encapsulation?

 On 29 June 2015 at 10:19, Slawa Olhovchenkov s...@zxy.spb.ru wrote:
  On Mon, Jun 29, 2015 at 10:07:15AM -0700, Navdeep Parhar wrote:
 
  On 06/29/2015 08:17, Slawa Olhovchenkov wrote:
  ...
   b) custom RSS. Modern NIC have RSS poorly interoperable with packet
   analysing: packets from same flow, but different direction placed in
   different queue, ...
 
  This is default behavior because the default hash (Toeplitz) is not
  symmetrical.  There are modern NICs that do support other, symmetrical
  hashes.
 
  Anyway this is still hardware-depended.
  I am don't see symmetrical hashes for 1G/10G Intel cards.
  ___
  freebsd-net@freebsd.org mailing list
  http://lists.freebsd.org/mailman/listinfo/freebsd-net
  To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


Re: netmap custom RSS and custom packet info

2015-06-29 Thread Slawa Olhovchenkov
On Mon, Jun 29, 2015 at 10:07:15AM -0700, Navdeep Parhar wrote:

 On 06/29/2015 08:17, Slawa Olhovchenkov wrote:
 ...
  b) custom RSS. Modern NIC have RSS poorly interoperable with packet
  analysing: packets from same flow, but different direction placed in
  different queue, ...
 
 This is default behavior because the default hash (Toeplitz) is not 
 symmetrical.  There are modern NICs that do support other, symmetrical 
 hashes.

Anyway this is still hardware-depended.
I am don't see symmetrical hashes for 1G/10G Intel cards.
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


Re: netmap custom RSS and custom packet info

2015-06-29 Thread Luigi Rizzo
On Mon, Jun 29, 2015 at 6:22 PM, Slawa Olhovchenkov s...@zxy.spb.ru wrote:

 On Mon, Jun 29, 2015 at 06:05:41PM +0200, Luigi Rizzo wrote:

  On Mon, Jun 29, 2015 at 5:17 PM, Slawa Olhovchenkov s...@zxy.spb.ru
 wrote:
 
   Working with netmap and modern hardware I am lacking some features:
  
   a) some spare space before packet (64/128/192/256 bytes) for
   application data. For example: application do some pre-analysig
   packet, filled structure in this space and routed packet (via NETMAP
   pipe) to other thread. Received thread got packet and linked
   inforamtion about this packet for processing w/o additional overhead.
  
 
  ​spare space in front of the packet is something we have
  been considering for a different purpose, namely better
  support for encapsulation/decapsulation and things like
  vhost-net header.

 Adding more space (sysctl or ioctl controled may be satisfy both:
 4-8-20 bytes for encapsulation and rest for application).

  ​Note though that the annotation is transferred for free
  only in the case of pipes or ports sharing the same memory
  region; vale ports would have to explicitly copy the
  extra​ bytes which is (moderately) expensive.

 I think this bytes don't be transfered throw VALE.
 This is only packet-processing information, like tags, opposite to
 VALE that is like packet transfered by wire.
 ​



  A quick and dirty way to support what you want is the following:
  - in the kernel code, modify NMB(), PNMB() and the offset between
the netmap_ring and the first buffer to add the extra space
you want in front of the packet. You can possibly make this
offset a sysctl-controlled value
 
  - in netmap_vale.c, make a small change to the code that copies
buffers so that it includes also the space before the actual packet.
 
  That should be all.

 Do you plan to do this?
 I am don't like have permanenty private branch/patchs.


​possibly in the long term yes, but before doing it
i want to design it properly so that it does not
look like a custom hack.


   b) custom RSS. Modern NIC have RSS poorly interoperable with packet
   analysing: packets from same flow, but different direction placed in
   different queue, PPPoE encapsulated packets placed in queue 0,
   different tunneling don't recognised and etc. May be NETMAP can be
   used custom RSS hashing from loadable kernel module, provideng by
   user? Function frm this module can be packet analysing, tunnel
   removing, custom RSS hashnig with direction-independly maner, filled
   some structure prepended to buffer (see above) and pass this
   information to application.
  
 
  ​RSS is completely orthogonal to​
 
  ​ netmap and I strongly
  suggest to keep it this way, using either use the NIC-specific
  tools to control RSS or some generic mechanism
  (on linux there is ethtool, and we should implement something
  similar also on freebsd).

 This is not true RSS. This is only trick for reassigning RX packets to
 different netmap rings. All hardware avalable RSS mechanism is fully
 inacceptable for this:

 - don't support different encapsulation (PPPoE, GRE, GTP and etc)
 - give different rings for packet 1.2.3.4-5.6.7.8 and  5.6.7.8-1.2.3.4

 Producing unversal hashing/distributing mechanism is too complex. But
 using user-providing kernel module (syncing to application) may be
 acceptable?


 This is like ephemeral permanent NETMAP pipe between real hardware
 RX rings/driver and application visible rings.



this particular function
​would also need to deal with
notifications between the physical NIC and the exported
netmap rings, and i would probably leave it to userspace.

You should be able to do what you
have in mind using the
programmable forwarding function ​
​
that already exists
for VALE ports
​ (at the cost of a memory copy, which could
be avoided when/if we decide to support VALE ports that
share the same memory region hence using zero copy.​

Don't hold your breath though.

cheers
luigi​



-- 
-+---
 Prof. Luigi RIZZO, ri...@iet.unipi.it  . Dip. di Ing. dell'Informazione
 http://www.iet.unipi.it/~luigi/. Universita` di Pisa
 TEL  +39-050-2217533   . via Diotisalvi 2
 Mobile   +39-338-6809875   . 56122 PISA (Italy)
-+---
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org

Re: netmap custom RSS and custom packet info

2015-06-29 Thread Adrian Chadd
Hi,

PPPoE will not be hashed according to RSS on the 1g/10g (igb, ixgbe)
intel hardware. you're going to have to figure out some other method
for traffic redistribution.

If it's inside GRE, then it's IPv4/IPv6 and thus yes, you can do
symmetric hashing. But if it's raw pppoe coming in, you're SoL.



-a
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


Re: netmap custom RSS and custom packet info

2015-06-29 Thread Slawa Olhovchenkov
On Mon, Jun 29, 2015 at 10:41:39AM -0700, Adrian Chadd wrote:

 Hi,
 
 PPPoE will not be hashed according to RSS on the 1g/10g (igb, ixgbe)
 intel hardware. you're going to have to figure out some other method
 for traffic redistribution.

I propose ephemeral but permanent NETMAP RX pipe with redistribution
function from user-loadable kernel module.

Hardware RxRing(i) = loadable hash = user-visible RxRing(j).

As I perceive overhead is packet-parsing and hash computation +
swaping of slot indexes (zero-copying between hardware ring and
user-visible ring).

In other (Tx) direction user-visible direct mapped to hardware ring.

 If it's inside GRE, then it's IPv4/IPv6 and thus yes, you can do
 symmetric hashing. But if it's raw pppoe coming in, you're SoL.

This is will be only hashed tunnel IPs, yes?
Not sessions inside tunnel.
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


Re: FreeBSD 10.1-REL - network unaccessible after high traffic

2015-06-29 Thread Adrian Chadd
hi,

I asked for the output of vmstat -z and vmstat -m in a loop. :)


-a


On 29 June 2015 at 02:02, Csaba Banhalmi bim...@field.hu wrote:
 Hi All,

 vmstat 5 output when system freezes:
  procs  memory  pagedisks faults cpu
  r b w avmfre   flt  re  pi  pofr  sr ad0 ad1   in sy   cs us sy
 id
  0 0 0   8752M   126M  5663   0   0   0  4042 445  66   0 1219 7148 4870  3
 2 95
  0 0 0   8650M   145M  2167   0   0   0  3501 447  79   0  974 4042 3578  1
 1 98
  0 0 0   8374M   201M  3113   0   0   0  6790 441   5   0 1130 6670 3729  3
 1 96
  0 0 0   8252M   220M  2632   0   0   0  4014 435   4   0  726 11653 2401  2
 1 97
  0 0 0   8188M   224M  1625   0   0   0  2189 434   5   0  713 6714 2376  1
 1 98
  0 0 0   7992M   233M  1504   0   0   0  2254 433   2   0  867 2890 2868  1
 1 98
  4 0 0   8032M   216M  2145   0   0   0  1995 435  18   0  526 3769 2048  1
 1 98
  0 0 0   8180M   195M  1949   0   0   0  1741 435  50   0  593 3441 2363  1
 1 98
  0 0 0   8186M   178M  2859   0   0   0  2525 436   6   0  499 3313 1733  2
 1 97
  1 0 0   8410M   146M  2521   0   0   0  1764 440  11   0  736 67271 2121  4
 2 94
  0 0 0   8182M   205M  2910   0   0   0  6378 927   8   0  495 16043 1775  1
 1 98
  1 1 0   7944M   210M  3009   0   0   0  3696 438   8   0  522 4247 1963  2
 1 97
  0 0 0   8091M   169M  7529   0   0   0  3601 436 105   0 1359 75290 4400  9
 3 88
  0 0 0   8121M   141M  4607   0   0   0  3288 444  62   0  949 12169 3268  5
 1 94
  0 0 0   8044M   201M  1782   0   0   0  4954 1795   9   0  446 3025 1927  1
 1 99
  0 0 0   7916M   222M  1296   0   0   0  2671 438   5   0  525 2984 1920  1
 1 98
  1 0 0   7870M   230M   888   0   0   0  1677 432   8   0  473 6424 2126  1
 1 99
  0 0 0   7968M   228M  3375   0   0   0  2625 433  51   0  768 4100 2852  3
 1 96
  0 0 0   8238M   194M  7586   0   0   0  4758 436  88   0 1026 9631 3908  4
 2 94
  0 0 0   8293M   185M  3253   0   0   0  2362 437  52   0  747 4475 3105  2
 1 97

 I increased the vm.v_free_min, but did not help. It was a different froze,
 the system was unreacheable even through IPMI, needed a hard reset.

 Regards,
 Csaba



 2015.06.12. 20:17 keltezéssel, Adrian Chadd írta:

 On 12 June 2015 at 10:57, Christopher Forgeron csforge...@gmail.com
 wrote:

 I agree it shouldn't run out of memory. Here's what mine does under
 network
 load, or rsync load:

 2 0 9   1822M  1834M 0   0   0   014   8   0   0 22750  724
 136119
 0 23 77

 0 0 9   1822M  1823M 0   0   0   0 0   8   0   0 44317  347
 138151
 0 16 84

 0 0 9   1822M  1761M 0   0   0   017   8   0   0 23818  820 92198
 0
 12 88

 0 0 9   1822M  1727M 0   0   0   014   8   0   0 40768  634
 126688
 0 17 83

 0 0 9   1822M  8192B 0   8   0   015   3   3   0 9236  305 57149
 0
 33 67


 That's with a 5 second vmstat output. After the 8KiB, the system is
 nearly
 completely brain-dead and needs a hard power-off.


 I've seen it go from 6 GiB free to 8KiB in 5 sec as well. Currently my
 large
 machines are set to 12 GiB free to keep them from crashing, from what I
 presume is just network load due to lots of iSCSI / NFS traffic on my
 10GiB
 network.


 I haven't had time to type this up for the list yet, but I'm putting it
 here
 just to make sure people know it's real.

 Hi,

 Then something is leaking or  holding onto memory when it shouldn't be.

 Try doing vmstat -z and vmstat -m in a one second loop, post the data
 just before it falls over.


 -adrian


___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org

Re: Same NIC name to MAC mapping on FreeBSD

2015-06-29 Thread Paul S.
On my production systems, I've never seen it deviate without hardware 
changes.


Are you seeing otherwise?

On 6/29/2015 午後 04:23, Wei Hu wrote:

Hi,

On a FreeBSD system with multiple NICs, ie, multiple MAC addresses, is there a 
way to keep the same network interface name to MAC address mapping across 
reboot? It seems on Linux udev rule can help achieve this. Anything similar on 
FreeBSD?

Thanks,
Wei
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org

Re: netmap custom RSS and custom packet info

2015-06-29 Thread Slawa Olhovchenkov
On Mon, Jun 29, 2015 at 10:29:14AM -0700, Adrian Chadd wrote:

 Hi,
 
 Turns out there are a class of symmetric RSS Toeplitz keys. Use google
 to find the paper. :)

Do someone work on using different RSS keys and hash fields (selecting
L2/L3/L4 or just L3 hash for example) in FreeBSD?
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


[Bug 159294] [em] em watchdog timeouts

2015-06-29 Thread bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=159294

--- Comment #3 from Kurt Jaeger p...@freebsd.org ---
I still have a box with those interfaces, but I have avoided using em4 and em5
on this box.

It now says:

dev.em.4.%desc: Intel(R) PRO/1000 Legacy Network Connection 1.0.6

Hmm, if I have to reproduce the problem, this will take some time.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


[Bug 159294] [em] em watchdog timeouts

2015-06-29 Thread bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=159294

--- Comment #4 from Sean Bruno sbr...@freebsd.org ---
(In reply to Kurt Jaeger from comment #3)
I'll try to do the same here with my ATOM test box.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


[Bug 175734] no ethernet detected on system with EG20T PCH chipset ATOM E6xx series

2015-06-29 Thread bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=175734

Sean Bruno sbr...@freebsd.org changed:

   What|Removed |Added

   Keywords|IntelNetworking |

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


[Bug 159294] [em] em watchdog timeouts

2015-06-29 Thread bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=159294

--- Comment #2 from Sean Bruno sbr...@freebsd.org ---
(In reply to pi from comment #0)
There has been a lot of changes to em(4) handling in FreeBSD.

I have not, however, touched lem(4) which is what is controlling the 82541EI. 
Does this still happen for you?  I can take a look at the watchdog handling and
see if it can be fixed with the same code that em(4) has used.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


[Bug 193620] Problem with igb multiqueue together with pf

2015-06-29 Thread bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=193620

Sean Bruno sbr...@freebsd.org changed:

   What|Removed |Added

 Status|New |In Progress

--- Comment #1 from Sean Bruno sbr...@freebsd.org ---
hw.igb.enable_msix=0  disables MSIX features which *does* disable multiqueue. 
You can however, set hw.igb.num_queues=1 to use MSIX and only 1 queue.

Can you test this with 10.1 release and 10.2 release beta when available?

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


Re: NFS on 10G interface terribly slow

2015-06-29 Thread Rick Macklem
Scott Larson wrote:
 82599 in our case. One problem I do have is the stack likes to blow up
 on occasion with the right combo of high load and high throughput while TSO
 is enabled, possibly relating to the 10.x driver issue you've pointed out.
 But when it comes to the throughput they'll blast 10G with no problem.
 
Thanks for the info. So long as your mbuf cluster pool is large enough, I
think the m_defrag() calls will just result in increased CPU overheads and
probably don't introduce much delay.

I have no idea why the stack would blow up sometimes.
If you can catch the backtrace for one of these and post it, it might become
obvious. (Or you could just try increasing KSTACK_PAGES in 
sys/amd64/include/param.h
and see if the stack still blows up. Alternately, I think you can set 
KSTACK_PAGES in
your kernel config file.)

rick

 
 *[image: userimage]Scott Larson[image: los angeles]
 https://www.google.com/maps/place/4216+Glencoe+Ave,+Marina+Del+Rey,+CA+90292/@33.9892151,-118.4421334,17z/data=!3m1!4b1!4m2!3m1!1s0x80c2ba88ffae914d:0x14e1d00084d4d09cLead
 Systems Administrator[image: wdlogo] https://www.wiredrive.com/ [image:
 linkedin] https://www.linkedin.com/company/wiredrive [image: facebook]
 https://www.twitter.com/wiredrive [image: twitter]
 https://www.facebook.com/wiredrive [image: instagram]
 https://www.instagram.com/wiredriveT 310 823 8238 x1106
 310%20823%208238%20x1106  |  M 310 904 8818 310%20904%208818*
 
 On Mon, Jun 29, 2015 at 5:22 AM, Rick Macklem rmack...@uoguelph.ca wrote:
 
  Gerrit Kuhn wrote:
   On Fri, 26 Jun 2015 20:42:08 -0400 (EDT) Rick Macklem
   rmack...@uoguelph.ca wrote about Re: NFS on 10G interface terribly
  slow:
  
   RM Btw, can you tell us what Intel chip(s) you're using?
  
   I have
  
   ix0@pci0:5:0:0: class=0x02 card=0x00028086 chip=0x15288086 rev=0x01
   hdr=0x00 vendor = 'Intel Corporation'
   device = 'Ethernet Controller 10-Gigabit X540-AT2'
   class  = network
   subclass   = ethernet
  
  Yea, I don't know how to decode this either. I was actually interested in
  what chip Scott was using and getting wire speed.
  As noted in the other reply, since disabling TSO didn't help, you probably
  aren't affected by this issue.
 
  rick
 
   RM For example, from the ix driver:
   RM #define IXGBE_82598_SCATTER   100
   RM #define IXGBE_82599_SCATTER   32
  
   Hm, I cannot find out into which chipset number this translates for my
   device...
  
   RM Btw, it appears that the driver in head/current now sets
   RM if_hw_tsomaxsegcount, but the driver in stable/10 does not. This
  means
   RM that the 82599 chip will end up doing the m_defrag() calls for 10.x.
  
   So the next step could even be updating to -current...
   OTOH, I get the same (bad) resulsts, no matter if TSO is enabled or
   disabled on the interface.
  
  
   cu
 Gerrit
   ___
   freebsd-net@freebsd.org mailing list
   http://lists.freebsd.org/mailman/listinfo/freebsd-net
   To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
  
 
 ___
 freebsd-net@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-net
 To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
 
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


Re: NFS on 10G interface terribly slow

2015-06-29 Thread Gerrit Kühn
On Fri, 26 Jun 2015 19:58:42 -0400 (EDT) Rick Macklem
rmack...@uoguelph.ca wrote about Re: NFS on 10G interface terribly slow:


RM The default (auto tuned) value is reported by nfsstat -m.
RM It can be set with a mount option (should be something in man
RM mount_nfs). If you are doing a test with 1 megabyte writes, I'd set
RM it to at least 1 megabyte. (Basically, writing will be slower for write
RM (2) syscalls that are larger than wcommitsize. After mav@'s patch, the
RM difference isn't nearly as noticable. His other commit makes the auto
RM tuned value more reasonable).
RM 
RM If you set it large enough with the wcommitsize=N mount option, you
RM don't need the updates stable/10.

Ok, I set it way over 1MB now:

hellpool:/samqfs/K1/Gerrit on /net/hellpool
nfsv3,tcp,resvport,hard,cto,lockd,rdirplus,sec=sys,acdirmin=3,acdirmax=60,acregmin=5,acregmax=60,nametimeo=60,negnametimeo=60,rsize=65536,wsize=65536,readdirsize=8192,readahead=1,wcommitsize=2048576,timeout=120,retrans=2

However, this still gives me the same bad write performance:

root@crest: # dd if=/dev/zero of=/net/hellpool/Z bs=1024k
count=1000 1000+0 records in
1000+0 records out
1048576000 bytes transferred in 22.939049 secs (45711398 bytes/sec)


So I guess I can postpone the update for now, and look for some other
reason for this instead.


cu
  Gerrit
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


Same NIC name to MAC mapping on FreeBSD

2015-06-29 Thread Wei Hu
Hi,

On a FreeBSD system with multiple NICs, ie, multiple MAC addresses, is there a 
way to keep the same network interface name to MAC address mapping across 
reboot? It seems on Linux udev rule can help achieve this. Anything similar on 
FreeBSD?

Thanks,
Wei
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


Re: FreeBSD 10.1-REL - network unaccessible after high traffic

2015-06-29 Thread Csaba Banhalmi

Hi All,

vmstat 5 output when system freezes:
 procs  memory  pagedisks faults cpu
 r b w avmfre   flt  re  pi  pofr  sr ad0 ad1   in sy   cs 
us sy id
 0 0 0   8752M   126M  5663   0   0   0  4042 445  66   0 1219 7148 
4870  3  2 95
 0 0 0   8650M   145M  2167   0   0   0  3501 447  79   0  974 4042 
3578  1  1 98
 0 0 0   8374M   201M  3113   0   0   0  6790 441   5   0 1130 6670 
3729  3  1 96
 0 0 0   8252M   220M  2632   0   0   0  4014 435   4   0  726 11653 
2401  2  1 97
 0 0 0   8188M   224M  1625   0   0   0  2189 434   5   0  713 6714 
2376  1  1 98
 0 0 0   7992M   233M  1504   0   0   0  2254 433   2   0  867 2890 
2868  1  1 98
 4 0 0   8032M   216M  2145   0   0   0  1995 435  18   0  526 3769 
2048  1  1 98
 0 0 0   8180M   195M  1949   0   0   0  1741 435  50   0  593 3441 
2363  1  1 98
 0 0 0   8186M   178M  2859   0   0   0  2525 436   6   0  499 3313 
1733  2  1 97
 1 0 0   8410M   146M  2521   0   0   0  1764 440  11   0  736 67271 
2121  4  2 94
 0 0 0   8182M   205M  2910   0   0   0  6378 927   8   0  495 16043 
1775  1  1 98
 1 1 0   7944M   210M  3009   0   0   0  3696 438   8   0  522 4247 
1963  2  1 97
 0 0 0   8091M   169M  7529   0   0   0  3601 436 105   0 1359 75290 
4400  9  3 88
 0 0 0   8121M   141M  4607   0   0   0  3288 444  62   0  949 12169 
3268  5  1 94
 0 0 0   8044M   201M  1782   0   0   0  4954 1795   9   0  446 3025 
1927  1  1 99
 0 0 0   7916M   222M  1296   0   0   0  2671 438   5   0  525 2984 
1920  1  1 98
 1 0 0   7870M   230M   888   0   0   0  1677 432   8   0  473 6424 
2126  1  1 99
 0 0 0   7968M   228M  3375   0   0   0  2625 433  51   0  768 4100 
2852  3  1 96
 0 0 0   8238M   194M  7586   0   0   0  4758 436  88   0 1026 9631 
3908  4  2 94
 0 0 0   8293M   185M  3253   0   0   0  2362 437  52   0  747 4475 
3105  2  1 97


I increased the vm.v_free_min, but did not help. It was a different 
froze, the system was unreacheable even through IPMI, needed a hard reset.


Regards,
Csaba


2015.06.12. 20:17 keltezéssel, Adrian Chadd írta:

On 12 June 2015 at 10:57, Christopher Forgeron csforge...@gmail.com wrote:

I agree it shouldn't run out of memory. Here's what mine does under network
load, or rsync load:

2 0 9   1822M  1834M 0   0   0   014   8   0   0 22750  724 136119
0 23 77

0 0 9   1822M  1823M 0   0   0   0 0   8   0   0 44317  347 138151
0 16 84

0 0 9   1822M  1761M 0   0   0   017   8   0   0 23818  820 92198  0
12 88

0 0 9   1822M  1727M 0   0   0   014   8   0   0 40768  634 126688
0 17 83

0 0 9   1822M  8192B 0   8   0   015   3   3   0 9236  305 57149  0
33 67


That's with a 5 second vmstat output. After the 8KiB, the system is nearly
completely brain-dead and needs a hard power-off.


I've seen it go from 6 GiB free to 8KiB in 5 sec as well. Currently my large
machines are set to 12 GiB free to keep them from crashing, from what I
presume is just network load due to lots of iSCSI / NFS traffic on my 10GiB
network.


I haven't had time to type this up for the list yet, but I'm putting it here
just to make sure people know it's real.


Hi,

Then something is leaking or  holding onto memory when it shouldn't be.

Try doing vmstat -z and vmstat -m in a one second loop, post the data
just before it falls over.


-adrian


___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org

Re: NFS on 10G interface terribly slow

2015-06-29 Thread Gerrit Kühn
On Fri, 26 Jun 2015 20:42:08 -0400 (EDT) Rick Macklem
rmack...@uoguelph.ca wrote about Re: NFS on 10G interface terribly slow:

RM Btw, can you tell us what Intel chip(s) you're using?

I have

ix0@pci0:5:0:0: class=0x02 card=0x00028086 chip=0x15288086 rev=0x01
hdr=0x00 vendor = 'Intel Corporation'
device = 'Ethernet Controller 10-Gigabit X540-AT2'
class  = network
subclass   = ethernet

RM For example, from the ix driver:
RM #define IXGBE_82598_SCATTER 100
RM #define IXGBE_82599_SCATTER 32

Hm, I cannot find out into which chipset number this translates for my
device...

RM Btw, it appears that the driver in head/current now sets
RM if_hw_tsomaxsegcount, but the driver in stable/10 does not. This means
RM that the 82599 chip will end up doing the m_defrag() calls for 10.x.

So the next step could even be updating to -current...
OTOH, I get the same (bad) resulsts, no matter if TSO is enabled or
disabled on the interface.


cu
  Gerrit
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


Re: patm device on FreeBSD 9.2

2015-06-29 Thread Nomad Esst via freebsd-net


Yes, FreeBSD detects my card. When I ping the FreeBSD side from Linux side, 
netstat -s -p ip shows that arrived packets are incorrect version number. When 
both sides are FreeBSD packets are not received by the other side, I mean even 
netstat does not show the received packets!
 
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org

Re: netmap custom RSS and custom packet info

2015-06-29 Thread Adrian Chadd
On 29 June 2015 at 12:11, Slawa Olhovchenkov s...@zxy.spb.ru wrote:
 On Mon, Jun 29, 2015 at 10:29:14AM -0700, Adrian Chadd wrote:

 Hi,

 Turns out there are a class of symmetric RSS Toeplitz keys. Use google
 to find the paper. :)

 Do someone work on using different RSS keys and hash fields (selecting
 L2/L3/L4 or just L3 hash for example) in FreeBSD?

The -HEAD RSS stuff has a global set of (compiled in) config options
that say whether L2/L3 RSS hashing is enabled. It's more complicated
than that, as the alignment of RSS NIC config, expected RSS hash info
and the tcp/udp pcb table hashing has to align, or weird stuff
happens.

It still needs a bunch more work. Unfortunately it's not in my would
be useful for work right at the moment, so other things are taking
priority.



-adrian
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


RE: Same NIC name to MAC mapping on FreeBSD

2015-06-29 Thread Wei Hu

 -Original Message-
 From: owner-freebsd-...@freebsd.org [mailto:owner-freebsd-
 n...@freebsd.org] On Behalf Of Paul S.
 Sent: Monday, June 29, 2015 7:53 PM
 To: freebsd-net@freebsd.org
 Subject: Re: Same NIC name to MAC mapping on FreeBSD
 
 On my production systems, I've never seen it deviate without hardware
 changes.
 
 Are you seeing otherwise?
 

In Hyper-V, if say three NICs were assigned to the VM, I got following mapping 
Initially:

Hn0 - MAC 0
Hn1 - MAC 1
Hn2 - MAC2

Then if I remove the NIC with MAC 1 and reboot, I want the other two interfaces 
to keep the same
Names instead of reassigning hn1 to MAC2. This is a requirement from virtual 
appliance
Vendor to retain such mappings.  I am wondering if there is any way to do this 
without
Asking customer or manually editing any config files.

Thanks,
Wei
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org