date:20180728

[Bug 197997] [panic] ng_pppoe sometimes panics with trap 12 when server drops session

2018-07-28 Thread bugzilla-noreply

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=197997

Eugene Grosbein  changed:

   What|Removed |Added

 Resolution|--- |Feedback Timeout
 CC||eu...@freebsd.org
 Status|New |Closed

--- Comment #6 from Eugene Grosbein  ---
Feedback timeout over 3 years.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

[Bug 200860] Failed do-not-fragment ping when using PPPoE over FTTX connection

2018-07-28 Thread bugzilla-noreply

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=200860

Eugene Grosbein  changed:

   What|Removed |Added

 Status|New |Closed
 Resolution|--- |FIXED
 CC||eu...@freebsd.org

--- Comment #1 from Eugene Grosbein  ---
Just use net/mpd5 port/package that has requested RFC 4638 PPPoE client
support.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

[Bug 184141] [ppp] [patch] Kernel PPPoE sends bad echo-req magic number on big endian machines

2018-07-28 Thread bugzilla-noreply

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=184141

Eugene Grosbein  changed:

   What|Removed |Added

   Assignee|j...@freebsd.org |n...@freebsd.org
 CC||eu...@freebsd.org

--- Comment #6 from Eugene Grosbein  ---
Reset assignee after 5 years of inactivity.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: [Bug 203856] [igb] PPPoE RX traffic is limitied to one queue

2018-07-28 Thread k simon

   As a workround:  Use one ethercard supports SR-IOV, then adds all the 
VFs to a bridge, bound each VF with a MPD5 instance and a cpu core. I 
have heard that PPP protocol support RoundRobin algo.
   But I have not tested it.


Simon
20180728

在 2018/7/28 02:02, bugzilla-nore...@freebsd.org 写道:
> https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=203856
> 
> --- Comment #31 from Eugene Grosbein  ---
> (In reply to Kurt Jaeger from comment #30)
> 
> This patch does not apply in any sense: it won't apply textually and it was
> (incomplete) attempt to solve another problem in first place: it tried to add 
> a
> sysctl to disable flowid generation by igb(4) driver based on (always zero for
> PPPoE) hardware flow id assigned by the chip. It was meaningless from the
> beginning.
> 
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: PSPAT subsystem Implementation in FreeBSD - GSoC 2018

2018-07-28 Thread Sumit Lakra

Hello,

I tried some other simpler tests today. First I tried intercepting the
packets from ip_output.c again and added some printf statements to track
the path of package (code
).
As before, they were successfully intercepted and placed in the PSPAT
client queues but the arbiter was unable to find them most of the time (not
always), when scanning the queues. As per my previous assumption this was
probably due to client threads early return without any error indications
which assumed that the packet was dispatched. So, to test it I did this
..
I couldn't make the client threads pause as they apparently had some non
sleepable locks held, so I made them go through a really long loop before
returning hoping this would allow PSPAT enough time to pick them up and
dispatch.. and bingo.. it worked. The packets no longer disappeared from
the PSPAT client queues and reached the pspat_txqs_flush().

This could also be the same reason how the packets with PROTO_LAYER2 tags
disappeared, although as I mentioned in the previous mail, they were really
not good for interception anyway.

Next, I uncommented the actual if_output() call in the pspat_txqs_flush()
to dispatch the packets that were reaching this point, but somehow the
function call failed again(code
).
In order to check if the function was called with correct parameters, I
used some printf statements to check them (code
)..
they were intact. But the function call was failing when called by the
arbiter thread to dispatch packets. The exact same function called with the
exact same arguments and yet it fails when called from a thread other then
the client thread... Why does this happen ??.. I can't figure out !!

This makes my second assumption from the previous mail to be possibly
correct too, and this is probably why calling dummynet_send() from
pspat_txqs_flush() didn't work either.. Put simply, there is some thread
specific stuff going on with the client threads and they don't like any
other threads trying to step in their shoes and dispatch their packets, and
this is not restricted to dummynet/ipfw but maybe true for the entire
network stack and many other functions

Like I had said, I have already completed the PSPAT part and tested it to
be working well but trying to make it work with the existing networking
subsystem is turning out to be increasingly complex. I have no idea how to
get around this problem, but will keep trying to come up with something.
Any help/ideas will be greatly appreciated.

Thanks and Regards,
Sumit

On Sat, Jul 28, 2018 at 1:52 AM, Sumit Lakra 
wrote:

> Hello,
>
> I tried the sysctl and it worked in that I was able to intercept the
> packets with DIR  == DIR_OUT | PROTO_LAYER2, but I am beginning to face
> some other increasingly difficult and unanticipated problems in trying to
> attach the PSPAT code to work with the present networking system. As you
> mentioned you are a bit busy now, I was hoping maybe Alexander will be able
> to help me a little here. It will be good to hear a different viewpoint as
> well. Also, there are issues I am facing which I believe even you may not
> be aware of, hence I am also sending this mail to the mailing lists in hope
> of getting additional opinions from other experts of dummynet/ipfw and the
> FreeBSD network stack.
>
> PSPAT WIP branch - https://github.com/theGodlessL
> akra/freebsd-pspat/tree/pspat-temp
>
> Firstly, as per our previous ideas we had the plan to intercept the
> packets from dummynet... pass it through PSPAT... and finally dispatch them
> out from the dispatcher queue via the arbiter or a dedicated dispatcher
> thread using functions like ip_output() or ether_output_frame() similar to
> dummynet_send(). I had already spent a good deal of time trying to get
> these working but it failed every time and resulted in kernel panics. My
> first thoughts were that the packets are not complete enough for these
> functions. (net.link.ether.ipfw worked but it also resulted in an error
> when sending the packet to ether_output_frame). So, in order to test it, I
> wrote a simple commit to test whether these packets can really be sent to
> these functions without making them go through PSPAT at all. Turns out,
> they failed.
>
> The first one can be seen here
> ..
> sending DIR_OUT packets to ip_output() directly from dummynt_io() with
> nothing to do with PSPAT failed.
> The second one can be seen here
> ..
> a similar failure with DIR_OUT | PROTO_LAYER2 packets. These both attempts
> resulted i

Re: 9k jumbo clusters

2018-07-28 Thread Adrian Chadd

On Fri, 27 Jul 2018 at 15:19, John-Mark Gurney  wrote:

> Ryan Moeller wrote this message on Fri, Jul 27, 2018 at 12:45 -0700:
> > There is a long-standing issue with 9k mbuf jumbo clusters in FreeBSD.
> > For example:
> > https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=183381
> > https://lists.freebsd.org/pipermail/freebsd-net/2013-March/034890.html
> >
> > This comment suggests the 16k pool does not have the fragmentation
> problem:
> > https://reviews.freebsd.org/D11560#239462
> > I???m curious whether that has been confirmed.
> >
> > Is anyone working on the pathological case with 9k jumbo clusters in the
> > physical memory allocator?  There was an interesting discussion started a
> > few years ago but I???m not sure what ever came of it:
> > http://docs.freebsd.org/cgi/mid.cgi?21225.20047.947384.390241
> >
> > I have seen some work in the direction of avoiding larger than page size
> > jumbo clusters in 12-CURRENT.  Many existing drivers avoid the 9k cluster
> > size already.  The code for larger cluster sizes in iflib is #ifdef'd out
> > so it maxes out at the page size jumbo clusters until
> "CONTIGMALLOC_WORKS"
> > (apparently it doesn't).
> >
> > With all the changes due to iflib, is there any chance some of this will
> > get MFC'd to address the serious problem that remains in 11-STABLE?
> >
> > Otherwise, would it be feasible to disable the use of the 9k cluster pool
> > in at least some of the popular NIC drivers as a solution for the stable
> > branches?
> >
> > Finally, I have studied some of the driver code in 11-STABLE and posted
> the
> > gist of my notes in relation to this problem.  If anyone spots a mistake
> or
> > has something else to contribute, comments on the gist would be greatly
> > appreciated!
> > https://gist.github.com/freqlabs/eba9b755f17a223260246becfbb150a1
>
> Drivers need to be fixed to use 4k pages instead of cluster.  I really hope
> no one is using a card that can't do 4k pages, or if they are, then they
> should get a real card that can do scatter/gather on 4k pages for jumbo
> frames..

Yeah but it's 2018 and your server has like minimum a dozen million 4k
pages.

So if you're doing stuff like lots of network packet kerchunking why not
have specialised allocator paths that can do things like "hey, always give
me 64k physical contig pages for storage/mbufs because you know what?
they're going to be allocated/freed together always."

There was always a race between bus bandwidth, memory bandwidth and
bus/memory latencies. I'm not currently on the disk/packet pushing side of
things, but the last couple times I were it was at different points in that
4d space and almost every single time there was a benefit from having a
couple of specialised allocators so you didn't have to try and manage a few
dozen million 4k pages based on your changing workload.

I enjoy the 4k page size management stuff for my 128MB routers. Your 128G
server has a lot of 4k pages. It's a bit silly.

-adrian
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: 9k jumbo clusters

2018-07-28 Thread John-Mark Gurney

Adrian Chadd wrote this message on Sat, Jul 28, 2018 at 13:33 -0700:
> On Fri, 27 Jul 2018 at 15:19, John-Mark Gurney  wrote:
> 
> > Ryan Moeller wrote this message on Fri, Jul 27, 2018 at 12:45 -0700:
> > > There is a long-standing issue with 9k mbuf jumbo clusters in FreeBSD.
> > > For example:
> > > https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=183381
> > > https://lists.freebsd.org/pipermail/freebsd-net/2013-March/034890.html
> > >
> > > This comment suggests the 16k pool does not have the fragmentation
> > problem:
> > > https://reviews.freebsd.org/D11560#239462
> > > I???m curious whether that has been confirmed.
> > >
> > > Is anyone working on the pathological case with 9k jumbo clusters in the
> > > physical memory allocator?  There was an interesting discussion started a
> > > few years ago but I???m not sure what ever came of it:
> > > http://docs.freebsd.org/cgi/mid.cgi?21225.20047.947384.390241
> > >
> > > I have seen some work in the direction of avoiding larger than page size
> > > jumbo clusters in 12-CURRENT.  Many existing drivers avoid the 9k cluster
> > > size already.  The code for larger cluster sizes in iflib is #ifdef'd out
> > > so it maxes out at the page size jumbo clusters until
> > "CONTIGMALLOC_WORKS"
> > > (apparently it doesn't).
> > >
> > > With all the changes due to iflib, is there any chance some of this will
> > > get MFC'd to address the serious problem that remains in 11-STABLE?
> > >
> > > Otherwise, would it be feasible to disable the use of the 9k cluster pool
> > > in at least some of the popular NIC drivers as a solution for the stable
> > > branches?
> > >
> > > Finally, I have studied some of the driver code in 11-STABLE and posted
> > the
> > > gist of my notes in relation to this problem.  If anyone spots a mistake
> > or
> > > has something else to contribute, comments on the gist would be greatly
> > > appreciated!
> > > https://gist.github.com/freqlabs/eba9b755f17a223260246becfbb150a1
> >
> > Drivers need to be fixed to use 4k pages instead of cluster.  I really hope
> > no one is using a card that can't do 4k pages, or if they are, then they
> > should get a real card that can do scatter/gather on 4k pages for jumbo
> > frames..
> 
> 
> Yeah but it's 2018 and your server has like minimum a dozen million 4k
> pages.
> 
> So if you're doing stuff like lots of network packet kerchunking why not
> have specialised allocator paths that can do things like "hey, always give
> me 64k physical contig pages for storage/mbufs because you know what?
> they're going to be allocated/freed together always."
> 
> There was always a race between bus bandwidth, memory bandwidth and
> bus/memory latencies. I'm not currently on the disk/packet pushing side of
> things, but the last couple times I were it was at different points in that
> 4d space and almost every single time there was a benefit from having a
> couple of specialised allocators so you didn't have to try and manage a few
> dozen million 4k pages based on your changing workload.
> 
> I enjoy the 4k page size management stuff for my 128MB routers. Your 128G
> server has a lot of 4k pages. It's a bit silly.

We do:
$vmstat -z
ITEM   SIZE  LIMIT USED FREE  REQ FAIL SLEEP
[...]
8192:  8192,  0,  67, 109,398203019,   0,   0
16384:16384,  0,  65,  41,74103020,   0,   0
32768:32768,  0,  61,  28,41981659,   0,   0
65536:65536,  0,  17,  23,26127059,   0,   0
[...]
mbuf_jumbo_page:   4096, 509295,   0,  64,183536214,   0,   0
mbuf_jumbo_9k: 9216, 150902,   0,   0,   0,   0,   0
mbuf_jumbo_16k:   16384,  84882,   0,   0,   0,   0,   0
[...]

And I know you know the problem is that over time memory is fragmented,
so if suddenly you need more jumbo frames than you already have, you're
SOL...

page size allocations will always be available...

Fixing drivers to fall back to 4k allocations (or always use 4k allocations)
is a lot simplier than doing magic work to free pages, etc.

-- 
  John-Mark Gurney  Voice: +1 415 225 5579

 "All that I will do, has been done, All that I have, has not."
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: 9k jumbo clusters

2018-07-28 Thread Garrett Wollman

In article <20180729011153.gd2...@funkthat.com> j...@funkthat.com
writes:

>And I know you know the problem is that over time memory is fragmented,
>so if suddenly you need more jumbo frames than you already have, you're
>SOL...

This problem instantly disappears if you preallocate several gigabytes
of contiguous physical memory at boot time.  And if you're doing
something network-intensive and care about performance, you probably
don't mind doing that.

-GAWollman

___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

[Bug 197997] [panic] ng_pppoe sometimes panics with trap 12 when server drops session

[Bug 200860] Failed do-not-fragment ping when using PPPoE over FTTX connection

[Bug 184141] [ppp] [patch] Kernel PPPoE sends bad echo-req magic number on big endian machines

Re: [Bug 203856] [igb] PPPoE RX traffic is limitied to one queue

Re: PSPAT subsystem Implementation in FreeBSD - GSoC 2018

Re: 9k jumbo clusters

Re: 9k jumbo clusters

Re: 9k jumbo clusters

8 matches

Site Navigation

Mail list logo

Footer information