Re: M_NOFREE removal (was Re: svn commit: r254520 - in head/sys: kern sys)

2013-08-21 Thread Andre Oppermann

On 20.08.2013 00:38, Peter Grehan wrote:

Hi Andre,

  (moving to the more appropriate freebsd-net)


I'm sorry for ambushing but this stuff has to be done.  I have provided
an alternative way of handling it and I'm happy to help you with your
use case to make it good for you and to prevent the mbuf system from
getting bloated and hackish again.


  Sure. I'm not really upset since my code wasn't too far along, but with any 
API, you never know
who consumers might be so it's always worth being proactive about announcing 
it's removal.


Agreed.  OTOH there wasn't any in-tree user of it and posting before such 
removals
always yields at least one obscure hand wavy use or potential use of it.  So 
nothing
ever happens.


Can you please describe your intended use of M_NOFREE to better understand
the shortcomings of the current mbuf systems and the additional advantages
of the M_NOFREE case?


  I was looking at something similar to Linux's vhost-net, where a guest's 
virtio ring would be
processed in-kernel. An mbuf chain with external buffers would be used to pass 
guest tx buffer/len
segments directly into FreeBSD drivers.


Is that like page flipping?


  The intent of M_NOFREE was to avoid small mbuf allocations/frees in what is a 
hot path. This code
was intended to run at 10/40G.

  Note this code isn't really generic - it would require interfaces to be 
'owned' by the guest,
except that direct PCI-level pass-through wouldn't be needed.


Do you have some example code showing how that is or can be done?


  If there's an alternative to M_NOFREE, I'd be more than happy to use that.


Set up your own (*ext_free) function and omit freeing of the mbuf itself.  Make
sure to properly track your mbufs to avoid leaking them.

--
Andre

___
svn-src-head@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/svn-src-head
To unsubscribe, send any mail to svn-src-head-unsubscr...@freebsd.org


Re: M_NOFREE removal (was Re: svn commit: r254520 - in head/sys: kern sys)

2013-08-21 Thread Andre Oppermann

On 20.08.2013 05:13, Julian Elischer wrote:

On 8/20/13 6:38 AM, Peter Grehan wrote:

Hi Andre,

 (moving to the more appropriate freebsd-net)


I'm sorry for ambushing but this stuff has to be done.  I have provided
an alternative way of handling it and I'm happy to help you with your
use case to make it good for you and to prevent the mbuf system from
getting bloated and hackish again.


 Sure. I'm not really upset since my code wasn't too far along, but with any 
API, you never know
who consumers might be so it's always worth being proactive about announcing 
it's removal.


Can you please describe your intended use of M_NOFREE to better understand
the shortcomings of the current mbuf systems and the additional advantages
of the M_NOFREE case?


 I was looking at something similar to Linux's vhost-net, where a guest's 
virtio ring would be
processed in-kernel. An mbuf chain with external buffers would be used to pass 
guest tx buffer/len
segments directly into FreeBSD drivers.

 The intent of M_NOFREE was to avoid small mbuf allocations/frees in what is a 
hot path. This code
was intended to run at 10/40G.

 Note this code isn't really generic - it would require interfaces to be 
'owned' by the guest,
except that direct PCI-level pass-through wouldn't be needed.

 If there's an alternative to M_NOFREE, I'd be more than happy to use that.


I think an alternative would be a reference counted version.  we used to have 
that and NetBSD had a
quite sophisticated mbuf system where there were multiple owners of mbufs..
they wouldn't be freed until the last one freed it but I don't remember the 
details.


I just had a glance at the NetBSD mbuf system and it doesn't look convincing
either.  There may be some serious scalability issues with this ownership
thing.

--
Andre

___
svn-src-head@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/svn-src-head
To unsubscribe, send any mail to svn-src-head-unsubscr...@freebsd.org


Re: M_NOFREE removal (was Re: svn commit: r254520 - in head/sys: kern sys)

2013-08-21 Thread Navdeep Parhar
On 08/21/13 08:08, Andre Oppermann wrote:
 On 20.08.2013 00:38, Peter Grehan wrote:
snip
 
   If there's an alternative to M_NOFREE, I'd be more than happy to use
 that.
 
 Set up your own (*ext_free) function and omit freeing of the mbuf
 itself.  Make
 sure to properly track your mbufs to avoid leaking them.
 

How is this an alternate to M_NOFREE?  Your suggestion indicates that
you may have removed M_NOFREE thinking it did something other than what
it actually did.  And this suggestion doesn't even work -- note the
uma_zfree(zone_mbuf, m) at the end of mb_free_ext that would run after
any custom ext_free.

To recap: M_NOFREE was the way to tell the mbuf subsystem that the mbuf
does not come from zone_mbuf.  Nothing more and nothing less.  The mbuf
was in other ways like any other mbuf and could have a pkthdr (or not),
internal storage (or not), external cluster (or not), etc. etc.

Regards,
Navdeep
___
svn-src-head@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/svn-src-head
To unsubscribe, send any mail to svn-src-head-unsubscr...@freebsd.org


Re: M_NOFREE removal (was Re: svn commit: r254520 - in head/sys: kern sys)

2013-08-21 Thread Peter Grehan

  If there's an alternative to M_NOFREE, I'd be more than happy to use
that.


Set up your own (*ext_free) function and omit freeing of the mbuf
itself.  Make sure to properly track your mbufs to avoid leaking them.


 Doesn't work: there's an unconditional free of the small mbuf. That's 
why I used M_NOFREE.


later,

Peter.

___
svn-src-head@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/svn-src-head
To unsubscribe, send any mail to svn-src-head-unsubscr...@freebsd.org


Re: M_NOFREE removal (was Re: svn commit: r254520 - in head/sys: kern sys)

2013-08-21 Thread Andre Oppermann

On 21.08.2013 18:38, Navdeep Parhar wrote:

On 08/21/13 08:08, Andre Oppermann wrote:

On 20.08.2013 00:38, Peter Grehan wrote:

snip



   If there's an alternative to M_NOFREE, I'd be more than happy to use
that.


Set up your own (*ext_free) function and omit freeing of the mbuf
itself.  Make
sure to properly track your mbufs to avoid leaking them.



How is this an alternate to M_NOFREE?  Your suggestion indicates that
you may have removed M_NOFREE thinking it did something other than what
it actually did.  And this suggestion doesn't even work -- note the
uma_zfree(zone_mbuf, m) at the end of mb_free_ext that would run after
any custom ext_free.

To recap: M_NOFREE was the way to tell the mbuf subsystem that the mbuf
does not come from zone_mbuf.  Nothing more and nothing less.  The mbuf
was in other ways like any other mbuf and could have a pkthdr (or not),
internal storage (or not), external cluster (or not), etc. etc.


Mea culpa.  You're totally right.  I have mixed up my mental model with
another working tree I carry along for quite some time.  In it (*ext_free)
completely replaces mb_free_ext() making it very easy to keep the mbuf.
We should move that way hopefully sooner than later, simplifying a couple
of things including externally managed refcounts.

Sorry for the chaos and not getting it.

M_NOFREE is back with r254605.

--
Andre

___
svn-src-head@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/svn-src-head
To unsubscribe, send any mail to svn-src-head-unsubscr...@freebsd.org


Re: M_NOFREE removal (was Re: svn commit: r254520 - in head/sys: kern sys)

2013-08-21 Thread Navdeep Parhar
On 08/21/13 11:18, Andre Oppermann wrote:
 On 21.08.2013 18:38, Navdeep Parhar wrote:
 On 08/21/13 08:08, Andre Oppermann wrote:
 On 20.08.2013 00:38, Peter Grehan wrote:
 snip

If there's an alternative to M_NOFREE, I'd be more than happy to use
 that.

 Set up your own (*ext_free) function and omit freeing of the mbuf
 itself.  Make
 sure to properly track your mbufs to avoid leaking them.


 How is this an alternate to M_NOFREE?  Your suggestion indicates that
 you may have removed M_NOFREE thinking it did something other than what
 it actually did.  And this suggestion doesn't even work -- note the
 uma_zfree(zone_mbuf, m) at the end of mb_free_ext that would run after
 any custom ext_free.

 To recap: M_NOFREE was the way to tell the mbuf subsystem that the mbuf
 does not come from zone_mbuf.  Nothing more and nothing less.  The mbuf
 was in other ways like any other mbuf and could have a pkthdr (or not),
 internal storage (or not), external cluster (or not), etc. etc.
 
 Mea culpa.  You're totally right.  I have mixed up my mental model with
 another working tree I carry along for quite some time.  In it (*ext_free)
 completely replaces mb_free_ext() making it very easy to keep the mbuf.
 We should move that way hopefully sooner than later, simplifying a couple
 of things including externally managed refcounts.
 
 Sorry for the chaos and not getting it.
 
 M_NOFREE is back with r254605.
 

I saw that, thanks!

I believe we need an extra patch to get M_NOFREE correct.  I've had it
forever in some of my internal repos but never committed it upstream
(just plain forgot).  Since this stuff is fresh in your mind, can you
review this:

diff -r cd78031b7885 sys/sys/mbuf.h
--- a/sys/sys/mbuf.hFri Aug 16 13:35:26 2013 -0700
+++ b/sys/sys/mbuf.hWed Aug 21 10:55:57 2013 -0700
@@ -535,6 +535,8 @@ m_free(struct mbuf *m)
 {
struct mbuf *n = m-m_next;

+   if ((m-m_flags  (M_PKTHDR|M_NOFREE)) == (M_PKTHDR|M_NOFREE))
+   m_tag_delete_chain(m, NULL);
if (m-m_flags  M_EXT)
mb_free_ext(m);
else if ((m-m_flags  M_NOFREE) == 0)

It prevents leaks of tags from M_NOFREE+pkthdr mbufs.

Regards,
Navdeep
___
svn-src-head@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/svn-src-head
To unsubscribe, send any mail to svn-src-head-unsubscr...@freebsd.org


Re: M_NOFREE removal (was Re: svn commit: r254520 - in head/sys: kern sys)

2013-08-21 Thread Andre Oppermann

On 21.08.2013 20:23, Navdeep Parhar wrote:

I believe we need an extra patch to get M_NOFREE correct.  I've had it
forever in some of my internal repos but never committed it upstream
(just plain forgot).  Since this stuff is fresh in your mind, can you
review this:

diff -r cd78031b7885 sys/sys/mbuf.h
--- a/sys/sys/mbuf.hFri Aug 16 13:35:26 2013 -0700
+++ b/sys/sys/mbuf.hWed Aug 21 10:55:57 2013 -0700
@@ -535,6 +535,8 @@ m_free(struct mbuf *m)
  {
struct mbuf *n = m-m_next;

+   if ((m-m_flags  (M_PKTHDR|M_NOFREE)) == (M_PKTHDR|M_NOFREE))
+   m_tag_delete_chain(m, NULL);
if (m-m_flags  M_EXT)
mb_free_ext(m);
else if ((m-m_flags  M_NOFREE) == 0)

It prevents leaks of tags from M_NOFREE+pkthdr mbufs.


Good point.  Looks correct.

But then I wonder if it is really a smart thing to allow single
mbufs (w/o M_EXT) to be M_NOFREE at the same time.  They easily
get lost.  If it doesn't have an external buffer attached there
is no way to know when and if it was freed.

If M_NOFREE is only allowed together with M_EXT then the tag chain
delete should happen in mb_ext_free() next to 'skipmbuf' instead.

Index: kern/uipc_mbuf.c
===
--- kern/uipc_mbuf.c(revision 254605)
+++ kern/uipc_mbuf.c(working copy)
@@ -283,11 +283,6 @@
KASSERT((m-m_flags  M_EXT) == M_EXT, (%s: M_EXT not set, __func__));
KASSERT(m-m_ext.ref_cnt != NULL, (%s: ref_cnt not set, __func__));

-   /*
-* check if the header is embedded in the cluster
-*/
-   skipmbuf = (m-m_flags  M_NOFREE);
-
/* Free attached storage if this mbuf is the only reference to it. */
if (*(m-m_ext.ref_cnt) == 1 ||
atomic_fetchadd_int(m-m_ext.ref_cnt, -1) == 1) {
@@ -328,8 +323,14 @@
(%s: unknown ext_type, __func__));
}
}
-   if (skipmbuf)
+
+   /*
+* Do not free if the mbuf is embedded in the cluster.
+*/
+   if (m-m_flags  M_NOFREE) {
+   m_tag_delete_chain(m, NULL);
return;
+   }

/*
 * Free this mbuf back to the mbuf zone with all m_ext

--
Andre

___
svn-src-head@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/svn-src-head
To unsubscribe, send any mail to svn-src-head-unsubscr...@freebsd.org


Re: M_NOFREE removal (was Re: svn commit: r254520 - in head/sys: kern sys)

2013-08-21 Thread Navdeep Parhar
On 08/21/13 12:22, Andre Oppermann wrote:
 On 21.08.2013 20:23, Navdeep Parhar wrote:
 I believe we need an extra patch to get M_NOFREE correct.  I've had it
 forever in some of my internal repos but never committed it upstream
 (just plain forgot).  Since this stuff is fresh in your mind, can you
 review this:

 diff -r cd78031b7885 sys/sys/mbuf.h
 --- a/sys/sys/mbuf.hFri Aug 16 13:35:26 2013 -0700
 +++ b/sys/sys/mbuf.hWed Aug 21 10:55:57 2013 -0700
 @@ -535,6 +535,8 @@ m_free(struct mbuf *m)
   {
   struct mbuf *n = m-m_next;

 +if ((m-m_flags  (M_PKTHDR|M_NOFREE)) == (M_PKTHDR|M_NOFREE))
 +m_tag_delete_chain(m, NULL);
   if (m-m_flags  M_EXT)
   mb_free_ext(m);
   else if ((m-m_flags  M_NOFREE) == 0)

 It prevents leaks of tags from M_NOFREE+pkthdr mbufs.
 
 Good point.  Looks correct.
 
 But then I wonder if it is really a smart thing to allow single
 mbufs (w/o M_EXT) to be M_NOFREE at the same time.  They easily
 get lost.  If it doesn't have an external buffer attached there
 is no way to know when and if it was freed.
 
 If M_NOFREE is only allowed together with M_EXT then the tag chain
 delete should happen in mb_ext_free() next to 'skipmbuf' instead.
 
 Index: kern/uipc_mbuf.c
 ===
 --- kern/uipc_mbuf.c(revision 254605)
 +++ kern/uipc_mbuf.c(working copy)
 @@ -283,11 +283,6 @@
 KASSERT((m-m_flags  M_EXT) == M_EXT, (%s: M_EXT not set,
 __func__));
 KASSERT(m-m_ext.ref_cnt != NULL, (%s: ref_cnt not set,
 __func__));
 
 -   /*
 -* check if the header is embedded in the cluster
 -*/
 -   skipmbuf = (m-m_flags  M_NOFREE);
 -
 /* Free attached storage if this mbuf is the only reference to
 it. */
 if (*(m-m_ext.ref_cnt) == 1 ||
 atomic_fetchadd_int(m-m_ext.ref_cnt, -1) == 1) {
 @@ -328,8 +323,14 @@
 (%s: unknown ext_type, __func__));
 }
 }
 -   if (skipmbuf)
 +
 +   /*
 +* Do not free if the mbuf is embedded in the cluster.
 +*/
 +   if (m-m_flags  M_NOFREE) {
 +   m_tag_delete_chain(m, NULL);
 return;
 +   }

The problem with this is that the mbuf may already be gone if it was
embedded in the cluster that was just freed by the ext free.  That's why
skipmbuf is used to cache the M_NOFREE bit.
___
svn-src-head@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/svn-src-head
To unsubscribe, send any mail to svn-src-head-unsubscr...@freebsd.org


Re: M_NOFREE removal (was Re: svn commit: r254520 - in head/sys: kern sys)

2013-08-21 Thread Andre Oppermann

On 21.08.2013 21:40, Navdeep Parhar wrote:

On 08/21/13 12:22, Andre Oppermann wrote:

On 21.08.2013 20:23, Navdeep Parhar wrote:

I believe we need an extra patch to get M_NOFREE correct.  I've had it
forever in some of my internal repos but never committed it upstream
(just plain forgot).  Since this stuff is fresh in your mind, can you
review this:

diff -r cd78031b7885 sys/sys/mbuf.h
--- a/sys/sys/mbuf.hFri Aug 16 13:35:26 2013 -0700
+++ b/sys/sys/mbuf.hWed Aug 21 10:55:57 2013 -0700
@@ -535,6 +535,8 @@ m_free(struct mbuf *m)
   {
   struct mbuf *n = m-m_next;

+if ((m-m_flags  (M_PKTHDR|M_NOFREE)) == (M_PKTHDR|M_NOFREE))
+m_tag_delete_chain(m, NULL);
   if (m-m_flags  M_EXT)
   mb_free_ext(m);
   else if ((m-m_flags  M_NOFREE) == 0)

It prevents leaks of tags from M_NOFREE+pkthdr mbufs.


Good point.  Looks correct.

But then I wonder if it is really a smart thing to allow single
mbufs (w/o M_EXT) to be M_NOFREE at the same time.  They easily
get lost.  If it doesn't have an external buffer attached there
is no way to know when and if it was freed.

If M_NOFREE is only allowed together with M_EXT then the tag chain
delete should happen in mb_ext_free() next to 'skipmbuf' instead.

Index: kern/uipc_mbuf.c
===
--- kern/uipc_mbuf.c(revision 254605)
+++ kern/uipc_mbuf.c(working copy)
@@ -283,11 +283,6 @@
 KASSERT((m-m_flags  M_EXT) == M_EXT, (%s: M_EXT not set,
__func__));
 KASSERT(m-m_ext.ref_cnt != NULL, (%s: ref_cnt not set,
__func__));

-   /*
-* check if the header is embedded in the cluster
-*/
-   skipmbuf = (m-m_flags  M_NOFREE);
-
 /* Free attached storage if this mbuf is the only reference to
it. */
 if (*(m-m_ext.ref_cnt) == 1 ||
 atomic_fetchadd_int(m-m_ext.ref_cnt, -1) == 1) {
@@ -328,8 +323,14 @@
 (%s: unknown ext_type, __func__));
 }
 }
-   if (skipmbuf)
+
+   /*
+* Do not free if the mbuf is embedded in the cluster.
+*/
+   if (m-m_flags  M_NOFREE) {
+   m_tag_delete_chain(m, NULL);
 return;
+   }


The problem with this is that the mbuf may already be gone if it was
embedded in the cluster that was just freed by the ext free.  That's why
skipmbuf is used to cache the M_NOFREE bit.


Next try: ;)

Index: kern/uipc_mbuf.c
===
--- kern/uipc_mbuf.c(revision 254605)
+++ kern/uipc_mbuf.c(working copy)
@@ -284,9 +284,12 @@
KASSERT(m-m_ext.ref_cnt != NULL, (%s: ref_cnt not set, __func__));

/*
-* check if the header is embedded in the cluster
+* Do not free if the mbuf is embedded in the cluster
+* but make sure to get rid of any tag chains first.
 */
skipmbuf = (m-m_flags  M_NOFREE);
+   if (skipmbuf  (m-m_flags  M_PKTHDR))
+   m_tag_delete_chain(m, NULL);

/* Free attached storage if this mbuf is the only reference to it. */
if (*(m-m_ext.ref_cnt) == 1 ||

--
Andre

___
svn-src-head@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/svn-src-head
To unsubscribe, send any mail to svn-src-head-unsubscr...@freebsd.org


Re: M_NOFREE removal (was Re: svn commit: r254520 - in head/sys: kern sys)

2013-08-21 Thread Navdeep Parhar
On 08/21/13 13:44, Andre Oppermann wrote:
 On 21.08.2013 21:40, Navdeep Parhar wrote:
 On 08/21/13 12:22, Andre Oppermann wrote:
 On 21.08.2013 20:23, Navdeep Parhar wrote:
 I believe we need an extra patch to get M_NOFREE correct.  I've had it
 forever in some of my internal repos but never committed it upstream
 (just plain forgot).  Since this stuff is fresh in your mind, can you
 review this:

 diff -r cd78031b7885 sys/sys/mbuf.h
 --- a/sys/sys/mbuf.hFri Aug 16 13:35:26 2013 -0700
 +++ b/sys/sys/mbuf.hWed Aug 21 10:55:57 2013 -0700
 @@ -535,6 +535,8 @@ m_free(struct mbuf *m)
{
struct mbuf *n = m-m_next;

 +if ((m-m_flags  (M_PKTHDR|M_NOFREE)) == (M_PKTHDR|M_NOFREE))
 +m_tag_delete_chain(m, NULL);
if (m-m_flags  M_EXT)
mb_free_ext(m);
else if ((m-m_flags  M_NOFREE) == 0)

 It prevents leaks of tags from M_NOFREE+pkthdr mbufs.

 Good point.  Looks correct.

 But then I wonder if it is really a smart thing to allow single
 mbufs (w/o M_EXT) to be M_NOFREE at the same time.  They easily
 get lost.  If it doesn't have an external buffer attached there
 is no way to know when and if it was freed.

 If M_NOFREE is only allowed together with M_EXT then the tag chain
 delete should happen in mb_ext_free() next to 'skipmbuf' instead.

 Index: kern/uipc_mbuf.c
 ===
 --- kern/uipc_mbuf.c(revision 254605)
 +++ kern/uipc_mbuf.c(working copy)
 @@ -283,11 +283,6 @@
  KASSERT((m-m_flags  M_EXT) == M_EXT, (%s: M_EXT not set,
 __func__));
  KASSERT(m-m_ext.ref_cnt != NULL, (%s: ref_cnt not set,
 __func__));

 -   /*
 -* check if the header is embedded in the cluster
 -*/
 -   skipmbuf = (m-m_flags  M_NOFREE);
 -
  /* Free attached storage if this mbuf is the only reference to
 it. */
  if (*(m-m_ext.ref_cnt) == 1 ||
  atomic_fetchadd_int(m-m_ext.ref_cnt, -1) == 1) {
 @@ -328,8 +323,14 @@
  (%s: unknown ext_type, __func__));
  }
  }
 -   if (skipmbuf)
 +
 +   /*
 +* Do not free if the mbuf is embedded in the cluster.
 +*/
 +   if (m-m_flags  M_NOFREE) {
 +   m_tag_delete_chain(m, NULL);
  return;
 +   }

 The problem with this is that the mbuf may already be gone if it was
 embedded in the cluster that was just freed by the ext free.  That's why
 skipmbuf is used to cache the M_NOFREE bit.
 
 Next try: ;)

It is most flexible to let M_NOFREE work without any assumptions
attached (must be M_EXT, etc.)  So I still prefer my patch to this.  If
you don't have any strong preferences may I commit that one instead?

Regards,
Navdeep
___
svn-src-head@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/svn-src-head
To unsubscribe, send any mail to svn-src-head-unsubscr...@freebsd.org


Re: M_NOFREE removal (was Re: svn commit: r254520 - in head/sys: kern sys)

2013-08-21 Thread Andre Oppermann

On 21.08.2013 22:52, Navdeep Parhar wrote:

On 08/21/13 13:44, Andre Oppermann wrote:

On 21.08.2013 21:40, Navdeep Parhar wrote:

On 08/21/13 12:22, Andre Oppermann wrote:

On 21.08.2013 20:23, Navdeep Parhar wrote:

I believe we need an extra patch to get M_NOFREE correct.  I've had it
forever in some of my internal repos but never committed it upstream
(just plain forgot).  Since this stuff is fresh in your mind, can you
review this:

diff -r cd78031b7885 sys/sys/mbuf.h
--- a/sys/sys/mbuf.hFri Aug 16 13:35:26 2013 -0700
+++ b/sys/sys/mbuf.hWed Aug 21 10:55:57 2013 -0700
@@ -535,6 +535,8 @@ m_free(struct mbuf *m)
{
struct mbuf *n = m-m_next;

+if ((m-m_flags  (M_PKTHDR|M_NOFREE)) == (M_PKTHDR|M_NOFREE))
+m_tag_delete_chain(m, NULL);
if (m-m_flags  M_EXT)
mb_free_ext(m);
else if ((m-m_flags  M_NOFREE) == 0)

It prevents leaks of tags from M_NOFREE+pkthdr mbufs.


Good point.  Looks correct.

But then I wonder if it is really a smart thing to allow single
mbufs (w/o M_EXT) to be M_NOFREE at the same time.  They easily
get lost.  If it doesn't have an external buffer attached there
is no way to know when and if it was freed.

If M_NOFREE is only allowed together with M_EXT then the tag chain
delete should happen in mb_ext_free() next to 'skipmbuf' instead.

Index: kern/uipc_mbuf.c
===
--- kern/uipc_mbuf.c(revision 254605)
+++ kern/uipc_mbuf.c(working copy)
@@ -283,11 +283,6 @@
  KASSERT((m-m_flags  M_EXT) == M_EXT, (%s: M_EXT not set,
__func__));
  KASSERT(m-m_ext.ref_cnt != NULL, (%s: ref_cnt not set,
__func__));

-   /*
-* check if the header is embedded in the cluster
-*/
-   skipmbuf = (m-m_flags  M_NOFREE);
-
  /* Free attached storage if this mbuf is the only reference to
it. */
  if (*(m-m_ext.ref_cnt) == 1 ||
  atomic_fetchadd_int(m-m_ext.ref_cnt, -1) == 1) {
@@ -328,8 +323,14 @@
  (%s: unknown ext_type, __func__));
  }
  }
-   if (skipmbuf)
+
+   /*
+* Do not free if the mbuf is embedded in the cluster.
+*/
+   if (m-m_flags  M_NOFREE) {
+   m_tag_delete_chain(m, NULL);
  return;
+   }


The problem with this is that the mbuf may already be gone if it was
embedded in the cluster that was just freed by the ext free.  That's why
skipmbuf is used to cache the M_NOFREE bit.


Next try: ;)


It is most flexible to let M_NOFREE work without any assumptions
attached (must be M_EXT, etc.)  So I still prefer my patch to this.  If
you don't have any strong preferences may I commit that one instead?


I don't mind having your patch.  Though I don't see how it possibly can't leak
mbufs if they are not M_EXT.

--
Andre

___
svn-src-head@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/svn-src-head
To unsubscribe, send any mail to svn-src-head-unsubscr...@freebsd.org


Re: M_NOFREE removal (was Re: svn commit: r254520 - in head/sys: kern sys)

2013-08-21 Thread Navdeep Parhar
On 08/21/13 13:59, Andre Oppermann wrote:
 On 21.08.2013 22:52, Navdeep Parhar wrote:
snip
 It is most flexible to let M_NOFREE work without any assumptions
 attached (must be M_EXT, etc.)  So I still prefer my patch to this.  If
 you don't have any strong preferences may I commit that one instead?
 
 I don't mind having your patch.  Though I don't see how it possibly
 can't leak
 mbufs if they are not M_EXT.
 

Interfaces that are guaranteed to consume and free an mbuf (instead of
potentially queueing it somewhere) can be handed M_NOFREE mbufs with
inline data, and the caller knows that it is always safe to free the
underlying storage after the call.  Clearly, caution is required.  But
let's not completely prohibit use cases like these.

Regards,
Navdeep
___
svn-src-head@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/svn-src-head
To unsubscribe, send any mail to svn-src-head-unsubscr...@freebsd.org


Re: M_NOFREE removal (was Re: svn commit: r254520 - in head/sys: kern sys)

2013-08-19 Thread Julian Elischer

On 8/20/13 6:38 AM, Peter Grehan wrote:

Hi Andre,

 (moving to the more appropriate freebsd-net)

I'm sorry for ambushing but this stuff has to be done.  I have 
provided

an alternative way of handling it and I'm happy to help you with your
use case to make it good for you and to prevent the mbuf system from
getting bloated and hackish again.


 Sure. I'm not really upset since my code wasn't too far along, but 
with any API, you never know who consumers might be so it's always 
worth being proactive about announcing it's removal.


Can you please describe your intended use of M_NOFREE to better 
understand
the shortcomings of the current mbuf systems and the additional 
advantages

of the M_NOFREE case?


 I was looking at something similar to Linux's vhost-net, where a 
guest's virtio ring would be processed in-kernel. An mbuf chain with 
external buffers would be used to pass guest tx buffer/len segments 
directly into FreeBSD drivers.


 The intent of M_NOFREE was to avoid small mbuf allocations/frees in 
what is a hot path. This code was intended to run at 10/40G.


 Note this code isn't really generic - it would require interfaces 
to be 'owned' by the guest, except that direct PCI-level 
pass-through wouldn't be needed.


 If there's an alternative to M_NOFREE, I'd be more than happy to 
use that.


I think an alternative would be a reference counted version.  we used 
to have that and NetBSD had a quite sophisticated mbuf system where 
there were multiple owners of mbufs..
they wouldn't be freed until the last one freed it but I don't 
remember the details.


later,

Peter.
___
freebsd-...@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org




___
svn-src-head@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/svn-src-head
To unsubscribe, send any mail to svn-src-head-unsubscr...@freebsd.org