Big batch of KAUTH_GENERIC_ISSUSER removal

2012-02-17 Thread Elad Efrat
Hi,

The final version of what I would like to check in is available here:

http://www.netbsd.org/~elad/issuser.diff

(it's ~100k so I didn't attach it.)

It contains both earlier diffs, plus some similar, mechanical changes
to file-system code and changes to the kauth(9) man-page.

It will not be committed before 2/22 so take your time reviewing.

Elad


Reduce KAUTH_GENERIC_ISSUSER usage (batch 2)

2012-01-18 Thread Elad Efrat
Hi,

This is the second batch. Same as before, but this time changes that
affect only netsmb and lmc. Here the original checks are done through
definitions (smb_suser() and CHECK_CAP, respectively). For lmc I
believe it's a simple change, for netsmb I repalced the original
blanket call with four different authorization calls. (The kauth.h
part is omitted from the diff.)

Thanks,

Elad


batch2.diff
Description: Binary data


Veriexec "lock_state" misuse

2012-01-18 Thread Elad Efrat
Hi,

The current code confuses "lock_state" for something that it isn't. I
discussed this a while ago with hannken@ and blymn@ and came up with
the attached patch. We change the semantics to require locking and add
a KASSERT to make sure they're followed. Missing locking bits are
added in two places to comply.

I'm not seeking feedback on this one as there's more Veriexec locking
work pending. It's posted merely for the chance that someone might
want this in netbsd-6.

Elad


veriexec_lock_state.diff
Description: Binary data


Reduce KAUTH_GENERIC_ISSUSER usage (batch 1)

2012-01-18 Thread Elad Efrat
Hi,

Attached is a diff that reduces the use of KAUTH_GENERIC_ISSUSER. I
plan to commit it a week or so after the branch.

Most of it is mechanically replacing the above action with something
more meaningful, together with the necessary secmodel bits. To make
reviewing easier, below there's a list of files. If you see a file you
want to double-check on that list, see the diff. The files not on the
list are the back-end ones (kauth.h, secmodel_suser.c).

arch/amiga/dev/grf.c
arch/macppc/dev/ofb.c
arch/shark/ofw/vga_ofbus.c
arch/sparc/dev/tctrl.c
arch/sparc64/dev/gfb.c
dev/cons.c
dev/verified_exec.c
dev/dm/device-mapper.c
dev/ic/ct65550.c
dev/ic/midway.c
dev/pci/genfb_pci.c
dev/pci/machfb.c
dev/pci/pci_usrreq.c
dev/pci/pm2fb.c
dev/pci/r128fb.c
dev/pci/radeonfb.c
dev/pci/voodoofb.c
dev/pci/wcfb.c
dev/pci/voyager/voyagerfb.c
dev/tc/pxg.c
dev/wscons/wskbd.c
kern/kern_exec.c
kern/kern_fork.c
kern/sys_mqueue.c
kern/sysv_ipc.c
kern/sysv_msg.c
kern/sysv_sem.c
kern/sysv_shm.c
kern/uipc_sem.c
net/if_bridge.c
net/npf/npf.c
netinet6/in6.c
netinet6/ip6_output.c
netipsec/ipsec.c
sys/ipc.h

(Notice that I took care not to touch any file-system code. I know
people are actively working in this area and I will coordinate changes
with them to minimize interference.)

The purpose of the transition is to finish the secmodel implementation
and have a descriptive authorization KPI. Once that's in place, we can
adapt to a more type-safe interface if we want to because we will know
what context is needed for each action/request etc. So the focus for
the review is "is this the right action/request/context for this part
of the code?" If you find changes where the answer is "no," please
come up with a(n) solution/alternative.

Thanks,

Elad


batch1.diff
Description: Binary data


Re: secmodel_register(9) API

2011-12-05 Thread Elad Efrat
I don't understand what the problems are. In any case, this does not
violate anything. The language used to describe the supposed issues
with the new API is ridiculous.

What the new API allows is interaction between secmodels that are
built by people who are not part of NetBSD and don't want their
secmodel to become part of NetBSD but do want to take advantage of
features in secmodels provided by NetBSD.

Personally I don't care if this stays or not. All I can say is that I
have not seen a single argument worthy of consideration against it and
I would strongly recommend to leave it in.

Please don't CC me any further emails about this.

Elad

On Mon, Dec 5, 2011 at 1:33 PM, Jean-Yves Migeon
 wrote:
> (I am CC'ing Elad, as we both worked on it)
>
> On Mon, 5 Dec 2011 16:22:33 +0100, Christoph Badura wrote:
>>
>> That is by design.  The idea behind splitting the decision process into
>> separate secmodels is to decouple the models and the decision making.
>
>
> I can't see how -- secmodels are responsible for the decision making, so we
> cannot decouple these easily.
>
>>> So, when one wants to create sysctls that can only be changed when
>>> securelevel is below a certain level, you end up adding more hooks to
>>> secmodel_securelevel(9)...
>>
>>
>> That is intentional.  Every time a new control is added all the secmodels
>> need to be examined if they need updating.
>>
>>> which is problematic, because the sysctl does
>>> not necessarily belong to this secmodel (consider curtain and suser, as
>>> an exemple).
>>
>>
>> Uh, sysctls do not "belong to a secmodel".  Whatever that is supposed to
>> mean.
>
>
> Okay, let's put that differently: enabling/disabling a security feature like
> curtain has nothing to do with secmodel_suser(9) itself. Curtain is a
> feature that says: "only owners of an object can query information about
> it." So securelevel, suser, or bsd44... do not intervene in this process,
> _except_ when you are root (pragmatism always add exceptions). That's it.
>
> But the "am I root?" evaluation is more an exception than the rule (a weak
> dependency). Turning curtain into a full fledged suser extension because
> there is one slight divergence is problematic.
>
>>> error = secmodel_eval("org.netbsd.secmodel.suser", "is-root",
>>>    cred, &isroot, NULL);
>>
>>
>>> This one asks the suser module if the user cred is assumed to be
>>> root when evaluated by suser module. If the module is present, it
>>> will respond. If absent, the call will return an error.
>>
>>
>> This so-called "API" is a complete perversion of the kauth system.  It
>> pulls the implementation details that the other secmodel is supposed
>> to hide and abstract away out into unrelated code again.
>
>
> That's weird: what you describe here is the situation before this patch:
> curtain and usermount were "planted" inside a secmodel they do not belong to
> (secmodel_suser is about making decisions about root's rights, not ordinary
> users).
>
>> This is adding
>> back all the interdependencies and knowledge about internals that kauth
>> is supposed to abstract and hide.
>
>
> That's precisely what we are trying to avoid.
>
> Knowledge about internals were the de facto standard before: if you wanted
> to have an extension which implements a policy ("users can only see state of
> their objects") with an explicit dependency on implementation internals ("is
> that user privileged?"), you had to put the extension inside the secmodel(9)
> that implements the privileged evaluation.
>
>> The problem with this is that it doesn't scale.  You are just arbitrarily
>> moving the code around that needs to be modified when new actions need to
>> be authorized.
>
>
> No; we are allowing secmodel(9) to expose evaluation logic ("a question") to
> other secmodels, so they can query for a result that depends on their
> internals.
>
> The ultimate decision remains in the hand of the one making the query.
> Asking a question in the form of "do you allow that action to happen?" is a
> plain miss-use of this API; that should indeed be done by kauth(9).
>
>> It doesn't scale in the case when a new secmodel is introduced that
>> needs to have a say on this action.
>
>
> The evaluation API is not designed to accept hooks, nor shall it be. It's
> meant for a secmodel(9) to query "safely" (as is: no symbol/run-time
> dependency on code being loaded) a secmodel for internal state ("are you in
> this state?") or internal logic ("given these credentials, do you consider
> them privileged or not?").
>
>> And it doesn't scale when bsd44
>> or suser is replace with a secmodel that defines "privilege" other than
>> "is-root".
>
>
> It should not. If there's another secmodel that appears later that redefines
> "privilege", that secmodel will have to expose evaluation callback and let
> the curtain extension handle it.
>
> For the "is-root" case, the evaluation will return an error if the
> secmodel_suser(9) is not loaded. The curtain extension will then make the
> 

Re: secmodel order of initialization

2010-04-08 Thread Elad Efrat

Thor Lancelot Simon wrote:

I'm trying to make secmodel_overlay work in my netbsd-5 tree.  It appears
to have never been adapted when secmodel_securelevel was split out of
secmodel_bsd44.


At some point I think I had a different idea about how to implement what
"overlay" was supposed to do, so it's possible that it's not needed
anymore.

It would be better if you stated what you are interested in doing first.

-e.


Re: lower-case kernel option names

2010-04-08 Thread Elad Efrat

Masao Uebayashi wrote:

On Fri, Apr 9, 2010 at 1:16 AM, Thor Lancelot Simon  wrote:

I just confused myself considerably because this config file fragment
didn't work:

   no options SECMODEL_BSD44
   options SECMODEL_OVERLAY

It turns out these kernel options (in sys/conf/std) have *lowercase* names.

Why?  Shouldn't I change them?


Quick grep shows that only secmodel_* flags are lower-cased.  I think
they should be changed.


Go for it.

-e.



Re: kauth and socket calls (esp. bind())

2010-04-08 Thread Elad Efrat

Thor Lancelot Simon wrote:

According to kauth(9):

 Listeners might sleep, so no locks can be held when calling
 an authorization wrapper.

According to uipc_socket.c:sobind():

 solock(so);
 error = (*so->so_proto->pr_usrreq)(so, PRU_BIND, NULL, nam,
  NULL, l);
 sounlock(so);

According to in_pcb.c:in_pcbbind():

 kauth_authorize_network(cred, KAUTH_NETWORK_BIND,
 KAUTH_REQ_NETWORK_BIND_PRIVPORT, so,
 sin, NULL)

Um.  Is it the documentation or the code which should be corrected?


The idea is to encourage developers to structure code so that kauth(9)
calls are made with ideally no locks etc. held, but like the man-page
states, kauth(9) is under development.


I'm not sure I grasp how things like the filesystem or device scopes could
even really work if you can't make kauth calls with locks held.


Which is why kauth(9) isn't yet fully integrated. (See e.g. tmpfs code
as the only file-system using kauth(9)) Perhaps you can step up to the
plate and address those issues.

-e.



Re: [gsoc] syscall/libc fuzzer proposal

2010-03-20 Thread Elad Efrat
On Sat, Mar 20, 2010 at 3:24 PM, David Holland  wrote:
> On Sat, Mar 20, 2010 at 01:54:49PM -0400, Elad Efrat wrote:
>> Thor Lancelot Simon wrote:
>>> If not, I don't think this adds any benefit to your proposal and is likely
>>> to simply be a distraction; I'd urge you in that case to drop it.
>>
>> Strongly seconded. There are so many great ways to improve NetBSD and
>> wasting time and money on fuzzing is about as suboptimal as it gets.
>
> Um.
>
> First of all, that's not what Thor said;

Sorry? Are you saying that me agreeing with Thor that unless this
proposal shows some clear advantage over what we already have --
specifically Coverity Scan -- it should probably be dropped is not
what Thor said?

> second of all, you really
> should not be telling potential gsoc students that their project ideas
> are flatly worthless, even if your judgment were correct;

I said exactly what I think and I'll repeat it again: there are many
ways to improve NetBSD. Wasting both time and money, even if it's
someone else's, on things that aren't likely to benefit NetBSD in the
long term is a waste. There's a list of projects NetBSD's interested
in, and when someone proposes a project not on the list it should be
reviewed.

What I said is my opinion. I don't decide which projects are selected,
nor do I participate or plan to participate; it's an honest, objective
opinion.

> and third,
> I'm rather surprised that anyone who claims to work on security would
> call testing and analysis tools worthless.

I don't *claim* anything, David; I *work*, at least as opposed to,
say, assigning bugs to me, claiming for years I'll do something about
them (together with many other grand ideas) and instead fix, I dunno,
whitespace and grammar issues. Take your preaching elsewhere; I
couldn't care less.

As for the issue at hand, well, I suggest you look at what the
proposal is, what we already have for years, and draw your own
conclusions.

-e.


Re: [gsoc] syscall/libc fuzzer proposal

2010-03-20 Thread Elad Efrat
On Sat, Mar 20, 2010 at 2:31 PM, Jordan Gordeev  wrote:
> On 3/20/10 7:54 PM, Elad Efrat wrote:
>>
>> Strongly seconded. There are so many great ways to improve NetBSD and
>> wasting time and money on fuzzing is about as suboptimal as it gets.
>
> Please, list some of them.

Sure.

We need to finish abstracting stuff to kauth(9), and then might
consider changing to type-safe KPIs. There's also fileassoc(9) that
can solve a lot of problems (most recently the PaX ELF flags thing) by
providing file-system independent metadata storage -- completing its
kernel parts and perhaps providing a way to make it persistent is
important. I'd also like to see Veriexec gaining digital signature
support, now that NetPGP is in the base (and there's the per-page
fingerprints feature that needs to be somewhat debugged and enabled),
as well as a capabilities secmodel being added, even if as an
experiment. We just recently discussed a centralized daemon to
correlate security events and act upon them; this is probably not that
hard to implement. (Especially if you "modularize out" the correlation
bit :) Personally I'd also like to see PaX UDEREF done for us as well.

Of course, our network stack needs tons of improvements; we started
that a while ago by making socket options use real types rather than
mbufs and we still have a long way to go in that regard. I'd like to
see some of the improvements OpenBSD/FreeBSD added being looked at -
multiple routing tables, virtualization of the stack, etc., even if
it's to end up saying "not worth it."

Then there's also the issue of clean interfaces for passing data
to/from the kernel. Kvm(3) is a mess, and I'd like to see at least a
move towards sysctl, but preferably this needs to be solved in a
proplib-kinda way (I don't think "performance" is a goal for these
reporting tools).

Every one of the above has long-term prospects much better than
working on "fuzzing," and this is just a quick "OTOH" list; I'm sure
the folks who are doing work in UVM and the file-systems can list tons
of other ways to strengthen the foundations rather than introduce
"fuzzing" tools, even if we're looking to encapsulate something as a
GSoC project.

-e.


Re: [gsoc] syscall/libc fuzzer proposal

2010-03-20 Thread Elad Efrat

Thor Lancelot Simon wrote:

On Sat, Mar 20, 2010 at 05:32:28PM +0200, Mateusz Kocielski wrote:

As a part of my work I would like to write a translator for C language and a
small library. Their goal would be to detect integer overflows, stack overflows,
problems with static array indexing, etc (when such occur during the program
execution). It will enable me to uncover more bugs in the software.


What is the benefit of this when compared to existing static-analysis
tools such as Coverity Scan, splint, or the Clang static analyzer?  Will
this cover any cases they don't?  If so, which ones?

If not, I don't think this adds any benefit to your proposal and is likely
to simply be a distraction; I'd urge you in that case to drop it.


Strongly seconded. There are so many great ways to improve NetBSD and
wasting time and money on fuzzing is about as suboptimal as it gets.

-e.



Re: IPSEC (both stacks) slight adaptation to kauth(9)

2009-12-31 Thread Elad Efrat

Withdrawn.

-e.

Michael van Elst wrote:

e...@netbsd.org (Elad Efrat) writes:


The attached diff addresses this last abuse for uidinfo for
authorization by doing the following:
 

  1. Reorganize the switch statements so they are easier to understand.
 They differ only slightly, and as the networking stacks have enough
 duplicated code as it is, this is a step in the right direction if
 we are to eventually clean them up.
 
As discussed yesterday, your 'beautified' code just muddles the

separation between privileged and unprivileged case. And if you
think that this is an optimization (and I disagree), it should be
applied in a separate change.
 
 

  2. Remove the 'priv' field of the SP PCB structures from both IPSEC
 and FAST_IPSEC. Isolate it to the relevant context, and retrieve
 its value in runtime and don't cache it.
 
I think the cached value is there for a reason. Replacing it

with runtime checks silently changes semantics and adds significant
overhead to each outgoing IP packet.
 


  3. Replace uid comparison for privileged/unprivileged distinction with
 kauth(9) calls. For now, these are done on the generic scope as I
 have other changes in the pipe; once committed, these will be
 changed to use the network scope.


That's the one thing you should have done...





IPSEC (both stacks) slight adaptation to kauth(9)

2009-12-30 Thread Elad Efrat

Hi,

Our IPSEC stacks both have a SP PCB structure with a 'priv' field,
indicating whether this SP is "privileged" or not. The SP, which is
attached to a socket, gets this value set by by querying the socket's
uid, from the uidinfo field in the socket. In turn, the 'priv' field
is used in authorization. (If the uid is zero, it's privileged, and
vice versa.)

This of course doesn't fit with our transition to centralized
authorization and support for changing security policy (e.g. a MAC
policy that changed will not have any effect if the authorization is
done internally against a cached decision.)

The attached diff addresses this last abuse for uidinfo for
authorization by doing the following:

  1. Reorganize the switch statements so they are easier to understand.
 They differ only slightly, and as the networking stacks have enough
 duplicated code as it is, this is a step in the right direction if
 we are to eventually clean them up.

  2. Remove the 'priv' field of the SP PCB structures from both IPSEC
 and FAST_IPSEC. Isolate it to the relevant context, and retrieve
 its value in runtime and don't cache it.

  3. Replace uid comparison for privileged/unprivileged distinction with
 kauth(9) calls. For now, these are done on the generic scope as I
 have other changes in the pipe; once committed, these will be
 changed to use the network scope.

Please review. :)

Thanks,

-e.

Index: netipsec/ipsec.h
===
RCS file: /cvsroot/src/sys/netipsec/ipsec.h,v
retrieving revision 1.24
diff -u -p -r1.24 ipsec.h
--- netipsec/ipsec.h10 May 2009 02:13:07 -  1.24
+++ netipsec/ipsec.h31 Dec 2009 04:20:09 -
@@ -116,7 +116,6 @@ struct ipsecrequest {
 struct inpcbpolicy {
struct secpolicy *sp_in;
struct secpolicy *sp_out;
-   int priv;   /* privileged socket ? */
 
 #ifdef __NetBSD__
/* cached policy */
Index: netipsec/ipsec.c
===
RCS file: /cvsroot/src/sys/netipsec/ipsec.c,v
retrieving revision 1.46
diff -u -p -r1.46 ipsec.c
--- netipsec/ipsec.c30 Jul 2009 14:41:59 -  1.46
+++ netipsec/ipsec.c31 Dec 2009 04:20:11 -
@@ -513,8 +513,9 @@ ipsec_getpolicybysock(struct mbuf *m, u_
 {
struct inpcbpolicy *pcbsp = NULL;
struct secpolicy *currsp = NULL;/* policy on socket */
-   struct secpolicy *sp;
-   int af;
+   struct secpolicy *sp = NULL; /* XXX gcc */
+   struct socket *so;
+   int af, unpriv;
 
IPSEC_ASSERT(m != NULL, ("ipsec_getpolicybysock: null mbuf"));
IPSEC_ASSERT(inp != NULL, ("ipsec_getpolicybysock: null inpcb"));
@@ -524,6 +525,7 @@ ipsec_getpolicybysock(struct mbuf *m, u_
 
IPSEC_ASSERT(PCB_SOCKET(inp) != NULL,
("ipsec_getppolicybysock: null socket\n"));
+   so = PCB_SOCKET(inp);
 
/* XXX FIXME inpcb/in6pcb  vs socket*/
af = PCB_FAMILY(inp);
@@ -578,61 +580,52 @@ ipsec_getpolicybysock(struct mbuf *m, u_
}
IPSEC_ASSERT(currsp != NULL, ("ipsec_getpolicybysock: null currsp"));
 
-   if (pcbsp->priv) {  /* when privilieged socket */
-   switch (currsp->policy) {
-   case IPSEC_POLICY_BYPASS:
-   case IPSEC_POLICY_IPSEC:
-   currsp->refcnt++;
-   sp = currsp;
-   break;
+   unpriv = kauth_authorize_generic(so->so_cred, KAUTH_GENERIC_ISSUSER, 
NULL);
+   if (unpriv || currsp->policy == IPSEC_POLICY_ENTRUST) {
+   sp = KEY_ALLOCSP(&currsp->spidx, dir);
+   if (sp == NULL)
+   sp = KEY_ALLOCSP_DEFAULT(af);
 
-   case IPSEC_POLICY_ENTRUST:
-   /* look for a policy in SPD */
-   sp = KEY_ALLOCSP(&currsp->spidx, dir);
-   if (sp == NULL) /* no SP found */
-   sp = KEY_ALLOCSP_DEFAULT(af);
-   break;
+   return sp;
+   }
 
-   default:
+   switch (currsp->policy) {
+   case IPSEC_POLICY_BYPASS:
+   unpriv = kauth_authorize_generic(so->so_cred,
+   KAUTH_GENERIC_ISSUSER, NULL);
+   if (unpriv) {
ipseclog((LOG_ERR, "ipsec_getpolicybysock: "
- "Invalid policy for PCB %d\n", 
currsp->policy));
+  "Illegal policy for non-priviliged defined %d\n",
+   currsp->policy));
*error = EINVAL;
return NULL;
}
-   } else {/* unpriv, SPD has policy */
-   sp = KEY_ALLOCSP(&currsp->spidx, dir);
-   if (sp == NULL) {   /* no SP found */
-   switch (currsp->polic

Socket reuse policy adaptation to kauth(9)

2009-12-30 Thread Elad Efrat

Hi,

The attached diff moves the socket reuse policy to a kauth(9) listener.

It has been tested and reviewed, but just in case I'd like to see if
someone else has any comments.

Please review. :)

Thanks,

-e.

Index: sys/netinet/in_pcb.c
===
RCS file: /cvsroot/src/sys/netinet/in_pcb.c,v
retrieving revision 1.137
diff -u -p -r1.137 in_pcb.c
--- sys/netinet/in_pcb.c12 May 2009 22:22:46 -  1.137
+++ sys/netinet/in_pcb.c30 Dec 2009 21:19:13 -
@@ -155,6 +155,34 @@ intlowportmin  = IPPORT_RESERVEDMIN;
 intlowportmax  = IPPORT_RESERVEDMAX;
 
 static struct pool inpcb_pool;
+static kauth_listener_t in_listener;
+
+static int
+in_listener_cb(kauth_cred_t cred, kauth_action_t action, void *cookie,
+void *arg0, void *arg1, void *arg2, void *arg3)
+{
+   struct socket *so, *current_so;
+   enum kauth_network_req req;
+   int result;
+
+   result = KAUTH_RESULT_DEFER;
+   req = (enum kauth_network_req)arg0;
+
+   if ((action != KAUTH_NETWORK_BIND) ||
+   (req != KAUTH_REQ_NETWORK_BIND_REUSEPORT))
+   return result;
+
+   /* XXX-elad: Make sure this is an IPv4 socket. */
+
+   so = arg1;
+   current_so = arg3;
+
+   if (kauth_cred_geteuid(so->so_cred) ==
+   kauth_cred_geteuid(current_so->so_cred))
+   result = KAUTH_RESULT_ALLOW;
+
+   return result;
+}
 
 static int
 inpcb_poolinit(void)
@@ -162,6 +190,10 @@ inpcb_poolinit(void)
 
pool_init(&inpcb_pool, sizeof(struct inpcb), 0, 0, 0, "inpcbpl", NULL,
IPL_NET);
+
+   in_listener = kauth_listen_scope(KAUTH_SCOPE_NETWORK, in_listener_cb,
+   NULL);
+
return 0;
 }
 
@@ -378,9 +410,9 @@ in_pcbbind_port(struct inpcb *inp, struc
return (EADDRINUSE);
 #endif
 
-   /* XXX-kauth */
-   if (so->so_uidinfo->ui_uid && 
!IN_MULTICAST(sin->sin_addr.s_addr)) {
+   if (!IN_MULTICAST(sin->sin_addr.s_addr)) {
t = in_pcblookup_port(table, sin->sin_addr, 
sin->sin_port, 1);
+
/*
 * XXX: investigate ramifications of loosening this
 *  restriction so that as long as both ports have
@@ -389,11 +421,20 @@ in_pcbbind_port(struct inpcb *inp, struc
if (t &&
(!in_nullhost(sin->sin_addr) ||
 !in_nullhost(t->inp_laddr) ||
-(t->inp_socket->so_options & SO_REUSEPORT) == 0)
-   && (so->so_uidinfo->ui_uid != 
t->inp_socket->so_uidinfo->ui_uid)) {
-   return (EADDRINUSE);
+(t->inp_socket->so_options & SO_REUSEPORT) == 0)) {
+   /*
+* Check if allowed to overrule the "in use"
+* policy.
+*/
+   error = kauth_authorize_network(so->so_cred,
+   KAUTH_NETWORK_BIND,
+   KAUTH_REQ_NETWORK_BIND_REUSEPORT, so, sin,
+   t->inp_socket);
+   if (error)
+   return (EADDRINUSE);
}
}
+
t = in_pcblookup_port(table, sin->sin_addr, sin->sin_port, 
wild);
if (t && (reuseport & t->inp_socket->so_options) == 0)
return (EADDRINUSE);
Index: sys/secmodel/suser/secmodel_suser.c
===
RCS file: /cvsroot/src/sys/secmodel/suser/secmodel_suser.c,v
retrieving revision 1.34
diff -u -p -r1.34 secmodel_suser.c
--- sys/secmodel/suser/secmodel_suser.c 29 Dec 2009 04:25:30 -  1.34
+++ sys/secmodel/suser/secmodel_suser.c 30 Dec 2009 21:19:13 -
@@ -622,6 +622,7 @@ secmodel_suser_network_cb(kauth_cred_t c
switch (req) {
case KAUTH_REQ_NETWORK_BIND_PORT:
case KAUTH_REQ_NETWORK_BIND_PRIVPORT:
+   case KAUTH_REQ_NETWORK_BIND_REUSEPORT:
if (isroot)
result = KAUTH_RESULT_ALLOW;
break;
Index: sys/sys/kauth.h
===
RCS file: /cvsroot/src/sys/sys/kauth.h,v
retrieving revision 1.64
diff -u -p -r1.64 kauth.h
--- sys/sys/kauth.h 24 Dec 2009 19:02:07 -  1.64
+++ sys/sys/kauth.h 30 Dec 2009 21:19:14 -
@@ -226,6 +226,7 @@ enum kauth_network_req {
KAUTH_REQ_NETWORK_INTERFACE_SLIP_ADD,
KAUTH_REQ_NETWORK_INTERFACE_STRIP_ADD,
KAUTH_REQ_NETWORK_INTERFACE_TUN_ADD,
+   KAUTH_REQ_NETWORK_BIND_REUSEPORT,
 };
 
 /*


Re: tmpfs module and pool allocator

2009-12-30 Thread Elad Efrat



I've "fixed it" by explicitly calling mutex_destroy() in
tmpfs_modcmd():MODULE_CMD_FINI, but I think the right fix would be to
set a flag indicating that the pool allocator used is not a "system
allocator" (or that it is a "custom allocator", or whatever) and
should be destroyed as well, or at least its mutex...


Matt@ suggested we use a reference count on the pool allocator instead
of the PA_INITIALIZED flag, and when it drops to zero, pool_destroy can
destroy the mutex as well.

I've attached a diff that supposedly implements it... But I don't know
if it's correct. It fixes the problem for me, on a GEENRIC kernel built
with DIAGNOSTIC and LOCKDEBUG.

Please have a look and let me know if it's okay to commit.

Thanks,

-e.
Index: sys/pool.h
===
RCS file: /cvsroot/src/sys/sys/pool.h,v
retrieving revision 1.67
diff -u -p -r1.67 pool.h
--- sys/pool.h  15 Oct 2009 20:50:12 -  1.67
+++ sys/pool.h  30 Dec 2009 17:58:17 -
@@ -64,7 +64,7 @@ struct pool_allocator {
kmutex_tpa_lock;
TAILQ_HEAD(, pool) pa_list; /* list of pools using this allocator */
int pa_flags;
-#definePA_INITIALIZED  0x01
+   uint32_tpa_refcnt;  /* number of pools using this allocator 
*/
int pa_pagemask;
int pa_pageshift;
struct vm_map *pa_backingmap;
Index: kern/subr_pool.c
===
RCS file: /cvsroot/src/sys/kern/subr_pool.c,v
retrieving revision 1.177
diff -u -p -r1.177 subr_pool.c
--- kern/subr_pool.c20 Oct 2009 17:24:22 -  1.177
+++ kern/subr_pool.c30 Dec 2009 17:58:18 -
@@ -104,6 +104,9 @@ static struct pool  *drainpp;
 static kmutex_t pool_head_lock;
 static kcondvar_t pool_busy;
 
+/* This lock protects initialization of a potentially shared pool allocator */
+static kmutex_t pool_allocator_lock;
+
 typedef uint32_t pool_item_bitmap_t;
 #defineBITMAP_SIZE (CHAR_BIT * sizeof(pool_item_bitmap_t))
 #defineBITMAP_MASK (BITMAP_SIZE - 1)
@@ -604,6 +607,8 @@ pool_subsystem_init(void)
 
pool_init(&cache_cpu_pool, sizeof(pool_cache_cpu_t), coherency_unit,
0, 0, "pcachecpu", &pool_allocator_nointr, IPL_NONE);
+
+   mutex_init(&pool_allocator_lock, MUTEX_DEFAULT, IPL_NONE);
 }
 
 /*
@@ -650,7 +655,8 @@ pool_init(struct pool *pp, size_t size, 
palloc = &pool_allocator_nointr_fullpage;
}   
 #endif /* POOL_SUBPAGE */
-   if ((palloc->pa_flags & PA_INITIALIZED) == 0) {
+   mutex_enter(&pool_allocator_lock);
+   if (palloc->pa_refcnt++ == 0) {
if (palloc->pa_pagesz == 0)
palloc->pa_pagesz = PAGE_SIZE;
 
@@ -663,8 +669,8 @@ pool_init(struct pool *pp, size_t size, 
if (palloc->pa_backingmapptr != NULL) {
pa_reclaim_register(palloc);
}
-   palloc->pa_flags |= PA_INITIALIZED;
}
+   mutex_exit(&pool_allocator_lock);
 
if (align == 0)
align = ALIGN(1);
@@ -892,6 +898,11 @@ pool_destroy(struct pool *pp)
TAILQ_REMOVE(&pp->pr_alloc->pa_list, pp, pr_alloc_list);
mutex_exit(&pp->pr_alloc->pa_lock);
 
+   mutex_enter(&pool_allocator_lock);
+   if (--pp->pr_alloc->pa_refcnt == 0)
+   mutex_destroy(&pp->pr_alloc->pa_lock);
+   mutex_exit(&pool_allocator_lock);
+
mutex_enter(&pp->pr_lock);
 
KASSERT(pp->pr_cache == NULL);


Uid comparison in sbreserve()

2009-12-30 Thread Elad Efrat
Hi,

I think at least one part of the kern/uipc_socket2.c:sbreserve() code
is wrong, where it compares uids before applying a resource limit.

The relevant code is

600 if (kauth_cred_geteuid(l->l_cred) == so->so_uidinfo->ui_uid)
601 maxcc = l->l_proc->p_rlimit[RLIMIT_SBSIZE].rlim_cur;
602 else
603 maxcc = RLIM_INFINITY;

where 'l' is 'curlwp' in this context, and 'so' is the passed socket.

The evolution of this particular test boils down to two commits;
revision 1.59, introducing the following:

+   if (p && p->p_ucred->cr_uid == so->so_uid)
+   maxcc = p->p_rlimit[RLIMIT_SBSIZE].rlim_cur;
+   else
+   maxcc = RLIM_INFINITY;

and later on, once we always had a lwp context, revision 1.88:

-   if (l && kauth_cred_geteuid(l->l_cred) == 
so->so_uidinfo->ui_uid)
+   if (kauth_cred_geteuid(l->l_cred) == so->so_uidinfo->ui_uid)

I think the first change here was an oversight, because the logic
doesn't seem right to me. The process resource limits are only used
when both the process is not NULL and the uid from the process'
credentials matches the uid on the socket. What if the process is not
NULL, but the uids don't match? I don't think it's intended to fall
back on RLIM_INFINITY for such a case...

It seems that the original logic should have been

if (p != NULL)
use resource limits from p;
else
use infinity;

as in those days a NULL process pointer would mean the request is
coming from "the kernel." (Other networking code had that assumption
too in a lot of credential checks with a rather inconsistent policy on
what to do when the process or lwp context is NULL.) Once 'p' can
never be NULL, the logic should have obviously be rewritten

use resource limits from p;

without comparing any uids.

Approaching it differently, let's say the uids match, so we use the
limits from the process. What do we do if they don't match? Unless we
can somehow "guarantee" that whenever they don't match one of them is
always root, using infinity is wrong for such a case... And I don't
see how we can guarantee such a thing.

Therefore, I'd like to remove the uid comparison and always use the
limits from the process. :)

Thoughts?

-e.