Re: [RFC PATCH v3] audit: use proper refcount locking on audit_sock

2016-12-13 Thread Cong Wang
On Tue, Dec 13, 2016 at 8:00 PM, Richard Guy Briggs  wrote:
> On 2016-12-13 16:19, Cong Wang wrote:
>> On Tue, Dec 13, 2016 at 7:03 AM, Richard Guy Briggs  wrote:
>> > @@ -1283,8 +1299,10 @@ static void __net_exit audit_net_exit(struct net 
>> > *net)
>> >  {
>> > struct audit_net *aunet = net_generic(net, audit_net_id);
>> > struct sock *sock = aunet->nlsk;
>> > +   mutex_lock(_cmd_mutex);
>> > if (sock == audit_sock)
>> > auditd_reset();
>> > +   mutex_unlock(_cmd_mutex);
>>
>> This still doesn't look correct to me, b/c here we release the audit_sock
>> refcnt twice:
>>
>> 1) inside audit_reset()
>
> The audit_reset() refcount decrement corresponds to a setting of
> audit_sock only if audit_sock is still non-NULL.
>

Hmm, thinking about it again, looks like the sock == audit_sock
and audit_sock != NULL checks can guarantee we are safe. So,

Reviewed-by: Cong Wang 


Re: [RFC PATCH v3] audit: use proper refcount locking on audit_sock

2016-12-13 Thread Richard Guy Briggs
On 2016-12-13 16:19, Cong Wang wrote:
> On Tue, Dec 13, 2016 at 7:03 AM, Richard Guy Briggs  wrote:
> > @@ -1283,8 +1299,10 @@ static void __net_exit audit_net_exit(struct net 
> > *net)
> >  {
> > struct audit_net *aunet = net_generic(net, audit_net_id);
> > struct sock *sock = aunet->nlsk;
> > +   mutex_lock(_cmd_mutex);
> > if (sock == audit_sock)
> > auditd_reset();
> > +   mutex_unlock(_cmd_mutex);
> 
> This still doesn't look correct to me, b/c here we release the audit_sock
> refcnt twice:
> 
> 1) inside audit_reset()

The audit_reset() refcount decrement corresponds to a setting of
audit_sock only if audit_sock is still non-NULL.

> 2) netlink_kernel_release()

This refcount decrement corresponds to netlink_kernel_create().

- RGB

--
Richard Guy Briggs 
Kernel Security Engineering, Base Operating Systems, Red Hat
Remote, Ottawa, Canada
Voice: +1.647.777.2635, Internal: (81) 32635


Re: [RFC PATCH v3] audit: use proper refcount locking on audit_sock

2016-12-13 Thread Cong Wang
On Tue, Dec 13, 2016 at 7:03 AM, Richard Guy Briggs  wrote:
> @@ -1283,8 +1299,10 @@ static void __net_exit audit_net_exit(struct net *net)
>  {
> struct audit_net *aunet = net_generic(net, audit_net_id);
> struct sock *sock = aunet->nlsk;
> +   mutex_lock(_cmd_mutex);
> if (sock == audit_sock)
> auditd_reset();
> +   mutex_unlock(_cmd_mutex);

This still doesn't look correct to me, b/c here we release the audit_sock
refcnt twice:

1) inside audit_reset()
2) netlink_kernel_release()


Re: [RFC PATCH v3] audit: use proper refcount locking on audit_sock

2016-12-13 Thread Paul Moore
On Tue, Dec 13, 2016 at 10:03 AM, Richard Guy Briggs  wrote:
> Resetting audit_sock appears to be racy.
>
> audit_sock was being copied and dereferenced without using a refcount on
> the source sock.
>
> Bump the refcount on the underlying sock when we store a refrence in
> audit_sock and release it when we reset audit_sock.  audit_sock
> modification needs the audit_cmd_mutex.
>
> See: https://lkml.org/lkml/2016/11/26/232
>
> Thanks to Eric Dumazet  and Cong Wang
>  on ideas how to fix it.
>
> Signed-off-by: Richard Guy Briggs 
> ---
> There has been a lot of change in the audit code that is about to go
> upstream to address audit queue issues.  This patch is based on the
> source tree: git://git.infradead.org/users/pcmoore/audit#next
> ---
>  kernel/audit.c |   28 +++-
>  1 files changed, 23 insertions(+), 5 deletions(-)

This looks more reasonable.  I still wonder about synchronization
between threads changing the audit_* connection variables and the
kauditd_thread, but I guess we can treat that as another issue; this
patch fixes a bug and is worth merging now.

I'm building a test kernel right now, assuming nothing blows up I'll
push this patch with the rest of the audit patches tomorrow; if
something bad happens, this is going to miss the first audit pull
request.

> diff --git a/kernel/audit.c b/kernel/audit.c
> index f20eee0..3bb4126 100644
> --- a/kernel/audit.c
> +++ b/kernel/audit.c
> @@ -446,14 +446,19 @@ static void kauditd_retry_skb(struct sk_buff *skb)
>   * Description:
>   * Break the auditd/kauditd connection and move all the records in the retry
>   * queue into the hold queue in case auditd reconnects.
> + * The audit_cmd_mutex must be held when calling this function.
>   */

Don't resend, but in the future please start comments like this on the
previous line.


[RFC PATCH v3] audit: use proper refcount locking on audit_sock

2016-12-13 Thread Richard Guy Briggs
Resetting audit_sock appears to be racy.

audit_sock was being copied and dereferenced without using a refcount on
the source sock.

Bump the refcount on the underlying sock when we store a refrence in
audit_sock and release it when we reset audit_sock.  audit_sock
modification needs the audit_cmd_mutex.

See: https://lkml.org/lkml/2016/11/26/232

Thanks to Eric Dumazet  and Cong Wang
 on ideas how to fix it.

Signed-off-by: Richard Guy Briggs 
---
There has been a lot of change in the audit code that is about to go
upstream to address audit queue issues.  This patch is based on the
source tree: git://git.infradead.org/users/pcmoore/audit#next
---
 kernel/audit.c |   28 +++-
 1 files changed, 23 insertions(+), 5 deletions(-)

diff --git a/kernel/audit.c b/kernel/audit.c
index f20eee0..3bb4126 100644
--- a/kernel/audit.c
+++ b/kernel/audit.c
@@ -446,14 +446,19 @@ static void kauditd_retry_skb(struct sk_buff *skb)
  * Description:
  * Break the auditd/kauditd connection and move all the records in the retry
  * queue into the hold queue in case auditd reconnects.
+ * The audit_cmd_mutex must be held when calling this function.
  */
 static void auditd_reset(void)
 {
struct sk_buff *skb;
 
/* break the connection */
+   if (audit_sock) {
+   sock_put(audit_sock);
+   audit_sock = NULL;
+   }
audit_pid = 0;
-   audit_sock = NULL;
+   audit_nlk_portid = 0;
 
/* flush all of the retry queue to the hold queue */
while ((skb = skb_dequeue(_retry_queue)))
@@ -579,7 +584,9 @@ static int kauditd_thread(void *dummy)
 
auditd = 0;
if (AUDITD_BAD(rc, reschedule)) {
+   mutex_lock(_cmd_mutex);
auditd_reset();
+   mutex_unlock(_cmd_mutex);
reschedule = 0;
}
} else
@@ -594,7 +601,9 @@ static int kauditd_thread(void *dummy)
auditd = 0;
if (AUDITD_BAD(rc, reschedule)) {
kauditd_hold_skb(skb);
+   mutex_lock(_cmd_mutex);
auditd_reset();
+   mutex_unlock(_cmd_mutex);
reschedule = 0;
} else
/* temporary problem (we hope), queue
@@ -623,7 +632,9 @@ quick_loop:
if (rc) {
auditd = 0;
if (AUDITD_BAD(rc, reschedule)) {
+   mutex_lock(_cmd_mutex);
auditd_reset();
+   mutex_unlock(_cmd_mutex);
reschedule = 0;
}
 
@@ -1010,11 +1021,16 @@ static int audit_receive_msg(struct sk_buff *skb, 
struct nlmsghdr *nlh)
}
if (audit_enabled != AUDIT_OFF)
audit_log_config_change("audit_pid", new_pid, 
audit_pid, 1);
-   audit_pid = new_pid;
-   audit_nlk_portid = NETLINK_CB(skb).portid;
-   audit_sock = skb->sk;
-   if (!new_pid)
+   if (new_pid) {
+   if (audit_sock)
+   sock_put(audit_sock);
+   audit_pid = new_pid;
+   audit_nlk_portid = NETLINK_CB(skb).portid;
+   sock_hold(skb->sk);
+   audit_sock = skb->sk;
+   } else {
auditd_reset();
+   }
wake_up_interruptible(_wait);
}
if (s.mask & AUDIT_STATUS_RATE_LIMIT) {
@@ -1283,8 +1299,10 @@ static void __net_exit audit_net_exit(struct net *net)
 {
struct audit_net *aunet = net_generic(net, audit_net_id);
struct sock *sock = aunet->nlsk;
+   mutex_lock(_cmd_mutex);
if (sock == audit_sock)
auditd_reset();
+   mutex_unlock(_cmd_mutex);
 
RCU_INIT_POINTER(aunet->nlsk, NULL);
synchronize_net();
-- 
1.7.1