Re: [PATCH 3/3] audit: drop audit_cmd_lock in AUDIT_USER family of cases

2013-12-08 Thread Toshiyuki Okajima
On Wed,  4 Dec 2013 21:45:56 -0500
Richard Guy Briggs  wrote:

> We do not need to hold the audit_cmd_mutex for this family of cases.  The
> possible exception to this is the call to audit_filter_user(), so drop the 
> lock
> immediately after.  To help in fixing the race we are trying to avoid, make
> sure that nothing called by audit_filter_user() calls audit_log_start().  In
> particular, watch out for *_audit_rule_match().
> 
> This fix will take care of systemd and anything USING audit.  It still means
> that we could race with something configuring audit and auditd shutting down.
> 
> Signed-off-by: Richard Guy Briggs 
> Signed-off-by: Richard Guy Briggs 

When I have tested a patch with my reproducer, 
a hangup by this problem doesn't occur. 
 reproducer 
 # auditctl -a exit,always -S all
 # reboot
/

So, this fix looks good to me.

Please add:

Reported-by: toshi.okaj...@jp.fujitsu.com
Tested-by: toshi.okaj...@jp.fujitsu.com

Thanks.

> ---
>  kernel/audit.c |2 ++
>  1 files changed, 2 insertions(+), 0 deletions(-)
> 
> diff --git a/kernel/audit.c b/kernel/audit.c
> index 4689012..4cbc945 100644
> --- a/kernel/audit.c
> +++ b/kernel/audit.c
> @@ -713,6 +713,7 @@ static int audit_receive_msg(struct sk_buff *skb, struct 
> nlmsghdr *nlh)
>   if (err)
>   break;
>   }
> + mutex_unlock(_cmd_mutex);
>   audit_log_common_recv_msg(, msg_type);
>   if (msg_type != AUDIT_USER_TTY)
>   audit_log_format(ab, " msg='%.1024s'",
> @@ -729,6 +730,7 @@ static int audit_receive_msg(struct sk_buff *skb, struct 
> nlmsghdr *nlh)
>   }
>   audit_set_pid(ab, NETLINK_CB(skb).portid);
>   audit_log_end(ab);
> + mutex_lock(_cmd_mutex);
>   }
>   break;
>   case AUDIT_ADD_RULE:
> -- 
> 1.7.1
> 


-- 
Toshiyuki Okajima 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/3] audit: drop audit_cmd_lock in AUDIT_USER family of cases

2013-12-08 Thread Toshiyuki Okajima
On Wed,  4 Dec 2013 21:45:56 -0500
Richard Guy Briggs r...@redhat.com wrote:

 We do not need to hold the audit_cmd_mutex for this family of cases.  The
 possible exception to this is the call to audit_filter_user(), so drop the 
 lock
 immediately after.  To help in fixing the race we are trying to avoid, make
 sure that nothing called by audit_filter_user() calls audit_log_start().  In
 particular, watch out for *_audit_rule_match().
 
 This fix will take care of systemd and anything USING audit.  It still means
 that we could race with something configuring audit and auditd shutting down.
 
 Signed-off-by: Richard Guy Briggs r...@tricolour.ca
 Signed-off-by: Richard Guy Briggs r...@redhat.com

When I have tested a patch with my reproducer, 
a hangup by this problem doesn't occur. 
 reproducer 
 # auditctl -a exit,always -S all
 # reboot
/

So, this fix looks good to me.

Please add:

Reported-by: toshi.okaj...@jp.fujitsu.com
Tested-by: toshi.okaj...@jp.fujitsu.com

Thanks.

 ---
  kernel/audit.c |2 ++
  1 files changed, 2 insertions(+), 0 deletions(-)
 
 diff --git a/kernel/audit.c b/kernel/audit.c
 index 4689012..4cbc945 100644
 --- a/kernel/audit.c
 +++ b/kernel/audit.c
 @@ -713,6 +713,7 @@ static int audit_receive_msg(struct sk_buff *skb, struct 
 nlmsghdr *nlh)
   if (err)
   break;
   }
 + mutex_unlock(audit_cmd_mutex);
   audit_log_common_recv_msg(ab, msg_type);
   if (msg_type != AUDIT_USER_TTY)
   audit_log_format(ab,  msg='%.1024s',
 @@ -729,6 +730,7 @@ static int audit_receive_msg(struct sk_buff *skb, struct 
 nlmsghdr *nlh)
   }
   audit_set_pid(ab, NETLINK_CB(skb).portid);
   audit_log_end(ab);
 + mutex_lock(audit_cmd_mutex);
   }
   break;
   case AUDIT_ADD_RULE:
 -- 
 1.7.1
 


-- 
Toshiyuki Okajima toshi.okaj...@jp.fujitsu.com
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RESEND][BUG][PATCH V3] audit: audit_log_start running on auditd should not stop

2013-12-04 Thread Toshiyuki Okajima
Hi.

Please apply this patch because the problem that audit_receive called by auditd 
process is hung up by other process (systemd) which has already called it is 
fixed.

This patch fixes the problem that auditd hangs up by itself.

Thanks.

---
The backlog cannot be consumed when audit_log_start is running on auditd
even if audit_log_start calls wait_for_auditd to consume it.
The situation is the deadlock because only auditd can consume the backlog.
If the other process needs to send the backlog, it can be also stopped 
by the deadlock.

So, audit_log_start running on auditd should not stop.

You can see the deadlock with the following reproducer:
 # auditctl -a exit,always -S all
 # reboot

Signed-off-by: Toshiyuki Okajima 
Reviewed-by: gaof...@cn.fujitsu.com
---
 kernel/audit.c |   14 --
 1 files changed, 8 insertions(+), 6 deletions(-)

diff --git a/kernel/audit.c b/kernel/audit.c
index 7b0e23a..29cfc94 100644
--- a/kernel/audit.c
+++ b/kernel/audit.c
@@ -1095,7 +1095,8 @@ struct audit_buffer *audit_log_start(struct audit_context 
*ctx, gfp_t gfp_mask,
struct audit_buffer *ab = NULL;
struct timespec t;
unsigned intuninitialized_var(serial);
-   int reserve;
+   int reserve = 5; /* Allow atomic callers to go up to five
+   entries over the normal backlog limit */
unsigned long timeout_start = jiffies;
 
if (audit_initialized != AUDIT_INITIALIZED)
@@ -1104,11 +1105,12 @@ struct audit_buffer *audit_log_start(struct 
audit_context *ctx, gfp_t gfp_mask,
if (unlikely(audit_filter_type(type)))
return NULL;
 
-   if (gfp_mask & __GFP_WAIT)
-   reserve = 0;
-   else
-   reserve = 5; /* Allow atomic callers to go up to five
-   entries over the normal backlog limit */
+   if (gfp_mask & __GFP_WAIT) {
+   if (audit_pid && audit_pid == current->pid)
+   gfp_mask &= ~__GFP_WAIT;
+   else
+   reserve = 0;
+   }
 
while (audit_backlog_limit
   && skb_queue_len(_skb_queue) > audit_backlog_limit + 
reserve) {
-- 
1.5.5.6
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RESEND][BUG][PATCH V3] audit: audit_log_start running on auditd should not stop

2013-12-04 Thread Toshiyuki Okajima
Hi.

Please apply this patch because the problem that audit_receive called by auditd 
process is hung up by other process (systemd) which has already called it is 
fixed.

This patch fixes the problem that auditd hangs up by itself.

Thanks.

---
The backlog cannot be consumed when audit_log_start is running on auditd
even if audit_log_start calls wait_for_auditd to consume it.
The situation is the deadlock because only auditd can consume the backlog.
If the other process needs to send the backlog, it can be also stopped 
by the deadlock.

So, audit_log_start running on auditd should not stop.

You can see the deadlock with the following reproducer:
 # auditctl -a exit,always -S all
 # reboot

Signed-off-by: Toshiyuki Okajima toshi.okaj...@jp.fujitsu.com
Reviewed-by: gaof...@cn.fujitsu.com
---
 kernel/audit.c |   14 --
 1 files changed, 8 insertions(+), 6 deletions(-)

diff --git a/kernel/audit.c b/kernel/audit.c
index 7b0e23a..29cfc94 100644
--- a/kernel/audit.c
+++ b/kernel/audit.c
@@ -1095,7 +1095,8 @@ struct audit_buffer *audit_log_start(struct audit_context 
*ctx, gfp_t gfp_mask,
struct audit_buffer *ab = NULL;
struct timespec t;
unsigned intuninitialized_var(serial);
-   int reserve;
+   int reserve = 5; /* Allow atomic callers to go up to five
+   entries over the normal backlog limit */
unsigned long timeout_start = jiffies;
 
if (audit_initialized != AUDIT_INITIALIZED)
@@ -1104,11 +1105,12 @@ struct audit_buffer *audit_log_start(struct 
audit_context *ctx, gfp_t gfp_mask,
if (unlikely(audit_filter_type(type)))
return NULL;
 
-   if (gfp_mask  __GFP_WAIT)
-   reserve = 0;
-   else
-   reserve = 5; /* Allow atomic callers to go up to five
-   entries over the normal backlog limit */
+   if (gfp_mask  __GFP_WAIT) {
+   if (audit_pid  audit_pid == current-pid)
+   gfp_mask = ~__GFP_WAIT;
+   else
+   reserve = 0;
+   }
 
while (audit_backlog_limit
skb_queue_len(audit_skb_queue)  audit_backlog_limit + 
reserve) {
-- 
1.5.5.6
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [BUG][PATCH] audit: audit_log_start running on auditd should not stop

2013-10-28 Thread Toshiyuki Okajima
Hi.

(2013/10/26 0:12), Eric Paris wrote:
> On Fri, 2013-10-25 at 10:36 +0900, Toshiyuki Okajima wrote:
> 
>> systemd|auditd
>> ---+---
>> ...|
>> -> audit_receive   |...
>>-> mutex_lock(_cmd_mutex) |-> audit_receive
>>   ... -> audit_log_start   |   -> 
>> mutex_lock(_cmd_mutex)
>>  -> wait_for_auditd|  // wait for systemd
>> -> schedule_timeout(60*HZ) |
> 

> Ugggh, definitely a problem.  Adding a similar hack to systemd really
> does not seem like an acceptable answer.  It seems to me that in
I think so, too. We should fix it against the various cases.

> audit_receive_msg()
> 
> case AUDIT_USER:
> case AUDIT_FIRST_USER_MSG ... AUDIT_LAST_USER_MSG:
> case AUDIT_FIRST_USER_MSG2 ... AUDIT_LAST_USER_MSG2:
> 
> we do not need to hold the audit_cmd_mutex.  So a quick and dirty patch
> should be to just drop the mutex there (and we need to verify there
> aren't issues running the audit_filter_user() without the lock).  That
> will take care of systemd and anything USING audit.  It still means that
> you could race with something configuring audit and auditd shutting
> down.  Seems like a good quick and dirty 'fix' while we work on a better
> fix...
> 
> To take care of that I think maybe we could drop the cmd_mutex every
> time we call audit_log_start.  That's not necessarily going to be
> pretty.  Maybe make a new switch at the top of the function which knows
> which operations we are going to have to allocate an audit_buffer.  Drop
> the lock, allocate the buffer, then retake the lock to finish running
> audit_receive_msg()
> 

> Maybe that second option isn't so hard and we can go directly after that
> instead of just dealing with userspace audit messages?
> 
> Thoughts?
Does it mean that we can also fix the problem only in the userspace? 

Even if we fix userspace process (auditd, readahead-collector and systemd) 
only, 
the problem would happen again if a new userspace audit process is implemented.
Therefore, I think we should fix only in the kernel.
Sorry, but I don't have clear method to fix it.

Regards,
Toshiyuki Okajima

> 
> 
> 



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [BUG][PATCH] audit: audit_log_start running on auditd should not stop

2013-10-28 Thread Toshiyuki Okajima
Hi.

(2013/10/26 0:12), Eric Paris wrote:
 On Fri, 2013-10-25 at 10:36 +0900, Toshiyuki Okajima wrote:
 
 systemd|auditd
 ---+---
 ...|
 - audit_receive   |...
- mutex_lock(audit_cmd_mutex) |- audit_receive
   ... - audit_log_start   |   - 
 mutex_lock(audit_cmd_mutex)
  - wait_for_auditd|  // wait for systemd
 - schedule_timeout(60*HZ) |
 

 Ugggh, definitely a problem.  Adding a similar hack to systemd really
 does not seem like an acceptable answer.  It seems to me that in
I think so, too. We should fix it against the various cases.

 audit_receive_msg()
 
 case AUDIT_USER:
 case AUDIT_FIRST_USER_MSG ... AUDIT_LAST_USER_MSG:
 case AUDIT_FIRST_USER_MSG2 ... AUDIT_LAST_USER_MSG2:
 
 we do not need to hold the audit_cmd_mutex.  So a quick and dirty patch
 should be to just drop the mutex there (and we need to verify there
 aren't issues running the audit_filter_user() without the lock).  That
 will take care of systemd and anything USING audit.  It still means that
 you could race with something configuring audit and auditd shutting
 down.  Seems like a good quick and dirty 'fix' while we work on a better
 fix...
 
 To take care of that I think maybe we could drop the cmd_mutex every
 time we call audit_log_start.  That's not necessarily going to be
 pretty.  Maybe make a new switch at the top of the function which knows
 which operations we are going to have to allocate an audit_buffer.  Drop
 the lock, allocate the buffer, then retake the lock to finish running
 audit_receive_msg()
 

 Maybe that second option isn't so hard and we can go directly after that
 instead of just dealing with userspace audit messages?
 
 Thoughts?
Does it mean that we can also fix the problem only in the userspace? 

Even if we fix userspace process (auditd, readahead-collector and systemd) 
only, 
the problem would happen again if a new userspace audit process is implemented.
Therefore, I think we should fix only in the kernel.
Sorry, but I don't have clear method to fix it.

Regards,
Toshiyuki Okajima

 
 
 



--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [BUG][PATCH] audit: audit_log_start running on auditd should not stop

2013-10-24 Thread Toshiyuki Okajima
Hi.

Thanks for your interest.

(2013/10/24 4:55), Richard Guy Briggs wrote:
> On Tue, Oct 15, 2013 at 02:30:34PM +0800, Gao feng wrote:
>> Hi Toshiyuki-san,
> 
> Toshiuki and Gao,
> 
>> On 10/15/2013 12:43 PM, Toshiyuki Okajima wrote:
>>> The backlog cannot be consumed when audit_log_start is running on auditd
>>> even if audit_log_start calls wait_for_auditd to consume it.
>>> The situation is a deadlock because only auditd can consume the backlog.
>>> If the other process needs to send the backlog, it can be also stopped 
>>> by the deadlock.
>>>
>>> So, audit_log_start running on auditd should not stop.
>>>
>>> You can see the deadlock with the following reproducer:
>>>  # auditctl -a exit,always -S all
>>>  # reboot
> 
>> Hmm, I see, There may be other code paths that auditd can call 
>> audit_log_start except
>> audit_log_config_change. so it's better to handle this problem in 
>> audit_log_start.
>>
>> but current task is only meaningful when gfp_mask & __GFP_WAIT is true.
>> so maybe the below patch is what you want.
> 
> I have been following this thread with interest.  I like the general
> evolution of this patch.  The first patch was a bit too abrupt, dropping
> too much, but this one makes much more sense.  I would be tempted to
> make the reserve even bigger.
> 
> I see that you should be using a kernel that has included commit
> 8ac1c8d5 (which made it into v3.12-rc3)
>   audit: fix endless wait in audit_log_start()
> That was an obvious bug, but I was still concerned about the cause of
> the initial wait.  There are other fixes and ideas in the works that
> should alleviate some of the pressure to make the service more usable.
>   https://lkml.org/lkml/2013/9/18/453
> 

> I have tested with and without this v3 patch and I don't see any
> significant difference with the reproducer provided above.  I'm also
> testing with a reproducer of the endless wait bug (readahead-collector).
> 
> What are your expected results?  What are your actual results in each
> case?  How are they different?
OK. I try to describe.

1) I found that the system cannot reboot smoothly because auditd daemon can 
stop for 60 seconds. 

2) As the result of my investigation of this problem, I found that 
audit_log_start executed
on auditd called wait_for_auditd and stopped for "audit_backlog_wait_time". 
Calling wait_for_auditd is to let auditd consume the backlog during this sleep.
However, when wait_for_auditd is called on auditd, the backlog cannot be 
consumed during this time.
In addition, audit_log_start called by other processes can also stop while 
auditd is stopping.
So, wait_for_auditd called on auditd is meaningless.

3) Therefore, I made the v3 patch that audit_log_start executed on auditd 
doesn't call wait_for_auditd. 

4) On my environment, I couldn't see the situation that auditd stopped for 60 
seconds by using
this patch. 

5) So, I thought this problem could be fixed.


But after I received your comment, I encountered the other problem that auditd 
can stop for 
60 seconds when I tried this patch on the other environment installed "systemd".
So, I think you encountered the other problem that was the same as mine.

The following situation was occurred:
zz
PID: 1  TASK: 88007c7d8000  CPU: 0   COMMAND: "systemd"
 #0 [88007c7758d8] __schedule at 815f231a
 #1 [88007c775950] schedule at 815f2a79
 #2 [88007c775960] schedule_timeout at 815f0f93
 #3 [88007c775a10] audit_log_start at 810d7d09
 #4 [88007c775ab0] audit_log_common_recv_msg at 810d8068
 #5 [88007c775b00] audit_receive_msg at 810d8ca0
 #6 [88007c775bb0] audit_receive at 810d8f98
 #7 [88007c775be0] netlink_unicast at 8150bf31
 #8 [88007c775c30] netlink_sendmsg at 8150c2b1
 #9 [88007c775cc0] sock_sendmsg at 814ca3fc
#10 [88007c775e50] sys_sendto at 814ccd6d
#11 [88007c775f80] system_call_fastpath at 815fc259
RIP: 7fbee5653543  RSP: 7fff23d684b8  RFLAGS: 0246
RAX: 002c  RBX: 815fc259  RCX: 
RDX: 0078  RSI: 7fff23d5d8c0  RDI: 0003
RBP: 0003   R8: 7fff23d5d8b0   R9: 000c
R10:   R11: 0246  R12: 0050
R13: 7fff23d66380  R14: 046b  R15: 0067
ORIG_RAX: 002c  CS: 0033  SS: 002b

PID: 488TASK: 880036edcbf0  CPU: 1   COMMAND: "auditd"
 #0 [88007ba93a98] __schedule at 815f231a
 #1 [88007ba93b10] schedule at 815

Re: [BUG][PATCH] audit: audit_log_start running on auditd should not stop

2013-10-24 Thread Toshiyuki Okajima
Hi.

Thanks for your interest.

(2013/10/24 4:55), Richard Guy Briggs wrote:
 On Tue, Oct 15, 2013 at 02:30:34PM +0800, Gao feng wrote:
 Hi Toshiyuki-san,
 
 Toshiuki and Gao,
 
 On 10/15/2013 12:43 PM, Toshiyuki Okajima wrote:
 The backlog cannot be consumed when audit_log_start is running on auditd
 even if audit_log_start calls wait_for_auditd to consume it.
 The situation is a deadlock because only auditd can consume the backlog.
 If the other process needs to send the backlog, it can be also stopped 
 by the deadlock.

 So, audit_log_start running on auditd should not stop.

 You can see the deadlock with the following reproducer:
  # auditctl -a exit,always -S all
  # reboot
 
 Hmm, I see, There may be other code paths that auditd can call 
 audit_log_start except
 audit_log_config_change. so it's better to handle this problem in 
 audit_log_start.

 but current task is only meaningful when gfp_mask  __GFP_WAIT is true.
 so maybe the below patch is what you want.
 
 I have been following this thread with interest.  I like the general
 evolution of this patch.  The first patch was a bit too abrupt, dropping
 too much, but this one makes much more sense.  I would be tempted to
 make the reserve even bigger.
 
 I see that you should be using a kernel that has included commit
 8ac1c8d5 (which made it into v3.12-rc3)
   audit: fix endless wait in audit_log_start()
 That was an obvious bug, but I was still concerned about the cause of
 the initial wait.  There are other fixes and ideas in the works that
 should alleviate some of the pressure to make the service more usable.
   https://lkml.org/lkml/2013/9/18/453
 

 I have tested with and without this v3 patch and I don't see any
 significant difference with the reproducer provided above.  I'm also
 testing with a reproducer of the endless wait bug (readahead-collector).
 
 What are your expected results?  What are your actual results in each
 case?  How are they different?
OK. I try to describe.

1) I found that the system cannot reboot smoothly because auditd daemon can 
stop for 60 seconds. 

2) As the result of my investigation of this problem, I found that 
audit_log_start executed
on auditd called wait_for_auditd and stopped for audit_backlog_wait_time. 
Calling wait_for_auditd is to let auditd consume the backlog during this sleep.
However, when wait_for_auditd is called on auditd, the backlog cannot be 
consumed during this time.
In addition, audit_log_start called by other processes can also stop while 
auditd is stopping.
So, wait_for_auditd called on auditd is meaningless.

3) Therefore, I made the v3 patch that audit_log_start executed on auditd 
doesn't call wait_for_auditd. 

4) On my environment, I couldn't see the situation that auditd stopped for 60 
seconds by using
this patch. 

5) So, I thought this problem could be fixed.


But after I received your comment, I encountered the other problem that auditd 
can stop for 
60 seconds when I tried this patch on the other environment installed systemd.
So, I think you encountered the other problem that was the same as mine.

The following situation was occurred:
zz
PID: 1  TASK: 88007c7d8000  CPU: 0   COMMAND: systemd
 #0 [88007c7758d8] __schedule at 815f231a
 #1 [88007c775950] schedule at 815f2a79
 #2 [88007c775960] schedule_timeout at 815f0f93
 #3 [88007c775a10] audit_log_start at 810d7d09
 #4 [88007c775ab0] audit_log_common_recv_msg at 810d8068
 #5 [88007c775b00] audit_receive_msg at 810d8ca0
 #6 [88007c775bb0] audit_receive at 810d8f98
 #7 [88007c775be0] netlink_unicast at 8150bf31
 #8 [88007c775c30] netlink_sendmsg at 8150c2b1
 #9 [88007c775cc0] sock_sendmsg at 814ca3fc
#10 [88007c775e50] sys_sendto at 814ccd6d
#11 [88007c775f80] system_call_fastpath at 815fc259
RIP: 7fbee5653543  RSP: 7fff23d684b8  RFLAGS: 0246
RAX: 002c  RBX: 815fc259  RCX: 
RDX: 0078  RSI: 7fff23d5d8c0  RDI: 0003
RBP: 0003   R8: 7fff23d5d8b0   R9: 000c
R10:   R11: 0246  R12: 0050
R13: 7fff23d66380  R14: 046b  R15: 0067
ORIG_RAX: 002c  CS: 0033  SS: 002b

PID: 488TASK: 880036edcbf0  CPU: 1   COMMAND: auditd
 #0 [88007ba93a98] __schedule at 815f231a
 #1 [88007ba93b10] schedule at 815f2a79
 #2 [88007ba93b20] schedule_preempt_disabled at 815f2d0e
 #3 [88007ba93b30] __mutex_lock_slowpath at 815f1903
 #4 [88007ba93b90] mutex_lock at 815f143a
 #5 [88007ba93bb0] audit_receive at 810d8f71
 #6 [88007ba93be0] netlink_unicast at 8150bf31
 #7 [88007ba93c30] netlink_sendmsg

[BUG][PATCH V3] audit: audit_log_start running on auditd should not stop

2013-10-15 Thread Toshiyuki Okajima
The backlog cannot be consumed when audit_log_start is running on auditd
even if audit_log_start calls wait_for_auditd to consume it.
The situation is the deadlock because only auditd can consume the backlog.
If the other process needs to send the backlog, it can be also stopped 
by the deadlock.

So, audit_log_start running on auditd should not stop.

You can see the deadlock with the following reproducer:
 # auditctl -a exit,always -S all
 # reboot

Signed-off-by: Toshiyuki Okajima 
Cc: gaof...@cn.fujitsu.com
---
 kernel/audit.c |   14 --
 1 files changed, 8 insertions(+), 6 deletions(-)

diff --git a/kernel/audit.c b/kernel/audit.c
index 7b0e23a..29cfc94 100644
--- a/kernel/audit.c
+++ b/kernel/audit.c
@@ -1095,7 +1095,8 @@ struct audit_buffer *audit_log_start(struct audit_context 
*ctx, gfp_t gfp_mask,
struct audit_buffer *ab = NULL;
struct timespec t;
unsigned intuninitialized_var(serial);
-   int reserve;
+   int reserve = 5; /* Allow atomic callers to go up to five
+   entries over the normal backlog limit */
unsigned long timeout_start = jiffies;
 
if (audit_initialized != AUDIT_INITIALIZED)
@@ -1104,11 +1105,12 @@ struct audit_buffer *audit_log_start(struct 
audit_context *ctx, gfp_t gfp_mask,
if (unlikely(audit_filter_type(type)))
return NULL;
 
-   if (gfp_mask & __GFP_WAIT)
-   reserve = 0;
-   else
-   reserve = 5; /* Allow atomic callers to go up to five
-   entries over the normal backlog limit */
+   if (gfp_mask & __GFP_WAIT) {
+   if (audit_pid && audit_pid == current->pid)
+   gfp_mask &= ~__GFP_WAIT;
+   else
+   reserve = 0;
+   }
 
while (audit_backlog_limit
   && skb_queue_len(_skb_queue) > audit_backlog_limit + 
reserve) {
-- 
1.5.5.6
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [BUG][PATCH] audit: audit_log_start running on auditd should not stop

2013-10-15 Thread Toshiyuki Okajima
Hi Gao-san,

(2013/10/15 15:30), Gao feng wrote:
> Hi Toshiyuki-san,
> On 10/15/2013 12:43 PM, Toshiyuki Okajima wrote:
>> The backlog cannot be consumed when audit_log_start is running on auditd
>> even if audit_log_start calls wait_for_auditd to consume it.
>> The situation is a deadlock because only auditd can consume the backlog.
>> If the other process needs to send the backlog, it can be also stopped 
>> by the deadlock.
>>
>> So, audit_log_start running on auditd should not stop.
>>
>> You can see the deadlock with the following reproducer:
>>  # auditctl -a exit,always -S all
>>  # reboot
>>
>> Signed-off-by: Toshiyuki Okajima 
>> Cc: gaof...@cn.fujitsu.com
>> ---
>>  kernel/audit.c |3 +++
>>  1 files changed, 3 insertions(+), 0 deletions(-)
>>
>> diff --git a/kernel/audit.c b/kernel/audit.c
>> index 7b0e23a..ce1fb38 100644
>> --- a/kernel/audit.c
>> +++ b/kernel/audit.c
>> @@ -1098,6 +1098,9 @@ struct audit_buffer *audit_log_start(struct 
>> audit_context *ctx, gfp_t gfp_mask,
>>  int reserve;
>>  unsigned long timeout_start = jiffies;
>>  
>> +if (audit_pid && audit_pid == current->pid)
>> +gfp_mask &= ~__GFP_WAIT;
>> +
>>  if (audit_initialized != AUDIT_INITIALIZED)
>>  return NULL;
>>  
>>
> 

> Hmm, I see, There may be other code paths that auditd can call 
> audit_log_start except
Yeah. I found it when I reviewed the code paths to audit_log_start after I got 
your comment.

sys_sendto
-> sock_sendmsg
   -> netlink_unicast
  -> audit_receive
 -> audit_receive_skb  
-> audit_receive_msg
   -> audit_log_config_change
  -> audit_log_start
   -> audit_log_common_recv_msg ***
  -> audit_log_start***

> audit_log_config_change. so it's better to handle this problem in 
> audit_log_start.
> 
> but current task is only meaningful when gfp_mask & __GFP_WAIT is true.
> so maybe the below patch is what you want.
OK. 

I considered the code size, but I didn't consider the execution speed.
audit_log_start doesn't always need to execute "if (audit_pid && audit_pid == 
current->pid)"! 

I will send your revised patch later.

Thanks,
Toshiyuki Okajima

> 
> diff --git a/kernel/audit.c b/kernel/audit.c
> index 7b0e23a..10b4545 100644
> --- a/kernel/audit.c
> +++ b/kernel/audit.c
> @@ -1095,7 +1095,9 @@ struct audit_buffer *audit_log_start(struct 
> audit_context
> struct audit_buffer *ab = NULL;
> struct timespec t;
> unsigned intuninitialized_var(serial);
> -   int reserve;
> +   int reserve = 5; /* Allow atomic callers to go up to five
> +   entries over the normal backlog limit */
> +
> unsigned long timeout_start = jiffies;
> 
> if (audit_initialized != AUDIT_INITIALIZED)
> @@ -1104,11 +1106,12 @@ struct audit_buffer *audit_log_start(struct 
> audit_contex
> if (unlikely(audit_filter_type(type)))
> return NULL;
> 
> -   if (gfp_mask & __GFP_WAIT)
> -   reserve = 0;
> -   else
> -   reserve = 5; /* Allow atomic callers to go up to five
> -   entries over the normal backlog limit */
> +   if (gfp_mask & __GFP_WAIT) {
> +   if (audit_pid && audit_pid == current->pid)
> +   gfp_mask &= ~__GFP_WAIT;
> +   else
> +   reserve = 0;
> +   }
> 
> while (audit_backlog_limit
>&& skb_queue_len(_skb_queue) > audit_backlog_limit + 
> reserv
> 
> 
> Thanks
> 
> 




--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [BUG][PATCH] audit: audit_log_start running on auditd should not stop

2013-10-15 Thread Toshiyuki Okajima
Hi Gao-san,

(2013/10/15 15:30), Gao feng wrote:
 Hi Toshiyuki-san,
 On 10/15/2013 12:43 PM, Toshiyuki Okajima wrote:
 The backlog cannot be consumed when audit_log_start is running on auditd
 even if audit_log_start calls wait_for_auditd to consume it.
 The situation is a deadlock because only auditd can consume the backlog.
 If the other process needs to send the backlog, it can be also stopped 
 by the deadlock.

 So, audit_log_start running on auditd should not stop.

 You can see the deadlock with the following reproducer:
  # auditctl -a exit,always -S all
  # reboot

 Signed-off-by: Toshiyuki Okajima toshi.okaj...@jp.fujitsu.com
 Cc: gaof...@cn.fujitsu.com
 ---
  kernel/audit.c |3 +++
  1 files changed, 3 insertions(+), 0 deletions(-)

 diff --git a/kernel/audit.c b/kernel/audit.c
 index 7b0e23a..ce1fb38 100644
 --- a/kernel/audit.c
 +++ b/kernel/audit.c
 @@ -1098,6 +1098,9 @@ struct audit_buffer *audit_log_start(struct 
 audit_context *ctx, gfp_t gfp_mask,
  int reserve;
  unsigned long timeout_start = jiffies;
  
 +if (audit_pid  audit_pid == current-pid)
 +gfp_mask = ~__GFP_WAIT;
 +
  if (audit_initialized != AUDIT_INITIALIZED)
  return NULL;
  

 

 Hmm, I see, There may be other code paths that auditd can call 
 audit_log_start except
Yeah. I found it when I reviewed the code paths to audit_log_start after I got 
your comment.

sys_sendto
- sock_sendmsg
   - netlink_unicast
  - audit_receive
 - audit_receive_skb  
- audit_receive_msg
   - audit_log_config_change
  - audit_log_start
   - audit_log_common_recv_msg ***
  - audit_log_start***

 audit_log_config_change. so it's better to handle this problem in 
 audit_log_start.
 
 but current task is only meaningful when gfp_mask  __GFP_WAIT is true.
 so maybe the below patch is what you want.
OK. 

I considered the code size, but I didn't consider the execution speed.
audit_log_start doesn't always need to execute if (audit_pid  audit_pid == 
current-pid)! 

I will send your revised patch later.

Thanks,
Toshiyuki Okajima

 
 diff --git a/kernel/audit.c b/kernel/audit.c
 index 7b0e23a..10b4545 100644
 --- a/kernel/audit.c
 +++ b/kernel/audit.c
 @@ -1095,7 +1095,9 @@ struct audit_buffer *audit_log_start(struct 
 audit_context
 struct audit_buffer *ab = NULL;
 struct timespec t;
 unsigned intuninitialized_var(serial);
 -   int reserve;
 +   int reserve = 5; /* Allow atomic callers to go up to five
 +   entries over the normal backlog limit */
 +
 unsigned long timeout_start = jiffies;
 
 if (audit_initialized != AUDIT_INITIALIZED)
 @@ -1104,11 +1106,12 @@ struct audit_buffer *audit_log_start(struct 
 audit_contex
 if (unlikely(audit_filter_type(type)))
 return NULL;
 
 -   if (gfp_mask  __GFP_WAIT)
 -   reserve = 0;
 -   else
 -   reserve = 5; /* Allow atomic callers to go up to five
 -   entries over the normal backlog limit */
 +   if (gfp_mask  __GFP_WAIT) {
 +   if (audit_pid  audit_pid == current-pid)
 +   gfp_mask = ~__GFP_WAIT;
 +   else
 +   reserve = 0;
 +   }
 
 while (audit_backlog_limit
 skb_queue_len(audit_skb_queue)  audit_backlog_limit + 
 reserv
 
 
 Thanks
 
 




--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[BUG][PATCH V3] audit: audit_log_start running on auditd should not stop

2013-10-15 Thread Toshiyuki Okajima
The backlog cannot be consumed when audit_log_start is running on auditd
even if audit_log_start calls wait_for_auditd to consume it.
The situation is the deadlock because only auditd can consume the backlog.
If the other process needs to send the backlog, it can be also stopped 
by the deadlock.

So, audit_log_start running on auditd should not stop.

You can see the deadlock with the following reproducer:
 # auditctl -a exit,always -S all
 # reboot

Signed-off-by: Toshiyuki Okajima toshi.okaj...@jp.fujitsu.com
Cc: gaof...@cn.fujitsu.com
---
 kernel/audit.c |   14 --
 1 files changed, 8 insertions(+), 6 deletions(-)

diff --git a/kernel/audit.c b/kernel/audit.c
index 7b0e23a..29cfc94 100644
--- a/kernel/audit.c
+++ b/kernel/audit.c
@@ -1095,7 +1095,8 @@ struct audit_buffer *audit_log_start(struct audit_context 
*ctx, gfp_t gfp_mask,
struct audit_buffer *ab = NULL;
struct timespec t;
unsigned intuninitialized_var(serial);
-   int reserve;
+   int reserve = 5; /* Allow atomic callers to go up to five
+   entries over the normal backlog limit */
unsigned long timeout_start = jiffies;
 
if (audit_initialized != AUDIT_INITIALIZED)
@@ -1104,11 +1105,12 @@ struct audit_buffer *audit_log_start(struct 
audit_context *ctx, gfp_t gfp_mask,
if (unlikely(audit_filter_type(type)))
return NULL;
 
-   if (gfp_mask  __GFP_WAIT)
-   reserve = 0;
-   else
-   reserve = 5; /* Allow atomic callers to go up to five
-   entries over the normal backlog limit */
+   if (gfp_mask  __GFP_WAIT) {
+   if (audit_pid  audit_pid == current-pid)
+   gfp_mask = ~__GFP_WAIT;
+   else
+   reserve = 0;
+   }
 
while (audit_backlog_limit
skb_queue_len(audit_skb_queue)  audit_backlog_limit + 
reserve) {
-- 
1.5.5.6
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[BUG][PATCH] audit: audit_log_start running on auditd should not stop

2013-10-14 Thread Toshiyuki Okajima
The backlog cannot be consumed when audit_log_start is running on auditd
even if audit_log_start calls wait_for_auditd to consume it.
The situation is a deadlock because only auditd can consume the backlog.
If the other process needs to send the backlog, it can be also stopped 
by the deadlock.

So, audit_log_start running on auditd should not stop.

You can see the deadlock with the following reproducer:
 # auditctl -a exit,always -S all
 # reboot

Signed-off-by: Toshiyuki Okajima 
Cc: gaof...@cn.fujitsu.com
---
 kernel/audit.c |3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/kernel/audit.c b/kernel/audit.c
index 7b0e23a..ce1fb38 100644
--- a/kernel/audit.c
+++ b/kernel/audit.c
@@ -1098,6 +1098,9 @@ struct audit_buffer *audit_log_start(struct audit_context 
*ctx, gfp_t gfp_mask,
int reserve;
unsigned long timeout_start = jiffies;
 
+   if (audit_pid && audit_pid == current->pid)
+   gfp_mask &= ~__GFP_WAIT;
+
if (audit_initialized != AUDIT_INITIALIZED)
return NULL;
 
-- 
1.5.5.6
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[BUG][PATCH] audit: audit_log_start running on auditd should not stop

2013-10-14 Thread Toshiyuki Okajima
The backlog cannot be consumed when audit_log_start is running on auditd
even if audit_log_start calls wait_for_auditd to consume it.
The situation is a deadlock because only auditd can consume the backlog.
If the other process needs to send the backlog, it can be also stopped 
by the deadlock.

So, audit_log_start running on auditd should not stop.

You can see the deadlock with the following reproducer:
 # auditctl -a exit,always -S all
 # reboot

Signed-off-by: Toshiyuki Okajima toshi.okaj...@jp.fujitsu.com
Cc: gaof...@cn.fujitsu.com
---
 kernel/audit.c |3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/kernel/audit.c b/kernel/audit.c
index 7b0e23a..ce1fb38 100644
--- a/kernel/audit.c
+++ b/kernel/audit.c
@@ -1098,6 +1098,9 @@ struct audit_buffer *audit_log_start(struct audit_context 
*ctx, gfp_t gfp_mask,
int reserve;
unsigned long timeout_start = jiffies;
 
+   if (audit_pid  audit_pid == current-pid)
+   gfp_mask = ~__GFP_WAIT;
+
if (audit_initialized != AUDIT_INITIALIZED)
return NULL;
 
-- 
1.5.5.6
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [BUG][PATCH][RFC] audit: hang up in audit_log_start executed on auditd

2013-10-11 Thread Toshiyuki Okajima (smtp-b.css)
Gao-san,

(2013/10/11 18:33), Gao feng wrote:
> On 10/11/2013 09:36 AM, Toshiyuki Okajima wrote:
>> Hi. 
>>
>> The following reproducer causes auditd daemon hang up.
>> (But the hang up is released after the audit_backlog_wait_time passes.)
>>  # auditctl -a exit,always -S all
>>  # reboot
>>
>>
>> I reproduced the hangup on KVM, and then got a crash dump.
>> After I analyzed the dump, I found auditd daemon hung up in audit_log_start. 
>> (I have confirmed it on linux-3.12-rc4.)
>>
>> Like this:
>> crash> bt 1426
>> PID: 1426   TASK: 88007b63e040  CPU: 1   COMMAND: "auditd"
>>  #0 [88007cb93918] __schedule at 8155d980
>>  #1 [88007cb939b0] schedule at 8155de99
>>  #2 [88007cb939c0] schedule_timeout at 8155b840
>>  #3 [88007cb93a60] audit_log_start at 810d3ce5
>>  #4 [88007cb93b20] audit_log_config_change at 810d3ece
>>  #5 [88007cb93b60] audit_receive_msg at 810d4fd6
>>  #6 [88007cb93c00] audit_receive at 810d5173
>>  #7 [88007cb93c30] netlink_unicast at 814c5269
>>  #8 [88007cb93c90] netlink_sendmsg at 814c6386
>>  #9 [88007cb93d20] sock_sendmsg at 814813c0
>> #10 [88007cb93e30] SYSC_sendto at 81481524
>> #11 [88007cb93f70] sys_sendto at 8148157e
>> #12 [88007cb93f80] system_call_fastpath at 81568052
>> RIP: 7f5c47f7fba3  RSP: 7fffcf21a118  RFLAGS: 00010202
>> RAX: 002c  RBX: 81568052  RCX: 
>> RDX: 0030  RSI: 7fffcf21e7d0  RDI: 0003
>> RBP: 7fffcf21e7d0   R8: 7fffcf21a130   R9: 000c
>> R10:   R11: 0293  R12: 8148157e
>> R13: 88007cb93f78  R14: 0020  R15: 0030
>> ORIG_RAX: 002c  CS: 0033  SS: 002b
>>
>>
>> The reason is that auditd daemon itself cannot consume its backlog 
>> while audit_log_start is calling schedule_timeout on auditd daemon.  
>> So, that is a deadlock!
>>
>> Therefore, I think audit_log_start shouldn't handle auditd's backlog
>> when auditd daemon executes audit_log_start.
>>
>> For example, I made the following fix patch.
>> --
>> auditd daemon can execute the audit_log_start, and then it can cause 
>> a hang up because only auditd daemon can consume the backlog.
>> So, audit_log_start executed by auditd daemon should not handle the backlog 
>> in case auditd daemon hangs up (while wait_for_auditd is calling).
>>
>> Signed-off-by: Toshiyuki Okajima 
>> ---
>>  kernel/audit.c |3 +++
>>  1 files changed, 3 insertions(+), 0 deletions(-)
>>
>> diff --git a/kernel/audit.c b/kernel/audit.c
>> index 7b0e23a..86c389e 100644
>> --- a/kernel/audit.c
>> +++ b/kernel/audit.c
>> @@ -1098,6 +1098,9 @@ struct audit_buffer *audit_log_start(struct 
>> audit_context *ctx, gfp_t gfp_mask,
>>  int reserve;
>>  unsigned long timeout_start = jiffies;
>>  
>> +if (audit_pid && (audit_pid == current->pid))
>> +return NULL;
>> +
> 
> audit_log_start can be called in interrupt context, such as iptables AUDIT 
> module,
> we can't use current here.
> please try the patch below.
> 
> diff --git a/kernel/audit.c b/kernel/audit.c
> index 7b0e23a..1f35f3d 100644
> --- a/kernel/audit.c
> +++ b/kernel/audit.c
> @@ -270,9 +270,13 @@ static int audit_log_config_change(char *function_name, 
> int new, int old,
>int allow_changes)
>  {
> struct audit_buffer *ab;
> +   gfp_t gfp_mask = GFP_KERNEL;
> int rc = 0;
> 
> -   ab = audit_log_start(NULL, GFP_KERNEL, AUDIT_CONFIG_CHANGE);
> +   if (audit_pid && audit_pid == current->pid)
> +   gfp_mask = GFP_ATOMIC;
> +
> +   ab = audit_log_start(NULL, gfp_mask, AUDIT_CONFIG_CHANGE);
> if (unlikely(!ab))
> return rc;
> audit_log_format(ab, "%s=%d old=%d", function_name, new, old);
> 
> 

Thanks for your suggestion. I will try it.

Regards,
Toshiyuki Okajima


> Thanks
> 
> 



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [BUG][PATCH][RFC] audit: hang up in audit_log_start executed on auditd

2013-10-11 Thread Toshiyuki Okajima (smtp-b.css)
Gao-san,

(2013/10/11 18:33), Gao feng wrote:
 On 10/11/2013 09:36 AM, Toshiyuki Okajima wrote:
 Hi. 

 The following reproducer causes auditd daemon hang up.
 (But the hang up is released after the audit_backlog_wait_time passes.)
  # auditctl -a exit,always -S all
  # reboot


 I reproduced the hangup on KVM, and then got a crash dump.
 After I analyzed the dump, I found auditd daemon hung up in audit_log_start. 
 (I have confirmed it on linux-3.12-rc4.)

 Like this:
 crash bt 1426
 PID: 1426   TASK: 88007b63e040  CPU: 1   COMMAND: auditd
  #0 [88007cb93918] __schedule at 8155d980
  #1 [88007cb939b0] schedule at 8155de99
  #2 [88007cb939c0] schedule_timeout at 8155b840
  #3 [88007cb93a60] audit_log_start at 810d3ce5
  #4 [88007cb93b20] audit_log_config_change at 810d3ece
  #5 [88007cb93b60] audit_receive_msg at 810d4fd6
  #6 [88007cb93c00] audit_receive at 810d5173
  #7 [88007cb93c30] netlink_unicast at 814c5269
  #8 [88007cb93c90] netlink_sendmsg at 814c6386
  #9 [88007cb93d20] sock_sendmsg at 814813c0
 #10 [88007cb93e30] SYSC_sendto at 81481524
 #11 [88007cb93f70] sys_sendto at 8148157e
 #12 [88007cb93f80] system_call_fastpath at 81568052
 RIP: 7f5c47f7fba3  RSP: 7fffcf21a118  RFLAGS: 00010202
 RAX: 002c  RBX: 81568052  RCX: 
 RDX: 0030  RSI: 7fffcf21e7d0  RDI: 0003
 RBP: 7fffcf21e7d0   R8: 7fffcf21a130   R9: 000c
 R10:   R11: 0293  R12: 8148157e
 R13: 88007cb93f78  R14: 0020  R15: 0030
 ORIG_RAX: 002c  CS: 0033  SS: 002b


 The reason is that auditd daemon itself cannot consume its backlog 
 while audit_log_start is calling schedule_timeout on auditd daemon.  
 So, that is a deadlock!

 Therefore, I think audit_log_start shouldn't handle auditd's backlog
 when auditd daemon executes audit_log_start.

 For example, I made the following fix patch.
 --
 auditd daemon can execute the audit_log_start, and then it can cause 
 a hang up because only auditd daemon can consume the backlog.
 So, audit_log_start executed by auditd daemon should not handle the backlog 
 in case auditd daemon hangs up (while wait_for_auditd is calling).

 Signed-off-by: Toshiyuki Okajima toshi.okaj...@jp.fujitsu.com
 ---
  kernel/audit.c |3 +++
  1 files changed, 3 insertions(+), 0 deletions(-)

 diff --git a/kernel/audit.c b/kernel/audit.c
 index 7b0e23a..86c389e 100644
 --- a/kernel/audit.c
 +++ b/kernel/audit.c
 @@ -1098,6 +1098,9 @@ struct audit_buffer *audit_log_start(struct 
 audit_context *ctx, gfp_t gfp_mask,
  int reserve;
  unsigned long timeout_start = jiffies;
  
 +if (audit_pid  (audit_pid == current-pid))
 +return NULL;
 +
 
 audit_log_start can be called in interrupt context, such as iptables AUDIT 
 module,
 we can't use current here.
 please try the patch below.
 
 diff --git a/kernel/audit.c b/kernel/audit.c
 index 7b0e23a..1f35f3d 100644
 --- a/kernel/audit.c
 +++ b/kernel/audit.c
 @@ -270,9 +270,13 @@ static int audit_log_config_change(char *function_name, 
 int new, int old,
int allow_changes)
  {
 struct audit_buffer *ab;
 +   gfp_t gfp_mask = GFP_KERNEL;
 int rc = 0;
 
 -   ab = audit_log_start(NULL, GFP_KERNEL, AUDIT_CONFIG_CHANGE);
 +   if (audit_pid  audit_pid == current-pid)
 +   gfp_mask = GFP_ATOMIC;
 +
 +   ab = audit_log_start(NULL, gfp_mask, AUDIT_CONFIG_CHANGE);
 if (unlikely(!ab))
 return rc;
 audit_log_format(ab, %s=%d old=%d, function_name, new, old);
 
 

Thanks for your suggestion. I will try it.

Regards,
Toshiyuki Okajima


 Thanks
 
 



--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[BUG][PATCH][RFC] audit: hang up in audit_log_start executed on auditd

2013-10-10 Thread Toshiyuki Okajima
Hi. 

The following reproducer causes auditd daemon hang up.
(But the hang up is released after the audit_backlog_wait_time passes.)
 # auditctl -a exit,always -S all
 # reboot


I reproduced the hangup on KVM, and then got a crash dump.
After I analyzed the dump, I found auditd daemon hung up in audit_log_start. 
(I have confirmed it on linux-3.12-rc4.)

Like this:
crash> bt 1426
PID: 1426   TASK: 88007b63e040  CPU: 1   COMMAND: "auditd"
 #0 [88007cb93918] __schedule at 8155d980
 #1 [88007cb939b0] schedule at 8155de99
 #2 [88007cb939c0] schedule_timeout at 8155b840
 #3 [88007cb93a60] audit_log_start at 810d3ce5
 #4 [88007cb93b20] audit_log_config_change at 810d3ece
 #5 [88007cb93b60] audit_receive_msg at 810d4fd6
 #6 [88007cb93c00] audit_receive at 810d5173
 #7 [88007cb93c30] netlink_unicast at 814c5269
 #8 [88007cb93c90] netlink_sendmsg at 814c6386
 #9 [88007cb93d20] sock_sendmsg at 814813c0
#10 [88007cb93e30] SYSC_sendto at 81481524
#11 [88007cb93f70] sys_sendto at 8148157e
#12 [88007cb93f80] system_call_fastpath at 81568052
RIP: 7f5c47f7fba3  RSP: 7fffcf21a118  RFLAGS: 00010202
RAX: 002c  RBX: 81568052  RCX: 
RDX: 0030  RSI: 7fffcf21e7d0  RDI: 0003
RBP: 7fffcf21e7d0   R8: 7fffcf21a130   R9: 000c
R10:   R11: 0293  R12: 8148157e
R13: 88007cb93f78  R14: 0020  R15: 0030
ORIG_RAX: 002c  CS: 0033  SS: 002b


The reason is that auditd daemon itself cannot consume its backlog 
while audit_log_start is calling schedule_timeout on auditd daemon.  
So, that is a deadlock!

Therefore, I think audit_log_start shouldn't handle auditd's backlog
when auditd daemon executes audit_log_start.

For example, I made the following fix patch.
--
auditd daemon can execute the audit_log_start, and then it can cause 
a hang up because only auditd daemon can consume the backlog.
So, audit_log_start executed by auditd daemon should not handle the backlog 
in case auditd daemon hangs up (while wait_for_auditd is calling).

Signed-off-by: Toshiyuki Okajima 
---
 kernel/audit.c |3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/kernel/audit.c b/kernel/audit.c
index 7b0e23a..86c389e 100644
--- a/kernel/audit.c
+++ b/kernel/audit.c
@@ -1098,6 +1098,9 @@ struct audit_buffer *audit_log_start(struct audit_context 
*ctx, gfp_t gfp_mask,
int reserve;
unsigned long timeout_start = jiffies;
 
+   if (audit_pid && (audit_pid == current->pid))
+   return NULL;
+
if (audit_initialized != AUDIT_INITIALIZED)
return NULL;
 
-- 
1.5.5.6
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[BUG][PATCH][RFC] audit: hang up in audit_log_start executed on auditd

2013-10-10 Thread Toshiyuki Okajima
Hi. 

The following reproducer causes auditd daemon hang up.
(But the hang up is released after the audit_backlog_wait_time passes.)
 # auditctl -a exit,always -S all
 # reboot


I reproduced the hangup on KVM, and then got a crash dump.
After I analyzed the dump, I found auditd daemon hung up in audit_log_start. 
(I have confirmed it on linux-3.12-rc4.)

Like this:
crash bt 1426
PID: 1426   TASK: 88007b63e040  CPU: 1   COMMAND: auditd
 #0 [88007cb93918] __schedule at 8155d980
 #1 [88007cb939b0] schedule at 8155de99
 #2 [88007cb939c0] schedule_timeout at 8155b840
 #3 [88007cb93a60] audit_log_start at 810d3ce5
 #4 [88007cb93b20] audit_log_config_change at 810d3ece
 #5 [88007cb93b60] audit_receive_msg at 810d4fd6
 #6 [88007cb93c00] audit_receive at 810d5173
 #7 [88007cb93c30] netlink_unicast at 814c5269
 #8 [88007cb93c90] netlink_sendmsg at 814c6386
 #9 [88007cb93d20] sock_sendmsg at 814813c0
#10 [88007cb93e30] SYSC_sendto at 81481524
#11 [88007cb93f70] sys_sendto at 8148157e
#12 [88007cb93f80] system_call_fastpath at 81568052
RIP: 7f5c47f7fba3  RSP: 7fffcf21a118  RFLAGS: 00010202
RAX: 002c  RBX: 81568052  RCX: 
RDX: 0030  RSI: 7fffcf21e7d0  RDI: 0003
RBP: 7fffcf21e7d0   R8: 7fffcf21a130   R9: 000c
R10:   R11: 0293  R12: 8148157e
R13: 88007cb93f78  R14: 0020  R15: 0030
ORIG_RAX: 002c  CS: 0033  SS: 002b


The reason is that auditd daemon itself cannot consume its backlog 
while audit_log_start is calling schedule_timeout on auditd daemon.  
So, that is a deadlock!

Therefore, I think audit_log_start shouldn't handle auditd's backlog
when auditd daemon executes audit_log_start.

For example, I made the following fix patch.
--
auditd daemon can execute the audit_log_start, and then it can cause 
a hang up because only auditd daemon can consume the backlog.
So, audit_log_start executed by auditd daemon should not handle the backlog 
in case auditd daemon hangs up (while wait_for_auditd is calling).

Signed-off-by: Toshiyuki Okajima toshi.okaj...@jp.fujitsu.com
---
 kernel/audit.c |3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/kernel/audit.c b/kernel/audit.c
index 7b0e23a..86c389e 100644
--- a/kernel/audit.c
+++ b/kernel/audit.c
@@ -1098,6 +1098,9 @@ struct audit_buffer *audit_log_start(struct audit_context 
*ctx, gfp_t gfp_mask,
int reserve;
unsigned long timeout_start = jiffies;
 
+   if (audit_pid  (audit_pid == current-pid))
+   return NULL;
+
if (audit_initialized != AUDIT_INITIALIZED)
return NULL;
 
-- 
1.5.5.6
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[BUG] Failed to open a file after setgid

2013-07-03 Thread Toshiyuki Okajima
Hi guys!

I encountered that a file cannot be opened even if the process has a valid 
access authority for the file on linux-3.10. 

This problem is caused since the group list which getgroups() returns includes 
wrong ID as the group ID after setgid(). These groups include egid of the 
process 
before calling setgid().

Since man 3p getgroups says as follows, including egid of process itself is
no problem.

==
The getgroups() function shall fill in the array grouplist with the current
supplementary group IDs of  the  calling  process.  It  is 
implementation-defined
whether getgroups() also returns the effective group ID in the grouplist array.
==

However, it should be the current egid after setgid() instead of the past egid.
So, I consider this behavior is a bug. How do you think?

Here is the step to reproduce:

1. Create a file and set its file access permission/attribute as follows:
 # touch /dummy
 # chmod 505 /dummy
 # chown 0:0 /dummy
 # ls -lnd /
 drwxrwxrwx. 29 0 0 4096 Jul  4 11:46 /
 # ls -ln /dummy
 -r-x---r-x 1 0 0 0 Jul  4 11:46 /dummy
 # id
 uid=0(root) gid=0(root) groups=0(root),3(sys),4(adm),10(wheel)
 # su - tmpuser
 $ id
 uid=1000(tmpuser) gid=100(users) groups=100(users),99(nobody)

2. Run the following program as the privileged user (root)
  # cd /tmp
  # cat source_ng.c
---
#include 
#include 
#include 
#include 
#include 
#include 
#include 
int main(void)
{
int gid = setgid(100);
int uid = setuid(1000);
int fd;
int groups;
int i;
gid_t gids[10];

groups = getgroups(10, gids);
for (i = 0; i < groups; i++)
printf("[group %d] %d\n", i, gids[i]);
fd = open("/dummy", O_RDONLY);
if (fd < 0) {
fprintf(stderr, "open: %s(%d)\n", strerror(errno), errno);
return 1;
}
return 0;
}
  ---
  # make source_ng
  # /tmp/source_ng

  Additional information: 
The program terminates successfully when tmpuser runs it directly. 
  # su - tmpuser
  $ /tmp/source_ng
  $ echo $?
  0  


This problem happens since root group (0) is still included in the group list
which getgroups() returns. Then, the process is regarded as belonging to these 
groups and prohibited to access to the file.

Expected Result:
This program returns without any errors.
(getgroups() returns the group list including the primary group of the current 
user but not the one of the past user after setgid()) 
  # /tmp/source_ng
  [group 0] 100 
  [group 1] 3   
  [group 2] 4
  [group 3] 10

Actual Result:
This program returns with error (EACCES).
(getgroups() returns the group list including the primary group of the past
user but not the one of the current user after setgid()) 
  # /tmp/source_ng
  [group 0] 0   // I expect the value is the current egid but not the past egid.
  [group 1] 3   
  [group 2] 4
  [group 3] 10
  open: Permission denied(13)

Or, setgid() has not a bug but the parameter which the pam library and 
so on give setgroups() including the primary group is wrong? 

Thanks,
toshi.okajima
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[BUG] Failed to open a file after setgid

2013-07-03 Thread Toshiyuki Okajima
Hi guys!

I encountered that a file cannot be opened even if the process has a valid 
access authority for the file on linux-3.10. 

This problem is caused since the group list which getgroups() returns includes 
wrong ID as the group ID after setgid(). These groups include egid of the 
process 
before calling setgid().

Since man 3p getgroups says as follows, including egid of process itself is
no problem.

==
The getgroups() function shall fill in the array grouplist with the current
supplementary group IDs of  the  calling  process.  It  is 
implementation-defined
whether getgroups() also returns the effective group ID in the grouplist array.
==

However, it should be the current egid after setgid() instead of the past egid.
So, I consider this behavior is a bug. How do you think?

Here is the step to reproduce:

1. Create a file and set its file access permission/attribute as follows:
 # touch /dummy
 # chmod 505 /dummy
 # chown 0:0 /dummy
 # ls -lnd /
 drwxrwxrwx. 29 0 0 4096 Jul  4 11:46 /
 # ls -ln /dummy
 -r-x---r-x 1 0 0 0 Jul  4 11:46 /dummy
 # id
 uid=0(root) gid=0(root) groups=0(root),3(sys),4(adm),10(wheel)
 # su - tmpuser
 $ id
 uid=1000(tmpuser) gid=100(users) groups=100(users),99(nobody)

2. Run the following program as the privileged user (root)
  # cd /tmp
  # cat source_ng.c
---
#include unistd.h
#include sys/types.h
#include sys/stat.h
#include fcntl.h
#include stdio.h
#include string.h
#include errno.h
int main(void)
{
int gid = setgid(100);
int uid = setuid(1000);
int fd;
int groups;
int i;
gid_t gids[10];

groups = getgroups(10, gids);
for (i = 0; i  groups; i++)
printf([group %d] %d\n, i, gids[i]);
fd = open(/dummy, O_RDONLY);
if (fd  0) {
fprintf(stderr, open: %s(%d)\n, strerror(errno), errno);
return 1;
}
return 0;
}
  ---
  # make source_ng
  # /tmp/source_ng

  Additional information: 
The program terminates successfully when tmpuser runs it directly. 
  # su - tmpuser
  $ /tmp/source_ng
  $ echo $?
  0  


This problem happens since root group (0) is still included in the group list
which getgroups() returns. Then, the process is regarded as belonging to these 
groups and prohibited to access to the file.

Expected Result:
This program returns without any errors.
(getgroups() returns the group list including the primary group of the current 
user but not the one of the past user after setgid()) 
  # /tmp/source_ng
  [group 0] 100 
  [group 1] 3   
  [group 2] 4
  [group 3] 10

Actual Result:
This program returns with error (EACCES).
(getgroups() returns the group list including the primary group of the past
user but not the one of the current user after setgid()) 
  # /tmp/source_ng
  [group 0] 0   // I expect the value is the current egid but not the past egid.
  [group 1] 3   
  [group 2] 4
  [group 3] 10
  open: Permission denied(13)

Or, setgid() has not a bug but the parameter which the pam library and 
so on give setgroups() including the primary group is wrong? 

Thanks,
toshi.okajima
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/