Re: [PATCH] audit: optionally print warning after waiting to enqueue record

2020-06-18 Thread Richard Guy Briggs
On 2020-06-18 23:48, Max Englander wrote:
> On Wed, Jun 17, 2020 at 09:06:27PM -0400, Paul Moore wrote:
> > On Wed, Jun 17, 2020 at 6:54 PM Max Englander  
> > wrote:
> > > On Wed, Jun 17, 2020 at 02:47:19PM -0400, Paul Moore wrote:
> > > > On Tue, Jun 16, 2020 at 12:58 AM Max Englander 
> > > >  wrote:
> > > > >
> > > > > In environments where security is prioritized, users may set
> > > > > --backlog_wait_time to a high value in order to reduce the likelihood
> > > > > that any audit event is lost, even though doing so may result in
> > > > > unpredictable performance if the kernel schedules a timeout when the
> > > > > backlog limit is exceeded. For these users, the next best thing to
> > > > > predictable performance is the ability to quickly detect and react to
> > > > > degraded performance. This patch proposes to aid the detection of 
> > > > > kernel
> > > > > audit subsystem pauses through the following changes:
> > > > >
> > > > > Add a variable named audit_backlog_warn_time. Enforce the value of 
> > > > > this
> > > > > variable to be no less than zero, and no more than the value of
> > > > > audit_backlog_wait_time.
> > > > >
> > > > > If audit_backlog_warn_time is greater than zero and if the total time
> > > > > spent waiting to enqueue an audit record is greater than or equal to
> > > > > audit_backlog_warn_time, then print a warning with the total time
> > > > > spent waiting.
> > > > >
> > > > > An example configuration:
> > > > >
> > > > > auditctl --backlog_warn_time 50
> > > > >
> > > > > An example warning message:
> > > > >
> > > > > audit: sleep_time=52 >= audit_backlog_warn_time=50
> > > > >
> > > > > Tested on Ubuntu 18.04.04 using complementary changes to the audit
> > > > > userspace: https://github.com/linux-audit/audit-userspace/pull/131.
> > > > >
> > > > > Signed-off-by: Max Englander 
> > > > > ---
> > > > >  include/uapi/linux/audit.h |  7 ++-
> > > > >  kernel/audit.c | 35 +++
> > > > >  2 files changed, 41 insertions(+), 1 deletion(-)
> > > >
> > > > If an admin is prioritizing security, aka don't loose any audit
> > > > records, and there is a concern over variable system latency due to an
> > > > audit queue backlog, why not simply disable the backlog limit?
> > > >
> > > > --
> > > > paul moore
> > > > www.paul-moore.com
> > >
> > > That’s good in some cases, but in other cases unbounded growth of the
> > > backlog could result in memory issues. If the kernel runs out of memory
> > > it would drop the audit event or possibly have other problems. It could
> > > also also consume memory in a way that starves user workloads or causes
> > > them to be killed by the OOMKiller.
> > >
> > > To refine my motivating use case a bit, if a Kubernetes admin wants to
> > > prioritize security, and also avoid unbounded growth of the audit
> > > backlog, they may set -b and --backlog_wait_time in a way that limits
> > > kernel memory usage and reduces the likelihood that any audit event is
> > > lost. Occasional performance degradation may be acceptable to the admin,
> > > but they would like a way to be alerted to prolonged kernel pauses, so
> > > that they can investigate and take corrective action (increase backlog,
> > > increase server capacity, move some workloads to other servers, etc.).
> > >
> > > To state another way. The kernel currently can be configured to print a
> > > message when the backlog limit is exceeded and it must discard the audit
> > > event. This is a useful message for admins, which they can address with
> > > corrective action. I think a message similar to the one proposed by this
> > > patch would be equally useful when the backlog limit is exceeded and the
> > > kernel is configured to wait for the backlog to drain. Admins could
> > > address that message in the same way, but without the cost of lost audit
> > > events.
> > 
> > I'm still struggling to understand how this is any better than
> > disabling the backlog limit, or setting it very high, and simply
> > monitoring the audit size of the audit backlog.  This way the admin
> > doesn't have to worry about the latency issues of a full backlog,
> > while still being able to trigger actions based on the state of the
> > backlog.  The userspace tooling/scripting to watch the backlog size
> > would be trivial, and would arguably provide much better visibility
> > into the backlog state than a single warning threshold in the kernel.
> > 
> > -- 
> > paul moore
> > www.paul-moore.com
> 
> Removing the backlog limit entirely could lead to the memory issues I
> mentioned above (lost audit events, out-of-memory errors), and would
> effectively make the backlog limit a function of free memory. Setting
> the backlog limit higher won’t necessarily prevent it from being
> exceeded on very busy systems where the rate of audit data generation
> can, for long periods of time, outpace the ability of auditd or a
> drop-in replacement to consume it. 
> 
> 

Re: Error handling of auditctl -w

2020-06-18 Thread Lenny Bruzenak

On 6/16/20 2:00 PM, Stefan Tauner wrote:


Hi,

I was wondering why my auditctl executions do not print any errors but
apparently didn't do anything. After checking the return value (which
was 255) I looked at the code and noticed that audit_setup_perms() and
audit_update_watch_perms() have virtually no user-visible error reporting.


Are these in a rules file? I was thinking about perhaps a previous "-i" 
(i.e. auditctl -i) preceeded your rule, however the man page says it 
always returns a success code.


So probably not that, unless the man page doesn't match reality. Can you 
provide a little more detail (e.g. parameters)?


Thx,

LCB

--
Lenny Bruzenak
MagitekLTD

--
Linux-audit mailing list
Linux-audit@redhat.com
https://www.redhat.com/mailman/listinfo/linux-audit



Re: [PATCH] audit: optionally print warning after waiting to enqueue record

2020-06-18 Thread Max Englander
On Wed, Jun 17, 2020 at 09:06:27PM -0400, Paul Moore wrote:
> On Wed, Jun 17, 2020 at 6:54 PM Max Englander  wrote:
> > On Wed, Jun 17, 2020 at 02:47:19PM -0400, Paul Moore wrote:
> > > On Tue, Jun 16, 2020 at 12:58 AM Max Englander  
> > > wrote:
> > > >
> > > > In environments where security is prioritized, users may set
> > > > --backlog_wait_time to a high value in order to reduce the likelihood
> > > > that any audit event is lost, even though doing so may result in
> > > > unpredictable performance if the kernel schedules a timeout when the
> > > > backlog limit is exceeded. For these users, the next best thing to
> > > > predictable performance is the ability to quickly detect and react to
> > > > degraded performance. This patch proposes to aid the detection of kernel
> > > > audit subsystem pauses through the following changes:
> > > >
> > > > Add a variable named audit_backlog_warn_time. Enforce the value of this
> > > > variable to be no less than zero, and no more than the value of
> > > > audit_backlog_wait_time.
> > > >
> > > > If audit_backlog_warn_time is greater than zero and if the total time
> > > > spent waiting to enqueue an audit record is greater than or equal to
> > > > audit_backlog_warn_time, then print a warning with the total time
> > > > spent waiting.
> > > >
> > > > An example configuration:
> > > >
> > > > auditctl --backlog_warn_time 50
> > > >
> > > > An example warning message:
> > > >
> > > > audit: sleep_time=52 >= audit_backlog_warn_time=50
> > > >
> > > > Tested on Ubuntu 18.04.04 using complementary changes to the audit
> > > > userspace: https://github.com/linux-audit/audit-userspace/pull/131.
> > > >
> > > > Signed-off-by: Max Englander 
> > > > ---
> > > >  include/uapi/linux/audit.h |  7 ++-
> > > >  kernel/audit.c | 35 +++
> > > >  2 files changed, 41 insertions(+), 1 deletion(-)
> > >
> > > If an admin is prioritizing security, aka don't loose any audit
> > > records, and there is a concern over variable system latency due to an
> > > audit queue backlog, why not simply disable the backlog limit?
> > >
> > > --
> > > paul moore
> > > www.paul-moore.com
> >
> > That’s good in some cases, but in other cases unbounded growth of the
> > backlog could result in memory issues. If the kernel runs out of memory
> > it would drop the audit event or possibly have other problems. It could
> > also also consume memory in a way that starves user workloads or causes
> > them to be killed by the OOMKiller.
> >
> > To refine my motivating use case a bit, if a Kubernetes admin wants to
> > prioritize security, and also avoid unbounded growth of the audit
> > backlog, they may set -b and --backlog_wait_time in a way that limits
> > kernel memory usage and reduces the likelihood that any audit event is
> > lost. Occasional performance degradation may be acceptable to the admin,
> > but they would like a way to be alerted to prolonged kernel pauses, so
> > that they can investigate and take corrective action (increase backlog,
> > increase server capacity, move some workloads to other servers, etc.).
> >
> > To state another way. The kernel currently can be configured to print a
> > message when the backlog limit is exceeded and it must discard the audit
> > event. This is a useful message for admins, which they can address with
> > corrective action. I think a message similar to the one proposed by this
> > patch would be equally useful when the backlog limit is exceeded and the
> > kernel is configured to wait for the backlog to drain. Admins could
> > address that message in the same way, but without the cost of lost audit
> > events.
> 
> I'm still struggling to understand how this is any better than
> disabling the backlog limit, or setting it very high, and simply
> monitoring the audit size of the audit backlog.  This way the admin
> doesn't have to worry about the latency issues of a full backlog,
> while still being able to trigger actions based on the state of the
> backlog.  The userspace tooling/scripting to watch the backlog size
> would be trivial, and would arguably provide much better visibility
> into the backlog state than a single warning threshold in the kernel.
> 
> -- 
> paul moore
> www.paul-moore.com

Removing the backlog limit entirely could lead to the memory issues I
mentioned above (lost audit events, out-of-memory errors), and would
effectively make the backlog limit a function of free memory. Setting
the backlog limit higher won’t necessarily prevent it from being
exceeded on very busy systems where the rate of audit data generation
can, for long periods of time, outpace the ability of auditd or a
drop-in replacement to consume it. 

The combination of backlog limit and wait time, on the other hand, sets
a bound on memory while all but ensuring the preservation of audit
events. The fact that latency can arise from using this combination is,
for me, an acceptable cost for the predictable 

Re: [PATCH] audit: optionally print warning after waiting to enqueue record

2020-06-18 Thread Max Englander
On Thu, Jun 18, 2020 at 09:39:08AM -0400, Steve Grubb wrote:
> On Wednesday, June 17, 2020 6:54:16 PM EDT Max Englander wrote:
> > On Wed, Jun 17, 2020 at 02:47:19PM -0400, Paul Moore wrote:
> > > On Tue, Jun 16, 2020 at 12:58 AM Max Englander  
> wrote:
> > > > In environments where security is prioritized, users may set
> > > > --backlog_wait_time to a high value in order to reduce the likelihood
> > > > that any audit event is lost, even though doing so may result in
> > > > unpredictable performance if the kernel schedules a timeout when the
> > > > backlog limit is exceeded. For these users, the next best thing to
> > > > predictable performance is the ability to quickly detect and react to
> > > > degraded performance. This patch proposes to aid the detection of
> > > > kernel
> > > > audit subsystem pauses through the following changes:
> > > > 
> > > > Add a variable named audit_backlog_warn_time. Enforce the value of this
> > > > variable to be no less than zero, and no more than the value of
> > > > audit_backlog_wait_time.
> > > > 
> > > > If audit_backlog_warn_time is greater than zero and if the total time
> > > > spent waiting to enqueue an audit record is greater than or equal to
> > > > audit_backlog_warn_time, then print a warning with the total time
> > > > spent waiting.
> > > > 
> > > > An example configuration:
> > > > auditctl --backlog_warn_time 50
> > > > 
> > > > An example warning message:
> > > > audit: sleep_time=52 >= audit_backlog_warn_time=50
> > > > 
> > > > Tested on Ubuntu 18.04.04 using complementary changes to the audit
> > > > userspace: https://github.com/linux-audit/audit-userspace/pull/131.
> > > > 
> > > > Signed-off-by: Max Englander 
> > > > ---
> > > > 
> > > >  include/uapi/linux/audit.h |  7 ++-
> > > >  kernel/audit.c | 35 +++
> > > >  2 files changed, 41 insertions(+), 1 deletion(-)
> > > 
> > > If an admin is prioritizing security, aka don't loose any audit
> > > records, and there is a concern over variable system latency due to an
> > > audit queue backlog, why not simply disable the backlog limit?
> > 
> > That’s good in some cases, but in other cases unbounded growth of the
> > backlog could result in memory issues. If the kernel runs out of memory
> > it would drop the audit event or possibly have other problems. It could
> > also also consume memory in a way that starves user workloads or causes
> > them to be killed by the OOMKiller.
> 
> The kernel cannot grow the backlog unbounded. If you do nothing, the backlog 
> is 64 - which is too small to really use. Otherwise, you set the backlog to a 
> finite number with the -b option.
> 
> > To refine my motivating use case a bit, if a Kubernetes admin wants to
> > prioritize security, and also avoid unbounded growth of the audit
> > backlog, they may set -b and --backlog_wait_time in a way that limits
> > kernel memory usage and reduces the likelihood that any audit event is
> > lost. Occasional performance degradation may be acceptable to the admin,
> > but they would like a way to be alerted to prolonged kernel pauses, so
> > that they can investigate and take corrective action (increase backlog,
> > increase server capacity, move some workloads to other servers, etc.).
> > 
> > To state another way. The kernel currently can be configured to print a
> > message when the backlog limit is exceeded and it must discard the audit
> > event. This is a useful message for admins, which they can address with
> > corrective action. I think a message similar to the one proposed by this
> > patch would be equally useful when the backlog limit is exceeded and the
> > kernel is configured to wait for the backlog to drain. Admins could
> > address that message in the same way, but without the cost of lost audit
> > events.
> 
> If backlog wait time is exceeded, that could be a useful warning if that does 
> not exist. I don't know how often that could happen...and of course without a 
> warning we don't know if it happens or why it happens.
  
What you’re describing already exists, if I’m reading your words right.
In the event that the backlog wait time limit is exceeded, the -f flag
is consulted, and, if the value of -f is 1, then an error message
stating that the backlog limit is exceeded is printed. This is also true
when the backlog wait time is zero.

What I am suggesting is that even if the the backlog wait time is not
exceeded, it would be useful for the kernel to report when backlog
waiting occurs as a way to help identify degraded kernel performance.

> I also wished we had metrics on the backlog such as max used. That might help 
> admins tune the size of the backlog.
> 
> -Steve
> 
> 

--
Linux-audit mailing list
Linux-audit@redhat.com
https://www.redhat.com/mailman/listinfo/linux-audit

Re: [PATCH 2/2] integrity: Add errno field in audit message

2020-06-18 Thread Mimi Zohar
On Wed, 2020-06-17 at 13:44 -0700, Lakshmi Ramasubramanian wrote:
> Error code is not included in the audit messages logged by
> the integrity subsystem. Add "errno" field in the audit messages
> logged by the integrity subsystem and set the value to the error code
> passed to integrity_audit_msg() in the "result" parameter.
> 
> Sample audit messages:
> 
> [6.284329] audit: type=1804 audit(1591756723.627:2): pid=1 uid=0 
> auid=4294967295 ses=4294967295 subj=kernel op=add_boot_aggregate 
> cause=alloc_entry comm="swapper/0" name="boot_aggregate" res=0 errno=-12
> 
> [8.085456] audit: type=1802 audit(1592005947.297:9): pid=1 uid=0 
> auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 
> op=policy_update cause=completed comm="systemd" res=1 errno=0
> 
> Signed-off-by: Lakshmi Ramasubramanian 
> Suggested-by: Steve Grubb 
> ---
>  security/integrity/integrity_audit.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/security/integrity/integrity_audit.c 
> b/security/integrity/integrity_audit.c
> index 5109173839cc..a265024f82f3 100644
> --- a/security/integrity/integrity_audit.c
> +++ b/security/integrity/integrity_audit.c
> @@ -53,6 +53,6 @@ void integrity_audit_msg(int audit_msgno, struct inode 
> *inode,
>   audit_log_untrustedstring(ab, inode->i_sb->s_id);
>   audit_log_format(ab, " ino=%lu", inode->i_ino);
>   }
> - audit_log_format(ab, " res=%d", !result);
> + audit_log_format(ab, " res=%d errno=%d", !result, result);
>   audit_log_end(ab);
>  }

For the reasons that I mentioned previously, unless others are willing
to add their Reviewed-by tag not for the audit aspect in particular,
but IMA itself, I'm not comfortable making this change all at once.

Previously I suggested making the existing integrity_audit_msg() a
wrapper for a new function with errno.  Steve said, "We normally do
not like to have fields that swing in and out ...", but said setting
errno to 0 is fine.  The original integrity_audit_msg() function would
call the new function with errno set to 0.

Mimi


--
Linux-audit mailing list
Linux-audit@redhat.com
https://www.redhat.com/mailman/listinfo/linux-audit

Re: [PATCH 2/2] integrity: Add errno field in audit message

2020-06-18 Thread Mimi Zohar
On Thu, 2020-06-18 at 11:05 -0700, Lakshmi Ramasubramanian wrote:
> On 6/18/20 10:41 AM, Mimi Zohar wrote:
> 
> > 
> > For the reasons that I mentioned previously, unless others are willing
> > to add their Reviewed-by tag not for the audit aspect in particular,
> > but IMA itself, I'm not comfortable making this change all at once.
> > 
> > Previously I suggested making the existing integrity_audit_msg() a
> > wrapper for a new function with errno.  Steve said, "We normally do
> > not like to have fields that swing in and out ...", but said setting
> > errno to 0 is fine.  The original integrity_audit_msg() function would
> > call the new function with errno set to 0.
> 
> If the original integrity_audit_msg() always calls the new function with 
> errno set to 0, there would be audit messages where "res" field is set 
> to "0" (fail) because "result" was non-zero, but errno set to "0" 
> (success). Wouldn't this be confusing?
> 
> In PATCH 1/2 I've made changes to make the "result" parameter to 
> integrity_audit_msg() consistent - i.e., it is always an error code (0 
> for success and a negative value for error). Would that address your 
> concerns?

You're overloading "res" to imply errno.  Define a new parameter
specifically for errno.

Mimi


--
Linux-audit mailing list
Linux-audit@redhat.com
https://www.redhat.com/mailman/listinfo/linux-audit

Re: [PATCH 2/2] integrity: Add errno field in audit message

2020-06-18 Thread Lakshmi Ramasubramanian

On 6/18/20 10:41 AM, Mimi Zohar wrote:



For the reasons that I mentioned previously, unless others are willing
to add their Reviewed-by tag not for the audit aspect in particular,
but IMA itself, I'm not comfortable making this change all at once.

Previously I suggested making the existing integrity_audit_msg() a
wrapper for a new function with errno.  Steve said, "We normally do
not like to have fields that swing in and out ...", but said setting
errno to 0 is fine.  The original integrity_audit_msg() function would
call the new function with errno set to 0.


If the original integrity_audit_msg() always calls the new function with 
errno set to 0, there would be audit messages where "res" field is set 
to "0" (fail) because "result" was non-zero, but errno set to "0" 
(success). Wouldn't this be confusing?


In PATCH 1/2 I've made changes to make the "result" parameter to 
integrity_audit_msg() consistent - i.e., it is always an error code (0 
for success and a negative value for error). Would that address your 
concerns?


thanks,
 -lakshmi





--
Linux-audit mailing list
Linux-audit@redhat.com
https://www.redhat.com/mailman/listinfo/linux-audit

Re: [PATCH] audit: optionally print warning after waiting to enqueue record

2020-06-18 Thread Paul Moore
On Thu, Jun 18, 2020 at 10:36 AM Steve Grubb  wrote:
> On Thursday, June 18, 2020 9:46:54 AM EDT Paul Moore wrote:
> > On Thu, Jun 18, 2020 at 9:39 AM Steve Grubb  wrote:
> > > The kernel cannot grow the backlog unbounded. If you do nothing, the
> > > backlog is 64 - which is too small to really use. Otherwise, you set the
> > > backlog to a finite number with the -b option.
> >
> > If one were to set the backlog limit to 0, it is effectively disabled
> > allowing the backlog to grow without any restrictions placed on it by
> > the audit subsystem.
>
> Then I'd say you asked for it. The cure is setting a number.

I wasn't commenting on if it was wise or not, that is going to depend
on the goals of the admin.  I just wanted to correct some bad
information you provided so those reading the mailing list were not
ill-informed.

-- 
paul moore
www.paul-moore.com

--
Linux-audit mailing list
Linux-audit@redhat.com
https://www.redhat.com/mailman/listinfo/linux-audit



Re: [PATCH] audit: optionally print warning after waiting to enqueue record

2020-06-18 Thread Steve Grubb
On Thursday, June 18, 2020 9:46:54 AM EDT Paul Moore wrote:
> On Thu, Jun 18, 2020 at 9:39 AM Steve Grubb  wrote:
> > The kernel cannot grow the backlog unbounded. If you do nothing, the
> > backlog is 64 - which is too small to really use. Otherwise, you set the
> > backlog to a finite number with the -b option.
> 
> If one were to set the backlog limit to 0, it is effectively disabled
> allowing the backlog to grow without any restrictions placed on it by
> the audit subsystem.

Then I'd say you asked for it. The cure is setting a number. But regardless, 
we could use some metrics on the backlog and visibility into the time it 
takes to dequeue. That might signal problems with plugins or overly agressive 
rules.

-Steve



--
Linux-audit mailing list
Linux-audit@redhat.com
https://www.redhat.com/mailman/listinfo/linux-audit



Re: [PATCH] audit: optionally print warning after waiting to enqueue record

2020-06-18 Thread Paul Moore
On Thu, Jun 18, 2020 at 9:39 AM Steve Grubb  wrote:
> The kernel cannot grow the backlog unbounded. If you do nothing, the backlog
> is 64 - which is too small to really use. Otherwise, you set the backlog to a
> finite number with the -b option.

If one were to set the backlog limit to 0, it is effectively disabled
allowing the backlog to grow without any restrictions placed on it by
the audit subsystem.

-- 
paul moore
www.paul-moore.com

--
Linux-audit mailing list
Linux-audit@redhat.com
https://www.redhat.com/mailman/listinfo/linux-audit



Re: [PATCH] audit: optionally print warning after waiting to enqueue record

2020-06-18 Thread Steve Grubb
On Wednesday, June 17, 2020 6:54:16 PM EDT Max Englander wrote:
> On Wed, Jun 17, 2020 at 02:47:19PM -0400, Paul Moore wrote:
> > On Tue, Jun 16, 2020 at 12:58 AM Max Englander  
wrote:
> > > In environments where security is prioritized, users may set
> > > --backlog_wait_time to a high value in order to reduce the likelihood
> > > that any audit event is lost, even though doing so may result in
> > > unpredictable performance if the kernel schedules a timeout when the
> > > backlog limit is exceeded. For these users, the next best thing to
> > > predictable performance is the ability to quickly detect and react to
> > > degraded performance. This patch proposes to aid the detection of
> > > kernel
> > > audit subsystem pauses through the following changes:
> > > 
> > > Add a variable named audit_backlog_warn_time. Enforce the value of this
> > > variable to be no less than zero, and no more than the value of
> > > audit_backlog_wait_time.
> > > 
> > > If audit_backlog_warn_time is greater than zero and if the total time
> > > spent waiting to enqueue an audit record is greater than or equal to
> > > audit_backlog_warn_time, then print a warning with the total time
> > > spent waiting.
> > > 
> > > An example configuration:
> > > auditctl --backlog_warn_time 50
> > > 
> > > An example warning message:
> > > audit: sleep_time=52 >= audit_backlog_warn_time=50
> > > 
> > > Tested on Ubuntu 18.04.04 using complementary changes to the audit
> > > userspace: https://github.com/linux-audit/audit-userspace/pull/131.
> > > 
> > > Signed-off-by: Max Englander 
> > > ---
> > > 
> > >  include/uapi/linux/audit.h |  7 ++-
> > >  kernel/audit.c | 35 +++
> > >  2 files changed, 41 insertions(+), 1 deletion(-)
> > 
> > If an admin is prioritizing security, aka don't loose any audit
> > records, and there is a concern over variable system latency due to an
> > audit queue backlog, why not simply disable the backlog limit?
> 
> That’s good in some cases, but in other cases unbounded growth of the
> backlog could result in memory issues. If the kernel runs out of memory
> it would drop the audit event or possibly have other problems. It could
> also also consume memory in a way that starves user workloads or causes
> them to be killed by the OOMKiller.

The kernel cannot grow the backlog unbounded. If you do nothing, the backlog 
is 64 - which is too small to really use. Otherwise, you set the backlog to a 
finite number with the -b option.

> To refine my motivating use case a bit, if a Kubernetes admin wants to
> prioritize security, and also avoid unbounded growth of the audit
> backlog, they may set -b and --backlog_wait_time in a way that limits
> kernel memory usage and reduces the likelihood that any audit event is
> lost. Occasional performance degradation may be acceptable to the admin,
> but they would like a way to be alerted to prolonged kernel pauses, so
> that they can investigate and take corrective action (increase backlog,
> increase server capacity, move some workloads to other servers, etc.).
> 
> To state another way. The kernel currently can be configured to print a
> message when the backlog limit is exceeded and it must discard the audit
> event. This is a useful message for admins, which they can address with
> corrective action. I think a message similar to the one proposed by this
> patch would be equally useful when the backlog limit is exceeded and the
> kernel is configured to wait for the backlog to drain. Admins could
> address that message in the same way, but without the cost of lost audit
> events.

If backlog wait time is exceeded, that could be a useful warning if that does 
not exist. I don't know how often that could happen...and of course without a 
warning we don't know if it happens or why it happens.

I also wished we had metrics on the backlog such as max used. That might help 
admins tune the size of the backlog.

-Steve



--
Linux-audit mailing list
Linux-audit@redhat.com
https://www.redhat.com/mailman/listinfo/linux-audit