Re: [RFC V2] IMA Log Snapshotting Design Proposal

2024-01-08 Thread Paul Moore
On Mon, Jan 8, 2024 at 6:48 AM Mimi Zohar  wrote:
> On Sun, 2024-01-07 at 21:58 -0500, Paul Moore wrote:
> > On Sun, Jan 7, 2024 at 7:59 AM Mimi Zohar  wrote:
> > > On Sat, 2024-01-06 at 18:27 -0500, Paul Moore wrote:
> > > > On Tue, Nov 28, 2023 at 9:07 PM Mimi Zohar  wrote:
> > > > > On Tue, 2023-11-28 at 20:06 -0500, Paul Moore wrote:
> > > > > > On Tue, Nov 28, 2023 at 7:09 AM Mimi Zohar  
> > > > > > wrote:
> > > > > > > On Mon, 2023-11-27 at 17:16 -0500, Paul Moore wrote:
> > > > > > > > On Mon, Nov 27, 2023 at 12:08 PM Mimi Zohar 
> > > > > > > >  wrote:
> > > > > > > > > On Wed, 2023-11-22 at 09:22 -0500, Paul Moore wrote:
> > > >
> > > > ...
> > > >
> > > > > > > > > Before defining a new critical-data record, we need to decide 
> > > > > > > > > whether
> > > > > > > > > it is really necessary or if it is redundant.  If we define a 
> > > > > > > > > new
> > > > > > > > > "critical-data" record, can it be defined such that it 
> > > > > > > > > doesn't require
> > > > > > > > > pausing extending the measurement list?  For example, a new 
> > > > > > > > > simple
> > > > > > > > > visual critical-data record could contain the number of 
> > > > > > > > > records (e.g.
> > > > > > > > > /ima/runtime_measurements_count) up to that point.
> > > > > > > >
> > > > > > > > What if the snapshot_aggregate was a hash of the measurement log
> > > > > > > > starting with either the boot_aggregate or the latest
> > > > > > > > snapshot_aggregate and ending on the record before the new
> > > > > > > > snapshot_aggregate?  The performance impact at snapshot time 
> > > > > > > > should be
> > > > > > > > minimal as the hash can be incrementally updated as new records 
> > > > > > > > are
> > > > > > > > added to the measurement list.  While the hash wouldn't capture 
> > > > > > > > the
> > > > > > > > TPM state, it would allow some crude verification when 
> > > > > > > > reassembling
> > > > > > > > the log.  If one could bear the cost of a TPM signing 
> > > > > > > > operation, the
> > > > > > > > log digest could be signed by the TPM.
> > > > > > >
> > > > > > > Other critical data is calculated, before calling
> > > > > > > ima_measure_critical_data(), which adds the record to the 
> > > > > > > measurement
> > > > > > > list and extends the TPM PCR.
> > > > > > >
> > > > > > > Signing the hash shouldn't be an issue if it behaves like other
> > > > > > > critical data.
> > > > > > >
> > > > > > > In addition to the hash, consider including other information in 
> > > > > > > the
> > > > > > > new critical data record (e.g. total number of measurement 
> > > > > > > records, the
> > > > > > > number of measurements included in the hash, the number of times 
> > > > > > > the
> > > > > > > measurement list was trimmed, etc).
> > > > > >
> > > > > > It would be nice if you could provide an explicit list of what you
> > > > > > would want hashed into a snapshot_aggregate record; the above is
> > > > > > close, but it is still a little hand-wavy.  I'm just trying to 
> > > > > > reduce
> > > > > > the back-n-forth :)
> > > > >
> > > > > What is being defined here is the first IMA critical-data record, 
> > > > > which
> > > > > really requires some thought.
> > > >
> > > > My thinking has always been that taking a hash of the current
> > > > measurement log up to the snapshot point would be a nice
> > > > snapshot_aggregate measurement, but I'm not heavily invested in that.
> > > > To me it is more important that we find something we can all agree on,
> > > > perhaps reluctantly, so we can move forward with a solution.
> > > >
> > > > > For ease of review, this new critical-
> > > > > data record should be a separate patch set from trimming the
> > > > > measurement list.
> > > >
> > > > I see the two as linked, but if you prefer them as separate then so be
> > > > it.  Once again, the important part is to move forward with a
> > > > solution, I'm not overly bothered if it arrives in multiple pieces
> > > > instead of one.
> > >
> > > Trimming the IMA measurement list could be used in conjunction with the 
> > > new IMA
> > > critical data record or independently.  Both options should be supported.
> > >
> > > 1. trim N number of records from the head of the in kernel IMA 
> > > measurement list
> > > 2. intermittently include the new IMA critical data record based on some 
> > > trigger
> > > 3. trim the measurement list up to the (first/last/Nth) IMA critical data 
> > > record
> > >
> > > Since the two features could be used independently of each other, there 
> > > is no
> > > reason to upstream them as a single patch set.  It just makes it harder to
> > > review.
> >
> > I don't see much point in recording a snapshot aggregate if you aren't
> > doing a snapshot, but it's not harmful in any way, so sure, go for it.
> > Like I said earlier, as long as the functionality is there, I don't
> > think anyone cares too much how it gets into the kernel (although
> > Tushar and Sush should comment from the perspect

Re: [RFC V2] IMA Log Snapshotting Design Proposal

2024-01-08 Thread Mimi Zohar
On Sun, 2024-01-07 at 21:58 -0500, Paul Moore wrote:
> On Sun, Jan 7, 2024 at 7:59 AM Mimi Zohar  wrote:
> > On Sat, 2024-01-06 at 18:27 -0500, Paul Moore wrote:
> > > On Tue, Nov 28, 2023 at 9:07 PM Mimi Zohar  wrote:
> > > > On Tue, 2023-11-28 at 20:06 -0500, Paul Moore wrote:
> > > > > On Tue, Nov 28, 2023 at 7:09 AM Mimi Zohar  
> > > > > wrote:
> > > > > > On Mon, 2023-11-27 at 17:16 -0500, Paul Moore wrote:
> > > > > > > On Mon, Nov 27, 2023 at 12:08 PM Mimi Zohar  
> > > > > > > wrote:
> > > > > > > > On Wed, 2023-11-22 at 09:22 -0500, Paul Moore wrote:
> > >
> > > ...
> > >
> > > > > > > > Before defining a new critical-data record, we need to decide 
> > > > > > > > whether
> > > > > > > > it is really necessary or if it is redundant.  If we define a 
> > > > > > > > new
> > > > > > > > "critical-data" record, can it be defined such that it doesn't 
> > > > > > > > require
> > > > > > > > pausing extending the measurement list?  For example, a new 
> > > > > > > > simple
> > > > > > > > visual critical-data record could contain the number of records 
> > > > > > > > (e.g.
> > > > > > > > /ima/runtime_measurements_count) up to that point.
> > > > > > >
> > > > > > > What if the snapshot_aggregate was a hash of the measurement log
> > > > > > > starting with either the boot_aggregate or the latest
> > > > > > > snapshot_aggregate and ending on the record before the new
> > > > > > > snapshot_aggregate?  The performance impact at snapshot time 
> > > > > > > should be
> > > > > > > minimal as the hash can be incrementally updated as new records 
> > > > > > > are
> > > > > > > added to the measurement list.  While the hash wouldn't capture 
> > > > > > > the
> > > > > > > TPM state, it would allow some crude verification when 
> > > > > > > reassembling
> > > > > > > the log.  If one could bear the cost of a TPM signing operation, 
> > > > > > > the
> > > > > > > log digest could be signed by the TPM.
> > > > > >
> > > > > > Other critical data is calculated, before calling
> > > > > > ima_measure_critical_data(), which adds the record to the 
> > > > > > measurement
> > > > > > list and extends the TPM PCR.
> > > > > >
> > > > > > Signing the hash shouldn't be an issue if it behaves like other
> > > > > > critical data.
> > > > > >
> > > > > > In addition to the hash, consider including other information in the
> > > > > > new critical data record (e.g. total number of measurement records, 
> > > > > > the
> > > > > > number of measurements included in the hash, the number of times the
> > > > > > measurement list was trimmed, etc).
> > > > >
> > > > > It would be nice if you could provide an explicit list of what you
> > > > > would want hashed into a snapshot_aggregate record; the above is
> > > > > close, but it is still a little hand-wavy.  I'm just trying to reduce
> > > > > the back-n-forth :)
> > > >
> > > > What is being defined here is the first IMA critical-data record, which
> > > > really requires some thought.
> > >
> > > My thinking has always been that taking a hash of the current
> > > measurement log up to the snapshot point would be a nice
> > > snapshot_aggregate measurement, but I'm not heavily invested in that.
> > > To me it is more important that we find something we can all agree on,
> > > perhaps reluctantly, so we can move forward with a solution.
> > >
> > > > For ease of review, this new critical-
> > > > data record should be a separate patch set from trimming the
> > > > measurement list.
> > >
> > > I see the two as linked, but if you prefer them as separate then so be
> > > it.  Once again, the important part is to move forward with a
> > > solution, I'm not overly bothered if it arrives in multiple pieces
> > > instead of one.
> >
> > Trimming the IMA measurement list could be used in conjunction with the new 
> > IMA
> > critical data record or independently.  Both options should be supported.
> >
> > 1. trim N number of records from the head of the in kernel IMA measurement 
> > list
> > 2. intermittently include the new IMA critical data record based on some 
> > trigger
> > 3. trim the measurement list up to the (first/last/Nth) IMA critical data 
> > record
> >
> > Since the two features could be used independently of each other, there is 
> > no
> > reason to upstream them as a single patch set.  It just makes it harder to
> > review.
> 
> I don't see much point in recording a snapshot aggregate if you aren't
> doing a snapshot, but it's not harmful in any way, so sure, go for it.
> Like I said earlier, as long as the functionality is there, I don't
> think anyone cares too much how it gets into the kernel (although
> Tushar and Sush should comment from the perspective).

Paul, there are two features: 
- trimming the measurement list
- defining and including an IMA critical data record

The original design doc combined these two features making them an "atomic"
operation and referred to it as a snapshot.  At the time the term "snapshot" was
an appropriate term 

Re: [RFC V2] IMA Log Snapshotting Design Proposal

2024-01-07 Thread Paul Moore
On Sun, Jan 7, 2024 at 7:59 AM Mimi Zohar  wrote:
> On Sat, 2024-01-06 at 18:27 -0500, Paul Moore wrote:
> > On Tue, Nov 28, 2023 at 9:07 PM Mimi Zohar  wrote:
> > > On Tue, 2023-11-28 at 20:06 -0500, Paul Moore wrote:
> > > > On Tue, Nov 28, 2023 at 7:09 AM Mimi Zohar  wrote:
> > > > > On Mon, 2023-11-27 at 17:16 -0500, Paul Moore wrote:
> > > > > > On Mon, Nov 27, 2023 at 12:08 PM Mimi Zohar  
> > > > > > wrote:
> > > > > > > On Wed, 2023-11-22 at 09:22 -0500, Paul Moore wrote:
> >
> > ...
> >
> > > > > > > Before defining a new critical-data record, we need to decide 
> > > > > > > whether
> > > > > > > it is really necessary or if it is redundant.  If we define a new
> > > > > > > "critical-data" record, can it be defined such that it doesn't 
> > > > > > > require
> > > > > > > pausing extending the measurement list?  For example, a new simple
> > > > > > > visual critical-data record could contain the number of records 
> > > > > > > (e.g.
> > > > > > > /ima/runtime_measurements_count) up to that point.
> > > > > >
> > > > > > What if the snapshot_aggregate was a hash of the measurement log
> > > > > > starting with either the boot_aggregate or the latest
> > > > > > snapshot_aggregate and ending on the record before the new
> > > > > > snapshot_aggregate?  The performance impact at snapshot time should 
> > > > > > be
> > > > > > minimal as the hash can be incrementally updated as new records are
> > > > > > added to the measurement list.  While the hash wouldn't capture the
> > > > > > TPM state, it would allow some crude verification when reassembling
> > > > > > the log.  If one could bear the cost of a TPM signing operation, the
> > > > > > log digest could be signed by the TPM.
> > > > >
> > > > > Other critical data is calculated, before calling
> > > > > ima_measure_critical_data(), which adds the record to the measurement
> > > > > list and extends the TPM PCR.
> > > > >
> > > > > Signing the hash shouldn't be an issue if it behaves like other
> > > > > critical data.
> > > > >
> > > > > In addition to the hash, consider including other information in the
> > > > > new critical data record (e.g. total number of measurement records, 
> > > > > the
> > > > > number of measurements included in the hash, the number of times the
> > > > > measurement list was trimmed, etc).
> > > >
> > > > It would be nice if you could provide an explicit list of what you
> > > > would want hashed into a snapshot_aggregate record; the above is
> > > > close, but it is still a little hand-wavy.  I'm just trying to reduce
> > > > the back-n-forth :)
> > >
> > > What is being defined here is the first IMA critical-data record, which
> > > really requires some thought.
> >
> > My thinking has always been that taking a hash of the current
> > measurement log up to the snapshot point would be a nice
> > snapshot_aggregate measurement, but I'm not heavily invested in that.
> > To me it is more important that we find something we can all agree on,
> > perhaps reluctantly, so we can move forward with a solution.
> >
> > > For ease of review, this new critical-
> > > data record should be a separate patch set from trimming the
> > > measurement list.
> >
> > I see the two as linked, but if you prefer them as separate then so be
> > it.  Once again, the important part is to move forward with a
> > solution, I'm not overly bothered if it arrives in multiple pieces
> > instead of one.
>
> Trimming the IMA measurement list could be used in conjunction with the new 
> IMA
> critical data record or independently.  Both options should be supported.
>
> 1. trim N number of records from the head of the in kernel IMA measurement 
> list
> 2. intermittently include the new IMA critical data record based on some 
> trigger
> 3. trim the measurement list up to the (first/last/Nth) IMA critical data 
> record
>
> Since the two features could be used independently of each other, there is no
> reason to upstream them as a single patch set.  It just makes it harder to
> review.

I don't see much point in recording a snapshot aggregate if you aren't
doing a snapshot, but it's not harmful in any way, so sure, go for it.
Like I said earlier, as long as the functionality is there, I don't
think anyone cares too much how it gets into the kernel (although
Tushar and Sush should comment from the perspective).

> > > As I'm sure you're aware, SElinux defines two critical-data records.
> > > From security/selinux/ima.c:
> > >
> > > ima_measure_critical_data("selinux", "selinux-state",
> > >   state_str, strlen(state_str), false,
> > >   NULL, 0);
> > >
> > > ima_measure_critical_data("selinux", "selinux-policy-hash",
> > >   policy, policy_len, true,
> > >   NULL, 0);
> >
> > Yep, but there is far more to this than SELinux.
>
> Only if you conflate the two features.

If that is a clever retort, you'll need 

Re: [RFC V2] IMA Log Snapshotting Design Proposal

2024-01-07 Thread Mimi Zohar
On Sat, 2024-01-06 at 18:27 -0500, Paul Moore wrote:
> On Tue, Nov 28, 2023 at 9:07 PM Mimi Zohar  wrote:
> > On Tue, 2023-11-28 at 20:06 -0500, Paul Moore wrote:
> > > On Tue, Nov 28, 2023 at 7:09 AM Mimi Zohar  wrote:
> > > > On Mon, 2023-11-27 at 17:16 -0500, Paul Moore wrote:
> > > > > On Mon, Nov 27, 2023 at 12:08 PM Mimi Zohar  
> > > > > wrote:
> > > > > > On Wed, 2023-11-22 at 09:22 -0500, Paul Moore wrote:
> 
> ...
> 
> > > > > > Before defining a new critical-data record, we need to decide 
> > > > > > whether
> > > > > > it is really necessary or if it is redundant.  If we define a new
> > > > > > "critical-data" record, can it be defined such that it doesn't 
> > > > > > require
> > > > > > pausing extending the measurement list?  For example, a new simple
> > > > > > visual critical-data record could contain the number of records 
> > > > > > (e.g.
> > > > > > /ima/runtime_measurements_count) up to that point.
> > > > >
> > > > > What if the snapshot_aggregate was a hash of the measurement log
> > > > > starting with either the boot_aggregate or the latest
> > > > > snapshot_aggregate and ending on the record before the new
> > > > > snapshot_aggregate?  The performance impact at snapshot time should be
> > > > > minimal as the hash can be incrementally updated as new records are
> > > > > added to the measurement list.  While the hash wouldn't capture the
> > > > > TPM state, it would allow some crude verification when reassembling
> > > > > the log.  If one could bear the cost of a TPM signing operation, the
> > > > > log digest could be signed by the TPM.
> > > >
> > > > Other critical data is calculated, before calling
> > > > ima_measure_critical_data(), which adds the record to the measurement
> > > > list and extends the TPM PCR.
> > > >
> > > > Signing the hash shouldn't be an issue if it behaves like other
> > > > critical data.
> > > >
> > > > In addition to the hash, consider including other information in the
> > > > new critical data record (e.g. total number of measurement records, the
> > > > number of measurements included in the hash, the number of times the
> > > > measurement list was trimmed, etc).
> > >
> > > It would be nice if you could provide an explicit list of what you
> > > would want hashed into a snapshot_aggregate record; the above is
> > > close, but it is still a little hand-wavy.  I'm just trying to reduce
> > > the back-n-forth :)
> >
> > What is being defined here is the first IMA critical-data record, which
> > really requires some thought.
> 
> My thinking has always been that taking a hash of the current
> measurement log up to the snapshot point would be a nice
> snapshot_aggregate measurement, but I'm not heavily invested in that.
> To me it is more important that we find something we can all agree on,
> perhaps reluctantly, so we can move forward with a solution.
> 
> > For ease of review, this new critical-
> > data record should be a separate patch set from trimming the
> > measurement list.
> 
> I see the two as linked, but if you prefer them as separate then so be
> it.  Once again, the important part is to move forward with a
> solution, I'm not overly bothered if it arrives in multiple pieces
> instead of one.

Trimming the IMA measurement list could be used in conjunction with the new IMA
critical data record or independently.  Both options should be supported.

1. trim N number of records from the head of the in kernel IMA measurement list
2. intermittently include the new IMA critical data record based on some trigger
3. trim the measurement list up to the (first/last/Nth) IMA critical data record

Since the two features could be used independently of each other, there is no
reason to upstream them as a single patch set.  It just makes it harder to
review.

> 
> > As I'm sure you're aware, SElinux defines two critical-data records.
> > From security/selinux/ima.c:
> >
> > ima_measure_critical_data("selinux", "selinux-state",
> >   state_str, strlen(state_str), false,
> >   NULL, 0);
> >
> > ima_measure_critical_data("selinux", "selinux-policy-hash",
> >   policy, policy_len, true,
> >   NULL, 0);
> 
> Yep, but there is far more to this than SELinux.

Only if you conflate the two features. 

Mimi




Re: [RFC V2] IMA Log Snapshotting Design Proposal

2024-01-06 Thread Paul Moore
On Wed, Dec 20, 2023 at 5:14 PM Ken Goldman  wrote:
>
> I'm still struggling with the "new root of trust" concept.
>
> Something - a user space agent, a third party, etc. - has to
> retain the entire log from event 0, because a new verifier
> needs all measurements.

[NOTE: a gentle reminder to please refrain from top-posting on Linux
kernel mailing lists, it is generally frowned upon and makes it
difficult to manage long running threads]

This is one of the reasons I have pushed to manage the snapshot, both
the trigger and the handling of the trimmed data, outside of the
kernel.  Setting aside the obvious limitations of kernel I/O, handling
the snapshot in userspace provides for a much richer set of options
when it comes to managing the snapshot and the
verification/attestation of the system.

> Therefore, the snapshot aggregate seems redundant.  It has to
> be verified to match the snapshotted events.

I can see a perspective where the snapshot_aggregate is theoretically
redundant, but I can also see at least one practical perspective where
a snapshot_aggregate could be used to simplify a remote attestation
with a sufficiently stateful attestation service.

> A redundancy is an attack surface.

Now that is an overly broad generalization, if we are going that
route, *everything* is an attack surface (and this arguably true
regardless, although a bit of an extreme statement).

> A badly written verifier
> might not do that verification, and this permits snapshotted
> events to be forged. No aggregate means the verifier can't
> make a mistake.

I would ask that you read your own comment again.  A poorly written
verifier is subject to any number of pitfalls and vulnerabilities,
regardless of a snapshot aggregate.  As a reminder, the snapshotting
mechanism has always been proposed as an opt-in mechanism, if one has
not implemented a proper snapshot-aware attestation mechanism then
they can simply refrain from taking a snapshot and reject all
attestation attempts using a snapshot.

> On 11/22/2023 9:22 AM, Paul Moore wrote:
> > I believe the intent is to only pause the measurements while the
> > snapshot_aggregate is generated, not for the duration of the entire
> > snapshot process.  The purpose of the snapshot_aggregate is to
> > establish a new root of trust, similar to the boot_aggregate, to help
> > improve attestation performance.

-- 
paul-moore.com



Re: [RFC V2] IMA Log Snapshotting Design Proposal

2024-01-06 Thread Paul Moore
On Tue, Nov 28, 2023 at 9:07 PM Mimi Zohar  wrote:
> On Tue, 2023-11-28 at 20:06 -0500, Paul Moore wrote:
> > On Tue, Nov 28, 2023 at 7:09 AM Mimi Zohar  wrote:
> > > On Mon, 2023-11-27 at 17:16 -0500, Paul Moore wrote:
> > > > On Mon, Nov 27, 2023 at 12:08 PM Mimi Zohar  wrote:
> > > > > On Wed, 2023-11-22 at 09:22 -0500, Paul Moore wrote:

...

> > > > > Before defining a new critical-data record, we need to decide whether
> > > > > it is really necessary or if it is redundant.  If we define a new
> > > > > "critical-data" record, can it be defined such that it doesn't require
> > > > > pausing extending the measurement list?  For example, a new simple
> > > > > visual critical-data record could contain the number of records (e.g.
> > > > > /ima/runtime_measurements_count) up to that point.
> > > >
> > > > What if the snapshot_aggregate was a hash of the measurement log
> > > > starting with either the boot_aggregate or the latest
> > > > snapshot_aggregate and ending on the record before the new
> > > > snapshot_aggregate?  The performance impact at snapshot time should be
> > > > minimal as the hash can be incrementally updated as new records are
> > > > added to the measurement list.  While the hash wouldn't capture the
> > > > TPM state, it would allow some crude verification when reassembling
> > > > the log.  If one could bear the cost of a TPM signing operation, the
> > > > log digest could be signed by the TPM.
> > >
> > > Other critical data is calculated, before calling
> > > ima_measure_critical_data(), which adds the record to the measurement
> > > list and extends the TPM PCR.
> > >
> > > Signing the hash shouldn't be an issue if it behaves like other
> > > critical data.
> > >
> > > In addition to the hash, consider including other information in the
> > > new critical data record (e.g. total number of measurement records, the
> > > number of measurements included in the hash, the number of times the
> > > measurement list was trimmed, etc).
> >
> > It would be nice if you could provide an explicit list of what you
> > would want hashed into a snapshot_aggregate record; the above is
> > close, but it is still a little hand-wavy.  I'm just trying to reduce
> > the back-n-forth :)
>
> What is being defined here is the first IMA critical-data record, which
> really requires some thought.

My thinking has always been that taking a hash of the current
measurement log up to the snapshot point would be a nice
snapshot_aggregate measurement, but I'm not heavily invested in that.
To me it is more important that we find something we can all agree on,
perhaps reluctantly, so we can move forward with a solution.

> For ease of review, this new critical-
> data record should be a separate patch set from trimming the
> measurement list.

I see the two as linked, but if you prefer them as separate then so be
it.  Once again, the important part is to move forward with a
solution, I'm not overly bothered if it arrives in multiple pieces
instead of one.

> As I'm sure you're aware, SElinux defines two critical-data records.
> From security/selinux/ima.c:
>
> ima_measure_critical_data("selinux", "selinux-state",
>   state_str, strlen(state_str), false,
>   NULL, 0);
>
> ima_measure_critical_data("selinux", "selinux-policy-hash",
>   policy, policy_len, true,
>   NULL, 0);

Yep, but there is far more to this than SELinux.

-- 
paul-moore.com



Re: [RFC V2] IMA Log Snapshotting Design Proposal

2023-12-20 Thread Ken Goldman

I'm still struggling with the "new root of trust" concept.

Something - a user space agent, a third party, etc. - has to
retain the entire log from event 0, because a new verifier
needs all measurements.

Therefore, the snapshot aggregate seems redundant.  It has to
be verified to match the snapshotted events.

A redundancy is an attack surface.  A badly written verifier
might not do that verification, and this permits snapshotted
events to be forged. No aggregate means the verifier can't
make a mistake.

On 11/22/2023 9:22 AM, Paul Moore wrote:

I believe the intent is to only pause the measurements while the
snapshot_aggregate is generated, not for the duration of the entire
snapshot process.  The purpose of the snapshot_aggregate is to
establish a new root of trust, similar to the boot_aggregate, to help
improve attestation performance.




Re: [RFC V2] IMA Log Snapshotting Design Proposal

2023-11-28 Thread Mimi Zohar
On Tue, 2023-11-28 at 20:06 -0500, Paul Moore wrote:
> On Tue, Nov 28, 2023 at 7:09 AM Mimi Zohar  wrote:
> > On Mon, 2023-11-27 at 17:16 -0500, Paul Moore wrote:
> > > On Mon, Nov 27, 2023 at 12:08 PM Mimi Zohar  wrote:
> > > > On Wed, 2023-11-22 at 09:22 -0500, Paul Moore wrote:
> 
> ...
> 
> > > If we are going to have a record count, I imagine it would also be
> > > helpful to maintain a securityfs file with the total size (in bytes)
> > > of the in-memory measurement log.  In fact, I suspect this will
> > > probably be more useful for those who wish to manage the size of the
> > > measurement log.
> >
> > A running number of bytes needed for carrying the measurement list
> > across kexec already exists.  This value would be affected when the
> > measurement list is trimmed.
> 
> There we go, it should be trivial to export that information via securityfs.
> 
> > > > Defining other IMA securityfs files like
> > > > how many times the measurement list has been trimmed might be
> > > > beneficial as well.
> > >
> > > I have no objection to that.  Would a total record count, i.e. a value
> > > that doesn't reset on a snapshot event, be more useful here?
> >
> > /ima/runtime_measurements_count already exports the total
> > number of measurement records.
> 
> I guess the question is would you want 'runtime_measurements_count' to
> reflect the current/trimmed log size or would you want it to reflect
> hthe measurements since the initial cold boot?  Presumably we would
> want to add another securityfs file to handle the case not covered by
> 'runtime_measurements_count'.

Right.  /ima/runtime_measurements_count is defined as the
total number of measurements since boot.  When the measurement list is
carried across kexec, it is the number of measurements since cold boot.

A new securityfs file should be defined for the current number of in
kernel memory records.  Unless the measurement list has been trimmed,
this should be the same as the runtime_measurements_count.

> 
> > > > Before defining a new critical-data record, we need to decide whether
> > > > it is really necessary or if it is redundant.  If we define a new
> > > > "critical-data" record, can it be defined such that it doesn't require
> > > > pausing extending the measurement list?  For example, a new simple
> > > > visual critical-data record could contain the number of records (e.g.
> > > > /ima/runtime_measurements_count) up to that point.
> > >
> > > What if the snapshot_aggregate was a hash of the measurement log
> > > starting with either the boot_aggregate or the latest
> > > snapshot_aggregate and ending on the record before the new
> > > snapshot_aggregate?  The performance impact at snapshot time should be
> > > minimal as the hash can be incrementally updated as new records are
> > > added to the measurement list.  While the hash wouldn't capture the
> > > TPM state, it would allow some crude verification when reassembling
> > > the log.  If one could bear the cost of a TPM signing operation, the
> > > log digest could be signed by the TPM.
> >
> > Other critical data is calculated, before calling
> > ima_measure_critical_data(), which adds the record to the measurement
> > list and extends the TPM PCR.
> >
> > Signing the hash shouldn't be an issue if it behaves like other
> > critical data.
> >
> > In addition to the hash, consider including other information in the
> > new critical data record (e.g. total number of measurement records, the
> > number of measurements included in the hash, the number of times the
> > measurement list was trimmed, etc).
> 
> It would be nice if you could provide an explicit list of what you
> would want hashed into a snapshot_aggregate record; the above is
> close, but it is still a little hand-wavy.  I'm just trying to reduce
> the back-n-forth :)

What is being defined here is the first IMA critical-data record, which
really requires some thought.  For ease of review, this new critical-
data record should be a separate patch set from trimming the
measurement list.

As I'm sure you're aware, SElinux defines two critical-data records.  
>From security/selinux/ima.c:

ima_measure_critical_data("selinux", "selinux-state",
  state_str, strlen(state_str), false,
  NULL, 0);

ima_measure_critical_data("selinux", "selinux-policy-hash",
  policy, policy_len, true,
  NULL, 0);

-- 
thanks,

Mimi




Re: [RFC V2] IMA Log Snapshotting Design Proposal

2023-11-28 Thread Paul Moore
On Tue, Nov 28, 2023 at 7:09 AM Mimi Zohar  wrote:
> On Mon, 2023-11-27 at 17:16 -0500, Paul Moore wrote:
> > On Mon, Nov 27, 2023 at 12:08 PM Mimi Zohar  wrote:
> > > On Wed, 2023-11-22 at 09:22 -0500, Paul Moore wrote:

...

> > If we are going to have a record count, I imagine it would also be
> > helpful to maintain a securityfs file with the total size (in bytes)
> > of the in-memory measurement log.  In fact, I suspect this will
> > probably be more useful for those who wish to manage the size of the
> > measurement log.
>
> A running number of bytes needed for carrying the measurement list
> across kexec already exists.  This value would be affected when the
> measurement list is trimmed.

There we go, it should be trivial to export that information via securityfs.

> > > Defining other IMA securityfs files like
> > > how many times the measurement list has been trimmed might be
> > > beneficial as well.
> >
> > I have no objection to that.  Would a total record count, i.e. a value
> > that doesn't reset on a snapshot event, be more useful here?
>
> /ima/runtime_measurements_count already exports the total
> number of measurement records.

I guess the question is would you want 'runtime_measurements_count' to
reflect the current/trimmed log size or would you want it to reflect
the measurements since the initial cold boot?  Presumably we would
want to add another securityfs file to handle the case not covered by
'runtime_measurements_count'.

> > > Before defining a new critical-data record, we need to decide whether
> > > it is really necessary or if it is redundant.  If we define a new
> > > "critical-data" record, can it be defined such that it doesn't require
> > > pausing extending the measurement list?  For example, a new simple
> > > visual critical-data record could contain the number of records (e.g.
> > > /ima/runtime_measurements_count) up to that point.
> >
> > What if the snapshot_aggregate was a hash of the measurement log
> > starting with either the boot_aggregate or the latest
> > snapshot_aggregate and ending on the record before the new
> > snapshot_aggregate?  The performance impact at snapshot time should be
> > minimal as the hash can be incrementally updated as new records are
> > added to the measurement list.  While the hash wouldn't capture the
> > TPM state, it would allow some crude verification when reassembling
> > the log.  If one could bear the cost of a TPM signing operation, the
> > log digest could be signed by the TPM.
>
> Other critical data is calculated, before calling
> ima_measure_critical_data(), which adds the record to the measurement
> list and extends the TPM PCR.
>
> Signing the hash shouldn't be an issue if it behaves like other
> critical data.
>
> In addition to the hash, consider including other information in the
> new critical data record (e.g. total number of measurement records, the
> number of measurements included in the hash, the number of times the
> measurement list was trimmed, etc).

It would be nice if you could provide an explicit list of what you
would want hashed into a snapshot_aggregate record; the above is
close, but it is still a little hand-wavy.  I'm just trying to reduce
the back-n-forth :)

-- 
paul-moore.com



Re: [RFC V2] IMA Log Snapshotting Design Proposal

2023-11-28 Thread Mimi Zohar
On Mon, 2023-11-27 at 17:16 -0500, Paul Moore wrote:
> On Mon, Nov 27, 2023 at 12:08 PM Mimi Zohar  wrote:
> > On Wed, 2023-11-22 at 09:22 -0500, Paul Moore wrote:
> 
> ...
> 
> > > Okay, we are starting to get closer, but I'm still missing the part
> > > where you say "if you do X, Y, and Z, I'll accept and merge the
> > > solution."  Can you be more explicit about what approach(es) you would
> > > be willing to accept upstream?
> >
> > Included with what is wanted/needed is an explanation as to my concerns
> > with the existing proposal.
> >
> > First we need to differentiate between kernel and uhserspace
> > requirements.  (The "snapshotting" design proposal intermixes them.)
> >
> > From the kernel persective, the Log Snapshotting Design proposal "B.1
> > Goals" is very nice, but once the measurement list can be trimmed it is
> > really irrelevant.  Userspace can do whatever it wants with the
> > measurement list records.  So instead of paying lip service to what
> > should be done, just call it as it is - trimming the measurement list.
> 
> Fair enough.  I personally think it is nice to have a brief discussion
> of how userspace might use a kernel feature, but if you prefer to drop
> that part of the design doc I doubt anyone will object very strongly.
> 
> > From the kernel perspective there needs to be a method of trimming N
> > number of records from the head of the measurement list.  In addition
> > to the existing securityfs "runtime measurement list",  defining a new
> > securityfs file containing the current count of in memory measurement
> > records would be beneficial.
> 
> I imagine that should be trivial to implement and I can't imagine
> there being any objection to that.
> 
> If we are going to have a record count, I imagine it would also be
> helpful to maintain a securityfs file with the total size (in bytes)
> of the in-memory measurement log.  In fact, I suspect this will
> probably be more useful for those who wish to manage the size of the
> measurement log.

A running number of bytes needed for carrying the measurement list
across kexec already exists.  This value would be affected when the
measurement list is trimmed.

...

> 
> > Defining other IMA securityfs files like
> > how many times the measurement list has been trimmed might be
> > beneficial as well.
> 
> I have no objection to that.  Would a total record count, i.e. a value
> that doesn't reset on a snapshot event, be more useful here?

/ima/runtime_measurements_count already exports the total
number of measurement records.

> 
> > Of course properly document the integrity
> > implications and repercussions of the new Kconfig that allows trimming
> > the measurement list.
> 
> Of course.
> 
> > Defining a simple "trim" marker measurement record would be a visual
> > indication that the measurement list has been trimmed.  I might even
> > have compared it to the "boot_aggregate".  However, the proposed marker
> > based on TPM PCRs requires pausing extending the measurement list.
> 
> ...
> 
> > Before defining a new critical-data record, we need to decide whether
> > it is really necessary or if it is redundant.  If we define a new
> > "critical-data" record, can it be defined such that it doesn't require
> > pausing extending the measurement list?  For example, a new simple
> > visual critical-data record could contain the number of records (e.g.
> > /ima/runtime_measurements_count) up to that point.
> 
> What if the snapshot_aggregate was a hash of the measurement log
> starting with either the boot_aggregate or the latest
> snapshot_aggregate and ending on the record before the new
> snapshot_aggregate?  The performance impact at snapshot time should be
> minimal as the hash can be incrementally updated as new records are
> added to the measurement list.  While the hash wouldn't capture the
> TPM state, it would allow some crude verification when reassembling
> the log.  If one could bear the cost of a TPM signing operation, the
> log digest could be signed by the TPM.

Other critical data is calculated, before calling
ima_measure_critical_data(), which adds the record to the measurement
list and extends the TPM PCR.

Signing the hash shouldn't be an issue if it behaves like other
critical data.

In addition to the hash, consider including other information in the
new critical data record (e.g. total number of measurement records, the
number of measurements included in the hash, the number of times the
measurement list was trimmed, etc). 

> 
> > The new critical-data record and trimming the measurement list should
> > be disjoint features.  If the first record after trimming the
> > measurement list should be the critical-data record, then trim the
> > measurement list up to that point.
> 
> I disagree about the snapshot_aggregate record being disjoint from the
> measurement log, but I suspect Tushar and Sush are willing to forgo
> the snapshot_aggregate if that is a blocker from your perspective.

> Once again, the m

Re: [RFC V2] IMA Log Snapshotting Design Proposal

2023-11-27 Thread Paul Moore
On Mon, Nov 27, 2023 at 12:08 PM Mimi Zohar  wrote:
> On Wed, 2023-11-22 at 09:22 -0500, Paul Moore wrote:

...

> > Okay, we are starting to get closer, but I'm still missing the part
> > where you say "if you do X, Y, and Z, I'll accept and merge the
> > solution."  Can you be more explicit about what approach(es) you would
> > be willing to accept upstream?
>
> Included with what is wanted/needed is an explanation as to my concerns
> with the existing proposal.
>
> First we need to differentiate between kernel and uhserspace
> requirements.  (The "snapshotting" design proposal intermixes them.)
>
> From the kernel persective, the Log Snapshotting Design proposal "B.1
> Goals" is very nice, but once the measurement list can be trimmed it is
> really irrelevant.  Userspace can do whatever it wants with the
> measurement list records.  So instead of paying lip service to what
> should be done, just call it as it is - trimming the measurement list.

Fair enough.  I personally think it is nice to have a brief discussion
of how userspace might use a kernel feature, but if you prefer to drop
that part of the design doc I doubt anyone will object very strongly.

> ---
> | B.1 Goals   |
> ---
> To address the issues described in the section above, we propose
> enhancements to the IMA subsystem to achieve the following goals:
>
>   a. Reduce memory pressure on the Kernel caused by larger in-memory
>  IMA logs.
>
>   b. Preserve the system's ability to get remotely attested using the
>  IMA log, even after implementing the enhancements to reduce memory
>  pressure caused by the IMA log. IMA's Integrity guarantees should
>  be maintained.
>
>   c. Provide mechanisms from Kernel side to the remote attestation
>  service to make service-side processing more efficient.

That looks fine to me.

> From the kernel perspective there needs to be a method of trimming N
> number of records from the head of the measurement list.  In addition
> to the existing securityfs "runtime measurement list",  defining a new
> securityfs file containing the current count of in memory measurement
> records would be beneficial.

I imagine that should be trivial to implement and I can't imagine
there being any objection to that.

If we are going to have a record count, I imagine it would also be
helpful to maintain a securityfs file with the total size (in bytes)
of the in-memory measurement log.  In fact, I suspect this will
probably be more useful for those who wish to manage the size of the
measurement log.

> Defining other IMA securityfs files like
> how many times the measurement list has been trimmed might be
> beneficial as well.

I have no objection to that.  Would a total record count, i.e. a value
that doesn't reset on a snapshot event, be more useful here?

> Of course properly document the integrity
> implications and repercussions of the new Kconfig that allows trimming
> the measurement list.

Of course.

> Defining a simple "trim" marker measurement record would be a visual
> indication that the measurement list has been trimmed.  I might even
> have compared it to the "boot_aggregate".  However, the proposed marker
> based on TPM PCRs requires pausing extending the measurement list.

...

> Before defining a new critical-data record, we need to decide whether
> it is really necessary or if it is redundant.  If we define a new
> "critical-data" record, can it be defined such that it doesn't require
> pausing extending the measurement list?  For example, a new simple
> visual critical-data record could contain the number of records (e.g.
> /ima/runtime_measurements_count) up to that point.

What if the snapshot_aggregate was a hash of the measurement log
starting with either the boot_aggregate or the latest
snapshot_aggregate and ending on the record before the new
snapshot_aggregate?  The performance impact at snapshot time should be
minimal as the hash can be incrementally updated as new records are
added to the measurement list.  While the hash wouldn't capture the
TPM state, it would allow some crude verification when reassembling
the log.  If one could bear the cost of a TPM signing operation, the
log digest could be signed by the TPM.

> The new critical-data record and trimming the measurement list should
> be disjoint features.  If the first record after trimming the
> measurement list should be the critical-data record, then trim the
> measurement list up to that point.

I disagree about the snapshot_aggregate record being disjoint from the
measurement log, but I suspect Tushar and Sush are willing to forgo
the snapshot_aggregate if that is a blocker from your perspective.
Once again, the main goal is the ability to manage the size of the
measurement log; while having a snapshot_aggregate that can be used to

Re: [RFC V2] IMA Log Snapshotting Design Proposal

2023-11-27 Thread Mimi Zohar
On Wed, 2023-11-22 at 09:22 -0500, Paul Moore wrote:
> On Wed, Nov 22, 2023 at 8:18 AM Mimi Zohar  wrote:
> > On Tue, 2023-11-21 at 23:27 -0500, Paul Moore wrote:
> > > On Thu, Nov 16, 2023 at 5:28 PM Paul Moore  wrote:
> > > > On Tue, Oct 31, 2023 at 3:15 PM Mimi Zohar  wrote:
> > >
> > > ...
> > >
> > > > > Userspace can already export the IMA measurement list(s) via the
> > > > > securityfs {ascii,binary}_runtime_measurements file(s) and do whatever
> > > > > it wants with it.  All that is missing in the kernel is the ability to
> > > > > trim the measurement list, which doesn't seem all that complicated.
> > > >
> > > > From my perspective what has been presented is basically just trimming
> > > > the in-memory measurement log, the additional complexity (which really
> > > > doesn't look that bad IMO) is there to ensure robustness in the face
> > > > of an unreliable userspace (processes die, get killed, etc.) and to
> > > > establish a new, transitive root of trust in the newly trimmed
> > > > in-memory log.
> > > >
> > > > I suppose one could simplify things greatly by having a design where
> > > > userspace  captures the measurement log and then writes the number of
> > > > measurement records to trim from the start of the measurement log to a
> > > > sysfs file and the kernel acts on that.  You could do this with, or
> > > > without, the snapshot_aggregate entry concept; in fact that could be
> > > > something that was controlled by userspace, e.g. write the number of
> > > > lines and a flag to indicate if a snapshot_aggregate was desired to
> > > > the sysfs file.  I can't say I've thought it all the way through to
> > > > make sure there are no gotchas, but I'm guessing that is about as
> > > > simple as one can get.
> >
> > > > If there is something else you had in mind, Mimi, please share the
> > > > details.  This is a very real problem we are facing and we want to
> > > > work to get a solution upstream.
> > >
> > > Any thoughts on this Mimi?  We have a real interest in working with
> > > you to solve this problem upstream, but we need more detailed feedback
> > > than "too complicated".  If you don't like the solutions presented
> > > thus far, what type of solution would you like to see?
> >
> > Paul, the design copies the measurement list to a temporary "snapshot"
> > file, before trimming the measurement list, which according to the
> > design document locks the existing measurement list.  And further
> > pauses extending the measurement list to calculate the
> > "snapshot_aggregate".
> 
> I believe the intent is to only pause the measurements while the
> snapshot_aggregate is generated, not for the duration of the entire
> snapshot process.  The purpose of the snapshot_aggregate is to
> establish a new root of trust, similar to the boot_aggregate, to help
> improve attestation performance.
> 
> > Userspace can export the measurement list already, so why this
> > complicated design?
> 
> The current code has no provision for trimming the measurement log,
> that's the primary reason.
> 
> > As I mentioned previously and repeated yesterday, the
> > "snapshot_aggregate" is a new type of critical data and should be
> > upstreamed independently of this patch set that trims the measurement
> > list.  Trimming the measurement list could be based, as you suggested
> > on the number of records to remove, or it could be up to the next/last
> > "snapshot_aggregate" record.
> 
> Okay, we are starting to get closer, but I'm still missing the part
> where you say "if you do X, Y, and Z, I'll accept and merge the
> solution."  Can you be more explicit about what approach(es) you would
> be willing to accept upstream?

Included with what is wanted/needed is an explanation as to my concerns
with the existing proposal.

First we need to differentiate between kernel and uhserspace
requirements.  (The "snapshotting" design proposal intermixes them.)

>From the kernel persective, the Log Snapshotting Design proposal "B.1
Goals" is very nice, but once the measurement list can be trimmed it is
really irrelevant.  Userspace can do whatever it wants with the
measurement list records.  So instead of paying lip service to what
should be done, just call it as it is - trimming the measurement list.

---
| B.1 Goals   |
---
To address the issues described in the section above, we propose
enhancements to the IMA subsystem to achieve the following goals:

  a. Reduce memory pressure on the Kernel caused by larger in-memory
 IMA logs.

  b. Preserve the system's ability to get remotely attested using the
 IMA log, even after implementing the enhancements to reduce memory
 pressure caused by the IMA log. IMA's Integrity guarantees should
 be maintained.

  c. Provide mechanisms from Kernel side to the remote attestat

Re: [RFC V2] IMA Log Snapshotting Design Proposal

2023-11-22 Thread Paul Moore
On Wed, Nov 22, 2023 at 8:18 AM Mimi Zohar  wrote:
> On Tue, 2023-11-21 at 23:27 -0500, Paul Moore wrote:
> > On Thu, Nov 16, 2023 at 5:28 PM Paul Moore  wrote:
> > > On Tue, Oct 31, 2023 at 3:15 PM Mimi Zohar  wrote:
> >
> > ...
> >
> > > > Userspace can already export the IMA measurement list(s) via the
> > > > securityfs {ascii,binary}_runtime_measurements file(s) and do whatever
> > > > it wants with it.  All that is missing in the kernel is the ability to
> > > > trim the measurement list, which doesn't seem all that complicated.
> > >
> > > From my perspective what has been presented is basically just trimming
> > > the in-memory measurement log, the additional complexity (which really
> > > doesn't look that bad IMO) is there to ensure robustness in the face
> > > of an unreliable userspace (processes die, get killed, etc.) and to
> > > establish a new, transitive root of trust in the newly trimmed
> > > in-memory log.
> > >
> > > I suppose one could simplify things greatly by having a design where
> > > userspace  captures the measurement log and then writes the number of
> > > measurement records to trim from the start of the measurement log to a
> > > sysfs file and the kernel acts on that.  You could do this with, or
> > > without, the snapshot_aggregate entry concept; in fact that could be
> > > something that was controlled by userspace, e.g. write the number of
> > > lines and a flag to indicate if a snapshot_aggregate was desired to
> > > the sysfs file.  I can't say I've thought it all the way through to
> > > make sure there are no gotchas, but I'm guessing that is about as
> > > simple as one can get.
>
> > > If there is something else you had in mind, Mimi, please share the
> > > details.  This is a very real problem we are facing and we want to
> > > work to get a solution upstream.
> >
> > Any thoughts on this Mimi?  We have a real interest in working with
> > you to solve this problem upstream, but we need more detailed feedback
> > than "too complicated".  If you don't like the solutions presented
> > thus far, what type of solution would you like to see?
>
> Paul, the design copies the measurement list to a temporary "snapshot"
> file, before trimming the measurement list, which according to the
> design document locks the existing measurement list.  And further
> pauses extending the measurement list to calculate the
> "snapshot_aggregate".

I believe the intent is to only pause the measurements while the
snapshot_aggregate is generated, not for the duration of the entire
snapshot process.  The purpose of the snapshot_aggregate is to
establish a new root of trust, similar to the boot_aggregate, to help
improve attestation performance.

> Userspace can export the measurement list already, so why this
> complicated design?

The current code has no provision for trimming the measurement log,
that's the primary reason.

> As I mentioned previously and repeated yesterday, the
> "snapshot_aggregate" is a new type of critical data and should be
> upstreamed independently of this patch set that trims the measurement
> list.  Trimming the measurement list could be based, as you suggested
> on the number of records to remove, or it could be up to the next/last
> "snapshot_aggregate" record.

Okay, we are starting to get closer, but I'm still missing the part
where you say "if you do X, Y, and Z, I'll accept and merge the
solution."  Can you be more explicit about what approach(es) you would
be willing to accept upstream?

-- 
paul-moore.com



Re: [RFC V2] IMA Log Snapshotting Design Proposal

2023-11-22 Thread Mimi Zohar
On Tue, 2023-11-21 at 23:27 -0500, Paul Moore wrote:
> On Thu, Nov 16, 2023 at 5:28 PM Paul Moore  wrote:
> > On Tue, Oct 31, 2023 at 3:15 PM Mimi Zohar  wrote:
> 
> ...
> 
> > > Userspace can already export the IMA measurement list(s) via the
> > > securityfs {ascii,binary}_runtime_measurements file(s) and do whatever
> > > it wants with it.  All that is missing in the kernel is the ability to
> > > trim the measurement list, which doesn't seem all that complicated.
> >
> > From my perspective what has been presented is basically just trimming
> > the in-memory measurement log, the additional complexity (which really
> > doesn't look that bad IMO) is there to ensure robustness in the face
> > of an unreliable userspace (processes die, get killed, etc.) and to
> > establish a new, transitive root of trust in the newly trimmed
> > in-memory log.
> >
> > I suppose one could simplify things greatly by having a design where
> > userspace  captures the measurement log and then writes the number of
> > measurement records to trim from the start of the measurement log to a
> > sysfs file and the kernel acts on that.  You could do this with, or
> > without, the snapshot_aggregate entry concept; in fact that could be
> > something that was controlled by userspace, e.g. write the number of
> > lines and a flag to indicate if a snapshot_aggregate was desired to
> > the sysfs file.  I can't say I've thought it all the way through to
> > make sure there are no gotchas, but I'm guessing that is about as
> > simple as one can get.

> > If there is something else you had in mind, Mimi, please share the
> > details.  This is a very real problem we are facing and we want to
> > work to get a solution upstream.
> 
> Any thoughts on this Mimi?  We have a real interest in working with
> you to solve this problem upstream, but we need more detailed feedback
> than "too complicated".  If you don't like the solutions presented
> thus far, what type of solution would you like to see?

Paul, the design copies the measurement list to a temporary "snapshot"
file, before trimming the measurement list, which according to the
design document locks the existing measurement list.  And further
pauses extending the measurement list to calculate the
"snapshot_aggregate".

Userspace can export the measurement list already, so why this
complicated design?

As I mentioned previously and repeated yesterday, the
"snapshot_aggregate" is a new type of critical data and should be
upstreamed independently of this patch set that trims the measurement
list.  Trimming the measurement list could be based, as you suggested
on the number of records to remove, or it could be up to the next/last
"snapshot_aggregate" record.

-- 
thanks,

Mimi




Re: [RFC V2] IMA Log Snapshotting Design Proposal

2023-11-21 Thread Paul Moore
On Thu, Nov 16, 2023 at 5:28 PM Paul Moore  wrote:
> On Tue, Oct 31, 2023 at 3:15 PM Mimi Zohar  wrote:

...

> > Userspace can already export the IMA measurement list(s) via the
> > securityfs {ascii,binary}_runtime_measurements file(s) and do whatever
> > it wants with it.  All that is missing in the kernel is the ability to
> > trim the measurement list, which doesn't seem all that complicated.
>
> From my perspective what has been presented is basically just trimming
> the in-memory measurement log, the additional complexity (which really
> doesn't look that bad IMO) is there to ensure robustness in the face
> of an unreliable userspace (processes die, get killed, etc.) and to
> establish a new, transitive root of trust in the newly trimmed
> in-memory log.
>
> I suppose one could simplify things greatly by having a design where
> userspace  captures the measurement log and then writes the number of
> measurement records to trim from the start of the measurement log to a
> sysfs file and the kernel acts on that.  You could do this with, or
> without, the snapshot_aggregate entry concept; in fact that could be
> something that was controlled by userspace, e.g. write the number of
> lines and a flag to indicate if a snapshot_aggregate was desired to
> the sysfs file.  I can't say I've thought it all the way through to
> make sure there are no gotchas, but I'm guessing that is about as
> simple as one can get.
>
> If there is something else you had in mind, Mimi, please share the
> details.  This is a very real problem we are facing and we want to
> work to get a solution upstream.

Any thoughts on this Mimi?  We have a real interest in working with
you to solve this problem upstream, but we need more detailed feedback
than "too complicated".  If you don't like the solutions presented
thus far, what type of solution would you like to see?

-- 
paul-moore.com



Re: [RFC V2] IMA Log Snapshotting Design Proposal

2023-11-21 Thread Mimi Zohar
On Tue, 2023-11-21 at 17:01 -0800, Tushar Sugandhi wrote:
> Hi Mimi,
> To address your concern about pausing the measurements -
> We are not proposing to pause the measurements for the entire duration
> of UM <--> Kernel interaction while taking a snapshot.
> 
> We are simply proposing to pause the measurements when we get the TPM
> PCR quotes to add them to "snapshot_aggregate". (which should be a very
> small time window). IMA already has this mechanism when two separate
> modules try to add entry to IMA log - by using
> mutex_lock(&ima_extend_list_mutex); in ima_add_template_entry.
> 
> 
> We plan to use this existing locking functionality.
> Hope this addresses your concern about pausing extending the measurement
> list.

Each TPM PCR read is a separate TPM command.  Have you done any
performance anlaysis to see how long it actually takes to calculate the
"snapshot_aggregate" with a physical TPM?

The "snapshot_aggregate" is a new critical-data and should be
upstreamed independently of this patch set.

-- 
thanks,

Mimi




Re: [RFC V2] IMA Log Snapshotting Design Proposal

2023-11-21 Thread Tushar Sugandhi




On 11/16/23 14:28, Paul Moore wrote:

On Tue, Oct 31, 2023 at 3:15 PM Mimi Zohar  wrote:

On Thu, 2023-10-19 at 11:49 -0700, Tushar Sugandhi wrote:

[...]

---
| C.1 Solution Summary|
---
To achieve the goals described in the section above, we propose the
following changes to the IMA subsystem.

  a. The IMA log from Kernel memory will be offloaded to some
 persistent storage disk to keep the system running reliably
 without facing memory pressure.
 More details, alternate approaches considered etc. are present
 in section "D.3 Choices for Storing Snapshots" below.

  b. The IMA log will be divided into multiple chunks (snapshots).
 Each snapshot would be a delta between the two instances when
 the log was offloaded from memory to the persistent storage
 disk.

  c. Some UM process (like a remote-attestation-client) will be
 responsible for writing the IMA log snapshot to the disk.

  d. The same UM process would be responsible for triggering the IMA
 log snapshot.

  e. There will be a well-known location for storing the IMA log
 snapshots on the disk.  It will be non-trivial for UM processes
 to change that location after booting into the Kernel.

  f. A new event, "snapshot_aggregate", will be computed and measured
 in the IMA log as part of this feature.  It should help the
 remote-attestation client/service to benefit from the IMA log
 snapshot feature.
 The "snapshot_aggregate" event is described in more details in
 section "D.1 Snapshot Aggregate Event" below.

  g. If the existing remote-attestation client/services do not change
 to benefit from this feature or do not trigger the snapshot,
 the Kernel will continue to have it's current functionality of
 maintaining an in-memory full IMA log.

Additionally, the remote-attestation client/services need to be updated
to benefit from the IMA log snapshot feature.  These proposed changes

are described in section "D.4 Remote-Attestation Client/Service Side
Changes" below, but their implementation is out of scope for this
proposal.


As previously said on v1,
This design seems overly complex and requires synchronization between the
"snapshot" record and exporting the records from the measurement list. [...]

Concerns:
- Pausing extending the measurement list.

Nothing has changed in terms of the complexity or in terms of pausing
the measurement list.   Pausing the measurement list is a non starter.


The measurement list would only need to be paused for the amount of
time it would require to generate the snapshot_aggregate entry, which
should be minimal and only occurs when a privileged userspace requests
a snapshot operation.  The snapshot remains opt-in functionality, and
even then there is the possibility that the kernel could reject the
snapshot request if generating the snapshot_aggregate entry was deemed
too costly (as determined by the kernel) at that point in time.


Thanks Paul for responding and sharing your thoughts.


Hi Mimi,
To address your concern about pausing the measurements -
We are not proposing to pause the measurements for the entire duration
of UM <--> Kernel interaction while taking a snapshot.

We are simply proposing to pause the measurements when we get the TPM
PCR quotes to add them to "snapshot_aggregate". (which should be a very
small time window). IMA already has this mechanism when two separate
modules try to add entry to IMA log - by using
mutex_lock(&ima_extend_list_mutex); in ima_add_template_entry.


We plan to use this existing locking functionality.
Hope this addresses your concern about pausing extending the measurement
list.

~Tushar


Userspace can already export the IMA measurement list(s) via the
securityfs {ascii,binary}_runtime_measurements file(s) and do whatever
it wants with it.  All that is missing in the kernel is the ability to
trim the measurement list, which doesn't seem all that complicated.



From my perspective what has been presented is basically just trimming

the in-memory measurement log, the additional complexity (which really
doesn't look that bad IMO) is there to ensure robustness in the face
of an unreliable userspace (processes die, get killed, etc.) and to
establish a new, transitive root of trust in the newly trimmed
in-memory log.

I suppose one could simplify things greatly by having a design where
userspace  captures the measurement log and then writes the number of
measurement records to trim from the start of the measurement log to a
sysfs file and the kernel acts on that.  You could do this with, or
without, the snapshot_aggregate entry concept; in fact that could be
something that was controlled by usersp

Re: [RFC V2] IMA Log Snapshotting Design Proposal

2023-11-20 Thread Tushar Sugandhi




On 11/16/23 14:07, Paul Moore wrote:

On Tue, Nov 14, 2023 at 1:58 PM Stefan Berger  wrote:

On 11/14/23 13:36, Sush Shringarputale wrote:

On 11/13/2023 10:59 AM, Stefan Berger wrote:

On 10/19/23 14:49, Tushar Sugandhi wrote:

===
| Introduction |
===
This document provides a detailed overview of the proposed Kernel
feature IMA log snapshotting.  It describes the motivation behind the
proposal, the problem to be solved, a detailed solution design with
examples, and describes the changes to be made in the clients/services
which are part of remote-attestation system.  This is the 2nd version
of the proposal.  The first version is present here[1].

Table of Contents:
--
A. Motivation and Background
B. Goals and Non-Goals
  B.1 Goals
  B.2 Non-Goals
C. Proposed Solution
  C.1 Solution Summary
  C.2 High-level Work-flow
D. Detailed Design
  D.1 Snapshot Aggregate Event
  D.2 Snapshot Triggering Mechanism
  D.3 Choosing A Persistent Storage Location For Snapshots
  D.4 Remote-Attestation Client/Service-side Changes
  D.4.a Client-side Changes
  D.4.b Service-side Changes
E. Example Walk-through
F. Other Design Considerations
G. References



Userspace applications will have to know
a) where are the shard files?

We describe the file storage location choices in section D.3, but user
applications will have to query the well-known location described there.

b) how do I read the shard files while locking out the producer of the
shard files?

IMO, this will require a well known config file and a locking method
(flock) so that user space applications can work together in this new
environment. The lock could be defined in the config file or just be
the config file itself.

The flock is a good idea for co-ordination between UM clients. While
the Kernel cannot enforce any access in this way, any UM process that
is planning on triggering the snapshot mechanism should follow that
protocol.  We will ensure we document that as the best-practices in
the patch series.


It's more than 'best practices'. You need a well-known config file with
well-known config options in it.

All clients that were previously just trying to read new bytes from the
IMA log cannot do this anymore in the presence of a log shard producer
but have to also learn that a new log shard has been produced so they
need to figure out the new position in the log where to read from. So
maybe a counter in a config file should indicate to the log readers that
a new log has been produced -- otherwise they would have to monitor all
the log shard files or the log shard file's size.


If a counter is needed, I would suggest placing it somewhere other
than the config file so that we can enforce limited write access to
the config file.


Agreed. The counter shouldn't be part of a config file.

IMA log already provides a trustworthy, tamper-resilient mechanism
to store such data.

The current design already provides the mechanism to store
the counter as part of the snapshot_aggregate event.

See section "D.1 Snapshot Aggregate Event" in the proposal for
reference.

Snapshot_Counter   := "Snapshot_Attempt_Count="
  


"snapshot_aggregate" becomes the first event recorded in the
in-memory IMA log, after the past entries are purged to
a shard file.  Along with the other benefits, the "snapshot_aggregate"
event also provides info to UM clients about how many snapshots are
taken so far.


See section "C.2 High-level Work-flow" in the proposal for more
info.

  Step #f
  -
 (In-memory IMA log)
   .--.
   | "snapshot_aggregate" |
   | Event #E4|
   | Event #E5|
   '--'

~Tushar

Regardless, I imagine there are a few ways one could synchronize
various userspace applications such that they see a consistent view of
the decomposed log state, and the good news is that the approach
described here is opt-in from a userspace perspective.  If the
userspace does not fully support IMA log snapshotting then it never
needs to trigger it and the system behaves as it does today; on the
other hand, if the userspace has been updated it can make use of the
new functionality to better manage the size of the IMA measurement
log.





Re: [RFC V2] IMA Log Snapshotting Design Proposal

2023-11-17 Thread Sush Shringarputale




On 11/16/2023 2:56 PM, Paul Moore wrote:

On Thu, Nov 16, 2023 at 5:41 PM Stefan Berger  wrote:

On 11/16/23 17:07, Paul Moore wrote:

On Tue, Nov 14, 2023 at 1:58 PM Stefan Berger  wrote:

On 11/14/23 13:36, Sush Shringarputale wrote:

On 11/13/2023 10:59 AM, Stefan Berger wrote:

On 10/19/23 14:49, Tushar Sugandhi wrote:

===
| Introduction |
===
This document provides a detailed overview of the proposed Kernel
feature IMA log snapshotting.  It describes the motivation behind the
proposal, the problem to be solved, a detailed solution design with
examples, and describes the changes to be made in the clients/services
which are part of remote-attestation system.  This is the 2nd version
of the proposal.  The first version is present here[1].

Table of Contents:
--
A. Motivation and Background
B. Goals and Non-Goals
   B.1 Goals
   B.2 Non-Goals
C. Proposed Solution
   C.1 Solution Summary
   C.2 High-level Work-flow
D. Detailed Design
   D.1 Snapshot Aggregate Event
   D.2 Snapshot Triggering Mechanism
   D.3 Choosing A Persistent Storage Location For Snapshots
   D.4 Remote-Attestation Client/Service-side Changes
   D.4.a Client-side Changes
   D.4.b Service-side Changes
E. Example Walk-through
F. Other Design Considerations
G. References


Userspace applications will have to know
a) where are the shard files?

We describe the file storage location choices in section D.3, but user
applications will have to query the well-known location described there.

b) how do I read the shard files while locking out the producer of the
shard files?

IMO, this will require a well known config file and a locking method
(flock) so that user space applications can work together in this new
environment. The lock could be defined in the config file or just be
the config file itself.

The flock is a good idea for co-ordination between UM clients. While
the Kernel cannot enforce any access in this way, any UM process that
is planning on triggering the snapshot mechanism should follow that
protocol.  We will ensure we document that as the best-practices in
the patch series.

It's more than 'best practices'. You need a well-known config file with
well-known config options in it.

All clients that were previously just trying to read new bytes from the
IMA log cannot do this anymore in the presence of a log shard producer
but have to also learn that a new log shard has been produced so they
need to figure out the new position in the log where to read from. So
maybe a counter in a config file should indicate to the log readers that
a new log has been produced -- otherwise they would have to monitor all
the log shard files or the log shard file's size.

If a counter is needed, I would suggest placing it somewhere other
than the config file so that we can enforce limited write access to
the config file.

Regardless, I imagine there are a few ways one could synchronize
various userspace applications such that they see a consistent view of
the decomposed log state, and the good news is that the approach
described here is opt-in from a userspace perspective.  If the

A FUSE filesystem that stitches together the log shards from one or
multiple files + IMA log file(s) could make this approach transparent
for as long as log shards are not thrown away. Presumably it (or root)
could bind-mount its files over the two IMA log files.


userspace does not fully support IMA log snapshotting then it never
needs to trigger it and the system behaves as it does today; on the

I don't think individual applications should trigger it , instead some
dedicated background process running on a machine would do that every n
log entries or so and possibly offer the FUSE filesystem at the same
time. In either case, once any application triggers it, all either have
to know how to deal with the shards or FUSE would make it completely
transparent.

FUSE would be a reasonable user space co-ordination implementation.  A
privileged process would trigger the snapshot generation and provide the
mountpoint to read the full IMA log backed by shards as needed by relying
parties.

Whether it is a privileged daemon or some other agent that triggers the
snapshot, it shouldn't impact the Kernel-side implementation.

- Sush

Yes, performing a snapshot is a privileged operation which I expect
would be done and managed by a dedicated daemon running on the system.






Re: [RFC V2] IMA Log Snapshotting Design Proposal

2023-11-16 Thread Paul Moore
On Thu, Nov 16, 2023 at 5:41 PM Stefan Berger  wrote:
> On 11/16/23 17:07, Paul Moore wrote:
> > On Tue, Nov 14, 2023 at 1:58 PM Stefan Berger  wrote:
> >> On 11/14/23 13:36, Sush Shringarputale wrote:
> >>> On 11/13/2023 10:59 AM, Stefan Berger wrote:
>  On 10/19/23 14:49, Tushar Sugandhi wrote:
> > ===
> > | Introduction |
> > ===
> > This document provides a detailed overview of the proposed Kernel
> > feature IMA log snapshotting.  It describes the motivation behind the
> > proposal, the problem to be solved, a detailed solution design with
> > examples, and describes the changes to be made in the clients/services
> > which are part of remote-attestation system.  This is the 2nd version
> > of the proposal.  The first version is present here[1].
> >
> > Table of Contents:
> > --
> > A. Motivation and Background
> > B. Goals and Non-Goals
> >   B.1 Goals
> >   B.2 Non-Goals
> > C. Proposed Solution
> >   C.1 Solution Summary
> >   C.2 High-level Work-flow
> > D. Detailed Design
> >   D.1 Snapshot Aggregate Event
> >   D.2 Snapshot Triggering Mechanism
> >   D.3 Choosing A Persistent Storage Location For Snapshots
> >   D.4 Remote-Attestation Client/Service-side Changes
> >   D.4.a Client-side Changes
> >   D.4.b Service-side Changes
> > E. Example Walk-through
> > F. Other Design Considerations
> > G. References
> >
> 
>  Userspace applications will have to know
>  a) where are the shard files?
> >>> We describe the file storage location choices in section D.3, but user
> >>> applications will have to query the well-known location described there.
>  b) how do I read the shard files while locking out the producer of the
>  shard files?
> 
>  IMO, this will require a well known config file and a locking method
>  (flock) so that user space applications can work together in this new
>  environment. The lock could be defined in the config file or just be
>  the config file itself.
> >>> The flock is a good idea for co-ordination between UM clients. While
> >>> the Kernel cannot enforce any access in this way, any UM process that
> >>> is planning on triggering the snapshot mechanism should follow that
> >>> protocol.  We will ensure we document that as the best-practices in
> >>> the patch series.
> >>
> >> It's more than 'best practices'. You need a well-known config file with
> >> well-known config options in it.
> >>
> >> All clients that were previously just trying to read new bytes from the
> >> IMA log cannot do this anymore in the presence of a log shard producer
> >> but have to also learn that a new log shard has been produced so they
> >> need to figure out the new position in the log where to read from. So
> >> maybe a counter in a config file should indicate to the log readers that
> >> a new log has been produced -- otherwise they would have to monitor all
> >> the log shard files or the log shard file's size.
> >
> > If a counter is needed, I would suggest placing it somewhere other
> > than the config file so that we can enforce limited write access to
> > the config file.
> >
> > Regardless, I imagine there are a few ways one could synchronize
> > various userspace applications such that they see a consistent view of
> > the decomposed log state, and the good news is that the approach
> > described here is opt-in from a userspace perspective.  If the
>
> A FUSE filesystem that stitches together the log shards from one or
> multiple files + IMA log file(s) could make this approach transparent
> for as long as log shards are not thrown away. Presumably it (or root)
> could bind-mount its files over the two IMA log files.
>
> > userspace does not fully support IMA log snapshotting then it never
> > needs to trigger it and the system behaves as it does today; on the
>
> I don't think individual applications should trigger it , instead some
> dedicated background process running on a machine would do that every n
> log entries or so and possibly offer the FUSE filesystem at the same
> time. In either case, once any application triggers it, all either have
> to know how to deal with the shards or FUSE would make it completely
> transparent.

Yes, performing a snapshot is a privileged operation which I expect
would be done and managed by a dedicated daemon running on the system.

-- 
paul-moore.com



Re: [RFC V2] IMA Log Snapshotting Design Proposal

2023-11-16 Thread Stefan Berger




On 11/16/23 17:07, Paul Moore wrote:

On Tue, Nov 14, 2023 at 1:58 PM Stefan Berger  wrote:

On 11/14/23 13:36, Sush Shringarputale wrote:

On 11/13/2023 10:59 AM, Stefan Berger wrote:

On 10/19/23 14:49, Tushar Sugandhi wrote:

===
| Introduction |
===
This document provides a detailed overview of the proposed Kernel
feature IMA log snapshotting.  It describes the motivation behind the
proposal, the problem to be solved, a detailed solution design with
examples, and describes the changes to be made in the clients/services
which are part of remote-attestation system.  This is the 2nd version
of the proposal.  The first version is present here[1].

Table of Contents:
--
A. Motivation and Background
B. Goals and Non-Goals
  B.1 Goals
  B.2 Non-Goals
C. Proposed Solution
  C.1 Solution Summary
  C.2 High-level Work-flow
D. Detailed Design
  D.1 Snapshot Aggregate Event
  D.2 Snapshot Triggering Mechanism
  D.3 Choosing A Persistent Storage Location For Snapshots
  D.4 Remote-Attestation Client/Service-side Changes
  D.4.a Client-side Changes
  D.4.b Service-side Changes
E. Example Walk-through
F. Other Design Considerations
G. References



Userspace applications will have to know
a) where are the shard files?

We describe the file storage location choices in section D.3, but user
applications will have to query the well-known location described there.

b) how do I read the shard files while locking out the producer of the
shard files?

IMO, this will require a well known config file and a locking method
(flock) so that user space applications can work together in this new
environment. The lock could be defined in the config file or just be
the config file itself.

The flock is a good idea for co-ordination between UM clients. While
the Kernel cannot enforce any access in this way, any UM process that
is planning on triggering the snapshot mechanism should follow that
protocol.  We will ensure we document that as the best-practices in
the patch series.


It's more than 'best practices'. You need a well-known config file with
well-known config options in it.

All clients that were previously just trying to read new bytes from the
IMA log cannot do this anymore in the presence of a log shard producer
but have to also learn that a new log shard has been produced so they
need to figure out the new position in the log where to read from. So
maybe a counter in a config file should indicate to the log readers that
a new log has been produced -- otherwise they would have to monitor all
the log shard files or the log shard file's size.


If a counter is needed, I would suggest placing it somewhere other
than the config file so that we can enforce limited write access to
the config file.

Regardless, I imagine there are a few ways one could synchronize
various userspace applications such that they see a consistent view of
the decomposed log state, and the good news is that the approach
described here is opt-in from a userspace perspective.  If the


A FUSE filesystem that stitches together the log shards from one or 
multiple files + IMA log file(s) could make this approach transparent 
for as long as log shards are not thrown away. Presumably it (or root) 
could bind-mount its files over the two IMA log files.



userspace does not fully support IMA log snapshotting then it never
needs to trigger it and the system behaves as it does today; on the


I don't think individual applications should trigger it , instead some 
dedicated background process running on a machine would do that every n 
log entries or so and possibly offer the FUSE filesystem at the same 
time. In either case, once any application triggers it, all either have 
to know how to deal with the shards or FUSE would make it completely 
transparent.



other hand, if the userspace has been updated it can make use of the
new functionality to better manage the size of the IMA measurement
log.





Re: [RFC V2] IMA Log Snapshotting Design Proposal

2023-11-16 Thread Paul Moore
On Tue, Oct 31, 2023 at 3:15 PM Mimi Zohar  wrote:
> On Thu, 2023-10-19 at 11:49 -0700, Tushar Sugandhi wrote:
>
> [...]
> > ---
> > | C.1 Solution Summary|
> > ---
> > To achieve the goals described in the section above, we propose the
> > following changes to the IMA subsystem.
> >
> >  a. The IMA log from Kernel memory will be offloaded to some
> > persistent storage disk to keep the system running reliably
> > without facing memory pressure.
> > More details, alternate approaches considered etc. are present
> > in section "D.3 Choices for Storing Snapshots" below.
> >
> >  b. The IMA log will be divided into multiple chunks (snapshots).
> > Each snapshot would be a delta between the two instances when
> > the log was offloaded from memory to the persistent storage
> > disk.
> >
> >  c. Some UM process (like a remote-attestation-client) will be
> > responsible for writing the IMA log snapshot to the disk.
> >
> >  d. The same UM process would be responsible for triggering the IMA
> > log snapshot.
> >
> >  e. There will be a well-known location for storing the IMA log
> > snapshots on the disk.  It will be non-trivial for UM processes
> > to change that location after booting into the Kernel.
> >
> >  f. A new event, "snapshot_aggregate", will be computed and measured
> > in the IMA log as part of this feature.  It should help the
> > remote-attestation client/service to benefit from the IMA log
> > snapshot feature.
> > The "snapshot_aggregate" event is described in more details in
> > section "D.1 Snapshot Aggregate Event" below.
> >
> >  g. If the existing remote-attestation client/services do not change
> > to benefit from this feature or do not trigger the snapshot,
> > the Kernel will continue to have it's current functionality of
> > maintaining an in-memory full IMA log.
> >
> > Additionally, the remote-attestation client/services need to be updated
> > to benefit from the IMA log snapshot feature.  These proposed changes
> >
> > are described in section "D.4 Remote-Attestation Client/Service Side
> > Changes" below, but their implementation is out of scope for this
> > proposal.
>
> As previously said on v1,
>This design seems overly complex and requires synchronization between the
>"snapshot" record and exporting the records from the measurement list. 
> [...]
>
>Concerns:
>- Pausing extending the measurement list.
>
> Nothing has changed in terms of the complexity or in terms of pausing
> the measurement list.   Pausing the measurement list is a non starter.

The measurement list would only need to be paused for the amount of
time it would require to generate the snapshot_aggregate entry, which
should be minimal and only occurs when a privileged userspace requests
a snapshot operation.  The snapshot remains opt-in functionality, and
even then there is the possibility that the kernel could reject the
snapshot request if generating the snapshot_aggregate entry was deemed
too costly (as determined by the kernel) at that point in time.

> Userspace can already export the IMA measurement list(s) via the
> securityfs {ascii,binary}_runtime_measurements file(s) and do whatever
> it wants with it.  All that is missing in the kernel is the ability to
> trim the measurement list, which doesn't seem all that complicated.

>From my perspective what has been presented is basically just trimming
the in-memory measurement log, the additional complexity (which really
doesn't look that bad IMO) is there to ensure robustness in the face
of an unreliable userspace (processes die, get killed, etc.) and to
establish a new, transitive root of trust in the newly trimmed
in-memory log.

I suppose one could simplify things greatly by having a design where
userspace  captures the measurement log and then writes the number of
measurement records to trim from the start of the measurement log to a
sysfs file and the kernel acts on that.  You could do this with, or
without, the snapshot_aggregate entry concept; in fact that could be
something that was controlled by userspace, e.g. write the number of
lines and a flag to indicate if a snapshot_aggregate was desired to
the sysfs file.  I can't say I've thought it all the way through to
make sure there are no gotchas, but I'm guessing that is about as
simple as one can get.

If there is something else you had in mind, Mimi, please share the
details.  This is a very real problem we are facing and we want to
work to get a solution upstream.

-- 
paul-moore.com



Re: [RFC V2] IMA Log Snapshotting Design Proposal

2023-11-16 Thread Paul Moore
On Tue, Nov 14, 2023 at 1:58 PM Stefan Berger  wrote:
> On 11/14/23 13:36, Sush Shringarputale wrote:
> > On 11/13/2023 10:59 AM, Stefan Berger wrote:
> >> On 10/19/23 14:49, Tushar Sugandhi wrote:
> >>> ===
> >>> | Introduction |
> >>> ===
> >>> This document provides a detailed overview of the proposed Kernel
> >>> feature IMA log snapshotting.  It describes the motivation behind the
> >>> proposal, the problem to be solved, a detailed solution design with
> >>> examples, and describes the changes to be made in the clients/services
> >>> which are part of remote-attestation system.  This is the 2nd version
> >>> of the proposal.  The first version is present here[1].
> >>>
> >>> Table of Contents:
> >>> --
> >>> A. Motivation and Background
> >>> B. Goals and Non-Goals
> >>>  B.1 Goals
> >>>  B.2 Non-Goals
> >>> C. Proposed Solution
> >>>  C.1 Solution Summary
> >>>  C.2 High-level Work-flow
> >>> D. Detailed Design
> >>>  D.1 Snapshot Aggregate Event
> >>>  D.2 Snapshot Triggering Mechanism
> >>>  D.3 Choosing A Persistent Storage Location For Snapshots
> >>>  D.4 Remote-Attestation Client/Service-side Changes
> >>>  D.4.a Client-side Changes
> >>>  D.4.b Service-side Changes
> >>> E. Example Walk-through
> >>> F. Other Design Considerations
> >>> G. References
> >>>
> >>
> >> Userspace applications will have to know
> >> a) where are the shard files?
> > We describe the file storage location choices in section D.3, but user
> > applications will have to query the well-known location described there.
> >> b) how do I read the shard files while locking out the producer of the
> >> shard files?
> >>
> >> IMO, this will require a well known config file and a locking method
> >> (flock) so that user space applications can work together in this new
> >> environment. The lock could be defined in the config file or just be
> >> the config file itself.
> > The flock is a good idea for co-ordination between UM clients. While
> > the Kernel cannot enforce any access in this way, any UM process that
> > is planning on triggering the snapshot mechanism should follow that
> > protocol.  We will ensure we document that as the best-practices in
> > the patch series.
>
> It's more than 'best practices'. You need a well-known config file with
> well-known config options in it.
>
> All clients that were previously just trying to read new bytes from the
> IMA log cannot do this anymore in the presence of a log shard producer
> but have to also learn that a new log shard has been produced so they
> need to figure out the new position in the log where to read from. So
> maybe a counter in a config file should indicate to the log readers that
> a new log has been produced -- otherwise they would have to monitor all
> the log shard files or the log shard file's size.

If a counter is needed, I would suggest placing it somewhere other
than the config file so that we can enforce limited write access to
the config file.

Regardless, I imagine there are a few ways one could synchronize
various userspace applications such that they see a consistent view of
the decomposed log state, and the good news is that the approach
described here is opt-in from a userspace perspective.  If the
userspace does not fully support IMA log snapshotting then it never
needs to trigger it and the system behaves as it does today; on the
other hand, if the userspace has been updated it can make use of the
new functionality to better manage the size of the IMA measurement
log.

-- 
paul-moore.com



Re: [RFC V2] IMA Log Snapshotting Design Proposal

2023-11-14 Thread Stefan Berger




On 11/14/23 13:36, Sush Shringarputale wrote:



On 11/13/2023 10:59 AM, Stefan Berger wrote:



On 10/19/23 14:49, Tushar Sugandhi wrote:

===
| Introduction |
===
This document provides a detailed overview of the proposed Kernel
feature IMA log snapshotting.  It describes the motivation behind the
proposal, the problem to be solved, a detailed solution design with
examples, and describes the changes to be made in the clients/services
which are part of remote-attestation system.  This is the 2nd version
of the proposal.  The first version is present here[1].

Table of Contents:
--
A. Motivation and Background
B. Goals and Non-Goals
 B.1 Goals
 B.2 Non-Goals
C. Proposed Solution
 C.1 Solution Summary
 C.2 High-level Work-flow
D. Detailed Design
 D.1 Snapshot Aggregate Event
 D.2 Snapshot Triggering Mechanism
 D.3 Choosing A Persistent Storage Location For Snapshots
 D.4 Remote-Attestation Client/Service-side Changes
 D.4.a Client-side Changes
 D.4.b Service-side Changes
E. Example Walk-through
F. Other Design Considerations
G. References



Userspace applications will have to know
a) where are the shard files?

We describe the file storage location choices in section D.3, but user
applications will have to query the well-known location described there.
b) how do I read the shard files while locking out the producer of the 
shard files?


IMO, this will require a well known config file and a locking method 
(flock) so that user space applications can work together in this new 
environment. The lock could be defined in the config file or just be 
the config file itself.

The flock is a good idea for co-ordination between UM clients. While
the Kernel cannot enforce any access in this way, any UM process that
is planning on triggering the snapshot mechanism should follow that
protocol.  We will ensure we document that as the best-practices in
the patch series.


It's more than 'best practices'. You need a well-known config file with 
well-known config options in it.


All clients that were previously just trying to read new bytes from the 
IMA log cannot do this anymore in the presence of a log shard producer 
but have to also learn that a new log shard has been produced so they 
need to figure out the new position in the log where to read from. So 
maybe a counter in a config file should indicate to the log readers that 
a new log has been produced -- otherwise they would have to monitor all 
the log shard files or the log shard file's size.


Iff the log-shard producer were configured to discard leading parts of 
the log then that should also be noted in a config file so clients, that 
need to see the beginning of the log, can refuse early on to work on a 
machine that either is configured this way or where the discarding has 
already happened.


  Stefan


- Sush




Re: [RFC V2] IMA Log Snapshotting Design Proposal

2023-11-14 Thread Sush Shringarputale




On 11/13/2023 10:59 AM, Stefan Berger wrote:



On 10/19/23 14:49, Tushar Sugandhi wrote:

===
| Introduction |
===
This document provides a detailed overview of the proposed Kernel
feature IMA log snapshotting.  It describes the motivation behind the
proposal, the problem to be solved, a detailed solution design with
examples, and describes the changes to be made in the clients/services
which are part of remote-attestation system.  This is the 2nd version
of the proposal.  The first version is present here[1].

Table of Contents:
--
A. Motivation and Background
B. Goals and Non-Goals
 B.1 Goals
 B.2 Non-Goals
C. Proposed Solution
 C.1 Solution Summary
 C.2 High-level Work-flow
D. Detailed Design
 D.1 Snapshot Aggregate Event
 D.2 Snapshot Triggering Mechanism
 D.3 Choosing A Persistent Storage Location For Snapshots
 D.4 Remote-Attestation Client/Service-side Changes
 D.4.a Client-side Changes
 D.4.b Service-side Changes
E. Example Walk-through
F. Other Design Considerations
G. References



Userspace applications will have to know
a) where are the shard files?

We describe the file storage location choices in section D.3, but user
applications will have to query the well-known location described there.
b) how do I read the shard files while locking out the producer of the 
shard files?


IMO, this will require a well known config file and a locking method 
(flock) so that user space applications can work together in this new 
environment. The lock could be defined in the config file or just be 
the config file itself.

The flock is a good idea for co-ordination between UM clients. While
the Kernel cannot enforce any access in this way, any UM process that
is planning on triggering the snapshot mechanism should follow that
protocol.  We will ensure we document that as the best-practices in
the patch series.
- Sush



Re: [RFC V2] IMA Log Snapshotting Design Proposal

2023-11-13 Thread Stefan Berger




On 10/19/23 14:49, Tushar Sugandhi wrote:

===
| Introduction    |
===
This document provides a detailed overview of the proposed Kernel
feature IMA log snapshotting.  It describes the motivation behind the
proposal, the problem to be solved, a detailed solution design with
examples, and describes the changes to be made in the clients/services
which are part of remote-attestation system.  This is the 2nd version
of the proposal.  The first version is present here[1].

Table of Contents:
--
A. Motivation and Background
B. Goals and Non-Goals
     B.1 Goals
     B.2 Non-Goals
C. Proposed Solution
     C.1 Solution Summary
     C.2 High-level Work-flow
D. Detailed Design
     D.1 Snapshot Aggregate Event
     D.2 Snapshot Triggering Mechanism
     D.3 Choosing A Persistent Storage Location For Snapshots
     D.4 Remote-Attestation Client/Service-side Changes
     D.4.a Client-side Changes
     D.4.b Service-side Changes
E. Example Walk-through
F. Other Design Considerations
G. References



Userspace applications will have to know
a) where are the shard files?
b) how do I read the shard files while locking out the producer of the 
shard files?


IMO, this will require a well known config file and a locking method 
(flock) so that user space applications can work together in this new 
environment. The lock could be defined in the config file or just be the 
config file itself.





Re: [RFC V2] IMA Log Snapshotting Design Proposal

2023-11-13 Thread Sush Shringarputale




On 10/31/2023 11:37 AM, Ken Goldman wrote:

On 10/19/2023 2:49 PM, Tushar Sugandhi wrote:

   f. A new event, "snapshot_aggregate", will be computed and measured
    in the IMA log as part of this feature.  It should help the
    remote-attestation client/service to benefit from the IMA log
    snapshot feature.
    The "snapshot_aggregate" event is described in more details in
    section "D.1 Snapshot Aggregate Event" below.


What is the use case for the snapshot aggregate?  My thinking is:

1. The platform must retain the entire measurement list.  Early 
measurements can never be discarded because a new quote verifier

must receive the entire log starting at the first measurement.

In this case, isn't the snapshot aggregate redundant?

Not quite. The snapshot aggregate still has a purpose, which is to stitch
together the snapshots on the disk and the in-memory segment of the IMA
log. The specific details are in the RFC Section D.1, quoted here:

The "snapshot_aggregate" marker provides the following benefits:

a. It facilitates the IMA log to be divided into multiple chunks and
provides mechanism to verify the integrity of the system using only the
latest chunks during remote attestation.

b. It provides tangible evidence from Kernel to the attestation client
that IMA log snapshotting has been enabled and at least one snapshot
exists on the system.

c. It helps both the Kernel and UM attestation client define clear
boundaries between multiple snapshots.

d. In the event of multiple snapshots, the last measured
"snapshot_aggregate" marker, which is present in the current segment of
the IMA log, has sufficient information to verify the integrity of the
IMA log segment as well as the previous snapshots using the PCR quotes.

e. In the event of multiple snapshots, say N, if the remote-attestation
service has already processed the last N-1 snapshots, it can efficiently
parse through them by just processing "snapshot_aggregate" events to
compute the PCR quotes needed to verify the events in the last snapshot.
This should drastically improve the IMA log processing efficiency of
the service.



2. There is a disadvantage to redundant data.  The verifier must 
support this new event type. It receives this event and must validate 
the aggregate against the snapshot-ed events. This is an attack 
surface. The attacker can send an aggregate and snapshot-ed 
measurements that do not match to exploit a flaw in the verifier.

I disagree with this.  Redundancy is a moot point because
"snapshot_aggregate" is required for the points mentioned above.
- Sush



Re: [RFC V2] IMA Log Snapshotting Design Proposal

2023-10-31 Thread Mimi Zohar
On Thu, 2023-10-19 at 11:49 -0700, Tushar Sugandhi wrote:

[...]
> ---
> | C.1 Solution Summary|
> ---
> To achieve the goals described in the section above, we propose the
> following changes to the IMA subsystem.
> 
>  a. The IMA log from Kernel memory will be offloaded to some
> persistent storage disk to keep the system running reliably
> without facing memory pressure.
> More details, alternate approaches considered etc. are present
> in section "D.3 Choices for Storing Snapshots" below.
> 
>  b. The IMA log will be divided into multiple chunks (snapshots).
> Each snapshot would be a delta between the two instances when
> the log was offloaded from memory to the persistent storage
> disk.
> 
>  c. Some UM process (like a remote-attestation-client) will be
> responsible for writing the IMA log snapshot to the disk.
> 
>  d. The same UM process would be responsible for triggering the IMA
> log snapshot.
> 
>  e. There will be a well-known location for storing the IMA log
> snapshots on the disk.  It will be non-trivial for UM processes
> to change that location after booting into the Kernel.
> 
>  f. A new event, "snapshot_aggregate", will be computed and measured
> in the IMA log as part of this feature.  It should help the
> remote-attestation client/service to benefit from the IMA log
> snapshot feature.
> The "snapshot_aggregate" event is described in more details in
> section "D.1 Snapshot Aggregate Event" below.
> 
>  g. If the existing remote-attestation client/services do not change
> to benefit from this feature or do not trigger the snapshot,
> the Kernel will continue to have it's current functionality of
> maintaining an in-memory full IMA log.
> 
> Additionally, the remote-attestation client/services need to be updated
> to benefit from the IMA log snapshot feature.  These proposed changes
> 
> are described in section "D.4 Remote-Attestation Client/Service Side
> Changes" below, but their implementation is out of scope for this
> proposal.

As previously said on v1,
   This design seems overly complex and requires synchronization between the
   "snapshot" record and exporting the records from the measurement list. [...] 

   Concerns:
   - Pausing extending the measurement list.

Nothing has changed in terms of the complexity or in terms of pausing
the measurement list.   Pausing the measurement list is a non starter.

Userspace can already export the IMA measurement list(s) via the
securityfs {ascii,binary}_runtime_measurements file(s) and do whatever
it wants with it.  All that is missing in the kernel is the ability to
trim the measurement list, which doesn't seem all that complicated.

Mimi



Re: [RFC V2] IMA Log Snapshotting Design Proposal

2023-10-31 Thread Ken Goldman

On 10/19/2023 2:49 PM, Tushar Sugandhi wrote:

   f. A new event, "snapshot_aggregate", will be computed and measured
    in the IMA log as part of this feature.  It should help the
    remote-attestation client/service to benefit from the IMA log
    snapshot feature.
    The "snapshot_aggregate" event is described in more details in
    section "D.1 Snapshot Aggregate Event" below.


What is the use case for the snapshot aggregate?  My thinking is:

1. The platform must retain the entire measurement list.  Early 
measurements can never be discarded because a new quote verifier

must receive the entire log starting at the first measurement.

In this case, isn't the snapshot aggregate redundant?

2. There is a disadvantage to redundant data.  The verifier must support 
this new event type. It receives this event and must validate the 
aggregate against the snapshot-ed events. This is an attack surface. 
The attacker can send an aggregate and snapshot-ed measurements that do 
not match to exploit a flaw in the verifier.