Re: [RFC V2] IMA Log Snapshotting Design Proposal
On Mon, Jan 8, 2024 at 6:48 AM Mimi Zohar wrote: > On Sun, 2024-01-07 at 21:58 -0500, Paul Moore wrote: > > On Sun, Jan 7, 2024 at 7:59 AM Mimi Zohar wrote: > > > On Sat, 2024-01-06 at 18:27 -0500, Paul Moore wrote: > > > > On Tue, Nov 28, 2023 at 9:07 PM Mimi Zohar wrote: > > > > > On Tue, 2023-11-28 at 20:06 -0500, Paul Moore wrote: > > > > > > On Tue, Nov 28, 2023 at 7:09 AM Mimi Zohar > > > > > > wrote: > > > > > > > On Mon, 2023-11-27 at 17:16 -0500, Paul Moore wrote: > > > > > > > > On Mon, Nov 27, 2023 at 12:08 PM Mimi Zohar > > > > > > > > wrote: > > > > > > > > > On Wed, 2023-11-22 at 09:22 -0500, Paul Moore wrote: > > > > > > > > ... > > > > > > > > > > > > > Before defining a new critical-data record, we need to decide > > > > > > > > > whether > > > > > > > > > it is really necessary or if it is redundant. If we define a > > > > > > > > > new > > > > > > > > > "critical-data" record, can it be defined such that it > > > > > > > > > doesn't require > > > > > > > > > pausing extending the measurement list? For example, a new > > > > > > > > > simple > > > > > > > > > visual critical-data record could contain the number of > > > > > > > > > records (e.g. > > > > > > > > > /ima/runtime_measurements_count) up to that point. > > > > > > > > > > > > > > > > What if the snapshot_aggregate was a hash of the measurement log > > > > > > > > starting with either the boot_aggregate or the latest > > > > > > > > snapshot_aggregate and ending on the record before the new > > > > > > > > snapshot_aggregate? The performance impact at snapshot time > > > > > > > > should be > > > > > > > > minimal as the hash can be incrementally updated as new records > > > > > > > > are > > > > > > > > added to the measurement list. While the hash wouldn't capture > > > > > > > > the > > > > > > > > TPM state, it would allow some crude verification when > > > > > > > > reassembling > > > > > > > > the log. If one could bear the cost of a TPM signing > > > > > > > > operation, the > > > > > > > > log digest could be signed by the TPM. > > > > > > > > > > > > > > Other critical data is calculated, before calling > > > > > > > ima_measure_critical_data(), which adds the record to the > > > > > > > measurement > > > > > > > list and extends the TPM PCR. > > > > > > > > > > > > > > Signing the hash shouldn't be an issue if it behaves like other > > > > > > > critical data. > > > > > > > > > > > > > > In addition to the hash, consider including other information in > > > > > > > the > > > > > > > new critical data record (e.g. total number of measurement > > > > > > > records, the > > > > > > > number of measurements included in the hash, the number of times > > > > > > > the > > > > > > > measurement list was trimmed, etc). > > > > > > > > > > > > It would be nice if you could provide an explicit list of what you > > > > > > would want hashed into a snapshot_aggregate record; the above is > > > > > > close, but it is still a little hand-wavy. I'm just trying to > > > > > > reduce > > > > > > the back-n-forth :) > > > > > > > > > > What is being defined here is the first IMA critical-data record, > > > > > which > > > > > really requires some thought. > > > > > > > > My thinking has always been that taking a hash of the current > > > > measurement log up to the snapshot point would be a nice > > > > snapshot_aggregate measurement, but I'm not heavily invested in that. > > > > To me it is more important that we find something we can all agree on, > > > > perhaps reluctantly, so we can move forward with a solution. > > > > > > > > > For ease of review, this new critical- > > > > > data record should be a separate patch set from trimming the > > > > > measurement list. > > > > > > > > I see the two as linked, but if you prefer them as separate then so be > > > > it. Once again, the important part is to move forward with a > > > > solution, I'm not overly bothered if it arrives in multiple pieces > > > > instead of one. > > > > > > Trimming the IMA measurement list could be used in conjunction with the > > > new IMA > > > critical data record or independently. Both options should be supported. > > > > > > 1. trim N number of records from the head of the in kernel IMA > > > measurement list > > > 2. intermittently include the new IMA critical data record based on some > > > trigger > > > 3. trim the measurement list up to the (first/last/Nth) IMA critical data > > > record > > > > > > Since the two features could be used independently of each other, there > > > is no > > > reason to upstream them as a single patch set. It just makes it harder to > > > review. > > > > I don't see much point in recording a snapshot aggregate if you aren't > > doing a snapshot, but it's not harmful in any way, so sure, go for it. > > Like I said earlier, as long as the functionality is there, I don't > > think anyone cares too much how it gets into the kernel (although > > Tushar and Sush should comment from the perspect
Re: [RFC V2] IMA Log Snapshotting Design Proposal
On Sun, 2024-01-07 at 21:58 -0500, Paul Moore wrote: > On Sun, Jan 7, 2024 at 7:59 AM Mimi Zohar wrote: > > On Sat, 2024-01-06 at 18:27 -0500, Paul Moore wrote: > > > On Tue, Nov 28, 2023 at 9:07 PM Mimi Zohar wrote: > > > > On Tue, 2023-11-28 at 20:06 -0500, Paul Moore wrote: > > > > > On Tue, Nov 28, 2023 at 7:09 AM Mimi Zohar > > > > > wrote: > > > > > > On Mon, 2023-11-27 at 17:16 -0500, Paul Moore wrote: > > > > > > > On Mon, Nov 27, 2023 at 12:08 PM Mimi Zohar > > > > > > > wrote: > > > > > > > > On Wed, 2023-11-22 at 09:22 -0500, Paul Moore wrote: > > > > > > ... > > > > > > > > > > > Before defining a new critical-data record, we need to decide > > > > > > > > whether > > > > > > > > it is really necessary or if it is redundant. If we define a > > > > > > > > new > > > > > > > > "critical-data" record, can it be defined such that it doesn't > > > > > > > > require > > > > > > > > pausing extending the measurement list? For example, a new > > > > > > > > simple > > > > > > > > visual critical-data record could contain the number of records > > > > > > > > (e.g. > > > > > > > > /ima/runtime_measurements_count) up to that point. > > > > > > > > > > > > > > What if the snapshot_aggregate was a hash of the measurement log > > > > > > > starting with either the boot_aggregate or the latest > > > > > > > snapshot_aggregate and ending on the record before the new > > > > > > > snapshot_aggregate? The performance impact at snapshot time > > > > > > > should be > > > > > > > minimal as the hash can be incrementally updated as new records > > > > > > > are > > > > > > > added to the measurement list. While the hash wouldn't capture > > > > > > > the > > > > > > > TPM state, it would allow some crude verification when > > > > > > > reassembling > > > > > > > the log. If one could bear the cost of a TPM signing operation, > > > > > > > the > > > > > > > log digest could be signed by the TPM. > > > > > > > > > > > > Other critical data is calculated, before calling > > > > > > ima_measure_critical_data(), which adds the record to the > > > > > > measurement > > > > > > list and extends the TPM PCR. > > > > > > > > > > > > Signing the hash shouldn't be an issue if it behaves like other > > > > > > critical data. > > > > > > > > > > > > In addition to the hash, consider including other information in the > > > > > > new critical data record (e.g. total number of measurement records, > > > > > > the > > > > > > number of measurements included in the hash, the number of times the > > > > > > measurement list was trimmed, etc). > > > > > > > > > > It would be nice if you could provide an explicit list of what you > > > > > would want hashed into a snapshot_aggregate record; the above is > > > > > close, but it is still a little hand-wavy. I'm just trying to reduce > > > > > the back-n-forth :) > > > > > > > > What is being defined here is the first IMA critical-data record, which > > > > really requires some thought. > > > > > > My thinking has always been that taking a hash of the current > > > measurement log up to the snapshot point would be a nice > > > snapshot_aggregate measurement, but I'm not heavily invested in that. > > > To me it is more important that we find something we can all agree on, > > > perhaps reluctantly, so we can move forward with a solution. > > > > > > > For ease of review, this new critical- > > > > data record should be a separate patch set from trimming the > > > > measurement list. > > > > > > I see the two as linked, but if you prefer them as separate then so be > > > it. Once again, the important part is to move forward with a > > > solution, I'm not overly bothered if it arrives in multiple pieces > > > instead of one. > > > > Trimming the IMA measurement list could be used in conjunction with the new > > IMA > > critical data record or independently. Both options should be supported. > > > > 1. trim N number of records from the head of the in kernel IMA measurement > > list > > 2. intermittently include the new IMA critical data record based on some > > trigger > > 3. trim the measurement list up to the (first/last/Nth) IMA critical data > > record > > > > Since the two features could be used independently of each other, there is > > no > > reason to upstream them as a single patch set. It just makes it harder to > > review. > > I don't see much point in recording a snapshot aggregate if you aren't > doing a snapshot, but it's not harmful in any way, so sure, go for it. > Like I said earlier, as long as the functionality is there, I don't > think anyone cares too much how it gets into the kernel (although > Tushar and Sush should comment from the perspective). Paul, there are two features: - trimming the measurement list - defining and including an IMA critical data record The original design doc combined these two features making them an "atomic" operation and referred to it as a snapshot. At the time the term "snapshot" was an appropriate term
Re: [RFC V2] IMA Log Snapshotting Design Proposal
On Sun, Jan 7, 2024 at 7:59 AM Mimi Zohar wrote: > On Sat, 2024-01-06 at 18:27 -0500, Paul Moore wrote: > > On Tue, Nov 28, 2023 at 9:07 PM Mimi Zohar wrote: > > > On Tue, 2023-11-28 at 20:06 -0500, Paul Moore wrote: > > > > On Tue, Nov 28, 2023 at 7:09 AM Mimi Zohar wrote: > > > > > On Mon, 2023-11-27 at 17:16 -0500, Paul Moore wrote: > > > > > > On Mon, Nov 27, 2023 at 12:08 PM Mimi Zohar > > > > > > wrote: > > > > > > > On Wed, 2023-11-22 at 09:22 -0500, Paul Moore wrote: > > > > ... > > > > > > > > > Before defining a new critical-data record, we need to decide > > > > > > > whether > > > > > > > it is really necessary or if it is redundant. If we define a new > > > > > > > "critical-data" record, can it be defined such that it doesn't > > > > > > > require > > > > > > > pausing extending the measurement list? For example, a new simple > > > > > > > visual critical-data record could contain the number of records > > > > > > > (e.g. > > > > > > > /ima/runtime_measurements_count) up to that point. > > > > > > > > > > > > What if the snapshot_aggregate was a hash of the measurement log > > > > > > starting with either the boot_aggregate or the latest > > > > > > snapshot_aggregate and ending on the record before the new > > > > > > snapshot_aggregate? The performance impact at snapshot time should > > > > > > be > > > > > > minimal as the hash can be incrementally updated as new records are > > > > > > added to the measurement list. While the hash wouldn't capture the > > > > > > TPM state, it would allow some crude verification when reassembling > > > > > > the log. If one could bear the cost of a TPM signing operation, the > > > > > > log digest could be signed by the TPM. > > > > > > > > > > Other critical data is calculated, before calling > > > > > ima_measure_critical_data(), which adds the record to the measurement > > > > > list and extends the TPM PCR. > > > > > > > > > > Signing the hash shouldn't be an issue if it behaves like other > > > > > critical data. > > > > > > > > > > In addition to the hash, consider including other information in the > > > > > new critical data record (e.g. total number of measurement records, > > > > > the > > > > > number of measurements included in the hash, the number of times the > > > > > measurement list was trimmed, etc). > > > > > > > > It would be nice if you could provide an explicit list of what you > > > > would want hashed into a snapshot_aggregate record; the above is > > > > close, but it is still a little hand-wavy. I'm just trying to reduce > > > > the back-n-forth :) > > > > > > What is being defined here is the first IMA critical-data record, which > > > really requires some thought. > > > > My thinking has always been that taking a hash of the current > > measurement log up to the snapshot point would be a nice > > snapshot_aggregate measurement, but I'm not heavily invested in that. > > To me it is more important that we find something we can all agree on, > > perhaps reluctantly, so we can move forward with a solution. > > > > > For ease of review, this new critical- > > > data record should be a separate patch set from trimming the > > > measurement list. > > > > I see the two as linked, but if you prefer them as separate then so be > > it. Once again, the important part is to move forward with a > > solution, I'm not overly bothered if it arrives in multiple pieces > > instead of one. > > Trimming the IMA measurement list could be used in conjunction with the new > IMA > critical data record or independently. Both options should be supported. > > 1. trim N number of records from the head of the in kernel IMA measurement > list > 2. intermittently include the new IMA critical data record based on some > trigger > 3. trim the measurement list up to the (first/last/Nth) IMA critical data > record > > Since the two features could be used independently of each other, there is no > reason to upstream them as a single patch set. It just makes it harder to > review. I don't see much point in recording a snapshot aggregate if you aren't doing a snapshot, but it's not harmful in any way, so sure, go for it. Like I said earlier, as long as the functionality is there, I don't think anyone cares too much how it gets into the kernel (although Tushar and Sush should comment from the perspective). > > > As I'm sure you're aware, SElinux defines two critical-data records. > > > From security/selinux/ima.c: > > > > > > ima_measure_critical_data("selinux", "selinux-state", > > > state_str, strlen(state_str), false, > > > NULL, 0); > > > > > > ima_measure_critical_data("selinux", "selinux-policy-hash", > > > policy, policy_len, true, > > > NULL, 0); > > > > Yep, but there is far more to this than SELinux. > > Only if you conflate the two features. If that is a clever retort, you'll need
Re: [RFC V2] IMA Log Snapshotting Design Proposal
On Sat, 2024-01-06 at 18:27 -0500, Paul Moore wrote: > On Tue, Nov 28, 2023 at 9:07 PM Mimi Zohar wrote: > > On Tue, 2023-11-28 at 20:06 -0500, Paul Moore wrote: > > > On Tue, Nov 28, 2023 at 7:09 AM Mimi Zohar wrote: > > > > On Mon, 2023-11-27 at 17:16 -0500, Paul Moore wrote: > > > > > On Mon, Nov 27, 2023 at 12:08 PM Mimi Zohar > > > > > wrote: > > > > > > On Wed, 2023-11-22 at 09:22 -0500, Paul Moore wrote: > > ... > > > > > > > Before defining a new critical-data record, we need to decide > > > > > > whether > > > > > > it is really necessary or if it is redundant. If we define a new > > > > > > "critical-data" record, can it be defined such that it doesn't > > > > > > require > > > > > > pausing extending the measurement list? For example, a new simple > > > > > > visual critical-data record could contain the number of records > > > > > > (e.g. > > > > > > /ima/runtime_measurements_count) up to that point. > > > > > > > > > > What if the snapshot_aggregate was a hash of the measurement log > > > > > starting with either the boot_aggregate or the latest > > > > > snapshot_aggregate and ending on the record before the new > > > > > snapshot_aggregate? The performance impact at snapshot time should be > > > > > minimal as the hash can be incrementally updated as new records are > > > > > added to the measurement list. While the hash wouldn't capture the > > > > > TPM state, it would allow some crude verification when reassembling > > > > > the log. If one could bear the cost of a TPM signing operation, the > > > > > log digest could be signed by the TPM. > > > > > > > > Other critical data is calculated, before calling > > > > ima_measure_critical_data(), which adds the record to the measurement > > > > list and extends the TPM PCR. > > > > > > > > Signing the hash shouldn't be an issue if it behaves like other > > > > critical data. > > > > > > > > In addition to the hash, consider including other information in the > > > > new critical data record (e.g. total number of measurement records, the > > > > number of measurements included in the hash, the number of times the > > > > measurement list was trimmed, etc). > > > > > > It would be nice if you could provide an explicit list of what you > > > would want hashed into a snapshot_aggregate record; the above is > > > close, but it is still a little hand-wavy. I'm just trying to reduce > > > the back-n-forth :) > > > > What is being defined here is the first IMA critical-data record, which > > really requires some thought. > > My thinking has always been that taking a hash of the current > measurement log up to the snapshot point would be a nice > snapshot_aggregate measurement, but I'm not heavily invested in that. > To me it is more important that we find something we can all agree on, > perhaps reluctantly, so we can move forward with a solution. > > > For ease of review, this new critical- > > data record should be a separate patch set from trimming the > > measurement list. > > I see the two as linked, but if you prefer them as separate then so be > it. Once again, the important part is to move forward with a > solution, I'm not overly bothered if it arrives in multiple pieces > instead of one. Trimming the IMA measurement list could be used in conjunction with the new IMA critical data record or independently. Both options should be supported. 1. trim N number of records from the head of the in kernel IMA measurement list 2. intermittently include the new IMA critical data record based on some trigger 3. trim the measurement list up to the (first/last/Nth) IMA critical data record Since the two features could be used independently of each other, there is no reason to upstream them as a single patch set. It just makes it harder to review. > > > As I'm sure you're aware, SElinux defines two critical-data records. > > From security/selinux/ima.c: > > > > ima_measure_critical_data("selinux", "selinux-state", > > state_str, strlen(state_str), false, > > NULL, 0); > > > > ima_measure_critical_data("selinux", "selinux-policy-hash", > > policy, policy_len, true, > > NULL, 0); > > Yep, but there is far more to this than SELinux. Only if you conflate the two features. Mimi
Re: [RFC V2] IMA Log Snapshotting Design Proposal
On Wed, Dec 20, 2023 at 5:14 PM Ken Goldman wrote: > > I'm still struggling with the "new root of trust" concept. > > Something - a user space agent, a third party, etc. - has to > retain the entire log from event 0, because a new verifier > needs all measurements. [NOTE: a gentle reminder to please refrain from top-posting on Linux kernel mailing lists, it is generally frowned upon and makes it difficult to manage long running threads] This is one of the reasons I have pushed to manage the snapshot, both the trigger and the handling of the trimmed data, outside of the kernel. Setting aside the obvious limitations of kernel I/O, handling the snapshot in userspace provides for a much richer set of options when it comes to managing the snapshot and the verification/attestation of the system. > Therefore, the snapshot aggregate seems redundant. It has to > be verified to match the snapshotted events. I can see a perspective where the snapshot_aggregate is theoretically redundant, but I can also see at least one practical perspective where a snapshot_aggregate could be used to simplify a remote attestation with a sufficiently stateful attestation service. > A redundancy is an attack surface. Now that is an overly broad generalization, if we are going that route, *everything* is an attack surface (and this arguably true regardless, although a bit of an extreme statement). > A badly written verifier > might not do that verification, and this permits snapshotted > events to be forged. No aggregate means the verifier can't > make a mistake. I would ask that you read your own comment again. A poorly written verifier is subject to any number of pitfalls and vulnerabilities, regardless of a snapshot aggregate. As a reminder, the snapshotting mechanism has always been proposed as an opt-in mechanism, if one has not implemented a proper snapshot-aware attestation mechanism then they can simply refrain from taking a snapshot and reject all attestation attempts using a snapshot. > On 11/22/2023 9:22 AM, Paul Moore wrote: > > I believe the intent is to only pause the measurements while the > > snapshot_aggregate is generated, not for the duration of the entire > > snapshot process. The purpose of the snapshot_aggregate is to > > establish a new root of trust, similar to the boot_aggregate, to help > > improve attestation performance. -- paul-moore.com
Re: [RFC V2] IMA Log Snapshotting Design Proposal
On Tue, Nov 28, 2023 at 9:07 PM Mimi Zohar wrote: > On Tue, 2023-11-28 at 20:06 -0500, Paul Moore wrote: > > On Tue, Nov 28, 2023 at 7:09 AM Mimi Zohar wrote: > > > On Mon, 2023-11-27 at 17:16 -0500, Paul Moore wrote: > > > > On Mon, Nov 27, 2023 at 12:08 PM Mimi Zohar wrote: > > > > > On Wed, 2023-11-22 at 09:22 -0500, Paul Moore wrote: ... > > > > > Before defining a new critical-data record, we need to decide whether > > > > > it is really necessary or if it is redundant. If we define a new > > > > > "critical-data" record, can it be defined such that it doesn't require > > > > > pausing extending the measurement list? For example, a new simple > > > > > visual critical-data record could contain the number of records (e.g. > > > > > /ima/runtime_measurements_count) up to that point. > > > > > > > > What if the snapshot_aggregate was a hash of the measurement log > > > > starting with either the boot_aggregate or the latest > > > > snapshot_aggregate and ending on the record before the new > > > > snapshot_aggregate? The performance impact at snapshot time should be > > > > minimal as the hash can be incrementally updated as new records are > > > > added to the measurement list. While the hash wouldn't capture the > > > > TPM state, it would allow some crude verification when reassembling > > > > the log. If one could bear the cost of a TPM signing operation, the > > > > log digest could be signed by the TPM. > > > > > > Other critical data is calculated, before calling > > > ima_measure_critical_data(), which adds the record to the measurement > > > list and extends the TPM PCR. > > > > > > Signing the hash shouldn't be an issue if it behaves like other > > > critical data. > > > > > > In addition to the hash, consider including other information in the > > > new critical data record (e.g. total number of measurement records, the > > > number of measurements included in the hash, the number of times the > > > measurement list was trimmed, etc). > > > > It would be nice if you could provide an explicit list of what you > > would want hashed into a snapshot_aggregate record; the above is > > close, but it is still a little hand-wavy. I'm just trying to reduce > > the back-n-forth :) > > What is being defined here is the first IMA critical-data record, which > really requires some thought. My thinking has always been that taking a hash of the current measurement log up to the snapshot point would be a nice snapshot_aggregate measurement, but I'm not heavily invested in that. To me it is more important that we find something we can all agree on, perhaps reluctantly, so we can move forward with a solution. > For ease of review, this new critical- > data record should be a separate patch set from trimming the > measurement list. I see the two as linked, but if you prefer them as separate then so be it. Once again, the important part is to move forward with a solution, I'm not overly bothered if it arrives in multiple pieces instead of one. > As I'm sure you're aware, SElinux defines two critical-data records. > From security/selinux/ima.c: > > ima_measure_critical_data("selinux", "selinux-state", > state_str, strlen(state_str), false, > NULL, 0); > > ima_measure_critical_data("selinux", "selinux-policy-hash", > policy, policy_len, true, > NULL, 0); Yep, but there is far more to this than SELinux. -- paul-moore.com
Re: [RFC V2] IMA Log Snapshotting Design Proposal
I'm still struggling with the "new root of trust" concept. Something - a user space agent, a third party, etc. - has to retain the entire log from event 0, because a new verifier needs all measurements. Therefore, the snapshot aggregate seems redundant. It has to be verified to match the snapshotted events. A redundancy is an attack surface. A badly written verifier might not do that verification, and this permits snapshotted events to be forged. No aggregate means the verifier can't make a mistake. On 11/22/2023 9:22 AM, Paul Moore wrote: I believe the intent is to only pause the measurements while the snapshot_aggregate is generated, not for the duration of the entire snapshot process. The purpose of the snapshot_aggregate is to establish a new root of trust, similar to the boot_aggregate, to help improve attestation performance.
Re: [RFC V2] IMA Log Snapshotting Design Proposal
On Tue, 2023-11-28 at 20:06 -0500, Paul Moore wrote: > On Tue, Nov 28, 2023 at 7:09 AM Mimi Zohar wrote: > > On Mon, 2023-11-27 at 17:16 -0500, Paul Moore wrote: > > > On Mon, Nov 27, 2023 at 12:08 PM Mimi Zohar wrote: > > > > On Wed, 2023-11-22 at 09:22 -0500, Paul Moore wrote: > > ... > > > > If we are going to have a record count, I imagine it would also be > > > helpful to maintain a securityfs file with the total size (in bytes) > > > of the in-memory measurement log. In fact, I suspect this will > > > probably be more useful for those who wish to manage the size of the > > > measurement log. > > > > A running number of bytes needed for carrying the measurement list > > across kexec already exists. This value would be affected when the > > measurement list is trimmed. > > There we go, it should be trivial to export that information via securityfs. > > > > > Defining other IMA securityfs files like > > > > how many times the measurement list has been trimmed might be > > > > beneficial as well. > > > > > > I have no objection to that. Would a total record count, i.e. a value > > > that doesn't reset on a snapshot event, be more useful here? > > > > /ima/runtime_measurements_count already exports the total > > number of measurement records. > > I guess the question is would you want 'runtime_measurements_count' to > reflect the current/trimmed log size or would you want it to reflect > hthe measurements since the initial cold boot? Presumably we would > want to add another securityfs file to handle the case not covered by > 'runtime_measurements_count'. Right. /ima/runtime_measurements_count is defined as the total number of measurements since boot. When the measurement list is carried across kexec, it is the number of measurements since cold boot. A new securityfs file should be defined for the current number of in kernel memory records. Unless the measurement list has been trimmed, this should be the same as the runtime_measurements_count. > > > > > Before defining a new critical-data record, we need to decide whether > > > > it is really necessary or if it is redundant. If we define a new > > > > "critical-data" record, can it be defined such that it doesn't require > > > > pausing extending the measurement list? For example, a new simple > > > > visual critical-data record could contain the number of records (e.g. > > > > /ima/runtime_measurements_count) up to that point. > > > > > > What if the snapshot_aggregate was a hash of the measurement log > > > starting with either the boot_aggregate or the latest > > > snapshot_aggregate and ending on the record before the new > > > snapshot_aggregate? The performance impact at snapshot time should be > > > minimal as the hash can be incrementally updated as new records are > > > added to the measurement list. While the hash wouldn't capture the > > > TPM state, it would allow some crude verification when reassembling > > > the log. If one could bear the cost of a TPM signing operation, the > > > log digest could be signed by the TPM. > > > > Other critical data is calculated, before calling > > ima_measure_critical_data(), which adds the record to the measurement > > list and extends the TPM PCR. > > > > Signing the hash shouldn't be an issue if it behaves like other > > critical data. > > > > In addition to the hash, consider including other information in the > > new critical data record (e.g. total number of measurement records, the > > number of measurements included in the hash, the number of times the > > measurement list was trimmed, etc). > > It would be nice if you could provide an explicit list of what you > would want hashed into a snapshot_aggregate record; the above is > close, but it is still a little hand-wavy. I'm just trying to reduce > the back-n-forth :) What is being defined here is the first IMA critical-data record, which really requires some thought. For ease of review, this new critical- data record should be a separate patch set from trimming the measurement list. As I'm sure you're aware, SElinux defines two critical-data records. >From security/selinux/ima.c: ima_measure_critical_data("selinux", "selinux-state", state_str, strlen(state_str), false, NULL, 0); ima_measure_critical_data("selinux", "selinux-policy-hash", policy, policy_len, true, NULL, 0); -- thanks, Mimi
Re: [RFC V2] IMA Log Snapshotting Design Proposal
On Tue, Nov 28, 2023 at 7:09 AM Mimi Zohar wrote: > On Mon, 2023-11-27 at 17:16 -0500, Paul Moore wrote: > > On Mon, Nov 27, 2023 at 12:08 PM Mimi Zohar wrote: > > > On Wed, 2023-11-22 at 09:22 -0500, Paul Moore wrote: ... > > If we are going to have a record count, I imagine it would also be > > helpful to maintain a securityfs file with the total size (in bytes) > > of the in-memory measurement log. In fact, I suspect this will > > probably be more useful for those who wish to manage the size of the > > measurement log. > > A running number of bytes needed for carrying the measurement list > across kexec already exists. This value would be affected when the > measurement list is trimmed. There we go, it should be trivial to export that information via securityfs. > > > Defining other IMA securityfs files like > > > how many times the measurement list has been trimmed might be > > > beneficial as well. > > > > I have no objection to that. Would a total record count, i.e. a value > > that doesn't reset on a snapshot event, be more useful here? > > /ima/runtime_measurements_count already exports the total > number of measurement records. I guess the question is would you want 'runtime_measurements_count' to reflect the current/trimmed log size or would you want it to reflect the measurements since the initial cold boot? Presumably we would want to add another securityfs file to handle the case not covered by 'runtime_measurements_count'. > > > Before defining a new critical-data record, we need to decide whether > > > it is really necessary or if it is redundant. If we define a new > > > "critical-data" record, can it be defined such that it doesn't require > > > pausing extending the measurement list? For example, a new simple > > > visual critical-data record could contain the number of records (e.g. > > > /ima/runtime_measurements_count) up to that point. > > > > What if the snapshot_aggregate was a hash of the measurement log > > starting with either the boot_aggregate or the latest > > snapshot_aggregate and ending on the record before the new > > snapshot_aggregate? The performance impact at snapshot time should be > > minimal as the hash can be incrementally updated as new records are > > added to the measurement list. While the hash wouldn't capture the > > TPM state, it would allow some crude verification when reassembling > > the log. If one could bear the cost of a TPM signing operation, the > > log digest could be signed by the TPM. > > Other critical data is calculated, before calling > ima_measure_critical_data(), which adds the record to the measurement > list and extends the TPM PCR. > > Signing the hash shouldn't be an issue if it behaves like other > critical data. > > In addition to the hash, consider including other information in the > new critical data record (e.g. total number of measurement records, the > number of measurements included in the hash, the number of times the > measurement list was trimmed, etc). It would be nice if you could provide an explicit list of what you would want hashed into a snapshot_aggregate record; the above is close, but it is still a little hand-wavy. I'm just trying to reduce the back-n-forth :) -- paul-moore.com
Re: [RFC V2] IMA Log Snapshotting Design Proposal
On Mon, 2023-11-27 at 17:16 -0500, Paul Moore wrote: > On Mon, Nov 27, 2023 at 12:08 PM Mimi Zohar wrote: > > On Wed, 2023-11-22 at 09:22 -0500, Paul Moore wrote: > > ... > > > > Okay, we are starting to get closer, but I'm still missing the part > > > where you say "if you do X, Y, and Z, I'll accept and merge the > > > solution." Can you be more explicit about what approach(es) you would > > > be willing to accept upstream? > > > > Included with what is wanted/needed is an explanation as to my concerns > > with the existing proposal. > > > > First we need to differentiate between kernel and uhserspace > > requirements. (The "snapshotting" design proposal intermixes them.) > > > > From the kernel persective, the Log Snapshotting Design proposal "B.1 > > Goals" is very nice, but once the measurement list can be trimmed it is > > really irrelevant. Userspace can do whatever it wants with the > > measurement list records. So instead of paying lip service to what > > should be done, just call it as it is - trimming the measurement list. > > Fair enough. I personally think it is nice to have a brief discussion > of how userspace might use a kernel feature, but if you prefer to drop > that part of the design doc I doubt anyone will object very strongly. > > > From the kernel perspective there needs to be a method of trimming N > > number of records from the head of the measurement list. In addition > > to the existing securityfs "runtime measurement list", defining a new > > securityfs file containing the current count of in memory measurement > > records would be beneficial. > > I imagine that should be trivial to implement and I can't imagine > there being any objection to that. > > If we are going to have a record count, I imagine it would also be > helpful to maintain a securityfs file with the total size (in bytes) > of the in-memory measurement log. In fact, I suspect this will > probably be more useful for those who wish to manage the size of the > measurement log. A running number of bytes needed for carrying the measurement list across kexec already exists. This value would be affected when the measurement list is trimmed. ... > > > Defining other IMA securityfs files like > > how many times the measurement list has been trimmed might be > > beneficial as well. > > I have no objection to that. Would a total record count, i.e. a value > that doesn't reset on a snapshot event, be more useful here? /ima/runtime_measurements_count already exports the total number of measurement records. > > > Of course properly document the integrity > > implications and repercussions of the new Kconfig that allows trimming > > the measurement list. > > Of course. > > > Defining a simple "trim" marker measurement record would be a visual > > indication that the measurement list has been trimmed. I might even > > have compared it to the "boot_aggregate". However, the proposed marker > > based on TPM PCRs requires pausing extending the measurement list. > > ... > > > Before defining a new critical-data record, we need to decide whether > > it is really necessary or if it is redundant. If we define a new > > "critical-data" record, can it be defined such that it doesn't require > > pausing extending the measurement list? For example, a new simple > > visual critical-data record could contain the number of records (e.g. > > /ima/runtime_measurements_count) up to that point. > > What if the snapshot_aggregate was a hash of the measurement log > starting with either the boot_aggregate or the latest > snapshot_aggregate and ending on the record before the new > snapshot_aggregate? The performance impact at snapshot time should be > minimal as the hash can be incrementally updated as new records are > added to the measurement list. While the hash wouldn't capture the > TPM state, it would allow some crude verification when reassembling > the log. If one could bear the cost of a TPM signing operation, the > log digest could be signed by the TPM. Other critical data is calculated, before calling ima_measure_critical_data(), which adds the record to the measurement list and extends the TPM PCR. Signing the hash shouldn't be an issue if it behaves like other critical data. In addition to the hash, consider including other information in the new critical data record (e.g. total number of measurement records, the number of measurements included in the hash, the number of times the measurement list was trimmed, etc). > > > The new critical-data record and trimming the measurement list should > > be disjoint features. If the first record after trimming the > > measurement list should be the critical-data record, then trim the > > measurement list up to that point. > > I disagree about the snapshot_aggregate record being disjoint from the > measurement log, but I suspect Tushar and Sush are willing to forgo > the snapshot_aggregate if that is a blocker from your perspective. > Once again, the m
Re: [RFC V2] IMA Log Snapshotting Design Proposal
On Mon, Nov 27, 2023 at 12:08 PM Mimi Zohar wrote: > On Wed, 2023-11-22 at 09:22 -0500, Paul Moore wrote: ... > > Okay, we are starting to get closer, but I'm still missing the part > > where you say "if you do X, Y, and Z, I'll accept and merge the > > solution." Can you be more explicit about what approach(es) you would > > be willing to accept upstream? > > Included with what is wanted/needed is an explanation as to my concerns > with the existing proposal. > > First we need to differentiate between kernel and uhserspace > requirements. (The "snapshotting" design proposal intermixes them.) > > From the kernel persective, the Log Snapshotting Design proposal "B.1 > Goals" is very nice, but once the measurement list can be trimmed it is > really irrelevant. Userspace can do whatever it wants with the > measurement list records. So instead of paying lip service to what > should be done, just call it as it is - trimming the measurement list. Fair enough. I personally think it is nice to have a brief discussion of how userspace might use a kernel feature, but if you prefer to drop that part of the design doc I doubt anyone will object very strongly. > --- > | B.1 Goals | > --- > To address the issues described in the section above, we propose > enhancements to the IMA subsystem to achieve the following goals: > > a. Reduce memory pressure on the Kernel caused by larger in-memory > IMA logs. > > b. Preserve the system's ability to get remotely attested using the > IMA log, even after implementing the enhancements to reduce memory > pressure caused by the IMA log. IMA's Integrity guarantees should > be maintained. > > c. Provide mechanisms from Kernel side to the remote attestation > service to make service-side processing more efficient. That looks fine to me. > From the kernel perspective there needs to be a method of trimming N > number of records from the head of the measurement list. In addition > to the existing securityfs "runtime measurement list", defining a new > securityfs file containing the current count of in memory measurement > records would be beneficial. I imagine that should be trivial to implement and I can't imagine there being any objection to that. If we are going to have a record count, I imagine it would also be helpful to maintain a securityfs file with the total size (in bytes) of the in-memory measurement log. In fact, I suspect this will probably be more useful for those who wish to manage the size of the measurement log. > Defining other IMA securityfs files like > how many times the measurement list has been trimmed might be > beneficial as well. I have no objection to that. Would a total record count, i.e. a value that doesn't reset on a snapshot event, be more useful here? > Of course properly document the integrity > implications and repercussions of the new Kconfig that allows trimming > the measurement list. Of course. > Defining a simple "trim" marker measurement record would be a visual > indication that the measurement list has been trimmed. I might even > have compared it to the "boot_aggregate". However, the proposed marker > based on TPM PCRs requires pausing extending the measurement list. ... > Before defining a new critical-data record, we need to decide whether > it is really necessary or if it is redundant. If we define a new > "critical-data" record, can it be defined such that it doesn't require > pausing extending the measurement list? For example, a new simple > visual critical-data record could contain the number of records (e.g. > /ima/runtime_measurements_count) up to that point. What if the snapshot_aggregate was a hash of the measurement log starting with either the boot_aggregate or the latest snapshot_aggregate and ending on the record before the new snapshot_aggregate? The performance impact at snapshot time should be minimal as the hash can be incrementally updated as new records are added to the measurement list. While the hash wouldn't capture the TPM state, it would allow some crude verification when reassembling the log. If one could bear the cost of a TPM signing operation, the log digest could be signed by the TPM. > The new critical-data record and trimming the measurement list should > be disjoint features. If the first record after trimming the > measurement list should be the critical-data record, then trim the > measurement list up to that point. I disagree about the snapshot_aggregate record being disjoint from the measurement log, but I suspect Tushar and Sush are willing to forgo the snapshot_aggregate if that is a blocker from your perspective. Once again, the main goal is the ability to manage the size of the measurement log; while having a snapshot_aggregate that can be used to
Re: [RFC V2] IMA Log Snapshotting Design Proposal
On Wed, 2023-11-22 at 09:22 -0500, Paul Moore wrote: > On Wed, Nov 22, 2023 at 8:18 AM Mimi Zohar wrote: > > On Tue, 2023-11-21 at 23:27 -0500, Paul Moore wrote: > > > On Thu, Nov 16, 2023 at 5:28 PM Paul Moore wrote: > > > > On Tue, Oct 31, 2023 at 3:15 PM Mimi Zohar wrote: > > > > > > ... > > > > > > > > Userspace can already export the IMA measurement list(s) via the > > > > > securityfs {ascii,binary}_runtime_measurements file(s) and do whatever > > > > > it wants with it. All that is missing in the kernel is the ability to > > > > > trim the measurement list, which doesn't seem all that complicated. > > > > > > > > From my perspective what has been presented is basically just trimming > > > > the in-memory measurement log, the additional complexity (which really > > > > doesn't look that bad IMO) is there to ensure robustness in the face > > > > of an unreliable userspace (processes die, get killed, etc.) and to > > > > establish a new, transitive root of trust in the newly trimmed > > > > in-memory log. > > > > > > > > I suppose one could simplify things greatly by having a design where > > > > userspace captures the measurement log and then writes the number of > > > > measurement records to trim from the start of the measurement log to a > > > > sysfs file and the kernel acts on that. You could do this with, or > > > > without, the snapshot_aggregate entry concept; in fact that could be > > > > something that was controlled by userspace, e.g. write the number of > > > > lines and a flag to indicate if a snapshot_aggregate was desired to > > > > the sysfs file. I can't say I've thought it all the way through to > > > > make sure there are no gotchas, but I'm guessing that is about as > > > > simple as one can get. > > > > > > If there is something else you had in mind, Mimi, please share the > > > > details. This is a very real problem we are facing and we want to > > > > work to get a solution upstream. > > > > > > Any thoughts on this Mimi? We have a real interest in working with > > > you to solve this problem upstream, but we need more detailed feedback > > > than "too complicated". If you don't like the solutions presented > > > thus far, what type of solution would you like to see? > > > > Paul, the design copies the measurement list to a temporary "snapshot" > > file, before trimming the measurement list, which according to the > > design document locks the existing measurement list. And further > > pauses extending the measurement list to calculate the > > "snapshot_aggregate". > > I believe the intent is to only pause the measurements while the > snapshot_aggregate is generated, not for the duration of the entire > snapshot process. The purpose of the snapshot_aggregate is to > establish a new root of trust, similar to the boot_aggregate, to help > improve attestation performance. > > > Userspace can export the measurement list already, so why this > > complicated design? > > The current code has no provision for trimming the measurement log, > that's the primary reason. > > > As I mentioned previously and repeated yesterday, the > > "snapshot_aggregate" is a new type of critical data and should be > > upstreamed independently of this patch set that trims the measurement > > list. Trimming the measurement list could be based, as you suggested > > on the number of records to remove, or it could be up to the next/last > > "snapshot_aggregate" record. > > Okay, we are starting to get closer, but I'm still missing the part > where you say "if you do X, Y, and Z, I'll accept and merge the > solution." Can you be more explicit about what approach(es) you would > be willing to accept upstream? Included with what is wanted/needed is an explanation as to my concerns with the existing proposal. First we need to differentiate between kernel and uhserspace requirements. (The "snapshotting" design proposal intermixes them.) >From the kernel persective, the Log Snapshotting Design proposal "B.1 Goals" is very nice, but once the measurement list can be trimmed it is really irrelevant. Userspace can do whatever it wants with the measurement list records. So instead of paying lip service to what should be done, just call it as it is - trimming the measurement list. --- | B.1 Goals | --- To address the issues described in the section above, we propose enhancements to the IMA subsystem to achieve the following goals: a. Reduce memory pressure on the Kernel caused by larger in-memory IMA logs. b. Preserve the system's ability to get remotely attested using the IMA log, even after implementing the enhancements to reduce memory pressure caused by the IMA log. IMA's Integrity guarantees should be maintained. c. Provide mechanisms from Kernel side to the remote attestat
Re: [RFC V2] IMA Log Snapshotting Design Proposal
On Wed, Nov 22, 2023 at 8:18 AM Mimi Zohar wrote: > On Tue, 2023-11-21 at 23:27 -0500, Paul Moore wrote: > > On Thu, Nov 16, 2023 at 5:28 PM Paul Moore wrote: > > > On Tue, Oct 31, 2023 at 3:15 PM Mimi Zohar wrote: > > > > ... > > > > > > Userspace can already export the IMA measurement list(s) via the > > > > securityfs {ascii,binary}_runtime_measurements file(s) and do whatever > > > > it wants with it. All that is missing in the kernel is the ability to > > > > trim the measurement list, which doesn't seem all that complicated. > > > > > > From my perspective what has been presented is basically just trimming > > > the in-memory measurement log, the additional complexity (which really > > > doesn't look that bad IMO) is there to ensure robustness in the face > > > of an unreliable userspace (processes die, get killed, etc.) and to > > > establish a new, transitive root of trust in the newly trimmed > > > in-memory log. > > > > > > I suppose one could simplify things greatly by having a design where > > > userspace captures the measurement log and then writes the number of > > > measurement records to trim from the start of the measurement log to a > > > sysfs file and the kernel acts on that. You could do this with, or > > > without, the snapshot_aggregate entry concept; in fact that could be > > > something that was controlled by userspace, e.g. write the number of > > > lines and a flag to indicate if a snapshot_aggregate was desired to > > > the sysfs file. I can't say I've thought it all the way through to > > > make sure there are no gotchas, but I'm guessing that is about as > > > simple as one can get. > > > > If there is something else you had in mind, Mimi, please share the > > > details. This is a very real problem we are facing and we want to > > > work to get a solution upstream. > > > > Any thoughts on this Mimi? We have a real interest in working with > > you to solve this problem upstream, but we need more detailed feedback > > than "too complicated". If you don't like the solutions presented > > thus far, what type of solution would you like to see? > > Paul, the design copies the measurement list to a temporary "snapshot" > file, before trimming the measurement list, which according to the > design document locks the existing measurement list. And further > pauses extending the measurement list to calculate the > "snapshot_aggregate". I believe the intent is to only pause the measurements while the snapshot_aggregate is generated, not for the duration of the entire snapshot process. The purpose of the snapshot_aggregate is to establish a new root of trust, similar to the boot_aggregate, to help improve attestation performance. > Userspace can export the measurement list already, so why this > complicated design? The current code has no provision for trimming the measurement log, that's the primary reason. > As I mentioned previously and repeated yesterday, the > "snapshot_aggregate" is a new type of critical data and should be > upstreamed independently of this patch set that trims the measurement > list. Trimming the measurement list could be based, as you suggested > on the number of records to remove, or it could be up to the next/last > "snapshot_aggregate" record. Okay, we are starting to get closer, but I'm still missing the part where you say "if you do X, Y, and Z, I'll accept and merge the solution." Can you be more explicit about what approach(es) you would be willing to accept upstream? -- paul-moore.com
Re: [RFC V2] IMA Log Snapshotting Design Proposal
On Tue, 2023-11-21 at 23:27 -0500, Paul Moore wrote: > On Thu, Nov 16, 2023 at 5:28 PM Paul Moore wrote: > > On Tue, Oct 31, 2023 at 3:15 PM Mimi Zohar wrote: > > ... > > > > Userspace can already export the IMA measurement list(s) via the > > > securityfs {ascii,binary}_runtime_measurements file(s) and do whatever > > > it wants with it. All that is missing in the kernel is the ability to > > > trim the measurement list, which doesn't seem all that complicated. > > > > From my perspective what has been presented is basically just trimming > > the in-memory measurement log, the additional complexity (which really > > doesn't look that bad IMO) is there to ensure robustness in the face > > of an unreliable userspace (processes die, get killed, etc.) and to > > establish a new, transitive root of trust in the newly trimmed > > in-memory log. > > > > I suppose one could simplify things greatly by having a design where > > userspace captures the measurement log and then writes the number of > > measurement records to trim from the start of the measurement log to a > > sysfs file and the kernel acts on that. You could do this with, or > > without, the snapshot_aggregate entry concept; in fact that could be > > something that was controlled by userspace, e.g. write the number of > > lines and a flag to indicate if a snapshot_aggregate was desired to > > the sysfs file. I can't say I've thought it all the way through to > > make sure there are no gotchas, but I'm guessing that is about as > > simple as one can get. > > If there is something else you had in mind, Mimi, please share the > > details. This is a very real problem we are facing and we want to > > work to get a solution upstream. > > Any thoughts on this Mimi? We have a real interest in working with > you to solve this problem upstream, but we need more detailed feedback > than "too complicated". If you don't like the solutions presented > thus far, what type of solution would you like to see? Paul, the design copies the measurement list to a temporary "snapshot" file, before trimming the measurement list, which according to the design document locks the existing measurement list. And further pauses extending the measurement list to calculate the "snapshot_aggregate". Userspace can export the measurement list already, so why this complicated design? As I mentioned previously and repeated yesterday, the "snapshot_aggregate" is a new type of critical data and should be upstreamed independently of this patch set that trims the measurement list. Trimming the measurement list could be based, as you suggested on the number of records to remove, or it could be up to the next/last "snapshot_aggregate" record. -- thanks, Mimi
Re: [RFC V2] IMA Log Snapshotting Design Proposal
On Thu, Nov 16, 2023 at 5:28 PM Paul Moore wrote: > On Tue, Oct 31, 2023 at 3:15 PM Mimi Zohar wrote: ... > > Userspace can already export the IMA measurement list(s) via the > > securityfs {ascii,binary}_runtime_measurements file(s) and do whatever > > it wants with it. All that is missing in the kernel is the ability to > > trim the measurement list, which doesn't seem all that complicated. > > From my perspective what has been presented is basically just trimming > the in-memory measurement log, the additional complexity (which really > doesn't look that bad IMO) is there to ensure robustness in the face > of an unreliable userspace (processes die, get killed, etc.) and to > establish a new, transitive root of trust in the newly trimmed > in-memory log. > > I suppose one could simplify things greatly by having a design where > userspace captures the measurement log and then writes the number of > measurement records to trim from the start of the measurement log to a > sysfs file and the kernel acts on that. You could do this with, or > without, the snapshot_aggregate entry concept; in fact that could be > something that was controlled by userspace, e.g. write the number of > lines and a flag to indicate if a snapshot_aggregate was desired to > the sysfs file. I can't say I've thought it all the way through to > make sure there are no gotchas, but I'm guessing that is about as > simple as one can get. > > If there is something else you had in mind, Mimi, please share the > details. This is a very real problem we are facing and we want to > work to get a solution upstream. Any thoughts on this Mimi? We have a real interest in working with you to solve this problem upstream, but we need more detailed feedback than "too complicated". If you don't like the solutions presented thus far, what type of solution would you like to see? -- paul-moore.com
Re: [RFC V2] IMA Log Snapshotting Design Proposal
On Tue, 2023-11-21 at 17:01 -0800, Tushar Sugandhi wrote: > Hi Mimi, > To address your concern about pausing the measurements - > We are not proposing to pause the measurements for the entire duration > of UM <--> Kernel interaction while taking a snapshot. > > We are simply proposing to pause the measurements when we get the TPM > PCR quotes to add them to "snapshot_aggregate". (which should be a very > small time window). IMA already has this mechanism when two separate > modules try to add entry to IMA log - by using > mutex_lock(&ima_extend_list_mutex); in ima_add_template_entry. > > > We plan to use this existing locking functionality. > Hope this addresses your concern about pausing extending the measurement > list. Each TPM PCR read is a separate TPM command. Have you done any performance anlaysis to see how long it actually takes to calculate the "snapshot_aggregate" with a physical TPM? The "snapshot_aggregate" is a new critical-data and should be upstreamed independently of this patch set. -- thanks, Mimi
Re: [RFC V2] IMA Log Snapshotting Design Proposal
On 11/16/23 14:28, Paul Moore wrote: On Tue, Oct 31, 2023 at 3:15 PM Mimi Zohar wrote: On Thu, 2023-10-19 at 11:49 -0700, Tushar Sugandhi wrote: [...] --- | C.1 Solution Summary| --- To achieve the goals described in the section above, we propose the following changes to the IMA subsystem. a. The IMA log from Kernel memory will be offloaded to some persistent storage disk to keep the system running reliably without facing memory pressure. More details, alternate approaches considered etc. are present in section "D.3 Choices for Storing Snapshots" below. b. The IMA log will be divided into multiple chunks (snapshots). Each snapshot would be a delta between the two instances when the log was offloaded from memory to the persistent storage disk. c. Some UM process (like a remote-attestation-client) will be responsible for writing the IMA log snapshot to the disk. d. The same UM process would be responsible for triggering the IMA log snapshot. e. There will be a well-known location for storing the IMA log snapshots on the disk. It will be non-trivial for UM processes to change that location after booting into the Kernel. f. A new event, "snapshot_aggregate", will be computed and measured in the IMA log as part of this feature. It should help the remote-attestation client/service to benefit from the IMA log snapshot feature. The "snapshot_aggregate" event is described in more details in section "D.1 Snapshot Aggregate Event" below. g. If the existing remote-attestation client/services do not change to benefit from this feature or do not trigger the snapshot, the Kernel will continue to have it's current functionality of maintaining an in-memory full IMA log. Additionally, the remote-attestation client/services need to be updated to benefit from the IMA log snapshot feature. These proposed changes are described in section "D.4 Remote-Attestation Client/Service Side Changes" below, but their implementation is out of scope for this proposal. As previously said on v1, This design seems overly complex and requires synchronization between the "snapshot" record and exporting the records from the measurement list. [...] Concerns: - Pausing extending the measurement list. Nothing has changed in terms of the complexity or in terms of pausing the measurement list. Pausing the measurement list is a non starter. The measurement list would only need to be paused for the amount of time it would require to generate the snapshot_aggregate entry, which should be minimal and only occurs when a privileged userspace requests a snapshot operation. The snapshot remains opt-in functionality, and even then there is the possibility that the kernel could reject the snapshot request if generating the snapshot_aggregate entry was deemed too costly (as determined by the kernel) at that point in time. Thanks Paul for responding and sharing your thoughts. Hi Mimi, To address your concern about pausing the measurements - We are not proposing to pause the measurements for the entire duration of UM <--> Kernel interaction while taking a snapshot. We are simply proposing to pause the measurements when we get the TPM PCR quotes to add them to "snapshot_aggregate". (which should be a very small time window). IMA already has this mechanism when two separate modules try to add entry to IMA log - by using mutex_lock(&ima_extend_list_mutex); in ima_add_template_entry. We plan to use this existing locking functionality. Hope this addresses your concern about pausing extending the measurement list. ~Tushar Userspace can already export the IMA measurement list(s) via the securityfs {ascii,binary}_runtime_measurements file(s) and do whatever it wants with it. All that is missing in the kernel is the ability to trim the measurement list, which doesn't seem all that complicated. From my perspective what has been presented is basically just trimming the in-memory measurement log, the additional complexity (which really doesn't look that bad IMO) is there to ensure robustness in the face of an unreliable userspace (processes die, get killed, etc.) and to establish a new, transitive root of trust in the newly trimmed in-memory log. I suppose one could simplify things greatly by having a design where userspace captures the measurement log and then writes the number of measurement records to trim from the start of the measurement log to a sysfs file and the kernel acts on that. You could do this with, or without, the snapshot_aggregate entry concept; in fact that could be something that was controlled by usersp
Re: [RFC V2] IMA Log Snapshotting Design Proposal
On 11/16/23 14:07, Paul Moore wrote: On Tue, Nov 14, 2023 at 1:58 PM Stefan Berger wrote: On 11/14/23 13:36, Sush Shringarputale wrote: On 11/13/2023 10:59 AM, Stefan Berger wrote: On 10/19/23 14:49, Tushar Sugandhi wrote: === | Introduction | === This document provides a detailed overview of the proposed Kernel feature IMA log snapshotting. It describes the motivation behind the proposal, the problem to be solved, a detailed solution design with examples, and describes the changes to be made in the clients/services which are part of remote-attestation system. This is the 2nd version of the proposal. The first version is present here[1]. Table of Contents: -- A. Motivation and Background B. Goals and Non-Goals B.1 Goals B.2 Non-Goals C. Proposed Solution C.1 Solution Summary C.2 High-level Work-flow D. Detailed Design D.1 Snapshot Aggregate Event D.2 Snapshot Triggering Mechanism D.3 Choosing A Persistent Storage Location For Snapshots D.4 Remote-Attestation Client/Service-side Changes D.4.a Client-side Changes D.4.b Service-side Changes E. Example Walk-through F. Other Design Considerations G. References Userspace applications will have to know a) where are the shard files? We describe the file storage location choices in section D.3, but user applications will have to query the well-known location described there. b) how do I read the shard files while locking out the producer of the shard files? IMO, this will require a well known config file and a locking method (flock) so that user space applications can work together in this new environment. The lock could be defined in the config file or just be the config file itself. The flock is a good idea for co-ordination between UM clients. While the Kernel cannot enforce any access in this way, any UM process that is planning on triggering the snapshot mechanism should follow that protocol. We will ensure we document that as the best-practices in the patch series. It's more than 'best practices'. You need a well-known config file with well-known config options in it. All clients that were previously just trying to read new bytes from the IMA log cannot do this anymore in the presence of a log shard producer but have to also learn that a new log shard has been produced so they need to figure out the new position in the log where to read from. So maybe a counter in a config file should indicate to the log readers that a new log has been produced -- otherwise they would have to monitor all the log shard files or the log shard file's size. If a counter is needed, I would suggest placing it somewhere other than the config file so that we can enforce limited write access to the config file. Agreed. The counter shouldn't be part of a config file. IMA log already provides a trustworthy, tamper-resilient mechanism to store such data. The current design already provides the mechanism to store the counter as part of the snapshot_aggregate event. See section "D.1 Snapshot Aggregate Event" in the proposal for reference. Snapshot_Counter := "Snapshot_Attempt_Count=" "snapshot_aggregate" becomes the first event recorded in the in-memory IMA log, after the past entries are purged to a shard file. Along with the other benefits, the "snapshot_aggregate" event also provides info to UM clients about how many snapshots are taken so far. See section "C.2 High-level Work-flow" in the proposal for more info. Step #f - (In-memory IMA log) .--. | "snapshot_aggregate" | | Event #E4| | Event #E5| '--' ~Tushar Regardless, I imagine there are a few ways one could synchronize various userspace applications such that they see a consistent view of the decomposed log state, and the good news is that the approach described here is opt-in from a userspace perspective. If the userspace does not fully support IMA log snapshotting then it never needs to trigger it and the system behaves as it does today; on the other hand, if the userspace has been updated it can make use of the new functionality to better manage the size of the IMA measurement log.
Re: [RFC V2] IMA Log Snapshotting Design Proposal
On 11/16/2023 2:56 PM, Paul Moore wrote: On Thu, Nov 16, 2023 at 5:41 PM Stefan Berger wrote: On 11/16/23 17:07, Paul Moore wrote: On Tue, Nov 14, 2023 at 1:58 PM Stefan Berger wrote: On 11/14/23 13:36, Sush Shringarputale wrote: On 11/13/2023 10:59 AM, Stefan Berger wrote: On 10/19/23 14:49, Tushar Sugandhi wrote: === | Introduction | === This document provides a detailed overview of the proposed Kernel feature IMA log snapshotting. It describes the motivation behind the proposal, the problem to be solved, a detailed solution design with examples, and describes the changes to be made in the clients/services which are part of remote-attestation system. This is the 2nd version of the proposal. The first version is present here[1]. Table of Contents: -- A. Motivation and Background B. Goals and Non-Goals B.1 Goals B.2 Non-Goals C. Proposed Solution C.1 Solution Summary C.2 High-level Work-flow D. Detailed Design D.1 Snapshot Aggregate Event D.2 Snapshot Triggering Mechanism D.3 Choosing A Persistent Storage Location For Snapshots D.4 Remote-Attestation Client/Service-side Changes D.4.a Client-side Changes D.4.b Service-side Changes E. Example Walk-through F. Other Design Considerations G. References Userspace applications will have to know a) where are the shard files? We describe the file storage location choices in section D.3, but user applications will have to query the well-known location described there. b) how do I read the shard files while locking out the producer of the shard files? IMO, this will require a well known config file and a locking method (flock) so that user space applications can work together in this new environment. The lock could be defined in the config file or just be the config file itself. The flock is a good idea for co-ordination between UM clients. While the Kernel cannot enforce any access in this way, any UM process that is planning on triggering the snapshot mechanism should follow that protocol. We will ensure we document that as the best-practices in the patch series. It's more than 'best practices'. You need a well-known config file with well-known config options in it. All clients that were previously just trying to read new bytes from the IMA log cannot do this anymore in the presence of a log shard producer but have to also learn that a new log shard has been produced so they need to figure out the new position in the log where to read from. So maybe a counter in a config file should indicate to the log readers that a new log has been produced -- otherwise they would have to monitor all the log shard files or the log shard file's size. If a counter is needed, I would suggest placing it somewhere other than the config file so that we can enforce limited write access to the config file. Regardless, I imagine there are a few ways one could synchronize various userspace applications such that they see a consistent view of the decomposed log state, and the good news is that the approach described here is opt-in from a userspace perspective. If the A FUSE filesystem that stitches together the log shards from one or multiple files + IMA log file(s) could make this approach transparent for as long as log shards are not thrown away. Presumably it (or root) could bind-mount its files over the two IMA log files. userspace does not fully support IMA log snapshotting then it never needs to trigger it and the system behaves as it does today; on the I don't think individual applications should trigger it , instead some dedicated background process running on a machine would do that every n log entries or so and possibly offer the FUSE filesystem at the same time. In either case, once any application triggers it, all either have to know how to deal with the shards or FUSE would make it completely transparent. FUSE would be a reasonable user space co-ordination implementation. A privileged process would trigger the snapshot generation and provide the mountpoint to read the full IMA log backed by shards as needed by relying parties. Whether it is a privileged daemon or some other agent that triggers the snapshot, it shouldn't impact the Kernel-side implementation. - Sush Yes, performing a snapshot is a privileged operation which I expect would be done and managed by a dedicated daemon running on the system.
Re: [RFC V2] IMA Log Snapshotting Design Proposal
On Thu, Nov 16, 2023 at 5:41 PM Stefan Berger wrote: > On 11/16/23 17:07, Paul Moore wrote: > > On Tue, Nov 14, 2023 at 1:58 PM Stefan Berger wrote: > >> On 11/14/23 13:36, Sush Shringarputale wrote: > >>> On 11/13/2023 10:59 AM, Stefan Berger wrote: > On 10/19/23 14:49, Tushar Sugandhi wrote: > > === > > | Introduction | > > === > > This document provides a detailed overview of the proposed Kernel > > feature IMA log snapshotting. It describes the motivation behind the > > proposal, the problem to be solved, a detailed solution design with > > examples, and describes the changes to be made in the clients/services > > which are part of remote-attestation system. This is the 2nd version > > of the proposal. The first version is present here[1]. > > > > Table of Contents: > > -- > > A. Motivation and Background > > B. Goals and Non-Goals > > B.1 Goals > > B.2 Non-Goals > > C. Proposed Solution > > C.1 Solution Summary > > C.2 High-level Work-flow > > D. Detailed Design > > D.1 Snapshot Aggregate Event > > D.2 Snapshot Triggering Mechanism > > D.3 Choosing A Persistent Storage Location For Snapshots > > D.4 Remote-Attestation Client/Service-side Changes > > D.4.a Client-side Changes > > D.4.b Service-side Changes > > E. Example Walk-through > > F. Other Design Considerations > > G. References > > > > Userspace applications will have to know > a) where are the shard files? > >>> We describe the file storage location choices in section D.3, but user > >>> applications will have to query the well-known location described there. > b) how do I read the shard files while locking out the producer of the > shard files? > > IMO, this will require a well known config file and a locking method > (flock) so that user space applications can work together in this new > environment. The lock could be defined in the config file or just be > the config file itself. > >>> The flock is a good idea for co-ordination between UM clients. While > >>> the Kernel cannot enforce any access in this way, any UM process that > >>> is planning on triggering the snapshot mechanism should follow that > >>> protocol. We will ensure we document that as the best-practices in > >>> the patch series. > >> > >> It's more than 'best practices'. You need a well-known config file with > >> well-known config options in it. > >> > >> All clients that were previously just trying to read new bytes from the > >> IMA log cannot do this anymore in the presence of a log shard producer > >> but have to also learn that a new log shard has been produced so they > >> need to figure out the new position in the log where to read from. So > >> maybe a counter in a config file should indicate to the log readers that > >> a new log has been produced -- otherwise they would have to monitor all > >> the log shard files or the log shard file's size. > > > > If a counter is needed, I would suggest placing it somewhere other > > than the config file so that we can enforce limited write access to > > the config file. > > > > Regardless, I imagine there are a few ways one could synchronize > > various userspace applications such that they see a consistent view of > > the decomposed log state, and the good news is that the approach > > described here is opt-in from a userspace perspective. If the > > A FUSE filesystem that stitches together the log shards from one or > multiple files + IMA log file(s) could make this approach transparent > for as long as log shards are not thrown away. Presumably it (or root) > could bind-mount its files over the two IMA log files. > > > userspace does not fully support IMA log snapshotting then it never > > needs to trigger it and the system behaves as it does today; on the > > I don't think individual applications should trigger it , instead some > dedicated background process running on a machine would do that every n > log entries or so and possibly offer the FUSE filesystem at the same > time. In either case, once any application triggers it, all either have > to know how to deal with the shards or FUSE would make it completely > transparent. Yes, performing a snapshot is a privileged operation which I expect would be done and managed by a dedicated daemon running on the system. -- paul-moore.com
Re: [RFC V2] IMA Log Snapshotting Design Proposal
On 11/16/23 17:07, Paul Moore wrote: On Tue, Nov 14, 2023 at 1:58 PM Stefan Berger wrote: On 11/14/23 13:36, Sush Shringarputale wrote: On 11/13/2023 10:59 AM, Stefan Berger wrote: On 10/19/23 14:49, Tushar Sugandhi wrote: === | Introduction | === This document provides a detailed overview of the proposed Kernel feature IMA log snapshotting. It describes the motivation behind the proposal, the problem to be solved, a detailed solution design with examples, and describes the changes to be made in the clients/services which are part of remote-attestation system. This is the 2nd version of the proposal. The first version is present here[1]. Table of Contents: -- A. Motivation and Background B. Goals and Non-Goals B.1 Goals B.2 Non-Goals C. Proposed Solution C.1 Solution Summary C.2 High-level Work-flow D. Detailed Design D.1 Snapshot Aggregate Event D.2 Snapshot Triggering Mechanism D.3 Choosing A Persistent Storage Location For Snapshots D.4 Remote-Attestation Client/Service-side Changes D.4.a Client-side Changes D.4.b Service-side Changes E. Example Walk-through F. Other Design Considerations G. References Userspace applications will have to know a) where are the shard files? We describe the file storage location choices in section D.3, but user applications will have to query the well-known location described there. b) how do I read the shard files while locking out the producer of the shard files? IMO, this will require a well known config file and a locking method (flock) so that user space applications can work together in this new environment. The lock could be defined in the config file or just be the config file itself. The flock is a good idea for co-ordination between UM clients. While the Kernel cannot enforce any access in this way, any UM process that is planning on triggering the snapshot mechanism should follow that protocol. We will ensure we document that as the best-practices in the patch series. It's more than 'best practices'. You need a well-known config file with well-known config options in it. All clients that were previously just trying to read new bytes from the IMA log cannot do this anymore in the presence of a log shard producer but have to also learn that a new log shard has been produced so they need to figure out the new position in the log where to read from. So maybe a counter in a config file should indicate to the log readers that a new log has been produced -- otherwise they would have to monitor all the log shard files or the log shard file's size. If a counter is needed, I would suggest placing it somewhere other than the config file so that we can enforce limited write access to the config file. Regardless, I imagine there are a few ways one could synchronize various userspace applications such that they see a consistent view of the decomposed log state, and the good news is that the approach described here is opt-in from a userspace perspective. If the A FUSE filesystem that stitches together the log shards from one or multiple files + IMA log file(s) could make this approach transparent for as long as log shards are not thrown away. Presumably it (or root) could bind-mount its files over the two IMA log files. userspace does not fully support IMA log snapshotting then it never needs to trigger it and the system behaves as it does today; on the I don't think individual applications should trigger it , instead some dedicated background process running on a machine would do that every n log entries or so and possibly offer the FUSE filesystem at the same time. In either case, once any application triggers it, all either have to know how to deal with the shards or FUSE would make it completely transparent. other hand, if the userspace has been updated it can make use of the new functionality to better manage the size of the IMA measurement log.
Re: [RFC V2] IMA Log Snapshotting Design Proposal
On Tue, Oct 31, 2023 at 3:15 PM Mimi Zohar wrote: > On Thu, 2023-10-19 at 11:49 -0700, Tushar Sugandhi wrote: > > [...] > > --- > > | C.1 Solution Summary| > > --- > > To achieve the goals described in the section above, we propose the > > following changes to the IMA subsystem. > > > > a. The IMA log from Kernel memory will be offloaded to some > > persistent storage disk to keep the system running reliably > > without facing memory pressure. > > More details, alternate approaches considered etc. are present > > in section "D.3 Choices for Storing Snapshots" below. > > > > b. The IMA log will be divided into multiple chunks (snapshots). > > Each snapshot would be a delta between the two instances when > > the log was offloaded from memory to the persistent storage > > disk. > > > > c. Some UM process (like a remote-attestation-client) will be > > responsible for writing the IMA log snapshot to the disk. > > > > d. The same UM process would be responsible for triggering the IMA > > log snapshot. > > > > e. There will be a well-known location for storing the IMA log > > snapshots on the disk. It will be non-trivial for UM processes > > to change that location after booting into the Kernel. > > > > f. A new event, "snapshot_aggregate", will be computed and measured > > in the IMA log as part of this feature. It should help the > > remote-attestation client/service to benefit from the IMA log > > snapshot feature. > > The "snapshot_aggregate" event is described in more details in > > section "D.1 Snapshot Aggregate Event" below. > > > > g. If the existing remote-attestation client/services do not change > > to benefit from this feature or do not trigger the snapshot, > > the Kernel will continue to have it's current functionality of > > maintaining an in-memory full IMA log. > > > > Additionally, the remote-attestation client/services need to be updated > > to benefit from the IMA log snapshot feature. These proposed changes > > > > are described in section "D.4 Remote-Attestation Client/Service Side > > Changes" below, but their implementation is out of scope for this > > proposal. > > As previously said on v1, >This design seems overly complex and requires synchronization between the >"snapshot" record and exporting the records from the measurement list. > [...] > >Concerns: >- Pausing extending the measurement list. > > Nothing has changed in terms of the complexity or in terms of pausing > the measurement list. Pausing the measurement list is a non starter. The measurement list would only need to be paused for the amount of time it would require to generate the snapshot_aggregate entry, which should be minimal and only occurs when a privileged userspace requests a snapshot operation. The snapshot remains opt-in functionality, and even then there is the possibility that the kernel could reject the snapshot request if generating the snapshot_aggregate entry was deemed too costly (as determined by the kernel) at that point in time. > Userspace can already export the IMA measurement list(s) via the > securityfs {ascii,binary}_runtime_measurements file(s) and do whatever > it wants with it. All that is missing in the kernel is the ability to > trim the measurement list, which doesn't seem all that complicated. >From my perspective what has been presented is basically just trimming the in-memory measurement log, the additional complexity (which really doesn't look that bad IMO) is there to ensure robustness in the face of an unreliable userspace (processes die, get killed, etc.) and to establish a new, transitive root of trust in the newly trimmed in-memory log. I suppose one could simplify things greatly by having a design where userspace captures the measurement log and then writes the number of measurement records to trim from the start of the measurement log to a sysfs file and the kernel acts on that. You could do this with, or without, the snapshot_aggregate entry concept; in fact that could be something that was controlled by userspace, e.g. write the number of lines and a flag to indicate if a snapshot_aggregate was desired to the sysfs file. I can't say I've thought it all the way through to make sure there are no gotchas, but I'm guessing that is about as simple as one can get. If there is something else you had in mind, Mimi, please share the details. This is a very real problem we are facing and we want to work to get a solution upstream. -- paul-moore.com
Re: [RFC V2] IMA Log Snapshotting Design Proposal
On Tue, Nov 14, 2023 at 1:58 PM Stefan Berger wrote: > On 11/14/23 13:36, Sush Shringarputale wrote: > > On 11/13/2023 10:59 AM, Stefan Berger wrote: > >> On 10/19/23 14:49, Tushar Sugandhi wrote: > >>> === > >>> | Introduction | > >>> === > >>> This document provides a detailed overview of the proposed Kernel > >>> feature IMA log snapshotting. It describes the motivation behind the > >>> proposal, the problem to be solved, a detailed solution design with > >>> examples, and describes the changes to be made in the clients/services > >>> which are part of remote-attestation system. This is the 2nd version > >>> of the proposal. The first version is present here[1]. > >>> > >>> Table of Contents: > >>> -- > >>> A. Motivation and Background > >>> B. Goals and Non-Goals > >>> B.1 Goals > >>> B.2 Non-Goals > >>> C. Proposed Solution > >>> C.1 Solution Summary > >>> C.2 High-level Work-flow > >>> D. Detailed Design > >>> D.1 Snapshot Aggregate Event > >>> D.2 Snapshot Triggering Mechanism > >>> D.3 Choosing A Persistent Storage Location For Snapshots > >>> D.4 Remote-Attestation Client/Service-side Changes > >>> D.4.a Client-side Changes > >>> D.4.b Service-side Changes > >>> E. Example Walk-through > >>> F. Other Design Considerations > >>> G. References > >>> > >> > >> Userspace applications will have to know > >> a) where are the shard files? > > We describe the file storage location choices in section D.3, but user > > applications will have to query the well-known location described there. > >> b) how do I read the shard files while locking out the producer of the > >> shard files? > >> > >> IMO, this will require a well known config file and a locking method > >> (flock) so that user space applications can work together in this new > >> environment. The lock could be defined in the config file or just be > >> the config file itself. > > The flock is a good idea for co-ordination between UM clients. While > > the Kernel cannot enforce any access in this way, any UM process that > > is planning on triggering the snapshot mechanism should follow that > > protocol. We will ensure we document that as the best-practices in > > the patch series. > > It's more than 'best practices'. You need a well-known config file with > well-known config options in it. > > All clients that were previously just trying to read new bytes from the > IMA log cannot do this anymore in the presence of a log shard producer > but have to also learn that a new log shard has been produced so they > need to figure out the new position in the log where to read from. So > maybe a counter in a config file should indicate to the log readers that > a new log has been produced -- otherwise they would have to monitor all > the log shard files or the log shard file's size. If a counter is needed, I would suggest placing it somewhere other than the config file so that we can enforce limited write access to the config file. Regardless, I imagine there are a few ways one could synchronize various userspace applications such that they see a consistent view of the decomposed log state, and the good news is that the approach described here is opt-in from a userspace perspective. If the userspace does not fully support IMA log snapshotting then it never needs to trigger it and the system behaves as it does today; on the other hand, if the userspace has been updated it can make use of the new functionality to better manage the size of the IMA measurement log. -- paul-moore.com
Re: [RFC V2] IMA Log Snapshotting Design Proposal
On 11/14/23 13:36, Sush Shringarputale wrote: On 11/13/2023 10:59 AM, Stefan Berger wrote: On 10/19/23 14:49, Tushar Sugandhi wrote: === | Introduction | === This document provides a detailed overview of the proposed Kernel feature IMA log snapshotting. It describes the motivation behind the proposal, the problem to be solved, a detailed solution design with examples, and describes the changes to be made in the clients/services which are part of remote-attestation system. This is the 2nd version of the proposal. The first version is present here[1]. Table of Contents: -- A. Motivation and Background B. Goals and Non-Goals B.1 Goals B.2 Non-Goals C. Proposed Solution C.1 Solution Summary C.2 High-level Work-flow D. Detailed Design D.1 Snapshot Aggregate Event D.2 Snapshot Triggering Mechanism D.3 Choosing A Persistent Storage Location For Snapshots D.4 Remote-Attestation Client/Service-side Changes D.4.a Client-side Changes D.4.b Service-side Changes E. Example Walk-through F. Other Design Considerations G. References Userspace applications will have to know a) where are the shard files? We describe the file storage location choices in section D.3, but user applications will have to query the well-known location described there. b) how do I read the shard files while locking out the producer of the shard files? IMO, this will require a well known config file and a locking method (flock) so that user space applications can work together in this new environment. The lock could be defined in the config file or just be the config file itself. The flock is a good idea for co-ordination between UM clients. While the Kernel cannot enforce any access in this way, any UM process that is planning on triggering the snapshot mechanism should follow that protocol. We will ensure we document that as the best-practices in the patch series. It's more than 'best practices'. You need a well-known config file with well-known config options in it. All clients that were previously just trying to read new bytes from the IMA log cannot do this anymore in the presence of a log shard producer but have to also learn that a new log shard has been produced so they need to figure out the new position in the log where to read from. So maybe a counter in a config file should indicate to the log readers that a new log has been produced -- otherwise they would have to monitor all the log shard files or the log shard file's size. Iff the log-shard producer were configured to discard leading parts of the log then that should also be noted in a config file so clients, that need to see the beginning of the log, can refuse early on to work on a machine that either is configured this way or where the discarding has already happened. Stefan - Sush
Re: [RFC V2] IMA Log Snapshotting Design Proposal
On 11/13/2023 10:59 AM, Stefan Berger wrote: On 10/19/23 14:49, Tushar Sugandhi wrote: === | Introduction | === This document provides a detailed overview of the proposed Kernel feature IMA log snapshotting. It describes the motivation behind the proposal, the problem to be solved, a detailed solution design with examples, and describes the changes to be made in the clients/services which are part of remote-attestation system. This is the 2nd version of the proposal. The first version is present here[1]. Table of Contents: -- A. Motivation and Background B. Goals and Non-Goals B.1 Goals B.2 Non-Goals C. Proposed Solution C.1 Solution Summary C.2 High-level Work-flow D. Detailed Design D.1 Snapshot Aggregate Event D.2 Snapshot Triggering Mechanism D.3 Choosing A Persistent Storage Location For Snapshots D.4 Remote-Attestation Client/Service-side Changes D.4.a Client-side Changes D.4.b Service-side Changes E. Example Walk-through F. Other Design Considerations G. References Userspace applications will have to know a) where are the shard files? We describe the file storage location choices in section D.3, but user applications will have to query the well-known location described there. b) how do I read the shard files while locking out the producer of the shard files? IMO, this will require a well known config file and a locking method (flock) so that user space applications can work together in this new environment. The lock could be defined in the config file or just be the config file itself. The flock is a good idea for co-ordination between UM clients. While the Kernel cannot enforce any access in this way, any UM process that is planning on triggering the snapshot mechanism should follow that protocol. We will ensure we document that as the best-practices in the patch series. - Sush
Re: [RFC V2] IMA Log Snapshotting Design Proposal
On 10/19/23 14:49, Tushar Sugandhi wrote: === | Introduction | === This document provides a detailed overview of the proposed Kernel feature IMA log snapshotting. It describes the motivation behind the proposal, the problem to be solved, a detailed solution design with examples, and describes the changes to be made in the clients/services which are part of remote-attestation system. This is the 2nd version of the proposal. The first version is present here[1]. Table of Contents: -- A. Motivation and Background B. Goals and Non-Goals B.1 Goals B.2 Non-Goals C. Proposed Solution C.1 Solution Summary C.2 High-level Work-flow D. Detailed Design D.1 Snapshot Aggregate Event D.2 Snapshot Triggering Mechanism D.3 Choosing A Persistent Storage Location For Snapshots D.4 Remote-Attestation Client/Service-side Changes D.4.a Client-side Changes D.4.b Service-side Changes E. Example Walk-through F. Other Design Considerations G. References Userspace applications will have to know a) where are the shard files? b) how do I read the shard files while locking out the producer of the shard files? IMO, this will require a well known config file and a locking method (flock) so that user space applications can work together in this new environment. The lock could be defined in the config file or just be the config file itself.
Re: [RFC V2] IMA Log Snapshotting Design Proposal
On 10/31/2023 11:37 AM, Ken Goldman wrote: On 10/19/2023 2:49 PM, Tushar Sugandhi wrote: f. A new event, "snapshot_aggregate", will be computed and measured in the IMA log as part of this feature. It should help the remote-attestation client/service to benefit from the IMA log snapshot feature. The "snapshot_aggregate" event is described in more details in section "D.1 Snapshot Aggregate Event" below. What is the use case for the snapshot aggregate? My thinking is: 1. The platform must retain the entire measurement list. Early measurements can never be discarded because a new quote verifier must receive the entire log starting at the first measurement. In this case, isn't the snapshot aggregate redundant? Not quite. The snapshot aggregate still has a purpose, which is to stitch together the snapshots on the disk and the in-memory segment of the IMA log. The specific details are in the RFC Section D.1, quoted here: The "snapshot_aggregate" marker provides the following benefits: a. It facilitates the IMA log to be divided into multiple chunks and provides mechanism to verify the integrity of the system using only the latest chunks during remote attestation. b. It provides tangible evidence from Kernel to the attestation client that IMA log snapshotting has been enabled and at least one snapshot exists on the system. c. It helps both the Kernel and UM attestation client define clear boundaries between multiple snapshots. d. In the event of multiple snapshots, the last measured "snapshot_aggregate" marker, which is present in the current segment of the IMA log, has sufficient information to verify the integrity of the IMA log segment as well as the previous snapshots using the PCR quotes. e. In the event of multiple snapshots, say N, if the remote-attestation service has already processed the last N-1 snapshots, it can efficiently parse through them by just processing "snapshot_aggregate" events to compute the PCR quotes needed to verify the events in the last snapshot. This should drastically improve the IMA log processing efficiency of the service. 2. There is a disadvantage to redundant data. The verifier must support this new event type. It receives this event and must validate the aggregate against the snapshot-ed events. This is an attack surface. The attacker can send an aggregate and snapshot-ed measurements that do not match to exploit a flaw in the verifier. I disagree with this. Redundancy is a moot point because "snapshot_aggregate" is required for the points mentioned above. - Sush
Re: [RFC V2] IMA Log Snapshotting Design Proposal
On Thu, 2023-10-19 at 11:49 -0700, Tushar Sugandhi wrote: [...] > --- > | C.1 Solution Summary| > --- > To achieve the goals described in the section above, we propose the > following changes to the IMA subsystem. > > a. The IMA log from Kernel memory will be offloaded to some > persistent storage disk to keep the system running reliably > without facing memory pressure. > More details, alternate approaches considered etc. are present > in section "D.3 Choices for Storing Snapshots" below. > > b. The IMA log will be divided into multiple chunks (snapshots). > Each snapshot would be a delta between the two instances when > the log was offloaded from memory to the persistent storage > disk. > > c. Some UM process (like a remote-attestation-client) will be > responsible for writing the IMA log snapshot to the disk. > > d. The same UM process would be responsible for triggering the IMA > log snapshot. > > e. There will be a well-known location for storing the IMA log > snapshots on the disk. It will be non-trivial for UM processes > to change that location after booting into the Kernel. > > f. A new event, "snapshot_aggregate", will be computed and measured > in the IMA log as part of this feature. It should help the > remote-attestation client/service to benefit from the IMA log > snapshot feature. > The "snapshot_aggregate" event is described in more details in > section "D.1 Snapshot Aggregate Event" below. > > g. If the existing remote-attestation client/services do not change > to benefit from this feature or do not trigger the snapshot, > the Kernel will continue to have it's current functionality of > maintaining an in-memory full IMA log. > > Additionally, the remote-attestation client/services need to be updated > to benefit from the IMA log snapshot feature. These proposed changes > > are described in section "D.4 Remote-Attestation Client/Service Side > Changes" below, but their implementation is out of scope for this > proposal. As previously said on v1, This design seems overly complex and requires synchronization between the "snapshot" record and exporting the records from the measurement list. [...] Concerns: - Pausing extending the measurement list. Nothing has changed in terms of the complexity or in terms of pausing the measurement list. Pausing the measurement list is a non starter. Userspace can already export the IMA measurement list(s) via the securityfs {ascii,binary}_runtime_measurements file(s) and do whatever it wants with it. All that is missing in the kernel is the ability to trim the measurement list, which doesn't seem all that complicated. Mimi
Re: [RFC V2] IMA Log Snapshotting Design Proposal
On 10/19/2023 2:49 PM, Tushar Sugandhi wrote: f. A new event, "snapshot_aggregate", will be computed and measured in the IMA log as part of this feature. It should help the remote-attestation client/service to benefit from the IMA log snapshot feature. The "snapshot_aggregate" event is described in more details in section "D.1 Snapshot Aggregate Event" below. What is the use case for the snapshot aggregate? My thinking is: 1. The platform must retain the entire measurement list. Early measurements can never be discarded because a new quote verifier must receive the entire log starting at the first measurement. In this case, isn't the snapshot aggregate redundant? 2. There is a disadvantage to redundant data. The verifier must support this new event type. It receives this event and must validate the aggregate against the snapshot-ed events. This is an attack surface. The attacker can send an aggregate and snapshot-ed measurements that do not match to exploit a flaw in the verifier.