Re: [External] : Re: [OTC VOTE PROPOSAL] Don't merge PR#14759 (blinding=yes and similar properties)

SHANE LONTIS Thu, 08 Apr 2021 16:52:50 -0700

Nicola we already have a interface that sets blinding to on or off..


> On 9 Apr 2021, at 9:48 am, Nicola Tuveri <nic....@gmail.com> wrote:
> 
> Thanks for the feedback Tim! 
> I see your points and I value the discussion. 
> 
> Reading your message I realized that probably I got carried away in some of 
> the details in the lengthy email, and lost focus on part of the message I 
> wanted to convey.
> 
> Setting aside the matter of secure by default designs, 
> I don't oppose having a property system to let users select whatever 
> implementation from a menu of available implementations.
> But when these are boolean properties as proposed in the PR, they should be 
> objective properties with a universally clear definition that is immutable 
> over time and clear cut between yes/no. 
> 
> This is not the case for blinding=yes because it is not a boolean property. 
> In most cases a blinding implementation does "some" blinding at some level, 
> but does not blind every single computation. 
> Moreover the "amount" of blinding that qualifies an implementation as 
> performing the minimum required blinding changes over time, over threat 
> models, different standardizing bodies/different subjective considerations: 
> it's not an objective property of an algorithm implementation.
> blinding=no has better semantics because you could define it as "this 
> implementation intentionally avoids any kind of blinding", and that would be 
> an objective, universal, and immutable definition.
> Same applies for consttime=yes vs consttime=no.
> Then I went on blabbing about the downsides of the =no variants of this, 
> because they are cases that don't admit a well defined =yes counterpart. I 
> included that to justify why I don't believe that flipping the flag logic 
> here would be enough to make the PR acceptable (IMO). 
> 
> I would not oppose this PR if it was defining a property string to mark 
> another objective quality of the implementation: e.g. 
> supports_cofactor_ecdh=yes vs =no, or compressed_format=yes, batching=yes, 
> multiprime=yes, padding=this_or_that_padding_standard as I don't argue about 
> having the provider=whatever that qualifies the origin of an implementation, 
> or labvalidated=cert_program_xyz, ISOxxxxx_score=3.14 and similar things.
> Those properties have clear definitions and could be perfectly useful for 
> apps and end-users to choose among a rich menu of alternatives. No objection 
> there.
> 
> You make another point with which I tend to disagree strongly: some 
> actionable information must be better than no information at all. 
> 
> I agree with it when it is stated like that, but I believe that is not the 
> case if the "some information" has a high risk of being actually 
> "disinformation". 
> In such cases the extra information can actually do more harm than good, e.g. 
> false sense of security, misled choices, risks of ossification, technical 
> debt, maintenance burden, etc.
> 
> What leads me to qualify blinding=yes as a property string with high risks of 
> quickly degrading into disinformation? The fact that it has ill-defined 
> semantics as I verbosely tried to argument.
> 
> The bits about the scope of well-definedness (apologies I fail to phrase this 
> in a better way) are both to support the above argument and to start a 
> discussion on how to establish criteria that would allow coordination among 
> the project and our community (incouding different provider authors and the 
> end-users) so that there is process to create well-defined shared property 
> strings for those qualities that cannot be reduced to definitions given by 
> some existing standardization body.
> In such a propquery program, I am of the opinion that "applies 
> countermeasures to prevent some side-channel attacks" should be deemed 
> unacceptable because of the reasons stated above, and rejected. 
> 
> 
> 
> Thanks again for you feedback, I hope this time I did a better job of 
> separating the concerns I have about the semantics of the proposed property 
> string and on the need to establish some guidance to have some degree of 
> coherence in the definition of property strings.
> I see it already as a problem that in this and the previous email a lot of my 
> arguments had to be prefixed with "in my opinion" because we lack a framework 
> to guide our own choices.
> 
> 
> Nicola
> 
> 
> P.S. aside from the main topic, I understand provider authors are free to 
> define and use properties as they see fit, and we are not limiting that 
> freedom in my proposal. But we need a framework to guide our decisions in 
> soundly designing and defining our own properties, and also a system to allow 
> coordinated properties when desirable (which is relevant here because 
> blinding=yes is an attempt at establishing a shared property definition for 
> end-users and 3rd party providers, given that its only purpose is to let 
> end-users decide among some of our implementations or some yet to come 3rd 
> party implementation). 
> 
> On Fri, Apr 9, 2021, 01:50 Tim Hudson <t...@cryptsoft.com 
> <mailto:t...@cryptsoft.com>> wrote:
> Nicola, you are (in my view) conflating multiple items.
> 
> For the provider and property design approach it is rather simple.
> - Algorithm implementations can vary.
> - Selection between algorithm implementations when multiple providers are 
> available is performed by properties
> 
> Algorithm implementations should declare whatever set of properties they feel 
> is appropriate for their implementation.
> Applications (and in this context most likely directly by end-user 
> configuration) should be able to select which properties are considered most 
> important for their context.
> That decision capability must be left to the end user as only the end user 
> knows the security context in which they are operating - we don't know that 
> ourselves.
> 
> The vast majority of your lengthy email below is actually focused on one 
> issue - what should the default behaviour be for selection of implementations 
> - and your viewpoint that we should not mark different properties at all that 
> might impact security. I don't think that position is supportable - in that 
> you are basically arguing that we should never declare anything about 
> properties of implementations and should never select between different 
> implementations except at a "provider level" approach. Your approach is that 
> all implementations should be considered equal and that pretty much defies 
> logic in my view.
> 
> Different implementations have different characteristics. Even looking at 
> something like constant time - not all of our implementations are constant 
> time.
> 
> Your statement that the approach of declaring properties "promotes insecure 
> by default" is simply flawed logic - and following the same logic I could 
> state that your approach "promotes ignorance by default" as it effectively 
> states that users shouldn't know the properties of the implementation in a 
> manner that allows selection to be performed.
> 
> Not all implementations are the same and different implementations can 
> implement different mitigations. Properties allow us to declare those 
> mitigations and allow users to make different decisions on the basis of the 
> properties that providers declare for algorithms. Having that information 
> available has to be a better thing than having nothing available - as with 
> nothing available then no selection between alternatives is possible.
> 
> Separately arguing about what the default set of properties should be (i.e. 
> what mitigations should we configure as required by default) would make sense 
> to do so - but arguing that the information for making such decisions 
> shouldn't be present simply makes no sense.
> 
> Tim.
> 
> 
> On Fri, Apr 9, 2021 at 3:02 AM Nicola Tuveri <nic....@gmail.com 
> <mailto:nic....@gmail.com>> wrote:
> Background
> ==========
> 
> [PR#14759](https://github.com/openssl/openssl/pull/14759 
> <https://urldefense.com/v3/__https://github.com/openssl/openssl/pull/14759__;!!GqivPVa7Brio!PiGg3zyLqwziPdnGYV3o9fY0UuVI8YpAii4YGV0a0aFmPrP9Ltzf_YobVMeQe80UgA$>)
>  (Set
> blinding=yes property on some algorithm implementations) is a fix for
> [Issue#14654](https://github.com/openssl/openssl/issues/14654 
> <https://urldefense.com/v3/__https://github.com/openssl/openssl/issues/14654__;!!GqivPVa7Brio!PiGg3zyLqwziPdnGYV3o9fY0UuVI8YpAii4YGV0a0aFmPrP9Ltzf_YobVMeNBLf63Q$>)
>  which
> itself is a spin-off of
> [Issue#14616](https://github.com/openssl/openssl/issues/14616 
> <https://urldefense.com/v3/__https://github.com/openssl/openssl/issues/14616__;!!GqivPVa7Brio!PiGg3zyLqwziPdnGYV3o9fY0UuVI8YpAii4YGV0a0aFmPrP9Ltzf_YobVMf53in-dQ$>).
> 
> The original issue is about _Deprecated low level key API's that have no
> replacements_, among which the following received special attention and
> were issued a dedicated issue during an OTC meeting:
> 
> ~~~c
> // RSA functions on RSA_FLAG_*
> void RSA_clear_flags(RSA *r, int flags);
> int RSA_test_flags(const RSA *r, int flags);
> void RSA_set_flags(RSA *r, int flags);
> 
> // RSA_FLAG_* about blinding
> #define RSA_FLAG_BLINDING
> #define RSA_FLAG_NO_BLINDING
> 
> // RSA functions directly on blinding
> int RSA_blinding_on(RSA *rsa, BN_CTX *ctx);
> void RSA_blinding_off(RSA *rsa);
> BN_BLINDING *RSA_setup_blinding(RSA *rsa, BN_CTX *ctx);
> ~~~
> 
> The decision the sprung Issue#14616 and PR#14759 was to use the
> propquery mechanism to let providers advertise algorithms as
> `blinding=yes` to select secure implementations if there are insecure
> ones present as well.
> 
> Similarly it was discussed the potential `consttime=yes` property that
> would work in a similar way: if applied properly for our current
> implementations that are not fully constant time it would allow a
> user/sysadmin/developer to prefer a third party implementation for the
> same algorithm with better security guarantees.
> In some contexts the consttime implementation might be seriously
> penalizing in terms of performance and in the contexts where const time
> is not required this would allow to select accordingly.
> 
> Definition for the blinding property
> ------------------------------------
> 
> The current definition of the `blinding` property applies to
> provider-native algorithm implementations for the `asym_cipher` and
> `signature` operations:
> 
> ```pod
> =head2 Properties
> 
> The following property is set by some of the OpenSSL signature
> algorithms.
> 
> =over 4
> 
> =item "blinding"
> 
> This boolean property is set to "yes" if the implementation performs
> blinding to prevent some side-channel attacks.
> ```
> 
> Rationale
> =========
> 
> Property queries are our decision making process for implementation
> selection, and has been part of the design for 3.0 since the Brisbane
> design meetings: I am not opposing the use of the property query
> mechanism to select algorithms here, but the semantics of the properties
> we decide to adopt (and thus endorse and encourage also for use by 3rd
> parties).
> In particular I see the following issues with choices like
> `blinding=yes` or `consttime=yes`.
> 
> Design promotes insecure by default
> -----------------------------------
> 
> This design is a slippery slope into patterns that go against the
> "secure by default" best practices.
> Users/administrators/developers should not have to opt-in for the safer
> implementation, but the other way around: the opt-in should be for the
> less safe but more performant implementations after evaluating the
> consequences of such a choice in specific contexts.
> We shouldn't have users having to query for `consttime=yes` algorithms
> but rather for `consttime=no` explicitly in specific conditions.
> So if this was the only issue with PR#14759 my recommendation would be
> to rather flag the non-blinding implementations as such rather than the
> other way around as is currently done.
> 
> The scenario in which 3rd party providers offer insecure algorithms not
> flagged as such with `consttime=no` or `blinding=no` IMO would then fall
> under the "know your providers" umbrella: if you are installing provider
> X you should be well aware that you are trusting X's authors "to do the
> right thing".
> 
> The project history showed us how the risk for insecure-by-default
> designs and its consequences are not just an hypothetical matter of
> debate: for example, `BN_FLG_CONSTTIME` has been with us since version
> 0.9.8f (and even before that under the `BN_FLG_EXP_CONSTTIME` name),
> well over 14 years, in which the decision of having the flag off by
> default and requiring to manually enabling it when desired has been the
> cause of many vulnerabilities and fixes, that are still a concern today,
> as we fixed yet another instance of forgetting to set it just some weeks
> ago (PR#13889).
> 
> In this case, though, I expect that just flipping the default is not
> enough to accept this PR: the problem of going with the negative
> versions of these flags is that such design can't be made future proof
> as we wish to do: we can't know in advance all possible properties that
> should be declared as `=no` by an algorithm released today as part of
> our providers when new relevant properties might be defined in the
> future.
> E.g., what if after 3.0 is released a ECCKiila provider was released and
> opted to tag its implementations as `formally_verified`?
> They could add `formally_verified=yes` but then they would fall into the
> "insecure by default" pattern: users would need to opt-in for the safer
> version instead of getting it by default, but there is no other option
> given that our and others' providers' offerings were not preemptively
> flagged with `formally_verified=no`.
> 
> Scope of propquery semantics
> ----------------------------
> 
> This follows directly from the previous example. The design of the
> propquery mechanism seems (IMO) to be limited to provide
> __well-defined__ semantics for the properties only within the scope of a
> single provider (and a single provider version at that).
> 
> If a given provider offered at runtime for the same algorithm two or
> more versions, one flagged as `consttime=no` and one with
> consttime guarantees, then it could provide documentation to its
> userbase on how to setup default propqueries in the configuration file
> at installation time or to application developers to embed them in their
> code depending on the use case, to select the `consttime=no` offering
> when such a thing is perceived as desirable for whatever (ill-guided :P)
> reason.
> 
> But doing this across providers does not really work, because other
> provider authors might have not flagged their insecure implementations
> with `consttime=no`, or because there might be very different opinions
> on what qualifies as "fully consttime" (that evolve over time with
> different threat models and attacker capabilities):
> provX's `consttime=yes` might be "more consttime" than provY's
> `consttime=yes`.
> For example provX is similar to our default provider, and
> offers an EC algorithm built on top of a generic BIGNUM module so its
> generic implementation supports any valid EC curve named or custom and
> orgX made sure that the ECDH/ECDSA and ECmath layers of the
> implementation follow state-of-the-art practices regarding
> secret-dependant code execution or data access so they flag their
> EC implementation as `consttime=yes`, even though the underlying BIGNUM
> module has limited consttime guarantees, so in certain circumstances a
> dynamic `realloc` or some other seemingly minor thing could leak a few
> secret bits to a motivated attacker.
> provY is similar to our default provider, and provides an EC algorithm,
> but instead of having a generic codepath they only support a subset of
> named curves (e.g. only P256 and P521) each with a dedicated
> implementation that removes the need for a generic BIGNUM layer and
> instead embeds its specific consttime field arithmetic in each curve
> implementation (e.g. they provide our `ecp_nistz256.c` and
> `ecp_nistp521.c` implementations only). provY's ECDH/ECDSA, ECmath and
> FFmath layers all follow best consttime practices so they flag as
> `consttime=yes` their EC offering.
> Now `constime=yes` is not well-defined anymore across providers, making
> it quite useless for users and providers to use `consttime=yes` or
> `consttime=no` to pick among offerings from different providers.
> 
> The same would be true within the same organization and provider, across
> provider versions. Let's say tomorrow we were to merge a PR that made
> all of BIGNUM resistant against all currently known timing side-channel
> techniques. Coincidentally we decided to mark all our pubkey providers
> as `consttime=yes` after verifying that all the layers above BIGNUM are
> also timing-resistant to current techniques.
> Fast-forward a year and a new CPU design flaw is discovered, giving a
> remote attacker a side-channel with unprecedented resolution and
> SNR: our previous `consttime=yes` semantic would not stand the
> test of time and the disruption of semantics could have ripple-effects
> within our providers and to the stack of applications that embedded
> `consttime=yes` as a selection mechanism to pick among offerings from
> various provider versions with now ill-defined qualifiers.
> 
> If we can agree, even just on some level or some specific cases like
> these, that the definitions of properties in propqueries have their
> scope limited to the single provider version, what is the point of
> exposing `blinding=*` to our users, given that at runtime we don't offer
> alternative implementations for the same algorithm with different
> properties?
> (I mean, yes we have alternative choices at runtime in some cases, but
> not at the algorithm level in the provider API, they are buried well
> within each algorithm implementation that offers such choices)
> 
> Too vague/coarse semantics
> --------------------------
> 
> Elaborating further from the examples above, there is also a conceptual
> issue at the base of the OTC discussion that led to PR#14759, namely
> that "blinding is a property of an algorithm: either an implementation
> does blinding or it doesn't".
> 
> But this is not entirely true: blinding is applied to individual
> operations at various levels.
> RSA is a special case that has the luxury of resting its security on
> basically a single mathematical operation (modular exponentiation) over
> a single mathematical layer of abstraction (finite field math provided
> by BIGNUM): so it is enough to apply blinding to the modular
> exponentiation to say "this algorithm implementations does blinding".
> That is not true anymore already for DSA signing (that relies on the
> same single finite field math abstraction provided by BIGNUM, but
> already rests on two operation, exponentiation and modular inversion)
> and one more layer deep with ECDH/ECDSA where you have two operations
> (scalar multiplication and modular inversion), of which the first
> is composed of math operations coming from the EC arithmetic layer of
> abstraction, and each of those operation themselves are built as the
> composition of many operations from the same FF arithmetic layer from
> BIGNUM.
> The number of levels of abstraction involved in an implementation keeps
> rising steeply when we move into the realm of upcoming post-quantum
> algorithms.
> When you have multiple operations from multiple layers, applying binding
> efficiently is non-trivial, one maybe blinds only the EC scalarmul but
> not the FF modinv, because the first takes the bulk of the computation
> time and common consensus at some point in time is that there is little
> point in blinding the modinv if its SNR is deemed too low to be
> exploitable. But what happens when this does not hold anymore? What is
> the meaning of `blinding=yes` for an EC implementation that applied
> blinding only to scalarmul and not to modinv because that was the best
> practice at the time it was written?
> What if blinding is applied to the scalarmul but the FF operation to
> perform the blinding already leaks a small amount because BIGNUM?
> A small leakage might not be considered a problem with 2048+ RSA
> keys (shared opinion might be that 1 or 3 bits of knowledge wouldn't
> give the attacker a noticeable boost compared to other tools in their
> arsenal) but the same few bits have been proven sufficient to break even
> P384 or P521 keys (which are above the security level of RSA 2048) with
> limited effort and costs: what is the meaning of `blinding=yes` in the
> case where the blinding implementation itself reveals information on the
> secret it should protect?
> 
> What if the blinding operation itself is not leaking a single bit by
> itself, but it magnifies some effect from the underlying level of
> abstraction (e.g., blinding scalarmul could make the nonce bigger than
> the prime order subgroup size `n`, and the specific scalarmul
> implementation must guarantee that its scalar input is less than `n` to
> compute correctly, so it has a special branch to be robust and handle
> bigger scalars coming from blinding, creating a magnifying lens for an
> attacker when the branch is taken, as that says something on the
> probability of the high order bits of the scalar being set)?
> This might seem as having nothing to do with the semantics of
> `blinding=yes`, but it does for providers like ours where we apply
> blinding at top abstraction layer (and in the case of certain paths also
> in strategic points in the second layer) but then have a number of
> different sub-algorithm implementations that behave very differently:
> depending on the curve name (and encoding format when parsing a key) you
> might end up in the generic prime/binary curves implementation, or in
> one of the specific ones like nistz256 or nistp521.
> 
> Applying `blinding=yes` like this:
> 
> ```c
> {"ECDSA","provider=default,blinding=yes",ossl_ecdsa_signature_functions}
> ```
> 
> is already making the semantics of the `blinding` property ill-defined,
> and affecting its effectiveness/purpose as a decision mechanism to be
> exposed to users: we have different codepaths under the same umbrella
> flagged as applying blinding, but even within it that blinding has very
> different shades of "yes".
> 
> As @t8m said "Security is not a boolean value" and my point here is that
> neither is `blinding` or `consttime` and similar: they come in shades
> (even the `formally_verified` one: maybe the EC layer is formally
> verified by tools, but the underlying FF module might come with
> different guarantees, etc.) and the threshold by which SOTA deems the
> quality to become "yes*" or "no*" changes over time (e.g., it was fine
> to only blind scalarmul yesterday, so implementation X was considered as
> a blinding implementation, but today also modinv blinding is required so
> the same implementation X is now deemed non-blinding).
> 
> Alternative
> -----------
> 
> Support for the specific `RSA_FLAG_BLINDING`&co functionality could
> instead just be dropped.
> My understanding is that its usefulness is already limited to legacy
> compatibility with engines which anyway would not play nicely with the
> new provider/core architecture.
> 
> More in general, as a mechanism to select secure implementations if
> there are insecure ones present as well, the problem can already be
> solved applying the "know your providers" mantra in combination with the
> use of the `provider=default`, `provider=provX` properties to pick
> across the offerings of multiple providers.
> 
> This wouldn't cover the use case (is this even supported/tested right
> now?) of a single provider registering the same algorithm twice with
> different alternative implementations and properties: in this case
> though it would be possible to use a property definition local to the
> specific provider version, and its documentation would inform the users
> of how to use propquery to select what and the related conditions and
> caveats. Luckily this last case does not affect our implementations, as
> we don't offer competing algorithms at the provider level.
> 
> Although I wouldn't propose to do it, if we really really wanted to
> retain the functionality of `RSA_FLAG_BLINDING` and have
> algorithms exposing knobs with the capability of turning on/off (or
> 1000/500/0) whatever degree of blinding they support, we could achieve
> this with existing ctrls/params as we do for many other tunable knobs.
> Using ctrls/params over doing this on a propquery level has the
> advantage that the definition/semantics can be scoped with more
> granularity than offered at the provider level.
> As a user/sysadmin, if I have done the evaluation of the specific
> context that could lead me to responsibly decide to pick no-blinding,
> no-consttime, no-secure or similar hazardous choices, I would expect to
> be anyway in a situation where I have so much control over my
> system, software and environment that patching the code to pass a param
> or call a ctrl function deeply inside my application shouldn't be a
> blocker at all.
> 
> Summary
> -------
> 
> I am opposed to merging PR#14759 for several reasons:
> 
> - it lends itself to insecure-by-default patterns, repeating mistakes
>   from the past
> - the validity scope for similar properties is somewhat limited to a
>   single provider version, so its effectiveness against its purpose of
>   being a selection mechanism for users to pick among competing
>   algorithms offered by different providers is limited (and has the
>   potential to hunt us back in the future)
> - its definition and semantics are vague, its applicability as an
>   algorithm property too coarse to be well-defined even inside our own
>   providers
> - `secure`, `blinding`, `consttime`, etc. are not boolean properties of
>   algorithms (even if I really wish they could be)
> - it's one more vague concept for everyone to understand, and one more
>   item with not-too-clear logic for the project to handle and maintain
>   for the foreseeable future
> - we have already established alternatives in the form of
>   per-implementation params or ctrls to tune some security parameters
>   that can alter the execution flow of our implementations
> 
> Also, from a more generic usability perspective and thinking of
> supporting those in our Community developing 3rd party providers, I
> think we should also
> 
> - [in the vote] take a position on what are good/bad ideas when devising
>   provider-specific properties, and have some official guidelines in the
>   documentation about it by the time final 3.0.0 is release, and
> - [not in the vote, could be a separate discussion] document what
>   mechanisms we plan to put in place to coordinate efforts
>   which will allow for some carefully evaluated properties to have
>   well-defined semantics (with versioning?) across providers/vendors
>   (some form of registry of properties, with well-defined criteria for
>   maintaining it?)
> 
> Proposed vote text
> ==================
> 
>     Do not merge PR#14759, prevent declaring properties similar to
>     `blinding=yes` or `consttime=yes` in our implementations and
>     discourage 3rd parties from adopting similar designs.
> 
> 
> 
> 
> Please provide feedback on the vote text, I would like to open the vote
> on Monday and have some public debate about its points via the list
> rather then compress it on the OTC meeting which already has enough
> items in agenda!
> 
> Thanks,
> 
> Nicola

Re: [External] : Re: [OTC VOTE PROPOSAL] Don't merge PR#14759 (blinding=yes and similar properties)

Reply via email to