On 10/26/21 21:43, Stephen Frost wrote:
Greetings,

* Yura Sokolov (y.soko...@postgrespro.ru) wrote:
... >>
Integrity could be based on simple non-cryptographic checksum, and it could
be checked after decryption. It would be imposible to intentionally change
encrypted page in a way it will pass checksum after decription.

No, it wouldn't be impossible when we're talking about non-cryptographic
checksums.  That is, in fact, why you'd call them that.  If it were
impossible (or at least utterly impractical) then you'd be able to claim
that it's cryptographic-level integrity validation.


Yeah, our checksums are probabilistic protection against rare and random bitflips cause by hardware, not against an attacker in the crypto sense.

To explain why it's not enough, consider our checksum is uint16, i.e. there are only 64k possible values. In other words, you can try flipping bits in the encrypted page, and after generating 64k you're guaranteed to have at least one collision. Yes, it's harder to get collision with the existing checksum, and compression methods that diffuse bits better makes it harder to get a valid page after decryption, but it's simply not the same thing as a crypto integrity.

Let's not try inventing something custom, there's been enough crypto failures due to smart custom stuff in the past already.

BTW I'm not sure what the existing patches do, but I wonder if we should calculate the checksum before or after encryption. I'd say it should be after encryption, because checksums were meant as a protection against issues at the storage level, so the checksum should be on what's written to storage, and it'd also allow offline verification of checksums etc. (Of course, that'd make the whole idea of relying on our checksums even more futile.)

Note: Maybe there are reasons why the checksum needs to be calculated before encryption, not sure.

Currently we have 16bit checksum, and it is very small. But having larger
checksum is orthogonal (ie doesn't bound) to having encryption.

Sure, but that would also require a page-format change.  We've pointed
out the downsides of that and what it would prevent in terms of
use-cases.  That's still something that might happen but it would be a
different effort from this.


... and if such page format ends up happening, it'd be fairly easy to just add some extra crypto data into the page header and not rely on the data checksums at all.

In fact, Adiantum is easily made close to SIV construction:
- just leave last 8/16 bytes zero. If after decription they are zero,
   then integrity check passed.
That is because SIV and Adiantum are very similar in its structure:
- SIV:
-- hash
-- then stream cipher
- Adiantum:
-- hash (except last 16bytes)
-- then encrypt last 16bytes with hash,
-- then stream cipher
-- then hash.
If last N (N>16) bytes is nonce + zero bytes, then "hash, then encrypt last
16bytes with hash" become equivalent to just "hash", and Adiantum became
logical equivalent to SIV.

While I appreciate your interest in this, I don't think it makes sense
for us to try and implement something of our own- we're not
cryptographers.  Best is to look at published guideance and what other
projects have had success doing, and that's what this thread has been
about.


Yeah, I personally don't see much difference between XTS and Adiantum.

There are a bunch of benefits, but the main reason why Google developed it seems to be performance on low-end ARM machines (i.e. phones). Which is nice, but it's probably not hugely important - very few people run Pg on such machines, especially in performance-sensitive context.

It's true Adiantum is probably more resilient to IV reuse etc. but it's not like XTS is suddenly obsolete, and it certainly doesn't solve the integrity issue etc.

- like XTS it doesn't need to change plain text format and doesn't need in
   additional Nonce/Auth Code.

Sure, in which case it's something that could potentially be added later
as another option in the future.  I don't think we'll always have just
one encryption method and it's good to generally think about what it
might look like to have others but I don't think it makes sense to try
and get everything in all at once.

And among others Adiantum looks best: it is fast even without hardware
acceleration, it provides whole block encryption (ie every bit depends
on every bit) and it doesn't bound to plain-text format.

And it could still be added later as another option if folks really want
it to be.  I've outlined why it makes sense to go with XTS first but I
don't mean that to imply that we'll only ever have that.  Indeed, once
we've actually got something, adding other methods will almost certainly
be simpler.  Trying to do everything from the start will make this very
difficult to accomplish though.


Yeah.

So maybe the best thing is simply to roll with both - design the whole feature in a way that allows selecting the encryption scheme, with two options. That's generally a good engineering practice, as it ensures things are not coupled too much. And it's not like the encryption methods are expected to be super difficult.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Reply via email to