Maybe worth taking a look at TDE in HDFS:

https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/TransparentEncryption.html

A complete solution requires several Hadoop services. I suspect that would
scare the Kafka community a bit, but maybe it's unreasonable to expect
Kafka brokers to do all we've mentioned.

Of particular note, seems TDE uses multiple layers of keys to avoid
re-encrypting data when keys are rotated, iiuc.

Ryanne

On Sat, May 16, 2020, 9:04 AM Adam Bellemare <adam.bellem...@gmail.com>
wrote:

> Hi Sönke
>
> I've been giving it more thought over the last few days, and looking into
> other systems as well, and I think that I've derailed your proposal a bit
> with suggesting that at-rest encryption may be sufficient. I believe that
> many of us are lacking the context of the sorts of discussions you have had
> with stakeholders concerned about encryption. Anyways, a very brief
> abbreviation of my thoughts:
>
> 1) We should look to do encryption at-rest, but it should be outside the
> scope of this KIP. (Is disk encryption as provided by the OS or cloud
> provider sufficient?)
>
> 2) For end-to-end encryption, the part that concerns me is the various
> roles that the broker may play in this plan. For instance, in Phase 2:
>
> > This phase will concentrate on server-side configuration of encryption.
> Topic settings will be added that allow the specification of encryption
> settings that consumers and producers should use. Producers and Consumers
> will be enabled to fetch these settings and use them for encryption without
> the end-user having to configure anything in addition.
>
> > Brokers will be extended with pluggable Key Managers that will allow for
> automatic key rotation later on. A basic, keystore based implementation
> will be created.
> Again, I am not a security expert, but it seems to me that if we want
> end-to-end encryption on par with the sort of encryption we see in our
> RelationalDB cousins, it would require that the broker (which could be
> hosted remotely, with a potentially malicious admin) have no knowledge of
> any of the keys, nor be responsible for any sort of key rotation. I believe
> that all of this would be required to be handled by the clients themselves
> (though please correct me if I am misinterpreting this), and that to reduce
> attack surface possibilities, we should handle the encryption + decryption
> keys in a manner similar to how we handle TLS keys (client must supply
> their own).
>
> Ryanne does point out that automatic key-rotation of end-to-end encrypted
> data would be an incredibly useful feature to have. However, I am not sure
> how to square this against what is done with relational databases, as it
> seems that they require that the client perform any updates or changes to
> the encryption keys and data and wash their hands completely of that duty
> (which makes sense - keep the database out of it, reduce the attack
> surface). End-to-end, by definition requires that the broker be unable to
> decrypt any of the data, and having it responsible for rolling keys, while
> seemingly useful, does deftly throw end-to-end out the window.
>
> Final Q:
> Would it be reasonable to create a new optional service in the Kafka
> project that is strictly responsible for these sorts of encryption matters?
> Something like Confluent's schema registry, but as a mechanism for
> coordinating key rotations with clients, encryption key registrations per
> topic, etc.? KeyManager would plug into here, could use Kafka as the
> storage layer for the keys (as we do with schemas, but encrypted themselves
> of course) or use the whole thing as just a thin layer over a full blown
> remote KeyManager that simply coordinates the producers, consumers, and
> keys required for the data per topic. This independent service would give
> organizations the ability to host it locally for security purposes, while
> farming out the brokers to perhaps less trustworthy sources?
>
> Adam
>
>
>
>
>
>
> On Sun, May 10, 2020 at 7:52 PM Adam Bellemare <adam.bellem...@gmail.com>
> wrote:
>
> > @Ryanne
> > > Seems that could still get us per-topic keys (vs encrypting the entire
> > > volume), which would be my main requirement.
> >
> > Agreed, I think that per-topic separation of keys would be very valuable
> > for multi-tenancy.
> >
> >
> > My 2 cents is that if encryption at rest is sufficient to satisfy GDPR +
> > other similar data protection measures, then we should aim to do that
> > first. The demand is real and privacy laws wont likely be loosening any
> > time soon. That being said, I am not sufficiently familiar with the
> myriad
> > of data laws. I will look into it some more though, as I am now curious.
> >
> >
> > On Sat, May 9, 2020 at 6:12 PM Maulin Vasavada <
> maulin.vasav...@gmail.com>
> > wrote:
> >
> >> Hi Sonke
> >>
> >> Thanks for bringing this for discussion. There are lot of considerations
> >> even if we assume we have end-to-end encryption done. Example depending
> >> upon company's setup there could be restrictions on how/which encryption
> >> keys are shared. Environment could have multiple security and network
> >> boundaries beyond which keys are not allowed to be shared. That will
> mean
> >> that consumers may not be able to decrypt the messages at all if the
> data
> >> is moved from one zone to another. If we have mirroring done, are
> >> mirror-makers supposed to decrypt and encrypt again OR they would be
> >> pretty
> >> much bytes-in bytes-out paradigm that it is today? Also having a
> polyglot
> >> Kafka client base will force you to support encryption/decryption
> >> libraries
> >> that work for all the languages and that may not work depending upon the
> >> scope of the team owning Kafka Infrastructure.
> >>
> >> Combining disk encryption with TLS+ACLs could be enough instead of
> having
> >> end-to-end message level encryption. What is your opinion on that?
> >>
> >> We have experimented with end-to-end encryption with custom
> >> serializers/deserializers and I felt that was good enough because
> >> other challenges I mentioned before may not be ease to address with a
> >> generic solution.
> >>
> >> Thanks
> >> Maulin
> >>
> >>
> >>
> >> Thanks
> >> Maulin
> >>
> >>
> >>
> >>
> >> On Sat, May 9, 2020 at 2:05 PM Ryanne Dolan <ryannedo...@gmail.com>
> >> wrote:
> >>
> >> > Adam, I agree, seems reasonable to limit the broker's responsibility
> to
> >> > encrypting only data at rest. I guess whole segment files could be
> >> > encrypted with the same key, and rotating keys would just involve
> >> > re-encrypting entire segments. Maybe a key rotation would involve
> >> closing
> >> > all affected segments and kicking off a background task to re-encrypt
> >> them.
> >> > Certainly that would not impede ingestion of new records, and seems
> >> > consumers could use the old segments until they are replaced with the
> >> newly
> >> > encrypted ones.
> >> >
> >> > Seems that could still get us per-topic keys (vs encrypting the entire
> >> > volume), which would be my main requirement.
> >> >
> >> > Not really "end-to-end", but combined with TLS or something, seems
> >> > reasonable.
> >> >
> >> > Ryanne
> >> >
> >> > On Sat, May 9, 2020, 11:00 AM Adam Bellemare <
> adam.bellem...@gmail.com>
> >> > wrote:
> >> >
> >> > > Hi All
> >> > >
> >> > > I typed up a number of replies which I have below, but I have one
> >> major
> >> > > overriding question: Is there a reason we aren't implementing
> >> > > encryption-at-rest almost exactly the same way that most relational
> >> > > databases do? ie:
> >> > > https://wiki.postgresql.org/wiki/Transparent_Data_Encryption
> >> > >
> >> > > I ask this because it seems like we're going to end up with
> something
> >> > > similar to what they did in terms of requirements, plus...
> >> > >
> >> > > "For the *past 16 months*, there has been discussion about whether
> and
> >> > how
> >> > > to implement Transparent Data Encryption (tde) in Postgres. Many
> other
> >> > > relational databases support tde, and *some security standards
> >> require*
> >> > it.
> >> > > However, it is also debatable how much security value tde provides.
> >> > > The tde *400-email
> >> > > thread* became difficult for people to follow..."
> >> > > What still isn't clear to me is the scope that we're trying to cover
> >> > here.
> >> > > Encryption at rest suggests that we need to have the data encrypted
> on
> >> > the
> >> > > brokers, and *only* on the brokers, since they're the durable units
> of
> >> > > storage. Any encryption over the wire should be covered by TLS.  I
> >> think
> >> > > that our goals for this should be (from
> >> > >
> >> >
> >>
> https://wiki.postgresql.org/wiki/Transparent_Data_Encryption#Threat_models
> >> > > )
> >> > >
> >> > > > TDE protects data from theft when file system access controls are
> >> > > > compromised:
> >> > > >
> >> > > >    - Malicious user steals storage devices and reads database
> files
> >> > > >    directly.
> >> > > >    - Malicious backup operator takes backup.
> >> > > >    - Protecting data at rest (persistent data)
> >> > > >
> >> > > > This does not protect from users who can read system memory, e.g.,
> >> > shared
> >> > > > buffers, which root users can do.
> >> > > >
> >> > >
> >> > > I am not a security expert nor am I an expert on relational
> databases.
> >> > > However, I can't identify any reason why the approach outlined by
> >> > > PostgresDB, which is very similar to MySQL/InnoDB and IBM (from my
> >> > > understanding) wouldn't work for data-at-rest encryption. In
> addition,
> >> > we'd
> >> > > get the added benefit of being consistent with other solutions,
> which
> >> is
> >> > an
> >> > > easier sell when discussing security with management (Kafka? Oh
> yeah,
> >> > their
> >> > > encryption solution is just like the one we already have in place
> for
> >> our
> >> > > Postgres solutions), and may let us avoid reinventing a good part of
> >> the
> >> > > wheel.
> >> > >
> >> > >
> >> > > ------------------
> >> > >
> >> > > @Ryanne
> >> > > One more complicating factor, regarding joins - the foreign key
> joiner
> >> > > requires access to the value to extract the foreign key - if it's
> >> > > encrypted, the FKJ would need to decrypt it to apply the value
> >> extractor.
> >> > >
> >> > > @Soenk re (1)
> >> > > > When people hear that this is not part of Apache Kafka itself, but
> >> that
> >> > > > would need to develop something themselves that more often than
> not
> >> is
> >> > > the
> >> > > > end of that discussion. Using something that is not "stock" is
> quite
> >> > > often
> >> > > > simply not an option.
> >> > >
> >> > > > I strongly feel that this is a needed feature in Kafka and that
> >> there
> >> > is
> >> > > a
> >> > > > large number of people out there that would want to use it - but I
> >> may
> >> > > very
> >> > > > well be mistaken, responses to this thread have not exactly been
> >> > > plentiful
> >> > > > this last year and a half..
> >> > >
> >> > > I agree with you on the default vs. non-default points made. We must
> >> all
> >> > > note that this mailing list is *not *representative of the typical
> >> users
> >> > of
> >> > > Kafka, and that many organizations are predominantly looking to use
> >> > > out-of-the-box solutions. This will only become more common as
> hosted
> >> > Kafka
> >> > > solutions (think AWS hosted Kafka) gain more traction. I think the
> >> goal
> >> > of
> >> > > this KIP to provide that out-of-the-box experience is extremely
> >> > important,
> >> > > especially for all the reasons noted so far (GDPR, privacy,
> >> financials,
> >> > > interest by many parties but no default solution).
> >> > >
> >> > > re: (4)
> >> > > >> Regarding plaintext data in RocksDB instances, I am a bit torn to
> >> be
> >> > > >> honest. On the one hand, I feel like this scenario is not
> something
> >> > that
> >> > > we
> >> > > >> can fully control.
> >> > >
> >> > > I agree with this in principle. I think that our responsibility to
> >> > encrypt
> >> > > data at rest ends the moment that data leaves the broker. That being
> >> > said,
> >> > > it isn't unreasonable. I am going to think more about this and see
> if
> >> I
> >> > can
> >> > > come up with something.
> >> > >
> >> > >
> >> > >
> >> > >
> >> > >
> >> > > On Fri, May 8, 2020 at 5:05 AM Sönke Liebau
> >> > > <soenke.lie...@opencore.com.invalid> wrote:
> >> > >
> >> > > > Hey everybody,
> >> > > >
> >> > > > thanks a lot for reading and giving feedback!! I'll try and answer
> >> all
> >> > > > points that I found going through the thread in this mail, but if
> I
> >> > miss
> >> > > > something please feel free to let me know! I've added a running
> >> number
> >> > to
> >> > > > the discussed topics for ease of reference down the road.
> >> > > >
> >> > > > I'll go through the KIP and update it with everything that I have
> >> > written
> >> > > > below after sending this mail.
> >> > > >
> >> > > > @Tom:
> >> > > > (1) If I understand your concerns correctly you feel that this
> >> > > > functionality would have a hard time getting approved into Apache
> >> Kafka
> >> > > > because it can be achieved with custom Serializers in the same way
> >> and
> >> > > that
> >> > > > we should maybe develop this outside of Apache Kafka at first.
> >> > > > I feel like it is precisely the fact that this is not part of core
> >> > Apache
> >> > > > Kafka that makes people think twice about doing end-to-end
> >> encryption.
> >> > I
> >> > > > may be working in a market (Germany) that is a bit special when
> >> > compared
> >> > > to
> >> > > > the rest of the world where encryption and things like that are
> >> > > concerned,
> >> > > > but I've personally sat in multiple meetings where this feature
> was
> >> > > > discussed. It is not necessarily the end-to-end encryption itself,
> >> but
> >> > > the
> >> > > > at-rest encryption that you get with it.
> >> > > > When people hear that this is not part of Apache Kafka itself, but
> >> that
> >> > > > would need to develop something themselves that more often than
> not
> >> is
> >> > > the
> >> > > > end of that discussion. Using something that is not "stock" is
> quite
> >> > > often
> >> > > > simply not an option.
> >> > > > Even if they decide to go forward with it, they'll find Hendrik's
> >> blog
> >> > > post
> >> > > > from 4 years ago on this, probably the Whitepapers from Confluent
> >> and
> >> > > > Lenses and maybe a few implementations on github - all of which
> just
> >> > > serve
> >> > > > to further muddy the waters. Not because any of these resources
> are
> >> bad
> >> > > or
> >> > > > wrong, but just because information and implementations are spread
> >> out
> >> > > over
> >> > > > a lot of different places. Developing this outside of Apache Kafka
> >> > would
> >> > > > simply serve to add one more item to this list that would not
> really
> >> > > matter
> >> > > > I'm afraid.
> >> > > >
> >> > > > I strongly feel that this is a needed feature in Kafka and that
> >> there
> >> > is
> >> > > a
> >> > > > large number of people out there that would want to use it - but I
> >> may
> >> > > very
> >> > > > well be mistaken, responses to this thread have not exactly been
> >> > > plentiful
> >> > > > this last year and a half..
> >> > > >
> >> > > > @Mike:
> >> > > > (2) Regarding the encryption of headers, my current idea is to
> keep
> >> > this
> >> > > > configurable. I have seen customers use headers for stuff like
> >> account
> >> > > > numbers which under the GDPR are considered to be personal data
> that
> >> > > should
> >> > > > be encrypted wherever possible. So in some instances it might be
> >> useful
> >> > > to
> >> > > > encrypt header fields as well.
> >> > > > My current PoC implementation allows specifying a Regex for
> headers
> >> > that
> >> > > > should be encrypted, which would allow having encrypted and
> >> unencrypted
> >> > > > headers in the same record to hopefully suit most use cases.
> >> > > >
> >> > > > (3) Also, my plan is to not change the message format, but to
> >> > > > "encrypt-in-place" and add a header field with the necessary
> >> > information
> >> > > > for decryption, which would then be removed by the decrypting
> >> consumer.
> >> > > > There may be some out-of-date intentions still in the KIP, I'll go
> >> > > through
> >> > > > it and update.
> >> > > >
> >> > > > @Ryanne:
> >> > > > First off, I fully agree that we should avoid painting ourselves
> >> into a
> >> > > > corner with an early client-only implementation. I scaled down
> this
> >> Kip
> >> > > > from earlier attempts that included things like key rollover and
> >> > > > broker-side implementations because I could not get any feedback
> >> from
> >> > the
> >> > > > community on those for a long time and felt that maybe there was
> no
> >> > > > appetite for the full-blown solution. So I decided to try with a
> >> more
> >> > > > limited scope. I am very happy to discuss/go for the fully
> featured
> >> > > version
> >> > > > again :)
> >> > > >
> >> > > > (4) Regarding plaintext data in RocksDB instances, I am a bit torn
> >> to
> >> > be
> >> > > > honest. On the one hand, I feel like this scenario is not
> something
> >> > that
> >> > > we
> >> > > > can fully control. Kafka Streams in this case is a client that
> takes
> >> > data
> >> > > > from Kafka, decrypts it and then puts it somewhere in plaintext.
> To
> >> me
> >> > > this
> >> > > > scenario differs only slightly from for example someone writing a
> >> > backup
> >> > > > job that reads a topic and writes it to a textfile - not much we
> >> can do
> >> > > > about it.
> >> > > > That being said, Kafka Streams is part of Apache Kafka, so does
> >> merit
> >> > > > special consideration. I'll have to dig into how StateStores are
> >> used a
> >> > > bit
> >> > > > (I am not the worlds largest expert - or any kind of expert on
> >> that) to
> >> > > try
> >> > > > and come up with an idea.
> >> > > >
> >> > > >
> >> > > > (5) On key encryption and hashing, this is definitely an issue
> that
> >> we
> >> > > need
> >> > > > a solution for. I currently have key encryption configurable in my
> >> > > > implementation. When encryption is enabled, an option would of
> >> course
> >> > be
> >> > > to
> >> > > > hash the original key and store the key data together with the
> >> value in
> >> > > an
> >> > > > encrypted form. Any salt added to the key before hashing could be
> >> > > encrypted
> >> > > > along with the data. This would allow all key-based functionality
> >> like
> >> > > > compaction, joins etc. to keep working without having to know the
> >> > > cleartext
> >> > > > key.
> >> > > >
> >> > > > I've also considered deterministic encryption which would keep the
> >> > > > encrypted key the same, but I am fairly certain that we will want
> to
> >> > > allow
> >> > > > regular key rotation (more on this in next paragraph) without
> >> > > re-encrypting
> >> > > > older data and that would then change the encrypted key and break
> >> all
> >> > > these
> >> > > > things.
> >> > > > Regarding re-encrypting existing keys when a crypto key is
> >> > compromised, I
> >> > > > think we need to be very careful with this if we do it in-place on
> >> the
> >> > > > broker. If we add functionality along the lines of compaction,
> which
> >> > > reads
> >> > > > re-encrypts and rewrites segment files we have to make sure that
> >> > > producers
> >> > > > chose partitions on the cleartext value, otherwise all records
> >> starting
> >> > > > from the key change may go to a different partition of the topic..
> >> > > >
> >> > > > (6) Key rollover would be a cool feature to have. I was up until
> now
> >> > only
> >> > > > thinking about supporting regular key rollover functionality that
> >> would
> >> > > > change keys for all records going forward tbh - mostly for
> >> complexity
> >> > > > reasons - I think there was actually a sentence in the original
> KIP
> >> to
> >> > > this
> >> > > > regard. But if you and others feel this is needed then I am happy
> to
> >> > > > discuss this.
> >> > > > If we implement this on the broker we could use topic compaction
> for
> >> > > > inspiration, read all segment files and check records one by one,
> if
> >> > the
> >> > > > key used for that record has been "retired/compromised/..."
> >> re-encrypt
> >> > > with
> >> > > > new key and write a new segment file. Lots of things to consider
> >> around
> >> > > > this regarding performance, how to trigger etc. but in principle
> >> this
> >> > > could
> >> > > > work I think.
> >> > > > One issue I can see with this is if we use envelope encryption for
> >> the
> >> > > keys
> >> > > > to address the rogue admin issue, so the broker doesn't have
> access
> >> to
> >> > > the
> >> > > > actual key encrypting the data, this would make that operation
> >> > > impossible.
> >> > > >
> >> > > >
> >> > > >
> >> > > > I hope I got to all items that were raised, but may very well have
> >> > > > overlooked something, please let me know if I did - and of course
> >> your
> >> > > > thoughts on what I wrote!
> >> > > >
> >> > > > I'll update the KIP today as well.
> >> > > >
> >> > > > Best regards,
> >> > > > Sönke
> >> > > >
> >> > > >
> >> > > >
> >> > > >
> >> > > > On Thu, 7 May 2020 at 19:54, Ryanne Dolan <ryannedo...@gmail.com>
> >> > wrote:
> >> > > >
> >> > > > > Tom, good point, I've done exactly that -- hashing record keys
> --
> >> but
> >> > > > it's
> >> > > > > unclear to me what should happen when the hash key must be
> >> rotated.
> >> > In
> >> > > my
> >> > > > > case the (external) solution involved rainbow tables, versioned
> >> keys,
> >> > > and
> >> > > > > custom materializers that were aware of older keys for each
> >> record.
> >> > > > >
> >> > > > > In particular I had a pipeline that would re-key records and
> >> > re-ingest
> >> > > > > them, while opportunistically overwriting records materialized
> >> with
> >> > the
> >> > > > old
> >> > > > > key.
> >> > > > >
> >> > > > > For a native solution I think maybe we'd need to carry around
> any
> >> old
> >> > > > > versions of each record key, perhaps as metadata. Then brokers
> and
> >> > > > > materializers can compact records based on _any_ overlapping
> key,
> >> > > maybe?
> >> > > > > Not sure.
> >> > > > >
> >> > > > > Ryanne
> >> > > > >
> >> > > > > On Thu, May 7, 2020, 12:05 PM Tom Bentley <tbent...@redhat.com>
> >> > wrote:
> >> > > > >
> >> > > > > > Hi Rayanne,
> >> > > > > >
> >> > > > > > You raise some good points there.
> >> > > > > >
> >> > > > > > Similarly, if the whole record is encrypted, it becomes
> >> impossible
> >> > to
> >> > > > do
> >> > > > > > > joins, group bys etc, which just need the record key and
> maybe
> >> > > don't
> >> > > > > have
> >> > > > > > > access to the encryption key. Maybe only record _values_
> >> should
> >> > be
> >> > > > > > > encrypted, and maybe Kafka Streams could defer decryption
> >> until
> >> > the
> >> > > > > > actual
> >> > > > > > > value is inspected. That way joins etc are possible without
> >> the
> >> > > > > > encryption
> >> > > > > > > key, and RocksDB would not need to decrypt values before
> >> > > > materializing
> >> > > > > to
> >> > > > > > > disk.
> >> > > > > > >
> >> > > > > >
> >> > > > > > It's getting a bit late here, so maybe I overlooked something,
> >> but
> >> > > > > wouldn't
> >> > > > > > the natural thing to do be to make the "encrypted" key a hash
> of
> >> > the
> >> > > > > > original key, and let the value of the encrypted value be the
> >> > cipher
> >> > > > text
> >> > > > > > of the (original key, original value) pair. A scheme like this
> >> > would
> >> > > > > > preserve equality of the key (strictly speaking there's a
> >> chance of
> >> > > > > > collision of course). I guess this could also be a solution
> for
> >> the
> >> > > > > > compacted topic issue Sönke mentioned.
> >> > > > > >
> >> > > > > > Cheers,
> >> > > > > >
> >> > > > > > Tom
> >> > > > > >
> >> > > > > >
> >> > > > > >
> >> > > > > > On Thu, May 7, 2020 at 5:17 PM Ryanne Dolan <
> >> ryannedo...@gmail.com
> >> > >
> >> > > > > wrote:
> >> > > > > >
> >> > > > > > > Thanks Sönke, this is an area in which Kafka is really,
> really
> >> > far
> >> > > > > > behind.
> >> > > > > > >
> >> > > > > > > I've built secure systems around Kafka as laid out in the
> KIP.
> >> > One
> >> > > > > issue
> >> > > > > > > that is not addressed in the KIP is re-encryption of records
> >> > after
> >> > > a
> >> > > > > key
> >> > > > > > > rotation. When a key is compromised, it's important that any
> >> data
> >> > > > > > encrypted
> >> > > > > > > using that key is immediately destroyed or re-encrypted
> with a
> >> > new
> >> > > > key.
> >> > > > > > > Ideally first-class support for end-to-end encryption in
> Kafka
> >> > > would
> >> > > > > make
> >> > > > > > > this possible natively, or else I'm not sure what the point
> >> would
> >> > > be.
> >> > > > > It
> >> > > > > > > seems to me that the brokers would need to be involved in
> this
> >> > > > process,
> >> > > > > > so
> >> > > > > > > perhaps a client-first approach will be painting ourselves
> >> into a
> >> > > > > corner.
> >> > > > > > > Not sure.
> >> > > > > > >
> >> > > > > > > Another issue is whether materialized tables, e.g. in Kafka
> >> > > Streams,
> >> > > > > > would
> >> > > > > > > see unencrypted or encrypted records. If we implemented the
> >> KIP
> >> > as
> >> > > > > > written,
> >> > > > > > > it would still result in a bunch of plain text data in
> RocksDB
> >> > > > > > everywhere.
> >> > > > > > > Again, I'm not sure what the point would be. Perhaps using
> >> custom
> >> > > > > serdes
> >> > > > > > > would actually be a more holistic approach, since Kafka
> >> Streams
> >> > etc
> >> > > > > could
> >> > > > > > > leverage these as well.
> >> > > > > > >
> >> > > > > > > Similarly, if the whole record is encrypted, it becomes
> >> > impossible
> >> > > to
> >> > > > > do
> >> > > > > > > joins, group bys etc, which just need the record key and
> maybe
> >> > > don't
> >> > > > > have
> >> > > > > > > access to the encryption key. Maybe only record _values_
> >> should
> >> > be
> >> > > > > > > encrypted, and maybe Kafka Streams could defer decryption
> >> until
> >> > the
> >> > > > > > actual
> >> > > > > > > value is inspected. That way joins etc are possible without
> >> the
> >> > > > > > encryption
> >> > > > > > > key, and RocksDB would not need to decrypt values before
> >> > > > materializing
> >> > > > > to
> >> > > > > > > disk.
> >> > > > > > >
> >> > > > > > > This is why I've implemented encryption on a per-field
> basis,
> >> not
> >> > > at
> >> > > > > the
> >> > > > > > > record level, when addressing kafka security in the past.
> And
> >> > I've
> >> > > > had
> >> > > > > to
> >> > > > > > > build external pipelines that purge, re-encrypt, and
> re-ingest
> >> > > > records
> >> > > > > > when
> >> > > > > > > keys are compromised.
> >> > > > > > >
> >> > > > > > > This KIP might be a step in the right direction, not sure.
> But
> >> > I'm
> >> > > > > > hesitant
> >> > > > > > > to support the idea of end-to-end encryption without a plan
> to
> >> > > > address
> >> > > > > > the
> >> > > > > > > myriad other problems.
> >> > > > > > >
> >> > > > > > > That said, we need this badly and I hope something shakes
> out.
> >> > > > > > >
> >> > > > > > > Ryanne
> >> > > > > > >
> >> > > > > > > On Tue, Apr 28, 2020, 6:26 PM Sönke Liebau
> >> > > > > > > <soenke.lie...@opencore.com.invalid> wrote:
> >> > > > > > >
> >> > > > > > > > All,
> >> > > > > > > >
> >> > > > > > > > I've asked for comments on this KIP in the past, but
> since I
> >> > > didn't
> >> > > > > > > really
> >> > > > > > > > get any feedback I've decided to reduce the initial scope
> of
> >> > the
> >> > > > KIP
> >> > > > > a
> >> > > > > > > bit
> >> > > > > > > > and try again.
> >> > > > > > > >
> >> > > > > > > > I have reworked to KIP to provide a limited, but useful
> set
> >> of
> >> > > > > features
> >> > > > > > > for
> >> > > > > > > > this initial KIP and laid out a very rough roadmap of what
> >> I'd
> >> > > > > envision
> >> > > > > > > > this looking like in a final version.
> >> > > > > > > >
> >> > > > > > > > I am aware that the KIP is currently light on
> implementation
> >> > > > details,
> >> > > > > > but
> >> > > > > > > > would like to get some feedback on the general approach
> >> before
> >> > > > fully
> >> > > > > > > > speccing everything.
> >> > > > > > > >
> >> > > > > > > > The KIP can be found at
> >> > > > > > > >
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-317%3A+Add+end-to-end+data+encryption+functionality+to+Apache+Kafka
> >> > > > > > > >
> >> > > > > > > >
> >> > > > > > > > I would very much appreciate any feedback!
> >> > > > > > > >
> >> > > > > > > > Best regards,
> >> > > > > > > > Sönke
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > > >
> >> > > > --
> >> > > > Sönke Liebau
> >> > > > Partner
> >> > > > Tel. +49 179 7940878
> >> > > > OpenCore GmbH & Co. KG - Thomas-Mann-Straße 8 - 22880 Wedel -
> >> Germany
> >> > > >
> >> > >
> >> >
> >>
> >
>

Reply via email to