[jira] [Reopened] (KAFKA-15743) KRaft support in ReplicationQuotasTest

2024-04-24 Thread Proven Provenzano (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-15743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Proven Provenzano reopened KAFKA-15743:
---

The {{registerBroker}} function in this test file needs to use the directory 
UUIDs assigned to the broker and not random UUIDs. If the directory UUIDs do 
not match then when the broker reregisters as part of startup, any topics with 
replicas on the broker which were created with the random UUIDs will actually 
be offline. [~soarez] can comment more here.

> KRaft support in ReplicationQuotasTest
> --
>
> Key: KAFKA-15743
> URL: https://issues.apache.org/jira/browse/KAFKA-15743
> Project: Kafka
>  Issue Type: Task
>  Components: core
>Reporter: Sameer Tejani
>Assignee: Dmitry Werner
>Priority: Minor
>  Labels: kraft, kraft-test, newbie
> Fix For: 3.8.0
>
>
> The following tests in ReplicationQuotasTest in 
> core/src/test/scala/unit/kafka/server/ReplicationQuotasTest.scala need to be 
> updated to support KRaft
> 59 : def shouldBootstrapTwoBrokersWithLeaderThrottle(): Unit = {
> 64 : def shouldBootstrapTwoBrokersWithFollowerThrottle(): Unit = {
> 171 : def shouldThrottleOldSegments(): Unit = {
> Scanned 240 lines. Found 0 KRaft tests out of 3 tests



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [VOTE] 3.7.0 RC4

2024-02-20 Thread Proven Provenzano
HI,

I've downloaded, built from source and then validated JBOD with KRaft works
along with migrating a cluster with JBOD from ZK to KRaft works.

+1 (nonbinding) from me.

--Proven

On Tue, Feb 20, 2024 at 2:13 PM Justine Olshan 
wrote:

> Hey folks,
>
> I've done the following to validate the release:
>
> -- validated the keys for all the artifacts
> -- built from source and started a ZK cluster -- ran a few workloads on it.
> -- ran 2.12 Kraft cluster and ran a few workloads on it
>
> I see there is a lot of ongoing discussion about the upgrade notes. +1
> (binding) from me given Mickael is voting +1 as well.
>
> Justine
>
> On Tue, Feb 20, 2024 at 6:18 AM Divij Vaidya 
> wrote:
>
> > > I am a bit unclear on the precise process regarding what parts of this
> > get merged at what time, and whether the release first needs to be done
> or
> > not.
> >
> > The order is as follows:
> >
> > 1. Release approved as part of this vote. After this we follow the
> > steps from here:
> >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/Release+Process#ReleaseProcess-Afterthevotepasses
> >
> > 2. Upload artifacts to maven etc. These artifacts do not have RC suffix
> in
> > them. You need a PMC member to mark these artifacts as "production" in
> > apache svn.
> > 3. Update website changes (docs, blog etc.). This is where your PRs
> > on kafka-site repo get merged.
> > 4. Send a release announcement by email.
> >
> > --
> > Divij Vaidya
> >
> >
> >
> > On Tue, Feb 20, 2024 at 3:02 PM Stanislav Kozlovski
> >  wrote:
> >
> > > Thanks for testing the release! And thanks for the review on the
> > > documentation. Good catch on the license too.
> > >
> > > I have addressed the comments in the blog PR, and opened a few other
> PRs
> > to
> > > the website in relation to the release.
> > >
> > > - 37: Add download section for the latest 3.7 release
> > > 
> > > - 37: Update default docs to point to the 3.7.0 release docs
> > > 
> > > - 3.7: Add blog post for Kafka 3.7
> > > 
> > > - MINOR: Update stale upgrade_3_6_0 header links in documentation
> > > 
> > > - 37: Add upgrade notes for the 3.7.0 release
> > > 
> > >
> > > I am a bit unclear on the precise process regarding what parts of this
> > get
> > > merged at what time, and whether the release first needs to be done or
> > not.
> > >
> > > Best,
> > > Stanislav
> > >
> > > On Mon, Feb 19, 2024 at 8:34 PM Divij Vaidya 
> > > wrote:
> > >
> > > > Great. In that case we can fix the license issue retrospectively. I
> > have
> > > > created a JIRA for it
> > https://issues.apache.org/jira/browse/KAFKA-16278
> > > > and
> > > > also updated the release process (which redirects to
> > > > https://issues.apache.org/jira/browse/KAFKA-12622) to check for the
> > > > correct
> > > > license in both the kafka binaries.
> > > >
> > > > I am +1 (binding) assuming Mickael's concerns about update notes to
> 3.7
> > > are
> > > > addressed before release.
> > > >
> > > > --
> > > > Divij Vaidya
> > > >
> > > >
> > > >
> > > > On Mon, Feb 19, 2024 at 6:08 PM Mickael Maison <
> > mickael.mai...@gmail.com
> > > >
> > > > wrote:
> > > >
> > > > > Hi,
> > > > >
> > > > > I agree with Josep, I don't think it's worth making a new RC just
> for
> > > > this.
> > > > >
> > > > > Thanks Stanislav for sharing the test results. The last thing
> holding
> > > > > me from casting my vote is the missing upgrade notes for 3.7.0.
> > > > >
> > > > > Thanks,
> > > > > Mickael
> > > > >
> > > > >
> > > > >
> > > > > On Mon, Feb 19, 2024 at 4:28 PM Josep Prat
> >  > > >
> > > > > wrote:
> > > > > >
> > > > > > I think I remember finding a similar problem (NOTICE_binary) and
> it
> > > > > didn't
> > > > > > qualify for an extra RC
> > > > > >
> > > > > > Best,
> > > > > >
> > > > > > On Mon, Feb 19, 2024 at 3:44 PM Divij Vaidya <
> > > divijvaidy...@gmail.com>
> > > > > > wrote:
> > > > > >
> > > > > > > I have performed the following checks. The only thing I would
> > like
> > > to
> > > > > call
> > > > > > > out is the missing licenses before providing a vote. How do we
> > want
> > > > > > > to proceed on this? What have we done in the past? (Creating a
> > new
> > > RC
> > > > > is
> > > > > > > overkill IMO for this license issue).
> > > > > > >
> > > > > > > ## License check
> > > > > > >
> > > > > > > Test: Validate license of dependencies for both 2.12 & 2.13
> > binary.
> > > > > > > Result: Missing license for some scala* libraries specifically
> > for
> > > > > 2.12.
> > > > > > > Seems like we have been missing these licenses for quite some
> > > version
> > > > > now.
> > > > > > >
> > > > > > > ```
> > > > > > > for f in $(ls libs | grep -v "^kafka\|connect\|trogdor"); do
> if !
> > > > grep
> > > > > -q
> > > > > > > ${f%.*} LICENSE; then echo "${

Re: [VOTE] 3.7.0 RC2

2024-01-17 Thread Proven Provenzano
We have another blocking issue for the RC :
https://issues.apache.org/jira/browse/KAFKA-16157. This bug is similar to
https://issues.apache.org/jira/browse/KAFKA-14616. The new issue however
can lead to the new topic having partitions that a producer cannot write to.

--Proven

On Tue, Jan 16, 2024 at 12:04 PM Proven Provenzano 
wrote:

>
> I have a PR https://github.com/apache/kafka/pull/15197 for
> https://issues.apache.org/jira/browse/KAFKA-16131 that is building now.
> --Proven
>
> On Mon, Jan 15, 2024 at 5:03 AM Jakub Scholz  wrote:
>
>> *> Hi Jakub,> > Thanks for trying the RC. I think what you found is a
>> blocker bug because it *
>> *> will generate huge amount of logspam. I guess we didn't find it in
>> junit
>> tests *
>> *> since logspam doesn't fail the automated tests. But certainly it's not
>> suitable *
>> *> for production. Did you file a JIRA yet?*
>>
>> Hi Colin,
>>
>> I opened https://issues.apache.org/jira/browse/KAFKA-16131.
>>
>> Thanks & Regards
>> Jakub
>>
>> On Mon, Jan 15, 2024 at 8:57 AM Colin McCabe  wrote:
>>
>> > Hi Stanislav,
>> >
>> > Thanks for making the first RC. The fact that it's titled RC2 is messing
>> > with my mind a bit. I hope this doesn't make people think that we're
>> > farther along than we are, heh.
>> >
>> > On Sun, Jan 14, 2024, at 13:54, Jakub Scholz wrote:
>> > > *> Nice catch! It does seem like we should have gated this behind the
>> > > metadata> version as KIP-858 implies. Is the cluster configured with
>> > > multiple log> dirs? What is the impact of the error messages?*
>> > >
>> > > I did not observe any obvious impact. I was able to send and receive
>> > > messages as normally. But to be honest, I have no idea what else
>> > > this might impact, so I did not try anything special.
>> > >
>> > > I think everyone upgrading an existing KRaft cluster will go through
>> this
>> > > stage (running Kafka 3.7 with an older metadata version for at least a
>> > > while). So even if it is just a logged exception without any other
>> > impact I
>> > > wonder if it might scare users from upgrading. But I leave it to
>> others
>> > to
>> > > decide if this is a blocker or not.
>> > >
>> >
>> > Hi Jakub,
>> >
>> > Thanks for trying the RC. I think what you found is a blocker bug
>> because
>> > it will generate huge amount of logspam. I guess we didn't find it in
>> junit
>> > tests since logspam doesn't fail the automated tests. But certainly it's
>> > not suitable for production. Did you file a JIRA yet?
>> >
>> > > On Sun, Jan 14, 2024 at 10:17 PM Stanislav Kozlovski
>> > >  wrote:
>> > >
>> > >> Hey Luke,
>> > >>
>> > >> This is an interesting problem. Given the fact that the KIP for
>> having a
>> > >> 3.8 release passed, I think it weights the scale towards not calling
>> > this a
>> > >> blocker and expecting it to be solved in 3.7.1.
>> > >>
>> > >> It is unfortunate that it would not seem safe to migrate to KRaft in
>> > 3.7.0
>> > >> (given the inability to rollback safely), but if that's true - the
>> same
>> > >> case would apply for 3.6.0. So in any case users w\ould be expected
>> to
>> > use a
>> > >> patch release for this.
>> >
>> > Hi Luke,
>> >
>> > Thanks for testing rollback. I think this is a case where the
>> > documentation is wrong. The intention was to for the steps to basically
>> be:
>> >
>> > 1. roll all the brokers into zk mode, but with migration enabled
>> > 2. take down the kraft quorum
>> > 3. rmr /controller, allowing a hybrid broker to take over.
>> > 4. roll all the brokers into zk mode without migration enabled (if
>> desired)
>> >
>> > With these steps, there isn't really unavailability since a ZK
>> controller
>> > can be elected quickly after the kraft quorum is gone.
>> >
>> > >> Further, since we will have a 3.8 release - it is
>> > >> likely we will ultimately recommend users upgrade from that version
>> > given
>> > >> its aim is to have strategic KRaft feature parity with ZK.
>> > >> That being said, I am not 100% on this. Let

Re: [VOTE] 3.7.0 RC2

2024-01-16 Thread Proven Provenzano
I have a PR https://github.com/apache/kafka/pull/15197 for
https://issues.apache.org/jira/browse/KAFKA-16131 that is building now.
--Proven

On Mon, Jan 15, 2024 at 5:03 AM Jakub Scholz  wrote:

> *> Hi Jakub,> > Thanks for trying the RC. I think what you found is a
> blocker bug because it *
> *> will generate huge amount of logspam. I guess we didn't find it in junit
> tests *
> *> since logspam doesn't fail the automated tests. But certainly it's not
> suitable *
> *> for production. Did you file a JIRA yet?*
>
> Hi Colin,
>
> I opened https://issues.apache.org/jira/browse/KAFKA-16131.
>
> Thanks & Regards
> Jakub
>
> On Mon, Jan 15, 2024 at 8:57 AM Colin McCabe  wrote:
>
> > Hi Stanislav,
> >
> > Thanks for making the first RC. The fact that it's titled RC2 is messing
> > with my mind a bit. I hope this doesn't make people think that we're
> > farther along than we are, heh.
> >
> > On Sun, Jan 14, 2024, at 13:54, Jakub Scholz wrote:
> > > *> Nice catch! It does seem like we should have gated this behind the
> > > metadata> version as KIP-858 implies. Is the cluster configured with
> > > multiple log> dirs? What is the impact of the error messages?*
> > >
> > > I did not observe any obvious impact. I was able to send and receive
> > > messages as normally. But to be honest, I have no idea what else
> > > this might impact, so I did not try anything special.
> > >
> > > I think everyone upgrading an existing KRaft cluster will go through
> this
> > > stage (running Kafka 3.7 with an older metadata version for at least a
> > > while). So even if it is just a logged exception without any other
> > impact I
> > > wonder if it might scare users from upgrading. But I leave it to others
> > to
> > > decide if this is a blocker or not.
> > >
> >
> > Hi Jakub,
> >
> > Thanks for trying the RC. I think what you found is a blocker bug because
> > it will generate huge amount of logspam. I guess we didn't find it in
> junit
> > tests since logspam doesn't fail the automated tests. But certainly it's
> > not suitable for production. Did you file a JIRA yet?
> >
> > > On Sun, Jan 14, 2024 at 10:17 PM Stanislav Kozlovski
> > >  wrote:
> > >
> > >> Hey Luke,
> > >>
> > >> This is an interesting problem. Given the fact that the KIP for
> having a
> > >> 3.8 release passed, I think it weights the scale towards not calling
> > this a
> > >> blocker and expecting it to be solved in 3.7.1.
> > >>
> > >> It is unfortunate that it would not seem safe to migrate to KRaft in
> > 3.7.0
> > >> (given the inability to rollback safely), but if that's true - the
> same
> > >> case would apply for 3.6.0. So in any case users w\ould be expected to
> > use a
> > >> patch release for this.
> >
> > Hi Luke,
> >
> > Thanks for testing rollback. I think this is a case where the
> > documentation is wrong. The intention was to for the steps to basically
> be:
> >
> > 1. roll all the brokers into zk mode, but with migration enabled
> > 2. take down the kraft quorum
> > 3. rmr /controller, allowing a hybrid broker to take over.
> > 4. roll all the brokers into zk mode without migration enabled (if
> desired)
> >
> > With these steps, there isn't really unavailability since a ZK controller
> > can be elected quickly after the kraft quorum is gone.
> >
> > >> Further, since we will have a 3.8 release - it is
> > >> likely we will ultimately recommend users upgrade from that version
> > given
> > >> its aim is to have strategic KRaft feature parity with ZK.
> > >> That being said, I am not 100% on this. Let me know whether you think
> > this
> > >> should block the release, Luke. I am also tagging Colin and David to
> > weigh
> > >> in with their opinions, as they worked on the migration logic.
> >
> > The rollback docs are new in 3.7 so the fact that they're wrong is a
> clear
> > blocker, I think. But easy to fix, I believe. I will create a PR.
> >
> > best,
> > Colin
> >
> > >>
> > >> Hey Kirk and Chris,
> > >>
> > >> Unless I'm missing something - KAFKALESS-16029 is simply a bad log due
> > to
> > >> improper closing. And the PR description implies this has been present
> > >> since 3.5. While annoying, I don't see a strong reason for this to
> block
> > >> the release.
> > >>
> > >> Hey Jakub,
> > >>
> > >> Nice catch! It does seem like we should have gated this behind the
> > metadata
> > >> version as KIP-858 implies. Is the cluster configured with multiple
> log
> > >> dirs? What is the impact of the error messages?
> > >>
> > >> Tagging Igor (the author of the KIP) to weigh in.
> > >>
> > >> Best,
> > >> Stanislav
> > >>
> > >> On Sat, Jan 13, 2024 at 7:22 PM Jakub Scholz  wrote:
> > >>
> > >> > Hi,
> > >> >
> > >> > I was trying the RC2 and run into the following issue ... when I run
> > >> > 3.7.0-RC2 KRaft cluster with metadata version set to 3.6-IV2
> metadata
> > >> > version, I seem to be getting repeated errors like this in the
> > controller
> > >> > logs:
> > >> >
> > >> > 2024-01-13 16:58:01,197 INFO [QuorumController id=0]
>

Re: [DISCUSS] KIP-1014: Managing Unstable Metadata Versions in Apache Kafka

2024-01-11 Thread Proven Provenzano
We have two approaches here for how we update unstable metadata versions.

   1. The update will only increase MVs of unstable features to a value
   greater than the new stable feature. The idea is that a specific unstable
   MV may support some set of features and in the future that set is always a
   strict subset of the current set. The issue is that moving a feature to
   make way for a stable feature with a higher MV will leave holes.
   2. We are free to reorder the MV for any unstable feature. This removes
   the hole issue, but does make the unstable MVs more muddled. There isn't
   the same binary state for a MV where a feature is available or there is a
   hole.


We also have two ends of the spectrum as to when we update the stable MV.

   1. We update at release points which reduces the amount of churn of the
   unstable MVs and makes a stronger correlation between accepted features and
   stable MVs for a release but means less testing on trunk as a stable MV.
   2. We update when the developers of a feature think it is done. This
   leads to features being available for more testing in trunk but forces the
   next release to include it as stable.


I'd like more feedback from others on these two dimensions.
--Proven



On Wed, Jan 10, 2024 at 12:16 PM Justine Olshan
 wrote:

> Hmm it seems like Colin and Proven are disagreeing with whether we can swap
> unstable metadata versions.
>
> >  When we reorder, we are always allocating a new MV and we are never
> reusing an existing MV even if it was also unstable.
>
> > Given that this is true, there's no reason to have special rules about
> what we can and can't do with unstable MVs. We can do anything
>
> I don't have a strong preference either way, but I think we should agree on
> one approach.
> The benefit of reordering and reusing is that we can release features that
> are ready earlier and we have more flexibility. With the approach where we
> always create a new MV, I am concerned with having many "empty" MVs. This
> would encourage waiting until the release before we decide an incomplete
> feature is not ready and moving its MV into the future. (The
> abandoning comment I made earlier -- that is consistent with Proven's
> approach)
>
> I think the only potential issue with reordering is that it could be a bit
> confusing and *potentially *prone to errors. Note I say potentially because
> I think it depends on folks' understanding with this new unstable metadata
> version concept. I echo Federico's comments about making sure the risks are
> highlighted.
>
> Thanks,
>
> Justine
>
> On Wed, Jan 10, 2024 at 1:16 AM Federico Valeri 
> wrote:
>
> > Hi folks,
> >
> > > If you use an unstable MV, you probably won't be able to upgrade your
> > software. Because whenever something changes, you'll probably get
> > serialization exceptions being thrown inside the controller. Fatal ones.
> >
> > Thanks for this clarification. I think this concrete risk should be
> > highlighted in the KIP and in the "unstable.metadata.versions.enable"
> > documentation.
> >
> > In the test plan, should we also have one system test checking that
> > "features with a stable MV will never have that MV changed"?
> >
> > On Wed, Jan 10, 2024 at 8:16 AM Colin McCabe  wrote:
> > >
> > > On Tue, Jan 9, 2024, at 18:56, Proven Provenzano wrote:
> > > > Hi folks,
> > > >
> > > > Thank you for the questions.
> > > >
> > > > Let me clarify about reorder first. The reorder of unstable metadata
> > > > versions should be infrequent.
> > >
> > > Why does it need to be infrequent? We should be able to reorder
> unstable
> > metadata versions as often as we like. There are no guarantees about
> > unstable MVs.
> > >
> > > > The time you reorder is when a feature that
> > > > requires a higher metadata version to enable becomes "production
> > ready" and
> > > > the features with unstable metadata versions less than the new stable
> > one
> > > > are moved to metadata versions greater than the new stable feature.
> > When we
> > > > reorder, we are always allocating a new MV and we are never reusing
> an
> > > > existing MV even if it was also unstable. This way a developer
> > upgrading
> > > > their environment with a specific unstable MV might see existing
> > > > functionality stop working but they won't see new MV dependent
> > > > functionality magically appear. The feature set for a given unstable
> MV
> > > > version can o

Re: [DISCUSS] KIP-1014: Managing Unstable Metadata Versions in Apache Kafka

2024-01-11 Thread Proven Provenzano
Hi Federico,

Thank you for the suggestions. I've added them to the KIP.

--Proven

On Wed, Jan 10, 2024 at 4:16 AM Federico Valeri 
wrote:

> Hi folks,
>
> > If you use an unstable MV, you probably won't be able to upgrade your
> software. Because whenever something changes, you'll probably get
> serialization exceptions being thrown inside the controller. Fatal ones.
>
> Thanks for this clarification. I think this concrete risk should be
> highlighted in the KIP and in the "unstable.metadata.versions.enable"
> documentation.
>
> In the test plan, should we also have one system test checking that
> "features with a stable MV will never have that MV changed"?
>
> On Wed, Jan 10, 2024 at 8:16 AM Colin McCabe  wrote:
> >
> > On Tue, Jan 9, 2024, at 18:56, Proven Provenzano wrote:
> > > Hi folks,
> > >
> > > Thank you for the questions.
> > >
> > > Let me clarify about reorder first. The reorder of unstable metadata
> > > versions should be infrequent.
> >
> > Why does it need to be infrequent? We should be able to reorder unstable
> metadata versions as often as we like. There are no guarantees about
> unstable MVs.
> >
> > > The time you reorder is when a feature that
> > > requires a higher metadata version to enable becomes "production
> ready" and
> > > the features with unstable metadata versions less than the new stable
> one
> > > are moved to metadata versions greater than the new stable feature.
> When we
> > > reorder, we are always allocating a new MV and we are never reusing an
> > > existing MV even if it was also unstable. This way a developer
> upgrading
> > > their environment with a specific unstable MV might see existing
> > > functionality stop working but they won't see new MV dependent
> > > functionality magically appear. The feature set for a given unstable MV
> > > version can only decrease with reordering.
> >
> > If you use an unstable MV, you probably won't be able to upgrade your
> software. Because whenever something changes, you'll probably get
> serialization exceptions being thrown inside the controller. Fatal ones.
> >
> > Given that this is true, there's no reason to have special rules about
> what we can and can't do with unstable MVs. We can do anything.
> >
> > >
> > > How do we define "production ready" and when should we bump
> > > LATEST_PRODUCTION? I would like to define it to be the point where the
> > > feature is code complete with tests and the KIP for it is approved.
> However
> > > even with this definition if the feature later develops a major issue
> it
> > > could still block future features until the issue is fixed which is
> what we
> > > are trying to avoid here. We could be much more formal about this and
> let
> > > the release manager for a release define what is stable for a given
> release
> > > and then do the bump just after the branch is created on the branch.
> When
> > > an RC candidate is accepted, the bump would be backported. I would
> like to
> > > hear other ideas here.
> > >
> >
> > Yeah, it's an interesting question. Overall, I think developers should
> define when a feature is production ready.
> >
> > The question to ask is, "are you ready to take this feature to
> production in your workplace?" I think most developers do have a sense of
> this. Obviously bugs and mistakes can happen, but I think this standard
> would avoid most of the issues that we're trying to avoid by having
> unstable MVs in the first place.
> >
> > ELR is a good example. Nobody would have said that it was production
> ready in 3.7 ... hence it belonged (and still belongs) in an unstable MV,
> until that changes (hopefully soon :) )
> >
> > best,
> > Colin
> >
> > > --Proven
> > >
> > > On Tue, Jan 9, 2024 at 3:26 PM Colin McCabe 
> wrote:
> > >
> > >> Hi Justine,
> > >>
> > >> Yes, this is an important point to clarify. Proven can comment more,
> but
> > >> my understanding is that we can do anything to unstable metadata
> versions.
> > >> Reorder them, delete them, change them in any other way. There are no
> > >> stability guarantees. If the current text is unclear let's add more
> > >> examples of what we can do (which is anything) :)
> > >>
> > >> best,
> > >> Colin
> > >>
> > >>
> > >&

Re: [DISCUSS] KIP-1014: Managing Unstable Metadata Versions in Apache Kafka

2024-01-09 Thread Proven Provenzano
 KIP.
> >>
> >> I'm not sure what you mean by abandoning a version. We never abandon a
> >> version once it's stable.
> >>
> >> Unstable versions can change. I wouldn't describe this as "abandonment",
> >> just the MV changing prior to release.
> >>
> >> In a similar way, the contents of the 3.7 branch will change up until
> >> 3.7.0 is released. Once it gets released, it's never unreleased. We just
> >> move on to 3.7.1. Same thing here.
> >>
> >> best,
> >> Colin
> >>
> >> >
> >> > Thanks,
> >> >
> >> > Justine
> >> >
> >> > On Mon, Jan 8, 2024 at 12:44 PM Colin McCabe 
> wrote:
> >> >
> >> >> Hi Proven,
> >> >>
> >> >> Thanks for the KIP. I think there is a need for this capability, for
> >> those
> >> >> of us who deploy from trunk (or branches dervied from trunk).
> >> >>
> >> >> With regard to "unstable.metadata.versions.enable": is this going to
> be
> >> a
> >> >> documented configuration, or an internal one? I am guessing we want
> it
> >> to
> >> >> be documented, so that users can use it. If we do, we should probably
> >> also
> >> >> very prominently warn that THIS WILL BREAK UPGRADES FOR YOUR CLUSTER.
> >> That
> >> >> includes logging an ERROR message on startup, etc.
> >> >>
> >> >> It would be good to document if a release can go out that contains
> >> "future
> >> >> MVs" that are unstable. Like can we make a 3.8 release that contains
> >> >> IBP_4_0_IV0 in MetadataVersion.java, as an unstable future MV?
> >> Personally I
> >> >> think the answer should be "yes," but with the usual caveats. When
> the
> >> >> actual 4.0 comes out, the unstable 4.0 MV that shipped in 3.8
> probably
> >> >> won't work, and you won't be able to upgrade. (It was unstable, we
> told
> >> you
> >> >> not to use it.)
> >> >>
> >> >> best,
> >> >> Colin
> >> >>
> >> >>
> >> >> On Fri, Jan 5, 2024, at 07:32, Proven Provenzano wrote:
> >> >> > Hey folks,
> >> >> >
> >> >> > I am starting a discussion thread for managing unstable metadata
> >> >> > versions
> >> >> > in Apache Kafka.
> >> >> >
> >> >>
> >>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1014%3A+Managing+Unstable+Metadata+Versions+in+Apache+Kafka
> >> >> >
> >> >> > This KIP is actually already implemented in 3.7 with PR
> >> >> > https://github.com/apache/kafka/pull/14860.
> >> >> > I have created this KIP to explain the motivation and how managing
> >> >> Metadata
> >> >> > Versions is expected to work.
> >> >> > Comments are greatly appreciated as this process can always be
> >> improved.
> >> >> >
> >> >> > --
> >> >> > --Proven
> >> >>
> >>
>


[jira] [Resolved] (KAFKA-15922) Add MetadataVersion for JBOD

2024-01-05 Thread Proven Provenzano (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-15922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Proven Provenzano resolved KAFKA-15922.
---
Resolution: Fixed

> Add MetadataVersion for JBOD
> 
>
> Key: KAFKA-15922
> URL: https://issues.apache.org/jira/browse/KAFKA-15922
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Colin McCabe
>    Assignee: Proven Provenzano
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[DISCUSS] KIP-1014: Managing Unstable Metadata Versions in Apache Kafka

2024-01-05 Thread Proven Provenzano
Hey folks,

I am starting a discussion thread for managing unstable metadata versions
in Apache Kafka.
https://cwiki.apache.org/confluence/display/KAFKA/KIP-1014%3A+Managing+Unstable+Metadata+Versions+in+Apache+Kafka

This KIP is actually already implemented in 3.7 with PR
https://github.com/apache/kafka/pull/14860.
I have created this KIP to explain the motivation and how managing Metadata
Versions is expected to work.
Comments are greatly appreciated as this process can always be improved.

-- 
--Proven


Re: Apache Kafka 3.7.0 Release

2024-01-05 Thread Proven Provenzano
We have potentially one more blocker
https://issues.apache.org/jira/browse/KAFKA-16082 which might cause a data
loss scenario with JBOD in KRaft.
Initial analysis thought this is a problem and further review looks like it
isn't but we are continuing to dig into the issue to ensure that it isn't.
We would request feedback on the bug from anyone who is familiar with this
code.

--Proven


[jira] [Created] (KAFKA-16082) JBOD: Possible dataloss when moving leader partition

2024-01-04 Thread Proven Provenzano (Jira)
Proven Provenzano created KAFKA-16082:
-

 Summary: JBOD: Possible dataloss when moving leader partition
 Key: KAFKA-16082
 URL: https://issues.apache.org/jira/browse/KAFKA-16082
 Project: Kafka
  Issue Type: Bug
  Components: jbod
Affects Versions: 3.7.0
Reporter: Proven Provenzano
Assignee: Gaurav Narula
 Fix For: 3.7.0


There is a possible dataloss scenario

when using JBOD,

when moving the partition leader log from one directory to another on the same 
broker,

when after the destination log has caught up to the source log and after the 
broker has sent an update to the partition assignment

if the broker accepts and commits a new record for the partition and then the 
broker restarts and the original partition leader log is lost

then the destination log would not contain the new record.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (KAFKA-15904) Downgrade tests are failing with directory.id 

2023-12-27 Thread Proven Provenzano (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-15904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Proven Provenzano resolved KAFKA-15904.
---
Resolution: Fixed

This was merged into trunk a month ago, long before the 3.7 branch was cut. I 
just forgot to close the ticket.

> Downgrade tests are failing with directory.id 
> --
>
> Key: KAFKA-15904
> URL: https://issues.apache.org/jira/browse/KAFKA-15904
> Project: Kafka
>  Issue Type: Bug
>Reporter: Manikumar
>    Assignee: Proven Provenzano
>Priority: Major
> Fix For: 3.7.0
>
>
> {{kafkatest.tests.core.downgrade_test.TestDowngrade}} tests are failing after 
> [https://github.com/apache/kafka/pull/14628.] 
> We have added {{directory.id}} to metadata.properties. This means 
> {{metadata.properties}} will be different for different log directories.
> Cluster downgrades will fail with below error if we have multiple log 
> directories . This looks blocker or requires additional downgrade steps from 
> AK 3.7. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (KAFKA-15893) Bump MetadataVersion for directory assignments

2023-12-14 Thread Proven Provenzano (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-15893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Proven Provenzano resolved KAFKA-15893.
---
Fix Version/s: 3.7.0
   Resolution: Fixed

This is finally merged into 3.7: https://github.com/apache/kafka/pull/14984

> Bump MetadataVersion for directory assignments
> --
>
> Key: KAFKA-15893
> URL: https://issues.apache.org/jira/browse/KAFKA-15893
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Igor Soarez
>    Assignee: Proven Provenzano
>Priority: Major
> Fix For: 3.7.0
>
>
> MetadataVersion.LATEST_PRODUCTION needs to be set to IBP_3_7_IV2 or higher to 
> enable JBOD support.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (KAFKA-15633) Bug: Generated Persistent Directory IDs are overwritten on startup.

2023-11-16 Thread Proven Provenzano (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-15633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Proven Provenzano resolved KAFKA-15633.
---
Resolution: Fixed

> Bug: Generated Persistent Directory IDs are overwritten on startup.
> ---
>
> Key: KAFKA-15633
> URL: https://issues.apache.org/jira/browse/KAFKA-15633
> Project: Kafka
>  Issue Type: Sub-task
>  Components: jbod
>Affects Versions: 3.7.0
>    Reporter: Proven Provenzano
>    Assignee: Proven Provenzano
>Priority: Major
> Fix For: 3.7.0
>
>
> The code to generate the persistent directory IDs and add them to the 
> meta.properties file works, but later in the startup process the file is 
> overwritten with the original data.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-15633) Bug: Generated Persistent Directory IDs are overwritten on startup.

2023-10-18 Thread Proven Provenzano (Jira)
Proven Provenzano created KAFKA-15633:
-

 Summary: Bug: Generated Persistent Directory IDs are overwritten 
on startup.
 Key: KAFKA-15633
 URL: https://issues.apache.org/jira/browse/KAFKA-15633
 Project: Kafka
  Issue Type: Sub-task
  Components: jbod
Affects Versions: 3.7.0
Reporter: Proven Provenzano
Assignee: Proven Provenzano
 Fix For: 3.7.0


The code to generate the persistent directory IDs and add them to the 
meta.properties file works, but later in the startup process the file is 
overwritten with the original data.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [VOTE] 3.6.0 RC2

2023-10-02 Thread Proven Provenzano
Hi,

To verify the release of release 3.6.0 RC2 I did the following:

   - Downloaded the source, built and ran the tests.
   - Validated SCRAM with KRaft including creating credentials with
   kafka-storage.
   - Validated Delegation Tokens with KRaft

+1 (non-binding)

--Proven



On Mon, Oct 2, 2023 at 8:37 AM Divij Vaidya  wrote:

> + 1 (non-binding)
>
> Verifications:
> 1. I ran a produce-consume workload with plaintext auth, JDK17, zstd
> compression using an open messaging benchmark and found 3.6 to be better
> than or equal to 3.5.1 across all dimensions. Notably, 3.6 had consistently
> 6-7% lower CPU utilization, lesser spikes on P99 produce latencies and
> overall lower P99.8 latencies.
>
> 2. I have verified that detached signature is correct using
> https://www.apache.org/info/verification.html and the release manager
> public keys are available at
> https://keys.openpgp.org/search?q=F65DC3423D4CD7B9
>
> 3. I have verified that all metrics emitted in 3.5.1 (with Zk) are also
> being emitted in 3.6.0 (with Zk).
>
> Problems (but not blockers):
> 1. Metrics added in
>
> https://github.com/apache/kafka/commit/2f71708955b293658cec3b27e9a5588d39c38d7e
> aren't available in the documentation (cc: Justine). I don't consider this
> as a release blocker but we should add it as a fast follow-up.
>
> 2. Metric added in
>
> https://github.com/apache/kafka/commit/a900794ace4dcf1f9dadee27fbd8b63979532a18
> isn't available in documentation (cc: David). I don't consider this as a
> release blocker but we should add it as a fast follow-up.
>
> --
> Divij Vaidya
>
>
>
> On Mon, Oct 2, 2023 at 9:50 AM Federico Valeri 
> wrote:
>
> > Hi Satish, I did the following to verify the release:
> >
> > - Built from source with Java 17 and Scala 2.13
> > - Ran all unit and integration tests
> > - Spot checked documentation
> > - Ran custom client applications using staging artifacts on a 3-nodes
> > cluster
> > - Tested tiered storage with one of the available RSM implementations
> >
> > +1 (non binding)
> >
> > Thanks
> > Fede
> >
> > On Mon, Oct 2, 2023 at 8:50 AM Luke Chen  wrote:
> > >
> > > Hi Satish,
> > >
> > > I verified with:
> > > 1. Ran quick start in KRaft for scala 2.12 artifact
> > > 2. Making sure the checksum are correct
> > > 3. Browsing release notes, documents, javadocs, protocols.
> > > 4. Verified the tiered storage feature works well.
> > >
> > > +1 (binding).
> > >
> > > Thanks.
> > > Luke
> > >
> > >
> > >
> > > On Mon, Oct 2, 2023 at 5:23 AM Jakub Scholz  wrote:
> > >
> > > > +1 (non-binding). I used the Scala 2.13 binaries and the staged Maven
> > > > artifacts and run my tests. Everything seems to work fine for me.
> > > >
> > > > Thanks
> > > > Jakub
> > > >
> > > > On Fri, Sep 29, 2023 at 8:17 PM Satish Duggana <
> > satish.dugg...@gmail.com>
> > > > wrote:
> > > >
> > > > > Hello Kafka users, developers and client-developers,
> > > > >
> > > > > This is the third candidate for the release of Apache Kafka 3.6.0.
> > > > > Some of the major features include:
> > > > >
> > > > > * KIP-405 : Kafka Tiered Storage
> > > > > * KIP-868 : KRaft Metadata Transactions
> > > > > * KIP-875: First-class offsets support in Kafka Connect
> > > > > * KIP-898: Modernize Connect plugin discovery
> > > > > * KIP-938: Add more metrics for measuring KRaft performance
> > > > > * KIP-902: Upgrade Zookeeper to 3.8.1
> > > > > * KIP-917: Additional custom metadata for remote log segment
> > > > >
> > > > > Release notes for the 3.6.0 release:
> > > > >
> https://home.apache.org/~satishd/kafka-3.6.0-rc2/RELEASE_NOTES.html
> > > > >
> > > > > *** Please download, test and vote by Tuesday, October 3, 12pm PT
> > > > >
> > > > > Kafka's KEYS file containing PGP keys we use to sign the release:
> > > > > https://kafka.apache.org/KEYS
> > > > >
> > > > > * Release artifacts to be voted upon (source and binary):
> > > > > https://home.apache.org/~satishd/kafka-3.6.0-rc2/
> > > > >
> > > > > * Maven artifacts to be voted upon:
> > > > >
> > https://repository.apache.org/content/groups/staging/org/apache/kafka/
> > > > >
> > > > > * Javadoc:
> > > > > https://home.apache.org/~satishd/kafka-3.6.0-rc2/javadoc/
> > > > >
> > > > > * Tag to be voted upon (off 3.6 branch) is the 3.6.0-rc2 tag:
> > > > > https://github.com/apache/kafka/releases/tag/3.6.0-rc2
> > > > >
> > > > > * Documentation:
> > > > > https://kafka.apache.org/36/documentation.html
> > > > >
> > > > > * Protocol:
> > > > > https://kafka.apache.org/36/protocol.html
> > > > >
> > > > > * Successful Jenkins builds for the 3.6 branch:
> > > > > There are a few runs of unit/integration tests. You can see the
> > latest
> > > > > at https://ci-builds.apache.org/job/Kafka/job/kafka/job/3.6/. We
> > will
> > > > > continue running a few more iterations.
> > > > > System tests:
> > > > > We will send an update once we have the results.
> > > > >
> > > > > Thanks,
> > > > > Satish.
> > > > >
> > > >
> >
>


[jira] [Reopened] (KAFKA-15411) DelegationTokenEndToEndAuthorizationWithOwnerTest is Flaky

2023-09-06 Thread Proven Provenzano (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-15411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Proven Provenzano reopened KAFKA-15411:
---

> DelegationTokenEndToEndAuthorizationWithOwnerTest is Flaky 
> ---
>
> Key: KAFKA-15411
> URL: https://issues.apache.org/jira/browse/KAFKA-15411
> Project: Kafka
>  Issue Type: Bug
>  Components: kraft
>    Reporter: Proven Provenzano
>    Assignee: Proven Provenzano
>Priority: Major
>  Labels: flaky-test
> Fix For: 3.7.0
>
>
> DelegationTokenEndToEndAuthorizationWithOwnerTest has become flaky since the 
> merge of delegation token support for KRaft.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (KAFKA-15411) DelegationTokenEndToEndAuthorizationWithOwnerTest is Flaky

2023-09-06 Thread Proven Provenzano (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-15411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Proven Provenzano resolved KAFKA-15411.
---
Resolution: Cannot Reproduce

After trying for a week, I have yet to see the error both locally and with PRs.
Please link a Jenkins job if you see this issue again.

> DelegationTokenEndToEndAuthorizationWithOwnerTest is Flaky 
> ---
>
> Key: KAFKA-15411
> URL: https://issues.apache.org/jira/browse/KAFKA-15411
> Project: Kafka
>  Issue Type: Bug
>  Components: kraft
>    Reporter: Proven Provenzano
>    Assignee: Proven Provenzano
>Priority: Major
>  Labels: flaky-test
> Fix For: 3.7.0
>
>
> DelegationTokenEndToEndAuthorizationWithOwnerTest has become flaky since the 
> merge of delegation token support for KRaft.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (KAFKA-15422) Update documentation for Delegation Tokens in Kafka with KRaft

2023-09-06 Thread Proven Provenzano (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-15422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Proven Provenzano resolved KAFKA-15422.
---
Resolution: Fixed

> Update documentation for Delegation Tokens in Kafka with KRaft
> --
>
> Key: KAFKA-15422
> URL: https://issues.apache.org/jira/browse/KAFKA-15422
> Project: Kafka
>  Issue Type: Task
>  Components: documentation
>    Reporter: Proven Provenzano
>    Assignee: Proven Provenzano
>Priority: Major
> Fix For: 3.6.0
>
>
> Update the documentation to indicate that controllers need the same 
> configuration as brokers for Delegation Tokens to work under Kafka with KRaft



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-15422) Update documentation for Delegation Tokens in Kafka with KRaft

2023-08-31 Thread Proven Provenzano (Jira)
Proven Provenzano created KAFKA-15422:
-

 Summary: Update documentation for Delegation Tokens in Kafka with 
KRaft
 Key: KAFKA-15422
 URL: https://issues.apache.org/jira/browse/KAFKA-15422
 Project: Kafka
  Issue Type: Task
  Components: documentation
Reporter: Proven Provenzano
Assignee: Proven Provenzano
 Fix For: 3.6.0


Update the documentation to indicate that controllers need the same 
configuration as brokers for Delegation Tokens to work under Kafka with KRaft



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-15411) DelegationTokenEndToEndAuthorizationWithOwnerTest is Flaky

2023-08-28 Thread Proven Provenzano (Jira)
Proven Provenzano created KAFKA-15411:
-

 Summary: DelegationTokenEndToEndAuthorizationWithOwnerTest is 
Flaky 
 Key: KAFKA-15411
 URL: https://issues.apache.org/jira/browse/KAFKA-15411
 Project: Kafka
  Issue Type: Bug
  Components: kraft
Reporter: Proven Provenzano
Assignee: Proven Provenzano
 Fix For: 3.6.0


DelegationTokenEndToEndAuthorizationWithOwnerTest has become flaky since the 
merge of delegation token support for KRaft.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-15114) StorageTool help specifies user as parameter not name

2023-06-22 Thread Proven Provenzano (Jira)
Proven Provenzano created KAFKA-15114:
-

 Summary: StorageTool help specifies user as parameter not name
 Key: KAFKA-15114
 URL: https://issues.apache.org/jira/browse/KAFKA-15114
 Project: Kafka
  Issue Type: Bug
  Components: kraft
Affects Versions: 3.5.0
Reporter: Proven Provenzano
Assignee: Proven Provenzano
 Fix For: 3.5.1


StorageTool help message current specifies setting a {{user}} parameter when 
creating a SCRAM record for bootstrap.

The StorageTool parses and only accepts the parameter as {{name}} and so the 
help message is wrong.

The choice of using {{name}} vs. {{user}} as a parameter is because internally 
the record uses name, all tests using the StorageTool use name as a parameter, 
KafkaPrincipals are created with {{name}} and because creating SCRAM 
credentials is done with {{--entity-name}}

I will change the help to specify {{name}} instead of {{user}}.


 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (KAFKA-15017) New ClientQuotas are not written to ZK from snapshot

2023-06-01 Thread Proven Provenzano (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-15017?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Proven Provenzano resolved KAFKA-15017.
---
Resolution: Fixed

> New ClientQuotas are not written to ZK from snapshot 
> -
>
> Key: KAFKA-15017
> URL: https://issues.apache.org/jira/browse/KAFKA-15017
> Project: Kafka
>  Issue Type: Bug
>  Components: kraft
>Affects Versions: 3.5.0
>Reporter: David Arthur
>    Assignee: Proven Provenzano
>Priority: Critical
>
> Similar issue to KAFKA-15009



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (KAFKA-14970) Dual write mode testing for SCRAM and Quota

2023-05-30 Thread Proven Provenzano (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-14970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Proven Provenzano resolved KAFKA-14970.
---
Resolution: Fixed

Committed and merged into 3.5.

> Dual write mode testing for SCRAM and Quota
> ---
>
> Key: KAFKA-14970
> URL: https://issues.apache.org/jira/browse/KAFKA-14970
> Project: Kafka
>  Issue Type: Test
>  Components: kraft
>    Reporter: Proven Provenzano
>    Assignee: Proven Provenzano
>Priority: Blocker
>  Labels: 3.5
>
> SCRAM and Quota are stored together in ZK and we need better testing to 
> validate the dual write mode support for them.
> I will add some additional tests for this.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (KAFKA-14084) Support SCRAM when using KRaft mode

2023-05-05 Thread Proven Provenzano (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-14084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Proven Provenzano resolved KAFKA-14084.
---
Resolution: Fixed

SCRAM support in KRaft is implemented.

Test of SCRAM authorization have been updated to work in KRaft mode.

Migration support of SCRAM credentials from ZK to SCRAM have been implemented 
and tested.

> Support SCRAM when using KRaft mode
> ---
>
> Key: KAFKA-14084
> URL: https://issues.apache.org/jira/browse/KAFKA-14084
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Colin McCabe
>    Assignee: Proven Provenzano
>Priority: Major
>  Labels: kip-500
> Fix For: 3.5.0
>
>
> Support SCRAM when using KRaft mode, as specified in KIP-631



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-14970) Dual write mode testing for SCRAM and Quota

2023-05-05 Thread Proven Provenzano (Jira)
Proven Provenzano created KAFKA-14970:
-

 Summary: Dual write mode testing for SCRAM and Quota
 Key: KAFKA-14970
 URL: https://issues.apache.org/jira/browse/KAFKA-14970
 Project: Kafka
  Issue Type: Test
  Components: kraft
Reporter: Proven Provenzano
Assignee: Proven Provenzano


SCRAM and Quota are stored together in ZK and we need better testing to 
validate the dual write mode support for them.

I will add some additional tests for this.

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (KAFKA-14859) Support SCRAM ZK to KRaft Migration

2023-05-05 Thread Proven Provenzano (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-14859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Proven Provenzano resolved KAFKA-14859.
---
Resolution: Fixed

Merged into trunk and 3.5

> Support SCRAM ZK to KRaft Migration
> ---
>
> Key: KAFKA-14859
> URL: https://issues.apache.org/jira/browse/KAFKA-14859
> Project: Kafka
>  Issue Type: Improvement
>    Reporter: Proven Provenzano
>    Assignee: Proven Provenzano
>Priority: Major
>
> I want to allow existing ZK installations to be able to migrate to KRaft and 
> support their existing SCRAM credentials.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (KAFKA-14948) Broker fails to rejoin cluster when cluster is in dual write mode

2023-04-27 Thread Proven Provenzano (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-14948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Proven Provenzano resolved KAFKA-14948.
---
Resolution: Cannot Reproduce

I updated my code to tip of master and cannot reproduce it anymore.

> Broker fails to rejoin cluster when cluster is in dual write mode
> -
>
> Key: KAFKA-14948
> URL: https://issues.apache.org/jira/browse/KAFKA-14948
> Project: Kafka
>  Issue Type: Bug
>  Components: kraft
>Affects Versions: 3.5.0
>    Reporter: Proven Provenzano
>    Assignee: Proven Provenzano
>Priority: Major
>
> While testing migration dual write mode I came across this issue.
> Initial setup: A single ZK node and a single Broker. Create a topic with some 
> data.
> Create a single controller and initiate migrations.
> Update Broker to start migration.
> Wait until all records are migrated. Cluster should be in dual write mode at 
> this point.
> Kill and restart the broker. Sometimes the broker will not rejoin the cluster 
> and consuming records from a topic will fail to find topic.
> Restarting the Controller will fix this issue.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-14948) Broker fails to rejoin cluster when cluster is in dual write mode

2023-04-27 Thread Proven Provenzano (Jira)
Proven Provenzano created KAFKA-14948:
-

 Summary: Broker fails to rejoin cluster when cluster is in dual 
write mode
 Key: KAFKA-14948
 URL: https://issues.apache.org/jira/browse/KAFKA-14948
 Project: Kafka
  Issue Type: Bug
  Components: kraft
Affects Versions: 3.5.0
Reporter: Proven Provenzano
Assignee: Proven Provenzano


While testing migration dual write mode I came across this issue.

Initial setup: A single ZK node and a single Broker. Create a topic with some 
data.

Create a single controller and initiate migrations.

Update Broker to start migration.

Wait until all records are migrated. Cluster should be in dual write mode at 
this point.

Kill and restart the broker. Sometimes the broker will not rejoin the cluster 
and consuming records from a topic will fail to find topic.

Restarting the Controller will fix this issue.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (KAFKA-14881) Update UserScramCredentialRecord for SCRAM ZK to KRaft migration

2023-04-18 Thread Proven Provenzano (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-14881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Proven Provenzano resolved KAFKA-14881.
---
Resolution: Fixed

Merged to Trunk and 3.5.

> Update UserScramCredentialRecord for SCRAM ZK to KRaft migration
> 
>
> Key: KAFKA-14881
> URL: https://issues.apache.org/jira/browse/KAFKA-14881
> Project: Kafka
>  Issue Type: Improvement
>  Components: kraft
>Affects Versions: 3.5.0
>    Reporter: Proven Provenzano
>    Assignee: Proven Provenzano
>Priority: Major
> Fix For: 3.5.0
>
>
> I want to support ZK to KRaft migration.
> ZK stores a storedKey and a serverKey for each SCRAM credential not the 
> saltedPassword.
> The storedKey and serverKey are a crypto hash of some data with the 
> saltedPassword and it is not possible to extract the saltedPassword from them.
> The serverKey and storedKey are enough for SCRAM authentication and 
> saltedPassword is not needed.
> I will update the UserScramCredentialRecord to store serverKey and storedKey 
> instead of saltedPassword and I will update that SCRAM is only supported with 
> a bumped version of IBP_3_5 so that there are no compatibility issues.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-14881) Update UserScramCredentialRecord for SCRAM ZK to KRaft migration

2023-04-06 Thread Proven Provenzano (Jira)
Proven Provenzano created KAFKA-14881:
-

 Summary: Update UserScramCredentialRecord for SCRAM ZK to KRaft 
migration
 Key: KAFKA-14881
 URL: https://issues.apache.org/jira/browse/KAFKA-14881
 Project: Kafka
  Issue Type: Improvement
  Components: kraft
Affects Versions: 3.5.0
Reporter: Proven Provenzano
Assignee: Proven Provenzano
 Fix For: 3.5.0


I want to support ZK to KRaft migration.

ZK stores a storedKey and a serverKey for each SCRAM credential not the 
saltedPassword.

The storedKey and serverKey are a crypto hash of some data with the 
saltedPassword and it is not possible to extract the saltedPassword from them.

The serverKey and storedKey are enough for SCRAM authentication and 
saltedPassword is not needed.

I will update the UserScramCredentialRecord to store serverKey and storedKey 
instead of saltedPassword and I will update that SCRAM is only supported with a 
bumped version of IBP_3_5 so that there are no compatibility issues.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (KAFKA-14765) Support SCRAM for brokers at bootstrap

2023-04-04 Thread Proven Provenzano (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-14765?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Proven Provenzano resolved KAFKA-14765.
---
Resolution: Fixed

[PR #13374|https://github.com/apache/kafka/pull/13374]includes both KIP-900 
implementation and updated integration tests.

> Support SCRAM for brokers at bootstrap
> --
>
> Key: KAFKA-14765
> URL: https://issues.apache.org/jira/browse/KAFKA-14765
> Project: Kafka
>  Issue Type: Improvement
>  Components: kraft
>    Reporter: Proven Provenzano
>    Assignee: Proven Provenzano
>Priority: Major
>  Labels: KIP-900
> Fix For: 3.5.0
>
>
> We want to add SCRAM support for brokers at bootstrap.
> We will support bootstrap as described in 
> [KIP-900|https://cwiki.apache.org/confluence/display/KAFKA/KIP-900%3A+KRaft+kafka-storage.sh+API+additions+to+support+SCRAM+for+Kafka+Brokers]
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (KAFKA-14776) Update SCRAM integration tests to run with KRaft

2023-04-04 Thread Proven Provenzano (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-14776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Proven Provenzano resolved KAFKA-14776.
---
Resolution: Fixed

[PR #13374|https://github.com/apache/kafka/pull/13374]includes both KIP-900 
implementation and updated integration tests.

> Update SCRAM integration tests to run with KRaft
> 
>
> Key: KAFKA-14776
> URL: https://issues.apache.org/jira/browse/KAFKA-14776
> Project: Kafka
>  Issue Type: Improvement
>  Components: kraft
>    Reporter: Proven Provenzano
>    Assignee: Proven Provenzano
>Priority: Major
> Fix For: 3.5.0
>
>
> I will update the SCRAM integration tests to run under both ZK and KRaft 
> quorum mode.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-14861) Migration Documentation should state the need for connectivity from controller to brokers when provisioning the KRaft controller quorum

2023-03-28 Thread Proven Provenzano (Jira)
Proven Provenzano created KAFKA-14861:
-

 Summary: Migration Documentation should state the need for 
connectivity from controller to brokers when provisioning the KRaft controller 
quorum
 Key: KAFKA-14861
 URL: https://issues.apache.org/jira/browse/KAFKA-14861
 Project: Kafka
  Issue Type: Improvement
  Components: documentation, kraft
Affects Versions: 3.4.0
Reporter: Proven Provenzano


Normal KRaft cluster operations with no nodes configured with {{process.roles}} 
as both controller and broker, there is no need to configure connectivity from 
the controllers to the brokers. Only brokers need to connect to controllers.

The ZK to KRaft migration does need controller to broker connectivity during 
the migration and this should be made explicit in the documentation in the 
*Provisioning the KRaft quorum controller* section.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-14860) New controllers created for migration should validate connectivity to brokers before starting migration

2023-03-28 Thread Proven Provenzano (Jira)
Proven Provenzano created KAFKA-14860:
-

 Summary: New controllers created for migration should validate 
connectivity to brokers before starting migration
 Key: KAFKA-14860
 URL: https://issues.apache.org/jira/browse/KAFKA-14860
 Project: Kafka
  Issue Type: Bug
  Components: kraft
Reporter: Proven Provenzano
Assignee: Proven Provenzano


The process of migration starts with creating controllers and adding them to 
the Kafka cluster at which point they wait for the brokers to register in a 
state ready for migrations. Brokers are then updated in a rolling manner to 
enable migrations. When all brokers have enabled migration, then migration 
starts.

After the metadata is migrated, the controller tries to connect back to the 
brokers to signal that metadata migration is complete. If the controller cannot 
create this connection (because of authentication issues) it will continuously 
retry (no timeout). All metadata operations initiated by the brokers that have 
not been contacted will fail. Rollback from this state back to ZK with no 
controllers though is a manual step.

I want the controllers to create the connection back to the brokers after all 
brokers have registered but before metadata migration is started. This will 
prevent metadata operations from failing if the controller cannot connect to 
any broker and allow for simple rollback of the cluster by disabling migration 
in the controllers and restarting them in a rolling manner.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-14859) Support SCRAM ZK to KRaft Migration

2023-03-28 Thread Proven Provenzano (Jira)
Proven Provenzano created KAFKA-14859:
-

 Summary: Support SCRAM ZK to KRaft Migration
 Key: KAFKA-14859
 URL: https://issues.apache.org/jira/browse/KAFKA-14859
 Project: Kafka
  Issue Type: Improvement
Reporter: Proven Provenzano
Assignee: Proven Provenzano


I want to allow existing ZK installations to be able to migrate to KRaft and 
support their existing SCRAM credentials.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-14776) Update SCRAM system tests to run with KRaft

2023-03-03 Thread Proven Provenzano (Jira)
Proven Provenzano created KAFKA-14776:
-

 Summary: Update SCRAM system tests to run with KRaft
 Key: KAFKA-14776
 URL: https://issues.apache.org/jira/browse/KAFKA-14776
 Project: Kafka
  Issue Type: Improvement
  Components: kraft
Reporter: Proven Provenzano
Assignee: Proven Provenzano


I will update the SCRAM system tests to run under both ZK and KRaft quorum mode.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-14775) Support SCRAM for broker to controller authentication

2023-03-03 Thread Proven Provenzano (Jira)
Proven Provenzano created KAFKA-14775:
-

 Summary: Support SCRAM for broker to controller authentication
 Key: KAFKA-14775
 URL: https://issues.apache.org/jira/browse/KAFKA-14775
 Project: Kafka
  Issue Type: Improvement
  Components: kraft
Reporter: Proven Provenzano
Assignee: Proven Provenzano


We need to apply SCRAM changes to controller nodes.

We need to handle DescribeUserScramCredentialsRequest in the controller nodes.

As part of this update I will split out the code from 
{{BrokerMetadataPublisher.scala}} for applying the SCRAM  into a separate 
{{{}MetadataPublisher{}}}, as we did with {{DynamicConfigPublisher}}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [VOTE] KIP-900: KRaft kafka-storage.sh API additions to support SCRAM for Kafka Brokers

2023-02-28 Thread Proven Provenzano
Hi all,
I am going to close the vote and start implementing.
The KIP is accepted with three binding votes from Colin, Jose, and
Manikumar.

--Proven

On Sat, Feb 25, 2023 at 1:21 AM Manikumar  wrote:

> +1 (binding)
>
> Thanks for the KIP.
>
> On Wed, Feb 22, 2023 at 3:48 AM José Armando García Sancio
>  wrote:
> >
> > LGTM Proven. Thanks for the improvements. +1 (binding)
> >
> > --
> > -José
>


[jira] [Created] (KAFKA-14765) Support SCRAM for brokers at bootstrap

2023-02-27 Thread Proven Provenzano (Jira)
Proven Provenzano created KAFKA-14765:
-

 Summary: Support SCRAM for brokers at bootstrap
 Key: KAFKA-14765
 URL: https://issues.apache.org/jira/browse/KAFKA-14765
 Project: Kafka
  Issue Type: Improvement
  Components: kraft
Reporter: Proven Provenzano


We want to add SCRAM support for brokers at bootstrap.

We will support bootstrap as described in 
[KIP-900|https://cwiki.apache.org/confluence/display/KAFKA/KIP-900%3A+KRaft+kafka-storage.sh+API+additions+to+support+SCRAM+for+Kafka+Brokers]

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS] KIP-900: KRaft kafka-storage.sh API additions to support SCRAM

2023-02-22 Thread Proven Provenzano
Hi Luke,

1. I'm going to assume that migration from ZK to KRaft will sync the info
from ZK into the metadata log.
There will be no need to bootstrap SCRAM for KRaft in the migration case. I
will need to test this.

 2. Done! I've also added a comment to each case so people understand the
change.

Thanks
--Proven



On Wed, Feb 22, 2023 at 5:46 AM Luke Chen  wrote:

> Hi Proven,
>
> Thanks for the KIP.
>
> Some questions:
> 1. During ZK migrating to KRaft, should we format the KRaft controller with
> the SCRAM info, or it'll sync from ZK?
> 2. In the KIP, you provided a link to a doc from Confluent (i.e. this
> line: See
> Configuring SCRAM
> <
> https://docs.confluent.io/platform/current/kafka/authentication_sasl/authentication_sasl_scram.html#configuring-scram
> >
> for details), could you change it into official Kafka doc link? That is
> this one: https://kafka.apache.org/documentation/#security_sasl_scram
>
> Thank you.
> Luke
>
>
> On Wed, Feb 22, 2023 at 6:17 AM José Armando García Sancio
>  wrote:
>
> > Hi Proven,
> >
> > On Tue, Feb 21, 2023 at 1:37 PM Proven Provenzano
> >  wrote:
> > >
> > > Hi Jose,
> > >
> > > 1. The SCRAM in SCRAM-SHA-256 is required as the mechanism name is
> > > SCRAM-SHA-256.
> > > I do realize there is a bit of redundancy here.
> > >
> > > 2. I'll add documentation for all the possible values. They are
> > > SCRAM-SHA-256 and SCRAM-SHA-512.
> > >
> > > 3. I'd like to keep it with a capital letter as it is a specific type
> of
> > > message we want to insert. I do agree that -A
> > > is not the correct choice so I'll suggest -S for SCRAM and that
> > leaves
> > > -A for --add-acl and -a for a generic
> > > --add-config in the future.
> > >
> >
> > It makes sense to me. Thanks for the changes.
> >
> > --
> > -José
> >
>


Re: [DISCUSS] KIP-900: KRaft kafka-storage.sh API additions to support SCRAM

2023-02-21 Thread Proven Provenzano
Hi Jose,

1. The SCRAM in SCRAM-SHA-256 is required as the mechanism name is
SCRAM-SHA-256.
I do realize there is a bit of redundancy here.

2. I'll add documentation for all the possible values. They are
SCRAM-SHA-256 and SCRAM-SHA-512.

3. I'd like to keep it with a capital letter as it is a specific type of
message we want to insert. I do agree that -A
is not the correct choice so I'll suggest -S for SCRAM and that leaves
-A for --add-acl and -a for a generic
--add-config in the future.

Thanks
--Proven

On Fri, Feb 17, 2023 at 1:32 PM José Armando García Sancio
 wrote:

> Hi Proven,
>
> Thanks for the changes to KIP-900. It looks good to me in general.
> Here are some suggestions and questions.
>
> 1. In the KIP you give the following example:
> --add-scram SCRAM-SHA-256=[user=alice,password=alice-secret]
>
> Is "SCRAM-" required as a prefix? The flag already has the suffix
> "-scram". Can the value to the flag be
> SHA-256=[user=alice,password=alice-secret]?
>
> 2. Should the CLI document all possible values for the --add-scram? Is
> SCRAM-SHA-256 the only supported algorithm?
>
> 3. Should the short version of the flag --add-scram be -s? I suspect
> that in the future we may want to add more options like --add-acl and
> --add-config.
>
> Thanks!
> --
> -José
>


[VOTE] KIP-900: KRaft kafka-storage.sh API additions to support SCRAM for Kafka Brokers

2023-02-16 Thread Proven Provenzano
Hi,

I'd like to start the vote on KIP-900: KRaft kafka-storage.sh API additions
to support SCRAM for Kafka Brokers.

The KIP is here: KIP-900


The Original discuss thread is here: KIP-900-DISCUSS


Please take a look and vote if you can.

Thank you
-- 
--Proven


Re: [DISCUSS] KIP-900: KRaft kafka-storage.sh API additions to support SCRAM

2023-02-14 Thread Proven Provenzano
The original idea was to pass in the JSON string representing the
UserScramCredentialsRecord
directly to make this simple and to require no parsing at all. Here is a
example of the JSON object:
{"name":"alice","mechanism":1,"salt":"MWx2NHBkbnc0ZndxN25vdGN4bTB5eTFrN3E=","SaltedPassword":"mT0yyUUxnlJaC99HXgRTSYlbuqa4FSGtJCJfTMvjYCE=","iterations":8192}
Note that it isn't very friendly. The mechanism is an integer value 1 or 2
not a enum such as
SCRAM-SHA-256 or SCRAM-SHA-512, The salt and iterations are required and
there
is no password, just a SaltedPassword which the customer would have to
generate externally.

Moving away from the above we will have to parse and validate arguments and
then from that generate
the UserScramCredentialsRecord. The question is what that looks like.
Should it be closer to
what kafha-configs uses or should it be our own made up JSON format?
Whichever we chose, one format
should be sufficient as it will only be used in this application.

The requirements so far for argument parsing are:

   - We want to specify the mechanism is a customer friendly enum
   SCRAM-SHA-256 or SCRAM-SHA-512
   - We want the salt and iterations to be optional and have a default if
   not specified.
   - We want the customer to be able to specify a password which then
   generates the salted password.
   - We want to allow the customer to specify a salted password if they so
   choose.
   - We want the user to specify a user for the credentials orto  specify
   default-entity.

This is on top of the arguments needed for kafka-storage. We should also
look forward to when we add additional
record types that need to be parsed and stored for bootstrap.

What I am suggesting is that we have an --add-config argument that requires
at least one key=value subargument
which indicates which record type to add. An example would be
SCRAM-SHA-256=[iterations=8192,password=alice-secret]
This indicates that the record to add is a UserScramCredentialsRecord with
mechanism 1 (SCRAM-SHA-256)
and there are some key values to add to the record. This is very much like
kafka-config. Note that this record is still incomplete
and that we need to specify a user to apply it to and that is where entity-type
users entity-name alice subarguments are needed.
If during parsing of the arguments the record is incomplete, then
kafka-storage will exit with a failure.

--Proven



On Mon, Feb 13, 2023 at 4:54 PM José Armando García Sancio
 wrote:

> Comments below.
>
> On Mon, Feb 13, 2023 at 11:44 AM Proven Provenzano
>  wrote:
> >
> > Hi Jose
> >
> > I want to clarify that the file parsing that Argparse4j provides is just
> a
> > mechanism for
> > taking command line args and putting them in a file. It doesn't
> > actually change what the
> > command line args are for processing the file. So I can add any
> > kafka-storage command
> > line arg into the file including say the storage UUID. I see that the
> file
> > parsing will be useful
> > in the future as we add more record types to add for the bootstrap
> process.
>
> Understood.
>
> > I'm against adding specific parsing for a list of configs vs. a separate
> > JSON file as it is adding
> > more surface area that needs testing for a feature that is used
> > infrequently. One config method
> > should be sufficient for one or more SCRAM records that a customer wants
> to
> > bootstrap with.
>
> Does this mean that the storage tool won't parse and validate SCRAM
> configuration? How will the user know that their SCRAM configuration
> is correct? Do they need to start the cluster to discover if their
> SCRAM configuration is correct?
>
> Thanks,
> --
> -José
>


Re: [DISCUSS] KIP-900: KRaft kafka-storage.sh API additions to support SCRAM

2023-02-13 Thread Proven Provenzano
Hi Jose

I want to clarify that the file parsing that Argparse4j provides is just a
mechanism for
taking command line args and putting them in a file. It doesn't
actually change what the
command line args are for processing the file. So I can add any
kafka-storage command
line arg into the file including say the storage UUID. I see that the file
parsing will be useful
in the future as we add more record types to add for the bootstrap process.

I'm against adding specific parsing for a list of configs vs. a separate
JSON file as it is adding
more surface area that needs testing for a feature that is used
infrequently. One config method
should be sufficient for one or more SCRAM records that a customer wants to
bootstrap with.

Thanks for the feedback
--Proven


On Mon, Feb 13, 2023 at 1:56 PM José Armando García Sancio
 wrote:

> Thanks for the discussion Colin and Proven.
>
> CLIs can be difficult to design when there are complicated use cases.
> Did we consider having CLI flags only for the common case? I would
> think that the common case is SCRAM for one user.
>
> For the complicated and less common cases they have to provide a
> "configuration file" that contains all of the SCRAM users and their
> passwords.
>
> The common case could be:
> --scram-user alice
> --scram-algo SHA-256
> --scram-password ...
> --scram-salt ...
> --scram-iterations ...
>
> If the user wants to provide multiple scram configurations they need
> to use --scram-config . The file would be human readable
> (maybe JSON or .properties) and contains the scram configuration for
> multiple users.
>
> --
> -José
>


Re: [DISCUSS] KIP-900: KRaft kafka-storage.sh API additions to support SCRAM

2023-02-01 Thread Proven Provenzano
Hi Colin,

I agree that the --alter flag is not useful. I just wanted to gauge how
close to the original command line we want.

I also think that positional args are not the way to go. Parsing can be
smarter than that.

Using --add-config to be the leed for setting dynamic config is fine. I
think disambiguating primary arguments from the arguments of a specific
config will be simple to understand. The '--' is for primary arguments to
the kafka-storage command. Multiple subarguments are possible for a single
--add-config each of which does not have a '--'. So like the following.

-add-config
'SCRAM-SHA-256=[iterations=8192,password=alice-secret],SCRAM-SHA-512=[password=alice-secret]'
entity-type users entity-name alice

The following are also acceptable from your example. I changed the ordering
because it does't matter.

--add-config entity-type brokers entity-name 0 foo=bar
--add-config default-entity entity-type broker baaz=quux

The --add-config sub argument parsing knows about the possible sub fields
entity-type, entity-name, default-entity and the key=value indicates what
the config is applied to.

Thanks
--Proven

On Fri, Jan 27, 2023 at 6:20 PM Colin McCabe  wrote:

> On Fri, Jan 27, 2023, at 11:25, Proven Provenzano wrote:
> > Hi Colin,
> >
> > So I think we are converging where we put the adding of records in
> > kafka-storage, allow for salt and iterations to be optional and now we
> are
> > getting to how to parse the individual arguments.
> >
> > My previous approach is to keep '--' as a delimiter for primary
> > kafka-storage arguments but to allow multiple space separated sub args
> > for
> > each 'scram-config' can contain multiple space separated key=value
> > pairs.
> > --scram-config
> > user=alice 'SCRAM-SHA-256=[iterations=8192,password=alicepass]
> > It could also be like the following with sub parsing of arguments where
> > name requires an additional arg and config requires one too. This looks
> > more like kafka-configs.
> >--scram-config entity-name alice config 'SCRAM-SHA-256=[
> > iterations=8192,password=alicepass]
> > If we really want to keep a similar parsing to kafka-configs we could
> > use '--alter'
> > as a primary argument like --config and --cluster-id and the other args
> > are
> > secondary arguments to --alter . The following now looks almost exactly
> > like kafka-configs. We would just ignore --entity-type as it must be of
> > type user.
> >   --alter --add-config
> >
> 'SCRAM-SHA-256=[iterations=8192,password=alice-secret],SCRAM-SHA-512=[password=alice-secret]'
> > --entity-type users --entity-name alice
> >
> > Thoughts?
> > --Proven
> >
>
> Hi Proven,
>
> I don't think it makes sense to have an --alter flag because we are not
> altering anything. We are creating a new bootstrap file.
>
> In general, we need to think about what else we will add to the formatting
> tool in the future. One thing I can think of is we will probably need to
> add the ability to set dynamic configurations. So we probably will have
> something like "--add-config key=value". Then, of course, the question
> becomes how to supply the entity-type and entity-name. We cannot really do
> it exactly like kafka-configs.sh does it, because that tool only accepts
> one alteration at a time, whereas we may have to do many.
>
> If we are willing to enforce ordering, we could do something like this:
> --entity-type brokers --entity-name 0 --add-config foo=bar
> --entity-type brokers --default-entity --add-config baaz=quux
>
> If we go that route, then maybe it makes sense to do SCRAM with a similar
> paradigm. So something like:
> --entity-type users --entity-name alice --add-config
> 'SCRAM-SHA-512=[password=alice-secret]'
>
> I do think this is doable with argparse4j. What do you think?
>
> best,
> Colin
>
>
> >
> >
> > On Tue, Jan 24, 2023 at 12:22 AM Colin McCabe 
> wrote:
> >
> >> On Fri, Jan 20, 2023, at 13:02, Proven Provenzano wrote:
> >> > Hi Colin,
> >> > Thanks for the response.
> >> >
> >> > I chose raw records, thinking it might be useful for future additions
> of
> >> > records that customers might want to add before the first start of the
> >> > cluster. I do see that it is at best an engineer friendly interface.
> >> >
> >> > I do think kafka-storage is the correct place to put the logic for
> adding
> >> > records to the bootstrap.checkpoint file. I think keeping the logic
> for
> >> > managing the bootstrap separate fro

Re: [DISCUSS] KIP-900: KRaft kafka-storage.sh API additions to support SCRAM

2023-01-27 Thread Proven Provenzano
Hi Colin,

So I think we are converging where we put the adding of records in
kafka-storage, allow for salt and iterations to be optional and now we are
getting to how to parse the individual arguments.

My previous approach is to keep '--' as a delimiter for primary
kafka-storage arguments but to allow multiple space separated sub args for
each 'scram-config' can contain multiple space separated key=value pairs.
--scram-config
user=alice 'SCRAM-SHA-256=[iterations=8192,password=alicepass]
It could also be like the following with sub parsing of arguments where
name requires an additional arg and config requires one too. This looks
more like kafka-configs.
   --scram-config entity-name alice config 'SCRAM-SHA-256=[
iterations=8192,password=alicepass]
If we really want to keep a similar parsing to kafka-configs we could
use '--alter'
as a primary argument like --config and --cluster-id and the other args are
secondary arguments to --alter . The following now looks almost exactly
like kafka-configs. We would just ignore --entity-type as it must be of
type user.
  --alter --add-config
'SCRAM-SHA-256=[iterations=8192,password=alice-secret],SCRAM-SHA-512=[password=alice-secret]'
--entity-type users --entity-name alice

Thoughts?
--Proven



On Tue, Jan 24, 2023 at 12:22 AM Colin McCabe  wrote:

> On Fri, Jan 20, 2023, at 13:02, Proven Provenzano wrote:
> > Hi Colin,
> > Thanks for the response.
> >
> > I chose raw records, thinking it might be useful for future additions of
> > records that customers might want to add before the first start of the
> > cluster. I do see that it is at best an engineer friendly interface.
> >
> > I do think kafka-storage is the correct place to put the logic for adding
> > records to the bootstrap.checkpoint file. I think keeping the logic for
> > managing the bootstrap separate from the logic of configuring an existing
> > cluster that is already running is a good division of functionality and I
> > think this separation will reduce the parsing logic significantly.
>
> Hi Proven,
>
> I suspect having the separation will increase the parsing logic you have
> to write rather than reducing it, since it will be more difficult to reuse
> the logic that is already in kafka-configs.sh.
>
> I do agree that having multiple commands modifying the bootstrap may be
> confusing, though. So maybe it's best to keep it all in the formatting
> tool, as you say.
>
> >
> > The API suggestion you made for kafka-storage is okay. I would prefer to
> > have one option for an entire SCRAM config including the user, such as
> the
> > following:
> >
>
> Hmm... I don't think having a single giant string for all the SCRAM
> configurations makes sense. At minimum each SCRAM user should get its own
> string. Another reason for doing it this way is that this is how
> kafka-configs.sh works, and we don't want two different incompatible ways
> of specifying SCRAM users on the command line.
>
> > I think adding the Argparse4j support for reading the arguments from a
> file
> > is a must.
>
> Yeah, agreed.
>
> best,
> Colin
>
> >
> > --Proven
> >
> >
> > On Thu, Jan 19, 2023 at 7:07 PM Colin McCabe  wrote:
> >
> >> Hi Proven,
> >>
> >> Thanks for putting this together.
> >>
> >> We always intended to have a way to bootstrap into using an all-SCRAM
> >> cluster, from scratch.
> >>
> >> I have two big comments here. First, I think we need a better interface
> >> than raw records. And second, I'm not sure that kafka-storage.sh is the
> >> right place to put this.
> >>
> >> I think raw records are going to be tough for people to use, because
> there
> >> are a lot of fields, and the values to set them to are not intuitive.
> For
> >> example, to set SHA512, the user needs to set "mechanism" equal to 2.
> That
> >> is going to be impossible to remember or figure out without looking at
> the
> >> source code. The other thing of course is that we may add more fields
> over
> >> time, including mandatory ones. So any documentation could quickly get
> out
> >> of date.
> >>
> >> I think people are going to want to specify SCRAM users here the same
> way
> >> they do when using the kafka-configs.sh tool. As a reminder, using
> >> kafka-configs.sh, they specify users like this:
> >>
> >> ./bin/kafka-configs --bootstrap-server localhost:9092 --alter \
> >>   --add-config 'SCRAM-SHA-256=[iterations=8192,password=pass]' \
> >>   --entity-type users \
> 

Re: [DISCUSS] KIP-900: KRaft kafka-storage.sh API additions to support SCRAM

2023-01-20 Thread Proven Provenzano
Hi Colin,
Thanks for the response.

I chose raw records, thinking it might be useful for future additions of
records that customers might want to add before the first start of the
cluster. I do see that it is at best an engineer friendly interface.

I do think kafka-storage is the correct place to put the logic for adding
records to the bootstrap.checkpoint file. I think keeping the logic for
managing the bootstrap separate from the logic of configuring an existing
cluster that is already running is a good division of functionality and I
think this separation will reduce the parsing logic significantly.

The API suggestion you made for kafka-storage is okay. I would prefer to
have one option for an entire SCRAM config including the user, such as the
following:

./bin/kafka-storage.sh format \
  --config [my-config-path] \
  --cluster-id mb0Zz1YPTUeVzpedHHPT-Q \
  --release-version 3.5-IV0 \
  --scram-config
user=alice 'SCRAM-SHA-256=[iterations=8192,password=alicepass]' \
  --scram-config user=bob 'SCRAM-SHA-256=[password=bobpass]'

Argparse4j supports multiple option arguments to a single option including
having an optional number of option arguments to a single option.

I think adding the Argparse4j support for reading the arguments from a file
is a must.

--Proven


On Thu, Jan 19, 2023 at 7:07 PM Colin McCabe  wrote:

> Hi Proven,
>
> Thanks for putting this together.
>
> We always intended to have a way to bootstrap into using an all-SCRAM
> cluster, from scratch.
>
> I have two big comments here. First, I think we need a better interface
> than raw records. And second, I'm not sure that kafka-storage.sh is the
> right place to put this.
>
> I think raw records are going to be tough for people to use, because there
> are a lot of fields, and the values to set them to are not intuitive. For
> example, to set SHA512, the user needs to set "mechanism" equal to 2. That
> is going to be impossible to remember or figure out without looking at the
> source code. The other thing of course is that we may add more fields over
> time, including mandatory ones. So any documentation could quickly get out
> of date.
>
> I think people are going to want to specify SCRAM users here the same way
> they do when using the kafka-configs.sh tool. As a reminder, using
> kafka-configs.sh, they specify users like this:
>
> ./bin/kafka-configs --bootstrap-server localhost:9092 --alter \
>   --add-config 'SCRAM-SHA-256=[iterations=8192,password=pass]' \
>   --entity-type users \
>   --entity-name alice
>
> Of course, in this example, we're not specifying a salt. So we'd have to
> evaluate whether that's what we want for our use-case as well. On the plus
> side, specifying a salt could ensure that the bootstrap files end up
> identical on every node. On the minus side, it is another random number
> that users would need to generate and explicitly pass in.
>
> I would lean towards auto-generating the salt. I don't think the salt
> needs to be the same on all nodes. Only one controller will become active
> and write the bootstrap records to the log; no other controllers will do
> that. Brokers don't need to read the SCRAM records out of the bootstrap
> file.
>
> If we put all the functionality into kafka-storage.sh, it might look
> something like this:
>
> ./bin/kafka-storage.sh format \
>   --config [my-config-path] \
>   --cluster-id mb0Zz1YPTUeVzpedHHPT-Q \
>   --release-version 3.5-IV0 \
>   --scram-user alice \
>   --scram-config 'SCRAM-SHA-256=[iterations=8192,password=alicepass]' \
>   --scram-user bob \
>   --scram-config 'SCRAM-SHA-256=[password=bobpass]'
>
> (Here I am assuming that each --scram-user must be followed by exactly on
> --scram-config line)
>
> Perhaps it's worth considering whether it woudl be better to add a mode to
> kafka-configs.sh where it appends to a bootstrap file.
>
> If we do put everything into kafka-storage.sh, we should consider the
> plight of people with low limits on the maximum length of their command
> lines. One fix for these people could be allowing them to read their
> arguments from a file like this:
>
> $ ./bin/kafka-storage.sh @myfile
> $ cat myfile:
>   ./bin/kafka-storage.sh format \
> --config [my-config-path] \
>   ...
> [etc, etc.]
>
> Argparse4j supports this natively with fromFilePrefix. See
> https://argparse4j.github.io/usage.html#fromfileprefix
>
> best,
> Colin
>
>
> On Thu, Jan 19, 2023, at 11:08, Proven Provenzano wrote:
> > I have written a KIP describing the API additions needed to
> > kafka-storage
> > to store SCRAM
> > credentials at bootstrap time. Please take a look at
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-900%3A+KRaft+kafka-storage.sh+API+additions+to+support+SCRAM+for+Kafka+Brokers
> >
> > --
> > --Proven
>


[DISCUSS] KIP-900: KRaft kafka-storage.sh API additions to support SCRAM

2023-01-19 Thread Proven Provenzano
I have written a KIP describing the API additions needed to kafka-storage
to store SCRAM
credentials at bootstrap time. Please take a look at
https://cwiki.apache.org/confluence/display/KAFKA/KIP-900%3A+KRaft+kafka-storage.sh+API+additions+to+support+SCRAM+for+Kafka+Brokers

-- 
--Proven


Request for permission to contribute to Apache Kafka

2023-01-17 Thread Proven Provenzano
Hi, I'd like permission to create KIPs. I am working on SCRAM for KRaft and
would like to file a KIPs to support this work. This is to add
bootstrapping SCRAM for KRaft by extending KIP-631. The basic SCRAM code
without bootstrap is available for comments in PR
https://github.com/apache/kafka/pull/13114 for KAFKA-14084

My Wiki ID and Jira ID are both pprovenzano.

Thanks

-- 
--Proven


[jira] [Resolved] (KAFKA-14398) Update EndToEndAuthorizerTest.scala to test with ZK and KRAFT quorum servers

2022-12-05 Thread Proven Provenzano (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-14398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Proven Provenzano resolved KAFKA-14398.
---
Resolution: Resolved

> Update EndToEndAuthorizerTest.scala to test with ZK and KRAFT quorum servers
> 
>
> Key: KAFKA-14398
> URL: https://issues.apache.org/jira/browse/KAFKA-14398
> Project: Kafka
>  Issue Type: Improvement
>  Components: kraft, unit tests
>    Reporter: Proven Provenzano
>    Assignee: Proven Provenzano
>Priority: Major
> Fix For: 3.4.0
>
>
> KRAFT is a replacement for ZK for storing metadata.
> We should validate that ACLs work with KRAFT for the supported authentication 
> mechanizms. 
> I will update EndToEndAuthorizerTest.scala to test with ZK and KRAFT.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (KAFKA-14375) Remove use of "authorizer-properties" in EndToEndAuthorizationTest.scala

2022-11-17 Thread Proven Provenzano (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-14375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Proven Provenzano resolved KAFKA-14375.
---
Resolution: Fixed

> Remove use of "authorizer-properties" in EndToEndAuthorizationTest.scala
> 
>
> Key: KAFKA-14375
> URL: https://issues.apache.org/jira/browse/KAFKA-14375
> Project: Kafka
>  Issue Type: Improvement
>  Components: unit tests
>    Reporter: Proven Provenzano
>Assignee: Proven Provenzano
>Priority: Major
>
> The use of {{authorizer-properties}} in AclCommand is deprecated and 
> EndToEndAuthroiztionTest.scala should be updated to not use it. 
> I will instead set {{kafkaPrincipal}} as a super user and set up the brokers 
> with AclAuthorzier. This will allow {{kafkaPrincipal}} to set ACLs and 
> clientPrincipal to validate them as per the tests.
> This update is a precursor to updating  EndToEndAuthroiztionTest.scala to run 
> in KRAFT mode



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-14398) Update EndToEndAuthorizerTest.scala to test with ZK and KRAFT quorum servers

2022-11-17 Thread Proven Provenzano (Jira)
Proven Provenzano created KAFKA-14398:
-

 Summary: Update EndToEndAuthorizerTest.scala to test with ZK and 
KRAFT quorum servers
 Key: KAFKA-14398
 URL: https://issues.apache.org/jira/browse/KAFKA-14398
 Project: Kafka
  Issue Type: Improvement
  Components: kraft, unit tests
Reporter: Proven Provenzano


KRAFT is a replacement for ZK for storing metadata.

We should validate that ACLs work with KRAFT for the supported authentication 
mechanizms. 

I will update EndToEndAuthorizerTest.scala to test with ZK and KRAFT.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-14375) Remove use of "authorizer-properties" n EndToEndAuthorizationTest.scala

2022-11-10 Thread Proven Provenzano (Jira)
Proven Provenzano created KAFKA-14375:
-

 Summary: Remove use of "authorizer-properties" n 
EndToEndAuthorizationTest.scala
 Key: KAFKA-14375
 URL: https://issues.apache.org/jira/browse/KAFKA-14375
 Project: Kafka
  Issue Type: Improvement
  Components: unit tests
Reporter: Proven Provenzano


The use of {{authorizer-properties}} in AclCommand is deprecated and 
EndToEndAuthroiztionTest.scala should be updated to not use it. 

I will instead set {{kafkaPrincipal}} as a super user and set up the brokers 
with AclAuthorzier. This will allow {{kafkaPrincipal}} to set ACLs and 
clientPrincipal to validate them as per the tests.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)