[jira] [Resolved] (KAFKA-8296) Kafka Streams branch method raises type warnings

2020-07-16 Thread Matthias J. Sax (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias J. Sax resolved KAFKA-8296.

Resolution: Duplicate

Closing this ticket as duplicate.

> Kafka Streams branch method raises type warnings
> 
>
> Key: KAFKA-8296
> URL: https://issues.apache.org/jira/browse/KAFKA-8296
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Reporter: Michael Drogalis
>Assignee: Ivan Ponomarev
>Priority: Minor
>
> Because the branch method in the DSL takes vargargs, using it as follows 
> raises an unchecked type warning:
> {code:java}
> KStream[] branches = builder. User>stream(inputTopic)
> .branch((key, user) -> "united 
> states".equals(user.getCountry()),
> (key, user) -> "germany".equals(user.getCountry()),
> (key, user) -> "mexico".equals(user.getCountry()),
> (key, user) -> true);
> {code}
> The compiler warns with:
> {code:java}
> Warning:(39, 24) java: unchecked generic array creation for varargs parameter 
> of type org.apache.kafka.streams.kstream.Predicate super io.confluent.developer.avro.User>[]
> {code}
> This is unfortunate because of the way Java's type system + generics work. We 
> could possibly fix this by adding the @SafeVarargs annotation to the branch 
> method signatures.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KAFKA-10284) Group membership update due to static member rejoin should be persisted

2020-07-16 Thread Boyang Chen (Jira)
Boyang Chen created KAFKA-10284:
---

 Summary: Group membership update due to static member rejoin 
should be persisted
 Key: KAFKA-10284
 URL: https://issues.apache.org/jira/browse/KAFKA-10284
 Project: Kafka
  Issue Type: Bug
  Components: consumer
Affects Versions: 2.5.0, 2.4.0, 2.3.0, 2.6.0
Reporter: Boyang Chen
Assignee: Boyang Chen
 Fix For: 2.6.1


For known static members rejoin, we would update its corresponding member.id 
without triggering a new rebalance. This serves the purpose for avoiding 
unnecessary rebalance for static membership, as well as fencing purpose if some 
still uses the old member.id. 

The bug is that we don't actually persist the membership update, so if no 
upcoming rebalance gets triggered, this new member.id information will get lost 
during group coordinator immigration, thus bringing up the zombie member 
identity.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KAFKA-10283) Consolidate client-level and consumer-level assignment within ClientState

2020-07-16 Thread Guozhang Wang (Jira)
Guozhang Wang created KAFKA-10283:
-

 Summary: Consolidate client-level and consumer-level assignment 
within ClientState
 Key: KAFKA-10283
 URL: https://issues.apache.org/jira/browse/KAFKA-10283
 Project: Kafka
  Issue Type: Improvement
  Components: streams
Reporter: Guozhang Wang


In StreamsPartitionAssignor, we do a two-level assignment, one on the 
client-level, and then after the assignment is done we further decide within 
the client how to distributed among consumers if there are more.

The {{ClientState}} class is used for book-keeping the assigned tasks, however 
it is only used for the first level, while for the second level it is done 
outside of the class and we only keep track of the results in a few maps for 
logging purposes. This leaves us with a bunch of hierarchical maps, e.g. some 
on the client level and some on the consumer level.

We would like to consolidate some of these maps into a single data structure 
for better keeping track of the assignment information, and also for less bug 
vulnerability causing the assignment information to be inconsistent. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [kafka-site] ableegoldman opened a new pull request #275: MINOR: update Streams docs for 2.6 and fix configs

2020-07-16 Thread GitBox


ableegoldman opened a new pull request #275:
URL: https://github.com/apache/kafka-site/pull/275


   See https://github.com/apache/kafka/pull/9027



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Created] (KAFKA-10282) Log metrics are removed if a log is deleted and re-created quickly enough

2020-07-16 Thread Bob Barrett (Jira)
Bob Barrett created KAFKA-10282:
---

 Summary: Log metrics are removed if a log is deleted and 
re-created quickly enough
 Key: KAFKA-10282
 URL: https://issues.apache.org/jira/browse/KAFKA-10282
 Project: Kafka
  Issue Type: Bug
  Components: log
Affects Versions: 2.6.0
Reporter: Bob Barrett
Assignee: Bob Barrett
 Fix For: 2.7.0, 2.6.1


When we delete a local log, we mark it for asynchronous deletion by renaming it 
with a `.delete` extension, and then wait `LogConfig.FileDeleteDelayMs` 
milliseconds before actually deleting the files on disk. We don't remove the 
Log metrics from the metrics registry until the actual deletion takes place. If 
we recreate a log of the same topic partition (for example, if we reassign the 
partition away from the broker and quickly reassign it back), the metrics are 
registered when the new log is created, but then unregistered when the async 
deletion of the original log takes place. This leaves us with a partition that 
is not reporting any Log metrics (size, offsets, number of segments, etc).

To fix this, the LogManager should check when creating new logs to see if a log 
for the same topic partition is marked for deletion, and if so, signal to that 
log not to unregister its metrics when it is deleted.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [VOTE] KIP-617: Allow Kafka Streams State Stores to be iterated backwards

2020-07-16 Thread Sophie Blee-Goldman
Hey Jorge,

Thanks for the reminder -- +1 (non-binding)

Cheers,
Sophie

On Thu, Jul 16, 2020 at 4:06 PM Jorge Esteban Quilcate Otoya <
quilcate.jo...@gmail.com> wrote:

> Bumping this vote thread to check if there's any feedback.
>
> Cheers,
> Jorge.
>
> On Sat, Jul 4, 2020 at 6:20 PM John Roesler  wrote:
>
> > Thanks Jorge,
> >
> > I’m +1 (binding)
> >
> > -John
> >
> > On Fri, Jul 3, 2020, at 10:26, Jorge Esteban Quilcate Otoya wrote:
> > > Hola everyone,
> > >
> > > I'd like to start a new thread to vote for KIP-617 as there have been
> > > significant changes since the previous vote started.
> > >
> > > KIP wiki page:
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-617%3A+Allow+Kafka+Streams+State+Stores+to+be+iterated+backwards
> > >
> > > Many thanks!
> > >
> > > Jorge.
> > >
> >
>


Re: [VOTE] KIP-617: Allow Kafka Streams State Stores to be iterated backwards

2020-07-16 Thread Jorge Esteban Quilcate Otoya
Bumping this vote thread to check if there's any feedback.

Cheers,
Jorge.

On Sat, Jul 4, 2020 at 6:20 PM John Roesler  wrote:

> Thanks Jorge,
>
> I’m +1 (binding)
>
> -John
>
> On Fri, Jul 3, 2020, at 10:26, Jorge Esteban Quilcate Otoya wrote:
> > Hola everyone,
> >
> > I'd like to start a new thread to vote for KIP-617 as there have been
> > significant changes since the previous vote started.
> >
> > KIP wiki page:
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-617%3A+Allow+Kafka+Streams+State+Stores+to+be+iterated+backwards
> >
> > Many thanks!
> >
> > Jorge.
> >
>


Re: [DISCUSS] KIP-634: Complementary support for headers in Kafka Streams DSL

2020-07-16 Thread Jorge Esteban Quilcate Otoya
Hi everyone,

Bumping this thread to check if there's any feedback.

Cheers,
Jorge.

On Tue, Jun 30, 2020 at 12:46 AM Jorge Esteban Quilcate Otoya <
quilcate.jo...@gmail.com> wrote:

> Hi everyone,
>
> I would like to start the discussion for 
> KIP-634:https://cwiki.apache.org/confluence/display/KAFKA/KIP-634%3A+Complementary+support+for+headers+in+Kafka+Streams+DSL
>
> Looking forward to your feedback.
>
> Thanks!
> Jorge.
>
>
>
>


Re: [VOTE] KIP-554: Add Broker-side SCRAM Config API

2020-07-16 Thread Ron Dagostino
Hi Colin.  I updated the KIP with various renames.  I've also created a
draft PR at https://github.com/apache/kafka/pull/9032 that still needs a
bunch of real implementation but nonetheless represents the renames in code.

The biggest changes are that there are now derived classes public class
UserScramCredentialAdditionUpsertion and public class
UserScramCredentialDeletion.  I don't know what the reaction to the use of
the term "upsertion" will be, but that's the best thing I could come up
with to reflect that these requests are "upserts" (update if there,
otherwise insert).  It was referred to as an "Addition" before, which I
felt was not technically correct.  If you diff the most recent two versions
of the KIP it diffs pretty cleanly and makes the changes pretty apparent.

Ron


On Thu, Jul 16, 2020 at 11:38 AM Colin McCabe  wrote:

> On Thu, Jul 16, 2020, at 08:06, Ron Dagostino wrote:
> > Thanks, Colin.  The standard "about" message for ThrottleTimeMs seems
> > to be "The duration in milliseconds for which the request was throttled
> > due to a quota violation, or zero if the request did not violate any
> quota."
> > as opposed to "The time spent waiting for quota." Should we adjust to
> > match the typical definition?
> >
>
> Hi Ron,
>
> Good point.  Let's keep the "about" text consistent.  I changed it.
>
> >
> > I'm wondering if describing Scram credentials should require READ
> privilege
> > rather than ALTER on the cluster?   Altering SCRAM credentials of course
> > requires ALTER privilege, and I can see the argument for requiring ALTER
> > privilege to describe them as well, but it did catch my eye as something
> > worth questioning/confirming.
> >
>
> Also a good point.  I spoke with Rajini about this offline, and she
> pointed out that we can already see user names in ACLs if we have DESCRIBE
> on CLUSTER.  So it should be fine to have describeScramUsers require
> DESCRIBE on CLUSTER as well.
>
> >
> > I'm also now thinking that "UNKNOWN" shouldn't be listed in the
> > ScramMechanism enum.  I thought maybe it was a catch-all so we will
> always
> > be able to deserialize something regardless of what key actually appears,
> > but I just realized that SCRAM credentials and Client Quotas are mixed
> > together in the same JSON, so it will be up to the corresponding API to
> > ignore what it doesn't recognize -- i.e. if both client quotas and SCRAM
> > credentials are defined for a user, then invoking DescribeClientQuotas
> must
> > only describe the quota configs and invoking DescribeScramUsers must only
> > describe the SCRAM configs.
> >
>
> The reason to have the UNKNOWN enum is so that we can add new SCRAM
> mechanisms in the future.  If we don't have it, then we're basically saying
> we can never add new mechanisms.
>
> I agree that the decision to put SCRAM users under the same ZK path as
> client quotas makes this more complex than we'd like it to be, but all is
> not lost.  For one thing, we could always just add a new ZK path for SCRAM
> users in the future if we really need to.  With a new path we wouldn't have
> to worry about namespace collisions.  For another thing, in the
> post-KIP-500 world this won't be an issue.
>
> In the short term, a simpler solution might work here.  For example, can
> we just assume that any key that starts with "SCRAM-" is not a quota, but a
> scram user?  (Or look at some other aspect of the key).
>
> >
> >  Also, note that invoking kafka-configs.sh
> > --bootstrap-server ... --entity-type user --describe will require the
> > invocation of two separate APIs -- one to describe quotas and one to
> > describe SCRAM credentials; I don't think this is a problem, but I did
> want
> > to call it out explicitly.
> >
>
> That should be OK.
>
> >
> > Finally, there is a lack of consistency regarding how we name things.
> For
> > example, we are calling these APIs {Describe,Alter}ScramUsers and we have
> > declared the classes {Describe,Alter}ScramUserOptions, which matches up
> > fine.  We also have public class DescribeScramUsersResult, which also
> > matches.  But then we have public class AlterScramCredentialsResult,
> > interface ScramCredentialAlteration, public class
> ScramCredentialAddition,
> > and public class ScramCredentialDeletion, none of which match up in terms
> > of consistency of naming.  I wonder if we should always use
> > "UserScramCredential" everywhere since that is technically what these
> API's
> > are about: describing/altering Users' SCRAM credentials.  So the APis
> would
> > be {Describe,Alter}UserScramCredentials, and everything else that is
> > publiuc that now refers inconsistently to either ScramUsers or
> > ScramCredential would instead refer to UserScramCredentials (sometimes
> > singular rather than plural if warranted).  For example: public class {
> > Describe,Alter}UserScramCredentialsResult, interface User
> > ScramCredentialAlteration, public class UserScramCredentialAddition, and
> > public class 

[jira] [Created] (KAFKA-10281) KIP-640: Add log compression analysis tool

2020-07-16 Thread Chris Beard (Jira)
Chris Beard created KAFKA-10281:
---

 Summary: KIP-640: Add log compression analysis tool
 Key: KAFKA-10281
 URL: https://issues.apache.org/jira/browse/KAFKA-10281
 Project: Kafka
  Issue Type: Improvement
  Components: tools
Reporter: Chris Beard
Assignee: Chris Beard


Link to KIP: 
[https://cwiki.apache.org/confluence/display/KAFKA/KIP-640%3A+Add+log+compression+analysis+tool]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] KIP-595: A Raft Protocol for the Metadata Quorum

2020-07-16 Thread Jun Rao
Hi, Jason,

Thanks for the updated KIP. Looks good overall. A few more comments below.

101. I still don't see a section on bootstrapping related issues. It would
be useful to document if/how the following is supported.
101.1 Currently, we support auto broker id generation. Is this supported
for bootstrap brokers?
101.2 As Colin mentioned, sometimes we may need to load the security
credentials to be broker before it can be connected to. Could you provide a
bit more detail on how this will work?
101.3 Currently, we use ZK to generate clusterId on a new cluster. With
Raft, how does every broker generate the same clusterId in a distributed
way?

200. It would be useful to document if the various special offsets (log
start offset, recovery point, HWM, etc) for the Raft log are stored in the
same existing checkpoint files or not.
200.1 Since the Raft log flushes every append, does that allow us to
recover from a recovery point within the active segment or do we still need
to scan the full segment including the recovery point? The former can be
tricky since multiple records can fall into the same disk page and a
subsequent flush may corrupt a page with previously flushed records.

201. Configurations.
201.1 How do the Raft brokers get security related configs for inter broker
communication? Is that based on the existing inter.broker.security.protocol?
201.2 We have quorum.retry.backoff.max.ms and quorum.retry.backoff.ms, but
only quorum.election.backoff.max.ms. This seems a bit inconsistent.

202. Metrics:
202.1 TotalTimeMs, InboundQueueTimeMs, HandleTimeMs, OutboundQueueTimeMs:
Are those the same as existing totalTime, requestQueueTime, localTime,
responseQueueTime? Could we reuse the existing ones with the tag
request=[request-type]?
202.2. Could you explain what InboundChannelSize and OutboundChannelSize
are?
202.3 ElectionLatencyMax/Avg: It seems that both should be windowed?

203. Quorum State: I assume that LeaderId will be kept consistently with
LeaderEpoch. For example, if a follower transitions to candidate and bumps
up LeaderEpoch, it will set leaderId to -1 and persist both in the Quorum
state file. Is that correct?

204. I was thinking about a corner case when a Raft broker is partitioned
off. This broker will then be in a continuous loop of bumping up the leader
epoch, but failing to get enough votes. When the partitioning is removed,
this broker's high leader epoch will force a leader election. I assume
other Raft brokers can immediately advance their leader epoch passing the
already bumped epoch such that leader election won't be delayed. Is that
right?

205. In a JBOD setting, could we use the existing tool to move the Raft log
from one disk to another?

206. The KIP doesn't mention the local metadata store derived from the Raft
log. Will that be covered in a separate KIP?

207. Since this is a critical component. Could we add a section on the
testing plan for correctness?

208. Performance. Do we plan to do group commit (e.g. buffer pending
appends during a flush and then flush all accumulated pending records
together in the next flush) for better throughput?

209. "the leader can actually defer fsync until it knows "quorum.size - 1"
has get to a certain entry offset." Why is that "quorum.size - 1" instead
of the majority of the quorum?

Thanks,

Jun

On Mon, Jul 13, 2020 at 9:43 AM Jason Gustafson  wrote:

> Hi All,
>
> Just a quick update on the proposal. We have decided to move quorum
> reassignment to a separate KIP:
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-642%3A+Dynamic+quorum+reassignment
> .
> The way this ties into cluster bootstrapping is complicated, so we felt we
> needed a bit more time for validation. That leaves the core of this
> proposal as quorum-based replication. If there are no further comments, we
> will plan to start a vote later this week.
>
> Thanks,
> Jason
>
> On Wed, Jun 24, 2020 at 10:43 AM Guozhang Wang  wrote:
>
> > @Jun Rao 
> >
> > Regarding your comment about log compaction. After some deep-diving into
> > this we've decided to propose a new snapshot-based log cleaning mechanism
> > which would be used to replace the current compaction mechanism for this
> > meta log. A new KIP will be proposed specifically for this idea.
> >
> > All,
> >
> > I've updated the KIP wiki a bit updating one config "
> > election.jitter.max.ms"
> > to "election.backoff.max.ms" to make it more clear about the usage: the
> > configured value will be the upper bound of the binary exponential
> backoff
> > time after a failed election, before starting a new one.
> >
> >
> >
> > Guozhang
> >
> >
> >
> > On Fri, Jun 12, 2020 at 9:26 AM Boyang Chen 
> > wrote:
> >
> > > Thanks for the suggestions Guozhang.
> > >
> > > On Thu, Jun 11, 2020 at 2:51 PM Guozhang Wang 
> > wrote:
> > >
> > > > Hello Boyang,
> > > >
> > > > Thanks for the updated information. A few questions here:
> > > >
> > > > 1) Should the quorum-file also update to support multi-raft?
> > > >
> > > > I'm 

[jira] [Created] (KAFKA-10279) Allow dynamic update of certificates with additional SubjectAltNames

2020-07-16 Thread Rajini Sivaram (Jira)
Rajini Sivaram created KAFKA-10279:
--

 Summary: Allow dynamic update of certificates with additional 
SubjectAltNames
 Key: KAFKA-10279
 URL: https://issues.apache.org/jira/browse/KAFKA-10279
 Project: Kafka
  Issue Type: Improvement
  Components: security
Reporter: Rajini Sivaram
Assignee: Rajini Sivaram
 Fix For: 2.7.0


At the moment, we don't allow dynamic keystore update in brokers if DN and 
SubjectAltNames don't match exactly. This is to ensure that existing clients 
and inter-broker communication don't break. Since addition of new entries to 
SubjectAltNames will not break any authentication, we should allow that and 
just verify that new SubjectAltNames is a superset of the old one.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Build failed in Jenkins: kafka-trunk-jdk14 #295

2020-07-16 Thread Apache Jenkins Server
See 


Changes:

[github] KAFKA-10174: Prefer --bootstrap-server for configs command in ducker


--
[...truncated 3.20 MB...]

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldThrowIfInMemoryBuiltInStoreIsAccessedWithUntypedMethod[Eos enabled = 
false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldThrowIfInMemoryBuiltInStoreIsAccessedWithUntypedMethod[Eos enabled = 
false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldProcessFromSourcesThatMatchMultiplePattern[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldProcessFromSourcesThatMatchMultiplePattern[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > shouldPopulateGlobalStore[Eos 
enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > shouldPopulateGlobalStore[Eos 
enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldThrowIfPersistentBuiltInStoreIsAccessedWithUntypedMethod[Eos enabled = 
false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldThrowIfPersistentBuiltInStoreIsAccessedWithUntypedMethod[Eos enabled = 
false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldAllowPrePopulatingStatesStoresWithCachingEnabled[Eos enabled = false] 
STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldAllowPrePopulatingStatesStoresWithCachingEnabled[Eos enabled = false] 
PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldReturnCorrectPersistentStoreTypeOnly[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldReturnCorrectPersistentStoreTypeOnly[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > shouldRespectTaskIdling[Eos 
enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > shouldRespectTaskIdling[Eos 
enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldUseSourceSpecificDeserializers[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldUseSourceSpecificDeserializers[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > shouldReturnAllStores[Eos 
enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > shouldReturnAllStores[Eos 
enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldNotCreateStateDirectoryForStatelessTopology[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldNotCreateStateDirectoryForStatelessTopology[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldApplyGlobalUpdatesCorrectlyInRecursiveTopologies[Eos enabled = false] 
STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldApplyGlobalUpdatesCorrectlyInRecursiveTopologies[Eos enabled = false] 
PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldReturnAllStoresNames[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldReturnAllStoresNames[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldPassRecordHeadersIntoSerializersAndDeserializers[Eos enabled = false] 
STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldPassRecordHeadersIntoSerializersAndDeserializers[Eos enabled = false] 
PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldProcessConsumerRecordList[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldProcessConsumerRecordList[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldUseSinkSpecificSerializers[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldUseSinkSpecificSerializers[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldFlushStoreForFirstInput[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldFlushStoreForFirstInput[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldProcessFromSourceThatMatchPattern[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldProcessFromSourceThatMatchPattern[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldCaptureSinkTopicNamesIfWrittenInto[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldCaptureSinkTopicNamesIfWrittenInto[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldUpdateStoreForNewKey[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldUpdateStoreForNewKey[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldSendRecordViaCorrectSourceTopicDeprecated[Eos enabled = false] STARTED


Build failed in Jenkins: kafka-trunk-jdk11 #1647

2020-07-16 Thread Apache Jenkins Server
See 


Changes:

[github] KAFKA-10174: Prefer --bootstrap-server for configs command in ducker


--
[...truncated 3.20 MB...]
org.apache.kafka.streams.scala.kstream.StreamJoinedTest > Create a StreamJoined 
should create a StreamJoined with Serdes and a State Store name PASSED

org.apache.kafka.streams.scala.kstream.KTableTest > filter a KTable should 
filter records satisfying the predicate STARTED

org.apache.kafka.streams.scala.kstream.KTableTest > filter a KTable should 
filter records satisfying the predicate PASSED

org.apache.kafka.streams.scala.kstream.KTableTest > filterNot a KTable should 
filter records not satisfying the predicate STARTED

org.apache.kafka.streams.scala.kstream.KTableTest > filterNot a KTable should 
filter records not satisfying the predicate PASSED

org.apache.kafka.streams.scala.kstream.KTableTest > join 2 KTables should join 
correctly records STARTED

org.apache.kafka.streams.scala.kstream.KTableTest > join 2 KTables should join 
correctly records PASSED

org.apache.kafka.streams.scala.kstream.KTableTest > join 2 KTables with a 
Materialized should join correctly records and state store STARTED

org.apache.kafka.streams.scala.kstream.KTableTest > join 2 KTables with a 
Materialized should join correctly records and state store PASSED

org.apache.kafka.streams.scala.kstream.KTableTest > windowed KTable#suppress 
should correctly suppress results using Suppressed.untilTimeLimit STARTED

org.apache.kafka.streams.scala.StreamToTableJoinScalaIntegrationTestImplicitSerdes
 > testShouldCountClicksPerRegionWithNamedRepartitionTopic PASSED

org.apache.kafka.streams.scala.StreamToTableJoinScalaIntegrationTestImplicitSerdes
 > testShouldCountClicksPerRegionJava STARTED

org.apache.kafka.streams.scala.kstream.KTableTest > windowed KTable#suppress 
should correctly suppress results using Suppressed.untilTimeLimit PASSED

org.apache.kafka.streams.scala.kstream.KTableTest > windowed KTable#suppress 
should correctly suppress results using Suppressed.untilWindowCloses STARTED

org.apache.kafka.streams.scala.kstream.KTableTest > windowed KTable#suppress 
should correctly suppress results using Suppressed.untilWindowCloses PASSED

org.apache.kafka.streams.scala.kstream.KTableTest > session windowed 
KTable#suppress should correctly suppress results using 
Suppressed.untilWindowCloses STARTED

org.apache.kafka.streams.scala.kstream.KTableTest > session windowed 
KTable#suppress should correctly suppress results using 
Suppressed.untilWindowCloses PASSED

org.apache.kafka.streams.scala.kstream.KTableTest > non-windowed 
KTable#suppress should correctly suppress results using 
Suppressed.untilTimeLimit STARTED

org.apache.kafka.streams.scala.kstream.KTableTest > non-windowed 
KTable#suppress should correctly suppress results using 
Suppressed.untilTimeLimit PASSED

org.apache.kafka.streams.scala.StreamToTableJoinScalaIntegrationTestImplicitSerdes
 > testShouldCountClicksPerRegionJava PASSED

org.apache.kafka.streams.scala.StreamToTableJoinScalaIntegrationTestImplicitSerdes
 > testShouldCountClicksPerRegion STARTED

org.apache.kafka.streams.scala.StreamToTableJoinScalaIntegrationTestImplicitSerdes
 > testShouldCountClicksPerRegion PASSED

org.apache.kafka.streams.scala.kstream.MaterializedTest > Create a Materialized 
should create a Materialized with Serdes STARTED

org.apache.kafka.streams.scala.kstream.MaterializedTest > Create a Materialized 
should create a Materialized with Serdes PASSED

org.apache.kafka.streams.scala.kstream.MaterializedTest > Create a Materialize 
with a store name should create a Materialized with Serdes and a store name 
STARTED

org.apache.kafka.streams.scala.kstream.MaterializedTest > Create a Materialize 
with a store name should create a Materialized with Serdes and a store name 
PASSED

org.apache.kafka.streams.scala.kstream.MaterializedTest > Create a Materialize 
with a window store supplier should create a Materialized with Serdes and a 
store supplier STARTED

org.apache.kafka.streams.scala.kstream.MaterializedTest > Create a Materialize 
with a window store supplier should create a Materialized with Serdes and a 
store supplier PASSED

org.apache.kafka.streams.scala.kstream.MaterializedTest > Create a Materialize 
with a key value store supplier should create a Materialized with Serdes and a 
store supplier STARTED

org.apache.kafka.streams.scala.kstream.MaterializedTest > Create a Materialize 
with a key value store supplier should create a Materialized with Serdes and a 
store supplier PASSED

org.apache.kafka.streams.scala.kstream.MaterializedTest > Create a Materialize 
with a session store supplier should create a Materialized with Serdes and a 
store supplier STARTED

org.apache.kafka.streams.scala.kstream.MaterializedTest > Create a Materialize 
with a session store supplier should create a Materialized with Serdes and a 
store 

Re: [DISCUSS] KIP-450: Sliding Windows

2020-07-16 Thread John Roesler
Hello all, 

Thanks for the KIP, Leah!

Regarding (1): I'd go farther actually. Making Windows an abstract
class was a mistake from the beginning that led to us not being
able to fix a very confusing situation for users around retention times,
final results emitting, etc. Thus, I would not suggest extending
TimeWindows for sure, but would also not suggest extending Windows.

The very simplest thing to do is follow the example of SessionWindows,
which is just a completely self-contained class. If we don't mess with
class inheritance, we won't ever have any of the problems related to
class inheritance. This is my preferred solution.

Still, Sliding windows has a lot in common with TimeWindows and other
fixed-size windows, namely that the windows are fixed in size. If we want
to preserve the current two-part windowing API in which you can window
by either "fixed" or "data driven" modes, I'd suggest we avoid increasing
the blast radius of Windows by taking the opportunity to replace it with
a proper interface and implement that interface instead.

For example:
https://github.com/apache/kafka/pull/9031

Then, SlidingWindows would just implement FixedSizeWindowDefinition

==

Regarding (2), it seems more straightforward as a user of Streams
to just have one mental model. _All_ of our aggregation operations
follow an eager emission model, in which we just emit an update whenever
an update is available. We already provided Suppression to explicitly apply
different update semantics in the case it's required. Why should we define
a snowflake operation with completely different semantics from everything
else? I.e., systems are generally easier to use when they follow a few
simple, composable rules than when they have a lot of different, specific
rules.


==

New point: (4):
It would be nice to include some examples of user code that would use the
new API, which should include:
1. using the DSL with the sliding window definition
2. accessing the stored results of a sliding window aggregation via IQ
3. defining a custom processor to access sliding windows in a store

It generally helps reviewers wrap their heads around the proposal, as well
as shaking out any design issues that would otherwise only come up during
implementation/testing/review.

Thanks again for the awesome proposal!
-John


On Tue, Jul 14, 2020, at 12:31, Guozhang Wang wrote:
> Hello Leah,
> 
> Thanks for the nice written KIP. A few thoughts:
> 
> 1) I echo the other reviewer's comments regarding the typing: why extending
> TimeWindow instead of just extending Window?
> 
> 2) I also feel that emitting policy for this type of windowing aggregation
> may be different from the existing ones. Existing emitting policy is very
> simple: emit every time when window get updates, and emit every time on
> out-of-ordering data within grace period, this is because for time-windows
> the window close time is strictly depend on the window start time which is
> fixed, while for session-windows although the window open/close time is
> also data-dependent it is relatively infrequent compared to the
> sliding-windows. For this KIP, since each new data would cause a
> new sliding-window, the num. windows maintained logically could be much
> larger and hence emitting on each update may be too aggressive.
> 
> 3) Although KIP itself should be focusing on user face interfaces, I'd
> suggest we create a children page of KIP-450 discussing about its
> implementations as well, since some of that may drive the interface design.
> E.g. personally I think having a combiner interface in addition to
> aggregator would be useful but that's based on my 2cents about the
> implementation design (I once created a child page describing it:
> https://cwiki.apache.org/confluence/display/KAFKA/Optimize+Windowed+Aggregation
> ).
> 
> 
> Guozhang
> 
> 
> 
> 
> On Tue, Jul 14, 2020 at 5:31 AM Bruno Cadonna  wrote:
> 
> > Hi Leah,
> >
> > Thank you for the KIP!
> >
> > Here is my feedback:
> >
> > 1. The KIP would benefit from some code examples that show how to use
> > sliding windows in aggregations.
> >
> > 2. The different sliding windows in Figure 1 and 2 are really hard to
> > distinguish. Could you please try to make them graphically better
> > distinguishable? You could try to draw the frames of consecutive
> > windows shifted to each other.
> >
> > 3. I agree with Matthias, that extending Windows does not
> > seem to be the best approach. What would be the result of
> > windowsFor()?
> >
> > 4. In the section "Public Interfaces" you should remove implementation
> > details like private constructors and private fields.
> >
> > 5. Do we need a new store interface or can we use WindowStore? Some
> > words about that would be informative.
> >
> > 6. @Matthias, if the subtrator is not strictly needed, I would skip it
> > for now and add it later.
> >
> > 7. I also agree that having a section that describes how to handle
> > out-of-order records would be good to understand what is 

[jira] [Resolved] (KAFKA-10174) Prefer --bootstrap-server ducktape tests using kafka_configs.sh

2020-07-16 Thread Colin McCabe (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-10174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin McCabe resolved KAFKA-10174.
--
Fix Version/s: 2.7
 Reviewer: Colin McCabe
   Resolution: Fixed

> Prefer --bootstrap-server ducktape tests using kafka_configs.sh
> ---
>
> Key: KAFKA-10174
> URL: https://issues.apache.org/jira/browse/KAFKA-10174
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Vinoth Chandar
>Assignee: Vinoth Chandar
>Priority: Major
> Fix For: 2.7
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [VOTE] KIP-554: Add Broker-side SCRAM Config API

2020-07-16 Thread Colin McCabe
On Thu, Jul 16, 2020, at 08:06, Ron Dagostino wrote:
> Thanks, Colin.  The standard "about" message for ThrottleTimeMs seems
> to be "The duration in milliseconds for which the request was throttled 
> due to a quota violation, or zero if the request did not violate any quota."
> as opposed to "The time spent waiting for quota." Should we adjust to 
> match the typical definition?
> 

Hi Ron,

Good point.  Let's keep the "about" text consistent.  I changed it.

>
> I'm wondering if describing Scram credentials should require READ privilege
> rather than ALTER on the cluster?   Altering SCRAM credentials of course
> requires ALTER privilege, and I can see the argument for requiring ALTER
> privilege to describe them as well, but it did catch my eye as something
> worth questioning/confirming.
> 

Also a good point.  I spoke with Rajini about this offline, and she pointed out 
that we can already see user names in ACLs if we have DESCRIBE on CLUSTER.  So 
it should be fine to have describeScramUsers require DESCRIBE on CLUSTER as 
well.

>
> I'm also now thinking that "UNKNOWN" shouldn't be listed in the
> ScramMechanism enum.  I thought maybe it was a catch-all so we will always
> be able to deserialize something regardless of what key actually appears,
> but I just realized that SCRAM credentials and Client Quotas are mixed
> together in the same JSON, so it will be up to the corresponding API to
> ignore what it doesn't recognize -- i.e. if both client quotas and SCRAM
> credentials are defined for a user, then invoking DescribeClientQuotas must
> only describe the quota configs and invoking DescribeScramUsers must only
> describe the SCRAM configs.
>

The reason to have the UNKNOWN enum is so that we can add new SCRAM mechanisms 
in the future.  If we don't have it, then we're basically saying we can never 
add new mechanisms.

I agree that the decision to put SCRAM users under the same ZK path as client 
quotas makes this more complex than we'd like it to be, but all is not lost.  
For one thing, we could always just add a new ZK path for SCRAM users in the 
future if we really need to.  With a new path we wouldn't have to worry about 
namespace collisions.  For another thing, in the post-KIP-500 world this won't 
be an issue.

In the short term, a simpler solution might work here.  For example, can we 
just assume that any key that starts with "SCRAM-" is not a quota, but a scram 
user?  (Or look at some other aspect of the key).

>
>  Also, note that invoking kafka-configs.sh
> --bootstrap-server ... --entity-type user --describe will require the
> invocation of two separate APIs -- one to describe quotas and one to
> describe SCRAM credentials; I don't think this is a problem, but I did want
> to call it out explicitly.
>

That should be OK.
 
>
> Finally, there is a lack of consistency regarding how we name things.  For
> example, we are calling these APIs {Describe,Alter}ScramUsers and we have
> declared the classes {Describe,Alter}ScramUserOptions, which matches up
> fine.  We also have public class DescribeScramUsersResult, which also
> matches.  But then we have public class AlterScramCredentialsResult,
> interface ScramCredentialAlteration, public class ScramCredentialAddition,
> and public class ScramCredentialDeletion, none of which match up in terms
> of consistency of naming.  I wonder if we should always use
> "UserScramCredential" everywhere since that is technically what these API's
> are about: describing/altering Users' SCRAM credentials.  So the APis would
> be {Describe,Alter}UserScramCredentials, and everything else that is
> publiuc that now refers inconsistently to either ScramUsers or
> ScramCredential would instead refer to UserScramCredentials (sometimes
> singular rather than plural if warranted).  For example: public class {
> Describe,Alter}UserScramCredentialsResult, interface User
> ScramCredentialAlteration, public class UserScramCredentialAddition, and
> public class UserScramCredentialDeletion
> 

Yeah, there is a bit of a mismatch between "credentials" and "users."  Really, 
these APIs are about credentials, not users.  So I agree -- let's rename it.

best,
Colin

> 
> On Wed, Jul 15, 2020 at 5:53 PM Colin McCabe  wrote:
> 
> > Hi all,
> >
> > Thanks, everyone, for reviewing.
> >
> > Since we made a few changes to the RPCs in the last few days, I'm going to
> > extend the vote until Monday and close it out then if it looks good.
> >
> > best,
> > Colin
> >
> >
> > On Wed, Jul 15, 2020, at 14:47, Colin McCabe wrote:
> > > On Tue, Jul 14, 2020, at 16:23, Ron Dagostino wrote:
> > > > Thanks, Colin.
> > > >
> > > > DescribeScramUsersResponse returns a list of ScramUser instances, which
> > > > makes sense, but then each of the ScramUser instances only has a single
> > > > ScramUserMechanismInfo instance.  I believe it needs a List of those?
> > >
> > > Hi Ron,
> > >
> > > Sorry, that was a typo in the response JSON.  Fixed.
> > >
> > > >
> > > > Also, 

Re: [VOTE] KIP-554: Add Broker-side SCRAM Config API

2020-07-16 Thread Ron Dagostino
Thanks, Colin.  The standard "about" message for ThrottleTimeMs seems
to be "The
duration in milliseconds for which the request was throttled due to a quota
violation, or zero if the request did not violate any quota." as
opposed to "The
time spent waiting for quota." Should we adjust to match the typical
definition?

I'm wondering if describing Scram credentials should require READ privilege
rather than ALTER on the cluster?   Altering SCRAM credentials of course
requires ALTER privilege, and I can see the argument for requiring ALTER
privilege to describe them as well, but it did catch my eye as something
worth questioning/confirming.

I'm also now thinking that "UNKNOWN" shouldn't be listed in the
ScramMechanism enum.  I thought maybe it was a catch-all so we will always
be able to deserialize something regardless of what key actually appears,
but I just realized that SCRAM credentials and Client Quotas are mixed
together in the same JSON, so it will be up to the corresponding API to
ignore what it doesn't recognize -- i.e. if both client quotas and SCRAM
credentials are defined for a user, then invoking DescribeClientQuotas must
only describe the quota configs and invoking DescribeScramUsers must only
describe the SCRAM configs.  Also, note that invoking kafka-configs.sh
--bootstrap-server ... --entity-type user --describe will require the
invocation of two separate APIs -- one to describe quotas and one to
describe SCRAM credentials; I don't think this is a problem, but I did want
to call it out explicitly.

Finally, there is a lack of consistency regarding how we name things.  For
example, we are calling these APIs {Describe,Alter}ScramUsers and we have
declared the classes {Describe,Alter}ScramUserOptions, which matches up
fine.  We also have public class DescribeScramUsersResult, which also
matches.  But then we have public class AlterScramCredentialsResult,
interface ScramCredentialAlteration, public class ScramCredentialAddition,
and public class ScramCredentialDeletion, none of which match up in terms
of consistency of naming.  I wonder if we should always use
"UserScramCredential" everywhere since that is technically what these API's
are about: describing/altering Users' SCRAM credentials.  So the APis would
be {Describe,Alter}UserScramCredentials, and everything else that is
publiuc that now refers inconsistently to either ScramUsers or
ScramCredential would instead refer to UserScramCredentials (sometimes
singular rather than plural if warranted).  For example: public class {
Describe,Alter}UserScramCredentialsResult, interface User
ScramCredentialAlteration, public class UserScramCredentialAddition, and
public class UserScramCredentialDeletion

Ron


On Wed, Jul 15, 2020 at 5:53 PM Colin McCabe  wrote:

> Hi all,
>
> Thanks, everyone, for reviewing.
>
> Since we made a few changes to the RPCs in the last few days, I'm going to
> extend the vote until Monday and close it out then if it looks good.
>
> best,
> Colin
>
>
> On Wed, Jul 15, 2020, at 14:47, Colin McCabe wrote:
> > On Tue, Jul 14, 2020, at 16:23, Ron Dagostino wrote:
> > > Thanks, Colin.
> > >
> > > DescribeScramUsersResponse returns a list of ScramUser instances, which
> > > makes sense, but then each of the ScramUser instances only has a single
> > > ScramUserMechanismInfo instance.  I believe it needs a List of those?
> >
> > Hi Ron,
> >
> > Sorry, that was a typo in the response JSON.  Fixed.
> >
> > >
> > > Also, ScramUserMechanismInfo probably needs a better "about" value (it
> > > currently says "The user name.")
> > >
> >
> > Also fixed :)
> >
> > >
> > > Should both responses include ThrottleTimeMs fields?
> > >
> >
> > Good call.  I added this.
> >
> > best,
> > Colin
> >
> > >
> > > I haven't looked at the AlterScramUsers stuff yet; I'll look at that in
> > > detail tomorrow.
> > >
> > > Ron
> > >
> > >
> > > On Tue, Jul 14, 2020 at 4:11 PM Colin McCabe 
> wrote:
> > >
> > > > On Tue, Jul 14, 2020, at 07:57, Ron Dagostino wrote:
> > > > > Hi again, Colin.  I also just realized a couple of other
> > > > incompatibilities
> > > > > with the way kafka-configs works today that prevents
> --bootstrap-server
> > > > > from being a drop-in replacement.  This may or may not be a hard
> > > > > requirement, but we should explicitly decide on these one way or
> the
> > > > other.
> > > > >
> > > > > One issue is that it is legal to list the SCRAM credentials for a
> single
> > > > > user with kafka-configs (e.g. bin/kafka-configs.sh --zookeeper
> > > > > localhost:2181 --describe --entity-type users --entity-name
> alice).  The
> > > > > current ListScramUsersRequest API does not support specifying an
> > > > (optional)
> > > > > user name, so it always returns all users' SCRAM credentials.  We
> could
> > > > > filter the lst on the client side, of course, but that seems
> inefficient.
> > > > >
> > > >
> > > > Hi Ron,
> > > >
> > > > Yes, I think we should allow listing just a particular scram user or
> > > > users.  I will 

[jira] [Created] (KAFKA-10278) kafka-configs does not show the current properties of running kafka broker upon describe.

2020-07-16 Thread kaushik srinivas (Jira)
kaushik srinivas created KAFKA-10278:


 Summary: kafka-configs does not show the current properties of 
running kafka broker upon describe.
 Key: KAFKA-10278
 URL: https://issues.apache.org/jira/browse/KAFKA-10278
 Project: Kafka
  Issue Type: Bug
Affects Versions: 2.4.1
Reporter: kaushik srinivas


kafka-configs.sh does not list the properties 
(read-only/per-broker/cluster-wide) with which the kafka broker is currently 
running.

The command returns nothing.

Only those properties added or updated via kafka-configs.sh is listed by the 
describe command.

bash-4.2$ env -i  bin/kafka-configs.sh --bootstrap-server 
kf-test-0.kf-test-headless.test.svc.cluster.local:9092 --entity-type brokers 
--entity-default --describe Default config for brokers in the cluster are:
  log.cleaner.threads=2 sensitive=false 
synonyms=\{DYNAMIC_DEFAULT_BROKER_CONFIG:log.cleaner.threads=2}

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (KAFKA-10160) Kafka MM2 consumer configuration

2020-07-16 Thread Mickael Maison (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-10160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mickael Maison resolved KAFKA-10160.

Fix Version/s: 2.7.0
   Resolution: Fixed

> Kafka MM2 consumer configuration
> 
>
> Key: KAFKA-10160
> URL: https://issues.apache.org/jira/browse/KAFKA-10160
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 2.5.0, 2.4.1
>Reporter: Pavol Ipoth
>Assignee: sats
>Priority: Major
>  Labels: configuration, kafka, mirror-maker
> Fix For: 2.7.0
>
>
> [https://github.com/apache/kafka/blob/trunk/connect/mirror/src/main/java/org/apache/kafka/connect/mirror/MirrorMakerConfig.java#L51,]
>  according this producer/consumer level properties should be configured as 
> e.g. somesource->sometarget.consumer.client.id, i try to set 
> somesource->sometarget.consumer.auto.offset.reset=latest, but without 
> success, consumer always tries to fetch earliest, not sure if bug or my 
> misconfiguration, but then at least some update to docu would be useful



--
This message was sent by Atlassian Jira
(v8.3.4#803005)