Re: [DISCUSS] KIP-567: Kafka Cluster Audit (new discussion)

2020-09-17 Thread Dániel Urbán
Hi,

Thanks for the KIP.

If the auditor needs access to the details of the action, one could argue
that even the response should be passed down to the auditor.
Is it feasible to convert the Java requests and responses to public API?
If not, do we have another option to access this info in the auditor?
I know that the auditor could just send proper requests through the API to
the brokers, but that seems like an awful lot of overhead, and could
introduce timing issues as well.

Daniel


Viktor Somogyi-Vass  ezt írta (időpont: 2020.
szept. 16., Sze, 17:17):

> One more after-thought on your second point (AbstractRequest): the reason I
> introduced it in the first place was that this way implementers can access
> request data. A use case can be if they want to audit a change in
> configuration or client quotas but not just acknowledge the fact that such
> an event happened but also capture the change itself by peeking into the
> request. Sometimes it can be useful especially when people want to trace
> back the order of events and what happened when and not just acknowledge
> that there was an event of a certain kind. I also recognize that this might
> be a very loose interpretation of auditing as it's not strictly related to
> authorization but rather a way of tracing the admin actions within the
> cluster. It even could be a different API therefore but because of the
> variety of the Kafka APIs it's very hard to give a method that fits all, so
> it's easier to pass down the AbstractRequest and the implementation can do
> the extraction of valuable info. So that's why I added this in the first
> place and I'm interested in your thoughts.
>
> On Wed, Sep 16, 2020 at 4:41 PM Viktor Somogyi-Vass <
> viktorsomo...@gmail.com>
> wrote:
>
> > Hi Mickael,
> >
> > Thanks for reviewing the KIP.
> >
> > 1.) I just wanted to follow the conventions used with the Authorizer as
> it
> > is built in a similar fashion, although it's true that in KafkaServer we
> > call the configure() method and the start() in the next line. This would
> be
> > the same in Auditor and even simpler as there aren't any parameters to
> > start(), so I can remove it. If it turns out there is a need for it, we
> can
> > add it later.
> >
> > 2.) Yes, this is a very good point, I will remove it, however in this
> case
> > I don't think we need to add the ApiKey as it is already available in
> > AuthorizableRequestContext.requestType(). One less parameter :).
> >
> > 3.) I'll add it. It will simply log important changes in the cluster like
> > topic events (create, update, delete, partition or replication factor
> > change), ACL events, config changes, reassignment, altering log dirs,
> > offset delete, group delete with the authorization info like who
> initiated
> > the call, was it authorized, were there any errors. Let me know if you
> > think there are other APIs I should include.
> >
> > 4.) The builder is there mostly for easier usability but actually
> thinking
> > of it it doesn't help much so I removed it. The AuditInfo is also a
> helper
> > class so I don't see any value in transforming it into an interface but
> if
> > I simplify it (by removing the builder) it will be cleaner. Would that
> work?
> >
> > I'll update the KIP to reflect my answers.
> >
> > Viktor
> >
> >
> > On Mon, Sep 14, 2020 at 6:02 PM Mickael Maison  >
> > wrote:
> >
> >> Hi Viktor,
> >>
> >> Thanks for restarting the discussion on this KIP. Being able to easily
> >> audit usage of a Kafka cluster is a very valuable feature.
> >>
> >> Regarding the API, I have a few of questions:
> >> 1) You introduced a start() method. I don't think any other interfaces
> >> have such a method. Users can do any setup they want in configure()
> >>
> >> 2) The first argument of audit is an AbstractRequest. Unfortunately
> >> this type is not part of the public API. But actually I'm not sure
> >> having the full request is really needed here. Maybe just passing the
> >> Apikey would be enough as we already have all the resources from the
> >> auditInfos field.
> >>
> >> 3) The KIP mentions a "LoggingAuditor" default implementation. What is
> >> it doing? Can you add more details about it?
> >>
> >> 4) Can fields of AuditInfo be null? I can see there's a constructor
> >> without an Errors and that sets the error field to None. However, with
> >> the builder pattern, if error is not set it's null.
> >>
> >> 5) Should AuditInfo be an interface?
> >>
> >> On Mon, Sep 14, 2020 at 3:26 PM Viktor Somogyi-Vass
> >>  wrote:
> >> >
> >> > Hi everyone,
> >> >
> >> > Changed the interface a little bit to accommodate methods better where
> >> > authorization happens for multiple operations so the implementer of
> the
> >> > audit interface will receive all authorizations together.
> >> > I'll wait a few more days to allow people to react or give feedback
> but
> >> if
> >> > there are no objections until then, I'll start a vote.
> >> >
> >> > Viktor
> >> >
> >> > On Tue, Sep 8, 2020 at 9:49 AM Vik

Re: [VOTE] KIP-631: The Quorum-based Kafka Controller

2020-09-17 Thread Unmesh Joshi
Thanks for the KIP.

+1 (non-binding)

On Tue, Sep 15, 2020 at 12:23 AM Colin McCabe  wrote:

> Hi all,
>
> I'd like to call a vote on KIP-631: the quorum-based Kafka Controller.
> The KIP is here:
>
> https://cwiki.apache.org/confluence/x/4RV4CQ
>
> The DISCUSS thread is here:
>
>
> https://lists.apache.org/thread.html/r1ed098a88c489780016d963b065e8cb450a9080a4736457cd25f323c%40%3Cdev.kafka.apache.org%3E
>
> Please take a look and vote if you can.
>
> best,
> Colin
>


Build failed in Jenkins: Kafka » kafka-trunk-jdk8 #69

2020-09-17 Thread Apache Jenkins Server
See 


Changes:

[github] MINOR: Fix common struct `JsonConverter` and `Schema` generation 
(#9279)


--
[...truncated 6.53 MB...]
org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullReverseForCompareValue PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfTimestampIsDifferentForCompareValueTimestamp STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfTimestampIsDifferentForCompareValueTimestamp PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfKeyAndValueAndTimestampIsEqualWithNullForCompareKeyValueTimestampWithProducerRecord
 STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfKeyAndValueAndTimestampIsEqualWithNullForCompareKeyValueTimestampWithProducerRecord
 PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullForCompareKeyValueWithProducerRecord STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullForCompareKeyValueWithProducerRecord PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentForCompareKeyValueTimestampWithProducerRecord 
STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentForCompareKeyValueTimestampWithProducerRecord PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullProducerRecordWithExpectedRecordForCompareValueTimestamp 
STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullProducerRecordWithExpectedRecordForCompareValueTimestamp 
PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullExpectedRecordForCompareValue STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullExpectedRecordForCompareValue PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullReversForCompareKeyValueTimestampWithProducerRecord
 STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullReversForCompareKeyValueTimestampWithProducerRecord
 PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullProducerRecordForCompareKeyValue STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullProducerRecordForCompareKeyValue PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullReversForCompareKeyValueWithProducerRecord 
STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullReversForCompareKeyValueWithProducerRecord 
PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfKeyAndValueAndTimestampIsEqualForCompareKeyValueTimestamp STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfKeyAndValueAndTimestampIsEqualForCompareKeyValueTimestamp PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullForCompareKeyValueTimestampWithProducerRecord
 STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullForCompareKeyValueTimestampWithProducerRecord
 PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfKeyIsDifferentWithNullReversForCompareKeyValueWithProducerRecord 
STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfKeyIsDifferentWithNullReversForCompareKeyValueWithProducerRecord 
PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfKeyAndValueIsEqualWithNullForCompareKeyValueWithProducerRecord 
STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfKeyAndValueIsEqualWithNullForCompareKeyValueWithProducerRecord 
PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfKeyIsDifferentWithNullForCompareKeyValueTimestampWithProducerRecord 
STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfKeyIsDifferentWithNullForCompareKeyValueTimestampWithProducerRecord 
PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentForCompareKeyValueTimestamp STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentForCompareKeyValueTimestamp PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfValueAndTimestampIsEqualForCompareValueTimestampWithProducerRecord 
STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfValueAndTimestampIsEqualForCompareValueTimestampWithProducerRecord 
PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullReversForCompareKeyValueTimestamp STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullReversForCompareKeyValueTimestamp PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferen

[jira] [Resolved] (KAFKA-10495) Fix spelling mistake

2020-09-17 Thread Jira


 [ 
https://issues.apache.org/jira/browse/KAFKA-10495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

欧阳能达 resolved KAFKA-10495.
--
Resolution: Won't Fix

> Fix spelling mistake
> 
>
> Key: KAFKA-10495
> URL: https://issues.apache.org/jira/browse/KAFKA-10495
> Project: Kafka
>  Issue Type: Improvement
>  Components: clients, consumer
>Reporter: 欧阳能达
>Priority: Trivial
>  Labels: newbie
>
> In track branch.
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.poll(ConsumerCoordinator.java
>  465 line: @return true *iff* the operation succeeded
> The *iff* is a mistake word.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Build failed in Jenkins: Kafka » kafka-trunk-jdk15 #69

2020-09-17 Thread Apache Jenkins Server
See 


Changes:

[github] MINOR: Fix common struct `JsonConverter` and `Schema` generation 
(#9279)


--
[...truncated 3.29 MB...]

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullForCompareKeyValueWithProducerRecord PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentForCompareKeyValueTimestampWithProducerRecord 
STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentForCompareKeyValueTimestampWithProducerRecord PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullProducerRecordWithExpectedRecordForCompareValueTimestamp 
STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullProducerRecordWithExpectedRecordForCompareValueTimestamp 
PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullExpectedRecordForCompareValue STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullExpectedRecordForCompareValue PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullReversForCompareKeyValueTimestampWithProducerRecord
 STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullReversForCompareKeyValueTimestampWithProducerRecord
 PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullProducerRecordForCompareKeyValue STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullProducerRecordForCompareKeyValue PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullReversForCompareKeyValueWithProducerRecord 
STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullReversForCompareKeyValueWithProducerRecord 
PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfKeyAndValueAndTimestampIsEqualForCompareKeyValueTimestamp STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfKeyAndValueAndTimestampIsEqualForCompareKeyValueTimestamp PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullForCompareKeyValueTimestampWithProducerRecord
 STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullForCompareKeyValueTimestampWithProducerRecord
 PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfKeyIsDifferentWithNullReversForCompareKeyValueWithProducerRecord 
STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfKeyIsDifferentWithNullReversForCompareKeyValueWithProducerRecord 
PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfKeyAndValueIsEqualWithNullForCompareKeyValueWithProducerRecord 
STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfKeyAndValueIsEqualWithNullForCompareKeyValueWithProducerRecord 
PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfKeyIsDifferentWithNullForCompareKeyValueTimestampWithProducerRecord 
STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfKeyIsDifferentWithNullForCompareKeyValueTimestampWithProducerRecord 
PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentForCompareKeyValueTimestamp STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentForCompareKeyValueTimestamp PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfValueAndTimestampIsEqualForCompareValueTimestampWithProducerRecord 
STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfValueAndTimestampIsEqualForCompareValueTimestampWithProducerRecord 
PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullReversForCompareKeyValueTimestamp STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullReversForCompareKeyValueTimestamp PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullReverseForCompareValueTimestampWithProducerRecord
 STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullReverseForCompareValueTimestampWithProducerRecord
 PASSED

org.apache.kafka.streams.MockProcessorContextTest > 
shouldStoreAndReturnStateStores STARTED

org.apache.kafka.streams.MockProcessorContextTest > 
shouldStoreAndReturnStateStores PASSED

org.apache.kafka.streams.MockProcessorContextTest > 
shouldCaptureOutputRecordsUsingTo STARTED

org.apache.kafka.streams.MockProcessorContextTest > 
shouldCaptureOutputRecordsUsingTo PASSED

org.apache.kafka.streams.MockProcessorContextTest > shouldCaptureOutputRecords 
STARTED

org.apache.kafka.streams.MockProcessorContextTest > shouldCaptureOutputRecords 
PASSED

org.apache.kafka.streams.MockP

Build failed in Jenkins: Kafka » kafka-trunk-jdk11 #70

2020-09-17 Thread Apache Jenkins Server
See 


Changes:

[github] MINOR: Fix common struct `JsonConverter` and `Schema` generation 
(#9279)


--
[...truncated 3.29 MB...]
org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldNotAllowToCreateTopicWithNullTopicNameWithNullKeyAndDefaultTimestamp 
PASSED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldRequireCustomTopicNameIfNotDefaultFactoryTopicNameWithDefaultTimestamp 
STARTED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldRequireCustomTopicNameIfNotDefaultFactoryTopicNameWithDefaultTimestamp 
PASSED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldNotAllowToCreateTopicWithNullTopicNameWithNullKey STARTED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldNotAllowToCreateTopicWithNullTopicNameWithNullKey PASSED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldCreateNullKeyConsumerRecordWithOtherTopicNameAndTimestampWithTimetamp 
STARTED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldCreateNullKeyConsumerRecordWithOtherTopicNameAndTimestampWithTimetamp 
PASSED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldCreateConsumerRecordWithTimestamp STARTED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldCreateConsumerRecordWithTimestamp PASSED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldNotAllowToCreateTopicWithNullHeaders STARTED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldNotAllowToCreateTopicWithNullHeaders PASSED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldNotAllowToCreateTopicWithNullTopicNameWithDefaultTimestamp STARTED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldNotAllowToCreateTopicWithNullTopicNameWithDefaultTimestamp PASSED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldRequireCustomTopicNameIfNotDefaultFactoryTopicName STARTED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldRequireCustomTopicNameIfNotDefaultFactoryTopicName PASSED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldRequireCustomTopicNameIfNotDefaultFactoryTopicNameWithNullKey STARTED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldRequireCustomTopicNameIfNotDefaultFactoryTopicNameWithNullKey PASSED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldCreateConsumerRecord STARTED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldCreateConsumerRecord PASSED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldCreateNullKeyConsumerRecord STARTED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldCreateNullKeyConsumerRecord PASSED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldCreateConsumerRecordWithOtherTopicName STARTED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldCreateConsumerRecordWithOtherTopicName PASSED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > shouldAdvanceTime 
STARTED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > shouldAdvanceTime 
PASSED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldNotAllowToCreateTopicWithNullTopicNameWithKeyValuePairs STARTED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldNotAllowToCreateTopicWithNullTopicNameWithKeyValuePairs PASSED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldRequireCustomTopicNameIfNotDefaultFactoryTopicNameWithKeyValuePairsAndCustomTimestamps
 STARTED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldRequireCustomTopicNameIfNotDefaultFactoryTopicNameWithKeyValuePairsAndCustomTimestamps
 PASSED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldRequireCustomTopicNameIfNotDefaultFactoryTopicNameWithKeyValuePairs 
STARTED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldRequireCustomTopicNameIfNotDefaultFactoryTopicNameWithKeyValuePairs PASSED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldCreateConsumerRecordWithOtherTopicNameAndTimestamp STARTED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldCreateConsumerRecordWithOtherTopicNameAndTimestamp PASSED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldNotAllowToCreateTopicWithNullTopicNameWithKeyValuePairsAndCustomTimestamps
 STARTED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldNotAllowToCreateTopicWithNullTopicNameWithKeyValuePairsAndCustomTimestamps
 PASSED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldNotAllowToCreateTopicWithNullTopicName STARTED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldNotAllowToCreateTopicWithNullTopicName PASSED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldCreateConsumerRecordsFromKeyValuePairsWithCustomTi

Jenkins build is back to normal : Kafka » kafka-trunk-jdk15 #68

2020-09-17 Thread Apache Jenkins Server
See 




Re: [DISCUSS] KIP-516: Topic Identifiers

2020-09-17 Thread Justine Olshan
Hi Jun,
Thanks for the quick response!

12. I've decided to bump up the versions on the requests and updated the
KIP. I think it's good we thoroughly discussed the options here, so we know
we made a good choice. :)

13. This is an interesting situation. I think if this does occur we should
give a warning. I agree that it's hard to know the source of truth for sure
since the directory or the file could be manually modified. I guess the
directory could be used as the source of truth. To be honest, I'm not
really sure what happens in kafka when the log directory is renamed
manually in such a way. I'm also wondering if the situation is recoverable
in this scenario.

Thanks,
Justine

On Thu, Sep 17, 2020 at 4:28 PM Jun Rao  wrote:

> Hi, Justine,
>
> Thanks for the reply.
>
> 12. I don't have a strong preference either. However, if we need IBP
> anyway, maybe it's easier to just bump up the version for all inter broker
> requests and add the topic id field as a regular field. A regular field is
> a bit more concise in wire transfer than a flexible field.
>
> 13. The confusion that I was referring to is between the topic name and
> partition number between the log dir and the metadata file. For example, if
> the log dir is topicA-1 and the metadata file in it has topicB and
> partition 0 (say due to a bug or manual modification), which one do we use
> as the source of truth?
>
> Jun
>
> On Thu, Sep 17, 2020 at 3:43 PM Justine Olshan 
> wrote:
>
> > Hi Jun,
> > Thanks for the comments.
> >
> > 12. I bumped the LeaderAndIsrRequest because I removed the topic name
> field
> > in the response. It may be possible to avoid bumping the version without
> > that change, but I may be missing something.
> > I believe StopReplica is actually on version 3 now, but because version 2
> > is flexible, I kept that listed as version 2 on the KIP page. However,
> you
> > may be right in that we may need to bump the version on StopReplica to
> deal
> > with deletion differently as mentioned above. I don't know if I have a
> big
> > preference over used tagged fields or not.
> >
> > 13. I was thinking that in the case where the file and the request topic
> > ids don't match, it means that the broker's topic/the one in the file has
> > been deleted. In that case, we would need to delete the old topic and
> start
> > receiving the new version. If the topic name were to change, but the ids
> > still matched, the file would also need to update. Am I missing a case
> > where the file would be correct and not the request?
> >
> > Thanks,
> > Justine
> >
> > On Thu, Sep 17, 2020 at 3:18 PM Jun Rao  wrote:
> >
> > > Hi, Justine,
> > >
> > > Thanks for the reply. A couple of more comments below.
> > >
> > > 12. ListOffset and OffsetForLeader currently don't support flexible
> > fields.
> > > So, we have to bump up the version number and use IBP at least for
> these
> > > two requests. Note that it seems 2.7.0 will require IBP anyway because
> of
> > > changes in KAFKA-10435. Also, it seems that the version for
> > > LeaderAndIsrRequest and StopReplica are bumped even though we only
> added
> > a
> > > tagged field. But since IBP is needed anyway, we may want to revisit
> the
> > > overall tagged field choice.
> > >
> > > 13. The only downside is the potential confusion on which one is the
> > source
> > > of truth if they don't match. Another option is to include those fields
> > in
> > > the metadata file when we actually change the directory structure.
> > >
> > > Thanks,
> > >
> > > Jun
> > >
> > > On Thu, Sep 17, 2020 at 2:01 PM Justine Olshan 
> > > wrote:
> > >
> > > > Hello all,
> > > >
> > > > I've thought some more about removing the topic name field from some
> of
> > > the
> > > > requests. On closer inspection of the requests/responses, it seems
> that
> > > the
> > > > internal changes would be much larger than I expected. Some protocols
> > > > involve clients, so they would require changes too. I'm thinking that
> > for
> > > > now, removing the topic name from these requests and responses are
> out
> > of
> > > > scope.
> > > >
> > > > I have decided to just keep the change LeaderAndIsrResponse to remove
> > the
> > > > topic name, and have updated the KIP to reflect this change. I have
> > also
> > > > mentioned the other requests and responses in future work.
> > > >
> > > > I'm hoping to start the voting process soon, so let me know if there
> is
> > > > anything else we should discuss.
> > > >
> > > > Thank you,
> > > > Justine
> > > >
> > > > On Tue, Sep 15, 2020 at 3:57 PM Justine Olshan  >
> > > > wrote:
> > > >
> > > > > Hello again,
> > > > > To follow up on some of the other comments:
> > > > >
> > > > > 10/11) We can remove the topic name from these requests/responses,
> > and
> > > > > that means we will just have to make a few internal changes to make
> > > > > partitions accessible by topic id and partition. I can update the
> KIP
> > > to
> > > > > remove them unless anyone thinks they should stay.
> > > > >
> 

Build failed in Jenkins: Kafka » kafka-trunk-jdk8 #68

2020-09-17 Thread Apache Jenkins Server
See 


Changes:

[github] MINOR: Fix now that kafka.apache.org resolves to 3 IP addresses (#9294)


--
[...truncated 3.27 MB...]

org.apache.kafka.streams.TestTopicsTest > testStartTimestamp PASSED

org.apache.kafka.streams.TestTopicsTest > testNegativeAdvance STARTED

org.apache.kafka.streams.TestTopicsTest > testNegativeAdvance PASSED

org.apache.kafka.streams.TestTopicsTest > shouldNotAllowToCreateWithNullDriver 
STARTED

org.apache.kafka.streams.TestTopicsTest > shouldNotAllowToCreateWithNullDriver 
PASSED

org.apache.kafka.streams.TestTopicsTest > testDuration STARTED

org.apache.kafka.streams.TestTopicsTest > testDuration PASSED

org.apache.kafka.streams.TestTopicsTest > testOutputToString STARTED

org.apache.kafka.streams.TestTopicsTest > testOutputToString PASSED

org.apache.kafka.streams.TestTopicsTest > testValue STARTED

org.apache.kafka.streams.TestTopicsTest > testValue PASSED

org.apache.kafka.streams.TestTopicsTest > testTimestampAutoAdvance STARTED

org.apache.kafka.streams.TestTopicsTest > testTimestampAutoAdvance PASSED

org.apache.kafka.streams.TestTopicsTest > testOutputWrongSerde STARTED

org.apache.kafka.streams.TestTopicsTest > testOutputWrongSerde PASSED

org.apache.kafka.streams.TestTopicsTest > 
shouldNotAllowToCreateOutputTopicWithNullTopicName STARTED

org.apache.kafka.streams.TestTopicsTest > 
shouldNotAllowToCreateOutputTopicWithNullTopicName PASSED

org.apache.kafka.streams.TestTopicsTest > testWrongSerde STARTED

org.apache.kafka.streams.TestTopicsTest > testWrongSerde PASSED

org.apache.kafka.streams.TestTopicsTest > testKeyValuesToMapWithNull STARTED

org.apache.kafka.streams.TestTopicsTest > testKeyValuesToMapWithNull PASSED

org.apache.kafka.streams.TestTopicsTest > testNonExistingOutputTopic STARTED

org.apache.kafka.streams.TestTopicsTest > testNonExistingOutputTopic PASSED

org.apache.kafka.streams.TestTopicsTest > testMultipleTopics STARTED

org.apache.kafka.streams.TestTopicsTest > testMultipleTopics PASSED

org.apache.kafka.streams.TestTopicsTest > testKeyValueList STARTED

org.apache.kafka.streams.TestTopicsTest > testKeyValueList PASSED

org.apache.kafka.streams.TestTopicsTest > 
shouldNotAllowToCreateOutputWithNullDriver STARTED

org.apache.kafka.streams.TestTopicsTest > 
shouldNotAllowToCreateOutputWithNullDriver PASSED

org.apache.kafka.streams.TestTopicsTest > testValueList STARTED

org.apache.kafka.streams.TestTopicsTest > testValueList PASSED

org.apache.kafka.streams.TestTopicsTest > testRecordList STARTED

org.apache.kafka.streams.TestTopicsTest > testRecordList PASSED

org.apache.kafka.streams.TestTopicsTest > testNonExistingInputTopic STARTED

org.apache.kafka.streams.TestTopicsTest > testNonExistingInputTopic PASSED

org.apache.kafka.streams.TestTopicsTest > testKeyValuesToMap STARTED

org.apache.kafka.streams.TestTopicsTest > testKeyValuesToMap PASSED

org.apache.kafka.streams.TestTopicsTest > testRecordsToList STARTED

org.apache.kafka.streams.TestTopicsTest > testRecordsToList PASSED

org.apache.kafka.streams.TestTopicsTest > testKeyValueListDuration STARTED

org.apache.kafka.streams.TestTopicsTest > testKeyValueListDuration PASSED

org.apache.kafka.streams.TestTopicsTest > testInputToString STARTED

org.apache.kafka.streams.TestTopicsTest > testInputToString PASSED

org.apache.kafka.streams.TestTopicsTest > testTimestamp STARTED

org.apache.kafka.streams.TestTopicsTest > testTimestamp PASSED

org.apache.kafka.streams.TestTopicsTest > testWithHeaders STARTED

org.apache.kafka.streams.TestTopicsTest > testWithHeaders PASSED

org.apache.kafka.streams.TestTopicsTest > testKeyValue STARTED

org.apache.kafka.streams.TestTopicsTest > testKeyValue PASSED

org.apache.kafka.streams.TestTopicsTest > 
shouldNotAllowToCreateTopicWithNullTopicName STARTED

org.apache.kafka.streams.TestTopicsTest > 
shouldNotAllowToCreateTopicWithNullTopicName PASSED

> Task :streams:upgrade-system-tests-0100:processTestResources NO-SOURCE
> Task :streams:upgrade-system-tests-0100:testClasses
> Task :streams:upgrade-system-tests-0100:checkstyleTest
> Task :streams:upgrade-system-tests-0100:spotbugsMain NO-SOURCE
> Task :streams:upgrade-system-tests-0100:test
> Task :streams:upgrade-system-tests-0101:compileJava NO-SOURCE
> Task :streams:upgrade-system-tests-0101:processResources NO-SOURCE
> Task :streams:upgrade-system-tests-0101:classes UP-TO-DATE
> Task :streams:upgrade-system-tests-0101:checkstyleMain NO-SOURCE
> Task :streams:upgrade-system-tests-0101:compileTestJava
> Task :streams:upgrade-system-tests-0101:processTestResources NO-SOURCE
> Task :streams:upgrade-system-tests-0101:testClasses
> Task :streams:upgrade-system-tests-0101:checkstyleTest
> Task :streams:upgrade-system-tests-0101:spotbugsMain NO-SOURCE
> Task :streams:upgrade-system-tests-0101:test
> Task :streams:upgrade-system-tests-0102:compileJava NO-SOURCE
> Task :streams:upgrade-

Build failed in Jenkins: Kafka » kafka-trunk-jdk11 #69

2020-09-17 Thread Apache Jenkins Server
See 


Changes:

[github] MINOR: Fix now that kafka.apache.org resolves to 3 IP addresses (#9294)


--
[...truncated 3.29 MB...]
org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldNotAllowToCreateTopicWithNullTopicNameWithNullKeyAndDefaultTimestamp 
PASSED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldRequireCustomTopicNameIfNotDefaultFactoryTopicNameWithDefaultTimestamp 
STARTED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldRequireCustomTopicNameIfNotDefaultFactoryTopicNameWithDefaultTimestamp 
PASSED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldNotAllowToCreateTopicWithNullTopicNameWithNullKey STARTED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldNotAllowToCreateTopicWithNullTopicNameWithNullKey PASSED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldCreateNullKeyConsumerRecordWithOtherTopicNameAndTimestampWithTimetamp 
STARTED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldCreateNullKeyConsumerRecordWithOtherTopicNameAndTimestampWithTimetamp 
PASSED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldCreateConsumerRecordWithTimestamp STARTED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldCreateConsumerRecordWithTimestamp PASSED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldNotAllowToCreateTopicWithNullHeaders STARTED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldNotAllowToCreateTopicWithNullHeaders PASSED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldNotAllowToCreateTopicWithNullTopicNameWithDefaultTimestamp STARTED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldNotAllowToCreateTopicWithNullTopicNameWithDefaultTimestamp PASSED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldRequireCustomTopicNameIfNotDefaultFactoryTopicName STARTED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldRequireCustomTopicNameIfNotDefaultFactoryTopicName PASSED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldRequireCustomTopicNameIfNotDefaultFactoryTopicNameWithNullKey STARTED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldRequireCustomTopicNameIfNotDefaultFactoryTopicNameWithNullKey PASSED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldCreateConsumerRecord STARTED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldCreateConsumerRecord PASSED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldCreateNullKeyConsumerRecord STARTED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldCreateNullKeyConsumerRecord PASSED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldCreateConsumerRecordWithOtherTopicName STARTED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldCreateConsumerRecordWithOtherTopicName PASSED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > shouldAdvanceTime 
STARTED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > shouldAdvanceTime 
PASSED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldNotAllowToCreateTopicWithNullTopicNameWithKeyValuePairs STARTED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldNotAllowToCreateTopicWithNullTopicNameWithKeyValuePairs PASSED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldRequireCustomTopicNameIfNotDefaultFactoryTopicNameWithKeyValuePairsAndCustomTimestamps
 STARTED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldRequireCustomTopicNameIfNotDefaultFactoryTopicNameWithKeyValuePairsAndCustomTimestamps
 PASSED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldRequireCustomTopicNameIfNotDefaultFactoryTopicNameWithKeyValuePairs 
STARTED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldRequireCustomTopicNameIfNotDefaultFactoryTopicNameWithKeyValuePairs PASSED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldCreateConsumerRecordWithOtherTopicNameAndTimestamp STARTED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldCreateConsumerRecordWithOtherTopicNameAndTimestamp PASSED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldNotAllowToCreateTopicWithNullTopicNameWithKeyValuePairsAndCustomTimestamps
 STARTED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldNotAllowToCreateTopicWithNullTopicNameWithKeyValuePairsAndCustomTimestamps
 PASSED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldNotAllowToCreateTopicWithNullTopicName STARTED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldNotAllowToCreateTopicWithNullTopicName PASSED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldCreateConsumerRecordsFromKeyValuePairsWithCustomTime

Re: [DISCUSS] KIP-516: Topic Identifiers

2020-09-17 Thread Jun Rao
Hi, Justine,

Thanks for the reply.

12. I don't have a strong preference either. However, if we need IBP
anyway, maybe it's easier to just bump up the version for all inter broker
requests and add the topic id field as a regular field. A regular field is
a bit more concise in wire transfer than a flexible field.

13. The confusion that I was referring to is between the topic name and
partition number between the log dir and the metadata file. For example, if
the log dir is topicA-1 and the metadata file in it has topicB and
partition 0 (say due to a bug or manual modification), which one do we use
as the source of truth?

Jun

On Thu, Sep 17, 2020 at 3:43 PM Justine Olshan  wrote:

> Hi Jun,
> Thanks for the comments.
>
> 12. I bumped the LeaderAndIsrRequest because I removed the topic name field
> in the response. It may be possible to avoid bumping the version without
> that change, but I may be missing something.
> I believe StopReplica is actually on version 3 now, but because version 2
> is flexible, I kept that listed as version 2 on the KIP page. However, you
> may be right in that we may need to bump the version on StopReplica to deal
> with deletion differently as mentioned above. I don't know if I have a big
> preference over used tagged fields or not.
>
> 13. I was thinking that in the case where the file and the request topic
> ids don't match, it means that the broker's topic/the one in the file has
> been deleted. In that case, we would need to delete the old topic and start
> receiving the new version. If the topic name were to change, but the ids
> still matched, the file would also need to update. Am I missing a case
> where the file would be correct and not the request?
>
> Thanks,
> Justine
>
> On Thu, Sep 17, 2020 at 3:18 PM Jun Rao  wrote:
>
> > Hi, Justine,
> >
> > Thanks for the reply. A couple of more comments below.
> >
> > 12. ListOffset and OffsetForLeader currently don't support flexible
> fields.
> > So, we have to bump up the version number and use IBP at least for these
> > two requests. Note that it seems 2.7.0 will require IBP anyway because of
> > changes in KAFKA-10435. Also, it seems that the version for
> > LeaderAndIsrRequest and StopReplica are bumped even though we only added
> a
> > tagged field. But since IBP is needed anyway, we may want to revisit the
> > overall tagged field choice.
> >
> > 13. The only downside is the potential confusion on which one is the
> source
> > of truth if they don't match. Another option is to include those fields
> in
> > the metadata file when we actually change the directory structure.
> >
> > Thanks,
> >
> > Jun
> >
> > On Thu, Sep 17, 2020 at 2:01 PM Justine Olshan 
> > wrote:
> >
> > > Hello all,
> > >
> > > I've thought some more about removing the topic name field from some of
> > the
> > > requests. On closer inspection of the requests/responses, it seems that
> > the
> > > internal changes would be much larger than I expected. Some protocols
> > > involve clients, so they would require changes too. I'm thinking that
> for
> > > now, removing the topic name from these requests and responses are out
> of
> > > scope.
> > >
> > > I have decided to just keep the change LeaderAndIsrResponse to remove
> the
> > > topic name, and have updated the KIP to reflect this change. I have
> also
> > > mentioned the other requests and responses in future work.
> > >
> > > I'm hoping to start the voting process soon, so let me know if there is
> > > anything else we should discuss.
> > >
> > > Thank you,
> > > Justine
> > >
> > > On Tue, Sep 15, 2020 at 3:57 PM Justine Olshan 
> > > wrote:
> > >
> > > > Hello again,
> > > > To follow up on some of the other comments:
> > > >
> > > > 10/11) We can remove the topic name from these requests/responses,
> and
> > > > that means we will just have to make a few internal changes to make
> > > > partitions accessible by topic id and partition. I can update the KIP
> > to
> > > > remove them unless anyone thinks they should stay.
> > > >
> > > > 12) Addressed in the previous email. I've updated the KIP to include
> > > > tagged fields for the requests and responses. (More on that below)
> > > >
> > > > 13) I think part of the idea for including this information is to
> > prepare
> > > > for future changes. Perhaps the directory structure might change from
> > > > topicName_partitionNumber to something like topicID_partitionNumber.
> > Then
> > > > it would be useful to have the topic name in the file since it would
> > not
> > > be
> > > > in the directory structure. Supporting topic renames might be easier
> if
> > > the
> > > > other fields are included. Would there be any downsides to including
> > this
> > > > information?
> > > >
> > > > 14)  Yes, we would need to copy the partition metadata file in this
> > > > process. I've updated the KIP to include this.
> > > >
> > > > 15) I believe Lucas meant v1 and v2 here. He was referring to how the
> > > > requests would fall under different IBP and m

Re: [DISCUSS] KIP-631: The Quorum-based Kafka Controller

2020-09-17 Thread Colin McCabe
Hi Unmesh,

That's a fair point.  I have moved the lease duration to the broker heartbeat 
response.  That way lease durations can be changed just be reconfiguring the 
controllers.

best,
Colin

On Wed, Sep 16, 2020, at 07:40, Unmesh Joshi wrote:
> Thanks Colin, the changes look good to me. One small thing.
> registration.lease.timeout.ms is the configuration on the controller side.
> It will be good to comment on how brokers know about it, to be able to
> send LeaseDurationMs
> in the heartbeat request,
> or else it can be added in the heartbeat response for brokers to know about
> it.
> 
> Thanks,
> Unmesh
> 
> On Fri, Sep 11, 2020 at 10:32 PM Colin McCabe  wrote:
> 
> > Hi Unmesh,
> >
> > I think you're right that we should use a duration here rather than a
> > time.  As you said, the clock on the controller will probably not match the
> > one on the broker.  I have updated the KIP.
> >
> > > > It's important to keep in mind that messages may be delayed in the
> > > > network, or arrive out of order.  When this happens, we will use the
> > start
> > > > time specified in the request to determine if the request is stale.
> > > I am assuming that there will be a single TCP connection maintained
> > between
> > > broker and active controller. So, there won't be any out of order
> > requests?
> > > There will be a scenario of broker GC pause, which might cause connection
> > > timeout and broker might need to reestablish the connection. If the pause
> > > is too long, lease will expire and the heartbeat sent after the pause
> > will
> > > be treated as a new registration (similar to restart case), and a new
> > epoch
> > > number will be assigned to the broker.
> >
> > I agree with the end of this paragraph, but not with the start :)
> >
> > There can be out-of-order requests, since the broker will simply use a new
> > TCP connection if the old one has problems.  This can happen for a variety
> > of reasons.  I don't think GC pauses are the most common reason for this to
> > happen.  It's more common to see issues issues in the network itself that
> > result connections getting dropped from time to time.
> >
> > So we have to assume that messages may arrive out of order, and possibly
> > be delayed.  I added a note that heartbeat requests should be ignored if
> > the metadata log offset they contain is smaller than a previous heartbeat.
> >
> > > When the active controller fails, the new active controller needs to be
> > > sure that it considers all the known brokers as alive till the lease
> > > expiration interval.  Because registration.lease.timeout.ms, is
> > configured
> > > on the controller, the new active controller will extend all the leases
> > by
> > > registration.lease.timeout.ms. I see that it won't need last heartbeat
> > > time.
> >
> > Agreed.
> >
> > best,
> > Colin
> >
> > >
> > > Thanks,
> > > Unmesh
> > >
> > > On Sat, Sep 5, 2020 at 1:28 AM Colin McCabe  wrote:
> > >
> > > > > Colin wrote:
> > > > > > The reason for including LeaseStartTimeMs in the request is to
> > ensure
> > > > > > that the time required to communicate with the controller gets
> > > > included in
> > > > > > the lease time.  Since requests can potentially be delayed in the
> > > > network
> > > > > > for a long time, this is important.
> > > >
> > > > On Mon, Aug 31, 2020, at 05:58, Unmesh Joshi wrote:
> > > > > The network time will be added anyway, because the lease timer on the
> > > > > active controller will start only after the heartbeat request
> > reaches the
> > > > > server.
> > > >
> > > > Hi Unmesh,
> > > >
> > > > If the start time is not specified in the request, then the network
> > time
> > > > is excluded from the heartbeat time.
> > > >
> > > > Here's an example:
> > > > Let's say broker A sends a heartbeat at time 100, and it arrives on the
> > > > controller at time 200, and the lease duration is 1000.
> > > >
> > > > The controller looks at the start time in the request, which is 100,
> > and
> > > > adds 1000 to it, getting 1100.  On the other hand, if start time is not
> > > > specified in the request, then the expiration will be at time 1200.
> > > > That is what I mean by "the network time is included in the expiration
> > > > time."
> > > >
> > > > > And I think, some assumption about network round trip time is
> > > > > needed anyway to decide on the frequency of the heartbeat (
> > > > > registration.heartbeat.interval.ms), and lease timeout (
> > > > > registration.lease.timeout.ms). So I think just having a leaseTTL
> > in the
> > > > > request is easier to understand and implement.
> > > >
> > > > It's important to keep in mind that messages may be delayed in the
> > > > network, or arrive out of order.  When this happens, we will use the
> > start
> > > > time specified in the request to determine if the request is stale.
> > > >
> > > > > > Yes, I agree that the lease timeout on the controller side should
> > be
> > > > > > reset in the case of controller failover.  The alternative

Re: [DISCUSS] KIP-516: Topic Identifiers

2020-09-17 Thread Justine Olshan
Hi Jun,
Thanks for the comments.

12. I bumped the LeaderAndIsrRequest because I removed the topic name field
in the response. It may be possible to avoid bumping the version without
that change, but I may be missing something.
I believe StopReplica is actually on version 3 now, but because version 2
is flexible, I kept that listed as version 2 on the KIP page. However, you
may be right in that we may need to bump the version on StopReplica to deal
with deletion differently as mentioned above. I don't know if I have a big
preference over used tagged fields or not.

13. I was thinking that in the case where the file and the request topic
ids don't match, it means that the broker's topic/the one in the file has
been deleted. In that case, we would need to delete the old topic and start
receiving the new version. If the topic name were to change, but the ids
still matched, the file would also need to update. Am I missing a case
where the file would be correct and not the request?

Thanks,
Justine

On Thu, Sep 17, 2020 at 3:18 PM Jun Rao  wrote:

> Hi, Justine,
>
> Thanks for the reply. A couple of more comments below.
>
> 12. ListOffset and OffsetForLeader currently don't support flexible fields.
> So, we have to bump up the version number and use IBP at least for these
> two requests. Note that it seems 2.7.0 will require IBP anyway because of
> changes in KAFKA-10435. Also, it seems that the version for
> LeaderAndIsrRequest and StopReplica are bumped even though we only added a
> tagged field. But since IBP is needed anyway, we may want to revisit the
> overall tagged field choice.
>
> 13. The only downside is the potential confusion on which one is the source
> of truth if they don't match. Another option is to include those fields in
> the metadata file when we actually change the directory structure.
>
> Thanks,
>
> Jun
>
> On Thu, Sep 17, 2020 at 2:01 PM Justine Olshan 
> wrote:
>
> > Hello all,
> >
> > I've thought some more about removing the topic name field from some of
> the
> > requests. On closer inspection of the requests/responses, it seems that
> the
> > internal changes would be much larger than I expected. Some protocols
> > involve clients, so they would require changes too. I'm thinking that for
> > now, removing the topic name from these requests and responses are out of
> > scope.
> >
> > I have decided to just keep the change LeaderAndIsrResponse to remove the
> > topic name, and have updated the KIP to reflect this change. I have also
> > mentioned the other requests and responses in future work.
> >
> > I'm hoping to start the voting process soon, so let me know if there is
> > anything else we should discuss.
> >
> > Thank you,
> > Justine
> >
> > On Tue, Sep 15, 2020 at 3:57 PM Justine Olshan 
> > wrote:
> >
> > > Hello again,
> > > To follow up on some of the other comments:
> > >
> > > 10/11) We can remove the topic name from these requests/responses, and
> > > that means we will just have to make a few internal changes to make
> > > partitions accessible by topic id and partition. I can update the KIP
> to
> > > remove them unless anyone thinks they should stay.
> > >
> > > 12) Addressed in the previous email. I've updated the KIP to include
> > > tagged fields for the requests and responses. (More on that below)
> > >
> > > 13) I think part of the idea for including this information is to
> prepare
> > > for future changes. Perhaps the directory structure might change from
> > > topicName_partitionNumber to something like topicID_partitionNumber.
> Then
> > > it would be useful to have the topic name in the file since it would
> not
> > be
> > > in the directory structure. Supporting topic renames might be easier if
> > the
> > > other fields are included. Would there be any downsides to including
> this
> > > information?
> > >
> > > 14)  Yes, we would need to copy the partition metadata file in this
> > > process. I've updated the KIP to include this.
> > >
> > > 15) I believe Lucas meant v1 and v2 here. He was referring to how the
> > > requests would fall under different IBP and meant that older brokers
> > would
> > > have to use the older version of the request and the existing topic
> > > deletion process. At first, it seemed like tagged fields would resolve
> > > the IBP issue. However, we may need IBP for this request after all
> since
> > > the controller handles the topic deletion differently depending on the
> > IBP
> > > version. In an older version, we can't just send a StopReplica delete
> the
> > > topic immediately like we'd want to for this KIP.
> > >
> > > This makes me wonder if we want tagged fields on all the requests after
> > > all. Let me know your thoughts!
> > >
> > > Justine
> > >
> > > On Tue, Sep 15, 2020 at 1:03 PM Justine Olshan 
> > > wrote:
> > >
> > >> Hi all,
> > >> Jun brought up a good point in his last email about tagged fields, and
> > >> I've updated the KIP to reflect that the changes to requests and
> > responses
> > >> will be in the 

Re: [DISCUSS] KIP-516: Topic Identifiers

2020-09-17 Thread Jun Rao
Hi, Justine,

Thanks for the reply. A couple of more comments below.

12. ListOffset and OffsetForLeader currently don't support flexible fields.
So, we have to bump up the version number and use IBP at least for these
two requests. Note that it seems 2.7.0 will require IBP anyway because of
changes in KAFKA-10435. Also, it seems that the version for
LeaderAndIsrRequest and StopReplica are bumped even though we only added a
tagged field. But since IBP is needed anyway, we may want to revisit the
overall tagged field choice.

13. The only downside is the potential confusion on which one is the source
of truth if they don't match. Another option is to include those fields in
the metadata file when we actually change the directory structure.

Thanks,

Jun

On Thu, Sep 17, 2020 at 2:01 PM Justine Olshan  wrote:

> Hello all,
>
> I've thought some more about removing the topic name field from some of the
> requests. On closer inspection of the requests/responses, it seems that the
> internal changes would be much larger than I expected. Some protocols
> involve clients, so they would require changes too. I'm thinking that for
> now, removing the topic name from these requests and responses are out of
> scope.
>
> I have decided to just keep the change LeaderAndIsrResponse to remove the
> topic name, and have updated the KIP to reflect this change. I have also
> mentioned the other requests and responses in future work.
>
> I'm hoping to start the voting process soon, so let me know if there is
> anything else we should discuss.
>
> Thank you,
> Justine
>
> On Tue, Sep 15, 2020 at 3:57 PM Justine Olshan 
> wrote:
>
> > Hello again,
> > To follow up on some of the other comments:
> >
> > 10/11) We can remove the topic name from these requests/responses, and
> > that means we will just have to make a few internal changes to make
> > partitions accessible by topic id and partition. I can update the KIP to
> > remove them unless anyone thinks they should stay.
> >
> > 12) Addressed in the previous email. I've updated the KIP to include
> > tagged fields for the requests and responses. (More on that below)
> >
> > 13) I think part of the idea for including this information is to prepare
> > for future changes. Perhaps the directory structure might change from
> > topicName_partitionNumber to something like topicID_partitionNumber. Then
> > it would be useful to have the topic name in the file since it would not
> be
> > in the directory structure. Supporting topic renames might be easier if
> the
> > other fields are included. Would there be any downsides to including this
> > information?
> >
> > 14)  Yes, we would need to copy the partition metadata file in this
> > process. I've updated the KIP to include this.
> >
> > 15) I believe Lucas meant v1 and v2 here. He was referring to how the
> > requests would fall under different IBP and meant that older brokers
> would
> > have to use the older version of the request and the existing topic
> > deletion process. At first, it seemed like tagged fields would resolve
> > the IBP issue. However, we may need IBP for this request after all since
> > the controller handles the topic deletion differently depending on the
> IBP
> > version. In an older version, we can't just send a StopReplica delete the
> > topic immediately like we'd want to for this KIP.
> >
> > This makes me wonder if we want tagged fields on all the requests after
> > all. Let me know your thoughts!
> >
> > Justine
> >
> > On Tue, Sep 15, 2020 at 1:03 PM Justine Olshan 
> > wrote:
> >
> >> Hi all,
> >> Jun brought up a good point in his last email about tagged fields, and
> >> I've updated the KIP to reflect that the changes to requests and
> responses
> >> will be in the form of tagged fields to avoid changing IBP.
> >>
> >> Jun: I plan on sending a followup email to address some of the other
> >> points.
> >>
> >> Thanks,
> >> Justine
> >>
> >> On Mon, Sep 14, 2020 at 4:25 PM Jun Rao  wrote:
> >>
> >>> Hi, Justine,
> >>>
> >>> Thanks for the updated KIP. A few comments below.
> >>>
> >>> 10. LeaderAndIsr Response: Do we need the topic name?
> >>>
> >>> 11. For the changed request/response, other than LeaderAndIsr,
> >>> UpdateMetadata, Metadata, do we need to include the topic name?
> >>>
> >>> 12. It seems that upgrades don't require IBP. Does that mean the new
> >>> fields
> >>> in all the request/response are added as tagged fields without bumping
> up
> >>> the request version? It would be useful to make that clear.
> >>>
> >>> 13. Partition Metadata file: Do we need to include the topic name and
> the
> >>> partition id since they are implied in the directory name?
> >>>
> >>> 14. In the JBOD mode, we support moving a partition's data from one
> disk
> >>> to
> >>> another. Will the new partition metadata file be copied during that
> >>> process?
> >>>
> >>> 15. The KIP says "Remove deleted topics from replicas by sending
> >>> StopReplicaRequest V2 for any topics which do not contain a topic ID,

Re: [DISCUSS] KIP-516: Topic Identifiers

2020-09-17 Thread Justine Olshan
Hello all,

I've thought some more about removing the topic name field from some of the
requests. On closer inspection of the requests/responses, it seems that the
internal changes would be much larger than I expected. Some protocols
involve clients, so they would require changes too. I'm thinking that for
now, removing the topic name from these requests and responses are out of
scope.

I have decided to just keep the change LeaderAndIsrResponse to remove the
topic name, and have updated the KIP to reflect this change. I have also
mentioned the other requests and responses in future work.

I'm hoping to start the voting process soon, so let me know if there is
anything else we should discuss.

Thank you,
Justine

On Tue, Sep 15, 2020 at 3:57 PM Justine Olshan  wrote:

> Hello again,
> To follow up on some of the other comments:
>
> 10/11) We can remove the topic name from these requests/responses, and
> that means we will just have to make a few internal changes to make
> partitions accessible by topic id and partition. I can update the KIP to
> remove them unless anyone thinks they should stay.
>
> 12) Addressed in the previous email. I've updated the KIP to include
> tagged fields for the requests and responses. (More on that below)
>
> 13) I think part of the idea for including this information is to prepare
> for future changes. Perhaps the directory structure might change from
> topicName_partitionNumber to something like topicID_partitionNumber. Then
> it would be useful to have the topic name in the file since it would not be
> in the directory structure. Supporting topic renames might be easier if the
> other fields are included. Would there be any downsides to including this
> information?
>
> 14)  Yes, we would need to copy the partition metadata file in this
> process. I've updated the KIP to include this.
>
> 15) I believe Lucas meant v1 and v2 here. He was referring to how the
> requests would fall under different IBP and meant that older brokers would
> have to use the older version of the request and the existing topic
> deletion process. At first, it seemed like tagged fields would resolve
> the IBP issue. However, we may need IBP for this request after all since
> the controller handles the topic deletion differently depending on the IBP
> version. In an older version, we can't just send a StopReplica delete the
> topic immediately like we'd want to for this KIP.
>
> This makes me wonder if we want tagged fields on all the requests after
> all. Let me know your thoughts!
>
> Justine
>
> On Tue, Sep 15, 2020 at 1:03 PM Justine Olshan 
> wrote:
>
>> Hi all,
>> Jun brought up a good point in his last email about tagged fields, and
>> I've updated the KIP to reflect that the changes to requests and responses
>> will be in the form of tagged fields to avoid changing IBP.
>>
>> Jun: I plan on sending a followup email to address some of the other
>> points.
>>
>> Thanks,
>> Justine
>>
>> On Mon, Sep 14, 2020 at 4:25 PM Jun Rao  wrote:
>>
>>> Hi, Justine,
>>>
>>> Thanks for the updated KIP. A few comments below.
>>>
>>> 10. LeaderAndIsr Response: Do we need the topic name?
>>>
>>> 11. For the changed request/response, other than LeaderAndIsr,
>>> UpdateMetadata, Metadata, do we need to include the topic name?
>>>
>>> 12. It seems that upgrades don't require IBP. Does that mean the new
>>> fields
>>> in all the request/response are added as tagged fields without bumping up
>>> the request version? It would be useful to make that clear.
>>>
>>> 13. Partition Metadata file: Do we need to include the topic name and the
>>> partition id since they are implied in the directory name?
>>>
>>> 14. In the JBOD mode, we support moving a partition's data from one disk
>>> to
>>> another. Will the new partition metadata file be copied during that
>>> process?
>>>
>>> 15. The KIP says "Remove deleted topics from replicas by sending
>>> StopReplicaRequest V2 for any topics which do not contain a topic ID, and
>>> V3 for any topics which do contain a topic ID.". However, it seems the
>>> updated controller will create all missing topic IDs first before doing
>>> other actions. So, is StopReplicaRequest V2 needed?
>>>
>>> Jun
>>>
>>> On Fri, Sep 11, 2020 at 10:31 AM John Roesler 
>>> wrote:
>>>
>>> > Thanks, Justine!
>>> >
>>> > Your response seems compelling to me.
>>> >
>>> > -John
>>> >
>>> > On Fri, 2020-09-11 at 10:17 -0700, Justine Olshan wrote:
>>> > > Hello all,
>>> > > Thanks for continuing the discussion! I have a few responses to your
>>> > points.
>>> > >
>>> > > Tom: You are correct in that this KIP has not mentioned the
>>> > > DeleteTopicsRequest. I think that this would be out of scope for
>>> now, but
>>> > > may be something worth adding in the future.
>>> > >
>>> > > John: We did consider sequence ids, but there are a few reasons to
>>> favor
>>> > > UUIDs. There are several cases where topics from different clusters
>>> may
>>> > > interact now and in the future. For example, Mirror Maker 2 may

[jira] [Created] (KAFKA-10499) 4 Unit Tests are breaking after addition of a new A record to "apache.org"

2020-09-17 Thread Prateek Agarwal (Jira)
Prateek Agarwal created KAFKA-10499:
---

 Summary: 4 Unit Tests are breaking after addition of a new A 
record to "apache.org"
 Key: KAFKA-10499
 URL: https://issues.apache.org/jira/browse/KAFKA-10499
 Project: Kafka
  Issue Type: Bug
  Components: unit tests
Affects Versions: 2.5.1
Reporter: Prateek Agarwal


{{apache.org}} earlier used to resolve only to 2 A records: 95.216.24.32 and 
40.79.78.1

 

With addition of a new A record 95.216.26.30, 4 unit tests have started 
failing, which expect the count of DNS resolution to be 2, but instead it is 
now 3.

 
{code:java}
org.apache.kafka.clients.ClusterConnectionStatesTest > 
testMultipleIPsWithUseAll FAILED
java.lang.AssertionError: expected:<2> but was:<3>
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.failNotEquals(Assert.java:834)
at org.junit.Assert.assertEquals(Assert.java:645)
at org.junit.Assert.assertEquals(Assert.java:631)
at 
org.apache.kafka.clients.ClusterConnectionStatesTest.testMultipleIPsWithUseAll(ClusterConnectionStatesTest.java:241)


org.apache.kafka.clients.ClusterConnectionStatesTest > testHostResolveChange 
FAILED
java.lang.AssertionError: expected:<2> but was:<3>
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.failNotEquals(Assert.java:834)
at org.junit.Assert.assertEquals(Assert.java:645)
at org.junit.Assert.assertEquals(Assert.java:631)
at 
org.apache.kafka.clients.ClusterConnectionStatesTest.testHostResolveChange(ClusterConnectionStatesTest.java:256)

org.apache.kafka.clients.ClusterConnectionStatesTest > 
testMultipleIPsWithDefault FAILED
java.lang.AssertionError: expected:<2> but was:<3>
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.failNotEquals(Assert.java:834)
at org.junit.Assert.assertEquals(Assert.java:645)
at org.junit.Assert.assertEquals(Assert.java:631)
at 
org.apache.kafka.clients.ClusterConnectionStatesTest.testMultipleIPsWithDefault(ClusterConnectionStatesTest.java:231)

org.apache.kafka.clients.ClientUtilsTest > testResolveDnsLookupAllIps FAILED
java.lang.AssertionError: expected:<2> but was:<3>
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.failNotEquals(Assert.java:834)
at org.junit.Assert.assertEquals(Assert.java:645)
at org.junit.Assert.assertEquals(Assert.java:631)
at 
org.apache.kafka.clients.ClientUtilsTest.testResolveDnsLookupAllIps(ClientUtilsTest.java:87)
 {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Serious Java Consumer performance issues / expectations vs. librdkafka

2020-09-17 Thread Adam Bellemare
Hi

I am trying to use a plain Java consumer (over SSL) to consume a very large
amount of historic data (20+TB across 20+ partitions). Consumption
performance is very low when fully parallelized.

We are seeing about* 200k rec/s* with java consumer versus *950k rec/s*
with librdkafka
We are seeing about *1 gigabit/s* with java consumer versus *5.3 gigabit/s*
with librdkafka

Both applications are doing no-ops (eg: consume, deserialize as byte
arrays, print a line for every 100 events). Both applications are using
defaults (including the same fetch sizes, maximums, batch sizes, etc). The
java processes do not appear to be starved for resources, CPU, memory, etc,
nor do the kafkacat instances. Everything is being run in exactly the same
environments with the same resources, but the Java Kafka client is just
incredibly slow.

Java Kafka Client version 2.4.x
JDK 11 (I think there was an SSL performance issue that required upgrading
to at least JDK 11).

Am I doing wrong here? The last time I tested the performance difference
between these two libraries was years ago, and it was something like
librdkafka was a bit faster in most cases, but certainly not 5x faster in a
no-op scenario. Is this in line with expectations?

Any thoughts or suggestions would be very much appreciated.

Thanks
Adam


Re: [DISCUSS] KIP-508: Make Suppression State Queriable - rebooted.

2020-09-17 Thread Dongjin Lee
Hi John,

I just reviewed KIP-307. As far as I understood, ...

1. There was Materialized name initially.
2. With KIP-307, Named Operations were added.
3. Now we have two options for materializing suppression. If we take
Materialized name here, we have two names for the same operation, which is
not feasible.

Do I understand correctly?

> Do you have a use case in mind for having two separate names for the
operation and the view?

No. I am now entirely convinced with your suggestion.

I just started to update the draft implementation. If I understand
correctly, please notify me; I will update the KIP by adding the discussion
above.

Best,
Dongjin

On Thu, Sep 17, 2020 at 11:06 AM John Roesler  wrote:

> Hi Dongjin,
>
> Thanks for the reply. Yes, that’s correct, we added that method to name
> the operation. But the operation seems synonymous with the view produced
> the operation, right?
>
> During KIP-307, I remember thinking that it’s unfortunate the we had to
> have two different “name” concepts for the same thing just because setting
> the name on Materialized is equivalent both to making it queriable and
> actually materializing it.
>
> If we were to reconsider the API, it would be nice to treat these three as
> orthogonal:
> * specify a name
> * flag to make the view queriable
> * flag to materialize the view
>
> That was the context behind my suggestion. Do you have a use case in mind
> for having two separate names for the operation and the view?
>
> Thanks,
> John
>
> On Wed, Sep 16, 2020, at 11:43, Dongjin Lee wrote:
> > Hi John,
> >
> > It seems like the available alternatives in this point is clear:
> >
> > 1. Pass queriable name as a separate parameter (i.e.,
> > `KTable#suppress(Suppressed, String)`)
> > 2. Make use of the Suppression processor name as a queryable name by
> adding
> > `enableQuery` optional flag to `Suppressed`.
> >
> > However, I doubt the second approach a little bit; As far as I know, the
> > processor name is introduced in KIP-307[^1] to make debugging topology
> easy
> > and understandable. Since the processor name is an independent concept
> with
> > the materialization, I feel the first approach is more natural and
> > consistent. Is there any specific reason that you prefer the second
> > approach?
> >
> > Thanks,
> > Dongjin
> >
> > [^1]:
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-307%3A+Allow+to+define+custom+processor+names+with+KStreams+DSL
> >
> >
> >
> > On Wed, Sep 16, 2020 at 11:48 PM John Roesler 
> wrote:
> >
> > > Hi Dongjin,
> > >
> > > Yes, that's where I was leaning. Although, I'd prefer adding
> > > the option to Suppressed instead of adding a new argument to
> > > the method call.
> > >
> > > What do you think about:
> > >
> > > class Suppressed {
> > > +  public Suppressed enableQuery();
> > > }
> > >
> > > Since Suppressed already has `withName(String)`, it seems
> > > like all we need to add is a boolean flag.
> > >
> > > Does that seem sensible to you?
> > >
> > > Thanks,
> > > -John
> > >
> > > On Wed, 2020-09-16 at 21:50 +0900, Dongjin Lee wrote:
> > > > Hi John,
> > > >
> > > > > Although it's not great to have "special snowflakes" in the API,
> > > Choice B
> > > > does seem safer in the short term. We would basically be proposing a
> > > > temporary API to make the suppressed view queriable without a
> > > Materialized
> > > > argument.
> > > >
> > > > Then, it seems like you prefer `KTable#suppress(Suppressed, String)`
> > > (i.e.,
> > > > queriable name only as a parameter) for this time, and refine API
> with
> > > the
> > > > other related KIPs later.
> > > >
> > > > Do I understand correctly?
> > > >
> > > > Thanks,
> > > > Dongjin
> > > >
> > > > On Wed, Sep 16, 2020 at 2:17 AM John Roesler 
> > > wrote:
> > > >
> > > > > Hi Dongjin,
> > > > >
> > > > > Thanks for presenting these options. The concern that
> > > > > Matthias brought up is a very deep problem that afflicts all
> > > > > operations downstream of windowing operations. It's the same
> > > > > thing that derailed KIP-300. For the larger context, I have
> > > > > developed a couple of approaches to resolve this situation,
> > > > > but I think it makes sense to finish up KIP-478 before
> > > > > presenting them.
> > > > >
> > > > > However, I don't think that we need in particular to block
> > > > > the current proposal on solving that long-running and deep
> > > > > issue with the DSL. Instead, we should make a top-level
> > > > > decision whether to:
> > > > >
> > > > > A: Make Suppress just like all the other KTable operations.
> > > > > It will have the same pathological behavior that the keyset
> > > > > is unbounded while the store implementation is only a
> > > > > KeyValueStore. Again, this exact pathology currently affects
> > > > > all KTable operations that follow from windowing operations.
> > > > > For example, it applies to the current workaround that
> > > > > Dongjin documented in the KIP:
> > > > > suppress().filter(Materialized). This is
> > >

Re: [DISCUSS] KIP-664: Provide tooling to detect and abort hanging transactions

2020-09-17 Thread Boyang Chen
Thanks for the updates Jason. I'm pretty satisfied with the overall
motivation and proposed solution, just a couple of more comments.

1. Why do we need to use type string for `StatesFilter` instead of a short
value, as we could translate it and save space?

2. I'm wondering whether the requirement for Describe permission on
TransactionalId works when we are heading towards
https://issues.apache.org/jira/browse/KAFKA-9454, where we could rely on
consumer group id instead of defining the transactional id. At a first
look, I think it should be ok but just want to raise this point.

3. Could the --find-hanging work with checking all brokers in the cluster,
or multiple brokers as a list?

4. Similar to transaction abortion, I guess there is a trade-off for
too-specific vs too-general for the required number of arguments. However,
supposedly I would like to wipe out all the associated transactions with
the given transactional id, or I want to clean up *all *hanging
transactions in the cluster, do I need to write the script on my own? Maybe
we could discuss a bit on whether we would like to support a more holistic
API, or this is good for now.


On Thu, Sep 10, 2020 at 7:53 AM Tom Bentley  wrote:

> Sounds good to me, thanks!
>
> On Wed, Sep 9, 2020 at 5:30 PM Jason Gustafson  wrote:
>
> > Hey Tom,
> >
> > Yeah, that's fair. I will update the proposal. I was also thinking of
> > adding a separate column for duration, just to save users the trouble of
> > computing it.
> >
> > Thanks,
> > Jason
> >
> > On Wed, Sep 9, 2020 at 1:21 AM Tom Bentley  wrote:
> >
> > > Hi Jason,
> > >
> > > The KIP looks good to me, but I had one question. AFAIU the
> LastTimestamp
> > > column in the output of --describe-producers and --find-hanging is
> there
> > so
> > > the users of the tool know the txnLastUpdateTimestamp of the
> > > TransactionMetadata and from that and the (max) timeout can infer
> > something
> > > about the likelihood that this really is a stuck transaction. If that's
> > the
> > > case then what is the benefit in displaying it as a ms offset from the
> > unix
> > > epoch, rather than an actual date time?
> > >
> > > Thanks,
> > >
> > > Tom
> > >
> > > On Mon, Aug 31, 2020 at 11:28 PM Guozhang Wang 
> > wrote:
> > >
> > > > Thanks Jason, I do not have more comments on the KIP then.
> > > >
> > > > On Mon, Aug 31, 2020 at 3:19 PM Jason Gustafson 
> > > > wrote:
> > > >
> > > > > > Hmm, but the "TxnStartOffset" is not included in the
> > > DescribeProducers
> > > > > response either?
> > > > >
> > > > > Oh, I accidentally called it `CurrentTxnStartTimestamp` in the
> > schema.
> > > > > Fixed now!
> > > > >
> > > > > -Jason
> > > > >
> > > > > On Mon, Aug 31, 2020 at 3:04 PM Guozhang Wang 
> > > > wrote:
> > > > >
> > > > > > On Mon, Aug 31, 2020 at 12:28 PM Jason Gustafson <
> > ja...@confluent.io
> > > >
> > > > > > wrote:
> > > > > >
> > > > > > > Hey Guozhang,
> > > > > > >
> > > > > > > Thanks for the detailed comments. Responses inline:
> > > > > > >
> > > > > > > > 1. I'd like to clarify how we can make "--abort" work with
> old
> > > > > brokers,
> > > > > > > since without the additional field "Partitions" the tool needs
> to
> > > set
> > > > > the
> > > > > > > coordinator epoch correctly instead of "-1"? Arguably that's
> > still
> > > > > doable
> > > > > > > but would require different call paths, and it's not clear
> > whether
> > > > > that's
> > > > > > > worth doing for old versions.
> > > > > > >
> > > > > > > That's a good question. What I had in mind was to write the
> > marker
> > > > > using
> > > > > > > the last coordinator epoch that was used by the respective
> > > > ProducerId.
> > > > > I
> > > > > > > realized that I left the coordinator epoch out of the
> > > > > `DescribeProducers`
> > > > > > > response, so I have updated the KIP to include it. It is
> possible
> > > > that
> > > > > > > there is no coordinator epoch associated with a given
> ProducerId
> > > > (e.g.
> > > > > if
> > > > > > > it is the first transaction from that producer), but in this
> case
> > > we
> > > > > can
> > > > > > > use 0.
> > > > > > >
> > > > > > > As for whether this is worth doing, I guess I would be more
> > > inclined
> > > > to
> > > > > > > leave it out if users had a reasonable alternative today to
> > address
> > > > > this
> > > > > > > problem.
> > > > > > >
> > > > > > > > 2. Why do we have to enforce "DescribeProducers" to be sent
> to
> > > only
> > > > > > > leaders
> > > > > > > while ListTransactions can be sent to any brokers? Or is it
> > really
> > > > > > > "ListTransactions to be sent to coordinators only"? From the
> > > workflow
> > > > > > > you've described, based on the results back from
> > DescribeProducers,
> > > > we
> > > > > > > should just immediately send ListTransactions to the
> > > > > > > corresponding coordinators based on the collected producer ids,
> > > > instead
> > > > > > of
> > > > > > > trying to send to any brokers right?
> > > > > > >
> > > > > > > 

[jira] [Created] (KAFKA-10498) Consumer should do offset/epoch validation through `Fetch` when possible

2020-09-17 Thread Jason Gustafson (Jira)
Jason Gustafson created KAFKA-10498:
---

 Summary: Consumer should do offset/epoch validation through 
`Fetch` when possible
 Key: KAFKA-10498
 URL: https://issues.apache.org/jira/browse/KAFKA-10498
 Project: Kafka
  Issue Type: Improvement
Reporter: Jason Gustafson
Assignee: Chia-Ping Tsai


Currently the consumer has logic to detect truncations (as a result of unclean 
leader election for example) using the OffsetsForLeaderEpoch API. It is a 
rather cumbersome and expensive process since we have to check for the need to 
send this request on every poll(). We should be able to do better now that 
KIP-595 has built support for truncation detection directly into the `Fetch` 
protocol. This should allow us to skip validation when we know that the `Fetch` 
version is high enough. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KAFKA-10497) Convert group/transaction coordinator metadata schemas to use generated protocol

2020-09-17 Thread Jason Gustafson (Jira)
Jason Gustafson created KAFKA-10497:
---

 Summary: Convert group/transaction coordinator metadata schemas to 
use generated protocol
 Key: KAFKA-10497
 URL: https://issues.apache.org/jira/browse/KAFKA-10497
 Project: Kafka
  Issue Type: Improvement
Reporter: Jason Gustafson
Assignee: Chia-Ping Tsai


We need to convert the internal schemas used for representing transaction/group 
metadata to the generated protocol. This opens the door for flexible version 
support on the next bump. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] KIP-663: API to Start and Shut Down Stream Threads and to Request Closing of Kafka Streams Clients

2020-09-17 Thread Bruno Cadonna

Hi Sophie,

Thank you for the feedback! I replied inline.

Best,
Bruno

On 16.09.20 19:19, Sophie Blee-Goldman wrote:


We guarantee that the metadata of the dead stream threads  will be
returned by KafkaStreams#localThreadsMetadata() at least until the next
call to KafkaStreams#addStreamThread() or
KafkaStreams#removeStreamThread() after the stream thread transited to
DEAD



This seems kind of tricky...personally I would find it pretty odd if I
queried the
local thread metadata and found two threads, A (alive) and B (dead), and
then
called removeStreamThread() and now suddenly I have zero. Or if I call
addStreamThread and now I still have two threads.



The behavior might be unusual, but it is well defined and not random by 
any means.



Both of those results seem to indicate that only live threads "count" and
are returned
by localThreadsMetadata(). But in reality we do temporarily keep the dead
thread,
but only for the arbitrary amount of time until the next time you want to
add or
remove some other stream thread? That seems like a weird side effect of the
add/removeStreamThread APIs.



This is not a side effect that just happens to occur. This is a 
guarantee that users get. It gives users the possibility to retrieve the 
metadata of the dead stream threads since the last call to 
add/removeStreamThread. Admittedly, this guarantee overlap with the 
current/planned implementation. But that is more a coincidence.


I would be more concerned about when add/removeStreamThread is called 
from different threads which could happen if an uncaught exception 
handler is called that wants to replace a stream thread and a thread 
that is responsible for automated scaling up is running.



If we really think users might want to log the metadata of dead threads,
then
let's just do that for them or give them a way to do exactly that.



Logging the metatdata of dead stream threads for the user is a valid 
alternative. Giving users the way to do exactly that is hard because the 
StreamThread class is not part of the public API. They would always need 
to call a method on the KafkaStreams object where we already have 
localThreadsMetadata().



I'm not that concerned about the backwards compatibility of removing dead
threads from the localThreadsMetadata, because I find it hard to believe
that
users do anything other than just skip over them in the list (set?) that
gets
returned. But maybe someone can chime in with an example use case.



I am also not too much concerned about backwards compatibility. That 
would indeed be a side effect of the current proposal.



I'm actually even a little skeptical that any users might want to log the
metadata of a
dead thread, since all of the metadata is only useful for IQ on live
threads or
already covered by other easily discoverable logging elsewhere, or both.



Said all of the above, I actually agree with you that there is not that 
much information in the metadata of a dead stream thread that is 
interesting. The name of the stream thread is known in the uncaught 
exception handler. The names of the clients, like consumer etc., used by 
the stream thread can be derived from the name of the stream thread. 
Finally, the sets of active and standby tasks should be empty for a dead 
stream thread.


Hence, I backpedal and propose to filter out dead stream threads from 
localThreadsMetadata(). WDYT?



On Wed, Sep 16, 2020 at 2:07 AM Bruno Cadonna  wrote:


Hi again,

I just realized that if we filter out DEAD stream threads in
localThreadsMetadata(), users cannot log the metadata of dying stream
threads in the uncaught exception handler.

I realized this thanks to the example Guozhang requested in the KIP.
Thank you for that, Guozhang!

Hence, I adapted the KIP as follows:

- We do not filter out DEAD stream threads in
KafkaStreams#localThreadsMetadata()

- We guarantee that the metadata of the dead stream threads  will be
returned by KafkaStreams#localThreadsMetadata() at least until the next
call to KafkaStreams#addStreamThread() or
KafkaStreams#removeStreamThread() after the stream thread transited to
DEAD. Besides giving users the opportunity to log the metadata of a
dying stream thread in its uncaught exception handler, this guarantee
makes KafkaStreams#localThreadsMetadata() completely backward compatible
to the current behavior, because if KafkaStreams#addStreamThread() and
KafkaStreams#removeStreamThread() are never called,
KafkaStreams#localThreadsMetadata() will also return the metadata of all
streams threads that have ever died which corresponds to the current
behavior.

- We guarantee that dead stream threads are removed from a Kafka Streams
client at latest after the next call to KafkaStreams#addStreamThread()
or KafkaStreams#removeStreamThread() following the transition of the
stream thread to DEAD. This guarantees that the number of maintained
stream threads does not grow indefinitely.


Best,
Bruno



On 16.09.20 09:23, Bruno Cadonna wrote:

Hi Guozhang,

Good 

[jira] [Created] (KAFKA-10496) Create an in-memory DNS server for ClientUtilsTest and ClusterConnectionStatesTest

2020-09-17 Thread Justine Olshan (Jira)
Justine Olshan created KAFKA-10496:
--

 Summary: Create an in-memory DNS server for ClientUtilsTest and 
ClusterConnectionStatesTest
 Key: KAFKA-10496
 URL: https://issues.apache.org/jira/browse/KAFKA-10496
 Project: Kafka
  Issue Type: Improvement
Reporter: Justine Olshan


ClientUtilsTest and ClusterConnectionStatesTest currently use an external DNS 
to resolve the IP addresses for kafka.apache.org. Until recently there were 
only two IP4 addresses, but now there is a third. 

These tests are pretty fragile when they rely on outside sources, so it would 
make sense to create an in-memory DNS.  This is what netty does for similar 
tests.  
[https://github.com/netty/netty/blob/master/resolver-dns/src/test/java/io/netty/resolver/dns/TestDnsServer.java].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: KIP-669: Preserve Source Partition in Kafka Streams from context

2020-09-17 Thread Balan k
Thanks Matthias for the reply
I think i like the idea of the ability to use Record context in the default 
partitioner itself. I will join the discussion for KIP 478 to understand the 
context.

On 2020/09/11 21:31:46, "Matthias J. Sax"  wrote: 
> With regard to KIP-478, there is the idea to introduce a `RecordContext`
> class.
> 
> Thus, we could just change the `StreamPartitioner` to take this new
> class as parameter to `partition()`? This might actually kill two birds
> with one stone, because I could imagine use cases in which users want to
> partition based on header information that is currently not exposed either.
> 
> For this case, we don't even need to provide any default implementation
> of `StreamPartitioner` but users can just implement it themselves. The
> use case itself makes sense, but it does not seem to be generic enough
> that we need to provide an out-of-the-box implementation for it.
> 
> 
> -Matthias
> 
> On 9/10/20 3:59 PM, Sophie Blee-Goldman wrote:
> > Hey Balan, thanks for the KIP!
> > 
> > The motivation here makes sense to me, but I have a few questions about the
> > proposed API
> > 
> > I guess the main thing to point out is that if we just add new addSink()
> > overloads to Topology,
> > then only the lower level Processor API will benefit and users of the DSL
> > won't be able to utilize
> > this. This seems like a useful feature that we should make available to
> > anyone.
> > 
> > We could follow a similar approach and add new toStream overloads to the
> > KStream class, but
> > that would expand the surface area of the API pretty significantly. The
> > additional addSink()
> > overloads alone would do this. The addSink() methods already have a pretty
> > large number
> > of optional parameters which means more and more overloads every time a new
> > one is added.
> > We should avoid making this problem worse wherever possible.
> > 
> > Existing StreamPartitioner  in SinkNode will be made null when context
> >> partition is enabled
> > 
> > 
> > This line from your KIP gave me some idea that it might be avoidable in
> > this case. The implication
> > of this quote is that the StreamPartitioner and useContextPartition
> > parameter are inherently
> > incompatible since they are two ways of specifying the same thing, the
> > target partition. Well, if
> > that's the case, then we should be able to just leverage the existing
> > StreamPartitioner in some
> > way to specify that we want records to end up in the source partition,
> > without introducing a new
> > parameter.
> > 
> > One option would be to just let users pass in a null StreamPartitioner to
> > mean that it should
> > use the source partition, but that seems a bit subtle. Maybe a better API
> > would be to offer
> > a new out-of-the-box StreamPartitioner called SourceContextPartitioner (or
> > something),
> > and then users just have to pass in an instance of this. WDYT?
> > 
> > On Thu, Sep 10, 2020 at 8:00 AM Balan k  wrote:
> > 
> >>
> >> Forgot to add the link
> >>
> >> https://cwiki.apache.org/confluence/display/KAFKA/KIP-669%3A+Preserve+Source+Partition+in+Kafka+Streams+from+context
> >>
> >>
> >>
> >> On 2020/09/10 13:40:02, satyanarayan komandur 
> >> wrote:
> >>> Hi,
> >>>
> >>> I have submitted a new KIP for preserving processor record context
> >> partition from source. I am looking for suggestions/comments.
> >>>
> >>> In most use cases where source message is getting transformed and sent
> >> to a target topic, where
> >>> 1. number of partitions on source and sink topic are same
> >>> 2. there is no change to the key
> >>> 3. more importantly if we would like to preserve the partition as is
> >> without re-deriving using partition from context would help.
> >>>
> >>> I am aware of one caveat where record processor context partition is not
> >> known in stream punctuation.
> >>>
> >>> Please look over the KIP and chime in more ideas
> >>>
> >>> Thanks
> >>> Balan
> >>>
> >>>
> >>>
> >>
> > 
> 
> 


Re: KIP-669: Preserve Source Partition in Kafka Streams from context

2020-09-17 Thread Balan k
Thanks for taking time to reply.
I thought about the confusion with overloads. creating another Partitioner 
seemed liked a good idea to start with, soon i realized the partitioner is 
interface written in a way which does not support anything other than 
key/value. It seems to me the idea of using RecordContext is a better option as 
suggested by Matthias. That would also open up the options for punctuate as 
well. I will look into 478 further and see if we can merge

On 2020/09/10 22:59:09, Sophie Blee-Goldman  wrote: 
> Hey Balan, thanks for the KIP!
> 
> The motivation here makes sense to me, but I have a few questions about the
> proposed API
> 
> I guess the main thing to point out is that if we just add new addSink()
> overloads to Topology,
> then only the lower level Processor API will benefit and users of the DSL
> won't be able to utilize
> this. This seems like a useful feature that we should make available to
> anyone.
> 
> We could follow a similar approach and add new toStream overloads to the
> KStream class, but
> that would expand the surface area of the API pretty significantly. The
> additional addSink()
> overloads alone would do this. The addSink() methods already have a pretty
> large number
> of optional parameters which means more and more overloads every time a new
> one is added.
> We should avoid making this problem worse wherever possible.
> 
> Existing StreamPartitioner  in SinkNode will be made null when context
> > partition is enabled
> 
> 
> This line from your KIP gave me some idea that it might be avoidable in
> this case. The implication
> of this quote is that the StreamPartitioner and useContextPartition
> parameter are inherently
> incompatible since they are two ways of specifying the same thing, the
> target partition. Well, if
> that's the case, then we should be able to just leverage the existing
> StreamPartitioner in some
> way to specify that we want records to end up in the source partition,
> without introducing a new
> parameter.
> 
> One option would be to just let users pass in a null StreamPartitioner to
> mean that it should
> use the source partition, but that seems a bit subtle. Maybe a better API
> would be to offer
> a new out-of-the-box StreamPartitioner called SourceContextPartitioner (or
> something),
> and then users just have to pass in an instance of this. WDYT?
> 
> On Thu, Sep 10, 2020 at 8:00 AM Balan k  wrote:
> 
> >
> > Forgot to add the link
> >
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-669%3A+Preserve+Source+Partition+in+Kafka+Streams+from+context
> >
> >
> >
> > On 2020/09/10 13:40:02, satyanarayan komandur 
> > wrote:
> > > Hi,
> > >
> > > I have submitted a new KIP for preserving processor record context
> > partition from source. I am looking for suggestions/comments.
> > >
> > > In most use cases where source message is getting transformed and sent
> > to a target topic, where
> > > 1. number of partitions on source and sink topic are same
> > > 2. there is no change to the key
> > > 3. more importantly if we would like to preserve the partition as is
> > without re-deriving using partition from context would help.
> > >
> > > I am aware of one caveat where record processor context partition is not
> > known in stream punctuation.
> > >
> > > Please look over the KIP and chime in more ideas
> > >
> > > Thanks
> > > Balan
> > >
> > >
> > >
> >
> 


[jira] [Created] (KAFKA-10495) Fix spelling mistake

2020-09-17 Thread Jira
欧阳能达 created KAFKA-10495:


 Summary: Fix spelling mistake
 Key: KAFKA-10495
 URL: https://issues.apache.org/jira/browse/KAFKA-10495
 Project: Kafka
  Issue Type: Improvement
  Components: clients, consumer
Reporter: 欧阳能达


In track branch.

org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.poll(ConsumerCoordinator.java
 465 line: @return true *iff* the operation succeeded

The *iff* is a mistake word.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KAFKA-10494) Streams: enableSendingOldValues should not call parent if node is itself materialized

2020-09-17 Thread Andy Coates (Jira)
Andy Coates created KAFKA-10494:
---

 Summary: Streams: enableSendingOldValues should not call parent if 
node is itself materialized
 Key: KAFKA-10494
 URL: https://issues.apache.org/jira/browse/KAFKA-10494
 Project: Kafka
  Issue Type: Improvement
  Components: streams
Reporter: Andy Coates


Calling `enableSendingOldValues` on many nodes, e.g. `KTableFilter`, is 
unnecessarily calling `enableSendingOldValues` on the parent, even when the 
processor itself is materialized. This can force the parent table to be 
materialized unnecessarily.

 

For example:

```

StreamsBuilder builder = new StreamsBuilder();

builder
   .table("t1", Consumed.of(...))
   .filter(predicate, Materialized.as("t2"))
   .

```

If `downStreamOps` result in a call to `enableSendingOldValues` on the table 
returned by the `filter` call, i.e. `t2`, then it will result in `t1` being 
materialized unnecessarily.


This ticket was raised off the back of [comments in a 
PR](https://github.com/apache/kafka/pull/9156#discussion_r490152263) while 
working on https://issues.apache.org/jira/browse/KAFKA-10077.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)