[jira] [Resolved] (KAFKA-10270) Add a broker to controller channel manager to redirect AlterConfig

2020-07-29 Thread Boyang Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-10270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boyang Chen resolved KAFKA-10270.
-
Resolution: Fixed

> Add a broker to controller channel manager to redirect AlterConfig
> --
>
> Key: KAFKA-10270
> URL: https://issues.apache.org/jira/browse/KAFKA-10270
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Boyang Chen
>Assignee: Boyang Chen
>Priority: Major
>
> Per KIP-590 requirement, we need to have a dedicate communication channel 
> from broker to the controller.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KAFKA-10326) Both serializer and deserializer should be able to see the generated client id

2020-07-29 Thread Chia-Ping Tsai (Jira)
Chia-Ping Tsai created KAFKA-10326:
--

 Summary: Both serializer and deserializer should be able to see 
the generated client id
 Key: KAFKA-10326
 URL: https://issues.apache.org/jira/browse/KAFKA-10326
 Project: Kafka
  Issue Type: Bug
Reporter: Chia-Ping Tsai
Assignee: Chia-Ping Tsai


Producer and consumer generate client id when users don't define it. the 
generated client id is passed to all configurable components (for example, 
metrics reporter) except for serializer/deseriaizer.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [VOTE] KIP-590: Redirect Zookeeper Mutation Protocols to The Controller

2020-07-29 Thread Boyang Chen
Thanks David for the feedback!

On Wed, Jul 29, 2020 at 7:53 AM David Jacot  wrote:

> Hi, Colin, Boyang,
>
> Colin, thanks for the clarification. Somehow, I thought that even if the
> controller is ran independently, it
> would still run the listeners of the broker and thus would be accessible by
> redirecting on the loopback
> interface. My mistake.
>
> Boyang, I have few questions/comments regarding the updated KIP:
>
> 1. I think that it would be great if we could clarify how old admin clients
> which are directly talking to the
> controller will work with this KIP. I read between the lines that, as we
> propose to provide a random
> broker Id as the controller Id in the metadata response, they will use a
> single node as a proxy. Is that
> correct? This deserves to be called out more explicitly in the design
> section instead of being hidden
> in the protocol bump of the metadata RPC.
>
> Makes sense, I stress this point in the compatibility section.


> 1.1 If I understand correctly, could we assume that old admin clients will
> stick to the same "fake controller"
> until they refresh their metadata? Refreshing the metadata usually happens
> when NOT_CONTROLLER
> is received but this won't happen anymore so they should change
> infrequently.
>
> That is correct, old admin clients would not try to refresh their metadata
due to NOT_CONTROLLER,
which is impossible to happen with the new broker cluster.


> 2. For the new admin client, I suppose that we plan on using
> LeastLoadedNodeProvider for the
> requests that are using ControllerNodeProvider. We could perhaps mention
> it.
>
> Sure, added.


> 3. Pre KIP-500, will we have a way to distinguish if a request that is
> received by the controller is
> coming directly from a client or from a broker? You mention that the
> listener can be used to do
> this but as you pointed out, it is not mandatory. Do we have another
> reliable method? I am asking
> in the context of KIP-599 with the current controller, we may need to
> throttle differently if the
> request comes from a client or from a broker.
>
> The point for using the listener name is more of a security purpose, to
detect any forged request to our best effort.
For throttling I think we could just check the request header for
*InitialClientId* existence, to distinguish whether to apply
throttling strategy as forwarded request or direct request.


> 4. Could we add `InitialClientId` as well? This will be required for the
> quota as we can apply them
> by principal and/or clientId.
>
> Sounds good, added.


> 5. A small remark regarding the structure of the KIP. It is a bit weird
> that requests that do not go
> to the controller are mentioned in the Proposed Design section and the
> requests that go to the
> controller are mentioned in the Public Interfaces. When one read the
> Proposed Design, it does not
> get a full picture of the whole new routing proposal for old and new
> clients. It would be great if we
> could have a full overview in that section.
>
> Good point, I will move the pieces around.


> Overall the change makes sense to me. I will work on drafting an addendum
> to KIP-599 to
> alter the design to cope with these changes. At a first glance, that seems
> doable if 1.1, 3
> and 4 are OK.
>
> Thank you for the help!


> Thanks,
> David
>
> On Wed, Jul 29, 2020 at 5:29 AM Boyang Chen 
> wrote:
>
> > Thanks for the feedback Colin!
> >
> > On Tue, Jul 28, 2020 at 2:11 PM Colin McCabe  wrote:
> >
> > > Hi Boyang,
> > >
> > > Thanks for updating this.  A few comments below:
> > >
> > > In the "Routing Request Security" section, there is a reference to
> "older
> > > requests that need redirection."  But after these new revisions, both
> new
> > > and old requests need redirection.  So we should rephrase this.
> > >
> > > > In addition, to avoid exposing this forwarding power to the admin
> > > clients,
> > > > the routing request shall be forwarded towards the controller broker
> > > internal
> > > > endpoint which should be only visible to other brokers inside the
> > > cluster
> > > > in the KIP-500 controller. Any admin configuration request with
> broker
> > > > principal should not be going through the public endpoint and will be
> > > > rejected for security purpose.
> > >
> > > We should also describe how this will work in the pre-KIP-500 case.  In
> > > that case, CLUSTER_ACTION gets the extra permissions described here
> only
> > > when the message comes in on the inter-broker listener.  We should
> state
> > > that here.
> > >
> > > (I can see that you have this information later on in the "Security
> > Access
> > > Changes" section, but it would be good to have it here as well, to
> avoid
> > > confusion.)
> > >
> > > > To be more strict of protecting controller information, the
> > > "ControllerId"
> > > > field in new MetadataResponse shall be set to -1 when the original
> > > request
> > > > comes from a non-broker client and it is already on v10. We shall use
> > the
> > > > 

Build failed in Jenkins: kafka-trunk-jdk8 #4751

2020-07-29 Thread Apache Jenkins Server
See 


Changes:

[github] KAFKA-10309: KafkaProducer's sendOffsetsToTransaction should not block


--
[...truncated 3.18 MB...]
org.apache.kafka.streams.TestTopicsTest > testKeyValueList PASSED

org.apache.kafka.streams.TestTopicsTest > 
shouldNotAllowToCreateOutputWithNullDriver STARTED

org.apache.kafka.streams.TestTopicsTest > 
shouldNotAllowToCreateOutputWithNullDriver PASSED

org.apache.kafka.streams.TestTopicsTest > testValueList STARTED

org.apache.kafka.streams.TestTopicsTest > testValueList PASSED

org.apache.kafka.streams.TestTopicsTest > testRecordList STARTED

org.apache.kafka.streams.TestTopicsTest > testRecordList PASSED

org.apache.kafka.streams.TestTopicsTest > testNonExistingInputTopic STARTED

org.apache.kafka.streams.TestTopicsTest > testNonExistingInputTopic PASSED

org.apache.kafka.streams.TestTopicsTest > testKeyValuesToMap STARTED

org.apache.kafka.streams.TestTopicsTest > testKeyValuesToMap PASSED

org.apache.kafka.streams.TestTopicsTest > testRecordsToList STARTED

org.apache.kafka.streams.TestTopicsTest > testRecordsToList PASSED

org.apache.kafka.streams.TestTopicsTest > testKeyValueListDuration STARTED

org.apache.kafka.streams.TestTopicsTest > testKeyValueListDuration PASSED

org.apache.kafka.streams.TestTopicsTest > testInputToString STARTED

org.apache.kafka.streams.TestTopicsTest > testInputToString PASSED

org.apache.kafka.streams.TestTopicsTest > testTimestamp STARTED

org.apache.kafka.streams.TestTopicsTest > testTimestamp PASSED

org.apache.kafka.streams.TestTopicsTest > testWithHeaders STARTED

org.apache.kafka.streams.TestTopicsTest > testWithHeaders PASSED

org.apache.kafka.streams.TestTopicsTest > testKeyValue STARTED

org.apache.kafka.streams.TestTopicsTest > testKeyValue PASSED

org.apache.kafka.streams.TestTopicsTest > 
shouldNotAllowToCreateTopicWithNullTopicName STARTED

org.apache.kafka.streams.TestTopicsTest > 
shouldNotAllowToCreateTopicWithNullTopicName PASSED

org.apache.kafka.streams.internals.WindowStoreFacadeTest > shouldReturnIsOpen 
STARTED

org.apache.kafka.streams.internals.WindowStoreFacadeTest > shouldReturnIsOpen 
PASSED

org.apache.kafka.streams.internals.WindowStoreFacadeTest > shouldReturnName 
STARTED

org.apache.kafka.streams.internals.WindowStoreFacadeTest > shouldReturnName 
PASSED

org.apache.kafka.streams.internals.WindowStoreFacadeTest > 
shouldPutWithUnknownTimestamp STARTED

org.apache.kafka.streams.internals.WindowStoreFacadeTest > 
shouldPutWithUnknownTimestamp PASSED

org.apache.kafka.streams.internals.WindowStoreFacadeTest > 
shouldPutWindowStartTimestampWithUnknownTimestamp STARTED

org.apache.kafka.streams.internals.WindowStoreFacadeTest > 
shouldPutWindowStartTimestampWithUnknownTimestamp PASSED

org.apache.kafka.streams.internals.WindowStoreFacadeTest > 
shouldReturnIsPersistent STARTED

org.apache.kafka.streams.internals.WindowStoreFacadeTest > 
shouldReturnIsPersistent PASSED

org.apache.kafka.streams.internals.WindowStoreFacadeTest > shouldForwardClose 
STARTED

org.apache.kafka.streams.internals.WindowStoreFacadeTest > shouldForwardClose 
PASSED

org.apache.kafka.streams.internals.WindowStoreFacadeTest > shouldForwardFlush 
STARTED

org.apache.kafka.streams.internals.WindowStoreFacadeTest > shouldForwardFlush 
PASSED

org.apache.kafka.streams.internals.WindowStoreFacadeTest > shouldForwardInit 
STARTED

org.apache.kafka.streams.internals.WindowStoreFacadeTest > shouldForwardInit 
PASSED

> Task :streams:upgrade-system-tests-0100:processTestResources NO-SOURCE
> Task :streams:upgrade-system-tests-0100:testClasses
> Task :streams:upgrade-system-tests-0100:checkstyleTest
> Task :streams:upgrade-system-tests-0100:spotbugsMain NO-SOURCE
> Task :streams:upgrade-system-tests-0100:test
> Task :streams:upgrade-system-tests-0101:compileJava NO-SOURCE
> Task :streams:upgrade-system-tests-0101:processResources NO-SOURCE
> Task :streams:upgrade-system-tests-0101:classes UP-TO-DATE
> Task :streams:upgrade-system-tests-0101:checkstyleMain NO-SOURCE
> Task :streams:upgrade-system-tests-0101:compileTestJava
> Task :streams:upgrade-system-tests-0101:processTestResources NO-SOURCE
> Task :streams:upgrade-system-tests-0101:testClasses
> Task :streams:upgrade-system-tests-0101:checkstyleTest
> Task :streams:upgrade-system-tests-0101:spotbugsMain NO-SOURCE
> Task :streams:upgrade-system-tests-0101:test
> Task :streams:upgrade-system-tests-0102:compileJava NO-SOURCE
> Task :streams:upgrade-system-tests-0102:processResources NO-SOURCE
> Task :streams:upgrade-system-tests-0102:classes UP-TO-DATE
> Task :streams:upgrade-system-tests-0102:checkstyleMain NO-SOURCE
> Task :streams:upgrade-system-tests-0102:compileTestJava
> Task :streams:upgrade-system-tests-0102:processTestResources NO-SOURCE
> Task :streams:upgrade-system-tests-0102:testClasses
> Task :streams:upgrade-system-tests-0102:checkstyleTest
> Task 

Re: [DISCUSS] KIP-631: The Quorum-based Kafka Controller

2020-07-29 Thread Colin McCabe
On Thu, Jul 23, 2020, at 23:02, Boyang Chen wrote:
> Hey Colin,
> 
> some more questions I have about the proposal:
> 
> 1. We mentioned in the networking section that "The only time when clients
> should contact a controller node directly is when they are debugging system
> issues". But later we didn't talk about how to enable this debug mode,
> could you consider getting a section about that?
> 

Hi Boyang,

Thanks for the review.

There isn't a separate debug mode or anything like that.  The assumption here 
is that when debugging system issues, we are inside the cluster, or at least 
have access to the private controller network.  This is pretty similar to how 
most administrators run ZooKeeper.

>
> 2. “When the active controller decides that a standby controller should
> start a snapshot, it will communicate that information in its response to
> the periodic heartbeat sent by that node.“ In the KIP-595, we provide an
> RPC called `EndQuorumEpoch` which would transfer the leadership role to a
> dedicated successor, do you think we could reuse that method instead of
> piggy-backing on the heartbeat RPC?
> 

I thought about this a little bit more, and I think the active controller 
should not need to give up the leadership in order to snapshot.  So I removed 
the part about renouncing the controllership.

>
> 3. The `DeleteBroker` record is listed but not mentioned in details for the
> KIP. Are we going to support removing a broker in runtime, or this record
> is just for the sake of removing an obsolete broker due to heartbeat
> failure?
>

This record is about removing a broker from the cluster due to heartbeat 
failure, or because it's shutting down.  I have renamed it to FenceBroker to 
make that clearer.

> 
> 4. In the rejected alternatives, we mentioned we don't want to combine
> heartbeats and fetch and listed out the reason was due to extra complexity.
> However, we should also mention some cons caused by this model, for example
> we are doing 2X round trips to maintain a liveness, where as a regular
> follower it should always send out fetch, for sure. If we are combining the
> two, what are the heartbeat request fields we need to populate in the Fetch
> protocol to make it work? Could we piggy-back on the UpdateMetadata RPC to
> propagate the broker state change for listeners separately to the
> controller? I'm not buying either approach here, just hope we could list
> out more reasoning for separating the heartbeat RPC from Fetch, pros and
> cons.
> 

The UpdateMetadata RPC is not used in the post-KIP-500 world.  This is 
mentioned later in KIP-631 where it says that "we will no longer need to send 
out LeaderAndIsrRequest, UpdateMetadataRequest, and StopReplicaRequest"

We could combine the heartbeat with the fetch request.  It would basically mean 
moving all the heartbeat fields into the fetch request.  As the KIP says, this 
would be pretty messy.  Another reason why it would be messy is because of the 
timing.  Fetch requests can get delayed when they're fetching a lot of data.  
If this delays heartbeats then it could cause brokers to get fenced 
unnecessarily.  This is something that we've gone back and forth about, but 
overall I think it's good to at least implement the simple thing first.

best,
Colin

>
> Boyang
> 
> On Wed, Jul 15, 2020 at 5:30 PM Colin McCabe  wrote:
> 
> > On Mon, Jul 13, 2020, at 11:08, Boyang Chen wrote:
> > > Hey Colin, some quick questions,
> > >
> > > 1. I looked around and didn't find a config for broker heartbeat
> > interval,
> > > are we piggy-back on some existing configs?
> > >
> >
> > Good point.  I meant to add this, but I forgot.  I added
> > registration.heartbeat.interval.ms in the table.
> >
> > >
> > > 2. We only mentioned that the lease time is 10X of the heartbeat
> > interval,
> > > could we also include why we chose this value?
> > >
> >
> > I will add registration.lease.timeout.ms so that this can be set
> > separately from registration.heartbeat.interval.ms.  The choice of value
> > is a balance between not timing out brokers too soon, and not keeping
> > unavailable brokers around too long.
> >
> > best,
> > Colin
> >
> > >
> > > On Mon, Jul 13, 2020 at 10:09 AM Jason Gustafson 
> > wrote:
> > >
> > > > Hi Colin,
> > > >
> > > > Thanks for the proposal. A few initial comments comments/questions
> > below:
> > > >
> > > > 1. I don't follow why we need a separate configuration for
> > > > `controller.listeners`. The current listener configuration already
> > allows
> > > > users to specify multiple listeners, which allows them to define
> > internal
> > > > endpoints that are not exposed to clients. Can you explain what the new
> > > > configuration gives us that we don't already have?
> > > > 2. What is the advantage of creating a separate `controller.id`
> > instead of
> > > > just using `broker.id`?
> > > > 3. It sounds like you are imagining a stop-the-world approach to
> > > > snapshotting, which is why we need the controller 

Build failed in Jenkins: kafka-trunk-jdk14 #326

2020-07-29 Thread Apache Jenkins Server
See 


Changes:

[github] KAFKA-10309: KafkaProducer's sendOffsetsToTransaction should not block


--
[...truncated 3.20 MB...]

org.apache.kafka.streams.TopologyTestDriverTest > shouldPopulateGlobalStore[Eos 
enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldThrowIfPersistentBuiltInStoreIsAccessedWithUntypedMethod[Eos enabled = 
false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldThrowIfPersistentBuiltInStoreIsAccessedWithUntypedMethod[Eos enabled = 
false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldAllowPrePopulatingStatesStoresWithCachingEnabled[Eos enabled = false] 
STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldAllowPrePopulatingStatesStoresWithCachingEnabled[Eos enabled = false] 
PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldReturnCorrectPersistentStoreTypeOnly[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldReturnCorrectPersistentStoreTypeOnly[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > shouldRespectTaskIdling[Eos 
enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > shouldRespectTaskIdling[Eos 
enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldUseSourceSpecificDeserializers[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldUseSourceSpecificDeserializers[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > shouldReturnAllStores[Eos 
enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > shouldReturnAllStores[Eos 
enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldNotCreateStateDirectoryForStatelessTopology[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldNotCreateStateDirectoryForStatelessTopology[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldApplyGlobalUpdatesCorrectlyInRecursiveTopologies[Eos enabled = false] 
STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldApplyGlobalUpdatesCorrectlyInRecursiveTopologies[Eos enabled = false] 
PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldReturnAllStoresNames[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldReturnAllStoresNames[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldPassRecordHeadersIntoSerializersAndDeserializers[Eos enabled = false] 
STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldPassRecordHeadersIntoSerializersAndDeserializers[Eos enabled = false] 
PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldProcessConsumerRecordList[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldProcessConsumerRecordList[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldUseSinkSpecificSerializers[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldUseSinkSpecificSerializers[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldFlushStoreForFirstInput[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldFlushStoreForFirstInput[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldProcessFromSourceThatMatchPattern[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldProcessFromSourceThatMatchPattern[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldCaptureSinkTopicNamesIfWrittenInto[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldCaptureSinkTopicNamesIfWrittenInto[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldUpdateStoreForNewKey[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldUpdateStoreForNewKey[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldSendRecordViaCorrectSourceTopicDeprecated[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldSendRecordViaCorrectSourceTopicDeprecated[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldPunctuateOnWallClockTime[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldPunctuateOnWallClockTime[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > shouldSetRecordMetadata[Eos 
enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > shouldSetRecordMetadata[Eos 
enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldNotUpdateStoreForLargerValue[Eos enabled = 

Build failed in Jenkins: kafka-trunk-jdk8 #4750

2020-07-29 Thread Apache Jenkins Server
See 


Changes:

[github] KAFKA-10270: A broker to controller channel manager (#9012)


--
[...truncated 6.37 MB...]

org.apache.kafka.streams.internals.WindowStoreFacadeTest > 
shouldPutWithUnknownTimestamp PASSED

org.apache.kafka.streams.internals.WindowStoreFacadeTest > 
shouldPutWindowStartTimestampWithUnknownTimestamp STARTED

org.apache.kafka.streams.internals.WindowStoreFacadeTest > 
shouldPutWindowStartTimestampWithUnknownTimestamp PASSED

org.apache.kafka.streams.internals.WindowStoreFacadeTest > 
shouldReturnIsPersistent STARTED

org.apache.kafka.streams.internals.WindowStoreFacadeTest > 
shouldReturnIsPersistent PASSED

org.apache.kafka.streams.internals.WindowStoreFacadeTest > shouldForwardClose 
STARTED

org.apache.kafka.streams.internals.WindowStoreFacadeTest > shouldForwardClose 
PASSED

org.apache.kafka.streams.internals.WindowStoreFacadeTest > shouldForwardFlush 
STARTED

org.apache.kafka.streams.internals.WindowStoreFacadeTest > shouldForwardFlush 
PASSED

org.apache.kafka.streams.internals.WindowStoreFacadeTest > shouldForwardInit 
STARTED

org.apache.kafka.streams.internals.WindowStoreFacadeTest > shouldForwardInit 
PASSED

org.apache.kafka.streams.TestTopicsTest > testNonUsedOutputTopic STARTED

org.apache.kafka.streams.TestTopicsTest > testNonUsedOutputTopic PASSED

org.apache.kafka.streams.TestTopicsTest > testEmptyTopic STARTED

org.apache.kafka.streams.TestTopicsTest > testEmptyTopic PASSED

org.apache.kafka.streams.TestTopicsTest > testStartTimestamp STARTED

org.apache.kafka.streams.TestTopicsTest > testStartTimestamp PASSED

org.apache.kafka.streams.TestTopicsTest > testNegativeAdvance STARTED

org.apache.kafka.streams.TestTopicsTest > testNegativeAdvance PASSED

org.apache.kafka.streams.TestTopicsTest > shouldNotAllowToCreateWithNullDriver 
STARTED

org.apache.kafka.streams.TestTopicsTest > shouldNotAllowToCreateWithNullDriver 
PASSED

org.apache.kafka.streams.TestTopicsTest > testDuration STARTED

org.apache.kafka.streams.TestTopicsTest > testDuration PASSED

org.apache.kafka.streams.TestTopicsTest > testOutputToString STARTED

org.apache.kafka.streams.TestTopicsTest > testOutputToString PASSED

org.apache.kafka.streams.TestTopicsTest > testValue STARTED

org.apache.kafka.streams.TestTopicsTest > testValue PASSED

org.apache.kafka.streams.TestTopicsTest > testTimestampAutoAdvance STARTED

org.apache.kafka.streams.TestTopicsTest > testTimestampAutoAdvance PASSED

org.apache.kafka.streams.TestTopicsTest > testOutputWrongSerde STARTED

org.apache.kafka.streams.TestTopicsTest > testOutputWrongSerde PASSED

org.apache.kafka.streams.TestTopicsTest > 
shouldNotAllowToCreateOutputTopicWithNullTopicName STARTED

org.apache.kafka.streams.TestTopicsTest > 
shouldNotAllowToCreateOutputTopicWithNullTopicName PASSED

org.apache.kafka.streams.TestTopicsTest > testWrongSerde STARTED

org.apache.kafka.streams.TestTopicsTest > testWrongSerde PASSED

org.apache.kafka.streams.TestTopicsTest > testKeyValuesToMapWithNull STARTED

org.apache.kafka.streams.TestTopicsTest > testKeyValuesToMapWithNull PASSED

org.apache.kafka.streams.TestTopicsTest > testNonExistingOutputTopic STARTED

org.apache.kafka.streams.TestTopicsTest > testNonExistingOutputTopic PASSED

org.apache.kafka.streams.TestTopicsTest > testMultipleTopics STARTED

org.apache.kafka.streams.TestTopicsTest > testMultipleTopics PASSED

org.apache.kafka.streams.TestTopicsTest > testKeyValueList STARTED

org.apache.kafka.streams.TestTopicsTest > testKeyValueList PASSED

org.apache.kafka.streams.TestTopicsTest > 
shouldNotAllowToCreateOutputWithNullDriver STARTED

org.apache.kafka.streams.TestTopicsTest > 
shouldNotAllowToCreateOutputWithNullDriver PASSED

org.apache.kafka.streams.TestTopicsTest > testValueList STARTED

org.apache.kafka.streams.TestTopicsTest > testValueList PASSED

org.apache.kafka.streams.TestTopicsTest > testRecordList STARTED

org.apache.kafka.streams.TestTopicsTest > testRecordList PASSED

org.apache.kafka.streams.TestTopicsTest > testNonExistingInputTopic STARTED

org.apache.kafka.streams.TestTopicsTest > testNonExistingInputTopic PASSED

org.apache.kafka.streams.TestTopicsTest > testKeyValuesToMap STARTED

org.apache.kafka.streams.TestTopicsTest > testKeyValuesToMap PASSED

org.apache.kafka.streams.TestTopicsTest > testRecordsToList STARTED

org.apache.kafka.streams.TestTopicsTest > testRecordsToList PASSED

org.apache.kafka.streams.TestTopicsTest > testKeyValueListDuration STARTED

org.apache.kafka.streams.TestTopicsTest > testKeyValueListDuration PASSED

org.apache.kafka.streams.TestTopicsTest > testInputToString STARTED

org.apache.kafka.streams.TestTopicsTest > testInputToString PASSED

org.apache.kafka.streams.TestTopicsTest > testTimestamp STARTED

org.apache.kafka.streams.TestTopicsTest > testTimestamp PASSED

org.apache.kafka.streams.TestTopicsTest > testWithHeaders STARTED


[DISCUSS] KIP-649: Dynamic Client Configuration

2020-07-29 Thread Ryan Dielhenn
Hi everyone,

I would like to start a discussion on KIP 649:
https://cwiki.apache.org/confluence/display/KAFKA/KIP-649%3A+Dynamic+Client+Configuration

This proposal specifies the mechanisms that will enable dynamic configuration 
of producers and consumers. We have cherry picked a few producer and consumer 
configurations that we would like to provide dynamic support for and they are 
outlined in this KIP. Please reply with any comments or suggestions.

Thank you!
Ryan Dielhenn


Build failed in Jenkins: kafka-trunk-jdk11 #1676

2020-07-29 Thread Apache Jenkins Server
See 


Changes:

[github] KAFKA-10270: A broker to controller channel manager (#9012)


--
[...truncated 6.41 MB...]

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldCleanUpPersistentStateStoresOnClose[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldCleanUpPersistentStateStoresOnClose[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldThrowForUnknownTopicDeprecated[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldThrowForUnknownTopicDeprecated[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldThrowPatternNotValidForTopicNameException[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldThrowPatternNotValidForTopicNameException[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldEnqueueLaterOutputsAfterEarlierOnes[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldEnqueueLaterOutputsAfterEarlierOnes[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldUseSourceSpecificDeserializersDeprecated[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldUseSourceSpecificDeserializersDeprecated[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldPunctuateIfEvenTimeAdvances[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldPunctuateIfEvenTimeAdvances[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > shouldInitProcessor[Eos 
enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > shouldInitProcessor[Eos 
enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldThrowForUnknownTopic[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldThrowForUnknownTopic[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldPunctuateOnStreamsTime[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldPunctuateOnStreamsTime[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldCaptureGlobalTopicNameIfWrittenInto[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldCaptureGlobalTopicNameIfWrittenInto[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldThrowIfInMemoryBuiltInStoreIsAccessedWithUntypedMethod[Eos enabled = 
false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldThrowIfInMemoryBuiltInStoreIsAccessedWithUntypedMethod[Eos enabled = 
false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldProcessFromSourcesThatMatchMultiplePattern[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldProcessFromSourcesThatMatchMultiplePattern[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > shouldPopulateGlobalStore[Eos 
enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > shouldPopulateGlobalStore[Eos 
enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldThrowIfPersistentBuiltInStoreIsAccessedWithUntypedMethod[Eos enabled = 
false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldThrowIfPersistentBuiltInStoreIsAccessedWithUntypedMethod[Eos enabled = 
false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldAllowPrePopulatingStatesStoresWithCachingEnabled[Eos enabled = false] 
STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldAllowPrePopulatingStatesStoresWithCachingEnabled[Eos enabled = false] 
PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldReturnCorrectPersistentStoreTypeOnly[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldReturnCorrectPersistentStoreTypeOnly[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > shouldRespectTaskIdling[Eos 
enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > shouldRespectTaskIdling[Eos 
enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldUseSourceSpecificDeserializers[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldUseSourceSpecificDeserializers[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > shouldReturnAllStores[Eos 
enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > shouldReturnAllStores[Eos 
enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldNotCreateStateDirectoryForStatelessTopology[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 

[jira] [Created] (KAFKA-10325) Implement KIP-649: Dynamic Client Configuration

2020-07-29 Thread Ryan Dielhenn (Jira)
Ryan Dielhenn created KAFKA-10325:
-

 Summary: Implement KIP-649: Dynamic Client Configuration
 Key: KAFKA-10325
 URL: https://issues.apache.org/jira/browse/KAFKA-10325
 Project: Kafka
  Issue Type: New Feature
Reporter: Ryan Dielhenn


Implement KIP-649: Dynamic Client Configuration



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Jenkins build is back to normal : kafka-trunk-jdk14 #325

2020-07-29 Thread Apache Jenkins Server
See 




Re: [DISCUSS] KIP-630: Kafka Raft Snapshot

2020-07-29 Thread Jose Garcia Sancio
Thanks Ron for the additional comments and suggestions.

Here are the changes to the KIP:
https://cwiki.apache.org/confluence/pages/diffpagesbyversion.action?pageId=158864763=17=15

On Wed, Jul 29, 2020 at 8:44 AM Ron Dagostino  wrote:
>
> Thanks, Jose.  It's looking good.  Here is one minor correction:
>
> <<< If the Kafka topic partition leader receives a fetch request with an
> offset and epoch greater than or equal to the LBO (x + 1, a)
> >>> If the Kafka topic partition leader receives a fetch request with an
> offset and epoch greater than or equal to the LBO (x + 1, b)
>

Done.

> Here is one more question.  Is there an ability to evolve the snapshot
> format over time, and if so, how is that managed for upgrades? It would be
> both Controllers and Brokers that would depend on the format, correct?
> Those could be the same thing if the controller was running inside the
> broker JVM, but that is an option rather than a requirement, I think.
> Might the Controller upgrade have to be coordinated with the broker upgrade
> in the separate-JVM case?  Perhaps a section discussing this would be
> appropriate?
>

The content set though the FetchSnapshot RPC is expected to be
compatible with future changes. In KIP-631 the Kafka Controller is
going to use the existing Kafka Message and versioning scheme.
Specifically see section "Record Format Versions". I added some
wording around this.

Thanks!
-Jose


[jira] [Resolved] (KAFKA-9210) kafka stream loss data

2020-07-29 Thread John Roesler (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Roesler resolved KAFKA-9210.
-
Resolution: Fixed

> kafka stream loss data
> --
>
> Key: KAFKA-9210
> URL: https://issues.apache.org/jira/browse/KAFKA-9210
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 2.0.1
>Reporter: panpan.liu
>Priority: Major
> Attachments: app.log, screenshot-1.png
>
>
> kafka broker: 2.0.1
> kafka stream client: 2.1.0
>  # two applications run at the same time
>  # after some days,I stop one application(in k8s)
>  # The flollowing log occured and I check the data and find that value is 
> less than what I expected.
> {quote}Partitions 
> [flash-app-xmc-worker-share-store-minute-repartition-1]Partitions 
> [flash-app-xmc-worker-share-store-minute-repartition-1] for changelog 
> flash-app-xmc-worker-share-store-minute-changelog-12019-11-19 
> 05:50:49.816|WARN 
> |flash-client-xmc-StreamThread-3|o.a.k.s.p.i.StoreChangelogReader|101|stream-thread
>  [flash-client-xmc-StreamThread-3] Restoring StreamTasks failed. Deleting 
> StreamTasks stores to recreate from 
> scratch.org.apache.kafka.clients.consumer.OffsetOutOfRangeException: Offsets 
> out of range with no configured reset policy for partitions: 
> \{flash-app-xmc-worker-share-store-minute-changelog-1=6128684} at 
> org.apache.kafka.clients.consumer.internals.Fetcher.parseCompletedFetch(Fetcher.java:987)2019-11-19
>  05:50:49.817|INFO 
> |flash-client-xmc-StreamThread-3|o.a.k.s.p.i.StoreChangelogReader|105|stream-thread
>  [flash-client-xmc-StreamThread-3] Reinitializing StreamTask TaskId: 10_1 
> ProcessorTopology: KSTREAM-SOURCE-70: topics: 
> [flash-app-xmc-worker-share-store-minute-repartition] children: 
> [KSTREAM-AGGREGATE-67] KSTREAM-AGGREGATE-67: states: 
> [worker-share-store-minute] children: [KTABLE-TOSTREAM-71] 
> KTABLE-TOSTREAM-71: children: [KSTREAM-SINK-72] 
> KSTREAM-SINK-72: topic: 
> StaticTopicNameExtractor(xmc-worker-share-minute)Partitions 
> [flash-app-xmc-worker-share-store-minute-repartition-1] for changelog 
> flash-app-xmc-worker-share-store-minute-changelog-12019-11-19 
> 05:50:49.842|WARN 
> |flash-client-xmc-StreamThread-3|o.a.k.s.p.i.StoreChangelogReader|101|stream-thread
>  [flash-client-xmc-StreamThread-3] Restoring StreamTasks failed. Deleting 
> StreamTasks stores to recreate from 
> scratch.org.apache.kafka.clients.consumer.OffsetOutOfRangeException: Offsets 
> out of range with no configured reset policy for partitions: 
> \{flash-app-xmc-worker-share-store-minute-changelog-1=6128684} at 
> org.apache.kafka.clients.consumer.internals.Fetcher.parseCompletedFetch(Fetcher.java:987)2019-11-19
>  05:50:49.842|INFO 
> |flash-client-xmc-StreamThread-3|o.a.k.s.p.i.StoreChangelogReader|105|stream-thread
>  [flash-client-xmc-StreamThread-3] Reinitializing StreamTask TaskId: 10_1 
> ProcessorTopology: KSTREAM-SOURCE-70: topics: 
> [flash-app-xmc-worker-share-store-minute-repartition] children: 
> [KSTREAM-AGGREGATE-67] KSTREAM-AGGREGATE-67: states: 
> [worker-share-store-minute] children: [KTABLE-TOSTREAM-71] 
> KTABLE-TOSTREAM-71: children: [KSTREAM-SINK-72] 
> KSTREAM-SINK-72: topic: 
> StaticTopicNameExtractor(xmc-worker-share-minute)Partitions 
> [flash-app-xmc-worker-share-store-minute-repartition-1] for changelog 
> flash-app-xmc-worker-share-store-minute-changelog-12019-11-19 
> 05:50:49.905|WARN 
> |flash-client-xmc-StreamThread-3|o.a.k.s.p.i.StoreChangelogReader|101|stream-thread
>  [flash-client-xmc-StreamThread-3] Restoring StreamTasks failed. Deleting 
> StreamTasks stores to recreate from 
> scratch.org.apache.kafka.clients.consumer.OffsetOutOfRangeException: Offsets 
> out of range with no configured reset policy for partitions: 
> \{flash-app-xmc-worker-share-store-minute-changelog-1=6128684} at 
> org.apache.kafka.clients.consumer.internals.Fetcher.parseCompletedFetch(Fetcher.java:987)2019-11-19
>  05:50:49.906|INFO 
> |flash-client-xmc-StreamThread-3|o.a.k.s.p.i.StoreChangelogReader|105|stream-thread
>  [flash-client-xmc-StreamThread-3] Reinitializing StreamTask TaskId: 10_1 
> ProcessorTopology: KSTREAM-SOURCE-70: topics: 
> [flash-app-xmc-worker-share-store-minute-repartition] children: 
> [KSTREAM-AGGREGATE-67] KSTREAM-AGGREGATE-67: states: 
> [worker-share-store-minute] 
> {quote}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] KIP-595: A Raft Protocol for the Metadata Quorum

2020-07-29 Thread Jason Gustafson
Hey Jun,

I added a section on "Cluster Bootstrapping" which discusses clusterId
generation and the process through which brokers find the current leader.
The quick summary is that the first controller will be responsible for
generating the clusterId and persisting it in the metadata log. Before the
first leader has been elected, quorum APIs will skip clusterId validation.
This seems reasonable since this is primarily intended to prevent the
damage from misconfiguration after a cluster has been running for some
time. Upon startup, brokers begin by sending Fetch requests to find the
current leader. This will include the cluster.id from meta.properties if it
is present. The broker will shutdown immediately if it receives
INVALID_CLUSTER_ID from the Fetch response.

I also added some details about our testing strategy, which you asked about
previously.

Thanks,
Jason

On Mon, Jul 27, 2020 at 10:46 PM Boyang Chen 
wrote:

> On Mon, Jul 27, 2020 at 4:58 AM Unmesh Joshi 
> wrote:
>
> > Just checked etcd and zookeeper code, and both support leader to step
> down
> > as a follower to make sure there are no two leaders if the leader has
> been
> > disconnected from the majority of the followers
> > For etcd this is https://github.com/etcd-io/etcd/issues/3866
> > For Zookeeper its https://issues.apache.org/jira/browse/ZOOKEEPER-1699
> > I was just thinking if it would be difficult to implement in the Pull
> based
> > model, but I guess not. It is possibly the same way ISR list is managed
> > currently, if leader of the controller quorum loses majority of the
> > followers, it should step down and become follower, that way, telling
> > client in time that it was disconnected from the quorum, and not keep on
> > sending state metadata to clients.
> >
> > Thanks,
> > Unmesh
> >
> >
> > On Mon, Jul 27, 2020 at 9:31 AM Unmesh Joshi 
> > wrote:
> >
> > > >>Could you clarify on this question? Which part of the raft group
> > doesn't
> > > >>know about leader dis-connection?
> > > The leader of the controller quorum is partitioned from the controller
> > > cluster, and a different leader is elected for the remaining controller
> > > cluster.
> >
> I see your concern. For KIP-595 implementation, since there is no regular
> heartbeats sent
> from the leader to the followers, we decided to piggy-back on the fetch
> timeout so that if the leader did not receive Fetch
> requests from a majority of the quorum for that amount of time, it would
> begin a new election and
> start sending VoteRequest to voter nodes in the cluster to understand the
> latest quorum. You could
> find more details in this section
> <
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-595%3A+A+Raft+Protocol+for+the+Metadata+Quorum#KIP595:ARaftProtocolfortheMetadataQuorum-Vote
> >
> .
>
>
> > > I think there are two things here,
> > > 1.  The old leader will not know if it's disconnected from the rest of
> > the
> > > controller quorum cluster unless it receives BeginQuorumEpoch from the
> > new
> > > leader. So it will keep on serving stale metadata to the clients
> > (Brokers,
> > > Producers and Consumers)
> > > 2. I assume, the Broker Leases will be managed on the controller quorum
> > > leader. This partitioned leader will keep on tracking broker leases it
> > has,
> > > while the new leader of the quorum will also start managing broker
> > leases.
> > > So while the quorum leader is partitioned, there will be two membership
> > > views of the kafka brokers managed on two leaders.
> > > Unless broker heartbeats are also replicated as part of the Raft log,
> > > there is no way to solve this?
> > > I know LogCabin implementation does replicate client heartbeats. I
> > suspect
> > > that the same issue is there in Zookeeper, which does not replicate
> > client
> > > Ping requests..
> > >
> > > Thanks,
> > > Unmesh
> > >
> > >
> > >
> > > On Mon, Jul 27, 2020 at 6:23 AM Boyang Chen <
> reluctanthero...@gmail.com>
> > > wrote:
> > >
> > >> Thanks for the questions Unmesh!
> > >>
> > >> On Sun, Jul 26, 2020 at 6:18 AM Unmesh Joshi 
> > >> wrote:
> > >>
> > >> > Hi,
> > >> >
> > >> > In the FetchRequest Handling, how to make sure we handle scenarios
> > where
> > >> > the leader might have been disconnected from the cluster, but
> doesn't
> > >> know
> > >> > yet?
> > >> >
> > >> Could you clarify on this question? Which part of the raft group
> doesn't
> > >> know about leader
> > >> dis-connection?
> > >>
> > >>
> > >> > As discussed in the Raft Thesis section 6.4, the linearizable
> > semantics
> > >> of
> > >> > read requests is implemented in LogCabin by sending heartbeat to
> > >> followers
> > >> > and waiting till the heartbeats are successful to make sure that the
> > >> leader
> > >> > is still the leader.
> > >> > I think for the controller quorum to make sure none of the consumers
> > get
> > >> > stale data, it's important to have linearizable semantics? In the
> pull
> > >> > based model, the leader will need to wait for heartbeats from 

[jira] [Created] (KAFKA-10324) Pre-0.11 consumers can get stuck when messages are downconverted from V2 format

2020-07-29 Thread Tommy Becker (Jira)
Tommy Becker created KAFKA-10324:


 Summary: Pre-0.11 consumers can get stuck when messages are 
downconverted from V2 format
 Key: KAFKA-10324
 URL: https://issues.apache.org/jira/browse/KAFKA-10324
 Project: Kafka
  Issue Type: Bug
Reporter: Tommy Becker


As noted in KAFKA-5443, The V2 message format preserves a batch's lastOffset 
even if that offset gets removed due to log compaction. If a pre-0.11 consumer 
seeks to such an offset and issues a fetch, it will get an empty batch, since 
offsets prior to the requested one are filtered out during down-conversion. 
KAFKA-5443 added consumer-side logic to advance the fetch offset in this case, 
but this leaves old consumers unable to consume these topics.

The exact behavior varies depending on consumer version. The 0.10.0.0 consumer 
throws RecordTooLargeException and dies, believing that the record must not 
have been returned because it was too large. The 0.10.1.0 consumer simply spins 
fetching the same empty batch over and over.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] KIP-630: Kafka Raft Snapshot

2020-07-29 Thread Ron Dagostino
Thanks, Jose.  It's looking good.  Here is one minor correction:

<<< If the Kafka topic partition leader receives a fetch request with an
offset and epoch greater than or equal to the LBO (x + 1, a)
>>> If the Kafka topic partition leader receives a fetch request with an
offset and epoch greater than or equal to the LBO (x + 1, b)

Here is one more question.  Is there an ability to evolve the snapshot
format over time, and if so, how is that managed for upgrades? It would be
both Controllers and Brokers that would depend on the format, correct?
Those could be the same thing if the controller was running inside the
broker JVM, but that is an option rather than a requirement, I think.
Might the Controller upgrade have to be coordinated with the broker upgrade
in the separate-JVM case?  Perhaps a section discussing this would be
appropriate?

Ron


On Tue, Jul 28, 2020 at 11:14 PM Jose Garcia Sancio 
wrote:

> Thanks Ron. Your comments and suggestions were helpful. You can see my
> changes to the KIP here:
>
> https://cwiki.apache.org/confluence/pages/diffpagesbyversion.action?pageId=158864763=15=14
>
> My comments are below...
>
> On Mon, Jul 27, 2020 at 11:29 AM Ron Dagostino  wrote:
> >
> > Hi Jose.  Thanks for the KIP.  Here are some questions and some nit
> corrections.
> >
> > <<< In KIP-500 the Kafka Controller, which is the quorum leader from
> > KIP-595, will materialize the entries in the metadata log into memory.
> > Technically I think the quorum leader is referred to as the Active
> > Controller in KIP-500.  Maybe replace "Kafka Controller" with "Active
> > Controller"?  I think the term "Kafka Controller" is fine as used
> > throughout the rest of the KIP to refer to the entire thing, but when
> > referring specifically to the leader I think "Active Controller" is
> > the term that is defined in KIP-500.
> >
>
> Made those changes.
>
> >
> > <<< Each broker in KIP-500, which will be a replica of the metadata
> > log, will materialize the entries in the log into a metadata cache
> > This wording confused me because I assumed that "replica" was a formal
> > term and only (non-Active) Controllers are formally "replicas" of the
> > metadata log -- Kafka brokers would be clients that read the log and
> > then use the data for their own purpose as opposed to formally being
> > replicas with this understanding of the term "replica".  Is that
> > correct, and if so, maybe replace "replica" with "client"?
> >
>
> In KIP-595 we have two types of replicas: voters and observers. Voter
> replicas are Kafka Controllers and one one of them will become the
> Active controller. Observer replicas fetch from the log and attempt to
> keep up with the LEO of the Active Controller. I think you can
> consider all of them as "client" of the replicated log.
>
> >
> > <<< The type of in-memory state machines what we plan to implement
> > >>> The type of in-memory state machines that we plan to implement
> > nit
> >
>
> Done.
>
> >
> > <<< doesn't map very well to an key and offset based clean up policy.
> > >>> doesn't map very well to a key and offset based clean up policy.
> > nit
> >
>
> Done.
>
> >
> > <<< When starting a broker either because it is a new broker, a broker
> > was upgraded or a failed broker is restarting. Loading the state
> > represented by the __cluster_metadata topic partition is required
> > before the broker is available
> > >>> When starting a broker either because it is a new broker or it is
> restarting, loading the state represented by the __cluster_metadata topic
> partition is required before the broker is available.
> > Reword for simplicity and clarity?
> >
>
> Done.
>
> >
> > <<< With snapshot based of the in-memory state Kafka can be much more
> aggressive
> > >>> By taking and transmitting a snapshot of the in-memory state as
> described below Kafka can be much more aggressive
> > Tough to refer to the concept of snapshot here without having
> > described what it is, so refer to "as described below" to help orient
> > the reader?
> >
>
> Made some changes to these sentences. I agree that fully understanding
> parts of the motivated section requires reading the rest of the
> document. I wanted to make sure that we had this in the motivation
> section.
>
> >
> > <<< In the future this mechanism will also be useful for
> > high-throughput topic partitions like the Group Coordinator and
> > Transaction Coordinator.
> > >>> In the future this mechanism may also be useful for high-throughput
> topic partitions like the Group Coordinator and Transaction Coordinator.
> > Tough to say "will" when that is an assumption that would depend on a
> KIP?
> >
>
> Yeah. Changed it.
>
> >
> > << > __cluster_metadata topic partition then the ... Kafka Controller will
> > need to replicate 3.81 MB to each broker in the cluster (10) or 38.14
> > MB.
> > It might be good to append a sentence that explicitly states how much
> > data is replicated for the delta/event -- right now it is implied to
> > be 

Re: [VOTE] KIP-648: Renaming getter method for Interactive Queries

2020-07-29 Thread Bruno Cadonna

Thanks John,

+1 (non-binding)

Best,
Bruno

On 29.07.20 01:02, John Thomas wrote:

Hello everyone,

I'd like to kick-off a vote for KIP-648 : Renaming getter method for 
Interactive Queries
https://cwiki.apache.org/confluence/display/KAFKA/KIP-648%3A+Renaming+getter+method+for+Interactive+Queries

It's a straight forward change to include new getters and deprecate the 
existing ones for the class KeyQueryMetadata.

Thanks,
John




Re: [VOTE] KIP-590: Redirect Zookeeper Mutation Protocols to The Controller

2020-07-29 Thread David Jacot
Hi, Colin, Boyang,

Colin, thanks for the clarification. Somehow, I thought that even if the
controller is ran independently, it
would still run the listeners of the broker and thus would be accessible by
redirecting on the loopback
interface. My mistake.

Boyang, I have few questions/comments regarding the updated KIP:

1. I think that it would be great if we could clarify how old admin clients
which are directly talking to the
controller will work with this KIP. I read between the lines that, as we
propose to provide a random
broker Id as the controller Id in the metadata response, they will use a
single node as a proxy. Is that
correct? This deserves to be called out more explicitly in the design
section instead of being hidden
in the protocol bump of the metadata RPC.

1.1 If I understand correctly, could we assume that old admin clients will
stick to the same "fake controller"
until they refresh their metadata? Refreshing the metadata usually happens
when NOT_CONTROLLER
is received but this won't happen anymore so they should change
infrequently.

2. For the new admin client, I suppose that we plan on using
LeastLoadedNodeProvider for the
requests that are using ControllerNodeProvider. We could perhaps mention it.

3. Pre KIP-500, will we have a way to distinguish if a request that is
received by the controller is
coming directly from a client or from a broker? You mention that the
listener can be used to do
this but as you pointed out, it is not mandatory. Do we have another
reliable method? I am asking
in the context of KIP-599 with the current controller, we may need to
throttle differently if the
request comes from a client or from a broker.

4. Could we add `InitialClientId` as well? This will be required for the
quota as we can apply them
by principal and/or clientId.

5. A small remark regarding the structure of the KIP. It is a bit weird
that requests that do not go
to the controller are mentioned in the Proposed Design section and the
requests that go to the
controller are mentioned in the Public Interfaces. When one read the
Proposed Design, it does not
get a full picture of the whole new routing proposal for old and new
clients. It would be great if we
could have a full overview in that section.

Overall the change makes sense to me. I will work on drafting an addendum
to KIP-599 to
alter the design to cope with these changes. At a first glance, that seems
doable if 1.1, 3
and 4 are OK.

Thanks,
David

On Wed, Jul 29, 2020 at 5:29 AM Boyang Chen 
wrote:

> Thanks for the feedback Colin!
>
> On Tue, Jul 28, 2020 at 2:11 PM Colin McCabe  wrote:
>
> > Hi Boyang,
> >
> > Thanks for updating this.  A few comments below:
> >
> > In the "Routing Request Security" section, there is a reference to "older
> > requests that need redirection."  But after these new revisions, both new
> > and old requests need redirection.  So we should rephrase this.
> >
> > > In addition, to avoid exposing this forwarding power to the admin
> > clients,
> > > the routing request shall be forwarded towards the controller broker
> > internal
> > > endpoint which should be only visible to other brokers inside the
> > cluster
> > > in the KIP-500 controller. Any admin configuration request with broker
> > > principal should not be going through the public endpoint and will be
> > > rejected for security purpose.
> >
> > We should also describe how this will work in the pre-KIP-500 case.  In
> > that case, CLUSTER_ACTION gets the extra permissions described here only
> > when the message comes in on the inter-broker listener.  We should state
> > that here.
> >
> > (I can see that you have this information later on in the "Security
> Access
> > Changes" section, but it would be good to have it here as well, to avoid
> > confusion.)
> >
> > > To be more strict of protecting controller information, the
> > "ControllerId"
> > > field in new MetadataResponse shall be set to -1 when the original
> > request
> > > comes from a non-broker client and it is already on v10. We shall use
> the
> > > request listener name to distinguish whether a given request is
> > inter-broker,
> > > or from the client.
> >
> > I'm not sure why we would need to distinguish between broker clients and
> > non-broker clients.  Brokers don't generally send MetadataRequests to
> other
> > brokers, do they?  Brokers learn about metadata from
> UpdateMetadataRequest
> > and LeaderAndIsrRequest, not by sending MetadataRequests to other
> brokers.
> >
> > We do have one use case where the MetadataRequest gets sent between the
> brokers, which is the InterBrokerSendThread. Currently we don't rely on it
> to get the controller id, so I guess your suggestion should be good to
> enforce. We could use some meta comment on the NetworkClient that it should
> not be used to get the controller location.
>
> Probably what we want here is: v0-v9: return a random broker in the cluster
> > as the controller ID.  v10: no controllerID present in the
> > MetadataResponse. 

[jira] [Created] (KAFKA-10323) NullPointerException during rebalance

2020-07-29 Thread yazgoo (Jira)
yazgoo created KAFKA-10323:
--

 Summary: NullPointerException during rebalance
 Key: KAFKA-10323
 URL: https://issues.apache.org/jira/browse/KAFKA-10323
 Project: Kafka
  Issue Type: Bug
  Components: KafkaConnect
Affects Versions: 2.5.0
Reporter: yazgoo


*confluent platform version: 5.5.0-ccs*

connector used: s3

Connector stops after rebalancing:

ERROR [Worker clientId=connect-1, groupId=connect] Couldn't instantiate task 
 because it has an invalid task configuration. This task will not 
execute until reconfigured.

java.lang.NullPointerException
 at org.apache.kafka.connect.runtime.Worker.startTask(Worker.java:427)
 at 
org.apache.kafka.connect.runtime.distributed.DistributedHerder.startTask(DistributedHerder.java:1147)
 at 
org.apache.kafka.connect.runtime.distributed.DistributedHerder.access$1600(DistributedHerder.java:126)
 at 
org.apache.kafka.connect.runtime.distributed.DistributedHerder$12.call(DistributedHerder.java:1162)
 at 
org.apache.kafka.connect.runtime.distributed.DistributedHerder$12.call(DistributedHerder.java:1158)
 at java.util.concurrent.FutureTask.run(FutureTask.java:266)
 at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
 at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
 at java.lang.Thread.run(Thread.java:748)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] KIP-631: The Quorum-based Kafka Controller

2020-07-29 Thread Unmesh Joshi
Hi,

In the BrokerHeartbeat request and response,  what is the reason to have
LeaseStartTimeMs and LeaseEndTimeMs respectively? There are two points I
was thinking of

1.  The time used to track lease expiry will be monotonic clock on the
active controller. So it won't be useful to use that value on the broker.
  (It's the monotonic clock of the active controller, and I assume it
won't be tracked on the follower controllers unless we have some
notion of 'cluster
clock'  as used in
LogCabin.)

2. The broker, while registering, will like to get the lease as soon as
possible, so what's the purpose of this time in the request?

Thanks,
Unmesh



On Mon, Jul 27, 2020 at 9:51 PM Jun Rao  wrote:

> Hi, Colin,
>
> Thanks for the KIP. A few comments below.
>
> 10. Some of the choices in this KIP are not consistent with KIP-595. It
> would be useful to make consistent choices between the two KIPs.
> 10.1 KIP-595 doesn't use a separate Heartbeat request and heartbeat is
> piggybacked through the Fetch request.
> 10.2 The discussion in KIP-595 still assumes a separate controlled shutdown
> request instead of heartbeat.
> 10.3 My understanding is that the controller is just the leader in the Raft
> quorum. If so, do we need process.roles and controller.connect  in this KIP
> given quorum.voters in KIP-595?
>
> 11. Fencing: It would be useful to clarify whether the fencing is 1-way or
> 2-way. In ZK, the fencing is 1-way. ZK server determines if a ZK session is
> expired or not. An expired ZK client doesn't know it's fenced until it can
> connect to ZK server. It seems in this KIP, the proposal is for the fencing
> to work in both ways, i.e., the controller can fence a broker and a broker
> can fence itself based on heartbeat independently. There are some tradeoffs
> between these two approaches. It would be useful to document the benefits
> and the limitations of the proposed approach. For example, I wonder what
> happens if the controller and the broker make inconsistent fencing
> decisions in the new approach.
>
> 12. BrokerRecord:
> 12.1 Currently, BrokerEpoch is the ZK session id. How is BrokerEpoch
> generated without ZK?
> 12.2 KIP-584 is in progress. So, we need to include the features field.
>
> 13. PartitionRecord/IsrChange. IsrChange seems to be representing an
> incremental change to ISR in PartitionRecord. For consistency, should we
> have a separate record for representing incremental change to replicas?
> Currently RemovingReplicas/AddingReplicas are included with many other
> fields in PartitionRecord?
>
> 14. "When the active controller decides that a standby controller should
> start a snapshot, it will communicate that information in its response to
> the periodic heartbeat sent by that node.  When the active controller
> decides that it itself should create a snapshot, it will first try to give
> up the leadership of the Raft quorum in order to avoid unnecessary delays
> while writing the snapshot." Is it truly necessary to only do snapshotting
> in the follower? It seems it's simpler to just let every replica do
> snapshotting in a background thread.
>
> 15. Currently, we store SCRAM hashes and delegation tokens in ZooKeeper.
> Should we add records to account for those?
>
> 16. The description of leaderEpoch says "An epoch that gets incremented
> each time we change the ISR." Currently, we only increment leaderEpoch when
> the leader changes.
>
> 17. Metrics
> 17.1 "kafka.controller:type=KafkaController,name=MetadataSnapshotLag The
> offset delta between the latest stable offset of the metadata topic and the
> offset of the last snapshot (or 0 if there are no snapshots)". 0 could be a
> valid lag. So using that to represent no snapshots can cause confusion.
> 17.2 kafka.controller:type=KafkaController,name=ControllerRequestsRate: We
> already have a rateAndTIme metric per ControllerState. Do we need this new
> metric?
>
> 18. Do we need a separate DeletePartition record? This could be useful to
> represent the successful deletion of a single partition.
>
> 19. Do we need brokerEpoch in DeleteBroker?
>
> 20. controller.id: I had the same feeling as Jason. Requiring the user to
> configure a separate controller id for each broker seems to add more
> complexity. So, we need a good reason to do that. So far, it seems that's
> just for having a unique id when creating the NetworkClient for the
> controller. That's internal and there could be other ways to achieve this.
> Thanks,
>
> Jun
>
>
> On Thu, Jul 23, 2020 at 11:02 PM Boyang Chen 
> wrote:
>
> > Hey Colin,
> >
> > some more questions I have about the proposal:
> >
> > 1. We mentioned in the networking section that "The only time when
> clients
> > should contact a controller node directly is when they are debugging
> system
> > issues". But later we didn't talk about how to enable this debug mode,
> > could you consider getting a section about that?
> >
> > 2. “When the active controller 

Re: [VOTE] KIP-648: Renaming getter method for Interactive Queries

2020-07-29 Thread John Roesler
Hi John,

Thanks for picking this up! I’m +1 (binding). 

-John

On Wed, Jul 29, 2020, at 09:14, Jorge Esteban Quilcate Otoya wrote:
> +1 (non-binding).
> 
> Thanks John!
> 
> On Wed, Jul 29, 2020 at 3:00 PM Navinder Brar
>  wrote:
> 
> > +1 (non-binding). Thanks John, looks good to me.
> >
> > ~NavinderOn Wednesday, 29 July, 2020, 04:32:25 am IST, John Thomas <
> > johnthote...@live.com> wrote:
> >
> >  Hello everyone,
> >
> > I'd like to kick-off a vote for KIP-648 : Renaming getter method for
> > Interactive Queries
> >
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-648%3A+Renaming+getter+method+for+Interactive+Queries
> >
> > It's a straight forward change to include new getters and deprecate the
> > existing ones for the class KeyQueryMetadata.
> >
> > Thanks,
> > John
> >
>


Re: [VOTE] KIP-648: Renaming getter method for Interactive Queries

2020-07-29 Thread Jorge Esteban Quilcate Otoya
+1 (non-binding).

Thanks John!

On Wed, Jul 29, 2020 at 3:00 PM Navinder Brar
 wrote:

> +1 (non-binding). Thanks John, looks good to me.
>
> ~NavinderOn Wednesday, 29 July, 2020, 04:32:25 am IST, John Thomas <
> johnthote...@live.com> wrote:
>
>  Hello everyone,
>
> I'd like to kick-off a vote for KIP-648 : Renaming getter method for
> Interactive Queries
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-648%3A+Renaming+getter+method+for+Interactive+Queries
>
> It's a straight forward change to include new getters and deprecate the
> existing ones for the class KeyQueryMetadata.
>
> Thanks,
> John
>


Re: [VOTE] KIP-648: Renaming getter method for Interactive Queries

2020-07-29 Thread Navinder Brar
+1 (non-binding). Thanks John, looks good to me.

~NavinderOn Wednesday, 29 July, 2020, 04:32:25 am IST, John Thomas 
 wrote:  
 
 Hello everyone,

I'd like to kick-off a vote for KIP-648 : Renaming getter method for 
Interactive Queries
https://cwiki.apache.org/confluence/display/KAFKA/KIP-648%3A+Renaming+getter+method+for+Interactive+Queries

It's a straight forward change to include new getters and deprecate the 
existing ones for the class KeyQueryMetadata.

Thanks,
John
  

Re: [VOTE] KIP-450: Sliding Window Aggregations in the DSL

2020-07-29 Thread Jorge Esteban Quilcate Otoya
Thanks Leah!
This will be a great addition.

+1 (non-binding)

Very happy that KIP-617 is being used already :D

Cheers,
Jorge.

On Wed, Jul 29, 2020 at 2:28 PM John Roesler  wrote:

> Thanks for the awesome KIP, Leah,
>
> I’m +1 (binding)
>
> Thanks,
> John
>
> On Tue, Jul 28, 2020, at 19:10, Guozhang Wang wrote:
> > +1 (binding)
> >
> > On Tue, Jul 28, 2020 at 4:44 PM Matthias J. Sax 
> wrote:
> >
> > > +1 (binding)
> > >
> > > On 7/28/20 4:35 PM, Sophie Blee-Goldman wrote:
> > > > Thanks for the KIP! It's been an enlightening discussion
> > > >
> > > > +1 (non-binding)
> > > >
> > > > Sophie
> > > >
> > > > On Tue, Jul 28, 2020 at 8:03 AM Leah Thomas 
> > > wrote:
> > > >
> > > >> Hi all,
> > > >>
> > > >> I'd like to kick-off the vote for KIP-450
> > > >> <
> > > >>
> > >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-450%3A+Sliding+Window+Aggregations+in+the+DSL
> > > >>> ,
> > > >> adding sliding window aggregations to the DSL. The discussion
> thread is
> > > >> here
> > > >> <
> > > >>
> > >
> http://mail-archives.apache.org/mod_mbox/kafka-dev/202007.mbox/%3ccabug4nfjkrroe_rf4ht2p1wu1pt7o-qd74h_0l7a4bnsmmg...@mail.gmail.com%3e
> > > >>>
> > > >> .
> > > >>
> > > >> Cheers,
> > > >> Leah
> > > >>
> > > >
> > >
> > >
> >
> > --
> > -- Guozhang
> >
>


Kafka zk1 password

2020-07-29 Thread Lukac, Dominik
Hello, I have a question regarding running kafka with the help of vagrant. I 
have tried to connect to the zk1 vm by using WinSCP, but it looks like it 
requires a password, which I cannot find. Is there some file, which documents 
the passwords for the different vms, or are they all the same? But still, what 
is it?

Thanks in advance
Dominik



This message is for the designated recipient only and may contain privileged, 
proprietary, or otherwise confidential information. If you have received it in 
error, please notify the sender immediately and delete the original. Any other 
use of the e-mail by you is prohibited. Where allowed by local law, electronic 
communications with Accenture and its affiliates, including e-mail and instant 
messaging (including content), may be scanned by our systems for the purposes 
of information security and assessment of internal compliance with Accenture 
policy. Your privacy is important to us. Accenture uses your personal data only 
in compliance with data protection laws. For further information on how 
Accenture processes your personal data, please see our privacy statement at 
https://www.accenture.com/us-en/privacy-policy.
__

www.accenture.com


Re: [DISCUSS] KIP-450: Sliding Windows

2020-07-29 Thread Leah Thomas
Thanks for the nits Matthias, I've updated the examples and language
accordingly.

Leah

On Tue, Jul 28, 2020 at 6:43 PM Matthias J. Sax  wrote:

> Thanks Leah. Overall LGTM.
>
> A few nits:
>
> - the first figure shows window [9,19] but the window is not aligned
> properly (it should be 1ms to the right; right now, it's aligned to
> window [8,18])
>
> - in the second figure, a hopping window would create more windows, ie,
> the first window would be [-6,14) and the last window would be [19,29),
> thus it's not just 10 windows but 26 windows (if I did not miss count)
>
> - "Two new windows will be created by the late record"
>
> late -> out-of-order
>
>
> -Matthias
>
>
>
> On 7/28/20 4:34 PM, Sophie Blee-Goldman wrote:
> > Thanks for the update Leah -- I think that all makes sense.
> >
> > Cheers,
> > Sophie
> >
> > On Tue, Jul 28, 2020 at 3:55 PM Leah Thomas 
> wrote:
> >
> >> Another minor tweak, instead of defining the window by the *size*, it
> will
> >> be defined by *timeDifference*, which is the maximum time difference
> >> between two events. This is a more precise way to define a window due to
> >> its inclusive ends, while allowing the user to create the window they
> >> expect. This definition fits with the current examples, where a record
> at
> >> *10* would fall into a window of time difference *5* from *[5,10]*. This
> >> window contains any records at 5, 6, 7, 8, 9, and 10, which is 6
> instances
> >> instead of 5. This semantic difference is why I've shifted *size* to
> >> *timeDifference*.
> >>
> >> The new builder will be *withTimeDifferenceAndGrace*, keeping with other
> >> conventions.
> >>
> >> Let me know if there are any concerns! The vote thread is open as well
> >> here:
> >> http://mail-archives.apache.org/mod_mbox/kafka-dev/202007.mbox/browser
> >>
> >> Best,
> >> Leah
> >>
> >> On Mon, Jul 27, 2020 at 3:58 PM Leah Thomas 
> wrote:
> >>
> >>> A small tweak - to make it more clear to users that grace is required,
> as
> >>> well as cleaning up some of the confusing grammar semantics of windows,
> >> the
> >>> main builder method for *slidingWindows* will be *withSizeAndGrace*
> >> instead
> >>> of *of*.  This looks like it'll be the last change (for now) on the
> >>> public API. If anyone has any comments or thoughts let me know.
> >> Otherwise,
> >>> I'll take this to vote shortly.
> >>>
> >>> Best,
> >>> Leah
> >>>
> >>> On Fri, Jul 24, 2020 at 3:45 PM Leah Thomas 
> >> wrote:
> >>>
>  To accommodate the change to a final class, I've added another
>  *windowedBy()* function in *CogroupedKStream.java *to handle
>  SlidingWindows.
> 
>  As far as the discussion goes, I think this is the last change we've
>  talked about. If anyone has other comments or concerns, please let me
> >> know!
> 
>  Cheers,
>  Leah
> 
>  On Thu, Jul 23, 2020 at 7:34 PM Leah Thomas 
> >> wrote:
> 
> > Thanks for the discussion about extending TimeWindows. I agree that
> > making it future proof is important, and the implementation of
> > SlidingWindows is unique enough that it seems logical to make it its
> >> own
> > final class.
> >
> > On that note, I've updated the KIP to make SlidingWindows a stand
> alone
> > final class, and added the *windowedBy() *API in *KGroupedStream *to
> > handle SlidingWindows. It seems that SlidingWindows would still be
> >> able to
> > leverage *TimeWindowedKStream by* creating a SlidingWindows version
> of
> > *TimeWindowedKStreamImpl* that implements *TimeWindowedKStream. *If
> > anyone sees issues with this implementation, please let me know.
> >
> > Best,
> > Leah
> >
> > On Wed, Jul 22, 2020 at 10:47 PM John Roesler 
> > wrote:
> >
> >> Thanks for the reply, Sophie.
> >>
> >> That all sounds about right to me.
> >>
> >> The Windows “interface”/algorithm is quite flexible, so it makes
> sense
> >> for it to be extensible. Different implementations can (and do)
> >> enumerate
> >> different windows to suit different use cases.
> >>
> >> On the other hand, I can’t think of any way to extend SessionWindows
> >> to
> >> do something different using the same algorithm, so it makes sense
> >> for it
> >> to stay final.
> >>
> >> If we think SlidingWindows is similarly not usefully extensible,
> then
> >> we can make it final. It’s easy to remove final later, but adding it
> >> is not
> >> possible. Or we could go the other route and just make it an
> >> interface, on
> >> general principle. Both of these choices are safe API design.
> >>
> >> Thanks again,
> >> John
> >>
> >> On Wed, Jul 22, 2020, at 21:54, Sophie Blee-Goldman wrote:
> 
>  Users could pass in a custom `SessionWindows` as
>  long as the session algorithm works correctly for it.
> >>>
> >>>
> >>> Well not really, SessionWindows is a final class. TimeWindows is
> >> also
> 

Re: [VOTE] KIP-450: Sliding Window Aggregations in the DSL

2020-07-29 Thread John Roesler
Thanks for the awesome KIP, Leah,

I’m +1 (binding)

Thanks,
John

On Tue, Jul 28, 2020, at 19:10, Guozhang Wang wrote:
> +1 (binding)
> 
> On Tue, Jul 28, 2020 at 4:44 PM Matthias J. Sax  wrote:
> 
> > +1 (binding)
> >
> > On 7/28/20 4:35 PM, Sophie Blee-Goldman wrote:
> > > Thanks for the KIP! It's been an enlightening discussion
> > >
> > > +1 (non-binding)
> > >
> > > Sophie
> > >
> > > On Tue, Jul 28, 2020 at 8:03 AM Leah Thomas 
> > wrote:
> > >
> > >> Hi all,
> > >>
> > >> I'd like to kick-off the vote for KIP-450
> > >> <
> > >>
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-450%3A+Sliding+Window+Aggregations+in+the+DSL
> > >>> ,
> > >> adding sliding window aggregations to the DSL. The discussion thread is
> > >> here
> > >> <
> > >>
> > http://mail-archives.apache.org/mod_mbox/kafka-dev/202007.mbox/%3ccabug4nfjkrroe_rf4ht2p1wu1pt7o-qd74h_0l7a4bnsmmg...@mail.gmail.com%3e
> > >>>
> > >> .
> > >>
> > >> Cheers,
> > >> Leah
> > >>
> > >
> >
> >
> 
> -- 
> -- Guozhang
>


Re: [VOTE] KIP-617: Allow Kafka Streams State Stores to be iterated backwards

2020-07-29 Thread Jorge Esteban Quilcate Otoya
Thanks everyone for voting!

With 3 binding votes (Matthias, Guozhang, and John) and 2 non-binding
votes (Leah, and Sophie), will mark this KIP as accepted.

Thanks,

Jorge.


On Tue, Jul 28, 2020 at 3:27 AM Matthias J. Sax  wrote:

> +1 (binding)
>
> On 7/27/20 4:55 PM, Guozhang Wang wrote:
> > +1. Thanks Jorge for bringing in this KIP!
> >
> > Guozhang
> >
> > On Mon, Jul 27, 2020 at 10:07 AM Leah Thomas 
> wrote:
> >
> >> Hi Jorge,
> >>
> >> Looks great. +1 (non-binding)
> >>
> >> Best,
> >> Leah
> >>
> >> On Thu, Jul 16, 2020 at 6:39 PM Sophie Blee-Goldman <
> sop...@confluent.io>
> >> wrote:
> >>
> >>> Hey Jorge,
> >>>
> >>> Thanks for the reminder -- +1 (non-binding)
> >>>
> >>> Cheers,
> >>> Sophie
> >>>
> >>> On Thu, Jul 16, 2020 at 4:06 PM Jorge Esteban Quilcate Otoya <
> >>> quilcate.jo...@gmail.com> wrote:
> >>>
>  Bumping this vote thread to check if there's any feedback.
> 
>  Cheers,
>  Jorge.
> 
>  On Sat, Jul 4, 2020 at 6:20 PM John Roesler 
> >> wrote:
> 
> > Thanks Jorge,
> >
> > I’m +1 (binding)
> >
> > -John
> >
> > On Fri, Jul 3, 2020, at 10:26, Jorge Esteban Quilcate Otoya wrote:
> >> Hola everyone,
> >>
> >> I'd like to start a new thread to vote for KIP-617 as there have
> >> been
> >> significant changes since the previous vote started.
> >>
> >> KIP wiki page:
> >>
> >
> 
> >>>
> >>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-617%3A+Allow+Kafka+Streams+State+Stores+to+be+iterated+backwards
> >>
> >> Many thanks!
> >>
> >> Jorge.
> >>
> >
> 
> >>>
> >>
> >
> >
>
>


Re: [DISCUSS] KIP-431: Support of printing additional ConsumerRecord fields in DefaultMessageFormatter

2020-07-29 Thread Badai Aqrandista
Hi all

I have created a PR for KIP-431 against the latest trunk:
https://github.com/apache/kafka/pull/9099

Please review.

Regards
Badai

On Tue, Jul 21, 2020 at 2:13 AM Matthias J. Sax  wrote:
>
> Thanks Badai. LGTM.
>
> On 7/19/20 4:26 PM, Badai Aqrandista wrote:
> > Hi all
> >
> > I have made a small change to KIP-431 to make it clearer which one is
> > "Partition" and "Offset". Also I have moved key field to the back,
> > before the value:
> >
> > $ kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic
> > test --from-beginning --property print.partition=true --property
> > print.key=true --property print.timestamp=true --property
> > print.offset=true --property print.headers=true --property
> > key.separator='|'
> >
> > CreateTime:1592475472398|Partition:0|Offset:3|h1:v1,h2:v2|key1|value1
> > CreateTime:1592475472456|Partition:0|Offset:4|NO_HEADERS|key2|value2
> >
> > Regards
> > Badai
> >
> > On Sun, Jun 21, 2020 at 11:39 PM Badai Aqrandista  
> > wrote:
> >>
> >> Excellent.
> >>
> >> Would like to hear more feedback from others.
> >>
> >> On Sat, Jun 20, 2020 at 1:27 AM David Jacot  wrote:
> >>>
> >>> Hi Badai,
> >>>
> >>> Thanks for your reply.
> >>>
> >>> 2. Yes, that makes sense.
> >>>
> >>> Best,
> >>> David
> >>>
> >>> On Thu, Jun 18, 2020 at 2:08 PM Badai Aqrandista  
> >>> wrote:
> >>>
>  David
> 
>  Thank you for replying
> 
>  1. It seems that `print.partition` is already implemented. Do you 
>  confirm?
>  BADAI: Yes, you are correct. I have removed it from the KIP.
> 
>  2. Will `null.literal` be only used when the value of the message
>  is NULL or for any fields? Also, it seems that we print out "null"
>  today when the key or the value is empty. Shall we use "null" as
>  a default instead of ""?
>  BADAI: For any fields. Do you think this is useful?
> 
>  3. Could we add a small example of the output in the KIP?
>  BADAI: Yes, I have updated the KIP to add a couple of example.
> 
>  4. When there are no headers, are we going to print something
>  to indicate it to the user? For instance, we print out NO_TIMESTAMP
>  where there is no timestamp.
>  BADAI: Yes, good idea. I have updated the KIP to print NO_HEADERS.
> 
>  Thanks
>  Badai
> 
> 
>  On Thu, Jun 18, 2020 at 7:25 PM David Jacot  wrote:
> >
> > Hi Badai,
> >
> > Thanks for resuming this. I have few small comments:
> >
> > 1. It seems that `print.partition` is already implemented. Do you
>  confirm?
> >
> > 2. Will `null.literal` be only used when the value of the message
> > is NULL or for any fields? Also, it seems that we print out "null"
> > today when the key or the value is empty. Shall we use "null" as
> > a default instead of ""?
> >
> > 3. Could we add a small example of the output in the KIP?
> >
> > 4. When there are no headers, are we going to print something
> > to indicate it to the user? For instance, we print out NO_TIMESTAMP
> > where there is no timestamp.
> >
> > Best,
> > David
> >
> > On Wed, Jun 17, 2020 at 4:53 PM Badai Aqrandista 
>  wrote:
> >
> >> Hi all,
> >>
> >> I have contacted Mateusz separately and he is ok for me to take over
> >> KIP-431:
> >>
> >>
> >>
>  https://cwiki.apache.org/confluence/display/KAFKA/KIP-431%3A+Support+of+printing+additional+ConsumerRecord+fields+in+DefaultMessageFormatter
> >>
> >> I have updated it a bit. Can anyone give a quick look at it again and
> >> give me some feedback?
> >>
> >> This feature will be very helpful for people supporting Kafka in
> >> operations.
> >>
> >> If it is ready for a vote, please let me know.
> >>
> >> Thanks
> >> Badai
> >>
> >> On Sat, Jun 13, 2020 at 10:59 PM Badai Aqrandista 
> >> wrote:
> >>>
> >>> Mateusz
> >>>
> >>> This KIP would be very useful for debugging. But the last discussion
> >>> is in Feb 2019.
> >>>
> >>> Are you ok if I take over this KIP?
> >>>
> >>> --
> >>> Thanks,
> >>> Badai
> >>
> >>
> >>
> >> --
> >> Thanks,
> >> Badai
> >>
> 
> 
> 
>  --
>  Thanks,
>  Badai
> 
> >>
> >>
> >>
> >> --
> >> Thanks,
> >> Badai
> >
> >
> >
>


-- 
Thanks,
Badai


[jira] [Created] (KAFKA-10322) InMemoryWindowStore restore keys format incompatibility (lack of sequenceNumber in keys on topic)

2020-07-29 Thread Jira
Tomasz Bradło created KAFKA-10322:
-

 Summary: InMemoryWindowStore restore keys format incompatibility 
(lack of sequenceNumber in keys on topic)
 Key: KAFKA-10322
 URL: https://issues.apache.org/jira/browse/KAFKA-10322
 Project: Kafka
  Issue Type: Bug
  Components: streams
Affects Versions: 2.5.0
 Environment: windows/linux
Reporter: Tomasz Bradło


I have regular groupBy stream configuration:
{code:java}
   
fun addStream(kStreamBuilder: StreamsBuilder) {
val storeSupplier = Stores.inMemoryWindowStore("count-store",
Duration.ofDays(10),
Duration.ofDays(1),
false)
val storeBuilder: StoreBuilder> = 
Stores
.windowStoreBuilder(storeSupplier, 
JsonSerde(CountableEvent::class.java), Serdes.Long())

kStreamBuilder
.stream("input-topic", Consumed.with(Serdes.String(), 
Serdes.String()))
.map {_, jsonRepresentation -> 
KeyValue(eventsCountingDeserializer.deserialize(jsonRepresentation), null)}
.groupByKey()
.windowedBy(TimeWindows.of(Duration.ofDays(1)))
.count(Materialized.with(JsonSerde(CountableEvent::class.java), 
Serdes.Long()))
.toStream()
.to("topic1-count")

val storeConsumed = 
Consumed.with(WindowedSerdes.TimeWindowedSerde(JsonSerde(CountableEvent::class.java),
 Duration.ofDays(1).toMillis()), Serdes.Long())
kStreamBuilder.addGlobalStore(storeBuilder, "topic1-count", 
storeConsumed, passThroughProcessorSupplier)
}{code}
While sending to "topic1-count", for serializing the key 
[TimeWindowedSerializer|https://github.com/apache/kafka/blob/b1aa1912f08765d9914e3d036deee6b71ea009dd/streams/src/main/java/org/apache/kafka/streams/kstream/TimeWindowedSerializer.java]
 is used which is using 
[WindowKeySchema.toBinary|https://github.com/apache/kafka/blob/b1aa1912f08765d9914e3d036deee6b71ea009dd/streams/src/main/java/org/apache/kafka/streams/state/internals/WindowKeySchema.java#L112]
 so the message key format is:
{code:java}
real_grouping_key + timestamp(8bytes){code}
 

Everything works. I can get correct values from state-store. But, in recovery 
scenario, when [GlobalStateManagerImpl 
|https://github.com/apache/kafka/blob/2.5/streams/src/main/java/org/apache/kafka/streams/processor/internals/GlobalStateManagerImpl.java#L317]enters
 offset < highWatermark loop then

[InMemoryWindowStore stateRestoreCallback 
|https://github.com/apache/kafka/blob/2.5/streams/src/main/java/org/apache/kafka/streams/state/internals/InMemoryWindowStore.java#L105]reads
 from "topic1-count" and fails to extract valid key and timestamp using 
[WindowKeySchema.extractStoreKeyBytes 
|https://github.com/apache/kafka/blob/1ff25663a8beb8d7a67e6b58bc7ee0020651b75a/streams/src/main/java/org/apache/kafka/streams/state/internals/WindowKeySchema.java#L188]and
 [WindowKeySchema.extractStoreTimestamp. 
|https://github.com/apache/kafka/blob/1ff25663a8beb8d7a67e6b58bc7ee0020651b75a/streams/src/main/java/org/apache/kafka/streams/state/internals/WindowKeySchema.java#L201]It
 fails because it expects format:
{code:java}
real_grouping_key + timestamp(8bytes) + sequence_number(4bytes) {code}
How this is supposed to work in this case?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [kafka-site] scott-confluent edited a comment on pull request #269: Kafka nav and hompeage redesign

2020-07-29 Thread GitBox


scott-confluent edited a comment on pull request #269:
URL: https://github.com/apache/kafka-site/pull/269#issuecomment-663676137







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Created] (KAFKA-10321) shouldDieOnInvalidOffsetExceptionWhileRunning would block forever on JDK11

2020-07-29 Thread Boyang Chen (Jira)
Boyang Chen created KAFKA-10321:
---

 Summary: shouldDieOnInvalidOffsetExceptionWhileRunning would block 
forever on JDK11
 Key: KAFKA-10321
 URL: https://issues.apache.org/jira/browse/KAFKA-10321
 Project: Kafka
  Issue Type: Bug
Affects Versions: 2.7.0
Reporter: Boyang Chen


Have spotted two definite cases where the test  
shouldDieOnInvalidOffsetExceptionWhileRunning fails to stop during the whole 
test suite:
[https://builds.apache.org/job/kafka-pr-jdk11-scala2.13/7604/consoleFull]

[https://builds.apache.org/job/kafka-pr-jdk11-scala2.13/7602/console]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)