date:20201208

Build failed in Jenkins: Kafka » kafka-trunk-jdk8 #271

2020-12-08 Thread Apache Jenkins Server

See 


Changes:

[github] KAFKA-10634; Adding LeaderId to voters list in LeaderChangeMessage 
along with granting voters (#9539)

[github] MINOR: Using primitive data types for loop index (#9705)


--
[...truncated 3.46 MB...]
org.apache.kafka.streams.TopologyTestDriverTest > shouldRespectTaskIdling[Eos 
enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > shouldRespectTaskIdling[Eos 
enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldUseSourceSpecificDeserializers[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldUseSourceSpecificDeserializers[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > shouldReturnAllStores[Eos 
enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > shouldReturnAllStores[Eos 
enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldNotCreateStateDirectoryForStatelessTopology[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldNotCreateStateDirectoryForStatelessTopology[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldApplyGlobalUpdatesCorrectlyInRecursiveTopologies[Eos enabled = false] 
STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldApplyGlobalUpdatesCorrectlyInRecursiveTopologies[Eos enabled = false] 
PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldReturnAllStoresNames[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldReturnAllStoresNames[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldPassRecordHeadersIntoSerializersAndDeserializers[Eos enabled = false] 
STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldPassRecordHeadersIntoSerializersAndDeserializers[Eos enabled = false] 
PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldProcessConsumerRecordList[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldProcessConsumerRecordList[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldUseSinkSpecificSerializers[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldUseSinkSpecificSerializers[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldFlushStoreForFirstInput[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldFlushStoreForFirstInput[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldProcessFromSourceThatMatchPattern[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldProcessFromSourceThatMatchPattern[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldCaptureSinkTopicNamesIfWrittenInto[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldCaptureSinkTopicNamesIfWrittenInto[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldUpdateStoreForNewKey[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldUpdateStoreForNewKey[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldSendRecordViaCorrectSourceTopicDeprecated[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldSendRecordViaCorrectSourceTopicDeprecated[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldPunctuateOnWallClockTime[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldPunctuateOnWallClockTime[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > shouldSetRecordMetadata[Eos 
enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > shouldSetRecordMetadata[Eos 
enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldNotUpdateStoreForLargerValue[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldNotUpdateStoreForLargerValue[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldReturnCorrectInMemoryStoreTypeOnly[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldReturnCorrectInMemoryStoreTypeOnly[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > shouldThrowForMissingTime[Eos 
enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > shouldThrowForMissingTime[Eos 
enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldCaptureInternalTopicNamesIfWrittenInto[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest >

Re: [DISCUSS] KIP-687: Automatic Reloading of Security Store

2020-12-08 Thread Boyang Chen

Hey Gwen, thanks for the feedback.

On Sun, Dec 6, 2020 at 10:06 PM Gwen Shapira  wrote:

> Agree with Igor. IIRC, we also encountered cases where filewatch was
> not triggered as expected. An interval will give us a better
> worse-case scenario that is easily controlled by the Kafka admin.
>
> Are the cases you were referring to happening in the cloud environment?
Should we investigate instead of simply assuming the standard API won't
work? I checked around and found a similar complaint here
.

I would be partially agreeing that we want to have a reliable approach for
all different operating systems in general, but would be great if we could
reach a quantitative measure of file-watch success rate if possible for us
to make the call. Eventually, the benefit of file-watch is more prompt
reaction time and less configuration to the broker.

> Gwen
>
> On Sun, Dec 6, 2020 at 8:17 AM Igor Soarez  wrote:
> >
> >
> > > > The proposed change relies on a file watch, why not also have a
> polling
> > > > interval to check the file for changes?
> > > >
> > > > The periodical check could work, the slight downside is that we need
> > > additional configurations to schedule the interval. Do you think the
> > > file-watch approach has any extra overhead than the interval based
> solution?
> >
> > I don't think so. The reason I'm asking this is the KIP currently
> includes:
> >
> >   "When the file watch does not work for unknown reason, user could
> still try to change the store path in an explicit AlterConfig call in the
> worst case."
> >
> > Having the interval in addition to the file watch could result in a
> better worst case scenario.
> > I understand it would require introducing at least one new configuration
> for the interval, so maybe this doesn't have to solved in this KIP.
> >
> > --
> > Igor
> >
> > On Fri, Dec 4, 2020, at 5:14 PM, Boyang Chen wrote:
> > > Hey Igor, thanks for the feedback.
> > >
> > > On Fri, Dec 4, 2020 at 5:24 AM Igor Soarez  wrote:
> > >
> > > > Hi Boyang,
> > > >
> > >
> > >
> > > > What happens if the file is changed into an invalid store? Does the
> > > > previous store stay in use?
> > > >
> > > > If the reload fails, the previous store should be effective. I will
> state
> > > that in the KIP.
> > >
> > >
> > > > Thanks,
> > > >
> > > > --
> > > > Igor
> > > >
> > > > On Fri, Dec 4, 2020, at 1:28 AM, Boyang Chen wrote:
> > > > > Hey there,
> > > > >
> > > > > I would like to start the discussion thread for KIP-687:
> > > > >
> > > >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-687%3A+Automatic+Reloading+of+Security+Store
> > > > >
> > > > > This KIP is trying to deprecate the AlterConfigs API support of
> updating
> > > > > the security store by reloading path in-place, and replace with a
> > > > > file-watch mechanism inside the broker. Let me know what you think.
> > > > >
> > > > > Best,
> > > > > Boyang
> > > > >
> > > >
> > >
>
>
>
> --
> Gwen Shapira
> Engineering Manager | Confluent
> 650.450.2760 | @gwenshap
> Follow us: Twitter | blog
>

Jenkins build is back to normal : Kafka » kafka-trunk-jdk11 #294

2020-12-08 Thread Apache Jenkins Server

See

Re: [DISCUSS] KIP-696: Update Streams FSM to clarify ERROR state meaning

2020-12-08 Thread Walker Carlson

Thanks for the feedback Guozhang!

I clarified some of the points in the Proposed Changes section so hopefully
it will be more clear what is going on now. I also agree with your
suggestion about the possible call to close() on ERROR so I added this
line.
"Close() called on ERROR will be idempotent and not throw an exception, but
we will log a warning."

I have linked those tickets and I will leave a comment trying to explain
how these changes will affect their issue.

walker

On Tue, Dec 8, 2020 at 4:57 PM Guozhang Wang  wrote:

> Hello Walker,
>
> Thanks for the KIP! Overall it looks reasonable to me. Just a few minor
> comments for the wiki page itself:
>
> 1) Could you clarify the conditions when RUNNING / REBALANCING ->
> PENDING_ERROR will happen; and when PENDING_ERROR -> ERROR will happen.
> E.g. when I read "Streams will only reach ERROR state in the event of an
> exceptional failure in which the `StreamsUncaughtExceptionHandler` chose to
> either shutdown the application or the client." I thought the first
> transition would happen before the handler, and the second transition would
> happen immediately after the handler returns "shutdown client" or "shutdown
> application", until I read the last statement regarding "SHUTDOWN_CLIENT".
>
> 2) A compatibility issue: today it is possible that users would call
> Streams APIs like shutdown in the global state transition listener. And
> it's common to try shutting down the application automatically when
> transiting to ERROR (assuming it was not a terminating state). I think we
> could consider making this call a no-op and log a warning.
>
> 3) Could you link the following JIRAs in the "JIRA" field?
>
> https://issues.apache.org/jira/browse/KAFKA-10555
> https://issues.apache.org/jira/browse/KAFKA-9638
> https://issues.apache.org/jira/browse/KAFKA-6520
>
> And maybe we can also left a comment on those tickets explaining what would
> happen to tackle the issues after this KIP.
>
>
> Guozhang
>
>
> On Tue, Dec 8, 2020 at 12:16 PM Walker Carlson 
> wrote:
>
> > Hello all,
> >
> > I'd like to propose KIP-696 to clarify the meaning of ERROR state in the
> > KafkaStreams Client State Machine. This will update the States to be
> > consistent with changes in KIP-671 and KIP-663.
> >
> > Here are the details: https://cwiki.apache.org/confluence/x/lCvZCQ
> >
> > Thanks,
> > Walker
> >
>
>
> --
> -- Guozhang
>

[jira] [Created] (KAFKA-10831) Consider increasing default session timeout for consumer

2020-12-08 Thread Jason Gustafson (Jira)

Jason Gustafson created KAFKA-10831:
---

 Summary: Consider increasing default session timeout for consumer
 Key: KAFKA-10831
 URL: https://issues.apache.org/jira/browse/KAFKA-10831
 Project: Kafka
  Issue Type: Improvement
Reporter: Jason Gustafson


The default session timeout is 10s, which is probably a bit low for cloud 
environments (which are becoming the standard). Moreover, the value is 
inconsistent with other defaults such as request.timeout.ms (30s), and 
default.api.timeout.ms (60s). Perhaps the default should be set to 60s so that 
it is consistent with the api timeout and so that we can give the client enough 
time to timeout a pending request and retry.

Note that we made a similar default change for the Zookeeper session timeout 
here: 
https://cwiki.apache.org/confluence/display/KAFKA/KIP-537%3A+Increase+default+zookeeper+session+timeout.





--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (KAFKA-10830) Kafka Producer API should throw unwrapped exceptions

2020-12-08 Thread Guozhang Wang (Jira)

Guozhang Wang created KAFKA-10830:
-

 Summary: Kafka Producer API should throw unwrapped exceptions
 Key: KAFKA-10830
 URL: https://issues.apache.org/jira/browse/KAFKA-10830
 Project: Kafka
  Issue Type: Improvement
  Components: producer 
Reporter: Guozhang Wang


Today in various KafkaProducer APIs (especially send and transaction related) 
we wrap many of the underlying exception with a KafkaException. In some nested 
calls we may even wrap it more than once. Although the initial goal is to not 
expose the root cause directly to users, it also brings confusion to advanced 
user's error handling that some KafkaException wrapped root cause may be 
handled differently.

Since all of those exceptions are public classes anyways (since one can still 
get them via exception.root()) and they are all inheriting KafkaException, I'd 
suggest we do not do any wrapping any more and throw the exception directly. 
For those users who just capture all KafkaException and handle them universally 
it is still compatible; but for those users who want to handle exceptions 
differently it would introduce an easier way.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (KAFKA-10829) Kafka Streams handle produce exception improvement

2020-12-08 Thread Guozhang Wang (Jira)

Guozhang Wang created KAFKA-10829:
-

 Summary: Kafka Streams handle produce exception improvement
 Key: KAFKA-10829
 URL: https://issues.apache.org/jira/browse/KAFKA-10829
 Project: Kafka
  Issue Type: Improvement
  Components: producer , streams
Reporter: Guozhang Wang


A summary of some recent discussions on how we should improve on embedded 
producer exception handling.

Note that below the basline logic would guarantee that our correctness 
semantics is not violated; and optimization are on top of the baseline to 
reduce the user's burden by letting the library auto-handle certain types of 
exception.

1) ``Producer.send()`` throw exception directly: 

1.a) baseline (to make sure correctness) logic is to always wrap them as 
StreamsException, it would cause the thread to shutdown and exception handler 
triggered. The handler could look into the wrapped exception and decide whether 
the shutdown thread can be restarted.

1.b) optimization is to look at the exception, and decide if they can be 
wrapped as TaskMigratedException instead (e.g. ProducerFenced). This would then 
be auto-handled by lost-all-tasks and re-join.

2) ``Producer.send()`` Callback has an exception:

2.a) baseline is first to check if the exception is instanceof 
RetriableException.

If not retriable, pass it to the producer exception handler to decide whether 
to throw or to continue with record dropped. If decide to throw, always warp it 
as StreamsException and keep it locally; at the same time do not send more 
records from the caller. In the next send call, check the remembered exception 
and throw. It would cause the thread to shutdown and exception handler 
triggered.

If the exception is not Retriable, always throw it as a fatal StreamsException.

2.b) optimization one: if the non-retriable exception can be translated as a 
TaskMigratedException, then do not wrap it as StreamsException to let the 
library handle internally.

2.c) optimization two: if the retriable exception is a timeout exception, then 
do not pass to the produce exception handler and treat it as TaskMigrated.

3) ``Producer.XXXTxn`` APIs except ``AbortTxn`` throw exception directly:

3.a) baseline logic is to capture all KafkaException except TimeoutException, 
and handle them as *TaskCorrupted* (which include abort the transaction, reset 
the state, and re-join the group). TimeoutException would be rethrown.

3.b) optimization: some exceptions can be handled as TaskMigrated, which would 
be handled in a lighter way.

4) ``Producer.abortTxn`` throw exception:

3.a) baseline logic is to capture all KafkaException  except TimeoutException 
as fatal StreamsException. TimeoutException would be rethrown.

3.b) optimization: some exceptions can be ignored (e.g. invalidTxnTransition 
means the abort did not succeeded).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Re: [DISCUSS] KIP-660: Pluggable ReplicaAssignor

2020-12-08 Thread Colin McCabe

On Thu, Dec 3, 2020, at 07:41, Edoardo Comar wrote:
> Hi Colin
> thanks for your comments.
> I think your objections to creating an interface for replica placement
> could be used against similar server-side plug-ins (Authorizer,
> QuotaCallback).

Hi Edoardo,

The Authorizer API is not really comparable, I think.  Most organizations have 
some kind of identity management solution in place already (ActiveDirectory, 
LDAP, Kerberos, whatever) and it makes sense for Kafka to integrate with that 
existing system.  In contrast, nobody has an existing replica placement system 
sitting around and wants to integrate Kafka with that.

best,
Colin


> They too are on sensitive code paths, can cause problems if badly
> written/poorly tested and may cause a pain on the evolution of Kafka if
> backward compatibility is promised.
> 
> On backward compatibility, I'm all for marking the interfaces as
> implement-at-your-risk :-) so that a minority of implementers won't keep
> the ecosystem on a handbrake.
> 
> On 'users', I think that the implementers of these interfaces are teams
> running Kafka clusters, not 'users' publishing and consuming messages or
> streams.
> Therefore I'd expect them to use these plug-in point implementations
> with due care, or else they risk extinction by survival of the fittest ...
> :-)
> 
> cheers
> Edoardo
> 
> On Wed, 2 Dec 2020 at 18:46, Colin McCabe  wrote:
> 
> > Hi Mickael,
> >
> > To be honest, I think it would be better not to make replica placement
> > pluggable.
> >
> > When I worked on the Hadoop Distributed Filesystem, we supported pluggable
> > replica placement policies, and it never worked out well.  Users would
> > write plugin code that ran in a very sensitive context, and it would crash
> > or perform poorly, causing problems which they often asked us to
> > troubleshoot.  The plugin interface became very complex because people kept
> > adding stuff, and nothing could be removed because it would break API
> > compatibility.
> >
> > Our advice to users was always not to do this, but to do placement on the
> > client side if they really wanted to control placement.  In general I think
> > we made a mistake by creating that API at all, and we should have just
> > provided more configuration knobs for placement instead.  (We did add a few
> > such knobs over time, and they were much better at solving people's
> > problems than the placement API...)
> >
> > The reality is that in order to do a good job with replica placement, your
> > code needs to be very well integrated with the rest of the system.  As you
> > said, you need to know about what other partitions and brokers exist.  This
> > information can be quite large, if you have a large cluster, and it is
> > constantly changing.  More than that, if you want to do better than the
> > current built-in policy, you need additional information like metrics about
> > how much load each broker is under, how much disk space it has, reads vs.
> > writes, etc. etc.  If you want to treat some nodes specially, you need a
> > place to store that metadata.  I think people who want to customize this
> > would be better off forking.
> >
> > If we are absolutely convinced we want to do this, we should at least mark
> > any internal classes we're exposing here as Evolving, rather than Stable,
> > so that we can change them at will.  Basically not give a compatibility
> > guarantee for this.
> >
> > best,
> > Colin
> >
> >
> > On Thu, Nov 26, 2020, at 08:47, Mickael Maison wrote:
> > > Thanks Gwen for following up.
> > >
> > > With this extra bit of context, David's comments make more sense.
> > >
> > > If we move the replica placement plugin to the controller, I think
> > > most of the API can stay the same. However, as David mentioned,
> > > Cluster may be problematic.
> > > In a replica placement plugin, you'd typically want to retrieve the
> > > list of brokers and the list of partitions (including leader and
> > > replicas) so it should not be too hard to come up with a new interface
> > > instead of using Cluster.
> > >
> > > But until KIP-631 is done, new types used for metadata in the
> > > controller are not known. So I wonder if we need to wait for that KIP
> > > before we can make further progress here or if we should be fine
> > > having a pretty generic Metadata interface.
> > >
> > > Maybe Colin/Ismael can comment and advise us here?
> > >
> > > On Tue, Nov 24, 2020 at 8:47 PM Gwen Shapira  wrote:
> > > >
> > > > Michael,
> > > >
> > > > I talked a bit to Colin and Ismael offline and got some clarity about
> > > > KIP-500. Basically, replica placement requires an entire metadata map
> > > > - all the placements of all replicas. Since one of the goals of
> > > > KIP-500 is to go to 1M or even 10M partitions on the cluster, this
> > > > will be a rather large map. Since brokers normally don't need to know
> > > > all the placements (they just need to know about the subset of
> > > > replicas that they lead or follow), the idea

Re: [DISCUSS] KIP-631: The Quorum-based Kafka Controller

2020-12-08 Thread Colin McCabe

On Thu, Dec 3, 2020, at 16:37, Jun Rao wrote:
> Hi, Colin,
> 
> Thanks for the updated KIP. A few more comments below.
> 

Hi Jun,

Thanks again for the reviews.

> 80.2 For deprecated configs, we need to include zookeeper.* and
> broker.id.generation.enable.
> 

Added.

> 83.1 If a broker is down, does the controller keep the previously
> registered broker epoch forever? If not, how long does the controller keep
> it? What does the controller do when receiving a broker heartbeat request
> with an unfound broker epoch?
> 

Yes, the controller keeps the previous registration forever.

Broker heartbeat requests with an incorrect broker epoch will be rejected with 
STALE_BROKER_EPOCH.

> 100. Have you figured out if we need to add a new record type for reporting
> partitions on failed disks?
> 

I added FailedReplicaRecord to reflect the case where a JBOD directory has 
failed, leading to failed replicas.

> 102. For debugging purposes, sometimes it's useful to read the metadata
> topic using tools like console-consumer. Should we support that and if so,
> how?
> 

For now, we have the ability to read the metadata logs with the dump-logs tool. 
 I think we will come up with some other tools in the future as we get 
experience.

> 200. "brokers which are fenced will not appear in MetadataResponses. The
> broker will not respond to these requests-- instead, it will simply
> disconnect." If the controller is partitioned off from the brokers, this
> design will cause every broker to stop accepting new client requests. In
> contrast, if ZK is partitioned off, the existing behavior is that the
> brokers can continue to work based on the last known metadata. So, I am not
> sure if we should change the existing behavior because of the bigger impact
> in the new one. Another option is to keep the existing behavior and expose
> a metric for fenced brokers so that the operator could be alerted.
> 

I'm skeptical about how well running without ZK currently works.  However, I 
will move the broker-side fencing into a follow-up KIP.  This KIP is already 
pretty large and there is no hard dependency on this.  There may also be other 
ways of accomplishing the positive effects of what broker-side fencing, so more 
discussion is needed.

> 201. I read Ron's comment, but I am still not sure the benefit of keeping
> broker.id and controller.id in meta.properties. It seems that we are just
> duplicating the same info in two places and have the additional burden of
> making sure the values in the two places are consistent.
> 

I think the reasoning is that having broker.id protects us against accidentally 
bringing up a broker with a disk from a different broker.  I don't feel 
strongly about this but it seemed simpler to keep it.

> 202. controller.connect.security.protocol: Is this needed since
> controller.listener.names and listener.security.protocol.map imply the
> security protocol already?
> 

You're right, this isn't needed.  I'll remove it.

> 203. registration.heartbeat.interval.ms: It defaults to 2k. ZK uses 1/3 of
> the session timeout for heartbeat. So, given the default 18k for
> registration.lease.timeout.ms, should we default
> registration.heartbeat.interval.ms to 6k?
> 

6 seconds seems like a pretty long time between heartbeats.  It might be useful 
to know when a broker is missing heartbeats, with less time than that.  I 
provisionally set it to 3 seconds (we can always change later...)

I also changed the name of these configurations to 
"broker.heartbeat.interval.ms" and "broker.registration.timeout.ms" to try to 
clarify them a bit.

> 204. "The highest metadata offset which the broker has not reached." It
> seems this should be "has reached".
> 

I changed this to "one more than the highest metadata offset which the broker 
has reached."

> 205. UnfenceBrokerRecord and UnregisterBrokerRecord: To me, they seem to be
> the same. Do we need both?
> 

Unregistration means that the broker has been removed from the cluster.  That 
is different than unfencing, which marks the broker as active.

> 206. TopicRecord: The Deleting field is used to indicate that the topic is
> being deleted. I am wondering if this is really needed since RemoveTopic
> already indicates the same thing.
> 

RemoveTopic is the last step, that scrubs all metadata about the topic.  In 
order to get to that last step, the topic data needs to removed from all 
brokers (after each broker notices that the topic is being deleted).

best,
Colin

> Jun
> 
> On Wed, Dec 2, 2020 at 2:50 PM Colin McCabe  wrote:
> 
> > On Wed, Dec 2, 2020, at 14:07, Ron Dagostino wrote:
> > > Hi Colin.  Thanks for the updates.  It's now clear to me that brokers
> > > keep their broker epoch for the life of their JVM -- they register
> > > once, get their broker epoch in the response, and then never
> > > re-register again.  Brokers may get fenced, but they keep the same
> > > broker epoch for the life of their JVM.  The incarnation ID is also
> > > kept for the life

[jira] [Created] (KAFKA-10828) Consider using "acknowledged" over "endorsing" for voters which have recognized the current leader

2020-12-08 Thread Jason Gustafson (Jira)

Jason Gustafson created KAFKA-10828:
---

 Summary: Consider using "acknowledged" over "endorsing" for voters 
which have recognized the current leader
 Key: KAFKA-10828
 URL: https://issues.apache.org/jira/browse/KAFKA-10828
 Project: Kafka
  Issue Type: Sub-task
Reporter: Jason Gustafson


We use the term "endorsing voter" to indicate a voter who has acknowledge the 
current leader by either responding to a `BeginQuorumEpoch` request from the 
leader or by beginning to send `Fetch` requests. This terminology seemed to 
cause a little confusion in https://github.com/apache/kafka/pull/9539/files. An 
alternative to "endorsing" that we were considering is "acknowledging." 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Re: [DISCUSS] KIP-696: Update Streams FSM to clarify ERROR state meaning

2020-12-08 Thread Guozhang Wang

Hello Walker,

Thanks for the KIP! Overall it looks reasonable to me. Just a few minor
comments for the wiki page itself:

1) Could you clarify the conditions when RUNNING / REBALANCING ->
PENDING_ERROR will happen; and when PENDING_ERROR -> ERROR will happen.
E.g. when I read "Streams will only reach ERROR state in the event of an
exceptional failure in which the `StreamsUncaughtExceptionHandler` chose to
either shutdown the application or the client." I thought the first
transition would happen before the handler, and the second transition would
happen immediately after the handler returns "shutdown client" or "shutdown
application", until I read the last statement regarding "SHUTDOWN_CLIENT".

2) A compatibility issue: today it is possible that users would call
Streams APIs like shutdown in the global state transition listener. And
it's common to try shutting down the application automatically when
transiting to ERROR (assuming it was not a terminating state). I think we
could consider making this call a no-op and log a warning.

3) Could you link the following JIRAs in the "JIRA" field?

https://issues.apache.org/jira/browse/KAFKA-10555
https://issues.apache.org/jira/browse/KAFKA-9638
https://issues.apache.org/jira/browse/KAFKA-6520

And maybe we can also left a comment on those tickets explaining what would
happen to tackle the issues after this KIP.

Guozhang

On Tue, Dec 8, 2020 at 12:16 PM Walker Carlson 
wrote:

> Hello all,
>
> I'd like to propose KIP-696 to clarify the meaning of ERROR state in the
> KafkaStreams Client State Machine. This will update the States to be
> consistent with changes in KIP-671 and KIP-663.
>
> Here are the details: https://cwiki.apache.org/confluence/x/lCvZCQ
>
> Thanks,
> Walker
>

-- 
-- Guozhang

[jira] [Created] (KAFKA-10827) Consumer group coordinator node never gets updated for manual partition assignment

2020-12-08 Thread Jaebin Yoon (Jira)

Jaebin Yoon created KAFKA-10827:
---

 Summary: Consumer group coordinator node never gets updated for 
manual partition assignment
 Key: KAFKA-10827
 URL: https://issues.apache.org/jira/browse/KAFKA-10827
 Project: Kafka
  Issue Type: Bug
  Components: consumer
Affects Versions: 2.2.1
Reporter: Jaebin Yoon


We've run into a situation where the coordinator node in the consumer never 
gets updated with the new coordinator when the coordinator broker gets replaced 
with a new instance. Once the consumer gets into this mode, the consumer keeps 
trying to connect to the old coordinator and never recovers unless restarted.

This happens when the consumer uses manual partition assignment and commits 
offsets very infrequently (every 5 minutes) and the coordinator broker is not 
reachable (ip address, hostname are gone in a cloud environment).

The exception the consumer keeps getting is 
{code:java}
Offset commit failed with a retriable exception. You should retry committing 
the latest consumed offsets. Caused by: 
org.apache.kafka.common.errors.TimeoutException: Failed to send request after 
12 ms.
{code}
We could see a bunch of *SYN_SENT* tcp state from the consumer app to the old 
hostname in this error condition.

In the current manual partition assignment scenario, the only way for the 
coordinator to gets updated is through checkAndGetCoordinator in 
AbstractCoordinator but this gets called only in committing offsets every 5 
minutes in our case.  

The current logic of checkAndGetCoordinator is using 
ConsumerNetworkClient.isUnavailable but it returns false unless the Network 
client is in reconnect backoff time, which is currently configured with default 
values (reconnect.backoff.ms (50), reconnect.backoff.max.ms (1000) while 
request.timeout.ms is 12.  In this scenario, 
ConsumerNetworkClient.isUnavailable for the old coordinator node always returns 
false, resulting in checkAndGetCoordinator keeps the old coordinator node 
forever.

It seems checkAndGetCoordinator should not rely on 
ConsumerNetworkClient.isUnavailable but just check the client.connectionFailed.

We had to restart the consumer to recover from this condition.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Build failed in Jenkins: Kafka » kafka-trunk-jdk11 #293

2020-12-08 Thread Apache Jenkins Server

See 


Changes:

[github] KAFKA-10264; Fix Flaky Test 
TransactionsTest.testBumpTransactionalEpoch (#9291)

[github] MINOR: Configure reconnect backoff in 
`BrokerToControllerChannelManager` (#9709)


--
[...truncated 6.98 MB...]
org.apache.kafka.streams.TestTopicsTest > testValue PASSED

org.apache.kafka.streams.TestTopicsTest > testTimestampAutoAdvance STARTED

org.apache.kafka.streams.TestTopicsTest > testTimestampAutoAdvance PASSED

org.apache.kafka.streams.TestTopicsTest > testOutputWrongSerde STARTED

org.apache.kafka.streams.TestTopicsTest > testOutputWrongSerde PASSED

org.apache.kafka.streams.TestTopicsTest > 
shouldNotAllowToCreateOutputTopicWithNullTopicName STARTED

org.apache.kafka.streams.TestTopicsTest > 
shouldNotAllowToCreateOutputTopicWithNullTopicName PASSED

org.apache.kafka.streams.TestTopicsTest > testWrongSerde STARTED

org.apache.kafka.streams.TestTopicsTest > testWrongSerde PASSED

org.apache.kafka.streams.TestTopicsTest > testKeyValuesToMapWithNull STARTED

org.apache.kafka.streams.TestTopicsTest > testKeyValuesToMapWithNull PASSED

org.apache.kafka.streams.TestTopicsTest > testNonExistingOutputTopic STARTED

org.apache.kafka.streams.TestTopicsTest > testNonExistingOutputTopic PASSED

org.apache.kafka.streams.TestTopicsTest > testMultipleTopics STARTED

org.apache.kafka.streams.TestTopicsTest > testMultipleTopics PASSED

org.apache.kafka.streams.TestTopicsTest > testKeyValueList STARTED

org.apache.kafka.streams.TestTopicsTest > testKeyValueList PASSED

org.apache.kafka.streams.TestTopicsTest > 
shouldNotAllowToCreateOutputWithNullDriver STARTED

org.apache.kafka.streams.TestTopicsTest > 
shouldNotAllowToCreateOutputWithNullDriver PASSED

org.apache.kafka.streams.TestTopicsTest > testValueList STARTED

org.apache.kafka.streams.TestTopicsTest > testValueList PASSED

org.apache.kafka.streams.TestTopicsTest > testRecordList STARTED

org.apache.kafka.streams.TestTopicsTest > testRecordList PASSED

org.apache.kafka.streams.TestTopicsTest > testNonExistingInputTopic STARTED

org.apache.kafka.streams.TestTopicsTest > testNonExistingInputTopic PASSED

org.apache.kafka.streams.TestTopicsTest > testKeyValuesToMap STARTED

org.apache.kafka.streams.TestTopicsTest > testKeyValuesToMap PASSED

org.apache.kafka.streams.TestTopicsTest > testRecordsToList STARTED

org.apache.kafka.streams.TestTopicsTest > testRecordsToList PASSED

org.apache.kafka.streams.TestTopicsTest > testKeyValueListDuration STARTED

org.apache.kafka.streams.TestTopicsTest > testKeyValueListDuration PASSED

org.apache.kafka.streams.TestTopicsTest > testInputToString STARTED

org.apache.kafka.streams.TestTopicsTest > testInputToString PASSED

org.apache.kafka.streams.TestTopicsTest > testTimestamp STARTED

org.apache.kafka.streams.TestTopicsTest > testTimestamp PASSED

org.apache.kafka.streams.TestTopicsTest > testWithHeaders STARTED

org.apache.kafka.streams.TestTopicsTest > testWithHeaders PASSED

org.apache.kafka.streams.TestTopicsTest > testKeyValue STARTED

org.apache.kafka.streams.TestTopicsTest > testKeyValue PASSED

org.apache.kafka.streams.TestTopicsTest > 
shouldNotAllowToCreateTopicWithNullTopicName STARTED

org.apache.kafka.streams.TestTopicsTest > 
shouldNotAllowToCreateTopicWithNullTopicName PASSED

org.apache.kafka.streams.internals.WindowStoreFacadeTest > shouldReturnIsOpen 
STARTED

org.apache.kafka.streams.internals.WindowStoreFacadeTest > shouldReturnIsOpen 
PASSED

org.apache.kafka.streams.internals.WindowStoreFacadeTest > shouldReturnName 
STARTED

org.apache.kafka.streams.internals.WindowStoreFacadeTest > shouldReturnName 
PASSED

org.apache.kafka.streams.internals.WindowStoreFacadeTest > 
shouldPutWithUnknownTimestamp STARTED

org.apache.kafka.streams.internals.WindowStoreFacadeTest > 
shouldPutWithUnknownTimestamp PASSED

org.apache.kafka.streams.internals.WindowStoreFacadeTest > 
shouldPutWindowStartTimestampWithUnknownTimestamp STARTED

org.apache.kafka.streams.internals.WindowStoreFacadeTest > 
shouldPutWindowStartTimestampWithUnknownTimestamp PASSED

org.apache.kafka.streams.internals.WindowStoreFacadeTest > 
shouldReturnIsPersistent STARTED

org.apache.kafka.streams.internals.WindowStoreFacadeTest > 
shouldReturnIsPersistent PASSED

org.apache.kafka.streams.internals.WindowStoreFacadeTest > 
shouldForwardDeprecatedInit STARTED

org.apache.kafka.streams.internals.WindowStoreFacadeTest > 
shouldForwardDeprecatedInit PASSED

org.apache.kafka.streams.internals.WindowStoreFacadeTest > shouldForwardClose 
STARTED

org.apache.kafka.streams.internals.WindowStoreFacadeTest > shouldForwardClose 
PASSED

org.apache.kafka.streams.internals.WindowStoreFacadeTest > shouldForwardFlush 
STARTED

org.apache.kafka.streams.internals.WindowStoreFacadeTest > shouldForwardFlush 
PASSED

org.apache.kafka.streams.internals.WindowStoreFacadeTest > shouldForwardInit 
STARTED

[DISCUSS] KIP-696: Update Streams FSM to clarify ERROR state meaning

2020-12-08 Thread Walker Carlson

Hello all,

I'd like to propose KIP-696 to clarify the meaning of ERROR state in the
KafkaStreams Client State Machine. This will update the States to be
consistent with changes in KIP-671 and KIP-663.

Here are the details: https://cwiki.apache.org/confluence/x/lCvZCQ

Thanks,
Walker

[jira] [Created] (KAFKA-10826) Ensure raft io thread wakes up after linger expiration

2020-12-08 Thread Jason Gustafson (Jira)

Jason Gustafson created KAFKA-10826:
---

 Summary: Ensure raft io thread wakes up after linger expiration
 Key: KAFKA-10826
 URL: https://issues.apache.org/jira/browse/KAFKA-10826
 Project: Kafka
  Issue Type: Sub-task
Reporter: Jason Gustafson
Assignee: Jason Gustafson


When scheduling an append, we currently only wakeup the IO thread after the 
batch is ready to drain. If the IO thread is blocking in `poll()`, there is no 
guarantee that it will get woken up by a subsequent append. We need to ensure 
that the thread gets woken up at least once when the linger timer starts 
ticking so that the IO thread will be ready when the batch is ready to drain.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (KAFKA-10825) Consolidate code between ZK and AlterISR for ISR updates

2020-12-08 Thread David Arthur (Jira)

David Arthur created KAFKA-10825:


 Summary: Consolidate code between ZK and AlterISR for ISR updates
 Key: KAFKA-10825
 URL: https://issues.apache.org/jira/browse/KAFKA-10825
 Project: Kafka
  Issue Type: Task
Reporter: David Arthur
Assignee: David Arthur


It would be nice to consolidate the two code paths that are used for ISR 
updates in Partition.scala. If we can totally (or mostly) abstract away the ZK 
code, it will simplify some of the KIP-500 work.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (KAFKA-10756) Add missing unit test for `UnattachedState`

2020-12-08 Thread Jason Gustafson (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-10756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Gustafson resolved KAFKA-10756.
-
Resolution: Fixed

> Add missing unit test for `UnattachedState`
> ---
>
> Key: KAFKA-10756
> URL: https://issues.apache.org/jira/browse/KAFKA-10756
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: dengziming
>Assignee: dengziming
>Priority: Minor
>
> Add unit test for UnattachedState, similar to KAFKA-10519



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Re: [DISCUSS] KIP-693: Client-side Circuit Breaker for Partition Write Errors

2020-12-08 Thread Justine Olshan

Hi George,
I've been looking at the discussion on improving the sticky partitioner,
and one of the potential issues we discussed is how we could get
information to the partitioner to tell it not to choose certain partitions.
Currently, the partitioner can only use availablePartitionsForTopic. I took
a quick look at your KIP and it seemed that your KIP would change what
partitions are returned with this method. This seems like a step in the
right direction for solving that issue too.

I agree with Jun that looking at both of these issues and the proposed
solutions would be very helpful.
Justine

On Tue, Dec 8, 2020 at 10:07 AM Jun Rao  wrote:

> Hi, George,
>
> Thanks for submitting the KIP. There was an earlier discussing on improving
> the sticky partitioner in the producer (
>
> https://lists.apache.org/thread.html/rae8d2d5587dae57ad9093a85181e0cb4256f10d1e57138ecdb3ef287%40%3Cdev.kafka.apache.org%3E
> ).
> It seems to be solving a very similar issue. It would be useful to analyze
> both approaches and see which one solves the problem better.
>
> Jun
>
> On Tue, Dec 8, 2020 at 8:05 AM georgeshu(舒国强) 
> wrote:
>
> > Hello,
> >
> > We write up a KIP based on a straightforward mechanism implemented and
> > tested in order to solve a practical issue in production.
> >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-693%3A+Client-side+Circuit+Breaker+for+Partition+Write+Errors
> > Look forward to hearing feedback and suggestions.
> >
> > Thanks!
> >
> >
>

Re: [DISCUSS] KIP-693: Client-side Circuit Breaker for Partition Write Errors

2020-12-08 Thread Jun Rao

Hi, George,

Thanks for submitting the KIP. There was an earlier discussing on improving
the sticky partitioner in the producer (
https://lists.apache.org/thread.html/rae8d2d5587dae57ad9093a85181e0cb4256f10d1e57138ecdb3ef287%40%3Cdev.kafka.apache.org%3E).
It seems to be solving a very similar issue. It would be useful to analyze
both approaches and see which one solves the problem better.

Jun

On Tue, Dec 8, 2020 at 8:05 AM georgeshu(舒国强)  wrote:

> Hello,
>
> We write up a KIP based on a straightforward mechanism implemented and
> tested in order to solve a practical issue in production.
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-693%3A+Client-side+Circuit+Breaker+for+Partition+Write+Errors
> Look forward to hearing feedback and suggestions.
>
> Thanks!
>
>

Re: [VOTE] KIP-679: Producer will enable the strongest delivery guarantee by default

2020-12-08 Thread Cheng Tan

Thanks you all for discussing and voting on this KIP. Now it has been approved 
with 4 binding votes.

Best, - Cheng

> On Dec 8, 2020, at 12:52 AM, David Jacot  wrote:
> 
> +1 (binding)
> 
> Thanks for the KIP, Cheng!
> 
> On Tue, Dec 8, 2020 at 12:23 AM Ismael Juma  wrote:
> 
>> Thanks, +1 (binding).
>> 
>> On Mon, Dec 7, 2020 at 1:40 PM Cheng Tan  wrote:
>> 
>>> Hi Ismael,
>>> 
>>> Yes. Add deprecation warning for `IDEMPOTENT_WRITE` in 3.0 makes sense.
>>> I’ve updated the KIP’s “DEMPOTENT_WRITE Deprecation” section to reflect
>>> your suggestion. Please let me know if you have more suggestions. Thanks.
>>> 
>>> 
>>> Best, - Cheng Tan
>>> 
>>> 
 On Dec 7, 2020, at 6:42 AM, Ismael Juma  wrote:
 
 Thanks for the KIP Cheng. One suggestion: should we add the
>> `kafka-acls`
 deprecation warnings for `IDEMPOTENT_WRITE` in 3.0? That would give
>> time
 for authorizer implementations to be updated.
 
 Ismael
 
 On Fri, Dec 4, 2020 at 11:00 AM Cheng Tan >> c...@confluent.io>> wrote:
 
> Hi all,
> 
> I’m proposing a new KIP for enabling the strongest delivery guarantee
>> by
> default. Today Kafka support EOS and N-1 concurrent failure tolerance
>>> but
> the default settings haven’t bring them out of the box. The proposal
> changes include the producer defaults change to `ack=all` and
> `enable.idempotence=true`. Also, the ACL operation type
>>> IDEMPOTENCE_WRITE
> will be deprecated. If a producer has WRITE permission to any topic,
>> it
> will be able to request a producer id and perform idempotent produce.
> 
> KIP here:
> 
> 
>>> 
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-679%3A+Producer+will+enable+the+strongest+delivery+guarantee+by+default
> <
> 
>>> 
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-679:+Producer+will+enable+the+strongest+delivery+guarantee+by+default
>>> <
>>> 
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-679:+Producer+will+enable+the+strongest+delivery+guarantee+by+default
 
>> 
> 
> Please vote in this mail thread.
> 
> Thanks
> 
> - Cheng Tan
>>> 
>>> 
>>

[jira] [Resolved] (KAFKA-10264) Flaky Test TransactionsTest.testBumpTransactionalEpoch

2020-12-08 Thread Jason Gustafson (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-10264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Gustafson resolved KAFKA-10264.
-
Resolution: Fixed

> Flaky Test TransactionsTest.testBumpTransactionalEpoch
> --
>
> Key: KAFKA-10264
> URL: https://issues.apache.org/jira/browse/KAFKA-10264
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Reporter: A. Sophie Blee-Goldman
>Assignee: Chia-Ping Tsai
>Priority: Major
>  Labels: flaky-test, unit-test
>
> h3. Stacktrace
> java.lang.AssertionError: Unexpected exception cause 
> org.apache.kafka.common.KafkaException: The client hasn't received 
> acknowledgment for some previously sent messages and can no longer retry 
> them. It is safe to abort the transaction and continue. at 
> org.junit.Assert.fail(Assert.java:89) at 
> org.junit.Assert.assertTrue(Assert.java:42) at 
> org.apache.kafka.test.TestUtils.assertFutureThrows(TestUtils.java:557) at 
> kafka.api.TransactionsTest.testBumpTransactionalEpoch(TransactionsTest.scala:637)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[VOTE] KIP-695: Improve Streams Time Synchronization

2020-12-08 Thread John Roesler

Hello all,

There hasn't been much discussion on KIP-695 so far, so I'd
like to go ahead and call for a vote.

As a reminder, the purpose of KIP-695 to improve on the
"task idling" feature we introduced in KIP-353. This KIP
will allow Streams to offer deterministic time semantics in
join-type topologies. For example, it makes sure that
when you join two topics, that we collate the topics by
timestamp. That was always the intent with task idling (KIP-
353), but it turns out the previous mechanism couldn't
provide the desired semantics.

The details are here:
https://cwiki.apache.org/confluence/x/JSXZCQ

Thanks,
-John

Re: [DISCUSS] KIP-695: Improve Streams Time Synchronization

2020-12-08 Thread John Roesler

Thanks for taking a look, Bruno!

You have a sharp eye. All I meant by that is that
we don't want to draw conclusions from metadata that
we received a long time ago (for example, if fetches have
been failing), but we also don't want to enforce waiting on
a new fetch every single time we process a task with an
empty partition.

I didn't want to introduce a new configuration option to
govern the staleness of fetch metadata because it doesn't
fundamentally affect the guarantees of task idling and it
should also be possible to make a heuristic based on other
configs.

The main point of the KIP is to force us to wait for data
that we know to be on the brokers. When it comes to waiting
for new data to be produced to the brokers, we can only
provide an approximation. This interaction is free of
synchronization between the clients and brokers, so even
with an extremely strict bound on staleness, we could never
guarantee to poll records that were produced in close
proximity to the idling timeout.

Therefore, I wanted to leave a little liberty in the KIP to
let us adjust the heuristic dymanically, or in response to
benchmarks or field feedback, etc.

I'll update the KIP to specify that the exact determination
of staleness is left as an implementation detail.

Thanks for the feedback,
-John

On Tue, 2020-12-08 at 10:21 +0100, Bruno Cadonna wrote:
> Thank you for the KIP, John!
> 
> Overall, the KIP looks good to me.
> I was just wondering what do you mean by "too stale". Could you define 
> "too stale"?
> 
> Best,
> Bruno
> 
> 
> On 04.12.20 23:39, John Roesler wrote:
> > Hello all,
> > 
> > I'd like to propose KIP-695 to improve on the "task idling" feature we
> > introduced in KIP-353. This KIP will allow Streams to offer deterministic
> > time semantics in join-type topologies. For example, it makes sure that
> > when you join two topics, that we collate the topics by timestamp. That
> > was always the intent with task idling (KIP-353), but it turns out the
> > previous mechanism couldn't provide the desired semantics.
> > 
> > The details are here:
> > https://cwiki.apache.org/confluence/x/JSXZCQ
> > 
> > I look forward to your feedback!
> > 
> > Thanks,
> > -John
> >

[DISCUSS] KIP-693: Client-side Circuit Breaker for Partition Write Errors

2020-12-08 Thread 舒国强

Hello,

We write up a KIP based on a straightforward mechanism implemented and tested 
in order to solve a practical issue in production.
https://cwiki.apache.org/confluence/display/KAFKA/KIP-693%3A+Client-side+Circuit+Breaker+for+Partition+Write+Errors
Look forward to hearing feedback and suggestions.

Thanks!

[DISCUSS] KIP-694: Support Reducing Partitions for Topics

2020-12-08 Thread 舒国强

Hello,

We write up a KIP based on a straightforward mechanism implemented and tested 
in order to solve a practical issue in production.
https://cwiki.apache.org/confluence/display/KAFKA/KIP-694%3A+Support+Reducing+Partitions+for+Topics
Look forward to hearing feedback and suggestions.

Thanks!

[VOTE] KIP-690: Add additional configuration to control MirrorMaker 2 internal topics naming convention

2020-12-08 Thread Omnia Ibrahim

Hi everyone,
I’m proposing a new KIP for MirrorMaker 2 to add the ability to control
internal topics naming convention. The proposal details are here
https://cwiki.apache.org/confluence/display/KAFKA/KIP-690%3A+Add+additional+configuration+to+control+MirrorMaker+2+internal+topics+naming+convention

Please vote in this thread.
Thanks
Omnia

[jira] [Created] (KAFKA-10824) Connection to node -1 could not be established.

2020-12-08 Thread Mohammad Abboud (Jira)

Mohammad Abboud created KAFKA-10824:
---

 Summary: Connection to node -1 could not be established. 
 Key: KAFKA-10824
 URL: https://issues.apache.org/jira/browse/KAFKA-10824
 Project: Kafka
  Issue Type: Bug
  Components: consumer, KafkaConnect, network, offset manager
Affects Versions: 2.1.1, 2.0.0
Reporter: Mohammad Abboud






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Re: [DISCUSS] KIP-695: Improve Streams Time Synchronization

2020-12-08 Thread Bruno Cadonna


Thank you for the KIP, John!

Overall, the KIP looks good to me.
I was just wondering what do you mean by "too stale". Could you define 
"too stale"?


Best,
Bruno


On 04.12.20 23:39, John Roesler wrote:

Hello all,

I'd like to propose KIP-695 to improve on the "task idling" feature we
introduced in KIP-353. This KIP will allow Streams to offer deterministic
time semantics in join-type topologies. For example, it makes sure that
when you join two topics, that we collate the topics by timestamp. That
was always the intent with task idling (KIP-353), but it turns out the
previous mechanism couldn't provide the desired semantics.

The details are here:
https://cwiki.apache.org/confluence/x/JSXZCQ

I look forward to your feedback!

Thanks,
-John

Re: [VOTE] KIP-679: Producer will enable the strongest delivery guarantee by default

2020-12-08 Thread David Jacot

+1 (binding)

Thanks for the KIP, Cheng!

On Tue, Dec 8, 2020 at 12:23 AM Ismael Juma  wrote:

> Thanks, +1 (binding).
>
> On Mon, Dec 7, 2020 at 1:40 PM Cheng Tan  wrote:
>
> > Hi Ismael,
> >
> > Yes. Add deprecation warning for `IDEMPOTENT_WRITE` in 3.0 makes sense.
> > I’ve updated the KIP’s “DEMPOTENT_WRITE Deprecation” section to reflect
> > your suggestion. Please let me know if you have more suggestions. Thanks.
> >
> >
> > Best, - Cheng Tan
> >
> >
> > > On Dec 7, 2020, at 6:42 AM, Ismael Juma  wrote:
> > >
> > > Thanks for the KIP Cheng. One suggestion: should we add the
> `kafka-acls`
> > > deprecation warnings for `IDEMPOTENT_WRITE` in 3.0? That would give
> time
> > > for authorizer implementations to be updated.
> > >
> > > Ismael
> > >
> > > On Fri, Dec 4, 2020 at 11:00 AM Cheng Tan  > c...@confluent.io>> wrote:
> > >
> > >> Hi all,
> > >>
> > >> I’m proposing a new KIP for enabling the strongest delivery guarantee
> by
> > >> default. Today Kafka support EOS and N-1 concurrent failure tolerance
> > but
> > >> the default settings haven’t bring them out of the box. The proposal
> > >> changes include the producer defaults change to `ack=all` and
> > >> `enable.idempotence=true`. Also, the ACL operation type
> > IDEMPOTENCE_WRITE
> > >> will be deprecated. If a producer has WRITE permission to any topic,
> it
> > >> will be able to request a producer id and perform idempotent produce.
> > >>
> > >> KIP here:
> > >>
> > >>
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-679%3A+Producer+will+enable+the+strongest+delivery+guarantee+by+default
> > >> <
> > >>
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-679:+Producer+will+enable+the+strongest+delivery+guarantee+by+default
> > <
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-679:+Producer+will+enable+the+strongest+delivery+guarantee+by+default
> > >
> > >>>
> > >>
> > >> Please vote in this mail thread.
> > >>
> > >> Thanks
> > >>
> > >> - Cheng Tan
> >
> >
>

Build failed in Jenkins: Kafka » kafka-trunk-jdk8 #271

Re: [DISCUSS] KIP-687: Automatic Reloading of Security Store

Jenkins build is back to normal : Kafka » kafka-trunk-jdk11 #294

Re: [DISCUSS] KIP-696: Update Streams FSM to clarify ERROR state meaning

[jira] [Created] (KAFKA-10831) Consider increasing default session timeout for consumer

[jira] [Created] (KAFKA-10830) Kafka Producer API should throw unwrapped exceptions

[jira] [Created] (KAFKA-10829) Kafka Streams handle produce exception improvement

Re: [DISCUSS] KIP-660: Pluggable ReplicaAssignor

Re: [DISCUSS] KIP-631: The Quorum-based Kafka Controller

[jira] [Created] (KAFKA-10828) Consider using "acknowledged" over "endorsing" for voters which have recognized the current leader

Re: [DISCUSS] KIP-696: Update Streams FSM to clarify ERROR state meaning

[jira] [Created] (KAFKA-10827) Consumer group coordinator node never gets updated for manual partition assignment

Build failed in Jenkins: Kafka » kafka-trunk-jdk11 #293

[DISCUSS] KIP-696: Update Streams FSM to clarify ERROR state meaning

[jira] [Created] (KAFKA-10826) Ensure raft io thread wakes up after linger expiration

[jira] [Created] (KAFKA-10825) Consolidate code between ZK and AlterISR for ISR updates

[jira] [Resolved] (KAFKA-10756) Add missing unit test for `UnattachedState`

Re: [DISCUSS] KIP-693: Client-side Circuit Breaker for Partition Write Errors

Re: [DISCUSS] KIP-693: Client-side Circuit Breaker for Partition Write Errors

Re: [VOTE] KIP-679: Producer will enable the strongest delivery guarantee by default

[jira] [Resolved] (KAFKA-10264) Flaky Test TransactionsTest.testBumpTransactionalEpoch

[VOTE] KIP-695: Improve Streams Time Synchronization

Re: [DISCUSS] KIP-695: Improve Streams Time Synchronization

[DISCUSS] KIP-693: Client-side Circuit Breaker for Partition Write Errors

[DISCUSS] KIP-694: Support Reducing Partitions for Topics

[VOTE] KIP-690: Add additional configuration to control MirrorMaker 2 internal topics naming convention

[jira] [Created] (KAFKA-10824) Connection to node -1 could not be established.

Re: [DISCUSS] KIP-695: Improve Streams Time Synchronization

Re: [VOTE] KIP-679: Producer will enable the strongest delivery guarantee by default

29 matches

Site Navigation

Mail list logo

Footer information