[jira] [Comment Edited] (KAFKA-4668) Mirrormaker should default to auto.offset.reset=earliest

2020-04-10 Thread Jeff Widman (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-4668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17081137#comment-17081137
 ] 

Jeff Widman edited comment on KAFKA-4668 at 4/11/20, 4:44 AM:
--

Going to try to get this across the line, three years later. :D

I filed 
[KIP-592|https://cwiki.apache.org/confluence/display/KAFKA/KIP-592%3A++Replicate+MirrorMaker+topics+from+earliest]
 for this.


was (Author: jeffwidman):
Going to try to get this across the line, three years later. :D

I filed 
[KIP-592|[https://cwiki.apache.org/confluence/display/KAFKA/KIP-592%3A++Replicate+MirrorMaker+topics+from+earliest]]
 for this.

> Mirrormaker should default to auto.offset.reset=earliest
> 
>
> Key: KAFKA-4668
> URL: https://issues.apache.org/jira/browse/KAFKA-4668
> Project: Kafka
>  Issue Type: Improvement
>  Components: mirrormaker
>Reporter: Jeff Widman
>Assignee: Jeff Widman
>Priority: Major
>
> Mirrormaker currently inherits the default value for auto.offset.reset, which 
> is latest (new consumer) / largest (old consumer). 
> While for most consumers this is a sensible default, mirrormakers are 
> specifically designed for replication, so they should default to replicating 
> topics from the beginning.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (KAFKA-4668) Mirrormaker should default to auto.offset.reset=earliest

2020-04-10 Thread Jeff Widman (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-4668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17081137#comment-17081137
 ] 

Jeff Widman edited comment on KAFKA-4668 at 4/11/20, 4:44 AM:
--

Going to try to get this across the line, three years later. :D

I filed 
[KIP-592|[https://cwiki.apache.org/confluence/display/KAFKA/KIP-592%3A++Replicate+MirrorMaker+topics+from+earliest]]
 for this.


was (Author: jeffwidman):
Going to try to get this across the line, three years later. :D

I filed 
https://cwiki.apache.org/confluence/display/KAFKA/KIP-592%3A++Replicate+MirrorMaker+topics+from+earliest
 for this.

> Mirrormaker should default to auto.offset.reset=earliest
> 
>
> Key: KAFKA-4668
> URL: https://issues.apache.org/jira/browse/KAFKA-4668
> Project: Kafka
>  Issue Type: Improvement
>  Components: mirrormaker
>Reporter: Jeff Widman
>Assignee: Jeff Widman
>Priority: Major
>
> Mirrormaker currently inherits the default value for auto.offset.reset, which 
> is latest (new consumer) / largest (old consumer). 
> While for most consumers this is a sensible default, mirrormakers are 
> specifically designed for replication, so they should default to replicating 
> topics from the beginning.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-4668) Mirrormaker should default to auto.offset.reset=earliest

2020-04-10 Thread Jeff Widman (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-4668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17081137#comment-17081137
 ] 

Jeff Widman commented on KAFKA-4668:


Going to try to get this across the line, three years later. :D

I filed 
https://cwiki.apache.org/confluence/display/KAFKA/KIP-592%3A++Replicate+MirrorMaker+topics+from+earliest
 for this.

> Mirrormaker should default to auto.offset.reset=earliest
> 
>
> Key: KAFKA-4668
> URL: https://issues.apache.org/jira/browse/KAFKA-4668
> Project: Kafka
>  Issue Type: Improvement
>  Components: mirrormaker
>Reporter: Jeff Widman
>Assignee: Jeff Widman
>Priority: Major
>
> Mirrormaker currently inherits the default value for auto.offset.reset, which 
> is latest (new consumer) / largest (old consumer). 
> While for most consumers this is a sensible default, mirrormakers are 
> specifically designed for replication, so they should default to replicating 
> topics from the beginning.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-9765) Could not add partitions to transaction due to errors

2020-04-10 Thread Guozhang Wang (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-9765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17081060#comment-17081060
 ] 

Guozhang Wang commented on KAFKA-9765:
--

Hello [~waykarp] I looked your uploaded stack trace, especially 

{code}
at 
kafka.coordinator.transaction.TransactionMetadata.throwStateTransitionFailure(TransactionMetadata.scala:391)
at 
kafka.coordinator.transaction.TransactionMetadata.completeTransitionTo(TransactionMetadata.scala:326)
{code}

Comparing them with 2.3 version source code I think you hit the one of the root 
cause of KAFKA-8803 as [~bchen225242] mentioned. Note that this is not easily 
reproducible as it is a clock time shift scenario.

I'd suggest you maybe applying the patch 
https://github.com/apache/kafka/pull/8278 and see if you would hit this issue 
again in the future.

> Could not add partitions to transaction due to errors
> -
>
> Key: KAFKA-9765
> URL: https://issues.apache.org/jira/browse/KAFKA-9765
> Project: Kafka
>  Issue Type: Bug
>  Components: clients, producer 
>Affects Versions: 2.3.1
>Reporter: Prashant Waykar
>Priority: Blocker
>
> I am following the producer with transactions example in 
> [https://kafka.apache.org/23/javadoc/org/apache/kafka/clients/producer/KafkaProducer.html,]
>  and on kafkaException, I use abortTransaction and retry. 
> I am seeing these exceptions. Has anyone experienced this before ? Please 
> suggest
> {code:java}
> // code placeholder
> java.util.concurrent.ExecutionException: 
> org.apache.kafka.common.KafkaException: Could not add partitions to 
> transaction due to errors: {nfvAlarmJob-0=UNKNOWN_SERVER_ERROR}
> at 
> org.apache.kafka.clients.producer.internals.FutureRecordMetadata.valueOrError(FutureRecordMetadata.java:98)
> at 
> org.apache.kafka.clients.producer.internals.FutureRecordMetadata.get(FutureRecordMetadata.java:67)
> at 
> org.apache.kafka.clients.producer.internals.FutureRecordMetadata.get(FutureRecordMetadata.java:30)
> at 
> com.vmware.vchs.hybridity.messaging.kafka.KafkaProducerDelegate.publishMessageWithTransaction(KafkaProducerDelegate.java:197)
> at 
> com.vmware.vchs.hybridity.messaging.kafka.KafkaProducerDelegate.publish(KafkaProducerDelegate.java:164)
> at 
> com.vmware.vchs.hybridity.messaging.kafka.KafkaProducerDelegate.publish(KafkaProducerDelegate.java:158)
> at 
> com.vmware.vchs.hybridity.messaging.adapter.JobManagerJobPublisher.publish(JobManagerJobPublisher.java:140)
> at 
> com.vmware.vchs.hybridity.messaging.adapter.JobManager.queueJob(JobManager.java:1720)
> at 
> com.vmware.vchs.hybridity.messaging.adapter.JobManagementAdapter.queueJob(JobManagementAdapter.java:80)
> at 
> com.vmware.vchs.hybridity.messaging.adapter.JobManagementAdapter.queueJob(JobManagementAdapter.java:70)
> at 
> com.vmware.vchs.hybridity.messaging.adapter.JobManagedService.queueJob(JobManagedService.java:168)
> at 
> com.vmware.hybridity.nfvm.alarm.UpdateVcenterAlarmsJob.run(UpdateVcenterAlarmsJob.java:67)
> at 
> com.vmware.vchs.hybridity.messaging.LoggingJobWrapper.run(LoggingJobWrapper.java:41)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> Caused by: org.apache.kafka.common.KafkaException: Could not add partitions 
> to transaction due to errors: {nfvAlarmJob-0=UNKNOWN_SERVER_ERROR}
> at 
> org.apache.kafka.clients.producer.internals.TransactionManager$AddPartitionsToTxnHandler.handleResponse(TransactionManager.java:1230)
> at 
> org.apache.kafka.clients.producer.internals.TransactionManager$TxnRequestHandler.onComplete(TransactionManager.java:1069)
> at 
> org.apache.kafka.clients.ClientResponse.onComplete(ClientResponse.java:109)
> at 
> org.apache.kafka.clients.NetworkClient.completeResponses(NetworkClient.java:561)
> at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:553)
> at 
> org.apache.kafka.clients.producer.internals.Sender.maybeSendAndPollTransactionalRequest(Sender.java:425)
> at 
> org.apache.kafka.clients.producer.internals.Sender.runOnce(Sender.java:311)
> at 
> 

[jira] [Commented] (KAFKA-9852) Lower block duration in BufferPoolTest to cut down on overall test runtime

2020-04-10 Thread Jira


[ 
https://issues.apache.org/jira/browse/KAFKA-9852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17081056#comment-17081056
 ] 

Sönke Liebau commented on KAFKA-9852:
-

[~junrao] as just discussed in the 
[PR|https://github.com/apache/kafka/pull/8399#discussion_r406964670] to 
KAFKA-3720

> Lower block duration in BufferPoolTest to cut down on overall test runtime
> --
>
> Key: KAFKA-9852
> URL: https://issues.apache.org/jira/browse/KAFKA-9852
> Project: Kafka
>  Issue Type: Improvement
>  Components: unit tests
>Reporter: Sönke Liebau
>Assignee: Sönke Liebau
>Priority: Trivial
>
> In BufferPoolTest we use a global setting for the maximum duration that calls 
> can block (max.block.ms) of [2000ms 
> |https://github.com/apache/kafka/blob/e032a360708cec2284f714e4cae388066064d61c/clients/src/test/java/org/apache/kafka/clients/producer/internals/BufferPoolTest.java#L54]
> Since this is wall clock time that might be waited on and could potentially 
> come into play multiple times while this class is executed this is a very 
> long timeout for testing.
> We should reduce this timeout to a much lower value to cut back on test 
> runtimes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-9852) Lower block duration in BufferPoolTest to cut down on overall test runtime

2020-04-10 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-9852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17081052#comment-17081052
 ] 

ASF GitHub Bot commented on KAFKA-9852:
---

soenkeliebau commented on pull request #8464: KAFKA-9852: Change the max 
duration that calls to the buffer pool can block from 2000ms to 10ms
URL: https://github.com/apache/kafka/pull/8464
 
 
   This is to reduce overall test runtime, as this is wallclock time.
   
   Adjusted one assert condition on a testcase as the success was dependant on 
thread runtimes and the much lower tolerances due to the reduced time broke 
this test.
   
   Ran a couple thousand iterations of the test class on my machine without 
failed test cases.
   
   ### Committer Checklist (excluded from commit message)
   - [ ] Verify design and implementation 
   - [ ] Verify test coverage and CI build status
   - [ ] Verify documentation (including upgrade notes)
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Lower block duration in BufferPoolTest to cut down on overall test runtime
> --
>
> Key: KAFKA-9852
> URL: https://issues.apache.org/jira/browse/KAFKA-9852
> Project: Kafka
>  Issue Type: Improvement
>  Components: unit tests
>Reporter: Sönke Liebau
>Assignee: Sönke Liebau
>Priority: Trivial
>
> In BufferPoolTest we use a global setting for the maximum duration that calls 
> can block (max.block.ms) of [2000ms 
> |https://github.com/apache/kafka/blob/e032a360708cec2284f714e4cae388066064d61c/clients/src/test/java/org/apache/kafka/clients/producer/internals/BufferPoolTest.java#L54]
> Since this is wall clock time that might be waited on and could potentially 
> come into play multiple times while this class is executed this is a very 
> long timeout for testing.
> We should reduce this timeout to a much lower value to cut back on test 
> runtimes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-3720) Remove BufferExhaustedException from doSend() in KafkaProducer

2020-04-10 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-3720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17081045#comment-17081045
 ] 

ASF GitHub Bot commented on KAFKA-3720:
---

junrao commented on pull request #8399: KAFKA-3720: Change TimeoutException to 
BufferExhaustedException when no memory can be allocated for a record within 
max.block.ms
URL: https://github.com/apache/kafka/pull/8399
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Remove BufferExhaustedException from doSend() in KafkaProducer
> --
>
> Key: KAFKA-3720
> URL: https://issues.apache.org/jira/browse/KAFKA-3720
> Project: Kafka
>  Issue Type: Bug
>  Components: producer 
>Affects Versions: 2.4.1
>Reporter: Mayuresh Gharat
>Assignee: Sönke Liebau
>Priority: Minor
>
> KafkaProducer no longer throws BufferExhaustException. We should remove it 
> from the catch clause. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-8124) Beginning offset is after the ending offset for topic partition

2020-04-10 Thread Rishabh (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-8124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17081023#comment-17081023
 ] 

Rishabh commented on KAFKA-8124:


+1

> Beginning offset is after the ending offset for topic partition
> ---
>
> Key: KAFKA-8124
> URL: https://issues.apache.org/jira/browse/KAFKA-8124
> Project: Kafka
>  Issue Type: Bug
>  Components: consumer
>Affects Versions: 1.1.0
> Environment: OS : Rhel 7
> server : VM
>Reporter: suseendramani
>Priority: Major
>
>  
> We are getting this issue in production and Sparks consumer dying because of 
> Off Set issue.
> We observed the following error in Kafka Broker ( that has problems)
> --
> [2019-03-18 14:40:14,100] WARN Unable to reconnect to ZooKeeper service, 
> session 0x1692e9ff4410004 has expired (org.apache.zookeeper.ClientCnxn)
>  [2019-03-18 14:40:14,100] INFO Unable to reconnect to ZooKeeper service, 
> session 0x1692e9ff4410004 has expired, closing socket connection 
> (org.apache.zook
>  eeper.ClientCnxn)
> ---
> Error from other broker when talking to the problematic broker.
>  [2019-03-18 14:40:14,107] INFO [ReplicaFetcher replicaId=3, leaderId=5, 
> fetcherId=0] Error sending fetch request (sessionId=2127346653, 
> epoch=27048427) to
>  node 5: java.nio.channels.ClosedSelectorException. 
> (org.apache.kafka.clients.FetchSessionHandler)
>  
> 
>  
> All topics were having replication factor of 3 and this issue happens when 
> one of the broker was having issues. We are using SCRAM authentication 
> (SHA-256) and SSL.
>  
> Sparks Job died with the following error:
> ERROR 2019-03-18 07:40:57,178 7924 org.apache.spark.executor.Executor 
> [Executor task launch worker for task 16] Exception in task 27.0 in stage 0.0 
> (TID 16)
>  java.lang.AssertionError: assertion failed: Beginning offset 115204574 is 
> after the ending offset 115204516 for topic  partition 37. You 
> either provided an invalid fromOffset, or the Kafka topic has been damaged
>  at scala.Predef$.assert(Predef.scala:170)
>  at org.apache.spark.streaming.kafka010.KafkaRDD.compute(KafkaRDD.scala:175)
>  at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
>  at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
>  at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>  at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
>  at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
>  at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>  at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
>  at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
>  at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>  at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
>  at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
>  at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>  at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
>  at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
>  at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
>  at org.apache.spark.scheduler.Task.run(Task.scala:109)
>  at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  at java.lang.Thread.run(Thread.java:748)
>  
> ---
>  
> please let me know if you need more details.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-9846) Race condition can lead to severe lag underestimate for active tasks

2020-04-10 Thread Vinoth Chandar (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-9846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17080989#comment-17080989
 ] 

Vinoth Chandar commented on KAFKA-9846:
---

Understood, I was always wary of grabbing those maps at will to build this 
out.. 

Anyways, here's what I found.. 

The original test did not fail for this reason. the test did not wait for the 
instance to transition into RUNNING, instead just waiting till the lags went 
down to zero. This only guarantees onRestoreStart() would definitely have been 
called (otherwise restoration cannot start and lag would not have been zero.). 
It might happen that the actual lag went to 0 and thread was on its way to 
RUNNING state by finishing up onRestoreEnd() call. But the test could check for 
restoreEnd lags before that. and cause the NPE..

 
{code:java}
java.lang.NullPointerException
at 
org.apache.kafka.streams.integration.LagFetchIntegrationTest.shouldFetchLagsDuringRestoration(LagFetchIntegrationTest.java:306)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 {code}
 

On the issue reported in this ticket itself, we need to have a scenario where 
the streamThread.allStreamsTasks() returns a task in `created` and `suspended` 
lists for e.g.. I wrote a thread to constantly poll for the 
streamThread.allStreamsTasks() as an active was starting up and restoring and 
transitioning into RUNNING.  I saw that the first time, I got some tasks back 
using allStreamsTasks() was only after PARTITIONS_ASSIGNED and it was already a 
restoring task. 

Do you have suggestions on reproducing this in a test?  [~ableegoldman] you 
mentioned sometimes the tasks can be in CREATED for a long time? Anyways, I 
have posted a patch here [https://github.com/apache/kafka/pull/8462/files] 

This issue does not happen on master. So if we are targetting this really for 
2.6. We can close this 

 

> Race condition can lead to severe lag underestimate for active tasks
> 
>
> Key: KAFKA-9846
> URL: https://issues.apache.org/jira/browse/KAFKA-9846
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 2.5.0
>Reporter: Sophie Blee-Goldman
>Assignee: Vinoth Chandar
>Priority: Critical
> Fix For: 2.6.0
>
>
> In KIP-535 we added the ability to query still-restoring and standby tasks. 
> To give users control over how out of date the data they fetch can be, we 
> added an API to KafkaStreams that fetches the end offsets for all changelog 
> partitions and computes the lag for each local state store.
> During this lag computation, we check whether an active task is in RESTORING 
> and calculate the actual lag if so. If not, we assume it's in RUNNING and 
> return a lag of zero. However, tasks may be in other states besides running 
> and restoring; notably they first pass through the CREATED state before 
> getting to RESTORING. A CREATED task may happen to be caught-up to the end 
> offset, but in many cases it is likely to be lagging or even completely 
> uninitialized.
> This introduces a race condition where users may be led to believe that a 
> task has zero lag and is "safe" to query even with the strictest correctness 
> guarantees, while the task is actually lagging by some unknown amount.  
> During transfer of ownership of the task between different threads on the 
> same machine, tasks can actually spend a while in CREATED while the new owner 
> waits to acquire the task directory lock. So, this race condition may not be 
> particularly rare in multi-threaded Streams applications



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-9642) "BigDecimal(double)" should not be used

2020-04-10 Thread Konstantine Karantasis (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantine Karantasis updated KAFKA-9642:
--
Fix Version/s: 2.6.0

> "BigDecimal(double)" should not be used
> ---
>
> Key: KAFKA-9642
> URL: https://issues.apache.org/jira/browse/KAFKA-9642
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Reporter: Soontaek Lim
>Assignee: Soontaek Lim
>Priority: Minor
> Fix For: 2.6.0
>
>
> I recommend not to use the BigDecimal(double) constructor. Because of 
> floating point imprecision, we're unlikely to get the value we expect from 
> that constructor.
> Instead, we should use BigDecimal.valueOf, which uses a string under the 
> covers to eliminate floating-point rounding errors, or the constructor that 
> takes a String argument.
>  
> From JavaDocs
> The results of this constructor can be somewhat unpredictable. One might 
> assume that writing new BigDecimal(0.1) in Java creates a BigDecimal which is 
> exactly equal to 0.1 (an unscaled value of 1, with a scale of 1), but it is 
> actually equal to 0.155511151231257827021181583404541015625. 
> This is because 0.1 cannot be represented exactly as a double (or, for that 
> matter, as a binary fraction of any finite length). Thus, the value that is 
> being passed in to the constructor is not exactly equal to 0.1, appearances 
> notwithstanding.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-9846) Race condition can lead to severe lag underestimate for active tasks

2020-04-10 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-9846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17080978#comment-17080978
 ] 

ASF GitHub Bot commented on KAFKA-9846:
---

vinothchandar commented on pull request #8462: KAFKA-9846: Filter active tasks 
for running state in KafkaStreams#allLocalStorePartitionLags()
URL: https://github.com/apache/kafka/pull/8462
 
 
   
 - Added check that only treats running active tasks as having 0 lag
 - Tasks that are neither restoring, nor running will report 0 as 
currentoffset position
 - Fixed LagFetchIntegrationTest to wait till thread/instance reaches 
RUNNING before checking lag
   
   
   ### Committer Checklist (excluded from commit message)
   - [ ] Verify design and implementation 
   - [ ] Verify test coverage and CI build status
   - [ ] Verify documentation (including upgrade notes)
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Race condition can lead to severe lag underestimate for active tasks
> 
>
> Key: KAFKA-9846
> URL: https://issues.apache.org/jira/browse/KAFKA-9846
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 2.5.0
>Reporter: Sophie Blee-Goldman
>Assignee: Vinoth Chandar
>Priority: Critical
> Fix For: 2.6.0
>
>
> In KIP-535 we added the ability to query still-restoring and standby tasks. 
> To give users control over how out of date the data they fetch can be, we 
> added an API to KafkaStreams that fetches the end offsets for all changelog 
> partitions and computes the lag for each local state store.
> During this lag computation, we check whether an active task is in RESTORING 
> and calculate the actual lag if so. If not, we assume it's in RUNNING and 
> return a lag of zero. However, tasks may be in other states besides running 
> and restoring; notably they first pass through the CREATED state before 
> getting to RESTORING. A CREATED task may happen to be caught-up to the end 
> offset, but in many cases it is likely to be lagging or even completely 
> uninitialized.
> This introduces a race condition where users may be led to believe that a 
> task has zero lag and is "safe" to query even with the strictest correctness 
> guarantees, while the task is actually lagging by some unknown amount.  
> During transfer of ownership of the task between different threads on the 
> same machine, tasks can actually spend a while in CREATED while the new owner 
> waits to acquire the task directory lock. So, this race condition may not be 
> particularly rare in multi-threaded Streams applications



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (KAFKA-4668) Mirrormaker should default to auto.offset.reset=earliest

2020-04-10 Thread Jeff Widman (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-4668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Widman reassigned KAFKA-4668:
--

Assignee: Jeff Widman

> Mirrormaker should default to auto.offset.reset=earliest
> 
>
> Key: KAFKA-4668
> URL: https://issues.apache.org/jira/browse/KAFKA-4668
> Project: Kafka
>  Issue Type: Improvement
>  Components: mirrormaker
>Reporter: Jeff Widman
>Assignee: Jeff Widman
>Priority: Major
>
> Mirrormaker currently inherits the default value for auto.offset.reset, which 
> is latest (new consumer) / largest (old consumer). 
> While for most consumers this is a sensible default, mirrormakers are 
> specifically designed for replication, so they should default to replicating 
> topics from the beginning.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-9603) Number of open files keeps increasing in Streams application

2020-04-10 Thread Guozhang Wang (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-9603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17080887#comment-17080887
 ] 

Guozhang Wang commented on KAFKA-9603:
--

Great, thanks! I'd be waiting for you to share the mini project then :)

> Number of open files keeps increasing in Streams application
> 
>
> Key: KAFKA-9603
> URL: https://issues.apache.org/jira/browse/KAFKA-9603
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 2.4.0, 2.3.1
> Environment: Spring Boot 2.2.4, OpenJDK 13, Centos image
>Reporter: Bruno Iljazovic
>Priority: Major
>
> Problem appeared when upgrading from *2.0.1* to *2.3.1*. 
> Relevant Kafka Streams code:
> {code:java}
> KStream events1 =
> builder.stream(FIRST_TOPIC_NAME, Consumed.with(stringSerde, event1Serde, 
> event1TimestampExtractor(), null))
>.mapValues(...);
> KStream events2 =
> builder.stream(SECOND_TOPIC_NAME, Consumed.with(stringSerde, event2Serde, 
> event2TimestampExtractor(), null))
>.mapValues(...);
> var joinWindows = JoinWindows.of(Duration.of(1, MINUTES).toMillis())
>  .until(Duration.of(1, HOURS).toMillis());
> events2.join(events1, this::join, joinWindows, Joined.with(stringSerde, 
> event2Serde, event1Serde))
>.foreach(...);
> {code}
> Number of open *.sst files keeps increasing until eventually it hits the os 
> limit (65536) and causes this exception:
> {code:java}
> Caused by: org.rocksdb.RocksDBException: While open a file for appending: 
> /.../0_8/KSTREAM-JOINOTHER-10-store/KSTREAM-JOINOTHER-10-store.157943520/001354.sst:
>  Too many open files
>   at org.rocksdb.RocksDB.flush(Native Method)
>   at org.rocksdb.RocksDB.flush(RocksDB.java:2394)
> {code}
> Here are example files that are opened and never closed:
> {code:java}
> /.../0_27/KSTREAM-JOINTHIS-09-store/KSTREAM-JOINTHIS-09-store.158245920/000114.sst
> /.../0_27/KSTREAM-JOINOTHER-10-store/KSTREAM-JOINOTHER-10-store.158245920/65.sst
> /.../0_29/KSTREAM-JOINTHIS-09-store/KSTREAM-JOINTHIS-09-store.158215680/000115.sst
> /.../0_29/KSTREAM-JOINTHIS-09-store/KSTREAM-JOINTHIS-09-store.158245920/000112.sst
> /.../0_31/KSTREAM-JOINTHIS-09-store/KSTREAM-JOINTHIS-09-store.158185440/51.sst
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-9592) Safely abort Producer transactions during application shutdown

2020-04-10 Thread Guozhang Wang (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-9592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17080875#comment-17080875
 ] 

Guozhang Wang commented on KAFKA-9592:
--

I would think so: the current javadoc of abortTxn says

{code}
 * This call will throw an exception immediately if any prior {@link 
#send(ProducerRecord)} calls failed with a
 * {@link ProducerFencedException} or an instance of {@link 
org.apache.kafka.common.errors.AuthorizationException}.
{code}

which would no longer be the case in our new proposal. So effectively we are 
indeed changing the contract we are offering to the users.

> Safely abort Producer transactions during application shutdown
> --
>
> Key: KAFKA-9592
> URL: https://issues.apache.org/jira/browse/KAFKA-9592
> Project: Kafka
>  Issue Type: Improvement
>  Components: producer 
>Affects Versions: 2.5.0
>Reporter: Boyang Chen
>Assignee: Xiang Zhang
>Priority: Major
>  Labels: help-wanted, needs-kip, newbie
> Fix For: 2.6.0
>
>
> Today if a transactional producer hits a fatal exception, the caller usually 
> catches the exception and handle it by closing the producer, and abort the 
> transaction:
>  
> {code:java}
> try {
>   producer.beginTxn();
>   producer.send(xxx);
>   producer.sendOffsets(xxx);
>   producer.commit();
> } catch (ProducerFenced | UnknownPid e) {
>   ...
>   producer.abortTxn();
>   producer.close();
> }{code}
> This is what the current API suggests user to do. Another scenario is during 
> an informed shutdown, people with EOS producer would also like to end an 
> ongoing transaction before closing the producer as it sounds more clean.
> The tricky scenario is that `abortTxn` is not a safe call when the producer 
> is already in an error state, which means user has to do another try-catch 
> with the first layer catch block, making the error handling pretty annoying. 
> There are several ways to make this API robust and guide user to a safe usage:
>  # Internally abort any ongoing transaction within `producer.close`, and 
> comment on `abortTxn` call to warn user not to do it manually. 
>  # Similar to 1, but get a new `close(boolean abortTxn)` API call in case 
> some users want to handle transaction state by themselves.
>  # Introduce a new abort transaction API with a boolean flag indicating 
> whether the producer is in error state, instead of throwing exceptions
>  # Introduce a public API `isInError` on producer for user to validate before 
> doing any transactional API calls
> I personally favor 1 & 2 most as it is simple and does not require any API 
> change. Considering the change scope, I would still recommend a small KIP.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-9832) Extend EOS system tests for EOS-beta

2020-04-10 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-9832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17080856#comment-17080856
 ] 

ASF GitHub Bot commented on KAFKA-9832:
---

mjsax commented on pull request #8443: KAFKA-9832: Extend Streams system tests 
for EOS-beta
URL: https://github.com/apache/kafka/pull/8443
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Extend EOS system tests for EOS-beta
> 
>
> Key: KAFKA-9832
> URL: https://issues.apache.org/jira/browse/KAFKA-9832
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Matthias J. Sax
>Assignee: Matthias J. Sax
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-6145) Warm up new KS instances before migrating tasks - potentially a two phase rebalance

2020-04-10 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-6145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17080851#comment-17080851
 ] 

ASF GitHub Bot commented on KAFKA-6145:
---

vvcephei commented on pull request #8436: KAFKA-6145: KIP-441 avoid unnecessary 
movement of standbys
URL: https://github.com/apache/kafka/pull/8436
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Warm up new KS instances before migrating tasks - potentially a two phase 
> rebalance
> ---
>
> Key: KAFKA-6145
> URL: https://issues.apache.org/jira/browse/KAFKA-6145
> Project: Kafka
>  Issue Type: New Feature
>  Components: streams
>Reporter: Antony Stubbs
>Assignee: Sophie Blee-Goldman
>Priority: Major
>  Labels: needs-kip
>
> Currently when expanding the KS cluster, the new node's partitions will be 
> unavailable during the rebalance, which for large states can take a very long 
> time, or for small state stores even more than a few ms can be a deal breaker 
> for micro service use cases.
> One workaround would be two execute the rebalance in two phases:
> 1) start running state store building on the new node
> 2) once the state store is fully populated on the new node, only then 
> rebalance the tasks - there will still be a rebalance pause, but would be 
> greatly reduced
> Relates to: KAFKA-6144 - Allow state stores to serve stale reads during 
> rebalance



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-8270) Kafka timestamp-based retention policy is not working when Kafka client's time is not reliable.

2020-04-10 Thread Jiangtao Liu (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jiangtao Liu updated KAFKA-8270:

Description: 
I have a particular blog to explain the whole context at here 
[https://medium.com/@jiangtaoliu/a-kafka-pitfall-when-to-set-log-message-timestamp-type-to-createtime-c17846813ca3]


What's the issue?
{quote}There were log segments, which can not be deleted over configured 
retention hours.
{quote}
What are impacts? 
{quote} # Log space keep in increasing and finally cause space shortage.
 # There are lots of log segment rolled with a small size. e.g log segment may 
be only 50mb, not the expected 1gb.
 # Kafka stream or client may experience missing data.{quote}
How to reproduce it?
 # 
{quote}Make sure Kafka client and server are not on the same machine.{quote}
 # 
{quote}Configure log.message.timestamp.type as *CreateTime*, not 
LogAppendTime.{quote}
 # 
{quote}Hack Kafka client's system clock time as a *future time*, e.g 
03/04/*2025*, 3:25:52 PM 
[GMT-08:00|https://www.epochconverter.com/timezones?q=1741130752]{quote}
 # 
{quote}Send message from Kafka client to server.{quote}

What's Next?
 # 
{quote}Check the timestamp in log time index and record in log segment(e.g 
035957300794.timeindex). You will see all the timestamp values in 
*.timeindex are messed up with a future time after `03/04/*2025*, 3:25:52 PM 
[GMT-08:00|https://www.epochconverter.com/timezones?q=1741130752]`.  
{quote}
 # 
{quote}You will also see the new rolled log segment's size smaller than the 
configured log segment size after waiting for a while.
{quote}
 # 
{quote}Log segment (035957300794.log|timeindex) and the rest of new 
rolled log segments will not be deleted over retention hours.
{quote}

What's the particular logic to cause this issue?
 # 
{quote}private def deletableSegments(predicate: (LogSegment, 
Option[LogSegment]) => 
Boolean)|[https://github.com/apache/kafka/blob/1.1/core/src/main/scala/kafka/log/Log.scala#L1227]]
 will always return empty deletable log segments.{color:#172b4d} {color}
{quote}

  was:
What's the issue?
{quote}There were log segments, which can not be deleted over configured 
retention hours.
{quote}
What are impacts? 
{quote} # Log space keep in increasing and finally cause space shortage.
 # There are lots of log segment rolled with a small size. e.g log segment may 
be only 50mb, not the expected 1gb.
 # Kafka stream or client may experience missing data.{quote}
How to reproduce it?
 # 
{quote}Configure message.timestamp.type as CreateTime.
{quote}
 # 
{quote}Reset Kafka client's system clock time as a future time, e.g 
03/04/*2025*, 3:25:52 PM 
[GMT-08:00|https://www.epochconverter.com/timezones?q=1741130752]

Send message to Kafka.
{quote}

What's Next?
 # 
{quote}Go to check the timestamp in log time index and record (e.g 
035957300794.log|timeindex). You should see all the value of the 
timestamp is messed up with future time after `03/04/*2025*, 3:25:52 PM 
[GMT-08:00|https://www.epochconverter.com/timezones?q=1741130752]`.  
{quote}
 # 
{quote}You may also see the new rolled log segment's size smaller than the 
configured log segment size).
{quote}
 # 
{quote}Log segment (035957300794.log|timeindex) and any of rolled log 
segments will not be deleted.
{quote}

What's the particular logic to cause this issue?
 # 
{quote}private def deletableSegments(predicate: (LogSegment, 
Option[LogSegment]) => 
Boolean)|[https://github.com/apache/kafka/blob/1.1/core/src/main/scala/kafka/log/Log.scala#L1227]]
 will always return empty deletable log segments.
{quote}
 


> Kafka timestamp-based retention policy is not working when Kafka client's 
> time is not reliable.
> ---
>
> Key: KAFKA-8270
> URL: https://issues.apache.org/jira/browse/KAFKA-8270
> Project: Kafka
>  Issue Type: Bug
>  Components: log, log cleaner, logging
>Affects Versions: 1.1.1
>Reporter: Jiangtao Liu
>Priority: Major
>  Labels: storage
> Attachments: Screen Shot 2019-04-20 at 10.57.59 PM.png
>
>
> I have a particular blog to explain the whole context at here 
> [https://medium.com/@jiangtaoliu/a-kafka-pitfall-when-to-set-log-message-timestamp-type-to-createtime-c17846813ca3]
> What's the issue?
> {quote}There were log segments, which can not be deleted over configured 
> retention hours.
> {quote}
> What are impacts? 
> {quote} # Log space keep in increasing and finally cause space shortage.
>  # There are lots of log segment rolled with a small size. e.g log segment 
> may be only 50mb, not the expected 1gb.
>  # Kafka stream or client may experience missing data.{quote}
> How to reproduce it?
>  # 
> {quote}Make sure Kafka client and server are not on the same machine.{quote}
>  # 
> 

[jira] [Updated] (KAFKA-8270) Kafka timestamp-based retention policy is not working when Kafka client's time is not reliable.

2020-04-10 Thread Jiangtao Liu (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jiangtao Liu updated KAFKA-8270:

Description: 
What's the issue?
{quote}There were log segments, which can not be deleted over configured 
retention hours.
{quote}
What are impacts? 
{quote} # Log space keep in increasing and finally cause space shortage.
 # There are lots of log segment rolled with a small size. e.g log segment may 
be only 50mb, not the expected 1gb.
 # Kafka stream or client may experience missing data.{quote}
How to reproduce it?
 # 
{quote}Configure message.timestamp.type as CreateTime.
{quote}
 # 
{quote}Reset Kafka client's system clock time as a future time, e.g 
03/04/*2025*, 3:25:52 PM 
[GMT-08:00|https://www.epochconverter.com/timezones?q=1741130752]

Send message to Kafka.
{quote}

What's Next?
 # 
{quote}Go to check the timestamp in log time index and record (e.g 
035957300794.log|timeindex). You should see all the value of the 
timestamp is messed up with future time after `03/04/*2025*, 3:25:52 PM 
[GMT-08:00|https://www.epochconverter.com/timezones?q=1741130752]`.  
{quote}
 # 
{quote}You may also see the new rolled log segment's size smaller than the 
configured log segment size).
{quote}
 # 
{quote}Log segment (035957300794.log|timeindex) and any of rolled log 
segments will not be deleted.
{quote}

What's the particular logic to cause this issue?
 # 
{quote}private def deletableSegments(predicate: (LogSegment, 
Option[LogSegment]) => 
Boolean)|[https://github.com/apache/kafka/blob/1.1/core/src/main/scala/kafka/log/Log.scala#L1227]]
 will always return empty deletable log segments.
{quote}
 

  was:
What's the issue?
{quote}There were log segments, which can not be deleted over configured 
retention hours.
{quote}
What are impacts? 
{quote} # Log space keep in increasing and finally cause space shortage.
 # There are lots of log segment rolled with a small size. e.g log segment may 
be only 50mb, not the expected 1gb.
 # Kafka stream or client may experience missing data.{quote}
How to reproduce it?
 # 
{quote}Configure message.timestamp.type as CreateTime.
{quote}
 # 
{quote}Reset Kafka client's system clock time a future time, e.g 03/04/*2025*, 
3:25:52 PM [GMT-08:00|https://www.epochconverter.com/timezones?q=1741130752]
{quote}

What's Next?
 # 
{quote}Go to check the timestamp in log time index and record (e.g 
035957300794.log|timeindex). You should see all the value of the 
timestamp is messed up with future time after `03/04/*2025*, 3:25:52 PM 
[GMT-08:00|https://www.epochconverter.com/timezones?q=1741130752]`.  
{quote}
 # 
{quote}You may also see the new rolled log segment's size smaller than the 
configured log segment size).
{quote}
 # 
{quote}Log segment (035957300794.log|timeindex) and any of rolled log 
segments will not be deleted.
{quote}

What's the particular logic to cause this issue?
 # 
{quote}private def deletableSegments(predicate: (LogSegment, 
Option[LogSegment]) => 
Boolean)|[https://github.com/apache/kafka/blob/1.1/core/src/main/scala/kafka/log/Log.scala#L1227]]
 will always return empty deletable log segments.
{quote}
 


> Kafka timestamp-based retention policy is not working when Kafka client's 
> time is not reliable.
> ---
>
> Key: KAFKA-8270
> URL: https://issues.apache.org/jira/browse/KAFKA-8270
> Project: Kafka
>  Issue Type: Bug
>  Components: log, log cleaner, logging
>Affects Versions: 1.1.1
>Reporter: Jiangtao Liu
>Priority: Major
>  Labels: storage
> Attachments: Screen Shot 2019-04-20 at 10.57.59 PM.png
>
>
> What's the issue?
> {quote}There were log segments, which can not be deleted over configured 
> retention hours.
> {quote}
> What are impacts? 
> {quote} # Log space keep in increasing and finally cause space shortage.
>  # There are lots of log segment rolled with a small size. e.g log segment 
> may be only 50mb, not the expected 1gb.
>  # Kafka stream or client may experience missing data.{quote}
> How to reproduce it?
>  # 
> {quote}Configure message.timestamp.type as CreateTime.
> {quote}
>  # 
> {quote}Reset Kafka client's system clock time as a future time, e.g 
> 03/04/*2025*, 3:25:52 PM 
> [GMT-08:00|https://www.epochconverter.com/timezones?q=1741130752]
> Send message to Kafka.
> {quote}
> What's Next?
>  # 
> {quote}Go to check the timestamp in log time index and record (e.g 
> 035957300794.log|timeindex). You should see all the value of the 
> timestamp is messed up with future time after `03/04/*2025*, 3:25:52 PM 
> [GMT-08:00|https://www.epochconverter.com/timezones?q=1741130752]`.  
> {quote}
>  # 
> {quote}You may also see the new rolled log segment's size smaller than the 
> configured log segment size).
> 

[jira] [Updated] (KAFKA-8270) Kafka timestamp-based retention policy is not working when Kafka client's time is not reliable.

2020-04-10 Thread Jiangtao Liu (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jiangtao Liu updated KAFKA-8270:

Description: 
What's the issue?
{quote}There were log segments, which can not be deleted over configured 
retention hours.
{quote}
What are impacts? 
{quote} # Log space keep in increasing and finally cause space shortage.
 # There are lots of log segment rolled with a small size. e.g log segment may 
be only 50mb, not the expected 1gb.
 # Kafka stream or client may experience missing data.{quote}
How to reproduce it?
 # 
{quote}Configure message.timestamp.type as CreateTime.
{quote}
 # 
{quote}Reset Kafka client's system clock time a future time, e.g 03/04/*2025*, 
3:25:52 PM [GMT-08:00|https://www.epochconverter.com/timezones?q=1741130752]
{quote}

What's Next?
 # 
{quote}Go to check the timestamp in log time index and record (e.g 
035957300794.log|timeindex). You should see all the value of the 
timestamp is messed up with future time after `03/04/*2025*, 3:25:52 PM 
[GMT-08:00|https://www.epochconverter.com/timezones?q=1741130752]`.  
{quote}
 # 
{quote}You may also see the new rolled log segment's size smaller than the 
configured log segment size).
{quote}
 # 
{quote}Log segment (035957300794.log|timeindex) and any of rolled log 
segments will not be deleted.
{quote}

What's the particular logic to cause this issue?
 # 
{quote}private def deletableSegments(predicate: (LogSegment, 
Option[LogSegment]) => 
Boolean)|[https://github.com/apache/kafka/blob/1.1/core/src/main/scala/kafka/log/Log.scala#L1227]]
 will always return empty deletable log segments.
{quote}
 

  was:
What's the issue?
{quote}There were log segments, which can not be deleted over configured 
retention hours.
{quote}
What are impacts? 
{quote} # Log space keep in increasing and finally cause space shortage.
 # There are lots of log segment rolled with a small size. e.g log segment may 
be only 50mb, not the expected 1gb.
 # Kafka stream or client may experience missing data.{quote}
How to reproduce it?
 # 
{quote}Configure message.timestamp.type as CreateTime.
{quote}
 # 
{quote}Reset Kafka client's system clock time a future time, e.g 03/04/*2025*, 
3:25:52 PM [GMT-08:00|https://www.epochconverter.com/timezones?q=1741130752]
{quote}

What's Next?
 # 
{quote}Go to check the timestamp in log time index and record (e.g 
035957300794.log|timeindex). You should see all the value of the 
timestamp is messed up with future time after `03/04/*2025*, 3:25:52 PM 
[GMT-08:00|https://www.epochconverter.com/timezones?q=1741130752]`.  
{quote}
 # 
{quote}You may also see the new rolled log segment's size smaller than the 
configured log segment size).
{quote}
 # 
{quote}Log segment (035957300794.log|timeindex) and any of rolled log 
segments will not be deleted.
{quote}

 

What's the particular logic to cause this issue?
 # 
{quote}private def deletableSegments(predicate: (LogSegment, 
Option[LogSegment]) => 
Boolean)|[https://github.com/apache/kafka/blob/1.1/core/src/main/scala/kafka/log/Log.scala#L1227]]
 will always return empty deletable log segments.
{quote}
 


> Kafka timestamp-based retention policy is not working when Kafka client's 
> time is not reliable.
> ---
>
> Key: KAFKA-8270
> URL: https://issues.apache.org/jira/browse/KAFKA-8270
> Project: Kafka
>  Issue Type: Bug
>  Components: log, log cleaner, logging
>Affects Versions: 1.1.1
>Reporter: Jiangtao Liu
>Priority: Major
>  Labels: storage
> Attachments: Screen Shot 2019-04-20 at 10.57.59 PM.png
>
>
> What's the issue?
> {quote}There were log segments, which can not be deleted over configured 
> retention hours.
> {quote}
> What are impacts? 
> {quote} # Log space keep in increasing and finally cause space shortage.
>  # There are lots of log segment rolled with a small size. e.g log segment 
> may be only 50mb, not the expected 1gb.
>  # Kafka stream or client may experience missing data.{quote}
> How to reproduce it?
>  # 
> {quote}Configure message.timestamp.type as CreateTime.
> {quote}
>  # 
> {quote}Reset Kafka client's system clock time a future time, e.g 
> 03/04/*2025*, 3:25:52 PM 
> [GMT-08:00|https://www.epochconverter.com/timezones?q=1741130752]
> {quote}
> What's Next?
>  # 
> {quote}Go to check the timestamp in log time index and record (e.g 
> 035957300794.log|timeindex). You should see all the value of the 
> timestamp is messed up with future time after `03/04/*2025*, 3:25:52 PM 
> [GMT-08:00|https://www.epochconverter.com/timezones?q=1741130752]`.  
> {quote}
>  # 
> {quote}You may also see the new rolled log segment's size smaller than the 
> configured log segment size).
> {quote}
>  # 
> {quote}Log segment 

[jira] [Updated] (KAFKA-8270) Kafka timestamp-based retention policy is not working when Kafka client's time is not reliable.

2020-04-10 Thread Jiangtao Liu (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jiangtao Liu updated KAFKA-8270:

Description: 
What's the issue?
{quote}There were log segments, which can not be deleted over configured 
retention hours.
{quote}
What are impacts? 
{quote} # Log space keep in increasing and finally cause space shortage.
 # There are lots of log segment rolled with a small size. e.g log segment may 
be only 50mb, not the expected 1gb.
 # Kafka stream or client may experience missing data.{quote}
How to reproduce it?
 # 
{quote}Configure message.timestamp.type as CreateTime.
{quote}
 # 
{quote}Reset Kafka client's system clock time a future time, e.g 03/04/*2025*, 
3:25:52 PM [GMT-08:00|https://www.epochconverter.com/timezones?q=1741130752]
{quote}

What's Next?
 # 
{quote}Go to check the timestamp in log time index and record (e.g 
035957300794.log|timeindex). You should see all the value of the 
timestamp is messed up with future time after `03/04/*2025*, 3:25:52 PM 
[GMT-08:00|https://www.epochconverter.com/timezones?q=1741130752]`.  
{quote}
 # 
{quote}You may also see the new rolled log segment's size smaller than the 
configured log segment size).
{quote}
 # 
{quote}Log segment (035957300794.log|timeindex) and any of rolled log 
segments will not be deleted.
{quote}

 

What's the particular logic to cause this issue?
 # 
{quote}private def deletableSegments(predicate: (LogSegment, 
Option[LogSegment]) => 
Boolean)|[https://github.com/apache/kafka/blob/1.1/core/src/main/scala/kafka/log/Log.scala#L1227]]
 will always return empty deletable log segments.
{quote}
 

  was:
What's the issue?
{quote}There were log segments, which can not be deleted over configured 
retention hours.
{quote}
What are impacts? 
{quote} # Log space keep in increasing and finally cause space shortage.
 # There are lots of log segment rolled with a small size. e.g log segment may 
be only 50mb, not the expected 1gb.
 # Kafka stream or client may experience missing data.{quote}
How to reproduce it?
 # 
{quote}Configure message.timestamp.type as CreateTime.
{quote}
 # 
{quote}Reset Kafka client's system clock time a future time, e.g 03/04/*2025*, 
3:25:52 PM [GMT-08:00|https://www.epochconverter.com/timezones?q=1741130752]
{quote}

What's Next?
 # 
{quote}Go to check the timestamp in log time index and record (e.g 
035957300794.log|timeindex). You should see all the value of the 
timestamp is messed up with future time after `03/04/*2025*, 3:25:52 PM 
[GMT-08:00|https://www.epochconverter.com/timezones?q=1741130752]`.  
{quote}
 # 
{quote}You may also see the new rolled log segment's size smaller than the 
configured log segment size).
{quote}
 # 
{quote}Log segment (035957300794.log|timeindex) and any of rolled log 
segments will not be deleted.
{quote}

 

What's the particular logic to cause this issue?
 # 
{quote}private def deletableSegments(predicate: (LogSegment, 
Option[LogSegment]) => 
Boolean)|[https://github.com/apache/kafka/blob/1.1/core/src/main/scala/kafka/log/Log.scala#L1227]]
 will always return empty deletable log segments.
{quote}


> Kafka timestamp-based retention policy is not working when Kafka client's 
> time is not reliable.
> ---
>
> Key: KAFKA-8270
> URL: https://issues.apache.org/jira/browse/KAFKA-8270
> Project: Kafka
>  Issue Type: Bug
>  Components: log, log cleaner, logging
>Affects Versions: 1.1.1
>Reporter: Jiangtao Liu
>Priority: Major
>  Labels: storage
> Attachments: Screen Shot 2019-04-20 at 10.57.59 PM.png
>
>
> What's the issue?
> {quote}There were log segments, which can not be deleted over configured 
> retention hours.
> {quote}
> What are impacts? 
> {quote} # Log space keep in increasing and finally cause space shortage.
>  # There are lots of log segment rolled with a small size. e.g log segment 
> may be only 50mb, not the expected 1gb.
>  # Kafka stream or client may experience missing data.{quote}
> How to reproduce it?
>  # 
> {quote}Configure message.timestamp.type as CreateTime.
> {quote}
>  # 
> {quote}Reset Kafka client's system clock time a future time, e.g 
> 03/04/*2025*, 3:25:52 PM 
> [GMT-08:00|https://www.epochconverter.com/timezones?q=1741130752]
> {quote}
> What's Next?
>  # 
> {quote}Go to check the timestamp in log time index and record (e.g 
> 035957300794.log|timeindex). You should see all the value of the 
> timestamp is messed up with future time after `03/04/*2025*, 3:25:52 PM 
> [GMT-08:00|https://www.epochconverter.com/timezones?q=1741130752]`.  
> {quote}
>  # 
> {quote}You may also see the new rolled log segment's size smaller than the 
> configured log segment size).
> {quote}
>  # 
> {quote}Log segment 

[jira] [Commented] (KAFKA-8270) Kafka timestamp-based retention policy is not working when Kafka client's time is not reliable.

2020-04-10 Thread Jiangtao Liu (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-8270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17080668#comment-17080668
 ] 

Jiangtao Liu commented on KAFKA-8270:
-

Please comment me if you need to have more context.

> Kafka timestamp-based retention policy is not working when Kafka client's 
> time is not reliable.
> ---
>
> Key: KAFKA-8270
> URL: https://issues.apache.org/jira/browse/KAFKA-8270
> Project: Kafka
>  Issue Type: Bug
>  Components: log, log cleaner, logging
>Affects Versions: 1.1.1
>Reporter: Jiangtao Liu
>Priority: Major
>  Labels: storage
> Attachments: Screen Shot 2019-04-20 at 10.57.59 PM.png
>
>
> What's the issue?
> {quote}There were log segments, which can not be deleted over configured 
> retention hours.
> {quote}
> What are impacts? 
> {quote} # Log space keep in increasing and finally cause space shortage.
>  # There are lots of log segment rolled with a small size. e.g log segment 
> may be only 50mb, not the expected 1gb.
>  # Kafka stream or client may experience missing data.{quote}
> How to reproduce it?
>  # 
> {quote}Configure message.timestamp.type as CreateTime.
> {quote}
>  # 
> {quote}Reset Kafka client's system clock time a future time, e.g 
> 03/04/*2025*, 3:25:52 PM 
> [GMT-08:00|https://www.epochconverter.com/timezones?q=1741130752]
> {quote}
> What's Next?
>  # 
> {quote}Go to check the timestamp in log time index and record (e.g 
> 035957300794.log|timeindex). You should see all the value of the 
> timestamp is messed up with future time after `03/04/*2025*, 3:25:52 PM 
> [GMT-08:00|https://www.epochconverter.com/timezones?q=1741130752]`.  
> {quote}
>  # 
> {quote}You may also see the new rolled log segment's size smaller than the 
> configured log segment size).
> {quote}
>  # 
> {quote}Log segment (035957300794.log|timeindex) and any of rolled log 
> segments will not be deleted.
> {quote}
>  
> What's the particular logic to cause this issue?
>  # 
> {quote}private def deletableSegments(predicate: (LogSegment, 
> Option[LogSegment]) => 
> Boolean)|[https://github.com/apache/kafka/blob/1.1/core/src/main/scala/kafka/log/Log.scala#L1227]]
>  will always return empty deletable log segments.
> {quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (KAFKA-8270) Kafka timestamp-based retention policy is not working when Kafka client's time is not reliable.

2020-04-10 Thread Jiangtao Liu (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-8270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17080665#comment-17080665
 ] 

Jiangtao Liu edited comment on KAFKA-8270 at 4/10/20, 5:58 PM:
---

In a short summary of this issue:

Log/Time index's timestamp is from Kafka client when to configure CreateTime as 
Log.Message.Timestamp.Type.

The log/time index's timestamp is critical for Kafka system and user functions. 
If you don't have confidence with Kafka client's time, You may need to set 
LogAppendTime instead of CreateTime as Log.Message.Timestamp.Type.


was (Author: tony2011):
In a short summary of this issue:
{noformat}
Log/Time index's timestamp is from Kafka client when to configure CreateTime as 
Log.Message.Timestamp.Type.{noformat}
{noformat}
The log/time index's timestamp is critical for Kafka system and user functions. 
If you don't have confidence with Kafka client's time, You may need to set 
LogAppendTime instead of CreateTime as Log.Message.Timestamp.Type.{noformat}

> Kafka timestamp-based retention policy is not working when Kafka client's 
> time is not reliable.
> ---
>
> Key: KAFKA-8270
> URL: https://issues.apache.org/jira/browse/KAFKA-8270
> Project: Kafka
>  Issue Type: Bug
>  Components: log, log cleaner, logging
>Affects Versions: 1.1.1
>Reporter: Jiangtao Liu
>Priority: Major
>  Labels: storage
> Attachments: Screen Shot 2019-04-20 at 10.57.59 PM.png
>
>
> What's the issue?
> {quote}There were log segments, which can not be deleted over configured 
> retention hours.
> {quote}
> What are impacts? 
> {quote} # Log space keep in increasing and finally cause space shortage.
>  # There are lots of log segment rolled with a small size. e.g log segment 
> may be only 50mb, not the expected 1gb.
>  # Kafka stream or client may experience missing data.{quote}
> How to reproduce it?
>  # 
> {quote}Configure message.timestamp.type as CreateTime.
> {quote}
>  # 
> {quote}Reset Kafka client's system clock time a future time, e.g 
> 03/04/*2025*, 3:25:52 PM 
> [GMT-08:00|https://www.epochconverter.com/timezones?q=1741130752]
> {quote}
> What's Next?
>  # 
> {quote}Go to check the timestamp in log time index and record (e.g 
> 035957300794.log|timeindex). You should see all the value of the 
> timestamp is messed up with future time after `03/04/*2025*, 3:25:52 PM 
> [GMT-08:00|https://www.epochconverter.com/timezones?q=1741130752]`.  
> {quote}
>  # 
> {quote}You may also see the new rolled log segment's size smaller than the 
> configured log segment size).
> {quote}
>  # 
> {quote}Log segment (035957300794.log|timeindex) and any of rolled log 
> segments will not be deleted.
> {quote}
>  
> What's the particular logic to cause this issue?
>  # 
> {quote}private def deletableSegments(predicate: (LogSegment, 
> Option[LogSegment]) => 
> Boolean)|[https://github.com/apache/kafka/blob/1.1/core/src/main/scala/kafka/log/Log.scala#L1227]]
>  will always return empty deletable log segments.
> {quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (KAFKA-8270) Kafka timestamp-based retention policy is not working when Kafka client's time is not reliable.

2020-04-10 Thread Jiangtao Liu (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-8270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17080665#comment-17080665
 ] 

Jiangtao Liu edited comment on KAFKA-8270 at 4/10/20, 5:57 PM:
---

In a short summary of this issue:
{noformat}
Log/Time index's timestamp is from Kafka client when to configure CreateTime as 
Log.Message.Timestamp.Type.{noformat}
{noformat}
The log/time index's timestamp is critical for Kafka system and user functions. 
If you don't have confidence with Kafka client's time, You may need to set 
LogAppendTime instead of CreateTime as Log.Message.Timestamp.Type.{noformat}


was (Author: tony2011):
Anyone if you have or may run into the issue, you can take a look this:

```
The log/time index's timestamp is critical for Kafka system and user functions. 
If you don't have confidence with Kafka client's time, You may need to set 
*LogAppendTime* instead of *CreateTime* as *Log.Message.Timestamp.Type*.```

> Kafka timestamp-based retention policy is not working when Kafka client's 
> time is not reliable.
> ---
>
> Key: KAFKA-8270
> URL: https://issues.apache.org/jira/browse/KAFKA-8270
> Project: Kafka
>  Issue Type: Bug
>  Components: log, log cleaner, logging
>Affects Versions: 1.1.1
>Reporter: Jiangtao Liu
>Priority: Major
>  Labels: storage
> Attachments: Screen Shot 2019-04-20 at 10.57.59 PM.png
>
>
> What's the issue?
> {quote}There were log segments, which can not be deleted over configured 
> retention hours.
> {quote}
> What are impacts? 
> {quote} # Log space keep in increasing and finally cause space shortage.
>  # There are lots of log segment rolled with a small size. e.g log segment 
> may be only 50mb, not the expected 1gb.
>  # Kafka stream or client may experience missing data.{quote}
> How to reproduce it?
>  # 
> {quote}Configure message.timestamp.type as CreateTime.
> {quote}
>  # 
> {quote}Reset Kafka client's system clock time a future time, e.g 
> 03/04/*2025*, 3:25:52 PM 
> [GMT-08:00|https://www.epochconverter.com/timezones?q=1741130752]
> {quote}
> What's Next?
>  # 
> {quote}Go to check the timestamp in log time index and record (e.g 
> 035957300794.log|timeindex). You should see all the value of the 
> timestamp is messed up with future time after `03/04/*2025*, 3:25:52 PM 
> [GMT-08:00|https://www.epochconverter.com/timezones?q=1741130752]`.  
> {quote}
>  # 
> {quote}You may also see the new rolled log segment's size smaller than the 
> configured log segment size).
> {quote}
>  # 
> {quote}Log segment (035957300794.log|timeindex) and any of rolled log 
> segments will not be deleted.
> {quote}
>  
> What's the particular logic to cause this issue?
>  # 
> {quote}private def deletableSegments(predicate: (LogSegment, 
> Option[LogSegment]) => 
> Boolean)|[https://github.com/apache/kafka/blob/1.1/core/src/main/scala/kafka/log/Log.scala#L1227]]
>  will always return empty deletable log segments.
> {quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-8270) Kafka timestamp-based retention policy is not working when Kafka client's time is not reliable.

2020-04-10 Thread Jiangtao Liu (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-8270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17080665#comment-17080665
 ] 

Jiangtao Liu commented on KAFKA-8270:
-

Anyone if you have or may run into the issue, you can take a look this:

```
The log/time index's timestamp is critical for Kafka system and user functions. 
If you don't have confidence with Kafka client's time, You may need to set 
*LogAppendTime* instead of *CreateTime* as *Log.Message.Timestamp.Type*.```

> Kafka timestamp-based retention policy is not working when Kafka client's 
> time is not reliable.
> ---
>
> Key: KAFKA-8270
> URL: https://issues.apache.org/jira/browse/KAFKA-8270
> Project: Kafka
>  Issue Type: Bug
>  Components: log, log cleaner, logging
>Affects Versions: 1.1.1
>Reporter: Jiangtao Liu
>Priority: Major
>  Labels: storage
> Attachments: Screen Shot 2019-04-20 at 10.57.59 PM.png
>
>
> What's the issue?
> {quote}There were log segments, which can not be deleted over configured 
> retention hours.
> {quote}
> What are impacts? 
> {quote} # Log space keep in increasing and finally cause space shortage.
>  # There are lots of log segment rolled with a small size. e.g log segment 
> may be only 50mb, not the expected 1gb.
>  # Kafka stream or client may experience missing data.{quote}
> How to reproduce it?
>  # 
> {quote}Configure message.timestamp.type as CreateTime.
> {quote}
>  # 
> {quote}Reset Kafka client's system clock time a future time, e.g 
> 03/04/*2025*, 3:25:52 PM 
> [GMT-08:00|https://www.epochconverter.com/timezones?q=1741130752]
> {quote}
> What's Next?
>  # 
> {quote}Go to check the timestamp in log time index and record (e.g 
> 035957300794.log|timeindex). You should see all the value of the 
> timestamp is messed up with future time after `03/04/*2025*, 3:25:52 PM 
> [GMT-08:00|https://www.epochconverter.com/timezones?q=1741130752]`.  
> {quote}
>  # 
> {quote}You may also see the new rolled log segment's size smaller than the 
> configured log segment size).
> {quote}
>  # 
> {quote}Log segment (035957300794.log|timeindex) and any of rolled log 
> segments will not be deleted.
> {quote}
>  
> What's the particular logic to cause this issue?
>  # 
> {quote}private def deletableSegments(predicate: (LogSegment, 
> Option[LogSegment]) => 
> Boolean)|[https://github.com/apache/kafka/blob/1.1/core/src/main/scala/kafka/log/Log.scala#L1227]]
>  will always return empty deletable log segments.
> {quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (KAFKA-8270) Kafka timestamp-based retention policy is not working when Kafka client's time is not reliable.

2020-04-10 Thread Jiangtao Liu (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-8270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16822636#comment-16822636
 ] 

Jiangtao Liu edited comment on KAFKA-8270 at 4/10/20, 5:53 PM:
---

[~Yohan123]

Since I cloned this ticket from an existing one and I don't have a way to edit 
the owner, so if you are not the owner of my reported issue, can you assign to 
the relative team to have a look? 


was (Author: tony2011):
[~Yohan123]

Since I cloned this ticket from an existing one and I don't have a way to edit 
the owner, so if you are not the owner of my reported issue, can you assign to 
the relative team to have a look? 

 

 

> Kafka timestamp-based retention policy is not working when Kafka client's 
> time is not reliable.
> ---
>
> Key: KAFKA-8270
> URL: https://issues.apache.org/jira/browse/KAFKA-8270
> Project: Kafka
>  Issue Type: Bug
>  Components: log, log cleaner, logging
>Affects Versions: 1.1.1
>Reporter: Jiangtao Liu
>Priority: Major
>  Labels: storage
> Attachments: Screen Shot 2019-04-20 at 10.57.59 PM.png
>
>
> What's the issue?
> {quote}There were log segments, which can not be deleted over configured 
> retention hours.
> {quote}
> What are impacts? 
> {quote} # Log space keep in increasing and finally cause space shortage.
>  # There are lots of log segment rolled with a small size. e.g log segment 
> may be only 50mb, not the expected 1gb.
>  # Kafka stream or client may experience missing data.{quote}
> How to reproduce it?
>  # 
> {quote}Configure message.timestamp.type as CreateTime.
> {quote}
>  # 
> {quote}Reset Kafka client's system clock time a future time, e.g 
> 03/04/*2025*, 3:25:52 PM 
> [GMT-08:00|https://www.epochconverter.com/timezones?q=1741130752]
> {quote}
> What's Next?
>  # 
> {quote}Go to check the timestamp in log time index and record (e.g 
> 035957300794.log|timeindex). You should see all the value of the 
> timestamp is messed up with future time after `03/04/*2025*, 3:25:52 PM 
> [GMT-08:00|https://www.epochconverter.com/timezones?q=1741130752]`.  
> {quote}
>  # 
> {quote}You may also see the new rolled log segment's size smaller than the 
> configured log segment size).
> {quote}
>  # 
> {quote}Log segment (035957300794.log|timeindex) and any of rolled log 
> segments will not be deleted.
> {quote}
>  
> What's the particular logic to cause this issue?
>  # 
> {quote}private def deletableSegments(predicate: (LogSegment, 
> Option[LogSegment]) => 
> Boolean)|[https://github.com/apache/kafka/blob/1.1/core/src/main/scala/kafka/log/Log.scala#L1227]]
>  will always return empty deletable log segments.
> {quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-8270) Kafka timestamp-based retention policy is not working when Kafka client's time is not reliable.

2020-04-10 Thread Jiangtao Liu (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jiangtao Liu updated KAFKA-8270:

Description: 
What's the issue?
{quote}There were log segments, which can not be deleted over configured 
retention hours.
{quote}
What are impacts? 
{quote} # Log space keep in increasing and finally cause space shortage.
 # There are lots of log segment rolled with a small size. e.g log segment may 
be only 50mb, not the expected 1gb.
 # Kafka stream or client may experience missing data.{quote}
How to reproduce it?
 # 
{quote}Configure message.timestamp.type as CreateTime.
{quote}
 # 
{quote}Reset Kafka client's system clock time a future time, e.g 03/04/*2025*, 
3:25:52 PM [GMT-08:00|https://www.epochconverter.com/timezones?q=1741130752]
{quote}

What's Next?
 # 
{quote}Go to check the timestamp in log time index and record (e.g 
035957300794.log|timeindex). You should see all the value of the 
timestamp is messed up with future time after `03/04/*2025*, 3:25:52 PM 
[GMT-08:00|https://www.epochconverter.com/timezones?q=1741130752]`.  
{quote}
 # 
{quote}You may also see the new rolled log segment's size smaller than the 
configured log segment size).
{quote}
 # 
{quote}Log segment (035957300794.log|timeindex) and any of rolled log 
segments will not be deleted.
{quote}

 

What's the particular logic to cause this issue?
 # 
{quote}private def deletableSegments(predicate: (LogSegment, 
Option[LogSegment]) => 
Boolean)|[https://github.com/apache/kafka/blob/1.1/core/src/main/scala/kafka/log/Log.scala#L1227]]
 will always return empty deletable log segments.
{quote}

  was:
What's the issue?
{quote}There were log segments, which can not deleted after over configured 
retention hours.
{quote}
What are impacts? 
{quote} # Log space keep in increasing and finally cause space shortage.
 # There are lots of log segment rolled with a small size. e.g log segment may 
be only 50mb, not the expected 1gb.
 # Kafka stream or client may experience missing data.{quote}
How to reproduce it?
 # 
{quote}Configure message.timestamp.type as CreateTime.
{quote}
 # 
{quote}Reset Kafka client's system clock time a future time, e.g 03/04/*2025*, 
3:25:52 PM [GMT-08:00|https://www.epochconverter.com/timezones?q=1741130752]
{quote}

What's Next?
 # 
{quote}Go to check the timestamp in log time index and record (e.g 
035957300794.log|timeindex). You should see all the value of the 
timestamp is messed up with future time after `03/04/*2025*, 3:25:52 PM 
[GMT-08:00|https://www.epochconverter.com/timezones?q=1741130752]`.  
{quote}
 # 
{quote}You may also see the new rolled log segment's size smaller than the 
configured log segment size).
{quote}
 # 
{quote}Log segment (035957300794.log|timeindex) and any of rolled log 
segments will not be deleted.
{quote}

 

What's the particular logic to cause this issue?
 # 
{quote}private def deletableSegments(predicate: (LogSegment, 
Option[LogSegment]) => 
Boolean)|[https://github.com/apache/kafka/blob/1.1/core/src/main/scala/kafka/log/Log.scala#L1227]]
 will always return empty deletable log segments.
{quote}


> Kafka timestamp-based retention policy is not working when Kafka client's 
> time is not reliable.
> ---
>
> Key: KAFKA-8270
> URL: https://issues.apache.org/jira/browse/KAFKA-8270
> Project: Kafka
>  Issue Type: Bug
>  Components: log, log cleaner, logging
>Affects Versions: 1.1.1
>Reporter: Jiangtao Liu
>Priority: Major
>  Labels: storage
> Attachments: Screen Shot 2019-04-20 at 10.57.59 PM.png
>
>
> What's the issue?
> {quote}There were log segments, which can not be deleted over configured 
> retention hours.
> {quote}
> What are impacts? 
> {quote} # Log space keep in increasing and finally cause space shortage.
>  # There are lots of log segment rolled with a small size. e.g log segment 
> may be only 50mb, not the expected 1gb.
>  # Kafka stream or client may experience missing data.{quote}
> How to reproduce it?
>  # 
> {quote}Configure message.timestamp.type as CreateTime.
> {quote}
>  # 
> {quote}Reset Kafka client's system clock time a future time, e.g 
> 03/04/*2025*, 3:25:52 PM 
> [GMT-08:00|https://www.epochconverter.com/timezones?q=1741130752]
> {quote}
> What's Next?
>  # 
> {quote}Go to check the timestamp in log time index and record (e.g 
> 035957300794.log|timeindex). You should see all the value of the 
> timestamp is messed up with future time after `03/04/*2025*, 3:25:52 PM 
> [GMT-08:00|https://www.epochconverter.com/timezones?q=1741130752]`.  
> {quote}
>  # 
> {quote}You may also see the new rolled log segment's size smaller than the 
> configured log segment size).
> {quote}
>  # 
> {quote}Log segment 

[jira] [Updated] (KAFKA-8270) Kafka timestamp-based retention policy is not working when Kafka client's time is not reliable.

2020-04-10 Thread Jiangtao Liu (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jiangtao Liu updated KAFKA-8270:

Description: 
What's the issue?
{quote}There were log segments, which can not deleted after over configured 
retention hours.
{quote}
What are impacts? 
{quote} # Log space keep in increasing and finally cause space shortage.
 # There are lots of log segment rolled with a small size. e.g log segment may 
be only 50mb, not the expected 1gb.
 # Kafka stream or client may experience missing data.{quote}
How to reproduce it?
 # 
{quote}Configure message.timestamp.type as CreateTime.
{quote}
 # 
{quote}Reset Kafka client's system clock time a future time, e.g 03/04/*2025*, 
3:25:52 PM [GMT-08:00|https://www.epochconverter.com/timezones?q=1741130752]
{quote}

What's Next?
 # 
{quote}Go to check the timestamp in log time index and record (e.g 
035957300794.log|timeindex). You should see all the value of the 
timestamp is messed up with future time after `03/04/*2025*, 3:25:52 PM 
[GMT-08:00|https://www.epochconverter.com/timezones?q=1741130752]`.  
{quote}
 # 
{quote}You may also see the new rolled log segment's size smaller than the 
configured log segment size).
{quote}
 # 
{quote}Log segment (035957300794.log|timeindex) and any of rolled log 
segments will not be deleted.
{quote}

 

What's the particular logic to cause this issue?
 # 
{quote}private def deletableSegments(predicate: (LogSegment, 
Option[LogSegment]) => 
Boolean)|[https://github.com/apache/kafka/blob/1.1/core/src/main/scala/kafka/log/Log.scala#L1227]]
 will always return empty deletable log segments.
{quote}

  was:
What's the issue?
{quote}We have log retention with 12 hours. there are log segments not deleted 
after 12 hours.
{quote}
How to reproduce it?
 # 
{quote}Configure message.timestamp.type as CreateTime.
{quote}
 # 
{quote}Reset Kafka client's system clock time a future time, e.g 03/04/*2025*, 
3:25:52 PM [GMT-08:00|https://www.epochconverter.com/timezones?q=1741130752]
{quote}

What's Next?
 # 
{quote}Go to check the timestamp in log time index and record (e.g 
035957300794.log|timeindex). You should see all the value of the 
timestamp is messed up with future time after `03/04/*2025*, 3:25:52 PM 
[GMT-08:00|https://www.epochconverter.com/timezones?q=1741130752]`.  
{quote}
 # 
{quote}You may also see the new rolled log segment's size smaller than the 
configured log segment size).
{quote}
 # 
{quote}Log segment (035957300794.log|timeindex) and any of rolled log 
segments will not be deleted.
{quote}

 

What's the particular logic to cause this issue?
 # 
{quote}private def deletableSegments(predicate: (LogSegment, 
Option[LogSegment]) => 
Boolean)|[https://github.com/apache/kafka/blob/1.1/core/src/main/scala/kafka/log/Log.scala#L1227]]
 will always return empty deletable log segments.
{quote}


> Kafka timestamp-based retention policy is not working when Kafka client's 
> time is not reliable.
> ---
>
> Key: KAFKA-8270
> URL: https://issues.apache.org/jira/browse/KAFKA-8270
> Project: Kafka
>  Issue Type: Bug
>  Components: log, log cleaner, logging
>Affects Versions: 1.1.1
>Reporter: Jiangtao Liu
>Priority: Major
>  Labels: storage
> Attachments: Screen Shot 2019-04-20 at 10.57.59 PM.png
>
>
> What's the issue?
> {quote}There were log segments, which can not deleted after over configured 
> retention hours.
> {quote}
> What are impacts? 
> {quote} # Log space keep in increasing and finally cause space shortage.
>  # There are lots of log segment rolled with a small size. e.g log segment 
> may be only 50mb, not the expected 1gb.
>  # Kafka stream or client may experience missing data.{quote}
> How to reproduce it?
>  # 
> {quote}Configure message.timestamp.type as CreateTime.
> {quote}
>  # 
> {quote}Reset Kafka client's system clock time a future time, e.g 
> 03/04/*2025*, 3:25:52 PM 
> [GMT-08:00|https://www.epochconverter.com/timezones?q=1741130752]
> {quote}
> What's Next?
>  # 
> {quote}Go to check the timestamp in log time index and record (e.g 
> 035957300794.log|timeindex). You should see all the value of the 
> timestamp is messed up with future time after `03/04/*2025*, 3:25:52 PM 
> [GMT-08:00|https://www.epochconverter.com/timezones?q=1741130752]`.  
> {quote}
>  # 
> {quote}You may also see the new rolled log segment's size smaller than the 
> configured log segment size).
> {quote}
>  # 
> {quote}Log segment (035957300794.log|timeindex) and any of rolled log 
> segments will not be deleted.
> {quote}
>  
> What's the particular logic to cause this issue?
>  # 
> {quote}private def deletableSegments(predicate: (LogSegment, 
> Option[LogSegment]) => 
> 

[jira] [Updated] (KAFKA-8270) Kafka timestamp-based retention policy is not working when Kafka client's time is not reliable.

2020-04-10 Thread Jiangtao Liu (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jiangtao Liu updated KAFKA-8270:

Component/s: logging
 log cleaner

> Kafka timestamp-based retention policy is not working when Kafka client's 
> time is not reliable.
> ---
>
> Key: KAFKA-8270
> URL: https://issues.apache.org/jira/browse/KAFKA-8270
> Project: Kafka
>  Issue Type: Bug
>  Components: log, log cleaner, logging
>Reporter: Jiangtao Liu
>Priority: Major
>  Labels: storage
> Attachments: Screen Shot 2019-04-20 at 10.57.59 PM.png
>
>
> What's the issue?
> {quote}We have log retention with 12 hours. there are log segments not 
> deleted after 12 hours.
> {quote}
> How to reproduce it?
>  # 
> {quote}Configure message.timestamp.type as CreateTime.
> {quote}
>  # 
> {quote}Reset Kafka client's system clock time a future time, e.g 
> 03/04/*2025*, 3:25:52 PM 
> [GMT-08:00|https://www.epochconverter.com/timezones?q=1741130752]
> {quote}
> What's Next?
>  # 
> {quote}Go to check the timestamp in log time index and record (e.g 
> 035957300794.log|timeindex). You should see all the value of the 
> timestamp is messed up with future time after `03/04/*2025*, 3:25:52 PM 
> [GMT-08:00|https://www.epochconverter.com/timezones?q=1741130752]`.  
> {quote}
>  # 
> {quote}You may also see the new rolled log segment's size smaller than the 
> configured log segment size).
> {quote}
>  # 
> {quote}Log segment (035957300794.log|timeindex) and any of rolled log 
> segments will not be deleted.
> {quote}
>  
> What's the particular logic to cause this issue?
>  # 
> {quote}private def deletableSegments(predicate: (LogSegment, 
> Option[LogSegment]) => 
> Boolean)|[https://github.com/apache/kafka/blob/1.1/core/src/main/scala/kafka/log/Log.scala#L1227]]
>  will always return empty deletable log segments.
> {quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-8270) Kafka timestamp-based retention policy is not working when Kafka client's time is not reliable.

2020-04-10 Thread Jiangtao Liu (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jiangtao Liu updated KAFKA-8270:

Affects Version/s: 1.1.1

> Kafka timestamp-based retention policy is not working when Kafka client's 
> time is not reliable.
> ---
>
> Key: KAFKA-8270
> URL: https://issues.apache.org/jira/browse/KAFKA-8270
> Project: Kafka
>  Issue Type: Bug
>  Components: log, log cleaner, logging
>Affects Versions: 1.1.1
>Reporter: Jiangtao Liu
>Priority: Major
>  Labels: storage
> Attachments: Screen Shot 2019-04-20 at 10.57.59 PM.png
>
>
> What's the issue?
> {quote}We have log retention with 12 hours. there are log segments not 
> deleted after 12 hours.
> {quote}
> How to reproduce it?
>  # 
> {quote}Configure message.timestamp.type as CreateTime.
> {quote}
>  # 
> {quote}Reset Kafka client's system clock time a future time, e.g 
> 03/04/*2025*, 3:25:52 PM 
> [GMT-08:00|https://www.epochconverter.com/timezones?q=1741130752]
> {quote}
> What's Next?
>  # 
> {quote}Go to check the timestamp in log time index and record (e.g 
> 035957300794.log|timeindex). You should see all the value of the 
> timestamp is messed up with future time after `03/04/*2025*, 3:25:52 PM 
> [GMT-08:00|https://www.epochconverter.com/timezones?q=1741130752]`.  
> {quote}
>  # 
> {quote}You may also see the new rolled log segment's size smaller than the 
> configured log segment size).
> {quote}
>  # 
> {quote}Log segment (035957300794.log|timeindex) and any of rolled log 
> segments will not be deleted.
> {quote}
>  
> What's the particular logic to cause this issue?
>  # 
> {quote}private def deletableSegments(predicate: (LogSegment, 
> Option[LogSegment]) => 
> Boolean)|[https://github.com/apache/kafka/blob/1.1/core/src/main/scala/kafka/log/Log.scala#L1227]]
>  will always return empty deletable log segments.
> {quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-8270) Kafka timestamp-based retention policy is not working when Kafka client's timestamp is not reliable.

2020-04-10 Thread Jiangtao Liu (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jiangtao Liu updated KAFKA-8270:

Description: 
What's the issue?
{quote}We have log retention with 12 hours. there are log segments not deleted 
after 12 hours.
{quote}
How to reproduce it?
 # 
{quote}Configure message.timestamp.type as CreateTime.
{quote}
 # 
{quote}Reset Kafka client's system clock time a future time, e.g 03/04/*2025*, 
3:25:52 PM [GMT-08:00|https://www.epochconverter.com/timezones?q=1741130752]
{quote}

What's Next?
 # 
{quote}Go to check the timestamp in log time index and record (e.g 
035957300794.log|timeindex). You should see all the value of the 
timestamp is messed up with future time after `03/04/*2025*, 3:25:52 PM 
[GMT-08:00|https://www.epochconverter.com/timezones?q=1741130752]`.  
{quote}
 # 
{quote}You may also see the new rolled log segment's size smaller than the 
configured log segment size).
{quote}
 # 
{quote}Log segment (035957300794.log|timeindex) and any of rolled log 
segments will not be deleted.
{quote}

 

What's the particular logic to cause this issue?
 # 
{quote}private def deletableSegments(predicate: (LogSegment, 
Option[LogSegment]) => 
Boolean)|[https://github.com/apache/kafka/blob/1.1/core/src/main/scala/kafka/log/Log.scala#L1227]]
 will always return empty deletable log segments.
{quote}

  was:
What's the issue and impact?
{quote}We have log retention with 12 hours. there are log segments not deleted 
after 12 hours. I outlined 
{quote}
How to reproduce it?
 # 
{quote}Configure message.timestamp.type as CreateTime.
{quote}
 # 
{quote}Reset Kafka client's system clock time a future time, e.g 03/04/*2025*, 
3:25:52 PM [GMT-08:00|https://www.epochconverter.com/timezones?q=1741130752]
{quote}

What's Next?
 # 
{quote}Go to check the timestamp in log time index and record (e.g 
035957300794.log|timeindex). You should see all the value of the 
timestamp is messed up with future time after `03/04/*2025*, 3:25:52 PM 
[GMT-08:00|https://www.epochconverter.com/timezones?q=1741130752]`.  
{quote}
 # 
{quote}You may also see the new rolled log segment's size smaller than the 
configured log segment size).
{quote}
 # 
{quote}Log segment (035957300794.log|timeindex) and any of rolled log 
segments will not be deleted.
{quote}

 

What's the particular logic to cause this issue?
 # 
{quote}private def deletableSegments(predicate: (LogSegment, 
Option[LogSegment]) => 
Boolean)|[https://github.com/apache/kafka/blob/1.1/core/src/main/scala/kafka/log/Log.scala#L1227]]
 will always return empty deletable log segments.
{quote}


> Kafka timestamp-based retention policy is not working when Kafka client's 
> timestamp is not reliable.
> 
>
> Key: KAFKA-8270
> URL: https://issues.apache.org/jira/browse/KAFKA-8270
> Project: Kafka
>  Issue Type: Bug
>  Components: log
>Reporter: Jiangtao Liu
>Priority: Major
>  Labels: storage
> Attachments: Screen Shot 2019-04-20 at 10.57.59 PM.png
>
>
> What's the issue?
> {quote}We have log retention with 12 hours. there are log segments not 
> deleted after 12 hours.
> {quote}
> How to reproduce it?
>  # 
> {quote}Configure message.timestamp.type as CreateTime.
> {quote}
>  # 
> {quote}Reset Kafka client's system clock time a future time, e.g 
> 03/04/*2025*, 3:25:52 PM 
> [GMT-08:00|https://www.epochconverter.com/timezones?q=1741130752]
> {quote}
> What's Next?
>  # 
> {quote}Go to check the timestamp in log time index and record (e.g 
> 035957300794.log|timeindex). You should see all the value of the 
> timestamp is messed up with future time after `03/04/*2025*, 3:25:52 PM 
> [GMT-08:00|https://www.epochconverter.com/timezones?q=1741130752]`.  
> {quote}
>  # 
> {quote}You may also see the new rolled log segment's size smaller than the 
> configured log segment size).
> {quote}
>  # 
> {quote}Log segment (035957300794.log|timeindex) and any of rolled log 
> segments will not be deleted.
> {quote}
>  
> What's the particular logic to cause this issue?
>  # 
> {quote}private def deletableSegments(predicate: (LogSegment, 
> Option[LogSegment]) => 
> Boolean)|[https://github.com/apache/kafka/blob/1.1/core/src/main/scala/kafka/log/Log.scala#L1227]]
>  will always return empty deletable log segments.
> {quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-8270) Kafka timestamp-based retention policy is not working when Kafka client's time is not reliable.

2020-04-10 Thread Jiangtao Liu (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jiangtao Liu updated KAFKA-8270:

Summary: Kafka timestamp-based retention policy is not working when Kafka 
client's time is not reliable.  (was: Kafka timestamp-based retention policy is 
not working when Kafka client's timestamp is not reliable.)

> Kafka timestamp-based retention policy is not working when Kafka client's 
> time is not reliable.
> ---
>
> Key: KAFKA-8270
> URL: https://issues.apache.org/jira/browse/KAFKA-8270
> Project: Kafka
>  Issue Type: Bug
>  Components: log
>Reporter: Jiangtao Liu
>Priority: Major
>  Labels: storage
> Attachments: Screen Shot 2019-04-20 at 10.57.59 PM.png
>
>
> What's the issue?
> {quote}We have log retention with 12 hours. there are log segments not 
> deleted after 12 hours.
> {quote}
> How to reproduce it?
>  # 
> {quote}Configure message.timestamp.type as CreateTime.
> {quote}
>  # 
> {quote}Reset Kafka client's system clock time a future time, e.g 
> 03/04/*2025*, 3:25:52 PM 
> [GMT-08:00|https://www.epochconverter.com/timezones?q=1741130752]
> {quote}
> What's Next?
>  # 
> {quote}Go to check the timestamp in log time index and record (e.g 
> 035957300794.log|timeindex). You should see all the value of the 
> timestamp is messed up with future time after `03/04/*2025*, 3:25:52 PM 
> [GMT-08:00|https://www.epochconverter.com/timezones?q=1741130752]`.  
> {quote}
>  # 
> {quote}You may also see the new rolled log segment's size smaller than the 
> configured log segment size).
> {quote}
>  # 
> {quote}Log segment (035957300794.log|timeindex) and any of rolled log 
> segments will not be deleted.
> {quote}
>  
> What's the particular logic to cause this issue?
>  # 
> {quote}private def deletableSegments(predicate: (LogSegment, 
> Option[LogSegment]) => 
> Boolean)|[https://github.com/apache/kafka/blob/1.1/core/src/main/scala/kafka/log/Log.scala#L1227]]
>  will always return empty deletable log segments.
> {quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-8270) Kafka timestamp-based retention policy is not working when Kafka client's timestamp is not reliable.

2020-04-10 Thread Jiangtao Liu (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jiangtao Liu updated KAFKA-8270:

Description: 
What's the issue and impact?
{quote}We have log retention with 12 hours. there are log segments not deleted 
after 12 hours. I outlined 
{quote}
How to reproduce it?
 # 
{quote}Configure message.timestamp.type as CreateTime.
{quote}
 # 
{quote}Reset Kafka client's system clock time a future time, e.g 03/04/*2025*, 
3:25:52 PM [GMT-08:00|https://www.epochconverter.com/timezones?q=1741130752]
{quote}

What's Next?
 # 
{quote}Go to check the timestamp in log time index and record (e.g 
035957300794.log|timeindex). You should see all the value of the 
timestamp is messed up with future time after `03/04/*2025*, 3:25:52 PM 
[GMT-08:00|https://www.epochconverter.com/timezones?q=1741130752]`.  
{quote}
 # 
{quote}You may also see the new rolled log segment's size smaller than the 
configured log segment size).
{quote}
 # 
{quote}Log segment (035957300794.log|timeindex) and any of rolled log 
segments will not be deleted.
{quote}

 

What's the particular logic to cause this issue?
 # 
{quote}private def deletableSegments(predicate: (LogSegment, 
Option[LogSegment]) => 
Boolean)|[https://github.com/apache/kafka/blob/1.1/core/src/main/scala/kafka/log/Log.scala#L1227]]
 will always return empty deletable log segments.
{quote}

  was:
what's the issue?
{quote}we have log retention with 12 hours. there are log segments not deleted 
after 12 hours.
{quote}
How to reproduce?
 # 
{quote}Configure message.timestamp.type as CreateTime.{quote}
 # 
{quote}Reset Kafka client's system clock time a future time, e.g 03/04/*2025*, 
3:25:52 PM 
[GMT-08:00|https://www.epochconverter.com/timezones?q=1741130752]{quote}

What's Next?
 # 
{quote}Go to check the timestamp in log time index and record (e.g 
035957300794.log|timeindex). You should see all the value of the 
timestamp is messed up with future time after `03/04/*2025*, 3:25:52 PM 
[GMT-08:00|https://www.epochconverter.com/timezones?q=1741130752]`.  {quote}
 # 
{quote}You may also see the new rolled log segment's size smaller than the 
configured log segment size).{quote}
 # 
{quote}Log segment (035957300794.log|timeindex) and any of rolled log 
segments will not be deleted.{quote}

 

What's the particular logic to cause this issue?
 # 
{quote}private def deletableSegments(predicate: (LogSegment, 
Option[LogSegment]) => 
Boolean)|[https://github.com/apache/kafka/blob/1.1/core/src/main/scala/kafka/log/Log.scala#L1227]]
 will always return empty deletable log segments.{quote}


> Kafka timestamp-based retention policy is not working when Kafka client's 
> timestamp is not reliable.
> 
>
> Key: KAFKA-8270
> URL: https://issues.apache.org/jira/browse/KAFKA-8270
> Project: Kafka
>  Issue Type: Bug
>  Components: log
>Reporter: Jiangtao Liu
>Priority: Major
>  Labels: storage
> Attachments: Screen Shot 2019-04-20 at 10.57.59 PM.png
>
>
> What's the issue and impact?
> {quote}We have log retention with 12 hours. there are log segments not 
> deleted after 12 hours. I outlined 
> {quote}
> How to reproduce it?
>  # 
> {quote}Configure message.timestamp.type as CreateTime.
> {quote}
>  # 
> {quote}Reset Kafka client's system clock time a future time, e.g 
> 03/04/*2025*, 3:25:52 PM 
> [GMT-08:00|https://www.epochconverter.com/timezones?q=1741130752]
> {quote}
> What's Next?
>  # 
> {quote}Go to check the timestamp in log time index and record (e.g 
> 035957300794.log|timeindex). You should see all the value of the 
> timestamp is messed up with future time after `03/04/*2025*, 3:25:52 PM 
> [GMT-08:00|https://www.epochconverter.com/timezones?q=1741130752]`.  
> {quote}
>  # 
> {quote}You may also see the new rolled log segment's size smaller than the 
> configured log segment size).
> {quote}
>  # 
> {quote}Log segment (035957300794.log|timeindex) and any of rolled log 
> segments will not be deleted.
> {quote}
>  
> What's the particular logic to cause this issue?
>  # 
> {quote}private def deletableSegments(predicate: (LogSegment, 
> Option[LogSegment]) => 
> Boolean)|[https://github.com/apache/kafka/blob/1.1/core/src/main/scala/kafka/log/Log.scala#L1227]]
>  will always return empty deletable log segments.
> {quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-8270) Kafka timestamp-based retention policy is not working when Kafka client's timestamp is not reliable.

2020-04-10 Thread Jiangtao Liu (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jiangtao Liu updated KAFKA-8270:

Description: 
what's the issue?
{quote}we have log retention with 12 hours. there are log segments not deleted 
after 12 hours.
{quote}
How to reproduce?
 # 
{quote}Configure message.timestamp.type as CreateTime.{quote}
 # 
{quote}Reset Kafka client's system clock time a future time, e.g 03/04/*2025*, 
3:25:52 PM 
[GMT-08:00|https://www.epochconverter.com/timezones?q=1741130752]{quote}

What's Next?
 # 
{quote}Go to check the timestamp in log time index and record (e.g 
035957300794.log|timeindex). You should see all the value of the 
timestamp is messed up with future time after `03/04/*2025*, 3:25:52 PM 
[GMT-08:00|https://www.epochconverter.com/timezones?q=1741130752]`.  {quote}
 # 
{quote}You may also see the new rolled log segment's size smaller than the 
configured log segment size).{quote}
 # 
{quote}Log segment (035957300794.log|timeindex) and any of rolled log 
segments will not be deleted.{quote}

 

What's the particular logic to cause this issue?
 # 
{quote}private def deletableSegments(predicate: (LogSegment, 
Option[LogSegment]) => 
Boolean)|[https://github.com/apache/kafka/blob/1.1/core/src/main/scala/kafka/log/Log.scala#L1227]]
 will always return empty deletable log segments.{quote}

  was:
what's the issue?

we have log retention with 12 hours. there are log segments not deleted after 
12 hours.

How to reproduce?
 # Configure message.timestamp.type as CreateTime.
 # Reset Kafka client's system clock time a future time, e.g 03/04/*2025*, 
3:25:52 PM [GMT-08:00|https://www.epochconverter.com/timezones?q=1741130752]

What's Next?
 # Go to check the timestamp in log time index and record (e.g 
035957300794.log|timeindex). You should see all the value of the 
timestamp is messed up with future time after `03/04/*2025*, 3:25:52 PM 
[GMT-08:00|https://www.epochconverter.com/timezones?q=1741130752]`.  
 # You may also see the new rolled log segment's size smaller than the 
configured log segment size).
 # Log segment (035957300794.log|timeindex) and any of rolled log 
segments will not be deleted.

 

what's the particular logic to cause this issue?
 # private def deletableSegments(predicate: (LogSegment, Option[LogSegment]) => 
Boolean)|[https://github.com/apache/kafka/blob/1.1/core/src/main/scala/kafka/log/Log.scala#L1227]]
 will always return empty deletable log segments.


> Kafka timestamp-based retention policy is not working when Kafka client's 
> timestamp is not reliable.
> 
>
> Key: KAFKA-8270
> URL: https://issues.apache.org/jira/browse/KAFKA-8270
> Project: Kafka
>  Issue Type: Bug
>  Components: log
>Reporter: Jiangtao Liu
>Priority: Major
>  Labels: storage
> Attachments: Screen Shot 2019-04-20 at 10.57.59 PM.png
>
>
> what's the issue?
> {quote}we have log retention with 12 hours. there are log segments not 
> deleted after 12 hours.
> {quote}
> How to reproduce?
>  # 
> {quote}Configure message.timestamp.type as CreateTime.{quote}
>  # 
> {quote}Reset Kafka client's system clock time a future time, e.g 
> 03/04/*2025*, 3:25:52 PM 
> [GMT-08:00|https://www.epochconverter.com/timezones?q=1741130752]{quote}
> What's Next?
>  # 
> {quote}Go to check the timestamp in log time index and record (e.g 
> 035957300794.log|timeindex). You should see all the value of the 
> timestamp is messed up with future time after `03/04/*2025*, 3:25:52 PM 
> [GMT-08:00|https://www.epochconverter.com/timezones?q=1741130752]`.  {quote}
>  # 
> {quote}You may also see the new rolled log segment's size smaller than the 
> configured log segment size).{quote}
>  # 
> {quote}Log segment (035957300794.log|timeindex) and any of rolled log 
> segments will not be deleted.{quote}
>  
> What's the particular logic to cause this issue?
>  # 
> {quote}private def deletableSegments(predicate: (LogSegment, 
> Option[LogSegment]) => 
> Boolean)|[https://github.com/apache/kafka/blob/1.1/core/src/main/scala/kafka/log/Log.scala#L1227]]
>  will always return empty deletable log segments.{quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-8270) Kafka timestamp-based retention policy is not working when Kafka client's timestamp is not reliable.

2020-04-10 Thread Jiangtao Liu (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jiangtao Liu updated KAFKA-8270:

Description: 
what's the issue?

we have log retention with 12 hours. there are log segments not deleted after 
12 hours.

How to reproduce?
 # Configure message.timestamp.type as CreateTime.
 # Reset Kafka client's system clock time a future time, e.g 03/04/*2025*, 
3:25:52 PM [GMT-08:00|https://www.epochconverter.com/timezones?q=1741130752]

What's Next?
 # Go to check the timestamp in log time index and record (e.g 
035957300794.log|timeindex). You should see all the value of the 
timestamp is messed up with future time after `03/04/*2025*, 3:25:52 PM 
[GMT-08:00|https://www.epochconverter.com/timezones?q=1741130752]`.  
 # You may also see the new rolled log segment's size smaller than the 
configured log segment size).
 # Log segment (035957300794.log|timeindex) and any of rolled log 
segments will not be deleted.

 

what's the particular logic to cause this issue?
 # private def deletableSegments(predicate: (LogSegment, Option[LogSegment]) => 
Boolean)|[https://github.com/apache/kafka/blob/1.1/core/src/main/scala/kafka/log/Log.scala#L1227]]
 will always return empty deletable log segments.

  was:
Kafka l cannot be deleted after the configured retention hours (12 hours for 
log retention). 

What's our Kafka cluster look like?

There are 6 brokers deployed with Kafka version 1.1.1.

 

Is it reproducible?
 I am not sure since our Kafka cluster is working well over 1.5 years without 
retention issue until 4/13/2019 ~ 4/20/2019.  

 

is it related to the active segment?
 as I know Kafka will not delete an active segment, my case those old logs are 
not activated, they should be inactivated. 

 

What's the current status?

Those old logs have been deleted after I manually ran rolling restart Kafka 
servers with retention hours adjustment (Ideally I tried this solution aimed 
to{color:#33} force retention hours work, not really want to adjust the 
retention hours, finally the solution it's working, but not immediately, I 
remember the retention start work after couples of hours after applying the 
change and rolling restart Kafka servers{color}.), now our Kafka storage is 
back to normal, please check the screenshot attached with this ticket.

A sample old log added here for better understanding of the retention hours not 
working issue.

```

// it has been there from 4/12
 -rw-r--r-- 1 root root 136866627{color:#d04437} Apr 12{color} 04:33 
002581377820.log 

// It was still being opened by Kafka when I check it with the tool lsof on 
{color:#d04437}4/19/2019 before server rolling restart with retention hours 
adjustment{color}.
 java 20281 0 1678u REG 202,32 136866627 1074562295 
/kafka/data/.../002581377820.log
 ```

 


> Kafka timestamp-based retention policy is not working when Kafka client's 
> timestamp is not reliable.
> 
>
> Key: KAFKA-8270
> URL: https://issues.apache.org/jira/browse/KAFKA-8270
> Project: Kafka
>  Issue Type: Bug
>  Components: log
>Reporter: Jiangtao Liu
>Priority: Major
>  Labels: storage
> Attachments: Screen Shot 2019-04-20 at 10.57.59 PM.png
>
>
> what's the issue?
> we have log retention with 12 hours. there are log segments not deleted after 
> 12 hours.
> How to reproduce?
>  # Configure message.timestamp.type as CreateTime.
>  # Reset Kafka client's system clock time a future time, e.g 03/04/*2025*, 
> 3:25:52 PM [GMT-08:00|https://www.epochconverter.com/timezones?q=1741130752]
> What's Next?
>  # Go to check the timestamp in log time index and record (e.g 
> 035957300794.log|timeindex). You should see all the value of the 
> timestamp is messed up with future time after `03/04/*2025*, 3:25:52 PM 
> [GMT-08:00|https://www.epochconverter.com/timezones?q=1741130752]`.  
>  # You may also see the new rolled log segment's size smaller than the 
> configured log segment size).
>  # Log segment (035957300794.log|timeindex) and any of rolled log 
> segments will not be deleted.
>  
> what's the particular logic to cause this issue?
>  # private def deletableSegments(predicate: (LogSegment, Option[LogSegment]) 
> => 
> Boolean)|[https://github.com/apache/kafka/blob/1.1/core/src/main/scala/kafka/log/Log.scala#L1227]]
>  will always return empty deletable log segments.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KAFKA-9851) Revoking Connect tasks due to connectivity issues should also clear running assignment

2020-04-10 Thread Konstantine Karantasis (Jira)
Konstantine Karantasis created KAFKA-9851:
-

 Summary: Revoking Connect tasks due to connectivity issues should 
also clear running assignment
 Key: KAFKA-9851
 URL: https://issues.apache.org/jira/browse/KAFKA-9851
 Project: Kafka
  Issue Type: Bug
Reporter: Konstantine Karantasis
Assignee: Konstantine Karantasis


https://issues.apache.org/jira/browse/KAFKA-9184 fixed an issue with workers 
continuing to run tasks even after they'd lose connectivity with the broker 
coordinator and they'd detect that they are out of the group. 

 

However, because the revocation of tasks in this case is voluntary and does not 
come with an explicit assignment (containing revoked tasks) from the leader 
worker, the worker that quits running its tasks due to connectivity issues 
needs to also clear its running task assignment snapshot. 

This will allow for proper restart of the stopped tasks after the worker 
rejoins the group when connectivity returns and get assigned the same 
connectors or tasks. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-9846) Race condition can lead to severe lag underestimate for active tasks

2020-04-10 Thread Sophie Blee-Goldman (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-9846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17080636#comment-17080636
 ] 

Sophie Blee-Goldman commented on KAFKA-9846:


True, the literal Task.State was introduced for 2.6, but there was always a 
concept of task state. It was just relatively poorly managed and much more 
difficult to keep track of – hence the refactoring that introduced Task.State. 
The TaskManager delegated to the now-removed AssignedStreamTasks class, which 
kept track of task state by adding/removing tasks from the "created", 
"restoring", "running", and "suspended" maps. Suspended probably also causes 
problems for IQ, as a task in suspended is basically already revoked

> Race condition can lead to severe lag underestimate for active tasks
> 
>
> Key: KAFKA-9846
> URL: https://issues.apache.org/jira/browse/KAFKA-9846
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 2.5.0
>Reporter: Sophie Blee-Goldman
>Assignee: Vinoth Chandar
>Priority: Critical
> Fix For: 2.6.0
>
>
> In KIP-535 we added the ability to query still-restoring and standby tasks. 
> To give users control over how out of date the data they fetch can be, we 
> added an API to KafkaStreams that fetches the end offsets for all changelog 
> partitions and computes the lag for each local state store.
> During this lag computation, we check whether an active task is in RESTORING 
> and calculate the actual lag if so. If not, we assume it's in RUNNING and 
> return a lag of zero. However, tasks may be in other states besides running 
> and restoring; notably they first pass through the CREATED state before 
> getting to RESTORING. A CREATED task may happen to be caught-up to the end 
> offset, but in many cases it is likely to be lagging or even completely 
> uninitialized.
> This introduces a race condition where users may be led to believe that a 
> task has zero lag and is "safe" to query even with the strictest correctness 
> guarantees, while the task is actually lagging by some unknown amount.  
> During transfer of ownership of the task between different threads on the 
> same machine, tasks can actually spend a while in CREATED while the new owner 
> waits to acquire the task directory lock. So, this race condition may not be 
> particularly rare in multi-threaded Streams applications



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KAFKA-9850) Move KStream#repartition operator validation during Topology build process

2020-04-10 Thread Levani Kokhreidze (Jira)
Levani Kokhreidze created KAFKA-9850:


 Summary: Move KStream#repartition operator validation during 
Topology build process 
 Key: KAFKA-9850
 URL: https://issues.apache.org/jira/browse/KAFKA-9850
 Project: Kafka
  Issue Type: Improvement
  Components: streams
Reporter: Levani Kokhreidze


`KStream#repartition` operation performs most of its validation regarding 
joining, co-partitioning, etc after starting Kafka Streams instance. Some parts 
of this validation can be detected much earlier, specifically during topology 
`build()`.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (KAFKA-9323) Refactor Streams' upgrade system tests

2020-04-10 Thread Nikolay Izhikov (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nikolay Izhikov reassigned KAFKA-9323:
--

Assignee: (was: Nikolay Izhikov)

> Refactor Streams'  upgrade system tests
> ---
>
> Key: KAFKA-9323
> URL: https://issues.apache.org/jira/browse/KAFKA-9323
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Reporter: Sophie Blee-Goldman
>Priority: Major
>
> With the introduction of version probing in 2.0 and cooperative rebalancing 
> in 2.4, the specific upgrade path depends heavily on the to & from version. 
> This can be a complex operation, and we should make sure to test a realistic 
> upgrade scenario across all possible combinations. The current system tests 
> have gaps however, which have allowed bugs in the upgrade path to slip by 
> unnoticed for several versions. 
> Our current system tests include a "simple upgrade downgrade" test, a 
> metadata upgrade test, a version probing test, and a cooperative upgrade 
> test. This has a few drawbacks and current issues:
> a) only the version probing test tests "forwards compatibility" (upgrade from 
> latest to future version)
> b) nothing tests version probing "backwards compatibility" (upgrade from 
> older version to latest), except:
> c) the cooperative rebalancing test actually happens to involve a version 
> probing step, and so coincidentally DOES test VP (but only starting with 2.4)
> d) each test itself tries to test the upgrade across different versions, 
> meaning there may be overlap and/or unnecessary tests. For example, currently 
> both the metadata_upgrade and cooperative_upgrade tests will test the upgrade 
> of 1.0 -> 2.4
> e) worse, a number of (to, from) pairs are not tested according to the 
> correct upgrade path at all, which has lead to easily reproducible bugs 
> slipping past for several versions.
> f) we have a test_simple_upgrade_downgrade test which does not actually do a 
> downgrade, and for some reason only tests upgrading within the range [0.10.1 
> - 1.1]
> g) as new versions are released, it is unclear to those not directly involved 
> in these tests and/or projects whether and what needs to be updated (eg 
> should this new version be added to the cooperative test? should the old 
> version be aded to the metadata test?)
> We should definitely fill in the testing gap here, but how to do so is of 
> course up for discussion.
> I would propose to refactor the upgrade tests, and rather than maintain 
> different lists of versions to pass as input to each different test, we 
> should have a single matrix that we update with each new version that 
> specifies which upgrade path that version combination actually requires. We 
> can then loop through each version combination and test only the actual 
> upgrade path that users would actually need to follow. This way we can be 
> sure we are not missing anything, as each and every possible upgrade would be 
> tested.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (KAFKA-6910) Ability to specify a default state store type or factory

2020-04-10 Thread Matthias J. Sax (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-6910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias J. Sax resolved KAFKA-6910.

Resolution: Duplicate

> Ability to specify a default state store type or factory
> 
>
> Key: KAFKA-6910
> URL: https://issues.apache.org/jira/browse/KAFKA-6910
> Project: Kafka
>  Issue Type: New Feature
>  Components: streams
>Affects Versions: 1.1.0, 1.1.1
>Reporter: Antony Stubbs
>Priority: Major
>
> For large projects, it's a huge pain and not really practically at all to use 
> a custom state store everywhere just to use in memory or avoid rocksdb, for 
> example for running a test suite on windows.
> It would be great to be able to set a global config for KS so that it uses a 
> different state store implementation everywhere.
> Blocked by KAFKA-4730 - Streams does not have an in-memory windowed store. 
> Also blocked by not having an in-memory session store implementation. A 
> version simple in memory window and session store that's not suitable for 
> production would still be very useful for running test suites.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-9846) Race condition can lead to severe lag underestimate for active tasks

2020-04-10 Thread Vinoth Chandar (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-9846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17080554#comment-17080554
 ] 

Vinoth Chandar commented on KAFKA-9846:
---

hi [~ableegoldman], if you were talking about Task.State, then you are in the 
future. 
https://github.com/apache/kafka/blame/b02bdd3227f08eef78080ef471c0950a4f77e5fb/streams/src/main/java/org/apache/kafka/streams/processor/internals/Task.java

I don't think we had that in 2.5 branch and was one of the harder things to 
reason about then. IIUC pre 2.5, there is state in only KafkaStreams level and 
thread level.. Thread level is what we checked to open stores. Let me see how 
to do a localized fix in 2.5, for this. 

> Race condition can lead to severe lag underestimate for active tasks
> 
>
> Key: KAFKA-9846
> URL: https://issues.apache.org/jira/browse/KAFKA-9846
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 2.5.0
>Reporter: Sophie Blee-Goldman
>Assignee: Vinoth Chandar
>Priority: Critical
> Fix For: 2.6.0
>
>
> In KIP-535 we added the ability to query still-restoring and standby tasks. 
> To give users control over how out of date the data they fetch can be, we 
> added an API to KafkaStreams that fetches the end offsets for all changelog 
> partitions and computes the lag for each local state store.
> During this lag computation, we check whether an active task is in RESTORING 
> and calculate the actual lag if so. If not, we assume it's in RUNNING and 
> return a lag of zero. However, tasks may be in other states besides running 
> and restoring; notably they first pass through the CREATED state before 
> getting to RESTORING. A CREATED task may happen to be caught-up to the end 
> offset, but in many cases it is likely to be lagging or even completely 
> uninitialized.
> This introduces a race condition where users may be led to believe that a 
> task has zero lag and is "safe" to query even with the strictest correctness 
> guarantees, while the task is actually lagging by some unknown amount.  
> During transfer of ownership of the task between different threads on the 
> same machine, tasks can actually spend a while in CREATED while the new owner 
> waits to acquire the task directory lock. So, this race condition may not be 
> particularly rare in multi-threaded Streams applications



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-8270) Kafka timestamp-based retention policy is not working when Kafka client's timestamp is not reliable.

2020-04-10 Thread Jiangtao Liu (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jiangtao Liu updated KAFKA-8270:

Summary: Kafka timestamp-based retention policy is not working when Kafka 
client's timestamp is not reliable.  (was: Kafka timestamp-based retention 
policy is not working.)

> Kafka timestamp-based retention policy is not working when Kafka client's 
> timestamp is not reliable.
> 
>
> Key: KAFKA-8270
> URL: https://issues.apache.org/jira/browse/KAFKA-8270
> Project: Kafka
>  Issue Type: Bug
>  Components: log
>Reporter: Jiangtao Liu
>Priority: Major
>  Labels: storage
> Attachments: Screen Shot 2019-04-20 at 10.57.59 PM.png
>
>
> Kafka l cannot be deleted after the configured retention hours (12 hours for 
> log retention). 
> What's our Kafka cluster look like?
> There are 6 brokers deployed with Kafka version 1.1.1.
>  
> Is it reproducible?
>  I am not sure since our Kafka cluster is working well over 1.5 years without 
> retention issue until 4/13/2019 ~ 4/20/2019.  
>  
> is it related to the active segment?
>  as I know Kafka will not delete an active segment, my case those old logs 
> are not activated, they should be inactivated. 
>  
> What's the current status?
> Those old logs have been deleted after I manually ran rolling restart Kafka 
> servers with retention hours adjustment (Ideally I tried this solution aimed 
> to{color:#33} force retention hours work, not really want to adjust the 
> retention hours, finally the solution it's working, but not immediately, I 
> remember the retention start work after couples of hours after applying the 
> change and rolling restart Kafka servers{color}.), now our Kafka storage is 
> back to normal, please check the screenshot attached with this ticket.
> A sample old log added here for better understanding of the retention hours 
> not working issue.
> ```
> // it has been there from 4/12
>  -rw-r--r-- 1 root root 136866627{color:#d04437} Apr 12{color} 04:33 
> 002581377820.log 
> // It was still being opened by Kafka when I check it with the tool lsof on 
> {color:#d04437}4/19/2019 before server rolling restart with retention hours 
> adjustment{color}.
>  java 20281 0 1678u REG 202,32 136866627 1074562295 
> /kafka/data/.../002581377820.log
>  ```
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-9744) SchemaProjector fails to handle change of record namespace in Avro schema

2020-04-10 Thread Jurgis Pods (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-9744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17080426#comment-17080426
 ] 

Jurgis Pods commented on KAFKA-9744:


I've updated my bug report to be more clear. In short: Avro does allow 
namespace changes, the Connect API does not - it should be consistent.

> SchemaProjector fails to handle change of record namespace in Avro schema
> -
>
> Key: KAFKA-9744
> URL: https://issues.apache.org/jira/browse/KAFKA-9744
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Affects Versions: 2.3.1
>Reporter: Jurgis Pods
>Priority: Major
>
> _Note_: This bug report is for CP 5.3.1 / Kafka 2.3.1, but it most likely 
> affects all versions.
> We recently changed the namespace of inner records in Avros schemas in the 
> Confluent Schema Registry. Those changes were accepted as 
> backwards-compatible. However, when redeploying Kafka S3 connectors consuming 
> the relevant topics, we received error from SchemaProjector.project(), 
> causing the connectors to crash und stop producing data:
> {code:java}
> org.apache.kafka.connect.errors.SchemaProjectorException: Schema name 
> mismatch. source name: my.example.Record and target name: 
> my.example.sub.Record {code}
> A change of a record's namespace is compatible according to the Schema 
> Registry (which internally uses a check from the Avro library), but not for 
> the Connect API. I would argue that the namespace/package name should not 
> affect compatibility, as it says nothing about the contained data and its 
> schema. The Avro library itself does allow for differing namespaces, if the 
> record data itself is compatible.
> Would it be possible to have a more consistent (and less restrictive) check 
> in Connect API, so that a namespace change in the producer can be made more 
> confidently, without fear of breaking the consuming connectors? If, on the 
> other hand, you want to be strict about namespaces, then at least the schema 
> registry should report a namespace change as incompatible.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-9744) SchemaProjector fails to handle change of record namespace in Avro schema

2020-04-10 Thread Jurgis Pods (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jurgis Pods updated KAFKA-9744:
---
Description: 
_Note_: This bug report is for CP 5.3.1 / Kafka 2.3.1, but it most likely 
affects all versions.

We recently changed the namespace of inner records in Avros schemas in the 
Confluent Schema Registry. Those changes were accepted as backwards-compatible. 
However, when redeploying Kafka S3 connectors consuming the relevant topics, we 
received error from SchemaProjector.project(), causing the connectors to crash 
und stop producing data:
{code:java}
org.apache.kafka.connect.errors.SchemaProjectorException: Schema name mismatch. 
source name: my.example.Record and target name: my.example.sub.Record {code}
A change of a record's namespace is compatible according to the Schema Registry 
(which internally uses a check from the Avro library), but not for the Connect 
API. I would argue that the namespace/package name should not affect 
compatibility, as it says nothing about the contained data and its schema. The 
Avro library itself does allow for differing namespaces, if the record data 
itself is compatible.

Would it be possible to have a more consistent (and less restrictive) check in 
Connect API, so that a namespace change in the producer can be made more 
confidently, without fear of breaking the consuming connectors? If, on the 
other hand, you want to be strict about namespaces, then at least the schema 
registry should report a namespace change as incompatible.

  was:
_Note_: This bug report is for CP 5.3.1 / Kafka 2.3.1, but it most likely 
affects all versions.

We recently changed the namespace of inner records in Avros schemas in the 
Confluent Schema Registry. Those changes were accepted as backwards-compatible. 
However, when redeploying Kafka S3 connectors consuming the relevant topics, we 
received error from SchemaProjector.project(), causing the connectors to crash 
und stop producing data:
{code:java}
org.apache.kafka.connect.errors.SchemaProjectorException: Schema name mismatch. 
source name: my.example.Record and target name: my.example.sub.Record {code}
A change of a record's namespace is compatible according to the Schema Registry 
(which internally uses a check from the Avro library), but not for the Connect 
API. I would argue that the namespace/package name should not affect 
compatibility, as it says nothing about the contained data and its schema.

Would it be possible to have a more consistent (and less restrictive) check in 
Connect API, so that a namespace change in the producer can be made more 
confidently, without fear of breaking the consuming connectors? If, on the 
other hand, you want to be strict about namespaces, then at least the schema 
registry should report a namespace change as incompatible.


> SchemaProjector fails to handle change of record namespace in Avro schema
> -
>
> Key: KAFKA-9744
> URL: https://issues.apache.org/jira/browse/KAFKA-9744
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Affects Versions: 2.3.1
>Reporter: Jurgis Pods
>Priority: Major
>
> _Note_: This bug report is for CP 5.3.1 / Kafka 2.3.1, but it most likely 
> affects all versions.
> We recently changed the namespace of inner records in Avros schemas in the 
> Confluent Schema Registry. Those changes were accepted as 
> backwards-compatible. However, when redeploying Kafka S3 connectors consuming 
> the relevant topics, we received error from SchemaProjector.project(), 
> causing the connectors to crash und stop producing data:
> {code:java}
> org.apache.kafka.connect.errors.SchemaProjectorException: Schema name 
> mismatch. source name: my.example.Record and target name: 
> my.example.sub.Record {code}
> A change of a record's namespace is compatible according to the Schema 
> Registry (which internally uses a check from the Avro library), but not for 
> the Connect API. I would argue that the namespace/package name should not 
> affect compatibility, as it says nothing about the contained data and its 
> schema. The Avro library itself does allow for differing namespaces, if the 
> record data itself is compatible.
> Would it be possible to have a more consistent (and less restrictive) check 
> in Connect API, so that a namespace change in the producer can be made more 
> confidently, without fear of breaking the consuming connectors? If, on the 
> other hand, you want to be strict about namespaces, then at least the schema 
> registry should report a namespace change as incompatible.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-9744) SchemaProjector fails to handle change of record namespace in Avro schema

2020-04-10 Thread Jurgis Pods (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jurgis Pods updated KAFKA-9744:
---
Summary: SchemaProjector fails to handle change of record namespace in Avro 
schema  (was: SchemaProjector fails to handle backwards-compatible schema 
change)

> SchemaProjector fails to handle change of record namespace in Avro schema
> -
>
> Key: KAFKA-9744
> URL: https://issues.apache.org/jira/browse/KAFKA-9744
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Affects Versions: 2.3.1
>Reporter: Jurgis Pods
>Priority: Major
>
> _Note_: This bug report is for CP 5.3.1 / Kafka 2.3.1, but it most likely 
> affects all versions.
> We recently changed the namespace of inner records in Avros schemas in the 
> Confluent Schema Registry. Those changes were accepted as 
> backwards-compatible. However, when redeploying Kafka S3 connectors consuming 
> the relevant topics, we received error from SchemaProjector.project(), 
> causing the connectors to crash und stop producing data:
> {code:java}
> org.apache.kafka.connect.errors.SchemaProjectorException: Schema name 
> mismatch. source name: my.example.Record and target name: 
> my.example.sub.Record {code}
> A change of a record's namespace is compatible according to the Schema 
> Registry (which internally uses a check from the Avro library), but not for 
> the Connect API. I would argue that the namespace/package name should not 
> affect compatibility, as it says nothing about the contained data and its 
> schema.
> Would it be possible to have a more consistent (and less restrictive) check 
> in Connect API, so that a namespace change in the producer can be made more 
> confidently, without fear of breaking the consuming connectors? If, on the 
> other hand, you want to be strict about namespaces, then at least the schema 
> registry should report a namespace change as incompatible.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-9744) SchemaProjector fails to handle backwards-compatible schema change

2020-04-10 Thread Jurgis Pods (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jurgis Pods updated KAFKA-9744:
---
Description: 
_Note_: This bug report is for CP 5.3.1 / Kafka 2.3.1, but it most likely 
affects all versions.

We recently changed the namespace of inner records in Avros schemas in the 
Confluent Schema Registry. Those changes were accepted as backwards-compatible. 
However, when redeploying Kafka S3 connectors consuming the relevant topics, we 
received error from SchemaProjector.project(), causing the connectors to crash 
und stop producing data:
{code:java}
org.apache.kafka.connect.errors.SchemaProjectorException: Schema name mismatch. 
source name: my.example.Record and target name: my.example.sub.Record {code}
A change of a record's namespace is compatible according to the Schema Registry 
(which internally uses a check from the Avro library), but not for the Connect 
API. I would argue that the namespace/package name should not affect 
compatibility, as it says nothing about the contained data and its schema.

Would it be possible to have a more consistent (and less restrictive) check in 
Connect API, so that a namespace change in the producer can be made more 
confidently, without fear of breaking the consuming connectors? If, on the 
other hand, you want to be strict about namespaces, then at least the schema 
registry should report a namespace change as incompatible.

  was:
_Note_: This bug report is for CP 5.3.1 / Kafka 2.3.1, but it most likely 
affects all versions.

We recently made a number of backwards-compatible changes to our Avro schemas 
in the Confluent Schema Registry. Those changes were accepted as 
backwards-compatible. However, when redeploying Kafka S3 connectors consuming 
the relevant topics, we noticed two separate instances of failures in 
SchemaProjector.project(), causing the connectors to crash und stop producing 
data:

1) Changed namespace of record:
{code:java}
org.apache.kafka.connect.errors.SchemaProjectorException: Schema name mismatch. 
source name: my.example.Record and target name: my.example.sub.Record {code}
A change of a record's namespace is compatible according to the Schema 
Registry, but not for the Connect API. I would argue that the namespace/package 
name should not affect compatibility, as it says nothing about the contained 
data and its schema.

2) Change of type from 1-element union to primitive field:
{code:java}
Schema type mismatch. source type: STRUCT and target type: STRING {code}
This happened when changing the corresponding field's Avro schema from
{code:java}
 name": "myfield", "type": ["string"] {code}
to
{code:java}
 name": "myfield", "type": "string"{code}
In this case, I am less convinced that those two schemas should be compatible 
(they are semantically identical - however, a Union is not a String). But it is 
unfortunate that the Schema Registry sees the above change as compatible, while 
the Connect API does not.

*Summary*:
We made two Avro schema changes which were accepted as compatible by the Schema 
Registry, but were rejected at runtime by the Kafka S3 connectors. Would it be 
possible to have a more consistent (and less restrictive) check in Connect API, 
so that a schema change in the producer can be made more confidently, without 
fear of breaking the consuming connectors?


> SchemaProjector fails to handle backwards-compatible schema change
> --
>
> Key: KAFKA-9744
> URL: https://issues.apache.org/jira/browse/KAFKA-9744
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Affects Versions: 2.3.1
>Reporter: Jurgis Pods
>Priority: Major
>
> _Note_: This bug report is for CP 5.3.1 / Kafka 2.3.1, but it most likely 
> affects all versions.
> We recently changed the namespace of inner records in Avros schemas in the 
> Confluent Schema Registry. Those changes were accepted as 
> backwards-compatible. However, when redeploying Kafka S3 connectors consuming 
> the relevant topics, we received error from SchemaProjector.project(), 
> causing the connectors to crash und stop producing data:
> {code:java}
> org.apache.kafka.connect.errors.SchemaProjectorException: Schema name 
> mismatch. source name: my.example.Record and target name: 
> my.example.sub.Record {code}
> A change of a record's namespace is compatible according to the Schema 
> Registry (which internally uses a check from the Avro library), but not for 
> the Connect API. I would argue that the namespace/package name should not 
> affect compatibility, as it says nothing about the contained data and its 
> schema.
> Would it be possible to have a more consistent (and less restrictive) check 
> in Connect API, so that a namespace change in the producer can be made more 
> confidently, without fear of breaking the 

[jira] [Commented] (KAFKA-9592) Safely abort Producer transactions during application shutdown

2020-04-10 Thread Xiang Zhang (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-9592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17080280#comment-17080280
 ] 

Xiang Zhang commented on KAFKA-9592:


[~bchen225242]  Thanks. Do we still need a small KIP for this ?

> Safely abort Producer transactions during application shutdown
> --
>
> Key: KAFKA-9592
> URL: https://issues.apache.org/jira/browse/KAFKA-9592
> Project: Kafka
>  Issue Type: Improvement
>  Components: producer 
>Affects Versions: 2.5.0
>Reporter: Boyang Chen
>Assignee: Xiang Zhang
>Priority: Major
>  Labels: help-wanted, needs-kip, newbie
> Fix For: 2.6.0
>
>
> Today if a transactional producer hits a fatal exception, the caller usually 
> catches the exception and handle it by closing the producer, and abort the 
> transaction:
>  
> {code:java}
> try {
>   producer.beginTxn();
>   producer.send(xxx);
>   producer.sendOffsets(xxx);
>   producer.commit();
> } catch (ProducerFenced | UnknownPid e) {
>   ...
>   producer.abortTxn();
>   producer.close();
> }{code}
> This is what the current API suggests user to do. Another scenario is during 
> an informed shutdown, people with EOS producer would also like to end an 
> ongoing transaction before closing the producer as it sounds more clean.
> The tricky scenario is that `abortTxn` is not a safe call when the producer 
> is already in an error state, which means user has to do another try-catch 
> with the first layer catch block, making the error handling pretty annoying. 
> There are several ways to make this API robust and guide user to a safe usage:
>  # Internally abort any ongoing transaction within `producer.close`, and 
> comment on `abortTxn` call to warn user not to do it manually. 
>  # Similar to 1, but get a new `close(boolean abortTxn)` API call in case 
> some users want to handle transaction state by themselves.
>  # Introduce a new abort transaction API with a boolean flag indicating 
> whether the producer is in error state, instead of throwing exceptions
>  # Introduce a public API `isInError` on producer for user to validate before 
> doing any transactional API calls
> I personally favor 1 & 2 most as it is simple and does not require any API 
> change. Considering the change scope, I would still recommend a small KIP.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-9592) Safely abort Producer transactions during application shutdown

2020-04-10 Thread Boyang Chen (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-9592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17080274#comment-17080274
 ] 

Boyang Chen commented on KAFKA-9592:


[~iamabug] Yes, I think your understanding is correct.

> Safely abort Producer transactions during application shutdown
> --
>
> Key: KAFKA-9592
> URL: https://issues.apache.org/jira/browse/KAFKA-9592
> Project: Kafka
>  Issue Type: Improvement
>  Components: producer 
>Affects Versions: 2.5.0
>Reporter: Boyang Chen
>Assignee: Xiang Zhang
>Priority: Major
>  Labels: help-wanted, needs-kip, newbie
> Fix For: 2.6.0
>
>
> Today if a transactional producer hits a fatal exception, the caller usually 
> catches the exception and handle it by closing the producer, and abort the 
> transaction:
>  
> {code:java}
> try {
>   producer.beginTxn();
>   producer.send(xxx);
>   producer.sendOffsets(xxx);
>   producer.commit();
> } catch (ProducerFenced | UnknownPid e) {
>   ...
>   producer.abortTxn();
>   producer.close();
> }{code}
> This is what the current API suggests user to do. Another scenario is during 
> an informed shutdown, people with EOS producer would also like to end an 
> ongoing transaction before closing the producer as it sounds more clean.
> The tricky scenario is that `abortTxn` is not a safe call when the producer 
> is already in an error state, which means user has to do another try-catch 
> with the first layer catch block, making the error handling pretty annoying. 
> There are several ways to make this API robust and guide user to a safe usage:
>  # Internally abort any ongoing transaction within `producer.close`, and 
> comment on `abortTxn` call to warn user not to do it manually. 
>  # Similar to 1, but get a new `close(boolean abortTxn)` API call in case 
> some users want to handle transaction state by themselves.
>  # Introduce a new abort transaction API with a boolean flag indicating 
> whether the producer is in error state, instead of throwing exceptions
>  # Introduce a public API `isInError` on producer for user to validate before 
> doing any transactional API calls
> I personally favor 1 & 2 most as it is simple and does not require any API 
> change. Considering the change scope, I would still recommend a small KIP.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)