[jira] [Created] (KAFKA-15375) When running in KRaft mode, LogManager may creates CleanShutdown file by mistake

2023-08-17 Thread Vincent Jiang (Jira)
Vincent Jiang created KAFKA-15375:
-

 Summary: When running in KRaft mode, LogManager may creates 
CleanShutdown file by mistake 
 Key: KAFKA-15375
 URL: https://issues.apache.org/jira/browse/KAFKA-15375
 Project: Kafka
  Issue Type: Bug
Reporter: Vincent Jiang






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-14497) LastStableOffset is advanced prematurely when a log is reopened.

2022-12-15 Thread Vincent Jiang (Jira)
Vincent Jiang created KAFKA-14497:
-

 Summary: LastStableOffset is advanced prematurely when a log is 
reopened.
 Key: KAFKA-14497
 URL: https://issues.apache.org/jira/browse/KAFKA-14497
 Project: Kafka
  Issue Type: Bug
Reporter: Vincent Jiang


In below test case, last stable offset of log is advanced prematurely after 
reopen:
 # producer #1 appends transaction records to leader. offsets = [0, 1, 2, 3]
 # producer #2 appends transactional records to leader. offsets =  [4, 5, 6, 7]
 # all records are replicated to followers and high watermark advanced to 8.
 # at this point, lastStableOffset = 0. (first offset of an open transaction)
 # producer #1 aborts the transaction by writing an abort marker at offset 8.  
ProducerStateManager.unreplicatedTxns contains the aborted transaction 
(firstOffset=0, lastOffset=8)
 # then the log is closed and reopened.
 # after reopen, log.lastStableOffset is initialized to 4.  This is because 
ProducerStateManager.unreplicatedTxns is empty after reopening log.

 

We should rebuild ProducerStateManager.unreplicatedTxns when reloading a log, 
so that lastStableOffset remains unchanged before and after reopen.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-14347) deleted records may be kept unexpectedly when leader changes while adding a new replica

2022-11-01 Thread Vincent Jiang (Jira)
Vincent Jiang created KAFKA-14347:
-

 Summary: deleted records may be kept unexpectedly when leader 
changes while adding a new replica
 Key: KAFKA-14347
 URL: https://issues.apache.org/jira/browse/KAFKA-14347
 Project: Kafka
  Issue Type: Improvement
Reporter: Vincent Jiang


Consider that in a compacted topic, a regular record _k1=v1_  is deleted by a 
later tombstone record {_}k1=null{_}{_}.{_}  And imagine that somehow __ log 
compaction is making different progress on the three replicas, {_}r1{_}, _r2_ 
and _r3:_
_-_ on replica {_}r1{_}, log compaction has not cleaned _k1=v1_ or _k1=null_ 
yet.
- on replica {_}r2{_}, log compaction cleaned and removed both _k1=v1_ and 
_k1=null._

In this case, following sequence can cause record _k1=v1_ being kept 
unexpectedly:
1.  Replica _r3_ is re-assigned to a different node and starts to replicate 
data from leader. 
2. At the beginning, _r1_ is the leader, so _r3_ replicates record _k1=v1_ from 
{_}r1{_}.
3. Before _k1=null_ is replicated from {_}r1{_}, leader changes to {_}r2{_}.
4. _r3_ replicates data from {_}r2{_}.  Because _k1=null_ record has been 
cleaned in {_}r2{_}, it will not be replicated.

As a result, _r3_ has record _k1=v1_ but not {_}k1=null{_}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-14151) Add additional validation to protect on-disk log segment data from being corrupted

2022-08-09 Thread Vincent Jiang (Jira)
Vincent Jiang created KAFKA-14151:
-

 Summary: Add additional validation to protect on-disk log segment 
data from being corrupted
 Key: KAFKA-14151
 URL: https://issues.apache.org/jira/browse/KAFKA-14151
 Project: Kafka
  Issue Type: Improvement
  Components: log
Reporter: Vincent Jiang


We received escalations reporting bad records being written to log segment 
on-disk data due to environmental issues (bug in old version JVM jit).  We 
should consider adding additional validation to  protect the on-disk data from 
being corrupted by inadvertent bugs or environmental issues



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-14005) LogCleaner doesn't clean log if there is no dirty range

2022-06-16 Thread Vincent Jiang (Jira)
Vincent Jiang created KAFKA-14005:
-

 Summary: LogCleaner doesn't clean log if there is no dirty range
 Key: KAFKA-14005
 URL: https://issues.apache.org/jira/browse/KAFKA-14005
 Project: Kafka
  Issue Type: Bug
Reporter: Vincent Jiang


When there is no dirty range to clean (firstDirtyOffset == 
firstUnclenableOffset), buildOffsetMap for dirty range returns an empty offset 
map, with map.latestOffset = -1.

 

Then target cleaning offset range becomes [startOffset, map.latestOffset + 1) = 
[startOffset, 0], hence no segments are cleaned.

 

The correct cleaning offset range should be [startOffset, firstDirtyOffset], so 
that the log can be cleaned again to remove abort/commit markers, or tombstones.

 

LogCleanerTest.FakeOffsetMap.clear() method has a bug - it doesn't reset 
lastOffset. This bug causes test case like testAbortMarkerRemoval() pass 
false-positively.

  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Created] (KAFKA-13717) KafkaConsumer.close throws authorization exception even when commit offsets is empty

2022-03-07 Thread Vincent Jiang (Jira)
Vincent Jiang created KAFKA-13717:
-

 Summary: KafkaConsumer.close throws authorization exception even 
when commit offsets is empty
 Key: KAFKA-13717
 URL: https://issues.apache.org/jira/browse/KAFKA-13717
 Project: Kafka
  Issue Type: Bug
  Components: unit tests
Reporter: Vincent Jiang


When offsets is empty and coordinator is unknown, KafkaConsumer.close doesn't 
throw exception before commit 
[https://github.com/apache/kafka/commit/4b468a9d81f7380f7197a2a6b859c1b4dca84bd9|https://github.com/apache/kafka/commit/4b468a9d81f7380f7197a2a6b859c1b4dca84bd9,].
  After this commit, Kafka.close may throw authorization exception.

 

Root cause is because in the commit, the logic is changed to call 
lookupCoordinator even if offsets is empty. 

 

Even if a consumer doesn't have access to a group or a topic, it might be 
better to not throw authorization exception in this case because close() call 
doesn't touch actually access any resource.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (KAFKA-13706) org.apache.kafka.test.MockSelector doesn't remove closed connections from its 'ready' field

2022-03-03 Thread Vincent Jiang (Jira)
Vincent Jiang created KAFKA-13706:
-

 Summary: org.apache.kafka.test.MockSelector doesn't remove closed 
connections from its 'ready' field
 Key: KAFKA-13706
 URL: https://issues.apache.org/jira/browse/KAFKA-13706
 Project: Kafka
  Issue Type: Bug
  Components: unit tests
Reporter: Vincent Jiang


MockSelector.close(String id) method doesn't remove closed connection from 
"ready" field.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (KAFKA-13461) KafkaController stops functioning as active controller after ZooKeeperClient auth failure

2021-11-17 Thread Vincent Jiang (Jira)
Vincent Jiang created KAFKA-13461:
-

 Summary: KafkaController stops functioning as active controller 
after ZooKeeperClient auth failure
 Key: KAFKA-13461
 URL: https://issues.apache.org/jira/browse/KAFKA-13461
 Project: Kafka
  Issue Type: Bug
  Components: zkclient
Reporter: Vincent Jiang


When java.security.auth.login.config is present, but there is no "Client" 
section,  ZookeeperSaslClient creation fails and raises LoginExcpetion, result 
in warning log:
{code:java}
WARN SASL configuration failed: javax.security.auth.login.LoginException: No 
JAAS configuration section named 'Client' was found in specified JAAS 
configuration file: '***'. Will continue connection to Zookeeper server without 
SASL authentication, if Zookeeper server allows it.{code}
When this happens after initial startup, ClientCnxn enqueues an AuthFailed 
event which will trigger following sequence:
 # zkclient reinitialization is triggered
 # the controller resigns.
 # Before the controller's ZK session expires, the controller successfully 
connect to ZK and maintains the current session
 # In KafkaController.elect(), the controller sets activeControllerId to itself 
and short-circuits the rest of the elect. Since the controller resigned earlier 
and also skips the call to onControllerFailover(), the controller is not 
actually functioning as the active controller (e.g. the necessary ZK watchers 
haven't been registered).

 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (KAFKA-13305) NullPointerException in LogCleanerManager "uncleanable-bytes" gauge

2021-09-15 Thread Vincent Jiang (Jira)
Vincent Jiang created KAFKA-13305:
-

 Summary: NullPointerException in LogCleanerManager 
"uncleanable-bytes" gauge
 Key: KAFKA-13305
 URL: https://issues.apache.org/jira/browse/KAFKA-13305
 Project: Kafka
  Issue Type: Bug
  Components: log cleaner
Reporter: Vincent Jiang


We've seen following exception in production environment:
{quote} java.lang.NullPointerException: Cannot invoke 
"kafka.log.UnifiedLog.logStartOffset()" because "log" is null at

kafka.log.LogCleanerManager$.cleanableOffsets(LogCleanerManager.scala:599)
{quote}
Looks like uncleanablePartitions never has partitions removed from it to 
reflect partition deletion/reassignment.

 

We should fix the NullPointerException and removed deleted partitions from 
uncleanablePartitions.

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)