[ 
https://issues.apache.org/jira/browse/KAFKA-12560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rufus Skyfii updated KAFKA-12560:
---------------------------------
    Description: 
These two log files have 0 byte size and last created modified or updated  
along time ago and these files seem to be created on startup.

When disks fill up, sometimes admins go through old log files and delete them. 
In this case, these two .log were also picked up. Note that I acknowledge this 
is a mistake as it should be filtered by extension but nevertheless anything in 
/var/log should not cause process to have issues and if the file doesn't exist, 
the process should simply recreate it.

The observed effect of this on a HA kafka cluster with 3 nodes, the replica 
members were out of sync with just one member being leader and not in sync with 
rest of the nodes while some topics had two in sync instead of 3 and the leader 
had a -1 status. Restarting kafka did not resolve so in order to resolve, kafka 
was flushed in one of the node where it had lots of -1 leader status and then 
rest of the nodes restarted.  However, this should not happen even if someone 
deleted kafka-authorizer.log and kafka-request.log and should be more resistant 
of breaking down due to file system changes.

Issue occurs several hours later.

 
{code:java}
root@ip-172-17-1-63:/var/log/kafka# find /var/log/kafka -mtime +30 -print | 
xargs ls -ll |grep '.log$'
-rw-r--r-- 1 root root        0 Nov 26 05:38 /var/log/kafka/kafka-authorizer.log
-rw-r--r-- 1 root root        0 Nov 26 05:38 /var/log/kafka/kafka-request.log
{code}
{code:java}
root@ip-172-17-1-63:/var/log/kafka# stat kafka-request.log                      
                                                                                
                                                   
  File: kafka-request.log                                                       
                                                                                
                                                   
  Size: 0               Blocks: 0          IO Block: 4096   regular empty file  
                                                                                
                                                   
Device: 34h/52d Inode: 768131      Links: 1                                     
                                                                                
                                                   
Access: (0644/-rw-r--r--)  Uid: (    0/    root)   Gid: (    0/    root)        
                                                                                
                                                   
Access: 2020-11-26 05:38:04.724561118 +0000                                     
                                                                                
                                                   
Modify: 2020-11-26 05:38:04.724561118 +0000                                     
                                                                                
                                                   
Change: 2020-11-26 05:38:04.724561118 +0000                                     
                                                                                
                                                   
 Birth: -                                                                       
                                                                                
                                                   
root@ip-172-17-1-63:/var/log/kafka# stat kafka-authorizer.log                   
                                                                                
                                                   
  File: kafka-authorizer.log                                                    
                                                                                
                                                   
  Size: 0               Blocks: 0          IO Block: 4096   regular empty file  
                                                                                
                                                   
Device: 34h/52d Inode: 768132      Links: 1                                     
                                                                                
                                                   
Access: (0644/-rw-r--r--)  Uid: (    0/    root)   Gid: (    0/    root)        
                                                                                
                                                   
Access: 2020-11-26 05:38:04.724561118 +0000                                     
                         
Modify: 2020-11-26 05:38:04.724561118 +0000                                     
                         
Change: 2020-11-26 05:38:04.724561118 +0000                                     
                         
 Birth: -                                                                       
                         
{code}
Checking file handles in staging, I can see these files while not updated in 30 
days are part of java process and the pid was current:
{code:java}
root@ip-172-17-1-63:/var/log/kafka# lsof kafka-authorizer.log
COMMAND PID USER   FD   TYPE DEVICE SIZE/OFF   NODE NAME
java     28 root   86w   REG   0,52        0 768132 kafka-authorizer.log
root@ip-172-17-1-63:/var/log/kafka# lsof kafka-request.log
COMMAND PID USER   FD   TYPE DEVICE SIZE/OFF   NODE NAME
java     28 root   85w   REG   0,52        0 768131 kafka-request.log
root@ip-172-17-1-63:/var/log/kafka# ps -ef | grep java
root        28    11 83 20:12 ?        02:32:37 /usr/local/openjdk-11/bin/java 
-Xmx2G -Xms1G -server -XX:+UseG1GC -XX:MaxGCPauseMillis=20 
-XX:InitiatingHeapOccupancyPercent=35 -XX:+ExplicitGCInvokesConcurrent 
-Djava.awt.headless=true -Dcom.sun.management.jmxremote 
-Dcom.sun.management.jmxremote.authenticate=false 
-Dcom.sun.management.jmxremote.ssl=false 
-Dcom.sun.management.jmxremote.port=9999 -Dkafka.logs.dir=/var/log/kafka 
-Dlog4j.configuration=file:/opt/kafka_2.12-2.2.1/bin/../config/log4j.properties 
-cp 
/opt/kafka_2.12-2.2.1/bin/../libs/activation-1.1.1.jar:/opt/kafka_2.12-2.2.1/bin/../libs/aopalliance-repackaged-2.5.0-b42.jar:/opt/kafka_2.12-2.2.1/bin/../libs/argparse4j-0.7.0.jar:/opt/kafka_2.12-2.2.1/bin/../libs/audience-annotations-0.5.0.jar:/opt/kafka_2.12-2.2.1/bin/../libs/commons-lang3-3.8.1.jar:/opt/kafka_2.12-2.2.1/bin/../libs/connect-api-2.2.1.jar:/opt/kafka_2.12-2.2.1/bin/../libs/connect-basic-auth-extension-2.2.1.jar:/opt/kafka_2.12-2.2.1/bin/../libs/connect-file-2.2.1.jar:/opt/kafka_2.12-2.2.1/bin/../libs/connect-json-2.2.1.jar:/opt/kafka_2.12-2.2.1/bin/../libs/connect-runtime-2.2.1.jar:/opt/kafka_2.12-2.2.1/bin/../libs/connect-transforms-2.2.1.jar:/opt/kafka_2.12-2.2.1/bin/../libs/guava-20.0.jar:/opt/kafka_2.12-2.2.1/bin/../libs/hk2-api-2.5.0-b42.jar:/opt/kafka_2.12-2.2.1/bin/../libs/hk2-locator-2.5.0-b42.jar:/opt/kafka_2.12-2.2.1/bin/../libs/hk2-utils-2.5.0-b42.jar:/opt/kafka_2.12-2.2.1/bin/../libs/jackson-annotations-2.9.8.jar:/opt/kafka_2.12-2.2.1/bin/../libs/jackson-core-2.9.8.jar:/opt/kafka_2.12-2.2.1/bin/../libs/jackson-databind-2.9.8.jar:/opt/kafka_2.12-2.2.1/bin/../libs/jackson-datatype-jdk8-2.9.8.jar:/opt/kafka_2.12-2.2.1/bin/../libs/jackson-jaxrs-base-2.9.8.jar:/opt/kafka_2.12-2.2.1/bin/../libs/jackson-jaxrs-json-provider-2.9.8.jar:/opt/kafka_2.12-2.2.1/bin/../libs/jackson-module-jaxb-annotations-2.9.8.jar:/opt/kafka_2.12-2.2.1/bin/../libs/javassist-3.22.0-CR2.jar:/opt/kafka_2.12-2.2.1/bin/../libs/javax.annotation-api-1.2.jar:/opt/kafka_2.12-2.2.1/bin/../libs/javax.inject-1.jar:/opt/kafka_2.12-2.2.1/bin/../libs/javax.inject-2.5.0-b42.jar:/opt/kafka_2.12-2.2.1/bin/../libs/javax.servlet-api-3.1.0.jar:/opt/kafka_2.12-2.2.1/bin/../libs/javax.ws.rs-api-2.1.1.jar:/opt/kafka_2.12-2.2.1/bin/../libs/javax.ws.rs-api-2.1.jar:/opt/kafka_2.12-2.2.1/bin/../libs/jaxb-api-2.3.0.jar:/opt/kafka_2.12-2.2.1/bin/../libs/jersey-client-2.27.jar:/opt/kafka_2.12-2.2.1/bin/../libs/jersey-common-2.27.jar:/opt/kafka_2.12-2.2.1/bin/../libs/jersey-container-servlet-2.27.jar:/opt/kafka_2.12-2.2.1/bin/../libs/jersey-container-servlet-core-2.27.jar:/opt/kafka_2.12-2.2.1/bin/../libs/jersey-hk2-2.27.jar:/opt/kafka_2.12-2.2.1/bin/../libs/jersey-media-jaxb-2.27.jar:/opt/kafka_2.12-2.2.1/bin/../libs/jersey-server-2.27.jar:/opt/kafka_2.12-2.2.1/bin/../libs/jetty-client-9.4.14.v20181114.jar:/opt/kafka_2.12-2.2.1/bin/../libs/jetty-continuation-9.4.14.v20181114.jar:/opt/kafka_2.12-2.2.1/bin/../libs/jetty-http-9.4.14.v20181114.jar:/opt/kafka_2.12-2.2.1/bin/../libs/jetty-io-9.4.14.v20181114.jar:/opt/kafka_2.12-2.2.1/bin/../libs/jetty-security-9.4.14.v20181114.jar:/opt/kafka_2.12-2.2.1/bin/../libs/jetty-server-9.4.14.v20181114.jar:/opt/kafka_2.12-2.2.1/bin/../libs/jetty-servlet-9.4.14.v20181114.jar:/opt/kafka_2.12-2.2.1/bin/../libs/jetty-servlets-9.4.14.v20181114.jar:/opt/kafka_2.12-2.2.1/bin/../libs/jetty-util-9.4.14.v20181114.jar:/opt/kafka_2.12-2.2.1/bin/../libs/jopt-simple-5.0.4.jar:/opt/kafka_2.12-2.2.1/bin/../libs/kafka-clients-2.2.1.jar:/opt/kafka_2.12-2.2.1/bin/../libs/kafka-log4j-appender-2.2.1.jar:/opt/kafka_2.12-2.2.1/bin/../libs/kafka-streams-2.2.1.jar:/opt/kafka_2.12-2.2.1/bin/../libs/kafka-streams-examples-2.2.1.jar:/opt/kafka_2.12-2.2.1/bin/../libs/kafka-streams-scala_2.12-2.2.1.jar:/opt/kafka_2.12-2.2.1/bin/../libs/kafka-streams-test-utils-2.2.1.jar:/opt/kafka_2.12-2.2.1/bin/../libs/kafka-tools-2.2.1.jar:/opt/kafka_2.12-2.2.1/bin/../libs/kafka_2.12-2.2.1-sources.jar:/opt/kafka_2.12-2.2.1/bin/../libs/kafka_2.12-2.2.1.jar:/opt/kafka_2.12-2.2.1/bin/../libs/log4j-1.2.17.jar:/opt/kafka_2.12-2.2.1/bin/../libs/lz4-java-1.5.0.jar:/opt/kafka_2.12-2.2.1/bin/../libs/maven-artifact-3.6.0.jar:/opt/kafka_2.12-2.2.1/bin/../libs/metrics-core-2.2.0.jar:/opt/kafka_2.12-2.2.1/bin/../libs/osgi-resource-locator-1.0.1.jar:/opt/kafka_2.12-2.2.1/bin/../libs/plexus-utils-3.1.0.jar:/opt/kafka_2.12-2.2.1/bin/../libs/reflections-0.9.11.jar:/opt/kafka_2.12-2.2.1/bin/../libs/rocksdbjni-5.15.10.jar:/opt/kafka_2.12-2.2.1/bin/../libs/scala-library-2.12.8.jar:/opt/kafka_2.12-2.2.1/bin/../libs/scala-logging_2.12-3.9.0.jar:/opt/kafka_2.12-2.2.1/bin/../libs/scala-reflect-2.12.8.jar:/opt/kafka_2.12-2.2.1/bin/../libs/slf4j-api-1.7.25.jar:/opt/kafka_2.12-2.2.1/bin/../libs/slf4j-log4j12-1.7.25.jar:/opt/kafka_2.12-2.2.1/bin/../libs/snappy-java-1.1.7.2.jar:/opt/kafka_2.12-2.2.1/bin/../libs/validation-api-1.1.0.Final.jar:/opt/kafka_2.12-2.2.1/bin/../libs/zkclient-0.11.jar:/opt/kafka_2.12-2.2.1/bin/../libs/zookeeper-3.4.13.jar:/opt/kafka_2.12-2.2.1/bin/../libs/zstd-jni-1.3.8-1.jar
 kafka.Kafka /opt/kafka_2.12-2.2.1/config/server.properties
root        29    10  0 20:12 ?        00:00:38 /usr/bin/java 
-Dzookeeper.log.dir=/var/log/zookeeper -Dzookeeper.root.logger=INFO,ROLLINGFILE 
-cp 
/etc/zookeeper/conf:/usr/share/java/jline.jar:/usr/share/java/log4j-1.2.jar:/usr/share/java/xercesImpl.jar:/usr/share/java/xmlParserAPIs.jar:/usr/share/java/netty.jar:/usr/share/java/slf4j-api.jar:/usr/share/java/slf4j-log4j12.jar:/usr/share/java/zookeeper.jar
 -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.local.only=true 
org.apache.zookeeper.server.quorum.QuorumPeerMain /etc/zookeeper/conf/zoo.cfg
{code}
Below confirms that all extensions that do are not current log (identified by 
just ".log") and match these ".log.x", ".log.x.gz", ".backup" or ".gz" do not 
have open file handles and can be safely cleaned. To verify, I checked lsof for 
all files matching logrotate extensions modified in last 2 days:
{code:java}
root@ip-172-17-1-63:/var/log/kafka# find /var/log/kafka -mtime -2 -ls | egrep 
-e '.log.$|.backup$|.gz$'
   768472    136 -rw-r--r--   1 root     root       137614 Mar 25 06:25 
/var/log/kafka/server.log.1.gz
   768517     16 -rw-r--r--   1 root     root        13823 Mar 24 04:58 
/var/log/kafka/log-cleaner.log.2.gz
   768431  10248 -rw-r--r--   1 root     root     10485918 Mar 24 01:27 
/var/log/kafka/server.log.1-2021032406.backup
   768507    688 -rw-r--r--   1 root     root       701207 Mar 24 06:24 
/var/log/kafka/server.log.2.gz
   768401  51508 -rw-r--r--   1 root     root     52742072 Mar 24 23:59 
/var/log/kafka/controller.log.1-2021032506.backup
   768459  10248 -rw-r--r--   1 root     root     10485913 Mar 25 05:51 
/var/log/kafka/server.log.1-2021032506.backup
   768432   1664 -rw-r--r--   1 root     root      1701962 Mar 24 05:54 
/var/log/kafka/state-change.log.2.gz
   768071  51208 -rw-r--r--   1 root     root     52429175 Mar 24 20:16 
/var/log/kafka/state-change.log.1-2021032506.backup
   768522     20 -rw-r--r--   1 root     root        16598 Mar 25 06:23 
/var/log/kafka/log-cleaner.log.1.gz
   768465   1052 -rw-r--r--   1 root     root      1075433 Mar 25 06:08 
/var/log/kafka/state-change.log.1.gz
   768562   3544 -rw-r--r--   1 root     root      3625106 Mar 25 06:24 
/var/log/kafka/controller.log.1.gz
root@ip-172-17-1-63:/var/log/kafka# find /var/log/kafka -mtime -2 -ls | egrep 
-e '.log.$|.backup$|.gz$' | awk '{print $11}' | xargs lsof
{code}
However, I still think accidental deletion of any log files should not affect 
operational state of the Kafka cluster.

  was:
These two log files have 0 byte size and last created modified or updated  
along time ago and these files seem to be created on startup.

When disks fill up, sometimes admins go through old log files and delete them. 
In this case, these two .log were also picked up. Note that I acknowledge this 
is a mistake as it should be filtered by extension but nevertheless anything in 
/var/log should not cause process to have issues and if the file doesn't exist, 
the process should simply recreate it.

The observed effect of this on a HA kafka cluster with 3 nodes, the replica 
members were out of sync with just one member being leader and not in sync with 
rest of the nodes while some topics had two in sync instead of 3 and the leader 
had a -1 status. Restarting kafka did not resolve so in order to resolve, kafka 
was flushed in one of the node where it had lots of -1 leader status and then 
rest of the nodes restarted.  However, this should not happen even if someone 
deleted kafka-authorizer.log and kafka-request.log and should be more resistant 
of breaking down due to file system changes.

 
{code:java}
root@ip-172-17-1-63:/var/log/kafka# find /var/log/kafka -mtime +30 -print | 
xargs ls -ll |grep '.log$'
-rw-r--r-- 1 root root        0 Nov 26 05:38 /var/log/kafka/kafka-authorizer.log
-rw-r--r-- 1 root root        0 Nov 26 05:38 /var/log/kafka/kafka-request.log
{code}
{code:java}
root@ip-172-17-1-63:/var/log/kafka# stat kafka-request.log                      
                                                                                
                                                   
  File: kafka-request.log                                                       
                                                                                
                                                   
  Size: 0               Blocks: 0          IO Block: 4096   regular empty file  
                                                                                
                                                   
Device: 34h/52d Inode: 768131      Links: 1                                     
                                                                                
                                                   
Access: (0644/-rw-r--r--)  Uid: (    0/    root)   Gid: (    0/    root)        
                                                                                
                                                   
Access: 2020-11-26 05:38:04.724561118 +0000                                     
                                                                                
                                                   
Modify: 2020-11-26 05:38:04.724561118 +0000                                     
                                                                                
                                                   
Change: 2020-11-26 05:38:04.724561118 +0000                                     
                                                                                
                                                   
 Birth: -                                                                       
                                                                                
                                                   
root@ip-172-17-1-63:/var/log/kafka# stat kafka-authorizer.log                   
                                                                                
                                                   
  File: kafka-authorizer.log                                                    
                                                                                
                                                   
  Size: 0               Blocks: 0          IO Block: 4096   regular empty file  
                                                                                
                                                   
Device: 34h/52d Inode: 768132      Links: 1                                     
                                                                                
                                                   
Access: (0644/-rw-r--r--)  Uid: (    0/    root)   Gid: (    0/    root)        
                                                                                
                                                   
Access: 2020-11-26 05:38:04.724561118 +0000                                     
                         
Modify: 2020-11-26 05:38:04.724561118 +0000                                     
                         
Change: 2020-11-26 05:38:04.724561118 +0000                                     
                         
 Birth: -                                                                       
                         
{code}
Checking file handles in staging, I can see these files while not updated in 30 
days are part of java process and the pid was current:
{code:java}
root@ip-172-17-1-63:/var/log/kafka# lsof kafka-authorizer.log
COMMAND PID USER   FD   TYPE DEVICE SIZE/OFF   NODE NAME
java     28 root   86w   REG   0,52        0 768132 kafka-authorizer.log
root@ip-172-17-1-63:/var/log/kafka# lsof kafka-request.log
COMMAND PID USER   FD   TYPE DEVICE SIZE/OFF   NODE NAME
java     28 root   85w   REG   0,52        0 768131 kafka-request.log
root@ip-172-17-1-63:/var/log/kafka# ps -ef | grep java
root        28    11 83 20:12 ?        02:32:37 /usr/local/openjdk-11/bin/java 
-Xmx2G -Xms1G -server -XX:+UseG1GC -XX:MaxGCPauseMillis=20 
-XX:InitiatingHeapOccupancyPercent=35 -XX:+ExplicitGCInvokesConcurrent 
-Djava.awt.headless=true -Dcom.sun.management.jmxremote 
-Dcom.sun.management.jmxremote.authenticate=false 
-Dcom.sun.management.jmxremote.ssl=false 
-Dcom.sun.management.jmxremote.port=9999 -Dkafka.logs.dir=/var/log/kafka 
-Dlog4j.configuration=file:/opt/kafka_2.12-2.2.1/bin/../config/log4j.properties 
-cp 
/opt/kafka_2.12-2.2.1/bin/../libs/activation-1.1.1.jar:/opt/kafka_2.12-2.2.1/bin/../libs/aopalliance-repackaged-2.5.0-b42.jar:/opt/kafka_2.12-2.2.1/bin/../libs/argparse4j-0.7.0.jar:/opt/kafka_2.12-2.2.1/bin/../libs/audience-annotations-0.5.0.jar:/opt/kafka_2.12-2.2.1/bin/../libs/commons-lang3-3.8.1.jar:/opt/kafka_2.12-2.2.1/bin/../libs/connect-api-2.2.1.jar:/opt/kafka_2.12-2.2.1/bin/../libs/connect-basic-auth-extension-2.2.1.jar:/opt/kafka_2.12-2.2.1/bin/../libs/connect-file-2.2.1.jar:/opt/kafka_2.12-2.2.1/bin/../libs/connect-json-2.2.1.jar:/opt/kafka_2.12-2.2.1/bin/../libs/connect-runtime-2.2.1.jar:/opt/kafka_2.12-2.2.1/bin/../libs/connect-transforms-2.2.1.jar:/opt/kafka_2.12-2.2.1/bin/../libs/guava-20.0.jar:/opt/kafka_2.12-2.2.1/bin/../libs/hk2-api-2.5.0-b42.jar:/opt/kafka_2.12-2.2.1/bin/../libs/hk2-locator-2.5.0-b42.jar:/opt/kafka_2.12-2.2.1/bin/../libs/hk2-utils-2.5.0-b42.jar:/opt/kafka_2.12-2.2.1/bin/../libs/jackson-annotations-2.9.8.jar:/opt/kafka_2.12-2.2.1/bin/../libs/jackson-core-2.9.8.jar:/opt/kafka_2.12-2.2.1/bin/../libs/jackson-databind-2.9.8.jar:/opt/kafka_2.12-2.2.1/bin/../libs/jackson-datatype-jdk8-2.9.8.jar:/opt/kafka_2.12-2.2.1/bin/../libs/jackson-jaxrs-base-2.9.8.jar:/opt/kafka_2.12-2.2.1/bin/../libs/jackson-jaxrs-json-provider-2.9.8.jar:/opt/kafka_2.12-2.2.1/bin/../libs/jackson-module-jaxb-annotations-2.9.8.jar:/opt/kafka_2.12-2.2.1/bin/../libs/javassist-3.22.0-CR2.jar:/opt/kafka_2.12-2.2.1/bin/../libs/javax.annotation-api-1.2.jar:/opt/kafka_2.12-2.2.1/bin/../libs/javax.inject-1.jar:/opt/kafka_2.12-2.2.1/bin/../libs/javax.inject-2.5.0-b42.jar:/opt/kafka_2.12-2.2.1/bin/../libs/javax.servlet-api-3.1.0.jar:/opt/kafka_2.12-2.2.1/bin/../libs/javax.ws.rs-api-2.1.1.jar:/opt/kafka_2.12-2.2.1/bin/../libs/javax.ws.rs-api-2.1.jar:/opt/kafka_2.12-2.2.1/bin/../libs/jaxb-api-2.3.0.jar:/opt/kafka_2.12-2.2.1/bin/../libs/jersey-client-2.27.jar:/opt/kafka_2.12-2.2.1/bin/../libs/jersey-common-2.27.jar:/opt/kafka_2.12-2.2.1/bin/../libs/jersey-container-servlet-2.27.jar:/opt/kafka_2.12-2.2.1/bin/../libs/jersey-container-servlet-core-2.27.jar:/opt/kafka_2.12-2.2.1/bin/../libs/jersey-hk2-2.27.jar:/opt/kafka_2.12-2.2.1/bin/../libs/jersey-media-jaxb-2.27.jar:/opt/kafka_2.12-2.2.1/bin/../libs/jersey-server-2.27.jar:/opt/kafka_2.12-2.2.1/bin/../libs/jetty-client-9.4.14.v20181114.jar:/opt/kafka_2.12-2.2.1/bin/../libs/jetty-continuation-9.4.14.v20181114.jar:/opt/kafka_2.12-2.2.1/bin/../libs/jetty-http-9.4.14.v20181114.jar:/opt/kafka_2.12-2.2.1/bin/../libs/jetty-io-9.4.14.v20181114.jar:/opt/kafka_2.12-2.2.1/bin/../libs/jetty-security-9.4.14.v20181114.jar:/opt/kafka_2.12-2.2.1/bin/../libs/jetty-server-9.4.14.v20181114.jar:/opt/kafka_2.12-2.2.1/bin/../libs/jetty-servlet-9.4.14.v20181114.jar:/opt/kafka_2.12-2.2.1/bin/../libs/jetty-servlets-9.4.14.v20181114.jar:/opt/kafka_2.12-2.2.1/bin/../libs/jetty-util-9.4.14.v20181114.jar:/opt/kafka_2.12-2.2.1/bin/../libs/jopt-simple-5.0.4.jar:/opt/kafka_2.12-2.2.1/bin/../libs/kafka-clients-2.2.1.jar:/opt/kafka_2.12-2.2.1/bin/../libs/kafka-log4j-appender-2.2.1.jar:/opt/kafka_2.12-2.2.1/bin/../libs/kafka-streams-2.2.1.jar:/opt/kafka_2.12-2.2.1/bin/../libs/kafka-streams-examples-2.2.1.jar:/opt/kafka_2.12-2.2.1/bin/../libs/kafka-streams-scala_2.12-2.2.1.jar:/opt/kafka_2.12-2.2.1/bin/../libs/kafka-streams-test-utils-2.2.1.jar:/opt/kafka_2.12-2.2.1/bin/../libs/kafka-tools-2.2.1.jar:/opt/kafka_2.12-2.2.1/bin/../libs/kafka_2.12-2.2.1-sources.jar:/opt/kafka_2.12-2.2.1/bin/../libs/kafka_2.12-2.2.1.jar:/opt/kafka_2.12-2.2.1/bin/../libs/log4j-1.2.17.jar:/opt/kafka_2.12-2.2.1/bin/../libs/lz4-java-1.5.0.jar:/opt/kafka_2.12-2.2.1/bin/../libs/maven-artifact-3.6.0.jar:/opt/kafka_2.12-2.2.1/bin/../libs/metrics-core-2.2.0.jar:/opt/kafka_2.12-2.2.1/bin/../libs/osgi-resource-locator-1.0.1.jar:/opt/kafka_2.12-2.2.1/bin/../libs/plexus-utils-3.1.0.jar:/opt/kafka_2.12-2.2.1/bin/../libs/reflections-0.9.11.jar:/opt/kafka_2.12-2.2.1/bin/../libs/rocksdbjni-5.15.10.jar:/opt/kafka_2.12-2.2.1/bin/../libs/scala-library-2.12.8.jar:/opt/kafka_2.12-2.2.1/bin/../libs/scala-logging_2.12-3.9.0.jar:/opt/kafka_2.12-2.2.1/bin/../libs/scala-reflect-2.12.8.jar:/opt/kafka_2.12-2.2.1/bin/../libs/slf4j-api-1.7.25.jar:/opt/kafka_2.12-2.2.1/bin/../libs/slf4j-log4j12-1.7.25.jar:/opt/kafka_2.12-2.2.1/bin/../libs/snappy-java-1.1.7.2.jar:/opt/kafka_2.12-2.2.1/bin/../libs/validation-api-1.1.0.Final.jar:/opt/kafka_2.12-2.2.1/bin/../libs/zkclient-0.11.jar:/opt/kafka_2.12-2.2.1/bin/../libs/zookeeper-3.4.13.jar:/opt/kafka_2.12-2.2.1/bin/../libs/zstd-jni-1.3.8-1.jar
 kafka.Kafka /opt/kafka_2.12-2.2.1/config/server.properties
root        29    10  0 20:12 ?        00:00:38 /usr/bin/java 
-Dzookeeper.log.dir=/var/log/zookeeper -Dzookeeper.root.logger=INFO,ROLLINGFILE 
-cp 
/etc/zookeeper/conf:/usr/share/java/jline.jar:/usr/share/java/log4j-1.2.jar:/usr/share/java/xercesImpl.jar:/usr/share/java/xmlParserAPIs.jar:/usr/share/java/netty.jar:/usr/share/java/slf4j-api.jar:/usr/share/java/slf4j-log4j12.jar:/usr/share/java/zookeeper.jar
 -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.local.only=true 
org.apache.zookeeper.server.quorum.QuorumPeerMain /etc/zookeeper/conf/zoo.cfg
{code}
Below confirms that all extensions that do are not current log (identified by 
just ".log") and match these ".log.x", ".log.x.gz", ".backup" or ".gz" do not 
have open file handles and can be safely cleaned. To verify, I checked lsof for 
all files matching logrotate extensions modified in last 2 days:
{code:java}
root@ip-172-17-1-63:/var/log/kafka# find /var/log/kafka -mtime -2 -ls | egrep 
-e '.log.$|.backup$|.gz$'
   768472    136 -rw-r--r--   1 root     root       137614 Mar 25 06:25 
/var/log/kafka/server.log.1.gz
   768517     16 -rw-r--r--   1 root     root        13823 Mar 24 04:58 
/var/log/kafka/log-cleaner.log.2.gz
   768431  10248 -rw-r--r--   1 root     root     10485918 Mar 24 01:27 
/var/log/kafka/server.log.1-2021032406.backup
   768507    688 -rw-r--r--   1 root     root       701207 Mar 24 06:24 
/var/log/kafka/server.log.2.gz
   768401  51508 -rw-r--r--   1 root     root     52742072 Mar 24 23:59 
/var/log/kafka/controller.log.1-2021032506.backup
   768459  10248 -rw-r--r--   1 root     root     10485913 Mar 25 05:51 
/var/log/kafka/server.log.1-2021032506.backup
   768432   1664 -rw-r--r--   1 root     root      1701962 Mar 24 05:54 
/var/log/kafka/state-change.log.2.gz
   768071  51208 -rw-r--r--   1 root     root     52429175 Mar 24 20:16 
/var/log/kafka/state-change.log.1-2021032506.backup
   768522     20 -rw-r--r--   1 root     root        16598 Mar 25 06:23 
/var/log/kafka/log-cleaner.log.1.gz
   768465   1052 -rw-r--r--   1 root     root      1075433 Mar 25 06:08 
/var/log/kafka/state-change.log.1.gz
   768562   3544 -rw-r--r--   1 root     root      3625106 Mar 25 06:24 
/var/log/kafka/controller.log.1.gz
root@ip-172-17-1-63:/var/log/kafka# find /var/log/kafka -mtime -2 -ls | egrep 
-e '.log.$|.backup$|.gz$' | awk '{print $11}' | xargs lsof
{code}
However, I still think accidental deletion of any log files should not affect 
operational state of the Kafka cluster.


> Accidental delete of some log files kafka-authorizer.log and 
> kafka-request.log can break topics in cluster
> ----------------------------------------------------------------------------------------------------------
>
>                 Key: KAFKA-12560
>                 URL: https://issues.apache.org/jira/browse/KAFKA-12560
>             Project: Kafka
>          Issue Type: Bug
>          Components: log
>    Affects Versions: 2.2.1
>         Environment: AWS EC2 i3.xlarge
>            Reporter: Rufus Skyfii
>            Priority: Minor
>
> These two log files have 0 byte size and last created modified or updated  
> along time ago and these files seem to be created on startup.
> When disks fill up, sometimes admins go through old log files and delete 
> them. In this case, these two .log were also picked up. Note that I 
> acknowledge this is a mistake as it should be filtered by extension but 
> nevertheless anything in /var/log should not cause process to have issues and 
> if the file doesn't exist, the process should simply recreate it.
> The observed effect of this on a HA kafka cluster with 3 nodes, the replica 
> members were out of sync with just one member being leader and not in sync 
> with rest of the nodes while some topics had two in sync instead of 3 and the 
> leader had a -1 status. Restarting kafka did not resolve so in order to 
> resolve, kafka was flushed in one of the node where it had lots of -1 leader 
> status and then rest of the nodes restarted.  However, this should not happen 
> even if someone deleted kafka-authorizer.log and kafka-request.log and should 
> be more resistant of breaking down due to file system changes.
> Issue occurs several hours later.
>  
> {code:java}
> root@ip-172-17-1-63:/var/log/kafka# find /var/log/kafka -mtime +30 -print | 
> xargs ls -ll |grep '.log$'
> -rw-r--r-- 1 root root        0 Nov 26 05:38 
> /var/log/kafka/kafka-authorizer.log
> -rw-r--r-- 1 root root        0 Nov 26 05:38 /var/log/kafka/kafka-request.log
> {code}
> {code:java}
> root@ip-172-17-1-63:/var/log/kafka# stat kafka-request.log                    
>                                                                               
>                                                        
>   File: kafka-request.log                                                     
>                                                                               
>                                                        
>   Size: 0               Blocks: 0          IO Block: 4096   regular empty 
> file                                                                          
>                                                            
> Device: 34h/52d Inode: 768131      Links: 1                                   
>                                                                               
>                                                        
> Access: (0644/-rw-r--r--)  Uid: (    0/    root)   Gid: (    0/    root)      
>                                                                               
>                                                        
> Access: 2020-11-26 05:38:04.724561118 +0000                                   
>                                                                               
>                                                        
> Modify: 2020-11-26 05:38:04.724561118 +0000                                   
>                                                                               
>                                                        
> Change: 2020-11-26 05:38:04.724561118 +0000                                   
>                                                                               
>                                                        
>  Birth: -                                                                     
>                                                                               
>                                                        
> root@ip-172-17-1-63:/var/log/kafka# stat kafka-authorizer.log                 
>                                                                               
>                                                        
>   File: kafka-authorizer.log                                                  
>                                                                               
>                                                        
>   Size: 0               Blocks: 0          IO Block: 4096   regular empty 
> file                                                                          
>                                                            
> Device: 34h/52d Inode: 768132      Links: 1                                   
>                                                                               
>                                                        
> Access: (0644/-rw-r--r--)  Uid: (    0/    root)   Gid: (    0/    root)      
>                                                                               
>                                                        
> Access: 2020-11-26 05:38:04.724561118 +0000                                   
>                            
> Modify: 2020-11-26 05:38:04.724561118 +0000                                   
>                            
> Change: 2020-11-26 05:38:04.724561118 +0000                                   
>                            
>  Birth: -                                                                     
>                            
> {code}
> Checking file handles in staging, I can see these files while not updated in 
> 30 days are part of java process and the pid was current:
> {code:java}
> root@ip-172-17-1-63:/var/log/kafka# lsof kafka-authorizer.log
> COMMAND PID USER   FD   TYPE DEVICE SIZE/OFF   NODE NAME
> java     28 root   86w   REG   0,52        0 768132 kafka-authorizer.log
> root@ip-172-17-1-63:/var/log/kafka# lsof kafka-request.log
> COMMAND PID USER   FD   TYPE DEVICE SIZE/OFF   NODE NAME
> java     28 root   85w   REG   0,52        0 768131 kafka-request.log
> root@ip-172-17-1-63:/var/log/kafka# ps -ef | grep java
> root        28    11 83 20:12 ?        02:32:37 
> /usr/local/openjdk-11/bin/java -Xmx2G -Xms1G -server -XX:+UseG1GC 
> -XX:MaxGCPauseMillis=20 -XX:InitiatingHeapOccupancyPercent=35 
> -XX:+ExplicitGCInvokesConcurrent -Djava.awt.headless=true 
> -Dcom.sun.management.jmxremote 
> -Dcom.sun.management.jmxremote.authenticate=false 
> -Dcom.sun.management.jmxremote.ssl=false 
> -Dcom.sun.management.jmxremote.port=9999 -Dkafka.logs.dir=/var/log/kafka 
> -Dlog4j.configuration=file:/opt/kafka_2.12-2.2.1/bin/../config/log4j.properties
>  -cp 
> /opt/kafka_2.12-2.2.1/bin/../libs/activation-1.1.1.jar:/opt/kafka_2.12-2.2.1/bin/../libs/aopalliance-repackaged-2.5.0-b42.jar:/opt/kafka_2.12-2.2.1/bin/../libs/argparse4j-0.7.0.jar:/opt/kafka_2.12-2.2.1/bin/../libs/audience-annotations-0.5.0.jar:/opt/kafka_2.12-2.2.1/bin/../libs/commons-lang3-3.8.1.jar:/opt/kafka_2.12-2.2.1/bin/../libs/connect-api-2.2.1.jar:/opt/kafka_2.12-2.2.1/bin/../libs/connect-basic-auth-extension-2.2.1.jar:/opt/kafka_2.12-2.2.1/bin/../libs/connect-file-2.2.1.jar:/opt/kafka_2.12-2.2.1/bin/../libs/connect-json-2.2.1.jar:/opt/kafka_2.12-2.2.1/bin/../libs/connect-runtime-2.2.1.jar:/opt/kafka_2.12-2.2.1/bin/../libs/connect-transforms-2.2.1.jar:/opt/kafka_2.12-2.2.1/bin/../libs/guava-20.0.jar:/opt/kafka_2.12-2.2.1/bin/../libs/hk2-api-2.5.0-b42.jar:/opt/kafka_2.12-2.2.1/bin/../libs/hk2-locator-2.5.0-b42.jar:/opt/kafka_2.12-2.2.1/bin/../libs/hk2-utils-2.5.0-b42.jar:/opt/kafka_2.12-2.2.1/bin/../libs/jackson-annotations-2.9.8.jar:/opt/kafka_2.12-2.2.1/bin/../libs/jackson-core-2.9.8.jar:/opt/kafka_2.12-2.2.1/bin/../libs/jackson-databind-2.9.8.jar:/opt/kafka_2.12-2.2.1/bin/../libs/jackson-datatype-jdk8-2.9.8.jar:/opt/kafka_2.12-2.2.1/bin/../libs/jackson-jaxrs-base-2.9.8.jar:/opt/kafka_2.12-2.2.1/bin/../libs/jackson-jaxrs-json-provider-2.9.8.jar:/opt/kafka_2.12-2.2.1/bin/../libs/jackson-module-jaxb-annotations-2.9.8.jar:/opt/kafka_2.12-2.2.1/bin/../libs/javassist-3.22.0-CR2.jar:/opt/kafka_2.12-2.2.1/bin/../libs/javax.annotation-api-1.2.jar:/opt/kafka_2.12-2.2.1/bin/../libs/javax.inject-1.jar:/opt/kafka_2.12-2.2.1/bin/../libs/javax.inject-2.5.0-b42.jar:/opt/kafka_2.12-2.2.1/bin/../libs/javax.servlet-api-3.1.0.jar:/opt/kafka_2.12-2.2.1/bin/../libs/javax.ws.rs-api-2.1.1.jar:/opt/kafka_2.12-2.2.1/bin/../libs/javax.ws.rs-api-2.1.jar:/opt/kafka_2.12-2.2.1/bin/../libs/jaxb-api-2.3.0.jar:/opt/kafka_2.12-2.2.1/bin/../libs/jersey-client-2.27.jar:/opt/kafka_2.12-2.2.1/bin/../libs/jersey-common-2.27.jar:/opt/kafka_2.12-2.2.1/bin/../libs/jersey-container-servlet-2.27.jar:/opt/kafka_2.12-2.2.1/bin/../libs/jersey-container-servlet-core-2.27.jar:/opt/kafka_2.12-2.2.1/bin/../libs/jersey-hk2-2.27.jar:/opt/kafka_2.12-2.2.1/bin/../libs/jersey-media-jaxb-2.27.jar:/opt/kafka_2.12-2.2.1/bin/../libs/jersey-server-2.27.jar:/opt/kafka_2.12-2.2.1/bin/../libs/jetty-client-9.4.14.v20181114.jar:/opt/kafka_2.12-2.2.1/bin/../libs/jetty-continuation-9.4.14.v20181114.jar:/opt/kafka_2.12-2.2.1/bin/../libs/jetty-http-9.4.14.v20181114.jar:/opt/kafka_2.12-2.2.1/bin/../libs/jetty-io-9.4.14.v20181114.jar:/opt/kafka_2.12-2.2.1/bin/../libs/jetty-security-9.4.14.v20181114.jar:/opt/kafka_2.12-2.2.1/bin/../libs/jetty-server-9.4.14.v20181114.jar:/opt/kafka_2.12-2.2.1/bin/../libs/jetty-servlet-9.4.14.v20181114.jar:/opt/kafka_2.12-2.2.1/bin/../libs/jetty-servlets-9.4.14.v20181114.jar:/opt/kafka_2.12-2.2.1/bin/../libs/jetty-util-9.4.14.v20181114.jar:/opt/kafka_2.12-2.2.1/bin/../libs/jopt-simple-5.0.4.jar:/opt/kafka_2.12-2.2.1/bin/../libs/kafka-clients-2.2.1.jar:/opt/kafka_2.12-2.2.1/bin/../libs/kafka-log4j-appender-2.2.1.jar:/opt/kafka_2.12-2.2.1/bin/../libs/kafka-streams-2.2.1.jar:/opt/kafka_2.12-2.2.1/bin/../libs/kafka-streams-examples-2.2.1.jar:/opt/kafka_2.12-2.2.1/bin/../libs/kafka-streams-scala_2.12-2.2.1.jar:/opt/kafka_2.12-2.2.1/bin/../libs/kafka-streams-test-utils-2.2.1.jar:/opt/kafka_2.12-2.2.1/bin/../libs/kafka-tools-2.2.1.jar:/opt/kafka_2.12-2.2.1/bin/../libs/kafka_2.12-2.2.1-sources.jar:/opt/kafka_2.12-2.2.1/bin/../libs/kafka_2.12-2.2.1.jar:/opt/kafka_2.12-2.2.1/bin/../libs/log4j-1.2.17.jar:/opt/kafka_2.12-2.2.1/bin/../libs/lz4-java-1.5.0.jar:/opt/kafka_2.12-2.2.1/bin/../libs/maven-artifact-3.6.0.jar:/opt/kafka_2.12-2.2.1/bin/../libs/metrics-core-2.2.0.jar:/opt/kafka_2.12-2.2.1/bin/../libs/osgi-resource-locator-1.0.1.jar:/opt/kafka_2.12-2.2.1/bin/../libs/plexus-utils-3.1.0.jar:/opt/kafka_2.12-2.2.1/bin/../libs/reflections-0.9.11.jar:/opt/kafka_2.12-2.2.1/bin/../libs/rocksdbjni-5.15.10.jar:/opt/kafka_2.12-2.2.1/bin/../libs/scala-library-2.12.8.jar:/opt/kafka_2.12-2.2.1/bin/../libs/scala-logging_2.12-3.9.0.jar:/opt/kafka_2.12-2.2.1/bin/../libs/scala-reflect-2.12.8.jar:/opt/kafka_2.12-2.2.1/bin/../libs/slf4j-api-1.7.25.jar:/opt/kafka_2.12-2.2.1/bin/../libs/slf4j-log4j12-1.7.25.jar:/opt/kafka_2.12-2.2.1/bin/../libs/snappy-java-1.1.7.2.jar:/opt/kafka_2.12-2.2.1/bin/../libs/validation-api-1.1.0.Final.jar:/opt/kafka_2.12-2.2.1/bin/../libs/zkclient-0.11.jar:/opt/kafka_2.12-2.2.1/bin/../libs/zookeeper-3.4.13.jar:/opt/kafka_2.12-2.2.1/bin/../libs/zstd-jni-1.3.8-1.jar
>  kafka.Kafka /opt/kafka_2.12-2.2.1/config/server.properties
> root        29    10  0 20:12 ?        00:00:38 /usr/bin/java 
> -Dzookeeper.log.dir=/var/log/zookeeper 
> -Dzookeeper.root.logger=INFO,ROLLINGFILE -cp 
> /etc/zookeeper/conf:/usr/share/java/jline.jar:/usr/share/java/log4j-1.2.jar:/usr/share/java/xercesImpl.jar:/usr/share/java/xmlParserAPIs.jar:/usr/share/java/netty.jar:/usr/share/java/slf4j-api.jar:/usr/share/java/slf4j-log4j12.jar:/usr/share/java/zookeeper.jar
>  -Dcom.sun.management.jmxremote 
> -Dcom.sun.management.jmxremote.local.only=true 
> org.apache.zookeeper.server.quorum.QuorumPeerMain /etc/zookeeper/conf/zoo.cfg
> {code}
> Below confirms that all extensions that do are not current log (identified by 
> just ".log") and match these ".log.x", ".log.x.gz", ".backup" or ".gz" do not 
> have open file handles and can be safely cleaned. To verify, I checked lsof 
> for all files matching logrotate extensions modified in last 2 days:
> {code:java}
> root@ip-172-17-1-63:/var/log/kafka# find /var/log/kafka -mtime -2 -ls | egrep 
> -e '.log.$|.backup$|.gz$'
>    768472    136 -rw-r--r--   1 root     root       137614 Mar 25 06:25 
> /var/log/kafka/server.log.1.gz
>    768517     16 -rw-r--r--   1 root     root        13823 Mar 24 04:58 
> /var/log/kafka/log-cleaner.log.2.gz
>    768431  10248 -rw-r--r--   1 root     root     10485918 Mar 24 01:27 
> /var/log/kafka/server.log.1-2021032406.backup
>    768507    688 -rw-r--r--   1 root     root       701207 Mar 24 06:24 
> /var/log/kafka/server.log.2.gz
>    768401  51508 -rw-r--r--   1 root     root     52742072 Mar 24 23:59 
> /var/log/kafka/controller.log.1-2021032506.backup
>    768459  10248 -rw-r--r--   1 root     root     10485913 Mar 25 05:51 
> /var/log/kafka/server.log.1-2021032506.backup
>    768432   1664 -rw-r--r--   1 root     root      1701962 Mar 24 05:54 
> /var/log/kafka/state-change.log.2.gz
>    768071  51208 -rw-r--r--   1 root     root     52429175 Mar 24 20:16 
> /var/log/kafka/state-change.log.1-2021032506.backup
>    768522     20 -rw-r--r--   1 root     root        16598 Mar 25 06:23 
> /var/log/kafka/log-cleaner.log.1.gz
>    768465   1052 -rw-r--r--   1 root     root      1075433 Mar 25 06:08 
> /var/log/kafka/state-change.log.1.gz
>    768562   3544 -rw-r--r--   1 root     root      3625106 Mar 25 06:24 
> /var/log/kafka/controller.log.1.gz
> root@ip-172-17-1-63:/var/log/kafka# find /var/log/kafka -mtime -2 -ls | egrep 
> -e '.log.$|.backup$|.gz$' | awk '{print $11}' | xargs lsof
> {code}
> However, I still think accidental deletion of any log files should not affect 
> operational state of the Kafka cluster.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to