[ https://issues.apache.org/jira/browse/CASSANDRA-17933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17612998#comment-17612998 ]
Stefan Miklosovic edited comment on CASSANDRA-17933 at 10/5/22 1:36 PM: ------------------------------------------------------------------------ [~maedhroz] I debugged the behavior of metadata.cq4t + bin files a bit. This will be a little bit longer. metadata.cq4t (metadata) are not immutable. It seems that the allocated size is same but its content is changing. One can dump the content of metadata and regular cq4 files (logs) by a class in ChronicleQueue called "DumpMain". If we dump metadata via this, it prints something like this: {code:java} --- !!meta-data #binary header: !STStore { wireType: !WireType BINARY_LIGHT, metadata: !SCQMeta { roll: !SCQSRoll { length: !short 1000, format: yyyyMMdd-HHmmss'T', epoch: 0 }, deltaCheckpointInterval: 64, sourceId: 0 } } --- !!data #binary listing.highestCycle: 1664967434 --- !!data #binary listing.lowestCycle: 1664965537 --- !!data #binary listing.modCount: 3 --- !!data #binary chronicle.write.lock: -9223372036854775808 --- !!data #binary chronicle.append.lock: -9223372036854775808 --- !!data #binary chronicle.lastIndexReplicated: -1 --- !!data #binary chronicle.lastAcknowledgedIndexReplicated: -1 # 130596 bytes remaining {code} At the very beginning we see that that roll cycle is 1 second (1000 ms). The most crucial fields are: listing.highestCycle - that is unix timestamp from '70 which is equal to the time when the last bin log file was created listing.lowestCycle - that is unix timestamp from '70 which is equal to the time when the first bin log file was created listing.modCount - I do not know what "mod" stands for but here it is 3, this number is same as the number of bin log files (called "cycles" in ChronicleQueue terminology). So, when fourth file is created, that modCount value is increased to 4 and so on. On disk, it might look like this: {code:java} 20221005-102537T.cq4 <-- listing.lowestCycle 20221005-102937T.cq4 20221005-102939T.cq4 <-- listing.highestCycle metadata.cq4t - <-- listing.modCount: 3 because there are three files {code} If we add one more file / cycle, modCount will be 4 and listing.highestCycle will change its value (lowestCycle is unchanged). I played with various scenarios: 1) Imagine there is 10 cq4 + metadata and I remove metadata.cq4t. When BinLog starts, it will create new metadata file and it starts to create new log files. Lets say that it will create 5 logs while it runs. After dumping the metadata file, listing.lowestCycle will have timestamp of the oldest file, listing.highestCycle will have timestamp of the newest file, but, and this is interesting, listing.modCount will be "5" again, not 15. In other words, when new metadata is created, it will take into account lowest and highest timestamps of ALL files in the log dir, but listing.modCount of new metadata file will be equal to the number of newly allocated files, not the number of old files PLUS new ones. New metadata starts to count from 0 and it does not take into account any previous logs even though it takes into account their lowest and timestamp. 2) If I remove the files between lowest and highest cycles, there is a "hole" in files, but metadata file do not care. Only modCount will not be the correct anymore. It seems to me that after I dumped all logs, even numbers in metadata were not aligned, it did not matter. It just dumped the content of the logs. It will dump them without metadata being present at all, actually. When I put an empty file into log dir and started Cassandra, it failed, but I was testing if empty file will increase modCount field in metadata.cq4t file - and it did not. It was still 3 even I manufactured empty, fourth file. After dumping metadata, it was not 4 but still 3. If there were legit 4 files in the log dir and modCount was be 4 and I stopped Cassandra, removed fourth file and touched an empty one, it failed to start Cassandra. If I removed that empty file by hand and started Cassandra again, it increased the counter in metadata.cq4t to 5 even though there are 4 files. All of this tells me that it is quite harmless to remove empty files and keep metadata there. The only issue so far I see is that lowestCycle and highestCycle are updated dynamically based on what it finds on the disk upon start (and updating it as it creates new files), however might happen that modCount will not reflect reality anymore. (1) [https://github.com/OpenHFT/Chronicle-Queue/blob/ea/src/main/java/net/openhft/chronicle/queue/main/DumpMain.java] was (Author: smiklosovic): [~maedhroz] I debugged the behavior of metadata.cq4t + bin files a bit. This will be a little bit longer. metadata.cq4t (metadata) are not immutable. It seems that the allocated size is same but its content is changing. One can dump the content of metadata and regular cq4 files (logs) by a class in ChronicleQueue called "DumpMain". If we dump metadata via this, it prints something like this: {code:java} --- !!meta-data #binary header: !STStore { wireType: !WireType BINARY_LIGHT, metadata: !SCQMeta { roll: !SCQSRoll { length: !short 1000, format: yyyyMMdd-HHmmss'T', epoch: 0 }, deltaCheckpointInterval: 64, sourceId: 0 } } --- !!data #binary listing.highestCycle: 1664967434 --- !!data #binary listing.lowestCycle: 1664965537 --- !!data #binary listing.modCount: 3 --- !!data #binary chronicle.write.lock: -9223372036854775808 --- !!data #binary chronicle.append.lock: -9223372036854775808 --- !!data #binary chronicle.lastIndexReplicated: -1 --- !!data #binary chronicle.lastAcknowledgedIndexReplicated: -1 # 130596 bytes remaining {code} At the very beginning we see that that roll cycle is 1 second (1000 ms). The most crucial fields are: listing.highestCycle - that is unix timestamp from '70 which is equal to the time when the last bin log file was created listing.lowestCycle - that is unix timestamp from '70 which is equal to the time when the first bin log file was created listing.modCount - I do not know what "mod" stands for but here it is 3, this number is same as the number of bin log files (called "cycles" in ChronicleQueue terminology). So, when fourth file is created, that modCount value is increased to 4 and so on. On disk, it might look like this: {code:java} 20221005-102537T.cq4 <-- listing.lowestCycle 20221005-102937T.cq4 20221005-102939T.cq4 <-- listing.highestCycle metadata.cq4t - <-- listing.modCount: 3 because there are three files {code} If we add one more file / cycle, modCount will be 4 and listing.highestCycle will change its value (lowestCycle is unchanged). I played with various scenarios: 1) Imagine there is 10 cq4 + metadata and I remove metadata.cq4t. When BinLog starts, it will create new metadata file and it starts to create new log files. Lets say that it will create 5 logs while it runs. After dumping the metadata file, listing.lowestCycle will have timestamp of the oldest file, listing.highestCycle will have timestamp of the newest file, but, and this is interesting, listing.modCount will be "5" again, not 15. In other words, when new metadata is created, it will take into account lowest and highest timestamps of ALL files in the log dir, but listing.modCount of new metadata file will be equal to the number of newly allocated files, not the number of old files PLUS new ones. New metadata starts to count from 0 and it does not take into account any previous logs even though it takes into account their lowest and timestamp. 2) If I remove the files between lowest and highest cycles, there is a "hole" in files, but metadata file do not care. Only modCount will not be the correct anymore. It seems to me that after I dumped all logs, even numbers in metadata were not aligned, it did not matter. It just dumped the content of the logs. It will dump them without metadata being present at all, actually. When I put an empty file into log dir and started Cassandra, it failed, but I was testing if empty file will increase modCount field in metadata.cq4t file - and it did not. It was still 3 even I manufactured empty, fourth file. After dumping metadata, it was not 4 but still 3. If there were legit 4 files in the log dir and modCount was be 4 and I stopped Cassandra, removed fourth file and touched an empty one, it failed to start Cassandra. If I removed that empty file by hand and started Cassandra again, it increased the counter in metadata.cq4t to 5 even though there are 4 files. All of this tells me that it is quite harmless to remove empty files and keep metadata there. Unless lowestCycle and highestCycle are updated dynamically based on what it finds on the disk upon start (and updating it as it creates new files), it might happen that modCount will not reflect reality anymore. (1) [https://github.com/OpenHFT/Chronicle-Queue/blob/ea/src/main/java/net/openhft/chronicle/queue/main/DumpMain.java] > Zero length file in Audit log folder, prevents a node from starting > ------------------------------------------------------------------- > > Key: CASSANDRA-17933 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17933 > Project: Cassandra > Issue Type: Bug > Components: Local/Startup and Shutdown > Reporter: Andrew Hogg > Assignee: Stefan Miklosovic > Priority: Normal > Fix For: 4.0.x, 4.1.x, 4.x > > > We have encountered a 4.0.3 cluster where the audit log folder had a zero > byte length file within it after the node had stopped. It is not clear how > Cassandra got to the point of this file existing. On restarting the node, the > node will not start and throws the following stack trace. > {code:java} > ERROR [main] 2022-09-26 14:01:27,892 CassandraDaemon.java:911 - Exception > encountered during startup > java.lang.ExceptionInInitializerError: null > at > org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:468) > at > org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:765) > at > org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:889) > Caused by: org.apache.cassandra.exceptions.ConfigurationException: Unable to > create instance of IAuditLogger. > at > org.apache.cassandra.utils.FBUtilities.newAuditLogger(FBUtilities.java:686) > at > org.apache.cassandra.audit.AuditLogManager.getAuditLogger(AuditLogManager.java:95) > at > org.apache.cassandra.audit.AuditLogManager.<init>(AuditLogManager.java:74) > at > org.apache.cassandra.audit.AuditLogManager.<clinit>(AuditLogManager.java:60) > ... 3 common frames omitted > Caused by: java.lang.reflect.InvocationTargetException: null > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native > Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:423) > at > org.apache.cassandra.utils.FBUtilities.newAuditLogger(FBUtilities.java:682) > ... 6 common frames omitted > Caused by: java.nio.channels.OverlappingFileLockException: null > at sun.nio.ch.SharedFileLockTable.checkList(FileLockTable.java:255) > at sun.nio.ch.SharedFileLockTable.add(FileLockTable.java:152) > at sun.nio.ch.FileChannelImpl.lock(FileChannelImpl.java:1068) > at java.nio.channels.FileChannel.lock(FileChannel.java:1053) > at > net.openhft.chronicle.bytes.MappedFile.resizeRafIfTooSmall(MappedFile.java:369) > at > net.openhft.chronicle.bytes.MappedFile.acquireByteStore(MappedFile.java:307) > at > net.openhft.chronicle.bytes.MappedFile.acquireByteStore(MappedFile.java:269) > at > net.openhft.chronicle.bytes.MappedBytes.acquireNextByteStore0(MappedBytes.java:434) > at > net.openhft.chronicle.bytes.MappedBytes.readVolatileInt(MappedBytes.java:792) > at > net.openhft.chronicle.queue.impl.single.SingleChronicleQueue$StoreSupplier.headerRecovery(SingleChronicleQueue.java:1027) > at > net.openhft.chronicle.queue.impl.single.SingleChronicleQueue$StoreSupplier.acquire(SingleChronicleQueue.java:981) > at > net.openhft.chronicle.queue.impl.WireStorePool.acquire(WireStorePool.java:53) > at > net.openhft.chronicle.queue.impl.single.SingleChronicleQueue.cleanupStoreFilesWithNoData(SingleChronicleQueue.java:821) > at > net.openhft.chronicle.queue.impl.single.StoreAppender.<init>(StoreAppender.java:75) > at > net.openhft.chronicle.queue.impl.single.SingleChronicleQueue.newAppender(SingleChronicleQueue.java:422) > at > net.openhft.chronicle.core.threads.CleaningThreadLocal.initialValue(CleaningThreadLocal.java:54) > at java.lang.ThreadLocal.setInitialValue(ThreadLocal.java:180) > at java.lang.ThreadLocal.get(ThreadLocal.java:170) > at > net.openhft.chronicle.core.threads.CleaningThreadLocal.get(CleaningThreadLocal.java:59) > at > net.openhft.chronicle.queue.impl.single.SingleChronicleQueue.acquireAppender(SingleChronicleQueue.java:441) > at org.apache.cassandra.utils.binlog.BinLog.<init>(BinLog.java:133) > at org.apache.cassandra.utils.binlog.BinLog.<init>(BinLog.java:65) > at > org.apache.cassandra.utils.binlog.BinLog$Builder.build(BinLog.java:453) > at > org.apache.cassandra.audit.BinAuditLogger.<init>(BinAuditLogger.java:55) > ... 11 common frames omitted {code} > To reproduce, we place a zero length file and attempted to start the node, > and saw the same stack trace. > {code:java} > ll ../logs/audit/ > total 4 > rw-rw-r-. 1 automaton automaton 0 Sep 28 13:00 20220928-12.cq4 > rw-rw-r-. 1 automaton automaton 131072 Sep 28 13:00 metadata.cq4t {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org