Denis Chudov created IGNITE-15295: ------------------------------------- Summary: Server node that has an empty checkpoint file-XXX-START.bin does not start Key: IGNITE-15295 URL: https://issues.apache.org/jira/browse/IGNITE-15295 Project: Ignite Issue Type: Improvement Reporter: Denis Chudov Assignee: Denis Chudov
When starting a server node that has an empty checkpoint file-XXX-START.bin this node does not start. {code:java} 2021-06-08 16:00:33.383[ERROR][Thread-19][o.a.i.i.IgniteKernal%DPL_GRID%DplGridNodeName] Exception during start processors, node will be stopped and close connections 2java.nio.BufferUnderflowException: null 3 at java.nio.Buffer.nextGetIndex(Buffer.java:532) 4 at java.nio.HeapByteBuffer.getLong(HeapByteBuffer.java:417) 5 at org.apache.ignite.internal.processors.cache.persistence.checkpoint.CheckpointMarkersStorage.readPointer(CheckpointMarkersStorage.java:301) 6 at org.apache.ignite.internal.processors.cache.persistence.checkpoint.CheckpointMarkersStorage.readCheckpointStatus(CheckpointMarkersStorage.java:218) 7 at org.apache.ignite.internal.processors.cache.persistence.checkpoint.CheckpointManager.readCheckpointStatus(CheckpointManager.java:265) 8 at org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.readCheckpointStatus(GridCacheDatabaseSharedManager.java:1642) 9 at org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.readMetastore(GridCacheDatabaseSharedManager.java:584) 10 at org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.notifyMetaStorageSubscribersOnReadyForRead(GridCacheDatabaseSharedManager.java:2999) 11 at org.apache.ignite.internal.IgniteKernal.start(IgniteKernal.java:1205) 12 at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start0(IgnitionEx.java:2105) 13 at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start(IgnitionEx.java:1768) 14 at org.apache.ignite.internal.IgnitionEx.start0(IgnitionEx.java:1147) 15 at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:667) 16 at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:593) 17 at org.apache.ignite.Ignition.start(Ignition.java:319) 18 at com.sbt.ignite.factory.IgniteFactory.getOrStartIgnite(IgniteFactory.java:139) 19 at com.sbt.ignite.factory.IgniteFactory.getOrStartIgnite(IgniteFactory.java:91) 20 at com.sbt.ignite.manager.IgniteLifecycleManagerImpl.startIgnite(IgniteLifecycleManagerImpl.java:82) 21 at com.sbt.ignite.manager.IgniteLifecycleManagerImpl.init(IgniteLifecycleManagerImpl.java:73) 22 at com.sbt.dpl.gridgain.container.DPLManagerLifecycleManager.initIgniteServiceHolder(DPLManagerLifecycleManager.java:170) 23 at com.sbt.dpl.gridgain.container.DPLManagerLifecycleManager.dplContextInit(DPLManagerLifecycleManager.java:145) 24 at com.sbt.dpl.gridgain.container.ContainerDPLFactory.<init>(ContainerDPLFactory.java:80) 25 at com.sbt.dpl.gridgain.springsupport.SpringDPLFactory.init(SpringDPLFactory.java:74) {code} Checkpoint marker is always fully written in the temp file first, and then this file is renamed (see {noformat} org.apache.ignite.internal.processors.cache.persistence.checkpoint.CheckpointMarkersStorage#writeCheckpointEntry(java.nio.ByteBuffer, org.apache.ignite.internal.processors.cache.persistence.checkpoint.CheckpointEntry, org.apache.ignite.internal.processors.cache.persistence.checkpoint.CheckpointEntryType, boolean){noformat} ) So the root cause of this error is not clear, unless file was changed somehow. We need extended information if such error will happen in future, but in this case we have nothing for analysis (LFS was cleared by the customer right after this error happened). In the same time we can’t guarantee correctness of work when checkpoint markers are inconsistent. We can’t just ignore them, if they are broken, and can’t recover from previous checkpoint just as simple. But it seems reasonable to catch all reading-related exceptions in org.apache.ignite.internal.processors.cache.persistence.checkpoint.CheckpointMarkersStorage#readPointer. -- This message was sent by Atlassian Jira (v8.3.4#803005)