[jira] [Created] (IGNITE-19994) Safe time message batching
Vladislav Pyatkov created IGNITE-19994: -- Summary: Safe time message batching Key: IGNITE-19994 URL: https://issues.apache.org/jira/browse/IGNITE-19994 Project: Ignite Issue Type: Improvement Reporter: Vladislav Pyatkov *Motivation* Safe time is passing from each primary to every replication group member at the same time. This generates too many network messages. In particular, we can send several safe time propagation messages designed for different replication groups to the same node. *Implemetation notes* Create a batch safe time message that contains a list of groups and pear timestamps for each group that is required to propagate safe time. Handle this message for each member and increase the safe time if it is possible. *Difinition of done* The safe time is passed in a batch message. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-19993) Reduce sending safe time command
[ https://issues.apache.org/jira/browse/IGNITE-19993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vladislav Pyatkov updated IGNITE-19993: --- Description: *Motivation* Safe time propagation is based on messages that are sent at a particular interval. A safe-time direct message is less expensive than a command. The prpouse of the algorith is reduce sending command to propage safe time. *Implementation notes* Send safe time command by the following rule: {code} If (tSend == null) { // Nothing was sent at this lease interval sendSafeTimePropagatingCommand(tCurrent) } else if (tCurrent - stFreq < tSend && tCurrent - stFreq > tSuc) { // An operation was performed in a previous stFreq interval, but it was not a success sendSafeTimePropagatingCommand(tCurrent) if (tSend < tCurrent) sendSafeTimePropagatingMessage(tSend, tCurrent) } else if (tCurrent - stFreq > tSend && tSend != tSuc) { // The last command was unsuccessful sendSafeTimePropagatingCommand(tCurrent) if (tSend < tCurrent) sendSafeTimePropagatingMessage(tSend, tCurrent) } else if (tSend < tCurrent) { // It is not necessary to send a safe time due to the fact that is moving with load sendSafeTimePropagatingMessage(tSend, tCurrent) } {code} *Difinition of done* Safe time propagates through messages in regular load. The safe time command is used only in the specific cases described above. was: *Motivation* Safe time propagation is based on messages that are sent at a particular interval. A safe-time direct message is less expensive than a command. The prpouse of the algorith is reduce sending command to propage safe time. *Implementation notes* Send safe time command by the following rule: {code} If (tSend == null) { // Nothing was sent at this lease interval sendSafeTimePropagatingCommand(tCurrent) if (tSend < tCurrent) sendSafeTimePropagatingMessage(tSend, tCurrent) } else if (tCurrent - stFreq < tSend && tCurrent - stFreq > tSuc) { // An operation was performed in a previous stFreq interval, but it was not a success sendSafeTimePropagatingCommand(tCurrent) if (tSend < tCurrent) sendSafeTimePropagatingMessage(tSend, tCurrent) } else if (tCurrent - stFreq > tSend && tSend != tSuc) { // The last command was unsuccessful sendSafeTimePropagatingCommand(tCurrent) if (tSend < tCurrent) sendSafeTimePropagatingMessage(tSend, tCurrent) } else if (tSend < tCurrent) { // It is not necessary to send a safe time due to the fact that is moving with load sendSafeTimePropagatingMessage(tSend, tCurrent) } {code} *Difinition of done* Safe time propagates through messages in regular load. The safe time command is used only in the specific cases described above. > Reduce sending safe time command > > > Key: IGNITE-19993 > URL: https://issues.apache.org/jira/browse/IGNITE-19993 > Project: Ignite > Issue Type: Improvement >Reporter: Vladislav Pyatkov >Priority: Major > > *Motivation* > Safe time propagation is based on messages that are sent at a particular > interval. A safe-time direct message is less expensive than a command. The > prpouse of the algorith is reduce sending command to propage safe time. > *Implementation notes* > Send safe time command by the following rule: > {code} > If (tSend == null) { > // Nothing was sent at this lease interval > sendSafeTimePropagatingCommand(tCurrent) > } else if (tCurrent - stFreq < tSend && tCurrent - stFreq > tSuc) { > // An operation was performed in a previous stFreq interval, but it was not > a success > sendSafeTimePropagatingCommand(tCurrent) > if (tSend < tCurrent) > sendSafeTimePropagatingMessage(tSend, tCurrent) > } else if (tCurrent - stFreq > tSend && tSend != tSuc) { > // The last command was unsuccessful > sendSafeTimePropagatingCommand(tCurrent) > if (tSend < tCurrent) > sendSafeTimePropagatingMessage(tSend, tCurrent) > } else if (tSend < tCurrent) { > // It is not necessary to send a safe time due to the fact that is moving > with load > sendSafeTimePropagatingMessage(tSend, tCurrent) > } > {code} > *Difinition of done* > Safe time propagates through messages in regular load. The safe time command > is used only in the specific cases described above. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (IGNITE-19980) NPE within OOM exception during defragmentation.
[ https://issues.apache.org/jira/browse/IGNITE-19980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17743959#comment-17743959 ] Mikhail Petrov commented on IGNITE-19980: - [~vladsz83] Thank you for the contribution. > NPE within OOM exception during defragmentation. > > > Key: IGNITE-19980 > URL: https://issues.apache.org/jira/browse/IGNITE-19980 > Project: Ignite > Issue Type: Bug >Affects Versions: 2.12 >Reporter: Vladimir Steshin >Assignee: Vladimir Steshin >Priority: Minor > Fix For: 2.16 > > Attachments: IgnitePdsDefragmentationTest.java > > Time Spent: 1h 20m > Remaining Estimate: 0h > > When defragmentating a partition, we can run out of the defragmentation > region. Then, a OOM exception arises by > {code:java} > IgniteOutOfMemoryException PageMemoryImpl#oomException(String reason) { > DataRegionConfiguration dataRegionCfg = > getDataRegionConfiguration(); > return new IgniteOutOfMemoryException("Failed to find a page for > eviction (" + reason + ") [" + > "segmentCapacity=" + loadedPages.capacity() + > ", loaded=" + loadedPages.size() + > ", maxDirtyPages=" + maxDirtyPages + > ", dirtyPages=" + dirtyPagesCntr + > ", cpPages=" + (checkpointPages() == null ? 0 : > checkpointPages().size()) + > ", pinned=" + acquiredPages() + > ']' + U.nl() + "Out of memory in data region [" + > "name=" + dataRegionCfg.getName() + > ", initSize=" + > U.readableSize(dataRegionCfg.getInitialSize(), false) + > ", maxSize=" + U.readableSize(dataRegionCfg.getMaxSize(), > false) + > ", persistenceEnabled=" + > dataRegionCfg.isPersistenceEnabled() + "] Try the following:" + U.nl() + > " ^-- Increase maximum off-heap memory size > (DataRegionConfiguration.maxSize)" + U.nl() + > " ^-- Enable eviction or expiration policies" > ); > } > {code} > The problem is that > {code:java} > DataRegionConfiguration dataRegionCfg = getDataRegionConfiguration(); > {code} > is actually null. @see `PageMemoryImpl#getDataRegionConfiguration()`. > Stacktrace: > {code:java} > [defragmentation-thread][CachePartitionDefragmentationManager] > Defragmentation process failed. > > org.apache.ignite.internal.processors.cache.persistence.freelist.CorruptedFreeListException: > Failed to insert data row > at > org.apache.ignite.internal.processors.cache.persistence.freelist.AbstractFreeList.insertDataRow(AbstractFreeList.java:600) > ~[classes/:?] > at > org.apache.ignite.internal.processors.cache.persistence.freelist.CacheFreeList.insertDataRow(CacheFreeList.java:74) > ~[classes/:?] > at > org.apache.ignite.internal.processors.cache.persistence.freelist.CacheFreeList.insertDataRow(CacheFreeList.java:35) > ~[classes/:?] > at > org.apache.ignite.internal.processors.cache.persistence.defragmentation.CachePartitionDefragmentationManager.lambda$copyPartitionData$4(CachePartitionDefragmentationManager.java:741) > ~[classes/:?] > at > org.apache.ignite.internal.processors.cache.persistence.defragmentation.TreeIterator.iterate(TreeIterator.java:83) > ~[classes/:?] > at > org.apache.ignite.internal.processors.cache.persistence.defragmentation.CachePartitionDefragmentationManager.copyPartitionData(CachePartitionDefragmentationManager.java:718) > ~[classes/:?] > at > org.apache.ignite.internal.processors.cache.persistence.defragmentation.CachePartitionDefragmentationManager.defragmentOnePartition(CachePartitionDefragmentationManager.java:565) > ~[classes/:?] > at > org.apache.ignite.internal.processors.cache.persistence.defragmentation.CachePartitionDefragmentationManager.lambda$executeDefragmentation$66155109$1(CachePartitionDefragmentationManager.java:382) > ~[classes/:?] > at > org.apache.ignite.internal.util.IgniteUtils.lambda$doInParallel$3(IgniteUtils.java:11661) > ~[classes/:?] > at java.util.concurrent.FutureTask.run(FutureTask.java:264) ~[?:?] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > ~[?:?] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > ~[?:?] > at java.lang.Thread.run(Thread.java:829) [?:?] > Suppressed: > org.apache.ignite.internal.processors.cache.persistence.freelist.CorruptedFreeListException: > Failed to insert data row > at > org.apache.ignite.internal.processors.cache.persistence.freelist.AbstractFreeList.insertDataRow(AbstractFreeList.java:600) > ~[classes/:?] > at > org.apache.ignite.internal.processors
[jira] [Updated] (IGNITE-19980) NPE within OOM exception during defragmentation.
[ https://issues.apache.org/jira/browse/IGNITE-19980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mikhail Petrov updated IGNITE-19980: Fix Version/s: 2.16 > NPE within OOM exception during defragmentation. > > > Key: IGNITE-19980 > URL: https://issues.apache.org/jira/browse/IGNITE-19980 > Project: Ignite > Issue Type: Bug >Affects Versions: 2.12 >Reporter: Vladimir Steshin >Assignee: Vladimir Steshin >Priority: Minor > Fix For: 2.16 > > Attachments: IgnitePdsDefragmentationTest.java > > Time Spent: 1h 20m > Remaining Estimate: 0h > > When defragmentating a partition, we can run out of the defragmentation > region. Then, a OOM exception arises by > {code:java} > IgniteOutOfMemoryException PageMemoryImpl#oomException(String reason) { > DataRegionConfiguration dataRegionCfg = > getDataRegionConfiguration(); > return new IgniteOutOfMemoryException("Failed to find a page for > eviction (" + reason + ") [" + > "segmentCapacity=" + loadedPages.capacity() + > ", loaded=" + loadedPages.size() + > ", maxDirtyPages=" + maxDirtyPages + > ", dirtyPages=" + dirtyPagesCntr + > ", cpPages=" + (checkpointPages() == null ? 0 : > checkpointPages().size()) + > ", pinned=" + acquiredPages() + > ']' + U.nl() + "Out of memory in data region [" + > "name=" + dataRegionCfg.getName() + > ", initSize=" + > U.readableSize(dataRegionCfg.getInitialSize(), false) + > ", maxSize=" + U.readableSize(dataRegionCfg.getMaxSize(), > false) + > ", persistenceEnabled=" + > dataRegionCfg.isPersistenceEnabled() + "] Try the following:" + U.nl() + > " ^-- Increase maximum off-heap memory size > (DataRegionConfiguration.maxSize)" + U.nl() + > " ^-- Enable eviction or expiration policies" > ); > } > {code} > The problem is that > {code:java} > DataRegionConfiguration dataRegionCfg = getDataRegionConfiguration(); > {code} > is actually null. @see `PageMemoryImpl#getDataRegionConfiguration()`. > Stacktrace: > {code:java} > [defragmentation-thread][CachePartitionDefragmentationManager] > Defragmentation process failed. > > org.apache.ignite.internal.processors.cache.persistence.freelist.CorruptedFreeListException: > Failed to insert data row > at > org.apache.ignite.internal.processors.cache.persistence.freelist.AbstractFreeList.insertDataRow(AbstractFreeList.java:600) > ~[classes/:?] > at > org.apache.ignite.internal.processors.cache.persistence.freelist.CacheFreeList.insertDataRow(CacheFreeList.java:74) > ~[classes/:?] > at > org.apache.ignite.internal.processors.cache.persistence.freelist.CacheFreeList.insertDataRow(CacheFreeList.java:35) > ~[classes/:?] > at > org.apache.ignite.internal.processors.cache.persistence.defragmentation.CachePartitionDefragmentationManager.lambda$copyPartitionData$4(CachePartitionDefragmentationManager.java:741) > ~[classes/:?] > at > org.apache.ignite.internal.processors.cache.persistence.defragmentation.TreeIterator.iterate(TreeIterator.java:83) > ~[classes/:?] > at > org.apache.ignite.internal.processors.cache.persistence.defragmentation.CachePartitionDefragmentationManager.copyPartitionData(CachePartitionDefragmentationManager.java:718) > ~[classes/:?] > at > org.apache.ignite.internal.processors.cache.persistence.defragmentation.CachePartitionDefragmentationManager.defragmentOnePartition(CachePartitionDefragmentationManager.java:565) > ~[classes/:?] > at > org.apache.ignite.internal.processors.cache.persistence.defragmentation.CachePartitionDefragmentationManager.lambda$executeDefragmentation$66155109$1(CachePartitionDefragmentationManager.java:382) > ~[classes/:?] > at > org.apache.ignite.internal.util.IgniteUtils.lambda$doInParallel$3(IgniteUtils.java:11661) > ~[classes/:?] > at java.util.concurrent.FutureTask.run(FutureTask.java:264) ~[?:?] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > ~[?:?] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > ~[?:?] > at java.lang.Thread.run(Thread.java:829) [?:?] > Suppressed: > org.apache.ignite.internal.processors.cache.persistence.freelist.CorruptedFreeListException: > Failed to insert data row > at > org.apache.ignite.internal.processors.cache.persistence.freelist.AbstractFreeList.insertDataRow(AbstractFreeList.java:600) > ~[classes/:?] > at > org.apache.ignite.internal.processors.cache.persistence.freelist.CacheFreeList.insertDataRow(CacheFreeList.java:74)
[jira] [Updated] (IGNITE-19980) NPE within OOM exception during defragmentation.
[ https://issues.apache.org/jira/browse/IGNITE-19980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vladimir Steshin updated IGNITE-19980: -- Release Note: Memory lack of the defragmantation region doesn't produce NPE exception any more. (was: Out of memory of the defragmantation region doesn't produce NPE exception any more.) > NPE within OOM exception during defragmentation. > > > Key: IGNITE-19980 > URL: https://issues.apache.org/jira/browse/IGNITE-19980 > Project: Ignite > Issue Type: Bug >Affects Versions: 2.12 >Reporter: Vladimir Steshin >Assignee: Vladimir Steshin >Priority: Minor > Attachments: IgnitePdsDefragmentationTest.java > > Time Spent: 1h 10m > Remaining Estimate: 0h > > When defragmentating a partition, we can run out of the defragmentation > region. Then, a OOM exception arises by > {code:java} > IgniteOutOfMemoryException PageMemoryImpl#oomException(String reason) { > DataRegionConfiguration dataRegionCfg = > getDataRegionConfiguration(); > return new IgniteOutOfMemoryException("Failed to find a page for > eviction (" + reason + ") [" + > "segmentCapacity=" + loadedPages.capacity() + > ", loaded=" + loadedPages.size() + > ", maxDirtyPages=" + maxDirtyPages + > ", dirtyPages=" + dirtyPagesCntr + > ", cpPages=" + (checkpointPages() == null ? 0 : > checkpointPages().size()) + > ", pinned=" + acquiredPages() + > ']' + U.nl() + "Out of memory in data region [" + > "name=" + dataRegionCfg.getName() + > ", initSize=" + > U.readableSize(dataRegionCfg.getInitialSize(), false) + > ", maxSize=" + U.readableSize(dataRegionCfg.getMaxSize(), > false) + > ", persistenceEnabled=" + > dataRegionCfg.isPersistenceEnabled() + "] Try the following:" + U.nl() + > " ^-- Increase maximum off-heap memory size > (DataRegionConfiguration.maxSize)" + U.nl() + > " ^-- Enable eviction or expiration policies" > ); > } > {code} > The problem is that > {code:java} > DataRegionConfiguration dataRegionCfg = getDataRegionConfiguration(); > {code} > is actually null. @see `PageMemoryImpl#getDataRegionConfiguration()`. > Stacktrace: > {code:java} > [defragmentation-thread][CachePartitionDefragmentationManager] > Defragmentation process failed. > > org.apache.ignite.internal.processors.cache.persistence.freelist.CorruptedFreeListException: > Failed to insert data row > at > org.apache.ignite.internal.processors.cache.persistence.freelist.AbstractFreeList.insertDataRow(AbstractFreeList.java:600) > ~[classes/:?] > at > org.apache.ignite.internal.processors.cache.persistence.freelist.CacheFreeList.insertDataRow(CacheFreeList.java:74) > ~[classes/:?] > at > org.apache.ignite.internal.processors.cache.persistence.freelist.CacheFreeList.insertDataRow(CacheFreeList.java:35) > ~[classes/:?] > at > org.apache.ignite.internal.processors.cache.persistence.defragmentation.CachePartitionDefragmentationManager.lambda$copyPartitionData$4(CachePartitionDefragmentationManager.java:741) > ~[classes/:?] > at > org.apache.ignite.internal.processors.cache.persistence.defragmentation.TreeIterator.iterate(TreeIterator.java:83) > ~[classes/:?] > at > org.apache.ignite.internal.processors.cache.persistence.defragmentation.CachePartitionDefragmentationManager.copyPartitionData(CachePartitionDefragmentationManager.java:718) > ~[classes/:?] > at > org.apache.ignite.internal.processors.cache.persistence.defragmentation.CachePartitionDefragmentationManager.defragmentOnePartition(CachePartitionDefragmentationManager.java:565) > ~[classes/:?] > at > org.apache.ignite.internal.processors.cache.persistence.defragmentation.CachePartitionDefragmentationManager.lambda$executeDefragmentation$66155109$1(CachePartitionDefragmentationManager.java:382) > ~[classes/:?] > at > org.apache.ignite.internal.util.IgniteUtils.lambda$doInParallel$3(IgniteUtils.java:11661) > ~[classes/:?] > at java.util.concurrent.FutureTask.run(FutureTask.java:264) ~[?:?] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > ~[?:?] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > ~[?:?] > at java.lang.Thread.run(Thread.java:829) [?:?] > Suppressed: > org.apache.ignite.internal.processors.cache.persistence.freelist.CorruptedFreeListException: > Failed to insert data row > at > org.apache.ignite.internal.processors.cache.persistence.freelist.AbstractFreeList.insertDataRow(AbstractFreeList.java:600) > ~[class
[jira] [Updated] (IGNITE-19980) NPE within OOM exception during defragmentation.
[ https://issues.apache.org/jira/browse/IGNITE-19980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vladimir Steshin updated IGNITE-19980: -- Release Note: Out of memory of the defragmantation region doesn't produce NPE exception any more. > NPE within OOM exception during defragmentation. > > > Key: IGNITE-19980 > URL: https://issues.apache.org/jira/browse/IGNITE-19980 > Project: Ignite > Issue Type: Bug >Affects Versions: 2.12 >Reporter: Vladimir Steshin >Assignee: Vladimir Steshin >Priority: Minor > Attachments: IgnitePdsDefragmentationTest.java > > Time Spent: 1h 10m > Remaining Estimate: 0h > > When defragmentating a partition, we can run out of the defragmentation > region. Then, a OOM exception arises by > {code:java} > IgniteOutOfMemoryException PageMemoryImpl#oomException(String reason) { > DataRegionConfiguration dataRegionCfg = > getDataRegionConfiguration(); > return new IgniteOutOfMemoryException("Failed to find a page for > eviction (" + reason + ") [" + > "segmentCapacity=" + loadedPages.capacity() + > ", loaded=" + loadedPages.size() + > ", maxDirtyPages=" + maxDirtyPages + > ", dirtyPages=" + dirtyPagesCntr + > ", cpPages=" + (checkpointPages() == null ? 0 : > checkpointPages().size()) + > ", pinned=" + acquiredPages() + > ']' + U.nl() + "Out of memory in data region [" + > "name=" + dataRegionCfg.getName() + > ", initSize=" + > U.readableSize(dataRegionCfg.getInitialSize(), false) + > ", maxSize=" + U.readableSize(dataRegionCfg.getMaxSize(), > false) + > ", persistenceEnabled=" + > dataRegionCfg.isPersistenceEnabled() + "] Try the following:" + U.nl() + > " ^-- Increase maximum off-heap memory size > (DataRegionConfiguration.maxSize)" + U.nl() + > " ^-- Enable eviction or expiration policies" > ); > } > {code} > The problem is that > {code:java} > DataRegionConfiguration dataRegionCfg = getDataRegionConfiguration(); > {code} > is actually null. @see `PageMemoryImpl#getDataRegionConfiguration()`. > Stacktrace: > {code:java} > [defragmentation-thread][CachePartitionDefragmentationManager] > Defragmentation process failed. > > org.apache.ignite.internal.processors.cache.persistence.freelist.CorruptedFreeListException: > Failed to insert data row > at > org.apache.ignite.internal.processors.cache.persistence.freelist.AbstractFreeList.insertDataRow(AbstractFreeList.java:600) > ~[classes/:?] > at > org.apache.ignite.internal.processors.cache.persistence.freelist.CacheFreeList.insertDataRow(CacheFreeList.java:74) > ~[classes/:?] > at > org.apache.ignite.internal.processors.cache.persistence.freelist.CacheFreeList.insertDataRow(CacheFreeList.java:35) > ~[classes/:?] > at > org.apache.ignite.internal.processors.cache.persistence.defragmentation.CachePartitionDefragmentationManager.lambda$copyPartitionData$4(CachePartitionDefragmentationManager.java:741) > ~[classes/:?] > at > org.apache.ignite.internal.processors.cache.persistence.defragmentation.TreeIterator.iterate(TreeIterator.java:83) > ~[classes/:?] > at > org.apache.ignite.internal.processors.cache.persistence.defragmentation.CachePartitionDefragmentationManager.copyPartitionData(CachePartitionDefragmentationManager.java:718) > ~[classes/:?] > at > org.apache.ignite.internal.processors.cache.persistence.defragmentation.CachePartitionDefragmentationManager.defragmentOnePartition(CachePartitionDefragmentationManager.java:565) > ~[classes/:?] > at > org.apache.ignite.internal.processors.cache.persistence.defragmentation.CachePartitionDefragmentationManager.lambda$executeDefragmentation$66155109$1(CachePartitionDefragmentationManager.java:382) > ~[classes/:?] > at > org.apache.ignite.internal.util.IgniteUtils.lambda$doInParallel$3(IgniteUtils.java:11661) > ~[classes/:?] > at java.util.concurrent.FutureTask.run(FutureTask.java:264) ~[?:?] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > ~[?:?] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > ~[?:?] > at java.lang.Thread.run(Thread.java:829) [?:?] > Suppressed: > org.apache.ignite.internal.processors.cache.persistence.freelist.CorruptedFreeListException: > Failed to insert data row > at > org.apache.ignite.internal.processors.cache.persistence.freelist.AbstractFreeList.insertDataRow(AbstractFreeList.java:600) > ~[classes/:?] > at > org.apache.ignite.internal.processors.cache.persistence.freeli
[jira] [Assigned] (IGNITE-19980) NPE within OOM exception during defragmentation.
[ https://issues.apache.org/jira/browse/IGNITE-19980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vladimir Steshin reassigned IGNITE-19980: - Assignee: Vladimir Steshin > NPE within OOM exception during defragmentation. > > > Key: IGNITE-19980 > URL: https://issues.apache.org/jira/browse/IGNITE-19980 > Project: Ignite > Issue Type: Bug >Affects Versions: 2.12 >Reporter: Vladimir Steshin >Assignee: Vladimir Steshin >Priority: Minor > Attachments: IgnitePdsDefragmentationTest.java > > Time Spent: 1h 10m > Remaining Estimate: 0h > > When defragmentating a partition, we can run out of the defragmentation > region. Then, a OOM exception arises by > {code:java} > IgniteOutOfMemoryException PageMemoryImpl#oomException(String reason) { > DataRegionConfiguration dataRegionCfg = > getDataRegionConfiguration(); > return new IgniteOutOfMemoryException("Failed to find a page for > eviction (" + reason + ") [" + > "segmentCapacity=" + loadedPages.capacity() + > ", loaded=" + loadedPages.size() + > ", maxDirtyPages=" + maxDirtyPages + > ", dirtyPages=" + dirtyPagesCntr + > ", cpPages=" + (checkpointPages() == null ? 0 : > checkpointPages().size()) + > ", pinned=" + acquiredPages() + > ']' + U.nl() + "Out of memory in data region [" + > "name=" + dataRegionCfg.getName() + > ", initSize=" + > U.readableSize(dataRegionCfg.getInitialSize(), false) + > ", maxSize=" + U.readableSize(dataRegionCfg.getMaxSize(), > false) + > ", persistenceEnabled=" + > dataRegionCfg.isPersistenceEnabled() + "] Try the following:" + U.nl() + > " ^-- Increase maximum off-heap memory size > (DataRegionConfiguration.maxSize)" + U.nl() + > " ^-- Enable eviction or expiration policies" > ); > } > {code} > The problem is that > {code:java} > DataRegionConfiguration dataRegionCfg = getDataRegionConfiguration(); > {code} > is actually null. @see `PageMemoryImpl#getDataRegionConfiguration()`. > Stacktrace: > {code:java} > [defragmentation-thread][CachePartitionDefragmentationManager] > Defragmentation process failed. > > org.apache.ignite.internal.processors.cache.persistence.freelist.CorruptedFreeListException: > Failed to insert data row > at > org.apache.ignite.internal.processors.cache.persistence.freelist.AbstractFreeList.insertDataRow(AbstractFreeList.java:600) > ~[classes/:?] > at > org.apache.ignite.internal.processors.cache.persistence.freelist.CacheFreeList.insertDataRow(CacheFreeList.java:74) > ~[classes/:?] > at > org.apache.ignite.internal.processors.cache.persistence.freelist.CacheFreeList.insertDataRow(CacheFreeList.java:35) > ~[classes/:?] > at > org.apache.ignite.internal.processors.cache.persistence.defragmentation.CachePartitionDefragmentationManager.lambda$copyPartitionData$4(CachePartitionDefragmentationManager.java:741) > ~[classes/:?] > at > org.apache.ignite.internal.processors.cache.persistence.defragmentation.TreeIterator.iterate(TreeIterator.java:83) > ~[classes/:?] > at > org.apache.ignite.internal.processors.cache.persistence.defragmentation.CachePartitionDefragmentationManager.copyPartitionData(CachePartitionDefragmentationManager.java:718) > ~[classes/:?] > at > org.apache.ignite.internal.processors.cache.persistence.defragmentation.CachePartitionDefragmentationManager.defragmentOnePartition(CachePartitionDefragmentationManager.java:565) > ~[classes/:?] > at > org.apache.ignite.internal.processors.cache.persistence.defragmentation.CachePartitionDefragmentationManager.lambda$executeDefragmentation$66155109$1(CachePartitionDefragmentationManager.java:382) > ~[classes/:?] > at > org.apache.ignite.internal.util.IgniteUtils.lambda$doInParallel$3(IgniteUtils.java:11661) > ~[classes/:?] > at java.util.concurrent.FutureTask.run(FutureTask.java:264) ~[?:?] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > ~[?:?] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > ~[?:?] > at java.lang.Thread.run(Thread.java:829) [?:?] > Suppressed: > org.apache.ignite.internal.processors.cache.persistence.freelist.CorruptedFreeListException: > Failed to insert data row > at > org.apache.ignite.internal.processors.cache.persistence.freelist.AbstractFreeList.insertDataRow(AbstractFreeList.java:600) > ~[classes/:?] > at > org.apache.ignite.internal.processors.cache.persistence.freelist.CacheFreeList.insertDataRow(CacheFreeList.java:74) > ~[classe
[jira] [Commented] (IGNITE-19980) NPE within OOM exception during defragmentation.
[ https://issues.apache.org/jira/browse/IGNITE-19980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17743954#comment-17743954 ] Ignite TC Bot commented on IGNITE-19980: {panel:title=Branch: [pull/10844/head] Base: [master] : No blockers found!|borderStyle=dashed|borderColor=#ccc|titleBGColor=#D6F7C1}{panel} {panel:title=Branch: [pull/10844/head] Base: [master] : No new tests found!|borderStyle=dashed|borderColor=#ccc|titleBGColor=#F7D6C1}{panel} [TeamCity *--> Run :: All* Results|https://ci2.ignite.apache.org/viewLog.html?buildId=7261634&buildTypeId=IgniteTests24Java8_RunAll] > NPE within OOM exception during defragmentation. > > > Key: IGNITE-19980 > URL: https://issues.apache.org/jira/browse/IGNITE-19980 > Project: Ignite > Issue Type: Bug >Affects Versions: 2.12 >Reporter: Vladimir Steshin >Priority: Minor > Attachments: IgnitePdsDefragmentationTest.java > > Time Spent: 1h 10m > Remaining Estimate: 0h > > When defragmentating a partition, we can run out of the defragmentation > region. Then, a OOM exception arises by > {code:java} > IgniteOutOfMemoryException PageMemoryImpl#oomException(String reason) { > DataRegionConfiguration dataRegionCfg = > getDataRegionConfiguration(); > return new IgniteOutOfMemoryException("Failed to find a page for > eviction (" + reason + ") [" + > "segmentCapacity=" + loadedPages.capacity() + > ", loaded=" + loadedPages.size() + > ", maxDirtyPages=" + maxDirtyPages + > ", dirtyPages=" + dirtyPagesCntr + > ", cpPages=" + (checkpointPages() == null ? 0 : > checkpointPages().size()) + > ", pinned=" + acquiredPages() + > ']' + U.nl() + "Out of memory in data region [" + > "name=" + dataRegionCfg.getName() + > ", initSize=" + > U.readableSize(dataRegionCfg.getInitialSize(), false) + > ", maxSize=" + U.readableSize(dataRegionCfg.getMaxSize(), > false) + > ", persistenceEnabled=" + > dataRegionCfg.isPersistenceEnabled() + "] Try the following:" + U.nl() + > " ^-- Increase maximum off-heap memory size > (DataRegionConfiguration.maxSize)" + U.nl() + > " ^-- Enable eviction or expiration policies" > ); > } > {code} > The problem is that > {code:java} > DataRegionConfiguration dataRegionCfg = getDataRegionConfiguration(); > {code} > is actually null. @see `PageMemoryImpl#getDataRegionConfiguration()`. > Stacktrace: > {code:java} > [defragmentation-thread][CachePartitionDefragmentationManager] > Defragmentation process failed. > > org.apache.ignite.internal.processors.cache.persistence.freelist.CorruptedFreeListException: > Failed to insert data row > at > org.apache.ignite.internal.processors.cache.persistence.freelist.AbstractFreeList.insertDataRow(AbstractFreeList.java:600) > ~[classes/:?] > at > org.apache.ignite.internal.processors.cache.persistence.freelist.CacheFreeList.insertDataRow(CacheFreeList.java:74) > ~[classes/:?] > at > org.apache.ignite.internal.processors.cache.persistence.freelist.CacheFreeList.insertDataRow(CacheFreeList.java:35) > ~[classes/:?] > at > org.apache.ignite.internal.processors.cache.persistence.defragmentation.CachePartitionDefragmentationManager.lambda$copyPartitionData$4(CachePartitionDefragmentationManager.java:741) > ~[classes/:?] > at > org.apache.ignite.internal.processors.cache.persistence.defragmentation.TreeIterator.iterate(TreeIterator.java:83) > ~[classes/:?] > at > org.apache.ignite.internal.processors.cache.persistence.defragmentation.CachePartitionDefragmentationManager.copyPartitionData(CachePartitionDefragmentationManager.java:718) > ~[classes/:?] > at > org.apache.ignite.internal.processors.cache.persistence.defragmentation.CachePartitionDefragmentationManager.defragmentOnePartition(CachePartitionDefragmentationManager.java:565) > ~[classes/:?] > at > org.apache.ignite.internal.processors.cache.persistence.defragmentation.CachePartitionDefragmentationManager.lambda$executeDefragmentation$66155109$1(CachePartitionDefragmentationManager.java:382) > ~[classes/:?] > at > org.apache.ignite.internal.util.IgniteUtils.lambda$doInParallel$3(IgniteUtils.java:11661) > ~[classes/:?] > at java.util.concurrent.FutureTask.run(FutureTask.java:264) ~[?:?] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > ~[?:?] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > ~[?:?] > at java.lang.Thread.run(Thread.java:829) [?:?] > Suppressed: > org.apache.ignite.internal.processors.cach
[jira] [Created] (IGNITE-19993) Reduce sending safe time command
Vladislav Pyatkov created IGNITE-19993: -- Summary: Reduce sending safe time command Key: IGNITE-19993 URL: https://issues.apache.org/jira/browse/IGNITE-19993 Project: Ignite Issue Type: Improvement Reporter: Vladislav Pyatkov *Motivation* Safe time propagation is based on messages that are sent at a particular interval. A safe-time direct message is less expensive than a command. The prpouse of the algorith is reduce sending command to propage safe time. *Implementation notes* Send safe time command by the following rule: {code} If (tSend == null) { // Nothing was sent at this lease interval sendSafeTimePropagatingCommand(tCurrent) if (tSend < tCurrent) sendSafeTimePropagatingMessage(tSend, tCurrent) } else if (tCurrent - stFreq < tSend && tCurrent - stFreq > tSuc) { // An operation was performed in a previous stFreq interval, but it was not a success sendSafeTimePropagatingCommand(tCurrent) if (tSend < tCurrent) sendSafeTimePropagatingMessage(tSend, tCurrent) } else if (tCurrent - stFreq > tSend && tSend != tSuc) { // The last command was unsuccessful sendSafeTimePropagatingCommand(tCurrent) if (tSend < tCurrent) sendSafeTimePropagatingMessage(tSend, tCurrent) } else if (tSend < tCurrent) { // It is not necessary to send a safe time due to the fact that is moving with load sendSafeTimePropagatingMessage(tSend, tCurrent) } {code} *Difinition of done* Safe time propagates through messages in regular load. The safe time command is used only in the specific cases described above. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IGNITE-19992) Sql. Rework execution of 2-phase set operators
Konstantin Orlov created IGNITE-19992: - Summary: Sql. Rework execution of 2-phase set operators Key: IGNITE-19992 URL: https://issues.apache.org/jira/browse/IGNITE-19992 Project: Ignite Issue Type: Improvement Components: sql Reporter: Konstantin Orlov As of now, both {{IntersectNode}} and {{MinusNode}} use complex structures as result of MAP phase (see {{org.apache.ignite.internal.sql.engine.exec.rel.AbstractSetOpNode.Grouping#getOnMapper}}; it emits rows cardinality of 2, where first column is entire group key, and second column is an array of counters). This prevents us from migrating sql runtime to BinaryTuple-based rows, because currently BinaryTuple does not support nested structures. Let's rework those node to inline group key and counters into plain row. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-19992) Sql. Rework execution of 2-phase set operators
[ https://issues.apache.org/jira/browse/IGNITE-19992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Konstantin Orlov updated IGNITE-19992: -- Epic Link: IGNITE-19860 > Sql. Rework execution of 2-phase set operators > -- > > Key: IGNITE-19992 > URL: https://issues.apache.org/jira/browse/IGNITE-19992 > Project: Ignite > Issue Type: Improvement > Components: sql >Reporter: Konstantin Orlov >Priority: Major > Labels: ignite-3 > > As of now, both {{IntersectNode}} and {{MinusNode}} use complex structures as > result of MAP phase (see > {{org.apache.ignite.internal.sql.engine.exec.rel.AbstractSetOpNode.Grouping#getOnMapper}}; > it emits rows cardinality of 2, where first column is entire group key, and > second column is an array of counters). This prevents us from migrating sql > runtime to BinaryTuple-based rows, because currently BinaryTuple does not > support nested structures. > Let's rework those node to inline group key and counters into plain row. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-19991) Safe time message
[ https://issues.apache.org/jira/browse/IGNITE-19991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vladislav Pyatkov updated IGNITE-19991: --- Description: *Motivation* The specific direct message type is required to replace the safe time propagation command. The message should be sent with the same trigger as the command. *Implementation notes* * Create a network message {{SafeTimePropagationMessage}} with two timestamps (t1, t2; t2 > t1). * t1 is determined by {{sendTimestamp}} (IGNITE-19990); t2 is determined by {{clock.now()}}. * Ensure that t2 > t2 before trying to send the message (otherwise, the safe time is transferred with load and the message sending is skipped). * Send the safe time message to each member of the replication group that matches this primary. * The safe time message is applied if the replica's locally safe time is greater than or equal to t1. *Difinition of done* A safe time message is sent together with a safe time command. was: *Motivation* The specific direct message type is required to replace the safe time propagation command. The message should be sent with the same trigger as the command. *Implementation notes* * Create a network message {{SafeTimePropagationMessage}} with two timestamps (t1, t2; t2 > t1). * t1 is determined by {{sendTimestamp}}; t2 is determined by {{clock.now()}}. * Ensure that t2 > t2 before trying to send the message (otherwise, the safe time is transferred with load and the message sending is skipped). * Send the safe time message to each member of the replication group that matches this primary. * The safe time message is applied if the replica's locally safe time is greater than or equal to t1. *Difinition of done* A safe time message is sent together with a safe time command. > Safe time message > - > > Key: IGNITE-19991 > URL: https://issues.apache.org/jira/browse/IGNITE-19991 > Project: Ignite > Issue Type: Improvement >Reporter: Vladislav Pyatkov >Priority: Major > > *Motivation* > The specific direct message type is required to replace the safe time > propagation command. The message should be sent with the same trigger as the > command. > *Implementation notes* > * Create a network message {{SafeTimePropagationMessage}} with two timestamps > (t1, t2; t2 > t1). > * t1 is determined by {{sendTimestamp}} (IGNITE-19990); t2 is determined by > {{clock.now()}}. > * Ensure that t2 > t2 before trying to send the message (otherwise, the safe > time is transferred with load and the message sending is skipped). > * Send the safe time message to each member of the replication group that > matches this primary. > * The safe time message is applied if the replica's locally safe time is > greater than or equal to t1. > *Difinition of done* > A safe time message is sent together with a safe time command. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IGNITE-19991) Safe time message
Vladislav Pyatkov created IGNITE-19991: -- Summary: Safe time message Key: IGNITE-19991 URL: https://issues.apache.org/jira/browse/IGNITE-19991 Project: Ignite Issue Type: Improvement Reporter: Vladislav Pyatkov *Motivation* The specific direct message type is required to replace the safe time propagation command. The message should be sent with the same trigger as the command. *Implementation notes* * Create a network message {{SafeTimePropagationMessage}} with two timestamps (t1, t2; t2 > t1). * t1 is determined by {{sendTimestamp}}; t2 is determined by {{clock.now()}}. * Ensure that t2 > t2 before trying to send the message (otherwise, the safe time is transferred with load and the message sending is skipped). * Send the safe time message to each member of the replication group that matches this primary. * The safe time message is applied if the replica's locally safe time is greater than or equal to t1. *Difinition of done* A safe time message is sent together with a safe time command. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (IGNITE-19399) ODBC 3.0: Support transactions
[ https://issues.apache.org/jira/browse/IGNITE-19399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17743818#comment-17743818 ] Dmitrii Zabotlin commented on IGNITE-19399: --- LGTM > ODBC 3.0: Support transactions > -- > > Key: IGNITE-19399 > URL: https://issues.apache.org/jira/browse/IGNITE-19399 > Project: Ignite > Issue Type: New Feature > Components: odbc >Reporter: Igor Sapego >Assignee: Igor Sapego >Priority: Major > Labels: ignite-3 > Fix For: 3.0.0-beta2 > > Time Spent: 20m > Remaining Estimate: 0h > > Need to implement transactions support for ODBC (Autocommit off) -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IGNITE-19990) Replication command timestamps
Vladislav Pyatkov created IGNITE-19990: -- Summary: Replication command timestamps Key: IGNITE-19990 URL: https://issues.apache.org/jira/browse/IGNITE-19990 Project: Ignite Issue Type: Improvement Reporter: Vladislav Pyatkov *Motivation* Replication command send and success timestamps are supposed to be cached in the primary replica. These timestamps are used to choose which method the safe time propagetion is supposed. *Implementation notes* * When the node becomes a primary replica for any partition, it initiates {{sendTimestamp = null}} and {{successTimestmp=0}}. * Store the latest timestamp of sending commands in {{sendTimestamp}}. * Store the latest timestamp of successfully executed commands in {{successTimestmp}}. * The updet of both timestamps is supposed to be atomic: {code} public boolean setIfGreater(AtomicLong sentTime, long ts) { return sentTime.updateAndGet(x -> x < ts ? ts : x) == ts; } {code} *Disinition of done* If a primary replica has not applied any operations, {{sendTimestamp}} is {{null}}. Otherwise, in an arbitrary moment, {{sendTimestamp}} has to be greater than or equivalent to {{successTimestmp}} or it is {{null}}. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (IGNITE-19131) ODBC 3.0 Basic functionality
[ https://issues.apache.org/jira/browse/IGNITE-19131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Igor Sapego resolved IGNITE-19131. -- Resolution: Fixed Done. > ODBC 3.0 Basic functionality > > > Key: IGNITE-19131 > URL: https://issues.apache.org/jira/browse/IGNITE-19131 > Project: Ignite > Issue Type: Epic > Components: odbc >Reporter: Igor Sapego >Assignee: Igor Sapego >Priority: Major > Labels: ignite-3 > Fix For: 3.0.0-beta2 > > > We need to implement basic ODBC driver for Ignite 3. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (IGNITE-18621) Packaging: DEB and RPM packages for ODBC
[ https://issues.apache.org/jira/browse/IGNITE-18621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mikhail Pochatkin reassigned IGNITE-18621: -- Assignee: Mikhail Pochatkin > Packaging: DEB and RPM packages for ODBC > > > Key: IGNITE-18621 > URL: https://issues.apache.org/jira/browse/IGNITE-18621 > Project: Ignite > Issue Type: New Feature > Components: odbc >Reporter: Igor Sapego >Assignee: Mikhail Pochatkin >Priority: Major > Labels: ignite-3 > > We need to implement ODBC packaging for Linux. Let's consider RPM and DEB > packages. It will be better to use gradle for creating packages, but cmake is > OK too if required. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-19746) control.sh --performance-statistics status doesn't not print actual status
[ https://issues.apache.org/jira/browse/IGNITE-19746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Korotkov updated IGNITE-19746: - Labels: IEP-81 ise (was: IEP-81) > control.sh --performance-statistics status doesn't not print actual status > -- > > Key: IGNITE-19746 > URL: https://issues.apache.org/jira/browse/IGNITE-19746 > Project: Ignite > Issue Type: Bug >Reporter: Sergey Korotkov >Assignee: Nikolay Izhikov >Priority: Major > Labels: IEP-81, ise > > The status sub-command of the control.sh --performance-statistics doesn't not > print the actual status to console. > Previously it was like (note the *Disabled.* word): > {noformat} > Control utility [ver. 15.0.0-SNAPSHOT#20230422-sha1:7f80003d] > 2023 Copyright(C) Apache Software Foundation > User: ducker > Time: 2023-04-23T22:17:12.489 > Command [PERFORMANCE-STATISTICS] started > Arguments: --host x.x.x.x --performance-statistics status --user admin > --password * > > Disabled. > Command [PERFORMANCE-STATISTICS] finished with code: 0 > Control utility has completed execution at: 2023-04-23T22:17:13.271 > Execution time: 782 ms > {noformat} > > Now it's like (note the absence of the *Disabled.* word): > {noformat} > Control utility [ver. 15.0.0-SNAPSHOT#20230613-sha1:cacee58d] > 2023 Copyright(C) Apache Software Foundation > User: ducker > Time: 2023-06-15T15:46:41.586 > Command [PERFORMANCE-STATISTICS] started > Arguments: --host x.x.x.x --performance-statistics status --user admin > --password * > > Command [PERFORMANCE-STATISTICS] finished with code: 0 > Control utility has completed execution at: 2023-06-15T15:46:42.523 > Execution time: 937 ms > {noformat} > > Outputs of other sub-commands are also need to be checked. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IGNITE-19989) Lease updater thread is not finished properly and has incorrect name
Denis Chudov created IGNITE-19989: - Summary: Lease updater thread is not finished properly and has incorrect name Key: IGNITE-19989 URL: https://issues.apache.org/jira/browse/IGNITE-19989 Project: Ignite Issue Type: Bug Reporter: Denis Chudov The name of LeaseUpdater#updaterTread consists of prefix only. It should be renamed in order to have something after '-' on the end. Also, it is not stopped on node stop, because LeaseUpdater#deactivate is not called on stop of PlacementDriverManager. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (IGNITE-19986) Increase stability of inMemoryNodeRestartNotLeader in ItIgniteInMemoryNodeRestartTest
[ https://issues.apache.org/jira/browse/IGNITE-19986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17743715#comment-17743715 ] Roman Puchkovskiy commented on IGNITE-19986: Thanks! > Increase stability of inMemoryNodeRestartNotLeader in > ItIgniteInMemoryNodeRestartTest > - > > Key: IGNITE-19986 > URL: https://issues.apache.org/jira/browse/IGNITE-19986 > Project: Ignite > Issue Type: Bug >Reporter: Roman Puchkovskiy >Assignee: Roman Puchkovskiy >Priority: Major > Labels: ignite-3 > Fix For: 3.0.0-beta2 > > Time Spent: 20m > Remaining Estimate: 0h > > The test sometimes fails because restart of a node happens before all nodes > get table data through replication. As a result, when the restarted node > starts, it does not see a 'majority with data' and fails. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-19988) Add index creation (population) status to index view
[ https://issues.apache.org/jira/browse/IGNITE-19988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Daschinsky updated IGNITE-19988: - Description: Sometimes index creation can be quite long. A user might start queries without waiting for finish of the index creation process and see slow queries. It is necessary to provide index status information to users by exposing it in the index system view. (was: Sometimes index creation can be quite long. It is necessary to provide index status information to users by exposing it in the index system view.) > Add index creation (population) status to index view > > > Key: IGNITE-19988 > URL: https://issues.apache.org/jira/browse/IGNITE-19988 > Project: Ignite > Issue Type: Improvement >Affects Versions: 2.15 >Reporter: Ivan Daschinsky >Priority: Major > Labels: ise > Fix For: 2.16 > > > Sometimes index creation can be quite long. A user might start queries > without waiting for finish of the index creation process and see slow > queries. It is necessary to provide index status information to users by > exposing it in the index system view. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-19988) Add index creation (population) status to index view
[ https://issues.apache.org/jira/browse/IGNITE-19988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Daschinsky updated IGNITE-19988: - Description: Sometimes index creation can be quite long. It is necessary to provide index status information to users by exposing it in the index system view. > Add index creation (population) status to index view > > > Key: IGNITE-19988 > URL: https://issues.apache.org/jira/browse/IGNITE-19988 > Project: Ignite > Issue Type: Improvement >Affects Versions: 2.15 >Reporter: Ivan Daschinsky >Priority: Major > Labels: ise > Fix For: 2.16 > > > Sometimes index creation can be quite long. It is necessary to provide index > status information to users by exposing it in the index system view. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IGNITE-19988) Add index creation (population) status to index view
Ivan Daschinsky created IGNITE-19988: Summary: Add index creation (population) status to index view Key: IGNITE-19988 URL: https://issues.apache.org/jira/browse/IGNITE-19988 Project: Ignite Issue Type: Improvement Affects Versions: 2.15 Reporter: Ivan Daschinsky Fix For: 2.16 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-19988) Add index creation (population) status to index view
[ https://issues.apache.org/jira/browse/IGNITE-19988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Daschinsky updated IGNITE-19988: - Labels: ise (was: ) > Add index creation (population) status to index view > > > Key: IGNITE-19988 > URL: https://issues.apache.org/jira/browse/IGNITE-19988 > Project: Ignite > Issue Type: Improvement >Affects Versions: 2.15 >Reporter: Ivan Daschinsky >Priority: Major > Labels: ise > Fix For: 2.16 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-19950) Snapshot of in-memory cache groups
[ https://issues.apache.org/jira/browse/IGNITE-19950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nikolay Izhikov updated IGNITE-19950: - Labels: IEP-109 iep-43 ise (was: iep-43 ise) > Snapshot of in-memory cache groups > -- > > Key: IGNITE-19950 > URL: https://issues.apache.org/jira/browse/IGNITE-19950 > Project: Ignite > Issue Type: New Feature >Reporter: Nikolay Izhikov >Assignee: Nikolay Izhikov >Priority: Major > Labels: IEP-109, iep-43, ise > Time Spent: 20m > Remaining Estimate: 0h > > Ignite can create snapshot for persistent cache groups only. > It will be very usefull for a user to have ability to create and restore > snapshot of in-memory cache groups. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-19692) Design Resilient Distributed Operations mechanism
[ https://issues.apache.org/jira/browse/IGNITE-19692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Roman Puchkovskiy updated IGNITE-19692: --- Description: We need a mechanism that would allow to do the following: # Execute an operation on all (or some of) partitions of a table # The whole operation is split into sub-operations (each of which operate on a single partition) # Each sub-operation must be resilient: that is, if the node that hosts it restarts or the partition moves to another node, the operation should proceed # When a sub-operation ends, it notifies the operation tracker/coordinator # When all sub-operations end, the tracker might take some action (like starting a subsequent operation) # The tracker is also resilient We need such a mechanism in a few places in the system: # Transaction cleanup? # Index build # Table data validation as a part of a schema change that requires a validation (like a narrowing type change) Probably, more applications of the mechanism will emerge. On the possible implementation: the tracker could be collocated with table's primary replica (that would guarantee that at most one tracker exists at all times). We could store the data needed to track the operation in the Meta-Storage under a prefix corresponding to the table, like 'ops...'. We could store the completion status for each of the partitions there along with some operation-wide status. h1. Approved design h2. Definitions An *operation* is run on a table as a whole, it consists of {*}jobs{*}, one job per partition. A job might complete successfully or fail (that is, it finishes unsuccessfully; unexpected failures are a different matter). If all jobs of an operation complete successfully, the operation itself completes successfully. If any job fails, the operation fails. Some jobs might track their own progress and can be resumed. Others cannot and always start over when asked to start/resume. h2. Goals # The operation must be resilient to a reboot or a leave of any of participating node # Also, it must be resilient to ‘primary node’ status changes (i.e. if a primary of any of table’s partition changes, this should not stall the operation forever or fail it completely) h3. Non-goals # It is not proposed that the fact that an operation is running prohibit changes to the cluster (like holding a node from a restart or from leaving the cluster) # It is considered ok that some work would have to be repeated (if, for example, a node ceases to be a primary while it executes an operation that can only be run on a primary; in such a situation the new primary might restart the job from scratch) h2. General properties An operation’s jobs are executed in parallel. If a job that is still not finished is reassigned to another node (due to primary change or due to logical topology change), the old executor stops executing the job (and cleans up if the job wrote something to disk); the new executor adopts the job and starts it from scratch. If a node restarts and finds that it has a non-finished resumable job to which it’s still assigned, it resumes the job. h2. Operation attributes * Type ID (a string identifying operation type, like ‘validate’, ‘buildIndex’) * Whether more than one operation instances of the type may be run on a table (example: ‘build index’ might build a few indices at the same time) or not (in the first version, we might only support operations that support many operation instances on a table at the same time) * Job logic run when a job is started/resumed * Job logic run when a job is canceled * Operation logic run when an operation is completed (success or failure) Some operations can only succeed normally (‘build index’ is an example), others can also finish unsuccessfully [that is, end up in the ‘failed’ state] (‘validate table data’ is an example). Some operations persist their progress on the disk (‘build index’ does this), others are ‘volatile’ and always start over (‘validate table data’ might work in this way, but it could also persist the position in the ‘cursor’ over the data being validated). h2. Provisioned operations # Build an index. Must be executed on a primary; a few operations might be run concurrently (each for its own index); cannot fail; persists its progress # Validate table data (for constraints where indices are not needed): checks whether every actual (i.e. not overwritten) tuple satisfies a restriction (NOT NULL/CHECK constraint/that it fits in a type range). May be executed on a secondary; a few operations might be run concurrently; can fail; might be volatile or persist its progress h2. Roles There are two roles: # Operation {*}coordinator{*}, there is at most one coordinator per operation instance (this is achieved by colocating a coordinator with the primary replica of partition 0 of the table); its responsibilities are: ##
[jira] [Updated] (IGNITE-19692) Design Resilient Distributed Operations mechanism
[ https://issues.apache.org/jira/browse/IGNITE-19692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Roman Puchkovskiy updated IGNITE-19692: --- Description: We need a mechanism that would allow to do the following: # Execute an operation on all (or some of) partitions of a table # The whole operation is split into sub-operations (each of which operate on a single partition) # Each sub-operation must be resilient: that is, if the node that hosts it restarts or the partition moves to another node, the operation should proceed # When a sub-operation ends, it notifies the operation tracker/coordinator # When all sub-operations end, the tracker might take some action (like starting a subsequent operation) # The tracker is also resilient We need such a mechanism in a few places in the system: # Transaction cleanup? # Index build # Table data validation as a part of a schema change that requires a validation (like a narrowing type change) Probably, more applications of the mechanism will emerge. On the possible implementation: the tracker could be collocated with table's primary replica (that would guarantee that at most one tracker exists at all times). We could store the data needed to track the operation in the Meta-Storage under a prefix corresponding to the table, like 'ops...'. We could store the completion status for each of the partitions there along with some operation-wide status. h1. Approved design h2. Definitions An *operation* is run on a table as a whole, it consists of {*}jobs{*}, one job per partition. A job might complete successfully or fail (that is, it finishes unsuccessfully; unexpected failures are a different matter). If all jobs of an operation complete successfully, the operation itself completes successfully. If any job fails, the operation fails. Some jobs might track their own progress and can be resumed. Others cannot and always start over when asked to start/resume. h2. Goals # The operation must be resilient to a reboot or a leave of any of participating node # Also, it must be resilient to ‘primary node’ status changes (i.e. if a primary of any of table’s partition changes, this should not stall the operation forever or fail it completely) h2. Non-goals # It is not proposed that the fact that an operation is running prohibit changes to the cluster (like holding a node from a restart or from leaving the cluster) # It is considered ok that some work would have to be repeated (if, for example, a node ceases to be a primary while it executes an operation that can only be run on a primary; in such a situation the new primary might restart the job from scratch) h2. General properties An operation’s jobs are executed in parallel. If a job that is still not finished is reassigned to another node (due to primary change or due to logical topology change), the old executor stops executing the job (and cleans up if the job wrote something to disk); the new executor adopts the job and starts it from scratch. If a node restarts and finds that it has a non-finished resumable job to which it’s still assigned, it resumes the job. h2. Operation attributes * Type ID (a string identifying operation type, like ‘validate’, ‘buildIndex’) * Whether more than one operation instances of the type may be run on a table (example: ‘build index’ might build a few indices at the same time) or not (in the first version, we might only support operations that support many operation instances on a table at the same time) * Job logic run when a job is started/resumed * Job logic run when a job is canceled * Operation logic run when an operation is completed (success or failure) Some operations can only succeed normally (‘build index’ is an example), others can also finish unsuccessfully [that is, end up in the ‘failed’ state] (‘validate table data’ is an example). Some operations persist their progress on the disk (‘build index’ does this), others are ‘volatile’ and always start over (‘validate table data’ might work in this way, but it could also persist the position in the ‘cursor’ over the data being validated). h2. Provisioned operations # Build an index. Must be executed on a primary; a few operations might be run concurrently (each for its own index); cannot fail; persists its progress # Validate table data (for constraints where indices are not needed): checks whether every actual (i.e. not overwritten) tuple satisfies a restriction (NOT NULL/CHECK constraint/that it fits in a type range). May be executed on a secondary; a few operations might be run concurrently; can fail; might be volatile or persist its progress h2. Roles There are two roles: # Operation {*}coordinator{*}, there is at most one coordinator per operation instance (this is achieved by colocating a coordinator with the primary replica of partition 0 of the table); its responsibilities are: ##
[jira] [Updated] (IGNITE-19692) Design Resilient Distributed Operations mechanism
[ https://issues.apache.org/jira/browse/IGNITE-19692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Roman Puchkovskiy updated IGNITE-19692: --- Description: We need a mechanism that would allow to do the following: # Execute an operation on all (or some of) partitions of a table # The whole operation is split into sub-operations (each of which operate on a single partition) # Each sub-operation must be resilient: that is, if the node that hosts it restarts or the partition moves to another node, the operation should proceed # When a sub-operation ends, it notifies the operation tracker/coordinator # When all sub-operations end, the tracker might take some action (like starting a subsequent operation) # The tracker is also resilient We need such a mechanism in a few places in the system: # Transaction cleanup? # Index build # Table data validation as a part of a schema change that requires a validation (like a narrowing type change) Probably, more applications of the mechanism will emerge. On the possible implementation: the tracker could be collocated with table's primary replica (that would guarantee that at most one tracker exists at all times). We could store the data needed to track the operation in the Meta-Storage under a prefix corresponding to the table, like 'ops...'. We could store the completion status for each of the partitions there along with some operation-wide status. h1. Approved design h2. Definitions An *operation* is run on a table as a whole, it consists of {*}jobs{*}, one job per partition. A job might complete successfully or fail (that is, it finishes unsuccessfully; unexpected failures are a different matter). If all jobs of an operation complete successfully, the operation itself completes successfully. If any job fails, the operation fails. Some jobs might track their own progress and can be resumed. Others cannot and always start over when asked to start/resume. h2. Goals # The operation must be resilient to a reboot or a leave of any of participating node # Also, it must be resilient to ‘primary node’ status changes (i.e. if a primary of any of table’s partition changes, this should not stall the operation forever or fail it completely) h3. Non-goals # It is not proposed that the fact that an operation is running prohibit changes to the cluster (like holding a node from a restart or from leaving the cluster) # It is considered ok that some work would have to be repeated (if, for example, a node ceases to be a primary while it executes an operation that can only be run on a primary; in such a situation the new primary might restart the job from scratch) h2. General properties An operation’s jobs are executed in parallel. If a job that is still not finished is reassigned to another node (due to primary change or due to logical topology change), the old executor stops executing the job (and cleans up if the job wrote something to disk); the new executor adopts the job and starts it from scratch. If a node restarts and finds that it has a non-finished resumable job to which it’s still assigned, it resumes the job. h2. Operation attributes * Type ID (a string identifying operation type, like ‘validate’, ‘buildIndex’) * Whether more than one operation instances of the type may be run on a table (example: ‘build index’ might build a few indices at the same time) or not (in the first version, we might only support operations that support many operation instances on a table at the same time) * Job logic run when a job is started/resumed * Job logic run when a job is canceled * Operation logic run when an operation is completed (success or failure) Some operations can only succeed normally (‘build index’ is an example), others can also finish unsuccessfully [that is, end up in the ‘failed’ state] (‘validate table data’ is an example). Some operations persist their progress on the disk (‘build index’ does this), others are ‘volatile’ and always start over (‘validate table data’ might work in this way, but it could also persist the position in the ‘cursor’ over the data being validated). h2. Provisioned operations # Build an index. Must be executed on a primary; a few operations might be run concurrently (each for its own index); cannot fail; persists its progress # Validate table data (for constraints where indices are not needed): checks whether every actual (i.e. not overwritten) tuple satisfies a restriction (NOT NULL/CHECK constraint/that it fits in a type range). May be executed on a secondary; a few operations might be run concurrently; can fail; might be volatile or persist its progress h2. Roles There are two roles: # Operation {*}coordinator{*}, there is at most one coordinator per operation instance (this is achieved by colocating a coordinator with the primary replica of partition 0 of the table); its responsibilities are: ##
[jira] [Updated] (IGNITE-19692) Design Resilient Distributed Operations mechanism
[ https://issues.apache.org/jira/browse/IGNITE-19692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Roman Puchkovskiy updated IGNITE-19692: --- Description: We need a mechanism that would allow to do the following: # Execute an operation on all (or some of) partitions of a table # The whole operation is split into sub-operations (each of which operate on a single partition) # Each sub-operation must be resilient: that is, if the node that hosts it restarts or the partition moves to another node, the operation should proceed # When a sub-operation ends, it notifies the operation tracker/coordinator # When all sub-operations end, the tracker might take some action (like starting a subsequent operation) # The tracker is also resilient We need such a mechanism in a few places in the system: # Transaction cleanup? # Index build # Table data validation as a part of a schema change that requires a validation (like a narrowing type change) Probably, more applications of the mechanism will emerge. On the possible implementation: the tracker could be collocated with table's primary replica (that would guarantee that at most one tracker exists at all times). We could store the data needed to track the operation in the Meta-Storage under a prefix corresponding to the table, like 'ops...'. We could store the completion status for each of the partitions there along with some operation-wide status. h1. Approved design h2. Definitions An *operation* is run on a table as a whole, it consists of {*}jobs{*}, one job per partition. A job might complete successfully or fail (that is, it finishes unsuccessfully; unexpected failures are a different matter). If all jobs of an operation complete successfully, the operation itself completes successfully. If any job fails, the operation fails. Some jobs might track their own progress and can be resumed. Others cannot and always start over when asked to start/resume. h2. Goals # The operation must be resilient to a reboot or a leave of any of participating node # Also, it must be resilient to ‘primary node’ status changes (i.e. if a primary of any of table’s partition changes, this should not stall the operation forever or fail it completely) h3. Non-goals # It is not proposed that the fact that an operation is running prohibit changes to the cluster (like holding a node from a restart or from leaving the cluster) # It is considered ok that some work would have to be repeated (if, for example, a node ceases to be a primary while it executes an operation that can only be run on a primary; in such a situation the new primary might restart the job from scratch) h2. General properties An operation’s jobs are executed in parallel. If a job that is still not finished is reassigned to another node (due to primary change or due to logical topology change), the old executor stops executing the job (and cleans up if the job wrote something to disk); the new executor adopts the job and starts it from scratch. If a node restarts and finds that it has a non-finished resumable job to which it’s still assigned, it resumes the job. h2. Operation attributes * Type ID (a string identifying operation type, like ‘validate’, ‘buildIndex’) * Whether more than one operation instances of the type may be run on a table (example: ‘build index’ might build a few indices at the same time) or not (in the first version, we might only support operations that support many operation instances on a table at the same time) * Job logic run when a job is started/resumed * Job logic run when a job is canceled * Operation logic run when an operation is completed (success or failure) Some operations can only succeed normally (‘build index’ is an example), others can also finish unsuccessfully [that is, end up in the ‘failed’ state] (‘validate table data’ is an example). Some operations persist their progress on the disk (‘build index’ does this), others are ‘volatile’ and always start over (‘validate table data’ might work in this way, but it could also persist the position in the ‘cursor’ over the data being validated). h2. Provisioned operations # Build an index. Must be executed on a primary; a few operations might be run concurrently (each for its own index); cannot fail; persists its progress # Validate table data (for constraints where indices are not needed): checks whether every actual (i.e. not overwritten) tuple satisfies a restriction (NOT NULL/CHECK constraint/that it fits in a type range). May be executed on a secondary; a few operations might be run concurrently; can fail; might be volatile or persist its progress h2. Roles There are two roles: # Operation {*}coordinator{*}, there is at most one coordinator per operation instance (this is achieved by colocating a coordinator with the primary replica of partition 0 of the table); its responsibilities are: ##
[jira] [Updated] (IGNITE-19692) Design Resilient Distributed Operations mechanism
[ https://issues.apache.org/jira/browse/IGNITE-19692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Roman Puchkovskiy updated IGNITE-19692: --- Description: We need a mechanism that would allow to do the following: # Execute an operation on all (or some of) partitions of a table # The whole operation is split into sub-operations (each of which operate on a single partition) # Each sub-operation must be resilient: that is, if the node that hosts it restarts or the partition moves to another node, the operation should proceed # When a sub-operation ends, it notifies the operation tracker/coordinator # When all sub-operations end, the tracker might take some action (like starting a subsequent operation) # The tracker is also resilient We need such a mechanism in a few places in the system: # Transaction cleanup? # Index build # Table data validation as a part of a schema change that requires a validation (like a narrowing type change) Probably, more applications of the mechanism will emerge. On the possible implementation: the tracker could be collocated with table's primary replica (that would guarantee that at most one tracker exists at all times). We could store the data needed to track the operation in the Meta-Storage under a prefix corresponding to the table, like 'ops...'. We could store the completion status for each of the partitions there along with some operation-wide status. h1. Approved design h2. Definitions An *operation* is run on a table as a whole, it consists of {*}jobs{*}, one job per partition. A job might complete successfully or fail (that is, it finishes unsuccessfully; unexpected failures are a different matter). If all jobs of an operation complete successfully, the operation itself completes successfully. If any job fails, the operation fails. Some jobs might track their own progress and can be resumed. Others cannot and always start over when asked to start/resume. h2. Goals # The operation must be resilient to a reboot or a leave of any of participating node # Also, it must be resilient to ‘primary node’ status changes (i.e. if a primary of any of table’s partition changes, this should not stall the operation forever or fail it completely) h3. Non-goals # It is not proposed that the fact that an operation is running prohibit changes to the cluster (like holding a node from a restart or from leaving the cluster) # It is considered ok that some work would have to be repeated (if, for example, a node ceases to be a primary while it executes an operation that can only be run on a primary; in such a situation the new primary might restart the job from scratch) h2. General properties An operation’s jobs are executed in parallel. If a job that is still not finished is reassigned to another node (due to primary change or due to logical topology change), the old executor stops executing the job (and cleans up if the job wrote something to disk); the new executor adopts the job and starts it from scratch. If a node restarts and finds that it has a non-finished resumable job to which it’s still assigned, it resumes the job. h2. Operation attributes * Type ID (a string identifying operation type, like ‘validate’, ‘buildIndex’) * Whether more than one operation instances of the type may be run on a table (example: ‘build index’ might build a few indices at the same time) or not (in the first version, we might only support operations that support many operation instances on a table at the same time) * Job logic run when a job is started/resumed * Job logic run when a job is canceled * Operation logic run when an operation is completed (success or failure) Some operations can only succeed normally (‘build index’ is an example), others can also finish unsuccessfully [that is, end up in the ‘failed’ state] (‘validate table data’ is an example). Some operations persist their progress on the disk (‘build index’ does this), others are ‘volatile’ and always start over (‘validate table data’ might work in this way, but it could also persist the position in the ‘cursor’ over the data being validated). h2. Provisioned operations # Build an index. Must be executed on a primary; a few operations might be run concurrently (each for its own index); cannot fail; persists its progress # Validate table data (for constraints where indices are not needed): checks whether every actual (i.e. not overwritten) tuple satisfies a restriction (NOT NULL/CHECK constraint/that it fits in a type range). May be executed on a secondary; a few operations might be run concurrently; can fail; might be volatile or persist its progress h2. Roles There are two roles: # Operation {*}coordinator{*}, there is at most one coordinator per operation instance (this is achieved by colocating a coordinator with the primary replica of partition 0 of the table); its responsibilities are: ##
[jira] [Updated] (IGNITE-19692) Design Resilient Distributed Operations mechanism
[ https://issues.apache.org/jira/browse/IGNITE-19692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Roman Puchkovskiy updated IGNITE-19692: --- Description: We need a mechanism that would allow to do the following: # Execute an operation on all (or some of) partitions of a table # The whole operation is split into sub-operations (each of which operate on a single partition) # Each sub-operation must be resilient: that is, if the node that hosts it restarts or the partition moves to another node, the operation should proceed # When a sub-operation ends, it notifies the operation tracker/coordinator # When all sub-operations end, the tracker might take some action (like starting a subsequent operation) # The tracker is also resilient We need such a mechanism in a few places in the system: # Transaction cleanup? # Index build # Table data validation as a part of a schema change that requires a validation (like a narrowing type change) Probably, more applications of the mechanism will emerge. On the possible implementation: the tracker could be collocated with table's primary replica (that would guarantee that at most one tracker exists at all times). We could store the data needed to track the operation in the Meta-Storage under a prefix corresponding to the table, like 'ops...'. We could store the completion status for each of the partitions there along with some operation-wide status. h1. Approved design h2. Definitions An *operation* is run on a table as a whole, it consists of {*}jobs{*}, one job per partition. A job might complete successfully or fail (that is, it finishes unsuccessfully; unexpected failures are a different matter). If all jobs of an operation complete successfully, the operation itself completes successfully. If any job fails, the operation fails. Some jobs might track their own progress and can be resumed. Others cannot and always start over when asked to start/resume. h2. Goals # The operation must be resilient to a reboot or a leave of any of participating node # Also, it must be resilient to ‘primary node’ status changes (i.e. if a primary of any of table’s partition changes, this should not stall the operation forever or fail it completely) h3. Non-goals # It is not proposed that the fact that an operation is running prohibit changes to the cluster (like holding a node from a restart or from leaving the cluster) # It is considered ok that some work would have to be repeated (if, for example, a node ceases to be a primary while it executes an operation that can only be run on a primary; in such a situation the new primary might restart the job from scratch) h2. General properties An operation’s jobs are executed in parallel. If a job that is still not finished is reassigned to another node (due to primary change or due to logical topology change), the old executor stops executing the job (and cleans up if the job wrote something to disk); the new executor adopts the job and starts it from scratch. If a node restarts and finds that it has a non-finished resumable job to which it’s still assigned, it resumes the job. h2. Operation attributes * Type ID (a string identifying operation type, like ‘validate’, ‘buildIndex’) * Whether more than one operation instances of the type may be run on a table (example: ‘build index’ might build a few indices at the same time) or not (in the first version, we might only support operations that support many operation instances on a table at the same time) * Job logic run when a job is started/resumed * Job logic run when a job is canceled * Operation logic run when an operation is completed (success or failure) Some operations can only succeed normally (‘build index’ is an example), others can also finish unsuccessfully [that is, end up in the ‘failed’ state] (‘validate table data’ is an example). Some operations persist their progress on the disk (‘build index’ does this), others are ‘volatile’ and always start over (‘validate table data’ might work in this way, but it could also persist the position in the ‘cursor’ over the data being validated). h2. Provisioned operations # Build an index. Must be executed on a primary; a few operations might be run concurrently (each for its own index); cannot fail; persists its progress # Validate table data (for constraints where indices are not needed): checks whether every actual (i.e. not overwritten) tuple satisfies a restriction (NOT NULL/CHECK constraint/that it fits in a type range). May be executed on a secondary; a few operations might be run concurrently; can fail; might be volatile or persist its progress h2. Roles There are two roles: # Operation {*}coordinator{*}, there is at most one coordinator per operation instance (this is achieved by colocating a coordinator with the primary replica of partition 0 of the table); its responsibilities are: ##
[jira] [Updated] (IGNITE-19692) Design Resilient Distributed Operations mechanism
[ https://issues.apache.org/jira/browse/IGNITE-19692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Roman Puchkovskiy updated IGNITE-19692: --- Description: We need a mechanism that would allow to do the following: # Execute an operation on all (or some of) partitions of a table # The whole operation is split into sub-operations (each of which operate on a single partition) # Each sub-operation must be resilient: that is, if the node that hosts it restarts or the partition moves to another node, the operation should proceed # When a sub-operation ends, it notifies the operation tracker/coordinator # When all sub-operations end, the tracker might take some action (like starting a subsequent operation) # The tracker is also resilient We need such a mechanism in a few places in the system: # Transaction cleanup? # Index build # Table data validation as a part of a schema change that requires a validation (like a narrowing type change) Probably, more applications of the mechanism will emerge. On the possible implementation: the tracker could be collocated with table's primary replica (that would guarantee that at most one tracker exists at all times). We could store the data needed to track the operation in the Meta-Storage under a prefix corresponding to the table, like 'ops...'. We could store the completion status for each of the partitions there along with some operation-wide status. h1. Approved design h2. Definitions An *operation* is run on a table as a whole, it consists of {*}jobs{*}, one job per partition. A job might complete successfully or fail (that is, it finishes unsuccessfully; unexpected failures are a different matter). If all jobs of an operation complete successfully, the operation itself completes successfully. If any job fails, the operation fails. Some jobs might track their own progress and can be resumed. Others cannot and always start over when asked to start/resume. h2. Goals # The operation must be resilient to a reboot or a leave of any of participating node # Also, it must be resilient to ‘primary node’ status changes (i.e. if a primary of any of table’s partition changes, this should not stall the operation forever or fail it completely) h3. Non-goals # It is not proposed that the fact that an operation is running prohibit changes to the cluster (like holding a node from a restart or from leaving the cluster) # It is considered ok that some work would have to be repeated (if, for example, a node ceases to be a primary while it executes an operation that can only be run on a primary; in such a situation the new primary might restart the job from scratch) h2. General properties An operation’s jobs are executed in parallel. If a job that is still not finished is reassigned to another node (due to primary change or due to logical topology change), the old executor stops executing the job (and cleans up if the job wrote something to disk); the new executor adopts the job and starts it from scratch. If a node restarts and finds that it has a non-finished resumable job to which it’s still assigned, it resumes the job. h2. Operation attributes * Type ID (a string identifying operation type, like ‘validate’, ‘buildIndex’) * Whether more than one operation instances of the type may be run on a table (example: ‘build index’ might build a few indices at the same time) or not (in the first version, we might only support operations that support many operation instances on a table at the same time) * Job logic run when a job is started/resumed * Job logic run when a job is canceled * Operation logic run when an operation is completed (success or failure) Some operations can only succeed normally (‘build index’ is an example), others can also finish unsuccessfully [that is, end up in the ‘failed’ state] (‘validate table data’ is an example). Some operations persist their progress on the disk (‘build index’ does this), others are ‘volatile’ and always start over (‘validate table data’ might work in this way, but it could also persist the position in the ‘cursor’ over the data being validated). h2. Provisioned operations # Build an index. Must be executed on a primary; a few operations might be run concurrently (each for its own index); cannot fail; persists its progress # Validate table data (for constraints where indices are not needed): checks whether every actual (i.e. not overwritten) tuple satisfies a restriction (NOT NULL/CHECK constraint/that it fits in a type range). May be executed on a secondary; a few operations might be run concurrently; can fail; might be volatile or persist its progress h2. Roles There are two roles: # Operation {*}coordinator{*}, there is at most one coordinator per operation instance (this is achieved by colocating a coordinator with the primary replica of partition 0 of the table); its responsibilities are: #
[jira] [Created] (IGNITE-19987) There is a broken link on rest api page
Igor Gusev created IGNITE-19987: --- Summary: There is a broken link on rest api page Key: IGNITE-19987 URL: https://issues.apache.org/jira/browse/IGNITE-19987 Project: Ignite Issue Type: Task Components: documentation Reporter: Igor Gusev We need to fix a link to openapi spec in github on this page [https://ignite.apache.org/docs/3.0.0-beta/rest/rest-api] Correct link is https://github.com/apache/ignite-3/blob/main/modules/rest-api/openapi/openapi.yaml -- This message was sent by Atlassian Jira (v8.20.10#820010)