[jira] [Updated] (IGNITE-17513) [ducktests] Support non-ascii symbols in SSH command lines in docker
[ https://issues.apache.org/jira/browse/IGNITE-17513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Korotkov updated IGNITE-17513: - Epic Link: IGNITE-13428 > [ducktests] Support non-ascii symbols in SSH command lines in docker > > > Key: IGNITE-17513 > URL: https://issues.apache.org/jira/browse/IGNITE-17513 > Project: Ignite > Issue Type: Test >Reporter: Sergey Korotkov >Assignee: Sergey Korotkov >Priority: Minor > Labels: ducktests > > Currently (at least in docker environment) it's not possible to pass the > non-ascii symbols via the command line to programs started via SSH in > ducktape. > In particular it's not possible to pass the password with non-ascii symbols > to control.sh > Reason is that the POSIX locale is assigned for such programs by default. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IGNITE-17513) [ducktests] Support non-ascii symbols in SSH command lines in docker
Sergey Korotkov created IGNITE-17513: Summary: [ducktests] Support non-ascii symbols in SSH command lines in docker Key: IGNITE-17513 URL: https://issues.apache.org/jira/browse/IGNITE-17513 Project: Ignite Issue Type: Test Reporter: Sergey Korotkov Assignee: Sergey Korotkov Currently (at least in docker environment) it's not possible to pass the non-ascii symbols via the command line to programs started via SSH in ducktape. In particular it's not possible to pass the password with non-ascii symbols to control.sh Reason is that the POSIX locale is assigned for such programs by default. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-17255) Implement ReplicaService
[ https://issues.apache.org/jira/browse/IGNITE-17255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexander Lapin updated IGNITE-17255: - Reviewer: Alexander Lapin (was: Vladislav Pyatkov) > Implement ReplicaService > > > Key: IGNITE-17255 > URL: https://issues.apache.org/jira/browse/IGNITE-17255 > Project: Ignite > Issue Type: Improvement >Reporter: Alexander Lapin >Assignee: Sergey Uttsel >Priority: Major > Labels: ignite-3, transaction3_rw > > For general context please check IGNITE-17252 > Within given ticket it's required to > * Implement ReplicaService itself. > * Substitute RaftGroupService with ReplicaService from within > InternalTableImpl and others > Please pay attention that according to tx protocol it's valid to fail the > transaction in case of primary replica change - we'll introduce support of > graceful primary replica switch later on. For now, within the scope of RW > transactions it's enough to detect where primary replica is and enlist it to > transaction with corresponding partition in order to reuse for further > in-partition communication. > We should make it very clear, that any replicaService.invoke(nodeId) might > fail with primaryReplicaMiss or replicaUnavailable, it's up to the outer > logic to remap such failed requests. > However it's still required to detect proper primary replica initially and > check whether it's still primary during further queries. Proper lease-based > primary replica stability engine will be introduced within > [IGNITE-17256|https://issues.apache.org/jira/browse/IGNITE-17256] , as a > staring point it's possible to reuse sendWithRetry logic with true readIndex > leader checks, meaning that primary replica is the replica collocated with > the current leader (not the node that is thinking that it's a leader, but a > node that was proved to be a leader based on readIndex logic). > In addition to all points mentioned above we should be aware that lot's of > tests will become flaky because sendWithRetry logic will only be available > with initial primaryReplica detection method and not within common invokes. > Generally speaking I believe that it's a good chance to rework them and thus > make them stable. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (IGNITE-17255) Implement ReplicaService
[ https://issues.apache.org/jira/browse/IGNITE-17255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17578064#comment-17578064 ] Alexander Lapin commented on IGNITE-17255: -- [~Sergey Uttsel] LGTM to feature branch > Implement ReplicaService > > > Key: IGNITE-17255 > URL: https://issues.apache.org/jira/browse/IGNITE-17255 > Project: Ignite > Issue Type: Improvement >Reporter: Alexander Lapin >Assignee: Sergey Uttsel >Priority: Major > Labels: ignite-3, transaction3_rw > > For general context please check IGNITE-17252 > Within given ticket it's required to > * Implement ReplicaService itself. > * Substitute RaftGroupService with ReplicaService from within > InternalTableImpl and others > Please pay attention that according to tx protocol it's valid to fail the > transaction in case of primary replica change - we'll introduce support of > graceful primary replica switch later on. For now, within the scope of RW > transactions it's enough to detect where primary replica is and enlist it to > transaction with corresponding partition in order to reuse for further > in-partition communication. > We should make it very clear, that any replicaService.invoke(nodeId) might > fail with primaryReplicaMiss or replicaUnavailable, it's up to the outer > logic to remap such failed requests. > However it's still required to detect proper primary replica initially and > check whether it's still primary during further queries. Proper lease-based > primary replica stability engine will be introduced within > [IGNITE-17256|https://issues.apache.org/jira/browse/IGNITE-17256] , as a > staring point it's possible to reuse sendWithRetry logic with true readIndex > leader checks, meaning that primary replica is the replica collocated with > the current leader (not the node that is thinking that it's a leader, but a > node that was proved to be a leader based on readIndex logic). > In addition to all points mentioned above we should be aware that lot's of > tests will become flaky because sendWithRetry logic will only be available > with initial primaryReplica detection method and not within common invokes. > Generally speaking I believe that it's a good chance to rework them and thus > make them stable. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-17347) Add port parameters to Ignite3 CLI node start command
[ https://issues.apache.org/jira/browse/IGNITE-17347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vyacheslav Koptilin updated IGNITE-17347: - Reviewer: Vyacheslav Koptilin > Add port parameters to Ignite3 CLI node start command > - > > Key: IGNITE-17347 > URL: https://issues.apache.org/jira/browse/IGNITE-17347 > Project: Ignite > Issue Type: Task > Components: cli >Reporter: Yury Yudin >Assignee: Vadim Pakhnushev >Priority: Major > Labels: ignite-3, ignite-3-cli-tool > Time Spent: 0.5h > Remaining Estimate: 0h > > Currently Ignite3 CLI node start command only provides a way to set different > port parameters via supplying a different configuration file. This makes it > somewhat cumbersome to start multple nodes on the same machine. > Let's provide --port and --rest-port parameters to the node start command to > make it easier. > In order to properly test a cluster on the same machine, --join parameter is > also needed which will provide a list of seed nodes to join physical topology. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (IGNITE-17258) Implement ReplicaListener
[ https://issues.apache.org/jira/browse/IGNITE-17258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17578050#comment-17578050 ] Vladislav Pyatkov commented on IGNITE-17258: Merged to the feature branch. > Implement ReplicaListener > - > > Key: IGNITE-17258 > URL: https://issues.apache.org/jira/browse/IGNITE-17258 > Project: Ignite > Issue Type: Improvement >Reporter: Alexander Lapin >Assignee: Vladislav Pyatkov >Priority: Major > Labels: ignite-3, transaction3_rw > > For general context please check IGNITE-17252. In order to specify > request-specific handling logic that will map particular actionRequest to > corresponding set of operations it's required to introduce such mapping rules > in a similar way that is used within raft listeners, in other words it's > required to introduce a sort of state machine for replica > As tx design document notes common flow for major tx requests is following: > {code:java} > On receiving OpRequest > 1. Check primary replica lease. Return the failure if not valid. > 2. Try to acquire a shared or exclusive lock, depending on the op type. > 3. If failed to acquire the lock due to a conflict, return the failure. > 4. When the lock is acquired, return an OpResponse to a coordinator with a > value, if op type is read or read-write.The OpResponse structure is: > opCode:int // 0 - ok, !0 - error code > result:Array > readLeases:Map > timestamp:HLC > 5. Replicate a write intent asynchronously if op type is write or read-write > As soon as the write intent is replicated, send WriteAckResponse to the > coordinator. The WriteAckReponse structure is: > opCode:int // 0 - ok, !0 - error code > opId:int > timestamp:HLC > 6. Return the replication ack response to a coordinator. {code} > Given steps should be managed from within ReplicaListener. Why? Because > concrete set of locks to acquire depends on operation type: > {code:java} > The required locks on the row store are the following: > 1. Tuple get(RowId rowId, UUID txId) > IS_commit(table) S_commit(rowId) > 2.Tuple get(RowId rowId, @Nullable Timestamp timestamp) > No locks. Null timestamp is used to read the latest committed value for a > single get. > 3.Tuple getForUpdate(RowId rowId, UUID txId) > IX_commit(table) X_commit(rowId) > 4. RowId insert(Tuple row, UUID txId) > IX_commit(table) > 5. boolean update(RowId rowId, Tuple newRow, UUID txId) > IX_commit(table) X_commit(rowId) > 6. Tuple remove(RowId rowId, UUID txId) > IX_commit(table) X_commit(rowId) > 7. void commitWrite(RowId rowId, Timestamp timestamp, UUID txId) > 8. void abortWrite(RowId rowId, UUID txId) > 9. Iterator scan(Predicate filter, UUID txId) > S_commit(table) - if a predicate can produce phantom reads, > IS_commit(table) - otherwise > 10. Iterator scan(Predicate filter, Timestamp timestamp) > No locks > 11. Iterator invoke(Predicate filter, InvokeClosure clo, > UUID txId) > SIX_commit(table) - if a predicate can produce phantom reads, > IX_commit(table) otherwise X_commit on each updated row. {code} > Please check ts design for full set of required actions for lock management, > e.g. index-based locks. > Besides that there are some actions, like commit/abort transaction > (replicateTxnState) that have dedicated handling logic. > *!* Given ticket should be validated with SE_team in order to check whether > whey are fine with proposed index managing actors. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (IGNITE-17258) Implement ReplicaListener
[ https://issues.apache.org/jira/browse/IGNITE-17258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17578031#comment-17578031 ] Alexander Lapin commented on IGNITE-17258: -- [~v.pyatkov] LGTM to feature branch. > Implement ReplicaListener > - > > Key: IGNITE-17258 > URL: https://issues.apache.org/jira/browse/IGNITE-17258 > Project: Ignite > Issue Type: Improvement >Reporter: Alexander Lapin >Assignee: Vladislav Pyatkov >Priority: Major > Labels: ignite-3, transaction3_rw > > For general context please check IGNITE-17252. In order to specify > request-specific handling logic that will map particular actionRequest to > corresponding set of operations it's required to introduce such mapping rules > in a similar way that is used within raft listeners, in other words it's > required to introduce a sort of state machine for replica > As tx design document notes common flow for major tx requests is following: > {code:java} > On receiving OpRequest > 1. Check primary replica lease. Return the failure if not valid. > 2. Try to acquire a shared or exclusive lock, depending on the op type. > 3. If failed to acquire the lock due to a conflict, return the failure. > 4. When the lock is acquired, return an OpResponse to a coordinator with a > value, if op type is read or read-write.The OpResponse structure is: > opCode:int // 0 - ok, !0 - error code > result:Array > readLeases:Map > timestamp:HLC > 5. Replicate a write intent asynchronously if op type is write or read-write > As soon as the write intent is replicated, send WriteAckResponse to the > coordinator. The WriteAckReponse structure is: > opCode:int // 0 - ok, !0 - error code > opId:int > timestamp:HLC > 6. Return the replication ack response to a coordinator. {code} > Given steps should be managed from within ReplicaListener. Why? Because > concrete set of locks to acquire depends on operation type: > {code:java} > The required locks on the row store are the following: > 1. Tuple get(RowId rowId, UUID txId) > IS_commit(table) S_commit(rowId) > 2.Tuple get(RowId rowId, @Nullable Timestamp timestamp) > No locks. Null timestamp is used to read the latest committed value for a > single get. > 3.Tuple getForUpdate(RowId rowId, UUID txId) > IX_commit(table) X_commit(rowId) > 4. RowId insert(Tuple row, UUID txId) > IX_commit(table) > 5. boolean update(RowId rowId, Tuple newRow, UUID txId) > IX_commit(table) X_commit(rowId) > 6. Tuple remove(RowId rowId, UUID txId) > IX_commit(table) X_commit(rowId) > 7. void commitWrite(RowId rowId, Timestamp timestamp, UUID txId) > 8. void abortWrite(RowId rowId, UUID txId) > 9. Iterator scan(Predicate filter, UUID txId) > S_commit(table) - if a predicate can produce phantom reads, > IS_commit(table) - otherwise > 10. Iterator scan(Predicate filter, Timestamp timestamp) > No locks > 11. Iterator invoke(Predicate filter, InvokeClosure clo, > UUID txId) > SIX_commit(table) - if a predicate can produce phantom reads, > IX_commit(table) otherwise X_commit on each updated row. {code} > Please check ts design for full set of required actions for lock management, > e.g. index-based locks. > Besides that there are some actions, like commit/abort transaction > (replicateTxnState) that have dedicated handling logic. > *!* Given ticket should be validated with SE_team in order to check whether > whey are fine with proposed index managing actors. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-17258) Implement ReplicaListener
[ https://issues.apache.org/jira/browse/IGNITE-17258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexander Lapin updated IGNITE-17258: - Reviewer: Alexander Lapin > Implement ReplicaListener > - > > Key: IGNITE-17258 > URL: https://issues.apache.org/jira/browse/IGNITE-17258 > Project: Ignite > Issue Type: Improvement >Reporter: Alexander Lapin >Assignee: Vladislav Pyatkov >Priority: Major > Labels: ignite-3, transaction3_rw > > For general context please check IGNITE-17252. In order to specify > request-specific handling logic that will map particular actionRequest to > corresponding set of operations it's required to introduce such mapping rules > in a similar way that is used within raft listeners, in other words it's > required to introduce a sort of state machine for replica > As tx design document notes common flow for major tx requests is following: > {code:java} > On receiving OpRequest > 1. Check primary replica lease. Return the failure if not valid. > 2. Try to acquire a shared or exclusive lock, depending on the op type. > 3. If failed to acquire the lock due to a conflict, return the failure. > 4. When the lock is acquired, return an OpResponse to a coordinator with a > value, if op type is read or read-write.The OpResponse structure is: > opCode:int // 0 - ok, !0 - error code > result:Array > readLeases:Map > timestamp:HLC > 5. Replicate a write intent asynchronously if op type is write or read-write > As soon as the write intent is replicated, send WriteAckResponse to the > coordinator. The WriteAckReponse structure is: > opCode:int // 0 - ok, !0 - error code > opId:int > timestamp:HLC > 6. Return the replication ack response to a coordinator. {code} > Given steps should be managed from within ReplicaListener. Why? Because > concrete set of locks to acquire depends on operation type: > {code:java} > The required locks on the row store are the following: > 1. Tuple get(RowId rowId, UUID txId) > IS_commit(table) S_commit(rowId) > 2.Tuple get(RowId rowId, @Nullable Timestamp timestamp) > No locks. Null timestamp is used to read the latest committed value for a > single get. > 3.Tuple getForUpdate(RowId rowId, UUID txId) > IX_commit(table) X_commit(rowId) > 4. RowId insert(Tuple row, UUID txId) > IX_commit(table) > 5. boolean update(RowId rowId, Tuple newRow, UUID txId) > IX_commit(table) X_commit(rowId) > 6. Tuple remove(RowId rowId, UUID txId) > IX_commit(table) X_commit(rowId) > 7. void commitWrite(RowId rowId, Timestamp timestamp, UUID txId) > 8. void abortWrite(RowId rowId, UUID txId) > 9. Iterator scan(Predicate filter, UUID txId) > S_commit(table) - if a predicate can produce phantom reads, > IS_commit(table) - otherwise > 10. Iterator scan(Predicate filter, Timestamp timestamp) > No locks > 11. Iterator invoke(Predicate filter, InvokeClosure clo, > UUID txId) > SIX_commit(table) - if a predicate can produce phantom reads, > IX_commit(table) otherwise X_commit on each updated row. {code} > Please check ts design for full set of required actions for lock management, > e.g. index-based locks. > Besides that there are some actions, like commit/abort transaction > (replicateTxnState) that have dedicated handling logic. > *!* Given ticket should be validated with SE_team in order to check whether > whey are fine with proposed index managing actors. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-17512) Load logger and test configuration from the classpath.
[ https://issues.apache.org/jira/browse/IGNITE-17512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pavel Pereslegin updated IGNITE-17512: -- Description: We currently only load log4j2-test.xml/test.properties from the ignite config directory. These files are not included in the test jar. This is very inconvenient for modules that use the Ignite test framework (extensions, for the most part). Suggestion: 1. include log4j2-test.xml and test.properties into test jar. 2. if log4j2-test.xml/test.properties was not found in ignite config directory - load them from the classpath. was: Currently we loading log4j2-test.xml/test.properties from config directory only. These files does not include into test jar. We currently only load log4j2-test.xml/test.properties from the ignite config directory. These files are not included in the test jar. This is very inconvenient for modules that use the Ignite test framework (extensions, for the most part). Suggestion: 1. include log4j2-test.xml and test.properties into test jar. 2. if log4j2-test.xml/test.properties was not found in ignite config directory - load them from the classpath. > Load logger and test configuration from the classpath. > -- > > Key: IGNITE-17512 > URL: https://issues.apache.org/jira/browse/IGNITE-17512 > Project: Ignite > Issue Type: Improvement >Reporter: Pavel Pereslegin >Assignee: Pavel Pereslegin >Priority: Minor > Labels: test-framework > > We currently only load log4j2-test.xml/test.properties from the ignite config > directory. These files are not included in the test jar. > This is very inconvenient for modules that use the Ignite test framework > (extensions, for the most part). > Suggestion: > 1. include log4j2-test.xml and test.properties into test jar. > 2. if log4j2-test.xml/test.properties was not found in ignite config > directory - load them from the classpath. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-17512) Load logger and test configuration from the classpath.
[ https://issues.apache.org/jira/browse/IGNITE-17512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pavel Pereslegin updated IGNITE-17512: -- Priority: Minor (was: Major) > Load logger and test configuration from the classpath. > -- > > Key: IGNITE-17512 > URL: https://issues.apache.org/jira/browse/IGNITE-17512 > Project: Ignite > Issue Type: Improvement >Reporter: Pavel Pereslegin >Assignee: Pavel Pereslegin >Priority: Minor > Labels: test-framework > > Currently we loading log4j2-test.xml/test.properties from config directory > only. These files does not include into test jar. > We currently only load log4j2-test.xml/test.properties from the ignite config > directory. These files are not included in the test jar. > This is very inconvenient for modules that use the Ignite test framework > (extensions, for the most part). > Suggestion: > 1. include log4j2-test.xml and test.properties into test jar. > 2. if log4j2-test.xml/test.properties was not found in ignite config > directory - load them from the classpath. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IGNITE-17512) Load logger and test configuration from the classpath.
Pavel Pereslegin created IGNITE-17512: - Summary: Load logger and test configuration from the classpath. Key: IGNITE-17512 URL: https://issues.apache.org/jira/browse/IGNITE-17512 Project: Ignite Issue Type: Improvement Reporter: Pavel Pereslegin Assignee: Pavel Pereslegin Currently we loading log4j2-test.xml/test.properties from config directory only. These files does not include into test jar. We currently only load log4j2-test.xml/test.properties from the ignite config directory. These files are not included in the test jar. This is very inconvenient for modules that use the Ignite test framework (extensions, for the most part). Suggestion: 1. include log4j2-test.xml and test.properties into test jar. 2. if log4j2-test.xml/test.properties was not found in ignite config directory - load them from the classpath. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (IGNITE-17455) IndexQuery should support setPartition
[ https://issues.apache.org/jira/browse/IGNITE-17455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17578015#comment-17578015 ] Maksim Timonin commented on IGNITE-17455: - Timeouted PDS (indexing) suite is a flaky one. All tests passed. [~ivandasch] thanks for review, merged to master. > IndexQuery should support setPartition > -- > > Key: IGNITE-17455 > URL: https://issues.apache.org/jira/browse/IGNITE-17455 > Project: Ignite > Issue Type: New Feature >Reporter: Maksim Timonin >Assignee: Maksim Timonin >Priority: Major > Labels: IEP-71 > Time Spent: 50m > Remaining Estimate: 0h > > Currently IndexQuery doesn't support querying specified partition. But other > types of queries provide this option - ScanQuery, SqlFieldsQuery. > It's useful option for working with affinity requests. Then IndexQuery should > work over single partition. > To make it possible to migrate to IndexQuery from others queries let's add > such opportunity. > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (IGNITE-17507) Failed to wait for partition map exchange on some clients
[ https://issues.apache.org/jira/browse/IGNITE-17507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17578012#comment-17578012 ] Vyacheslav Koptilin commented on IGNITE-17507: -- Test that demonstrates the issue can be found here: _CacheLateAffinityAssignmentTest.testDelayAssignmentAffinityChangedUnexpectedPME_ > Failed to wait for partition map exchange on some clients > - > > Key: IGNITE-17507 > URL: https://issues.apache.org/jira/browse/IGNITE-17507 > Project: Ignite > Issue Type: Bug >Reporter: Vyacheslav Koptilin >Assignee: Vyacheslav Koptilin >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > We have scenario with several client and server nodes, which can stuck on PME > after start: > * Start some server nodes > * Trigger rebalance > * Start some client and server nodes > * Some of the client nodes stuck with _Failed to wait for partition map > exchange [topVer=AffinityTopologyVersion…_ > Deep investigation of the logs showed, that the root cause of the stuck PME > on client is the race between joining new client node and receiving stale > _CacheAffinityChangeMessage_ on a client, which causes PME, but when other > old nodes receive this _CacheAffinityChangeMessage_, they skip it because of > some optimization. > Optimization can be found in the method > _CacheAffinitySharedManager#onDiscoveryEvent_, we save _lastAffVer = topVer_ > for old nodes, but because of some race _lastAffVer_ for the problem client > node is null when we reach _CacheAffinitySharedManager#onCustomEvent_ and we > schedule invalid PME in _msg.exchangeNeeded(exchangeNeeded)_, but other > nodes skip this PME > The possible fix is that we can try to make the _CacheAffinityChangeMessage_ > mutable (mutable discovery custom message). It allows to modify the message > before sending it across the ring. This approach does not require to make a > decision to apply or skip the message on client nodes, the required flag will > be transferred from a server node. In case of using Zookeeper Discovery, > there is no ability to mutate discovery messages. However is is possible to > mutate the message on the coordinator node (this requires adding > _stopProcess_ flag in _DiscoveryCustomMessage_ which was removed by > IGNITE-12400). This is quite enough for our case. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (IGNITE-17455) IndexQuery should support setPartition
[ https://issues.apache.org/jira/browse/IGNITE-17455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17578010#comment-17578010 ] Ignite TC Bot commented on IGNITE-17455: {panel:title=Branch: [pull/10182/head] Base: [master] : Possible Blockers (1)|borderStyle=dashed|borderColor=#ccc|titleBGColor=#F7D6C1} {color:#d04437}PDS (Indexing){color} [[tests 0 TIMEOUT , Exit Code |https://ci.ignite.apache.org/viewLog.html?buildId=6722405]] {panel} {panel:title=Branch: [pull/10182/head] Base: [master] : New Tests (16)|borderStyle=dashed|borderColor=#ccc|titleBGColor=#D6F7C1} {color:#8b}Index Query API{color} [[tests 16|https://ci.ignite.apache.org/viewLog.html?buildId=6722380]] * {color:#013220}IndexQueryTestSuite: IndexQueryPartitionTest.testSinglePartition[mode=REPLICATED, client=true] - PASSED{color} * {color:#013220}IndexQueryTestSuite: IndexQueryPartitionTest.testSetNullNotAffect[mode=REPLICATED, client=true] - PASSED{color} * {color:#013220}IndexQueryTestSuite: IndexQueryPartitionTest.testLocalWithPartition[mode=REPLICATED, client=true] - PASSED{color} * {color:#013220}IndexQueryTestSuite: IndexQueryPartitionTest.testNegativePartitionFails[mode=REPLICATED, client=true] - PASSED{color} * {color:#013220}IndexQueryTestSuite: IndexQueryPartitionTest.testSinglePartition[mode=REPLICATED, client=false] - PASSED{color} * {color:#013220}IndexQueryTestSuite: IndexQueryPartitionTest.testSetNullNotAffect[mode=REPLICATED, client=false] - PASSED{color} * {color:#013220}IndexQueryTestSuite: IndexQueryPartitionTest.testLocalWithPartition[mode=REPLICATED, client=false] - PASSED{color} * {color:#013220}IndexQueryTestSuite: IndexQueryPartitionTest.testNegativePartitionFails[mode=REPLICATED, client=false] - PASSED{color} * {color:#013220}IndexQueryTestSuite: IndexQueryPartitionTest.testSinglePartition[mode=PARTITIONED, client=true] - PASSED{color} * {color:#013220}IndexQueryTestSuite: IndexQueryPartitionTest.testSetNullNotAffect[mode=PARTITIONED, client=true] - PASSED{color} * {color:#013220}IndexQueryTestSuite: IndexQueryPartitionTest.testLocalWithPartition[mode=PARTITIONED, client=true] - PASSED{color} ... and 5 new tests {panel} [TeamCity *-- Run :: All* Results|https://ci.ignite.apache.org/viewLog.html?buildId=6722442buildTypeId=IgniteTests24Java8_RunAll] > IndexQuery should support setPartition > -- > > Key: IGNITE-17455 > URL: https://issues.apache.org/jira/browse/IGNITE-17455 > Project: Ignite > Issue Type: New Feature >Reporter: Maksim Timonin >Assignee: Maksim Timonin >Priority: Major > Labels: IEP-71 > Time Spent: 40m > Remaining Estimate: 0h > > Currently IndexQuery doesn't support querying specified partition. But other > types of queries provide this option - ScanQuery, SqlFieldsQuery. > It's useful option for working with affinity requests. Then IndexQuery should > work over single partition. > To make it possible to migrate to IndexQuery from others queries let's add > such opportunity. > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-17228) Get rid of the checkpoint markers
[ https://issues.apache.org/jira/browse/IGNITE-17228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kirill Tkalenko updated IGNITE-17228: - Reviewer: (was: Roman Puchkovskiy) > Get rid of the checkpoint markers > - > > Key: IGNITE-17228 > URL: https://issues.apache.org/jira/browse/IGNITE-17228 > Project: Ignite > Issue Type: Improvement >Reporter: Kirill Tkalenko >Priority: Major > Labels: ignite-3 > Fix For: 3.0.0-alpha6 > > > In 2.0, each checkpoint creates *start* and *end* markers on disk to perform > a binary recovery if the node crashed in the middle of the checkpoint (there > is no *end* market) which allows us to save the consistency of *index.bin* > files (a special group cache file that stores all the indexes of this group). > Since there is no WAL in 3.0 and indexes will be in each partition file, > there is no need for checkpoint markers. > > What should be done: > * Get rid of > *org.apache.ignite.internal.pagememory.persistence.checkpoint.CheckpointMarkersStorage*; > * Remove related tests; > * If logical recovery is not yet supported, then we need to drop the node > (throw exceptions at the start of the node) if we crashed in the middle of > the checkpoint. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] (IGNITE-17228) Get rid of the checkpoint markers
[ https://issues.apache.org/jira/browse/IGNITE-17228 ] Kirill Tkalenko deleted comment on IGNITE-17228: -- was (Author: ktkale...@gridgain.com): [~rpuch] Please make code review. > Get rid of the checkpoint markers > - > > Key: IGNITE-17228 > URL: https://issues.apache.org/jira/browse/IGNITE-17228 > Project: Ignite > Issue Type: Improvement >Reporter: Kirill Tkalenko >Priority: Major > Labels: ignite-3 > Fix For: 3.0.0-alpha6 > > > In 2.0, each checkpoint creates *start* and *end* markers on disk to perform > a binary recovery if the node crashed in the middle of the checkpoint (there > is no *end* market) which allows us to save the consistency of *index.bin* > files (a special group cache file that stores all the indexes of this group). > Since there is no WAL in 3.0 and indexes will be in each partition file, > there is no need for checkpoint markers. > > What should be done: > * Get rid of > *org.apache.ignite.internal.pagememory.persistence.checkpoint.CheckpointMarkersStorage*; > * Remove related tests; > * If logical recovery is not yet supported, then we need to drop the node > (throw exceptions at the start of the node) if we crashed in the middle of > the checkpoint. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-17511) Support IndexQuery for Java ThinClient
[ https://issues.apache.org/jira/browse/IGNITE-17511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maksim Timonin updated IGNITE-17511: Summary: Support IndexQuery for Java ThinClient (was: Support IndexQuery for ThinClient) > Support IndexQuery for Java ThinClient > -- > > Key: IGNITE-17511 > URL: https://issues.apache.org/jira/browse/IGNITE-17511 > Project: Ignite > Issue Type: New Feature >Reporter: Maksim Timonin >Assignee: Maksim Timonin >Priority: Major > Labels: IEP-71 > Fix For: 2.14 > > > ThinClient doesn't support IndexQuery. Let's fix it. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IGNITE-17511) Support IndexQuery for ThinClient
Maksim Timonin created IGNITE-17511: --- Summary: Support IndexQuery for ThinClient Key: IGNITE-17511 URL: https://issues.apache.org/jira/browse/IGNITE-17511 Project: Ignite Issue Type: New Feature Reporter: Maksim Timonin Assignee: Maksim Timonin Fix For: 2.14 ThinClient doesn't support IndexQuery. Let's fix it. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-17455) IndexQuery should support setPartition
[ https://issues.apache.org/jira/browse/IGNITE-17455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maksim Timonin updated IGNITE-17455: Labels: IEP-71 (was: ) > IndexQuery should support setPartition > -- > > Key: IGNITE-17455 > URL: https://issues.apache.org/jira/browse/IGNITE-17455 > Project: Ignite > Issue Type: New Feature >Reporter: Maksim Timonin >Assignee: Maksim Timonin >Priority: Major > Labels: IEP-71 > Time Spent: 40m > Remaining Estimate: 0h > > Currently IndexQuery doesn't support querying specified partition. But other > types of queries provide this option - ScanQuery, SqlFieldsQuery. > It's useful option for working with affinity requests. Then IndexQuery should > work over single partition. > To make it possible to migrate to IndexQuery from others queries let's add > such opportunity. > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (IGNITE-17476) Implement configurations event handling by index manager
[ https://issues.apache.org/jira/browse/IGNITE-17476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17577964#comment-17577964 ] Andrey Mashenkov commented on IGNITE-17476: --- [~korlov], LGTM! > Implement configurations event handling by index manager > > > Key: IGNITE-17476 > URL: https://issues.apache.org/jira/browse/IGNITE-17476 > Project: Ignite > Issue Type: Improvement > Components: sql >Reporter: Konstantin Orlov >Assignee: Konstantin Orlov >Priority: Major > Labels: ignite-3 > Fix For: 3.0.0-alpha6 > > Time Spent: 10m > Remaining Estimate: 0h > > Need to implement listener for configurations event to reflect the state of > configuration an create all necessary runtime structures. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (IGNITE-17428) Race between creating table and getting table, between creating schema and getting schema
[ https://issues.apache.org/jira/browse/IGNITE-17428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vyacheslav Koptilin reassigned IGNITE-17428: Assignee: Mirza Aliev > Race between creating table and getting table, between creating schema and > getting schema > - > > Key: IGNITE-17428 > URL: https://issues.apache.org/jira/browse/IGNITE-17428 > Project: Ignite > Issue Type: Bug >Reporter: Denis Chudov >Assignee: Mirza Aliev >Priority: Major > Labels: ignite-3 > > Current version of TableManager#tableAsyncInternal can possibly not detect > table that is being created while tableAsyncInternal is called. Scenario: > - tableAsyncInternal checks tablesByIdVv.latest() and there is no table > - the table creation started, table metadata appears in meta storage > - TableEvent.CREATE is fired > - tableAsyncInternal registers a listener for TableEvent.CREATE (after it is > fired for corresponding table) > - tableAsyncInternal checks tablesByIdVv.latest() once again and there still > is no table, because the table creation is not completed > - {{!isTableConfigured(id)}} condition returns *false* as the table is > present in meta storage > - {{if (tbl != null && getTblFut.complete(tbl) || !isTableConfigured(id) && > getTblFut.complete(null))}} evaluates *false* and the future created fot > getTable never completes. > Possibly we should use VersionedValue#whenComplete instead of creating > listener for event. The table is present in map wrapped in versioned value > only when the table creation is completed, and whenComplete allows to create > a callback to check the table presence. > The same problem is presented for {{SchemaManager}} when we get schema in > {{SchemaManager#tableSchema}} > Possible fix for {{SchemaManager}} is to use this pattern > {code:java} > registriesVv.whenComplete((token, val, e) -> { > if (schemaVer <= val.get(tblId).lastSchemaVersion()) { > fut.complete(getSchemaDescriptorLocally(schemaVer, tblCfg)); > } > }); > {code} > instead of creating listener for CREATE event. The same approach can be used > for {{TableManager}} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-17510) NPE in cluster configuration REST calls
[ https://issues.apache.org/jira/browse/IGNITE-17510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksandr updated IGNITE-17510: --- Description: When calling {{/management/v1/configuration/cluster}} on the cluster that is not initialized, then we got the NPE and as a result, a 500 error code is returned. {{ItNotInitializedClusterRestTest#clusterConfiguration}} and {{ItNotInitializedClusterRestTest#clusterConfigurationUpdate}} reproduce the issue and have TODO for that. I would suggest to return 404 in that case as {{/management/v1/cluster/topology/logical}} does. So, there is no such resource on the cluster that is not initialized. The {{cluster config show/update}} error handling should be updated as well. was: When calling {{/management/v1/configuration/cluster}} on the cluster that is not initialized, than we got the NPE and as a result, 500 error code is returned. {{ItNotInitializedClusterRestTest#clusterConfiguration}} and {{ItNotInitializedClusterRestTest#clusterConfigurationUpdate}} reproduce the issue and have TODO for that. I would suggest to return 404 in that case as {{/management/v1/cluster/topology/logical}} does. So, there is no such resource on the cluster that is not initialized. The {{cluster config show/update}} error handling should be updated as well. > NPE in cluster configuration REST calls > --- > > Key: IGNITE-17510 > URL: https://issues.apache.org/jira/browse/IGNITE-17510 > Project: Ignite > Issue Type: Task > Components: cli, ignite-3, rest >Reporter: Aleksandr >Priority: Major > > When calling {{/management/v1/configuration/cluster}} on the cluster that is > not initialized, then we got the NPE and as a result, a 500 error code is > returned. > {{ItNotInitializedClusterRestTest#clusterConfiguration}} and > {{ItNotInitializedClusterRestTest#clusterConfigurationUpdate}} reproduce the > issue and have TODO for that. > I would suggest to return 404 in that case as > {{/management/v1/cluster/topology/logical}} does. So, there is no such > resource on the cluster that is not initialized. > The {{cluster config show/update}} error handling should be updated as well. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-17510) NPE in cluster configuration REST calls
[ https://issues.apache.org/jira/browse/IGNITE-17510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksandr updated IGNITE-17510: --- Description: When calling {{/management/v1/configuration/cluster}} on the cluster that is not initialized, then we got the NPE and as a result, a 500 error code is returned. {{ItNotInitializedClusterRestTest#clusterConfiguration}} and {{ItNotInitializedClusterRestTest#clusterConfigurationUpdate}} reproduce the issue and have TODO for that. I would suggest to return 404 in that case as {{/management/v1/cluster/topology/logical}} does. So, there is no such resource on the cluster that is not initialized. The {{cluster config show/update}} error handling should be updated as well. was: When calling {{/management/v1/configuration/cluster}} on the cluster that is not initialized, then we got the NPE and as a result, a 500 error code is returned. {{ItNotInitializedClusterRestTest#clusterConfiguration}} and {{ItNotInitializedClusterRestTest#clusterConfigurationUpdate}} reproduce the issue and have TODO for that. I would suggest to return 404 in that case as {{/management/v1/cluster/topology/logical}} does. So, there is no such resource on the cluster that is not initialized. The {{cluster config show/update}} error handling should be updated as well. > NPE in cluster configuration REST calls > --- > > Key: IGNITE-17510 > URL: https://issues.apache.org/jira/browse/IGNITE-17510 > Project: Ignite > Issue Type: Task > Components: cli, ignite-3, rest >Reporter: Aleksandr >Priority: Major > > When calling {{/management/v1/configuration/cluster}} on the cluster that is > not initialized, then we got the NPE and as a result, a 500 error code is > returned. > {{ItNotInitializedClusterRestTest#clusterConfiguration}} and > {{ItNotInitializedClusterRestTest#clusterConfigurationUpdate}} reproduce the > issue and have TODO for that. > I would suggest to return 404 in that case as > {{/management/v1/cluster/topology/logical}} does. So, there is no such > resource on the cluster that is not initialized. > The {{cluster config show/update}} error handling should be updated as well. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IGNITE-17510) NPE in cluster configuration REST calls
Aleksandr created IGNITE-17510: -- Summary: NPE in cluster configuration REST calls Key: IGNITE-17510 URL: https://issues.apache.org/jira/browse/IGNITE-17510 Project: Ignite Issue Type: Task Components: cli, ignite-3, rest Reporter: Aleksandr When calling {{/management/v1/configuration/cluster}} on the cluster that is not initialized, than we got the NPE and as a result, 500 error code is returned. {{ItNotInitializedClusterRestTest#clusterConfiguration}} and {{ItNotInitializedClusterRestTest#clusterConfigurationUpdate}} reproduce the issue and have TODO for that. I would suggest to return 404 in that case as {{/management/v1/cluster/topology/logical}} does. So, there is no such resource on the cluster that is not initialized. The {{cluster config show/update}} error handling should be updated as well. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (IGNITE-17394) Implement API for getting partition mapping
[ https://issues.apache.org/jira/browse/IGNITE-17394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17577944#comment-17577944 ] Igor Sapego commented on IGNITE-17394: -- [~alapin] The first approach when operation succeeds but returns a new distribution is suitable for our scenario, the second one with exception is not. > Implement API for getting partition mapping > --- > > Key: IGNITE-17394 > URL: https://issues.apache.org/jira/browse/IGNITE-17394 > Project: Ignite > Issue Type: New Feature >Affects Versions: 3.0.0-alpha5 >Reporter: Igor Sapego >Assignee: Mirza Aliev >Priority: Major > Labels: ignite-3 > Fix For: 3.0.0-alpha6 > > > To implement Partition Awareness feature for clients, we need an internal or > public API that will provide us with the following mapping: [partition => > node id] (or [node id => partitions]). > We also need a lightweight mechanism that will allow us to discover that this > distribution has changed. In 2.x we used a topology version for this purpose, > assuming that if topology version has changed, partition distribution should > be refreshed. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-17509) [Extensions] Spring Data pageable request result contains incorrect total value.
[ https://issues.apache.org/jira/browse/IGNITE-17509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mikhail Petrov updated IGNITE-17509: Description: Assume that Spring Data repository contains the following method {code:java} public Page findByFirstNameContaining(String val, Pageable pageable); {code} In this case the following checks will fail {code:java} Page res = repo.findByFirstNameContaining("person", PageRequest.of(2, 100)); assertEquals(CACHE_SIZE, res.getTotalElements()); {code} where 'repo' is the instance of the previously mention repository. The full reproduccer is attached. The main reason of the such behaviour is that IgniteRepositoryQuery.java:614 does not make a separate request of the total rows count and just sets Page 'total' value to 0. See also org.springframework.data.domain.PageImpl#PageImpl(java.util.List, org.springframework.data.domain.Pageable, long) logic to understand the how the final result of 'getTotalElements()' is calculated. In bref the result will contain the summ of the offset and cout of values in the last page. It seems that as a workaround, you can explicitly request the total number of rows with a separate query. was: Assume that Spring Data repository contains the following method {code:java} public Page findByFirstNameContaining(String val, Pageable pageable); {code} In this case the following checks will fail {code:java} Page res = repo.findByFirstNameContaining("person", PageRequest.of(2, 100)); assertEquals(CACHE_SIZE, res.getTotalElements()); {code} where 'repo' is the instance of the previously mention repository. The full reproduccer is attached. The main reason of the such behaviour is that IgniteRepositoryQuery.java:614 does not make a separate request of the total rows count and just sets Page 'total' value to 0. See also org.springframework.data.domain.PageImpl#PageImpl(java.util.List, org.springframework.data.domain.Pageable, long) logic to understand the how the final result of 'getTotalElements()' is calculated. It seems that as a workaround, you can explicitly request the total number of rows with a separate query. > [Extensions] Spring Data pageable request result contains incorrect total > value. > > > Key: IGNITE-17509 > URL: https://issues.apache.org/jira/browse/IGNITE-17509 > Project: Ignite > Issue Type: Bug >Reporter: Mikhail Petrov >Priority: Major > Labels: ise > Attachments: Reproduces_incorrect_pageable_request_total_value_.patch > > > Assume that Spring Data repository contains the following method > {code:java} > public Page findByFirstNameContaining(String val, Pageable > pageable); > {code} > In this case the following checks will fail > {code:java} > Page res = repo.findByFirstNameContaining("person", > PageRequest.of(2, 100)); > assertEquals(CACHE_SIZE, res.getTotalElements()); > {code} > where 'repo' is the instance of the previously mention repository. > The full reproduccer is attached. > The main reason of the such behaviour is that IgniteRepositoryQuery.java:614 > does not make a separate request of the total rows count and just sets Page > 'total' value to 0. > See also org.springframework.data.domain.PageImpl#PageImpl(java.util.List, > org.springframework.data.domain.Pageable, long) logic to understand the how > the final result of 'getTotalElements()' is calculated. In bref the result > will contain the summ of the offset and cout of values in the last page. > It seems that as a workaround, you can explicitly request the total number of > rows with a separate query. > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-17509) [Extensions] Spring Data pageable request result contains incorrect total value.
[ https://issues.apache.org/jira/browse/IGNITE-17509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Gidaspov updated IGNITE-17509: - Labels: ise (was: ) > [Extensions] Spring Data pageable request result contains incorrect total > value. > > > Key: IGNITE-17509 > URL: https://issues.apache.org/jira/browse/IGNITE-17509 > Project: Ignite > Issue Type: Bug >Reporter: Mikhail Petrov >Priority: Major > Labels: ise > Attachments: Reproduces_incorrect_pageable_request_total_value_.patch > > > Assume that Spring Data repository contains the following method > {code:java} > public Page findByFirstNameContaining(String val, Pageable > pageable); > {code} > In this case the following checks will fail > {code:java} > Page res = repo.findByFirstNameContaining("person", > PageRequest.of(2, 100)); > assertEquals(CACHE_SIZE, res.getTotalElements()); > {code} > where 'repo' is the instance of the previously mention repository. > The full reproduccer is attached. > The main reason of the such behaviour is that IgniteRepositoryQuery.java:614 > does not make a separate request of the total rows count and just sets Page > 'total' value to 0. > See also org.springframework.data.domain.PageImpl#PageImpl(java.util.List, > org.springframework.data.domain.Pageable, long) logic to understand the how > the final result of 'getTotalElements()' is calculated. > It seems that as a workaround, you can explicitly request the total number of > rows with a separate query. > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-17499) Service method invocation excepition is not propagated to thin client side.
[ https://issues.apache.org/jira/browse/IGNITE-17499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Gidaspov updated IGNITE-17499: - Labels: ise (was: ) > Service method invocation excepition is not propagated to thin client side. > --- > > Key: IGNITE-17499 > URL: https://issues.apache.org/jira/browse/IGNITE-17499 > Project: Ignite > Issue Type: Bug >Reporter: Mikhail Petrov >Assignee: Mikhail Petrov >Priority: Minor > Labels: ise > > https://issues.apache.org/jira/browse/IGNITE-13389 introduced dedicated flag > that make it possible to propagate server side stacktrace to a thin client > side. The mentoined above propagation does not work for exceptions that > arises during Ignite Service invocation. > Steps to reproduce: > 1. Start .Net Ignite node > 2. Deploy service which invocation throws an arbitrary uncaught exception > 3. Invoke previously deployed services via Java thin client > As a result, information about the custom code exception is not present in > the exception stacktrace that is thrown after the service call. > The main reason of such behaviour is that > ClientServiceInvokeRequest.java:198 does not propagate initial exception. So > ClientRequestHandler#handleException could not handle exception properly even > if ThinClientConfiguration#sendServerExceptionStackTraceToClient() is enabled. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-17507) Failed to wait for partition map exchange on some clients
[ https://issues.apache.org/jira/browse/IGNITE-17507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vyacheslav Koptilin updated IGNITE-17507: - Description: We have scenario with several client and server nodes, which can stuck on PME after start: * Start some server nodes * Trigger rebalance * Start some client and server nodes * Some of the client nodes stuck with _Failed to wait for partition map exchange [topVer=AffinityTopologyVersion…_ Deep investigation of the logs showed, that the root cause of the stuck PME on client is the race between joining new client node and receiving stale _CacheAffinityChangeMessage_ on a client, which causes PME, but when other old nodes receive this _CacheAffinityChangeMessage_, they skip it because of some optimization. Optimization can be found in the method _CacheAffinitySharedManager#onDiscoveryEvent_, we save _lastAffVer = topVer_ for old nodes, but because of some race _lastAffVer_ for the problem client node is null when we reach _CacheAffinitySharedManager#onCustomEvent_ and we schedule invalid PME in _msg.exchangeNeeded(exchangeNeeded)_, but other nodes skip this PME The possible fix is that we can try to make the _CacheAffinityChangeMessage_ mutable (mutable discovery custom message). It allows to modify the message before sending it across the ring. This approach does not require to make a decision to apply or skip the message on client nodes, the required flag will be transferred from a server node. In case of using Zookeeper Discovery, there is no ability to mutate discovery messages. However is is possible to mutate the message on the coordinator node (this requires adding _stopProcess_ flag in _DiscoveryCustomMessage_ which was removed by IGNITE-12400). This is quite enough for our case. was: We have scenario with several client and server nodes, which can stuck on PME after start: * Start some server nodes * Trigger rebalance * Start some client and server nodes * Some of the client nodes stuck with _Failed to wait for partition map exchange [topVer=AffinityTopologyVersion…_ Deep investigation of the logs showed, that the root cause of the stuck PME on client is the race between joining new client node and receiving stale _CacheAffinityChangeMessage_ on a client, which causes PME, but when other old nodes receive this _CacheAffinityChangeMessage_, they skip it because of some optimization. Optimization can be found in the method _CacheAffinitySharedManager#onDiscoveryEvent_, we save _lastAffVer = topVer_ for old nodes, but because of some race _lastAffVer_ for the problem client node is null when we reach _CacheAffinitySharedManager#onCustomEvent_ and we schedule invalid PME in _msg.exchangeNeeded(exchangeNeeded)_, but other nodes skip this PME The possible fix is that we can try to make the _CacheAffinityChangeMessage_ mutable (mutable discovery custom message). It allows to modify the message before sending it across the ring. This approach does not require to make a decision to apply or skip the message on client nodes, the required flag will be transferred from a server node. In case of using Zookeeper Discovery, there is no ability to mutate discovery messages. However is is possible to mutate the message on the coordinator node. This is quite enough for our case. > Failed to wait for partition map exchange on some clients > - > > Key: IGNITE-17507 > URL: https://issues.apache.org/jira/browse/IGNITE-17507 > Project: Ignite > Issue Type: Bug >Reporter: Vyacheslav Koptilin >Assignee: Vyacheslav Koptilin >Priority: Major > > We have scenario with several client and server nodes, which can stuck on PME > after start: > * Start some server nodes > * Trigger rebalance > * Start some client and server nodes > * Some of the client nodes stuck with _Failed to wait for partition map > exchange [topVer=AffinityTopologyVersion…_ > Deep investigation of the logs showed, that the root cause of the stuck PME > on client is the race between joining new client node and receiving stale > _CacheAffinityChangeMessage_ on a client, which causes PME, but when other > old nodes receive this _CacheAffinityChangeMessage_, they skip it because of > some optimization. > Optimization can be found in the method > _CacheAffinitySharedManager#onDiscoveryEvent_, we save _lastAffVer = topVer_ > for old nodes, but because of some race _lastAffVer_ for the problem client > node is null when we reach _CacheAffinitySharedManager#onCustomEvent_ and we > schedule invalid PME in _msg.exchangeNeeded(exchangeNeeded)_, but other > nodes skip this PME > The possible fix is that we can try to make the _CacheAffinityChangeMessage_ > mutable (mutable discovery custom message). It allows to modify the message >
[jira] [Created] (IGNITE-17509) [Extensions] Spring Data pageable request result contains incorrect total value.
Mikhail Petrov created IGNITE-17509: --- Summary: [Extensions] Spring Data pageable request result contains incorrect total value. Key: IGNITE-17509 URL: https://issues.apache.org/jira/browse/IGNITE-17509 Project: Ignite Issue Type: Bug Reporter: Mikhail Petrov Attachments: Reproduces_incorrect_pageable_request_total_value_.patch Assume that Spring Data repository contains the following method {code:java} public Page findByFirstNameContaining(String val, Pageable pageable); {code} In this case the following checks will fail {code:java} Page res = repo.findByFirstNameContaining("person", PageRequest.of(2, 100)); assertEquals(CACHE_SIZE, res.getTotalElements()); {code} where 'repo' is the instance of the previously mention repository. The full reproduccer is attached. The main reason of the such behaviour is that IgniteRepositoryQuery.java:614 does not make a separate request of the total rows count and just sets Page 'total' value to 0. See also org.springframework.data.domain.PageImpl#PageImpl(java.util.List, org.springframework.data.domain.Pageable, long) logic to understand the how the final result of 'getTotalElements()' is calculated. It seems that as a workaround, you can explicitly request the total number of rows with a separate query. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (IGNITE-17394) Implement API for getting partition mapping
[ https://issues.apache.org/jira/browse/IGNITE-17394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17577938#comment-17577938 ] Alexander Lapin edited comment on IGNITE-17394 at 8/10/22 11:16 AM: [~isapego] >Will I get an exception in this case? Why? It depends on the API you are going to use. Generally speaking, as far as I remember, partition pruning logic assumes that there's some extra data transferred between client and server in order to mark given request as partition aware. So that if partition aware request missed target primary replicas because they were changed it will succeed in processing the request itself (extra rerouting included) and return new distribution besides common result. However if you will use ReplicaService interface directly with expected replica as target cluster node you will actually get PrimaryReplicaMissException. was (Author: alapin): [~isapego] >Will I get an exception in this case? Why? It depends on the API you are going to use. Generally speaking, as far as I remember, partition pruning logic assumes that there's some extra data transferred between client and server in order to mark request as partition aware. So that if partition aware request missed target primary replicas because they were changed it will succeed in processing the request itself (extra rerouting included) and return new distribution besides common result. However if you will use ReplicaService interface directly with expected replica as target cluster node you will actually get PrimaryReplicaMissException. > Implement API for getting partition mapping > --- > > Key: IGNITE-17394 > URL: https://issues.apache.org/jira/browse/IGNITE-17394 > Project: Ignite > Issue Type: New Feature >Affects Versions: 3.0.0-alpha5 >Reporter: Igor Sapego >Assignee: Mirza Aliev >Priority: Major > Labels: ignite-3 > Fix For: 3.0.0-alpha6 > > > To implement Partition Awareness feature for clients, we need an internal or > public API that will provide us with the following mapping: [partition => > node id] (or [node id => partitions]). > We also need a lightweight mechanism that will allow us to discover that this > distribution has changed. In 2.x we used a topology version for this purpose, > assuming that if topology version has changed, partition distribution should > be refreshed. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (IGNITE-17394) Implement API for getting partition mapping
[ https://issues.apache.org/jira/browse/IGNITE-17394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17577938#comment-17577938 ] Alexander Lapin commented on IGNITE-17394: -- [~isapego] >Will I get an exception in this case? Why? It depends on the API you are going to use. Generally speaking, as far as I remember, partition pruning logic assumes that there's some extra data transferred between client and server in order to mark request as partition aware. So that if partition aware request missed target primary replicas because they were changed it will succeed in processing the request itself (extra rerouting included) and return new distribution besides common result. However if you will use ReplicaService interface directly with expected replica as target cluster node you will actually get PrimaryReplicaMissException. > Implement API for getting partition mapping > --- > > Key: IGNITE-17394 > URL: https://issues.apache.org/jira/browse/IGNITE-17394 > Project: Ignite > Issue Type: New Feature >Affects Versions: 3.0.0-alpha5 >Reporter: Igor Sapego >Assignee: Mirza Aliev >Priority: Major > Labels: ignite-3 > Fix For: 3.0.0-alpha6 > > > To implement Partition Awareness feature for clients, we need an internal or > public API that will provide us with the following mapping: [partition => > node id] (or [node id => partitions]). > We also need a lightweight mechanism that will allow us to discover that this > distribution has changed. In 2.x we used a topology version for this purpose, > assuming that if topology version has changed, partition distribution should > be refreshed. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-17508) Exception handling in the partition replication listener for RAFT futures
[ https://issues.apache.org/jira/browse/IGNITE-17508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vladislav Pyatkov updated IGNITE-17508: --- Description: In the replication listener ({_}PartitionReplicaListener{_}) where we have the pattern: {code:java} raftFut.thenApply(ignored -> result);{code} we should worry about handling RAFT exceptions, including analyzing whether raftFut result. was: In the replication listener ({_}PartitionReplicaListener{_}) where we have the pattern: {code:java} raftFut.thenApply(ignored -> result);{code} we should worry about handling RAFT exceptions. > Exception handling in the partition replication listener for RAFT futures > - > > Key: IGNITE-17508 > URL: https://issues.apache.org/jira/browse/IGNITE-17508 > Project: Ignite > Issue Type: Improvement >Reporter: Vladislav Pyatkov >Priority: Major > Labels: ignite-3 > > In the replication listener ({_}PartitionReplicaListener{_}) where we have > the pattern: > {code:java} > raftFut.thenApply(ignored -> result);{code} > we should worry about handling RAFT exceptions, including analyzing whether > raftFut result. > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IGNITE-17508) Exception handling in the partition replication listener for RAFT futures
Vladislav Pyatkov created IGNITE-17508: -- Summary: Exception handling in the partition replication listener for RAFT futures Key: IGNITE-17508 URL: https://issues.apache.org/jira/browse/IGNITE-17508 Project: Ignite Issue Type: Improvement Reporter: Vladislav Pyatkov In the replication listener ({_}PartitionReplicaListener{_}) where we have the pattern: {code:java} raftFut.thenApply(ignored -> result);{code} we should worry about handling RAFT exceptions. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (IGNITE-17431) Sql. Index support by optimizer
[ https://issues.apache.org/jira/browse/IGNITE-17431?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrey Mashenkov reassigned IGNITE-17431: - Assignee: Andrey Mashenkov > Sql. Index support by optimizer > --- > > Key: IGNITE-17431 > URL: https://issues.apache.org/jira/browse/IGNITE-17431 > Project: Ignite > Issue Type: Improvement > Components: sql >Reporter: Konstantin Orlov >Assignee: Andrey Mashenkov >Priority: Major > Labels: ignite-3 > > We need to integrate indexes into optimisation framework. This includes > following parts: > - Integration of indexes into sql schema management: we need to provide a way > to discover indexes during optimising phase (see {{ExposeIndexRule}}) > - Integration of indexes into execution: need to provide an execution node to > convert IndexScan to (see {{LogicalRelImplementor#visit(IgniteIndexScan)}}) -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (IGNITE-17499) Service method invocation excepition is not propagated to thin client side.
[ https://issues.apache.org/jira/browse/IGNITE-17499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mikhail Petrov reassigned IGNITE-17499: --- Assignee: Mikhail Petrov > Service method invocation excepition is not propagated to thin client side. > --- > > Key: IGNITE-17499 > URL: https://issues.apache.org/jira/browse/IGNITE-17499 > Project: Ignite > Issue Type: Bug >Reporter: Mikhail Petrov >Assignee: Mikhail Petrov >Priority: Minor > > https://issues.apache.org/jira/browse/IGNITE-13389 introduced dedicated flag > that make it possible to propagate server side stacktrace to a thin client > side. The mentoined above propagation does not work for exceptions that > arises during Ignite Service invocation. > Steps to reproduce: > 1. Start .Net Ignite node > 2. Deploy service which invocation throws an arbitrary uncaught exception > 3. Invoke previously deployed services via Java thin client > As a result, information about the custom code exception is not present in > the exception stacktrace that is thrown after the service call. > The main reason of such behaviour is that > ClientServiceInvokeRequest.java:198 does not propagate initial exception. So > ClientRequestHandler#handleException could not handle exception properly even > if ThinClientConfiguration#sendServerExceptionStackTraceToClient() is enabled. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-17507) Failed to wait for partition map exchange on some clients
[ https://issues.apache.org/jira/browse/IGNITE-17507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vyacheslav Koptilin updated IGNITE-17507: - Description: We have scenario with several client and server nodes, which can stuck on PME after start: * Start some server nodes * Trigger rebalance * Start some client and server nodes * Some of the client nodes stuck with _Failed to wait for partition map exchange [topVer=AffinityTopologyVersion…_ Deep investigation of the logs showed, that the root cause of the stuck PME on client is the race between joining new client node and receiving stale _CacheAffinityChangeMessage_ on a client, which causes PME, but when other old nodes receive this _CacheAffinityChangeMessage_, they skip it because of some optimization. Optimization can be found in the method _CacheAffinitySharedManager#onDiscoveryEvent_, we save _lastAffVer = topVer_ for old nodes, but because of some race _lastAffVer_ for the problem client node is null when we reach _CacheAffinitySharedManager#onCustomEvent_ and we schedule invalid PME in _msg.exchangeNeeded(exchangeNeeded)_, but other nodes skip this PME The possible fix is that we can try to make the _CacheAffinityChangeMessage_ mutable (mutable discovery custom message). It allows to modify the message before sending it across the ring. This approach does not require to make a decision to apply or skip the message on client nodes, the required flag will be transferred from a server node. In case of using Zookeeper Discovery, there is no ability to mutate discovery messages. However is is possible to mutate the message on the coordinator node. This is quite enough for our case. was: We have scenario with several client and server nodes, which can stuck on PME after start: * Start some server nodes * Trigger rebalance * Start some client and server nodes * Some of the client nodes stuck with _Failed to wait for partition map exchange [topVer=AffinityTopologyVersion…_ Deep investigation of the logs showed, that the root cause of the stuck PME on client is the race between joining new client node and receiving stale _CacheAffinityChangeMessage _on a client, which causes PME, but when other old nodes receive this _CacheAffinityChangeMessage_, they skip it because of some optimization. Optimization can be found in the method _CacheAffinitySharedManager#onDiscoveryEvent_ , we save _lastAffVer = topVer_ for old nodes, but because of some race _lastAffVer_ for the problem client node is null when we reach _CacheAffinitySharedManager#onCustomEvent_ and we schedule invalid PME in _msg.exchangeNeeded(exchangeNeeded)_ , but other nodes skip this PME The possible fix is that we can try to make the _CacheAffinityChangeMessage_ mutable (mutable discovery custom message). It allows to modify the message before sending it across the ring. This approach does not require to make a decision to apply or skip the message on client nodes, the required flag will be transferred from a server node. In case of using Zookeeper Discovery, there is no ability to mutate discovery messages. However is is possible to mutate the message on the coordinator node. This is quite enough for our case. > Failed to wait for partition map exchange on some clients > - > > Key: IGNITE-17507 > URL: https://issues.apache.org/jira/browse/IGNITE-17507 > Project: Ignite > Issue Type: Bug >Reporter: Vyacheslav Koptilin >Assignee: Vyacheslav Koptilin >Priority: Major > > We have scenario with several client and server nodes, which can stuck on PME > after start: > * Start some server nodes > * Trigger rebalance > * Start some client and server nodes > * Some of the client nodes stuck with _Failed to wait for partition map > exchange [topVer=AffinityTopologyVersion…_ > Deep investigation of the logs showed, that the root cause of the stuck PME > on client is the race between joining new client node and receiving stale > _CacheAffinityChangeMessage_ on a client, which causes PME, but when other > old nodes receive this _CacheAffinityChangeMessage_, they skip it because of > some optimization. > Optimization can be found in the method > _CacheAffinitySharedManager#onDiscoveryEvent_, we save _lastAffVer = topVer_ > for old nodes, but because of some race _lastAffVer_ for the problem client > node is null when we reach _CacheAffinitySharedManager#onCustomEvent_ and we > schedule invalid PME in _msg.exchangeNeeded(exchangeNeeded)_, but other > nodes skip this PME > The possible fix is that we can try to make the _CacheAffinityChangeMessage_ > mutable (mutable discovery custom message). It allows to modify the message > before sending it across the ring. This approach does not require to make a > decision to apply or skip
[jira] [Updated] (IGNITE-17507) Failed to wait for partition map exchange on some clients
[ https://issues.apache.org/jira/browse/IGNITE-17507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vyacheslav Koptilin updated IGNITE-17507: - Description: We have scenario with several client and server nodes, which can stuck on PME after start: * Start some server nodes * Trigger rebalance * Start some client and server nodes * Some of the client nodes stuck with _Failed to wait for partition map exchange [topVer=AffinityTopologyVersion…_ Deep investigation of the logs showed, that the root cause of the stuck PME on client is the race between joining new client node and receiving stale _CacheAffinityChangeMessage _on a client, which causes PME, but when other old nodes receive this _CacheAffinityChangeMessage_, they skip it because of some optimization. Optimization can be found in the method _CacheAffinitySharedManager#onDiscoveryEvent_ , we save _lastAffVer = topVer_ for old nodes, but because of some race _lastAffVer_ for the problem client node is null when we reach _CacheAffinitySharedManager#onCustomEvent_ and we schedule invalid PME in _msg.exchangeNeeded(exchangeNeeded)_ , but other nodes skip this PME The possible fix is that we can try to make the _CacheAffinityChangeMessage_ mutable (mutable discovery custom message). It allows to modify the message before sending it across the ring. This approach does not require to make a decision to apply or skip the message on client nodes, the required flag will be transferred from a server node. In case of using Zookeeper Discovery, there is no ability to mutate discovery messages. However is is possible to mutate the message on the coordinator node. This is quite enough for our case. was: We have scenario with several client and server nodes, which can stuck on PME after start: * Start some server nodes * Trigger rebalance * Start some client and server nodes * Some of the client nodes stuck with _Failed to wait for partition map exchange [topVer=AffinityTopologyVersion…_ Deep investigation of the logs showed, that the root cause of the stuck PME on client is the race between joining new client node and receiving stale _CacheAffinityChangeMessage _on a client, which causes PME, but when other old nodes receive this _CacheAffinityChangeMessage_, they skip it because of some optimization. Optimization can be found in the method _CacheAffinitySharedManager#onDiscoveryEvent_, we save _lastAffVer = topVer_ for old nodes, but because of some race _lastAffVer_ for the problem client node is null when we reach _CacheAffinitySharedManager#onCustomEvent_ and we schedule invalid PME in _msg.exchangeNeeded(exchangeNeeded)_, but other nodes skip this PME The possible fix is that we can try to make the _CacheAffinityChangeMessage_ mutable (mutable discovery custom message). It allows to modify the message before sending it across the ring. This approach does not require to make a decision to apply or skip the message on client nodes, the required flag will be transferred from a server node. In case of using Zookeeper Discovery, there is no ability to mutate discovery messages. However is is possible to mutate the message on the coordinator node. This is quite enough for our case. > Failed to wait for partition map exchange on some clients > - > > Key: IGNITE-17507 > URL: https://issues.apache.org/jira/browse/IGNITE-17507 > Project: Ignite > Issue Type: Bug >Reporter: Vyacheslav Koptilin >Assignee: Vyacheslav Koptilin >Priority: Major > > We have scenario with several client and server nodes, which can stuck on PME > after start: > * Start some server nodes > * Trigger rebalance > * Start some client and server nodes > * Some of the client nodes stuck with _Failed to wait for partition map > exchange [topVer=AffinityTopologyVersion…_ > Deep investigation of the logs showed, that the root cause of the stuck PME > on client is the race between joining new client node and receiving stale > _CacheAffinityChangeMessage _on a client, which causes PME, but when other > old nodes receive this _CacheAffinityChangeMessage_, they skip it because of > some optimization. > Optimization can be found in the method > _CacheAffinitySharedManager#onDiscoveryEvent_ , we save _lastAffVer = topVer_ > for old nodes, but because of some race _lastAffVer_ for the problem client > node is null when we reach _CacheAffinitySharedManager#onCustomEvent_ and we > schedule invalid PME in _msg.exchangeNeeded(exchangeNeeded)_ , but other > nodes skip this PME > The possible fix is that we can try to make the _CacheAffinityChangeMessage_ > mutable (mutable discovery custom message). It allows to modify the message > before sending it across the ring. This approach does not require to make a > decision to apply or
[jira] [Updated] (IGNITE-17507) Failed to wait for partition map exchange on some clients
[ https://issues.apache.org/jira/browse/IGNITE-17507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vyacheslav Koptilin updated IGNITE-17507: - Description: We have scenario with several client and server nodes, which can stuck on PME after start: * Start some server nodes * Trigger rebalance * Start some client and server nodes * Some of the client nodes stuck with _Failed to wait for partition map exchange [topVer=AffinityTopologyVersion…_ Deep investigation of the logs showed, that the root cause of the stuck PME on client is the race between joining new client node and receiving stale _CacheAffinityChangeMessage _on a client, which causes PME, but when other old nodes receive this _CacheAffinityChangeMessage_, they skip it because of some optimization. Optimization can be found in the method _CacheAffinitySharedManager#onDiscoveryEvent_, we save _lastAffVer = topVer_ for old nodes, but because of some race _lastAffVer_ for the problem client node is null when we reach _CacheAffinitySharedManager#onCustomEvent_ and we schedule invalid PME in _msg.exchangeNeeded(exchangeNeeded)_, but other nodes skip this PME The possible fix is that we can try to make the _CacheAffinityChangeMessage_ mutable (mutable discovery custom message). It allows to modify the message before sending it across the ring. This approach does not require to make a decision to apply or skip the message on client nodes, the required flag will be transferred from a server node. In case of using Zookeeper Discovery, there is no ability to mutate discovery messages. However is is possible to mutate the message on the coordinator node. This is quite enough for our case. was: We have scenario with several client and server nodes, which can stuck on PME after start: * Start some server nodes * Trigger rebalance * Start some client and server nodes * Some of the client nodes stuck with Failed to wait for partition map exchange [topVer=AffinityTopologyVersion… Deep investigation of the logs showed, that the root cause of the stuck PME on client is the race between joining new client node and receiving stale CacheAffinityChangeMessage on a client, which causes PME, but when other old nodes receive this CacheAffinityChangeMessage, they skip it because of some optimization. Optimization can be found in the method CacheAffinitySharedManager#onDiscoveryEvent, we save lastAffVer = topVer; for old nodes, but because of some race lastAffVer for the problem client node is null when we reach CacheAffinitySharedManager#onCustomEvent and we schedule invalid PME in msg.exchangeNeeded(exchangeNeeded);, but other nodes skip this PME The possible fix is that we can try to make the _CacheAffinityChangeMessage _mutable (mutable discovery custom message). It allows to modify the message before sending it across the ring. This approach does not require to make a decision to apply or skip the message on client nodes, the required flag will be transferred from a server node. In case of using Zookeeper Discovery, there is no ability to mutate discovery messages. However is is possible to mutate the message on the coordinator node. This is quite enough for our case. TeamCity does not demonstrates any issue with this approach. > Failed to wait for partition map exchange on some clients > - > > Key: IGNITE-17507 > URL: https://issues.apache.org/jira/browse/IGNITE-17507 > Project: Ignite > Issue Type: Bug >Reporter: Vyacheslav Koptilin >Assignee: Vyacheslav Koptilin >Priority: Major > > We have scenario with several client and server nodes, which can stuck on PME > after start: > * Start some server nodes > * Trigger rebalance > * Start some client and server nodes > * Some of the client nodes stuck with _Failed to wait for partition map > exchange [topVer=AffinityTopologyVersion…_ > Deep investigation of the logs showed, that the root cause of the stuck PME > on client is the race between joining new client node and receiving stale > _CacheAffinityChangeMessage _on a client, which causes PME, but when other > old nodes receive this _CacheAffinityChangeMessage_, they skip it because of > some optimization. > Optimization can be found in the method > _CacheAffinitySharedManager#onDiscoveryEvent_, we save _lastAffVer = topVer_ > for old nodes, but because of some race _lastAffVer_ for the problem client > node is null when we reach _CacheAffinitySharedManager#onCustomEvent_ and we > schedule invalid PME in _msg.exchangeNeeded(exchangeNeeded)_, but other > nodes skip this PME > The possible fix is that we can try to make the _CacheAffinityChangeMessage_ > mutable (mutable discovery custom message). It allows to modify the message > before sending it across the ring. This approach does not
[jira] [Updated] (IGNITE-17507) Failed to wait for partition map exchange on some clients
[ https://issues.apache.org/jira/browse/IGNITE-17507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vyacheslav Koptilin updated IGNITE-17507: - Ignite Flags: (was: Docs Required,Release Notes Required) > Failed to wait for partition map exchange on some clients > - > > Key: IGNITE-17507 > URL: https://issues.apache.org/jira/browse/IGNITE-17507 > Project: Ignite > Issue Type: Bug >Reporter: Vyacheslav Koptilin >Assignee: Vyacheslav Koptilin >Priority: Major > > We have scenario with several client and server nodes, which can stuck on PME > after start: > * Start some server nodes > * Trigger rebalance > * Start some client and server nodes > * Some of the client nodes stuck with Failed to wait for partition map > exchange [topVer=AffinityTopologyVersion… > Deep investigation of the logs showed, that the root cause of the stuck PME > on client is the race between joining new client node and receiving stale > CacheAffinityChangeMessage on a client, which causes PME, but when other old > nodes receive this CacheAffinityChangeMessage, they skip it because of some > optimization. > Optimization can be found in the method > CacheAffinitySharedManager#onDiscoveryEvent, we save lastAffVer = topVer; for > old nodes, but because of some race lastAffVer for the problem client node is > null when we reach CacheAffinitySharedManager#onCustomEvent and we schedule > invalid PME in msg.exchangeNeeded(exchangeNeeded);, but other nodes skip > this PME > The possible fix is that we can try to make the _CacheAffinityChangeMessage > _mutable (mutable discovery custom message). It allows to modify the message > before sending it across the ring. This approach does not require to make a > decision to apply or skip the message on client nodes, the required flag will > be transferred from a server node. In case of using Zookeeper Discovery, > there is no ability to mutate discovery messages. However is is possible to > mutate the message on the coordinator node. This is quite enough for our > case. TeamCity does not demonstrates any issue with this approach. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IGNITE-17507) Failed to wait for partition map exchange on some clients
Vyacheslav Koptilin created IGNITE-17507: Summary: Failed to wait for partition map exchange on some clients Key: IGNITE-17507 URL: https://issues.apache.org/jira/browse/IGNITE-17507 Project: Ignite Issue Type: Bug Reporter: Vyacheslav Koptilin Assignee: Vyacheslav Koptilin We have scenario with several client and server nodes, which can stuck on PME after start: * Start some server nodes * Trigger rebalance * Start some client and server nodes * Some of the client nodes stuck with Failed to wait for partition map exchange [topVer=AffinityTopologyVersion… Deep investigation of the logs showed, that the root cause of the stuck PME on client is the race between joining new client node and receiving stale CacheAffinityChangeMessage on a client, which causes PME, but when other old nodes receive this CacheAffinityChangeMessage, they skip it because of some optimization. Optimization can be found in the method CacheAffinitySharedManager#onDiscoveryEvent, we save lastAffVer = topVer; for old nodes, but because of some race lastAffVer for the problem client node is null when we reach CacheAffinitySharedManager#onCustomEvent and we schedule invalid PME in msg.exchangeNeeded(exchangeNeeded);, but other nodes skip this PME The possible fix is that we can try to make the _CacheAffinityChangeMessage _mutable (mutable discovery custom message). It allows to modify the message before sending it across the ring. This approach does not require to make a decision to apply or skip the message on client nodes, the required flag will be transferred from a server node. In case of using Zookeeper Discovery, there is no ability to mutate discovery messages. However is is possible to mutate the message on the coordinator node. This is quite enough for our case. TeamCity does not demonstrates any issue with this approach. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (IGNITE-17430) Sql. Provide commands and handlers for index related operations
[ https://issues.apache.org/jira/browse/IGNITE-17430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrey Mashenkov reassigned IGNITE-17430: - Assignee: Andrey Mashenkov > Sql. Provide commands and handlers for index related operations > --- > > Key: IGNITE-17430 > URL: https://issues.apache.org/jira/browse/IGNITE-17430 > Project: Ignite > Issue Type: Improvement > Components: sql >Reporter: Konstantin Orlov >Assignee: Andrey Mashenkov >Priority: Major > Labels: ignite-3 > > After IGNITE-17429 and backend for index management will be implemented, we > need to connect both parts together. For thus, we need to implement AST to > Command conversion as well as handlers for new index-related commands which > will delegate invocations to index manager (similar to table-related > commands). -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (IGNITE-14985) Re-work error handling in affinity component in accordance with error scopes and prefixes
[ https://issues.apache.org/jira/browse/IGNITE-14985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vyacheslav Koptilin reassigned IGNITE-14985: Assignee: Vyacheslav Koptilin > Re-work error handling in affinity component in accordance with error scopes > and prefixes > -- > > Key: IGNITE-14985 > URL: https://issues.apache.org/jira/browse/IGNITE-14985 > Project: Ignite > Issue Type: Improvement >Reporter: Vyacheslav Koptilin >Assignee: Vyacheslav Koptilin >Priority: Major > Labels: iep-84, ignite-3 > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (IGNITE-17394) Implement API for getting partition mapping
[ https://issues.apache.org/jira/browse/IGNITE-17394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17577814#comment-17577814 ] Mirza Aliev edited comment on IGNITE-17394 at 8/10/22 9:29 AM: --- Hello [~isapego]! We have discussed this requirements and come with the following: * We can add a method {{List assignments(UUID tableId)}}, which can return you a list where on the i-th place resides a node id that considered as a leader for the i-th partition on the moment of invocation for the provided tableId. * As for the second request with discovering distribution changes, we propose a bit different solution: it will be possible that any api method will throw Transactional Exception after new transaction mechanism will be introduced before beta release. In case of changing primary replica (this is new abstraction, that encapsulates the conception of leaders), any api method will throw Replica Miss Exception (or it will be a cause of a Transaction Exception). When you handle such kind of exception, you can just re-call previously introduced api method {{List assignments(UUID tableId)}} and refresh your partition mapping with actual primary replicas. This is a ticket where concept of Replica Miss Exception will be introduced https://issues.apache.org/jira/browse/IGNITE-17378 was (Author: maliev): Hello [~isapego]! We have discussed this requirements and come with the following: * We can add a method {{List assignments(UUID tableId)}}, which can return you a list where on the i-th place resides a node id that considered as a leader for the i-th partition on the moment of invocation for the provided tableId. * As for the second request with discovering distribution changes, we propose a bit different solution: it will be possible that any api method will throw Transactional Exception after new transaction mechanism will be introduced before beta release. In case of changing primary replica (this is new abstraction, that encapsulates the conception of leaders), any api method will throw Replica Miss Exception (or it will be a cause of a Transaction Exception). When you handle such kind of exception, you can just re-call previously introduced api method {{List assignments(UUID tableId)}} and refresh your partition mapping with actual primary replicas. This is a ticket when concept of Replica Miss Exception will be introduced https://issues.apache.org/jira/browse/IGNITE-17378 > Implement API for getting partition mapping > --- > > Key: IGNITE-17394 > URL: https://issues.apache.org/jira/browse/IGNITE-17394 > Project: Ignite > Issue Type: New Feature >Affects Versions: 3.0.0-alpha5 >Reporter: Igor Sapego >Assignee: Mirza Aliev >Priority: Major > Labels: ignite-3 > Fix For: 3.0.0-alpha6 > > > To implement Partition Awareness feature for clients, we need an internal or > public API that will provide us with the following mapping: [partition => > node id] (or [node id => partitions]). > We also need a lightweight mechanism that will allow us to discover that this > distribution has changed. In 2.x we used a topology version for this purpose, > assuming that if topology version has changed, partition distribution should > be refreshed. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (IGNITE-17394) Implement API for getting partition mapping
[ https://issues.apache.org/jira/browse/IGNITE-17394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17577814#comment-17577814 ] Mirza Aliev edited comment on IGNITE-17394 at 8/10/22 9:27 AM: --- Hello [~isapego]! We have discussed this requirements and come with the following: * We can add a method {{List assignments(UUID tableId)}}, which can return you a list where on the i-th place resides a node id that considered as a leader for the i-th partition on the moment of invocation for the provided tableId. * As for the second request with discovering distribution changes, we propose a bit different solution: it will be possible that any api method will throw Transactional Exception after new transaction mechanism will be introduced before beta release. In case of changing primary replica (this is new abstraction, that encapsulates the conception of leaders), any api method will throw Replica Miss Exception (or it will be a cause of a Transaction Exception). When you handle such kind of exception, you can just re-call previously introduced api method {{List assignments(UUID tableId)}} and refresh your partition mapping with actual primary replicas. This is a ticket when concept of Replica Miss Exception will be introduced https://issues.apache.org/jira/browse/IGNITE-17378 was (Author: maliev): Hello [~isapego]! We have discussed this requirements and come with the following: * We can add a method {{List assignments(UUID tableId)}}, which can return you a list where on the i-th place resides a node id that considered as a leader for the i-th partition on the moment of invocation for the provided tableId. * As for the second request with discovering distribution changes, we propose a bit different solution: any api method will throw Transactional Exception after new transaction mechanism will be introduced before beta release. In case of changing primary replica (this is new abstraction, that encapsulates the conception of leaders), any api method will throw Replica Miss Exception (or it will be a cause of a Transaction Exception). When you handle such kind of exception, you can just re-call previously introduced api method {{List assignments(UUID tableId)}} and refresh your partition mapping with actual primary replicas. This is a ticket when concept of Replica Miss Exception will be introduced https://issues.apache.org/jira/browse/IGNITE-17378 > Implement API for getting partition mapping > --- > > Key: IGNITE-17394 > URL: https://issues.apache.org/jira/browse/IGNITE-17394 > Project: Ignite > Issue Type: New Feature >Affects Versions: 3.0.0-alpha5 >Reporter: Igor Sapego >Assignee: Mirza Aliev >Priority: Major > Labels: ignite-3 > Fix For: 3.0.0-alpha6 > > > To implement Partition Awareness feature for clients, we need an internal or > public API that will provide us with the following mapping: [partition => > node id] (or [node id => partitions]). > We also need a lightweight mechanism that will allow us to discover that this > distribution has changed. In 2.x we used a topology version for this purpose, > assuming that if topology version has changed, partition distribution should > be refreshed. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (IGNITE-12519) Spark SQL not working with NON upper case column names
[ https://issues.apache.org/jira/browse/IGNITE-12519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Gagarkin reassigned IGNITE-12519: -- Assignee: Ivan Gagarkin > Spark SQL not working with NON upper case column names > --- > > Key: IGNITE-12519 > URL: https://issues.apache.org/jira/browse/IGNITE-12519 > Project: Ignite > Issue Type: Bug > Components: spark >Affects Versions: 2.7.6 > Environment: 1) Spark 2.3.0 (Tried on Mesos Master and Local Master) > 2) Ignite 2.7.6 (10 Nodes Cluster on Kubernetes) > 3) Spark Ignite 2.7.6 >Reporter: Praneeth Ramesh >Assignee: Ivan Gagarkin >Priority: Major > Labels: dataframe, spark, spark-shell > > I created a simple table as below. > {code:java} > CREATE TABLE acc ( > "accId" VARCHAR PRIMARY KEY, > "accCol1" VARCHAR, > "accCol2" INT, > "accCol3" VARCHAR, > "accCol4" BOOLEAN > );{code} > And trying to read the data from table from Ignite Spark as below. > > {code:java} > val igniteDF = spark.read > .format(FORMAT_IGNITE) > .option(OPTION_TABLE, "acc") > .option(OPTION_CONFIG_FILE, "example-config.xml") > .load() > igniteDF.show(100, false) > {code} > > But I see an exception as below. > {code:java} > Caused by: org.h2.jdbc.JdbcSQLException: Column "ACCCOL1" not found; SQL > statement: > SELECT accCol4, CAST(accCol1 AS VARCHAR) AS accCol1, accCol2, CAST(accCol3 AS > VARCHAR) AS accCol3, accId FROM ACC LIMIT 21 [42122-197] > at org.h2.message.DbException.getJdbcSQLException(DbException.java:357) > at org.h2.message.DbException.get(DbException.java:179) > at org.h2.message.DbException.get(DbException.java:155) > at org.h2.expression.ExpressionColumn.optimize(ExpressionColumn.java:150) > at org.h2.command.dml.Select.prepare(Select.java:858) > at org.h2.command.Parser.prepareCommand(Parser.java:283) > at org.h2.engine.Session.prepareLocal(Session.java:611) > at org.h2.engine.Session.prepareCommand(Session.java:549) > at org.h2.jdbc.JdbcConnection.prepareCommand(JdbcConnection.java:1247) > at org.h2.jdbc.JdbcPreparedStatement.(JdbcPreparedStatement.java:76) > at org.h2.jdbc.JdbcConnection.prepareStatement(JdbcConnection.java:694) > at > org.apache.ignite.internal.processors.query.h2.IgniteH2Indexing.prepare0(IgniteH2Indexing.java:539) > at > org.apache.ignite.internal.processors.query.h2.IgniteH2Indexing.prepareStatement(IgniteH2Indexing.java:509) > at > org.apache.ignite.internal.processors.query.h2.IgniteH2Indexing.prepareStatement(IgniteH2Indexing.java:476){code} > > When I try naming the TABLE cols with UPPER CASE everything works fine. But > when I use the quotes in the Column Names to preserve the case, then it > breaks with the exception. > From exception I can see query built is still having the UPPER case column > name ACCCOL1 instead of the camel case column names. > Is there any workaround for this. > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IGNITE-17506) Document changes in CREATE TABLE command
Igor Gusev created IGNITE-17506: --- Summary: Document changes in CREATE TABLE command Key: IGNITE-17506 URL: https://issues.apache.org/jira/browse/IGNITE-17506 Project: Ignite Issue Type: Task Components: documentation Reporter: Igor Gusev In the https://issues.apache.org/jira/browse/IGNITE-16860 a new feature was added to CREATE TABLE command. We need to document it. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (IGNITE-17506) Document changes in CREATE TABLE command
[ https://issues.apache.org/jira/browse/IGNITE-17506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Igor Gusev reassigned IGNITE-17506: --- Assignee: Igor Gusev > Document changes in CREATE TABLE command > > > Key: IGNITE-17506 > URL: https://issues.apache.org/jira/browse/IGNITE-17506 > Project: Ignite > Issue Type: Task > Components: documentation >Reporter: Igor Gusev >Assignee: Igor Gusev >Priority: Major > Labels: ignite-3 > > In the https://issues.apache.org/jira/browse/IGNITE-16860 a new feature was > added to CREATE TABLE command. We need to document it. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (IGNITE-17394) Implement API for getting partition mapping
[ https://issues.apache.org/jira/browse/IGNITE-17394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17577887#comment-17577887 ] Igor Sapego commented on IGNITE-17394: -- [~maliev], The suggested API will do. But I'm not sure I understood proposed solution to detect changes in data distribution correctly. Let's consider the following scenario: 1. I'm making a table insertion, it succeeds. 2. Data distribution changes. 3. I'm making a new table insertion. Will I get an exception in this case? Why? > Implement API for getting partition mapping > --- > > Key: IGNITE-17394 > URL: https://issues.apache.org/jira/browse/IGNITE-17394 > Project: Ignite > Issue Type: New Feature >Affects Versions: 3.0.0-alpha5 >Reporter: Igor Sapego >Assignee: Mirza Aliev >Priority: Major > Labels: ignite-3 > Fix For: 3.0.0-alpha6 > > > To implement Partition Awareness feature for clients, we need an internal or > public API that will provide us with the following mapping: [partition => > node id] (or [node id => partitions]). > We also need a lightweight mechanism that will allow us to discover that this > distribution has changed. In 2.x we used a topology version for this purpose, > assuming that if topology version has changed, partition distribution should > be refreshed. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (IGNITE-17456) After node restart, there are duplicate messages about WAL segment compression.
[ https://issues.apache.org/jira/browse/IGNITE-17456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17577885#comment-17577885 ] Ignite TC Bot commented on IGNITE-17456: {panel:title=Branch: [pull/10183/head] Base: [master] : No blockers found!|borderStyle=dashed|borderColor=#ccc|titleBGColor=#D6F7C1}{panel} {panel:title=Branch: [pull/10183/head] Base: [master] : New Tests (2)|borderStyle=dashed|borderColor=#ccc|titleBGColor=#D6F7C1} {color:#8b}PDS 5{color} [[tests 2|https://ci2.ignite.apache.org/viewLog.html?buildId=6553323]] * {color:#013220}IgnitePdsTestSuite5: WalCompactionNotificationsTest.testNotificationsEmptyArchive - PASSED{color} * {color:#013220}IgnitePdsTestSuite5: WalCompactionNotificationsTest.testNotificationsUnlimitedWal - PASSED{color} {panel} [TeamCity *-- Run :: All* Results|https://ci2.ignite.apache.org/viewLog.html?buildId=6552900buildTypeId=IgniteTests24Java8_RunAll] > After node restart, there are duplicate messages about WAL segment > compression. > --- > > Key: IGNITE-17456 > URL: https://issues.apache.org/jira/browse/IGNITE-17456 > Project: Ignite > Issue Type: Bug >Reporter: Pavel Pereslegin >Assignee: Pavel Pereslegin >Priority: Minor > Labels: ise > Time Spent: 0.5h > Remaining Estimate: 0h > > If you enable compression of WAL segments and at the same time set the WAL > archive size too small, then compression will not actually occur, but after > each node restart, "fake" notifications about segment compression are > sequentially written to the log from the beginning (see log example below). > {noformat} > [2022-08-02T14:46:33,757][INFO > ][test-runner-#1%wal.WalCompactionAfterRestartTest%][FileWriteAheadLogManager] > Resolved write ahead log work directory: > /home/xtern/src/java/ignite/work/db/wal/node00-63fb6fa2-fcea-42aa-a3c8-b36cd330ba7c > [2022-08-02T14:46:33,757][INFO > ][test-runner-#1%wal.WalCompactionAfterRestartTest%][FileWriteAheadLogManager] > Resolved write ahead log archive directory: > /home/xtern/src/java/ignite/work/db/wal/archive/node00-63fb6fa2-fcea-42aa-a3c8-b36cd330ba7c > [2022-08-02T14:46:33,759][INFO > ][test-runner-#1%wal.WalCompactionAfterRestartTest%][FileWriteAheadLogManager] > Enqueuing segment for compression [idx=0] > [2022-08-02T14:46:33,759][INFO > ][test-runner-#1%wal.WalCompactionAfterRestartTest%][FileWriteAheadLogManager] > Enqueuing segment for compression [idx=1] > [2022-08-02T14:46:33,759][INFO > ][test-runner-#1%wal.WalCompactionAfterRestartTest%][FileWriteAheadLogManager] > Enqueuing segment for compression [idx=2] > [2022-08-02T14:46:33,759][INFO > ][test-runner-#1%wal.WalCompactionAfterRestartTest%][FileWriteAheadLogManager] > Enqueuing segment for compression [idx=3] > ... > [2022-08-02T14:46:33,761][INFO > ][test-runner-#1%wal.WalCompactionAfterRestartTest%][FileWriteAheadLogManager] > Enqueuing segment for compression [idx=49] > [2022-08-02T14:46:33,761][INFO > ][test-runner-#1%wal.WalCompactionAfterRestartTest%][FileHandleManagerImpl] > Initialized write-ahead log manager [mode=LOG_ONLY] > ... > [2022-08-02T14:46:34,084][INFO > ][wal-file-compressor-%wal.WalCompactionAfterRestartTest0%-1-#133%wal.WalCompactionAfterRestartTest0%][FileWriteAheadLogManager] > Segment compressed notification [idx=0] > [2022-08-02T14:46:34,084][INFO > ][wal-file-compressor-%wal.WalCompactionAfterRestartTest0%-0-#131%wal.WalCompactionAfterRestartTest0%][FileWriteAheadLogManager] > Segment compressed notification [idx=1] > [2022-08-02T14:46:34,084][INFO > ][wal-file-compressor-%wal.WalCompactionAfterRestartTest0%-0-#131%wal.WalCompactionAfterRestartTest0%][FileWriteAheadLogManager] > Segment compressed notification [idx=2] > [2022-08-02T14:46:34,084][INFO > ][wal-file-compressor-%wal.WalCompactionAfterRestartTest0%-0-#131%wal.WalCompactionAfterRestartTest0%][FileWriteAheadLogManager] > Segment compressed notification [idx=3] > ... > [2022-08-02T14:46:34,092][INFO > ][wal-file-compressor-%wal.WalCompactionAfterRestartTest0%-0-#131%wal.WalCompactionAfterRestartTest0%][FileWriteAheadLogManager] > Segment compressed notification [idx=49] > [2022-08-02T14:46:34,093][INFO > ][exchange-worker-#127%wal.WalCompactionAfterRestartTest0%][GridCacheProcessor] > Finished recovery for cache [cache=ignite-sys-cache, grp=ignite-sys-cache, > startVer=AffinityTopologyVersion [topVer=1, minorTopVer=1]] > {noformat} > Reproducer: > {code:java} > private final ListeningTestLogger logger = new ListeningTestLogger(log); > /** {@inheritDoc} */ > @Override protected IgniteConfiguration getConfiguration(String name) > throws Exception { > return super.getConfiguration(name) > .setGridLogger(logger) > .setDataStorageConfiguration(new
[jira] [Created] (IGNITE-17505) Document CREATE INDEX command
Igor Gusev created IGNITE-17505: --- Summary: Document CREATE INDEX command Key: IGNITE-17505 URL: https://issues.apache.org/jira/browse/IGNITE-17505 Project: Ignite Issue Type: Task Components: documentation Reporter: Igor Gusev In the https://issues.apache.org/jira/browse/IGNITE-17429 ticket, a new CREATE INDEX command was added. We need to document it -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-17504) Replace ScanQueryFallbackIterator with GridCacheDistributedQueryFuture
[ https://issues.apache.org/jira/browse/IGNITE-17504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maksim Timonin updated IGNITE-17504: Ignite Flags: (was: Docs Required,Release Notes Required) > Replace ScanQueryFallbackIterator with GridCacheDistributedQueryFuture > -- > > Key: IGNITE-17504 > URL: https://issues.apache.org/jira/browse/IGNITE-17504 > Project: Ignite > Issue Type: New Feature >Reporter: Maksim Timonin >Assignee: Maksim Timonin >Priority: Major > > Currently, for ScanQuery we have separate branch of query processing - > ScanQueryFallbackIterator. That is useful for partitioned request > (ScanQuery.setPartition). Here it just sends CacheRequest to single node, and > others are used as backups in case of main node failure. > This branch can be re-used with IndexQuery, but firstly it's required to > merge this branch with main logic of processing, implemented in > GridCacheDistributedQueryFuture. > Looks like it pretty easy to implement with existing set of fields and > methods within this future. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-17504) Replace ScanQueryFallbackIterator with GridCacheDistributedQueryFuture
[ https://issues.apache.org/jira/browse/IGNITE-17504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maksim Timonin updated IGNITE-17504: Description: Currently, for ScanQuery we have separate branch of query processing - ScanQueryFallbackIterator. That is useful for partitioned request (ScanQuery.setPartition). Here it just sends CacheRequest to single node, and others are used as backups in case of main node failure. This branch can be re-used with IndexQuery, but firstly it's required to merge this branch with main logic of processing, implemented in GridCacheDistributedQueryFuture. Looks like it pretty easy to implement with existing set of fields and methods within this future. was: Currently, for ScanQuery we have separate branch of query processing - ScanQueryFallbackIterator. That is useful for partitioned request (ScanQuery.setPartition). This branch can be re-used with IndexQuery, but firstly it's required to merge this branch with main logic of processing, implemented in GridCacheDistributedQueryFuture. Looks like it pretty easy to implement with existing set of fields and methods within this future. > Replace ScanQueryFallbackIterator with GridCacheDistributedQueryFuture > -- > > Key: IGNITE-17504 > URL: https://issues.apache.org/jira/browse/IGNITE-17504 > Project: Ignite > Issue Type: New Feature >Reporter: Maksim Timonin >Assignee: Maksim Timonin >Priority: Major > > Currently, for ScanQuery we have separate branch of query processing - > ScanQueryFallbackIterator. That is useful for partitioned request > (ScanQuery.setPartition). Here it just sends CacheRequest to single node, and > others are used as backups in case of main node failure. > This branch can be re-used with IndexQuery, but firstly it's required to > merge this branch with main logic of processing, implemented in > GridCacheDistributedQueryFuture. > Looks like it pretty easy to implement with existing set of fields and > methods within this future. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IGNITE-17504) Replace ScanFallbackIterator with GridCacheDistributedQueryFuture
Maksim Timonin created IGNITE-17504: --- Summary: Replace ScanFallbackIterator with GridCacheDistributedQueryFuture Key: IGNITE-17504 URL: https://issues.apache.org/jira/browse/IGNITE-17504 Project: Ignite Issue Type: New Feature Reporter: Maksim Timonin Assignee: Maksim Timonin Currently, for ScanQuery we have separate branch of query processing - ScanQueryFallbackIterator. That is useful for partitioned request (ScanQuery.setPartition). This branch can be re-used with IndexQuery, but firstly it's required to merge this branch with main logic of processing, implemented in GridCacheDistributedQueryFuture. Looks like it pretty easy to implement with existing set of fields and methods within this future. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-17504) Replace ScanQueryFallbackIterator with GridCacheDistributedQueryFuture
[ https://issues.apache.org/jira/browse/IGNITE-17504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maksim Timonin updated IGNITE-17504: Summary: Replace ScanQueryFallbackIterator with GridCacheDistributedQueryFuture (was: Replace ScanFallbackIterator with GridCacheDistributedQueryFuture) > Replace ScanQueryFallbackIterator with GridCacheDistributedQueryFuture > -- > > Key: IGNITE-17504 > URL: https://issues.apache.org/jira/browse/IGNITE-17504 > Project: Ignite > Issue Type: New Feature >Reporter: Maksim Timonin >Assignee: Maksim Timonin >Priority: Major > > Currently, for ScanQuery we have separate branch of query processing - > ScanQueryFallbackIterator. That is useful for partitioned request > (ScanQuery.setPartition). > This branch can be re-used with IndexQuery, but firstly it's required to > merge this branch with main logic of processing, implemented in > GridCacheDistributedQueryFuture. > Looks like it pretty easy to implement with existing set of fields and > methods within this future. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IGNITE-17503) Transmission sender fails if receiver's pool is busy.
Amelchev Nikita created IGNITE-17503: Summary: Transmission sender fails if receiver's pool is busy. Key: IGNITE-17503 URL: https://issues.apache.org/jira/browse/IGNITE-17503 Project: Ignite Issue Type: Bug Reporter: Amelchev Nikita Reproducer: {noformat} // do as much as a pool size. rcv.getExecutorService().submit(() -> doSleep(1)); try (TransmissionSender sender = openTransmissionSender(rcvNodeId)) { sender.send(file1); // throws SocketTimeoutException } {noformat} Exception: {noformat} java.net.SocketTimeoutException: null at sun.nio.ch.SocketAdaptor$SocketInputStream.read(SocketAdaptor.java:211) ~[?:1.8.0_201] at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103) ~[?:1.8.0_201] at java.io.ObjectInputStream$PeekInputStream.read(ObjectInputStream.java:2663) ~[?:1.8.0_201] at java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:2679) ~[?:1.8.0_201] at java.io.ObjectInputStream$BlockDataInputStream.readShort(ObjectInputStream.java:3156) ~[?:1.8.0_201] at java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:862) ~[?:1.8.0_201] at java.io.ObjectInputStream.(ObjectInputStream.java:358) ~[?:1.8.0_201] at org.apache.ignite.internal.managers.communication.GridIoManager$TransmissionSender.connect(GridIoManager.java:3262) ~[classes/:?] at org.apache.ignite.internal.managers.communication.GridIoManager$TransmissionSender.send(GridIoManager.java:3350) [classes/:?] at org.apache.ignite.internal.managers.communication.GridIoManager$TransmissionSender.send(GridIoManager.java:3288) [classes/:?] at org.apache.ignite.internal.managers.communication.GridIoManagerFileTransmissionSelfTest.testSendToBusy(GridIoManagerFileTransmissionSelfTest.java:967) [test-classes/:?] {noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-17502) Tasks to sent the snapshot files are not ordered
[ https://issues.apache.org/jira/browse/IGNITE-17502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Amelchev Nikita updated IGNITE-17502: - Description: Tasks to sent the snapshot files are not ordered. This leads to socket timeout in a file sender while thread is busy by sending to other node: {noformat} sender.send(part1); ... otherSender.send(part3); ... // `sender` throws socket timeout exception. sender.send(part2); {noformat} {noformat} java.io.EOFException: null at java.io.ObjectInputStream$BlockDataInputStream.readBoolean(ObjectInputStream.java:3120) ~[?:1.8.0_201] at java.io.ObjectInputStream.readBoolean(ObjectInputStream.java:966) ~[?:1.8.0_201] at org.apache.ignite.internal.managers.communication.GridIoManager.receiveFromChannel(GridIoManager.java:2935) [classes/:?] at org.apache.ignite.internal.managers.communication.GridIoManager.processOpenedChannel(GridIoManager.java:2895) [classes/:?] at org.apache.ignite.internal.managers.communication.GridIoManager.access$4900(GridIoManager.java:244) [classes/:?] at org.apache.ignite.internal.managers.communication.GridIoManager$7.run(GridIoManager.java:1237) [classes/:?] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_201] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_201] at java.lang.Thread.run(Thread.java:748) [?:1.8.0_201] ... Caused by: org.apache.ignite.IgniteCheckedException: Requested topic is busy by another transmission. It's not allowed to process different sessions over the same topic simultaneously. Channel will be closed [initMsg=SessionChannelMessage [sesId=9c855b38281-d8dcd34f-916f-49d0-a453-cd1866acfce1], channel=java.nio.channels.SocketChannel[connected local=/127.0.0.1:47102 remote=/127.0.0.1:55621], nodeId=5ace7280-b08a-4cf9-b428-7f70ef70] at org.apache.ignite.internal.managers.communication.GridIoManager.processOpenedChannel(GridIoManager.java:2867) ~[classes/:?] at org.apache.ignite.internal.managers.communication.GridIoManager.access$4900(GridIoManager.java:244) ~[classes/:?] at org.apache.ignite.internal.managers.communication.GridIoManager$7.run(GridIoManager.java:1237) ~[classes/:?] ... 3 more {noformat} was: Tasks to sent the snapshot files are not ordered. This leads to socket timeout in a file sender while thread is busy by sending to other node: {noformat} sender.send(part1); ... otherSender.send(part3); ... // `sender` throws socket timeout exception. sender.send(part2); {noformat} > Tasks to sent the snapshot files are not ordered > > > Key: IGNITE-17502 > URL: https://issues.apache.org/jira/browse/IGNITE-17502 > Project: Ignite > Issue Type: Bug >Reporter: Amelchev Nikita >Assignee: Amelchev Nikita >Priority: Major > Labels: ise > Fix For: 2.14 > > > Tasks to sent the snapshot files are not ordered. This leads to socket > timeout in a file sender while thread is busy by sending to other node: > {noformat} > sender.send(part1); > ... > otherSender.send(part3); > ... > // `sender` throws socket timeout exception. > sender.send(part2); > {noformat} > {noformat} > java.io.EOFException: null > at > java.io.ObjectInputStream$BlockDataInputStream.readBoolean(ObjectInputStream.java:3120) > ~[?:1.8.0_201] > at java.io.ObjectInputStream.readBoolean(ObjectInputStream.java:966) > ~[?:1.8.0_201] > at > org.apache.ignite.internal.managers.communication.GridIoManager.receiveFromChannel(GridIoManager.java:2935) > [classes/:?] > at > org.apache.ignite.internal.managers.communication.GridIoManager.processOpenedChannel(GridIoManager.java:2895) > [classes/:?] > at > org.apache.ignite.internal.managers.communication.GridIoManager.access$4900(GridIoManager.java:244) > [classes/:?] > at > org.apache.ignite.internal.managers.communication.GridIoManager$7.run(GridIoManager.java:1237) > [classes/:?] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > [?:1.8.0_201] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > [?:1.8.0_201] > at java.lang.Thread.run(Thread.java:748) [?:1.8.0_201] > ... > Caused by: org.apache.ignite.IgniteCheckedException: Requested topic is busy > by another transmission. It's not allowed to process different sessions over > the same topic simultaneously. Channel will be closed > [initMsg=SessionChannelMessage > [sesId=9c855b38281-d8dcd34f-916f-49d0-a453-cd1866acfce1], > channel=java.nio.channels.SocketChannel[connected local=/127.0.0.1:47102 > remote=/127.0.0.1:55621],
[jira] [Updated] (IGNITE-17502) Tasks to sent the snapshot files are not ordered
[ https://issues.apache.org/jira/browse/IGNITE-17502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Amelchev Nikita updated IGNITE-17502: - Fix Version/s: 2.14 > Tasks to sent the snapshot files are not ordered > > > Key: IGNITE-17502 > URL: https://issues.apache.org/jira/browse/IGNITE-17502 > Project: Ignite > Issue Type: Bug >Reporter: Amelchev Nikita >Assignee: Amelchev Nikita >Priority: Major > Fix For: 2.14 > > > Tasks to sent the snapshot files are not ordered. This leads to socket > timeout in a file sender while thread is busy by sending to other node: > {noformat} > sender.send(part1); > ... > otherSender.send(part3); > ... > // `sender` throws socket timeout exception. > sender.send(part2); > {noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-17502) Tasks to sent the snapshot files are not ordered
[ https://issues.apache.org/jira/browse/IGNITE-17502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Amelchev Nikita updated IGNITE-17502: - Labels: ise (was: ) > Tasks to sent the snapshot files are not ordered > > > Key: IGNITE-17502 > URL: https://issues.apache.org/jira/browse/IGNITE-17502 > Project: Ignite > Issue Type: Bug >Reporter: Amelchev Nikita >Assignee: Amelchev Nikita >Priority: Major > Labels: ise > Fix For: 2.14 > > > Tasks to sent the snapshot files are not ordered. This leads to socket > timeout in a file sender while thread is busy by sending to other node: > {noformat} > sender.send(part1); > ... > otherSender.send(part3); > ... > // `sender` throws socket timeout exception. > sender.send(part2); > {noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IGNITE-17502) Tasks to sent the snapshot files are not ordered
Amelchev Nikita created IGNITE-17502: Summary: Tasks to sent the snapshot files are not ordered Key: IGNITE-17502 URL: https://issues.apache.org/jira/browse/IGNITE-17502 Project: Ignite Issue Type: Bug Reporter: Amelchev Nikita Assignee: Amelchev Nikita Tasks to sent the snapshot files are not ordered. This leads to socket timeout in a file sender while thread is busy by sending to other node: {noformat} sender.send(part1); ... otherSender.send(part3); ... // `sender` throws socket timeout exception. sender.send(part2); {noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (IGNITE-17470) Add initial support of Spark 3.2
[ https://issues.apache.org/jira/browse/IGNITE-17470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Gagarkin reassigned IGNITE-17470: -- Assignee: Ivan Gagarkin > Add initial support of Spark 3.2 > > > Key: IGNITE-17470 > URL: https://issues.apache.org/jira/browse/IGNITE-17470 > Project: Ignite > Issue Type: Task > Components: spark >Reporter: Ivan Gagarkin >Assignee: Ivan Gagarkin >Priority: Major > > Update ignite-spark module to spark-3.2 and scala 12 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (IGNITE-17471) Update Ignite optimization tests
[ https://issues.apache.org/jira/browse/IGNITE-17471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Gagarkin reassigned IGNITE-17471: -- Assignee: Ivan Gagarkin > Update Ignite optimization tests > - > > Key: IGNITE-17471 > URL: https://issues.apache.org/jira/browse/IGNITE-17471 > Project: Ignite > Issue Type: Task > Components: spark >Reporter: Ivan Gagarkin >Assignee: Ivan Gagarkin >Priority: Major > > After upgrading to spark-3.2, some tests fail due to changes in Spark > Test sets: > * org.apache.ignite.spark.IgniteOptimizationMathFuncSpec > * org.apache.ignite.spark.IgniteOptimizationJoinSpec > * org.apache.ignite.spark.IgniteOptimizationSpec > * org.apache.ignite.spark.IgniteOptimizationStringFuncSpec > * org.apache.ignite.spark.IgniteDataFrameSuite -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-17501) Integration between index and partition storages
[ https://issues.apache.org/jira/browse/IGNITE-17501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksandr Polovtcev updated IGNITE-17501: - Ignite Flags: (was: Docs Required,Release Notes Required) > Integration between index and partition storages > > > Key: IGNITE-17501 > URL: https://issues.apache.org/jira/browse/IGNITE-17501 > Project: Ignite > Issue Type: Task >Reporter: Aleksandr Polovtcev >Assignee: Aleksandr Polovtcev >Priority: Major > Labels: ignite-3 > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IGNITE-17501) Integration between index and partition storages
Aleksandr Polovtcev created IGNITE-17501: Summary: Integration between index and partition storages Key: IGNITE-17501 URL: https://issues.apache.org/jira/browse/IGNITE-17501 Project: Ignite Issue Type: Task Reporter: Aleksandr Polovtcev Assignee: Aleksandr Polovtcev -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (IGNITE-17394) Implement API for getting partition mapping
[ https://issues.apache.org/jira/browse/IGNITE-17394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17577814#comment-17577814 ] Mirza Aliev commented on IGNITE-17394: -- Hello [~isapego]! We have discussed this requirements and come with the following: * We can add a method {{List assignments(UUID tableId)}}, which can return you a list where on the i-th place resides a node id that considered as a leader for the i-th partition on the moment of invocation for the provided tableId. * As for the second request with discovering distribution changes, we propose a bit different solution: any api method will throw Transactional Exception after new transaction mechanism will be introduced before beta release. In case of changing primary replica (this is new abstraction, that encapsulates the conception of leaders), any api method will throw Replica Miss Exception (or it will be a cause of a Transaction Exception). When you handle such kind of exception, you can just re-call previously introduced api method {{List assignments(UUID tableId)}} and refresh your partition mapping with actual primary replicas. This is a ticket when concept of Replica Miss Exception will be introduced https://issues.apache.org/jira/browse/IGNITE-17378 > Implement API for getting partition mapping > --- > > Key: IGNITE-17394 > URL: https://issues.apache.org/jira/browse/IGNITE-17394 > Project: Ignite > Issue Type: New Feature >Affects Versions: 3.0.0-alpha5 >Reporter: Igor Sapego >Assignee: Mirza Aliev >Priority: Major > Labels: ignite-3 > Fix For: 3.0.0-alpha6 > > > To implement Partition Awareness feature for clients, we need an internal or > public API that will provide us with the following mapping: [partition => > node id] (or [node id => partitions]). > We also need a lightweight mechanism that will allow us to discover that this > distribution has changed. In 2.x we used a topology version for this purpose, > assuming that if topology version has changed, partition distribution should > be refreshed. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (IGNITE-17434) Extend implicit sql transactions functionality for RO|RW transactions support.
[ https://issues.apache.org/jira/browse/IGNITE-17434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Evgeny Stanilovsky reassigned IGNITE-17434: --- Assignee: Evgeny Stanilovsky > Extend implicit sql transactions functionality for RO|RW transactions support. > -- > > Key: IGNITE-17434 > URL: https://issues.apache.org/jira/browse/IGNITE-17434 > Project: Ignite > Issue Type: Improvement > Components: sql >Affects Versions: 3.0.0-alpha5 >Reporter: Evgeny Stanilovsky >Assignee: Evgeny Stanilovsky >Priority: Major > Labels: ignite-3 > Time Spent: 10m > Remaining Estimate: 0h > > After [1] was implemented, there are some improvements seems still required: > # If no explicit tx is specified it`s possible to obtain situation when keys > from one bulk enlisted under different tx [2] > # Implicit tx (not tx at all but tx meta - timestamp, first enlisted > partition and so on ..) need to be replicated into all calcite execution > fragments. > Thus in [3] seems we need to start implicit tx if no explicit defined and > correctly process it in [4] > [1] https://issues.apache.org/jira/browse/IGNITE-17328 > [2] ModifyNode#flushTuples > [3] ExecutionServiceImpl#executeQuery > [4] DistributedQueryManager#execute -- This message was sent by Atlassian Jira (v8.20.10#820010)