[jira] [Commented] (IGNITE-21051) Fix javadocs for IndexQuery

2023-12-14 Thread Ignite TC Bot (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-21051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17797076#comment-17797076
 ] 

Ignite TC Bot commented on IGNITE-21051:


{panel:title=Branch: [pull/11091/head] Base: [master] : No blockers 
found!|borderStyle=dashed|borderColor=#ccc|titleBGColor=#D6F7C1}{panel}
{panel:title=Branch: [pull/11091/head] Base: [master] : No new tests 
found!|borderStyle=dashed|borderColor=#ccc|titleBGColor=#F7D6C1}{panel}
[TeamCity *-- Run :: All* 
Results|https://ci2.ignite.apache.org/viewLog.html?buildId=7657730buildTypeId=IgniteTests24Java8_RunAll]

> Fix javadocs for IndexQuery
> ---
>
> Key: IGNITE-21051
> URL: https://issues.apache.org/jira/browse/IGNITE-21051
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Maksim Timonin
>Assignee: Oleg Valuyskiy
>Priority: Major
>  Labels: ise, newbie
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> It's required to fix javadoc formatting in the `IndexQuery` class. Now it 
> renders the algorithm list in single line. Should use "ul", "li" tags for 
> correct rendering.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (IGNITE-20662) Sql. Test performance of multi statement queries

2023-12-14 Thread Pavel Pereslegin (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-20662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17797061#comment-17797061
 ] 

Pavel Pereslegin commented on IGNITE-20662:
---

[~korlov], [~zstan],
please review the proposed patch.

> Sql. Test performance of multi statement queries
> 
>
> Key: IGNITE-20662
> URL: https://issues.apache.org/jira/browse/IGNITE-20662
> Project: Ignite
>  Issue Type: Improvement
>  Components: sql
>Reporter: Konstantin Orlov
>Assignee: Pavel Pereslegin
>Priority: Major
>  Labels: ignite-3
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Let's add some benchmarks to measure performance of multi statement queries.
> At least two types of tests should be added:
> * overhead of script processor: run the same single statement query as single 
> statement and as script
> * performance gain after implementation of sophisticated rules of 
> parallelisation (IGNITE-20673): run sequence of queries as chain of single 
> statements and as script



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-20412) Fix DistributionZoneCausalityDataNodesTest.java#checkDataNodesRepeated

2023-12-14 Thread Sergey Uttsel (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-20412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Uttsel updated IGNITE-20412:
---
Summary: Fix 
DistributionZoneCausalityDataNodesTest.java#checkDataNodesRepeated  (was: Fix 
ItIgniteDistributionZoneManagerNodeRestartTest# 
testScaleUpsTriggeredByFilterUpdateAndNodeJoinAreRestoredAfterRestart)

> Fix DistributionZoneCausalityDataNodesTest.java#checkDataNodesRepeated
> --
>
> Key: IGNITE-20412
> URL: https://issues.apache.org/jira/browse/IGNITE-20412
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Kirill Tkalenko
>Assignee: Sergey Uttsel
>Priority: Major
>  Labels: ignite-3
> Fix For: 3.0.0-beta2
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> h3. Motivation
> org.apache.ignite.internal.distribution.zones.ItIgniteDistributionZoneManagerNodeRestartTest#testScaleUpsTriggeredByFilterUpdateAndNodeJoinAreRestoredAfterRestart
>  started to fall in the catalog-feature branch and fails in the main branch 
> after catalog-feature is merged
> [https://ci.ignite.apache.org/viewLog.html?buildId=7501721=buildResultsDiv=ApacheIgnite3xGradle_Test_RunAllTests=]
> {code:java}
> java.lang.AssertionError:
> Expected: is <[]>
>  but: was <[A]>
> at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
> at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:6)
> at 
> org.apache.ignite.internal.distributionzones.DistributionZonesTestUtil.assertValueInStorage(DistributionZonesTestUtil.java:459)
> at 
> org.apache.ignite.internal.distribution.zones.ItIgniteDistributionZoneManagerNodeRestartTest.testScaleUpsTriggeredByFilterUpdateAndNodeJoinAreRestoredAfterRestart(ItIgniteDistributionZoneManagerNodeRestartTest.java:539)
> {code}
> h3. Implementation notes
> The root cause:
>  # This test changes metaStorageManager behavior and it throws expected 
> exception on ms.invoke.
>  # The test alters zone with new filter.
>  # DistributionZoneManager#onUpdateFilter return a future from 
> saveDataNodesToMetaStorageOnScaleUp(zoneId, causalityToken)
>  # The future is completed exceptionally and 
> WatchProcessor#notificationFuture will be completed exceptionally.
>  # Next updates will not be handled properly because notificationFuture is 
> completed exceptionally.
> We have already created tickets obout exception handling:
>  * https://issues.apache.org/jira/browse/IGNITE-14693
>  * https://issues.apache.org/jira/browse/IGNITE-14611
>  
> The test scenario is incorrect because the node should be stopped (by failure 
> handler) if the ms.invoke failed. We need to rewrite it when the DZM restart 
> will be updated.
> UPD1:
> I've tried to rewrite test, so we could not throw exception in metastorage 
> handler, but just force thread to wait in this invoke, but this lead the to 
> the problem that because we use spy on Standalone Metastorage, and mockito 
> use synchronised block when we call ms.invoke, so that leads to the problem 
> that blocking of one invoke leads to blocking all other communication with ms.
> Need further investigation how to rewrite this test
>  
> UPD2:
> The testScaleUpsTriggeredByFilterUpdateAndNodeJoinAreRestoredAfterRestart 
> test was removed under another commit. But there is another test which is 
> disabled by this ticket. And it is fixed now.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (IGNITE-21088) Impossible to restart node with json config

2023-12-14 Thread Igor (Jira)
Igor created IGNITE-21088:
-

 Summary: Impossible to restart node with json config
 Key: IGNITE-21088
 URL: https://issues.apache.org/jira/browse/IGNITE-21088
 Project: Ignite
  Issue Type: Bug
  Components: general
Affects Versions: 3.0.0-beta2
Reporter: Igor
 Fix For: 3.0.0-beta2


*Steps:*
1. Create ignite-config.json config instead of ignite-config.conf
{code:java}
{
  "network" : {
    "nodeFinder" : {
      "netClusterNodes" : [ "localhost:3110", "localhost:3111" ]
    },
    "port" : 3110
  },
  "rest" : {
    "port" : 10550
  },
  "clientConnector" : {
    "port" : 2080
  }
} {code}
2. Start node.

 

3. Stop node.
4. Restart node.
*Expected:*
Node restarted.
*Actual:* 
Config was rewritten into .conf format (but filename wasn't changed) and didn't 
start because incorrectly formatted config.
{code:java}
aimem {
    defaultRegion {
        emptyPagesPoolSize=100
        evictionMode=DISABLED
        evictionThreshold=0.9
        initSize=13666140160
        maxSize=13666140160
        memoryAllocator {
            type=unsafe
        }
    }
    pageSize=16384
}
aipersist {
    checkpoint {
        checkpointDelayMillis=200
        checkpointThreads=4
        compactionThreads=4
        frequency=18
        frequencyDeviation=40
        logReadLockThresholdTimeout=0
        readLockTimeout=1
        useAsyncFileIoFactory=true
    }
    defaultRegion {
        memoryAllocator {
            type=unsafe
        }
        replacementMode=CLOCK
        size=13666140160
    }
    pageSize=16384
}
clientConnector {
    connectTimeout=5000
    idleTimeout=0
    metricsEnabled=false
    port=2080
    sendServerExceptionStackTraceToClient=false
    ssl {
        ciphers=""
        clientAuth=none
        enabled=false
        keyStore {
            password=""
            path=""
            type=PKCS12
        }
        trustStore {
            password=""
            path=""
            type=PKCS12
        }
    }
}
cluster {
    networkInvokeTimeout=500
}
compute {
    queueMaxSize=2147483647
    statesLifetimeMillis=6
    threadPoolSize=20
    threadPoolStopTimeoutMillis=1
}
deployment {
    deploymentLocation=deployment
}
network {
    fileTransfer {
        chunkSize=1048576
        maxConcurrentRequests=4
        responseTimeout=1
        threadPoolSize=8
    }
    inbound {
        soBacklog=128
        soKeepAlive=true
        soLinger=0
        soReuseAddr=true
        tcpNoDelay=true
    }
    membership {
        failurePingInterval=1000
        membershipSyncInterval=3
        scaleCube {
            failurePingRequestMembers=3
            gossipInterval=200
            gossipRepeatMult=3
            membershipSuspicionMultiplier=5
            metadataTimeout=3000
        }
    }
    nodeFinder {
        netClusterNodes=[
            "localhost:3110",
            "localhost:3111"
        ]
        type=STATIC
    }
    outbound {
        soKeepAlive=true
        soLinger=0
        tcpNoDelay=true
    }
    port=3110
    shutdownQuietPeriod=0
    shutdownTimeout=15000
    ssl {
        ciphers=""
        clientAuth=none
        enabled=false
        keyStore {
            password=""
            path=""
            type=PKCS12
        }
        trustStore {
            password=""
            path=""
            type=PKCS12
        }
    }
}
raft {
    fsync=true
    responseTimeout=3000
    retryDelay=200
    retryTimeout=1
    rpcInstallSnapshotTimeout=30
    volatileRaft {
        logStorage {
            name=unlimited
        }
    }
}
rest {
    dualProtocol=false
    httpToHttpsRedirection=false
    port=10550
    ssl {
        ciphers=""
        clientAuth=none
        enabled=false
        keyStore {
            password=""
            path=""
            type=PKCS12
        }
        port=10400
        trustStore {
            password=""
            path=""
            type=PKCS12
        }
    }
}
rocksDb {
    defaultRegion {
        cache=lru
        numShardBits=-1
        size=268435456
        writeBufferSize=67108864
    }
    flushDelayMillis=100
} {code}
The error while starting:
{code:java}
org.apache.ignite.lang.IgniteException: IGN-CMN-65535 
TraceId:58e58a9a-e9a7-4d2e-bba6-9477d41d03b2 Unable to start [node=Cluster_0]
        at 
org.apache.ignite.internal.app.IgniteImpl.handleStartException(IgniteImpl.java:897)
        at org.apache.ignite.internal.app.IgniteImpl.start(IgniteImpl.java:886)
        at 
org.apache.ignite.internal.app.IgnitionImpl.doStart(IgnitionImpl.java:198)
        at 
org.apache.ignite.internal.app.IgnitionImpl.start(IgnitionImpl.java:99)
        at org.apache.ignite.IgnitionManager.start(IgnitionManager.java:72)
        at org.apache.ignite.IgnitionManager.start(IgnitionManager.java:51)
        at 
org.apache.ignite.internal.app.IgniteRunner.call(IgniteRunner.java:48)
        at 

[jira] [Commented] (IGNITE-21085) Fix the update versions script fail on ignite-calcite module.

2023-12-14 Thread Ignite TC Bot (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-21085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17796892#comment-17796892
 ] 

Ignite TC Bot commented on IGNITE-21085:


{panel:title=Branch: [pull/11099/head] Base: [master] : No blockers 
found!|borderStyle=dashed|borderColor=#ccc|titleBGColor=#D6F7C1}{panel}
{panel:title=Branch: [pull/11099/head] Base: [master] : No new tests 
found!|borderStyle=dashed|borderColor=#ccc|titleBGColor=#F7D6C1}{panel}
[TeamCity *-- Run :: All* 
Results|https://ci2.ignite.apache.org/viewLog.html?buildId=7656865buildTypeId=IgniteTests24Java8_RunAll]

> Fix the update versions script fail on ignite-calcite module.
> -
>
> Key: IGNITE-21085
> URL: https://issues.apache.org/jira/browse/IGNITE-21085
> Project: Ignite
>  Issue Type: Bug
>Reporter: Nikita Amelchev
>Assignee: Nikita Amelchev
>Priority: Critical
> Fix For: 2.16
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The ignite-calcite module requires the ignite-core dependency at validate 
> phase. Example of fail: 
> [CI|https://ci2.ignite.apache.org/buildConfiguration/Releases_ApacheIgniteMain_ReleaseBuild/7656645?hideProblemsFromDependencies=false=false=true=true=7656644_1948_535=debug=flowAware]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-21085) Fix the update versions script fail on ignite-calcite module.

2023-12-14 Thread Nikita Amelchev (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-21085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nikita Amelchev updated IGNITE-21085:
-
Priority: Critical  (was: Major)

> Fix the update versions script fail on ignite-calcite module.
> -
>
> Key: IGNITE-21085
> URL: https://issues.apache.org/jira/browse/IGNITE-21085
> Project: Ignite
>  Issue Type: Bug
>Reporter: Nikita Amelchev
>Assignee: Nikita Amelchev
>Priority: Critical
> Fix For: 2.16
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The ignite-calcite module requires the ignite-core dependency at validate 
> phase. Example of fail: 
> [CI|https://ci2.ignite.apache.org/buildConfiguration/Releases_ApacheIgniteMain_ReleaseBuild/7656645?hideProblemsFromDependencies=false=false=true=true=7656644_1948_535=debug=flowAware]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (IGNITE-21087) Remove MvccCoordinator

2023-12-14 Thread Julia Bakulina (Jira)
Julia Bakulina created IGNITE-21087:
---

 Summary: Remove MvccCoordinator
 Key: IGNITE-21087
 URL: https://issues.apache.org/jira/browse/IGNITE-21087
 Project: Ignite
  Issue Type: Sub-task
Reporter: Julia Bakulina
Assignee: Julia Bakulina


Remove MvccCoordinator



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (IGNITE-21086) Remove MvccCoordinator

2023-12-14 Thread Julia Bakulina (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-21086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julia Bakulina resolved IGNITE-21086.
-
Resolution: Duplicate

> Remove MvccCoordinator
> --
>
> Key: IGNITE-21086
> URL: https://issues.apache.org/jira/browse/IGNITE-21086
> Project: Ignite
>  Issue Type: Task
>Reporter: Julia Bakulina
>Assignee: Julia Bakulina
>Priority: Minor
>
> Remove MvccCoordinator
> The community has agreed that MVCC public API should be removed.
> Vote thread
> [http://apache-ignite-developers.2346864.n4.nabble.com/Removing-MVCC-public-API-tp50550.html]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (IGNITE-21086) Remove MvccCoordinator

2023-12-14 Thread Julia Bakulina (Jira)
Julia Bakulina created IGNITE-21086:
---

 Summary: Remove MvccCoordinator
 Key: IGNITE-21086
 URL: https://issues.apache.org/jira/browse/IGNITE-21086
 Project: Ignite
  Issue Type: Task
 Environment: Remove MvccCoordinator

The community has agreed that MVCC public API should be removed.

Vote thread
[http://apache-ignite-developers.2346864.n4.nabble.com/Removing-MVCC-public-API-tp50550.html]
Reporter: Julia Bakulina
Assignee: Julia Bakulina






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-21086) Remove MvccCoordinator

2023-12-14 Thread Julia Bakulina (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-21086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julia Bakulina updated IGNITE-21086:

Environment: (was: Remove MvccCoordinator

The community has agreed that MVCC public API should be removed.

Vote thread
[http://apache-ignite-developers.2346864.n4.nabble.com/Removing-MVCC-public-API-tp50550.html])

> Remove MvccCoordinator
> --
>
> Key: IGNITE-21086
> URL: https://issues.apache.org/jira/browse/IGNITE-21086
> Project: Ignite
>  Issue Type: Task
>Reporter: Julia Bakulina
>Assignee: Julia Bakulina
>Priority: Minor
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-21086) Remove MvccCoordinator

2023-12-14 Thread Julia Bakulina (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-21086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julia Bakulina updated IGNITE-21086:

Description: 
Remove MvccCoordinator

The community has agreed that MVCC public API should be removed.

Vote thread
[http://apache-ignite-developers.2346864.n4.nabble.com/Removing-MVCC-public-API-tp50550.html]

> Remove MvccCoordinator
> --
>
> Key: IGNITE-21086
> URL: https://issues.apache.org/jira/browse/IGNITE-21086
> Project: Ignite
>  Issue Type: Task
>Reporter: Julia Bakulina
>Assignee: Julia Bakulina
>Priority: Minor
>
> Remove MvccCoordinator
> The community has agreed that MVCC public API should be removed.
> Vote thread
> [http://apache-ignite-developers.2346864.n4.nabble.com/Removing-MVCC-public-API-tp50550.html]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (IGNITE-21085) Fix the update versions script fail on ignite-calcite module.

2023-12-14 Thread Nikita Amelchev (Jira)
Nikita Amelchev created IGNITE-21085:


 Summary: Fix the update versions script fail on ignite-calcite 
module.
 Key: IGNITE-21085
 URL: https://issues.apache.org/jira/browse/IGNITE-21085
 Project: Ignite
  Issue Type: Bug
Reporter: Nikita Amelchev
Assignee: Nikita Amelchev
 Fix For: 2.16


The ignite-calcite module requires the ignite-core dependency at validate 
phase. Example of fail: 
[CI|https://ci2.ignite.apache.org/buildConfiguration/Releases_ApacheIgniteMain_ReleaseBuild/7656645?hideProblemsFromDependencies=false=false=true=true=7656644_1948_535=debug=flowAware]




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (IGNITE-21059) We have upgraded our ignite instance from 2.7.6 to 2.14. Found long running cache operations

2023-12-14 Thread Vipul Thakur (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-21059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17796799#comment-17796799
 ] 

Vipul Thakur commented on IGNITE-21059:
---

one of the jms listener was receiving more load  than rest of the listeners. 
What i can understand from the frequent logs about wal being to moved to disk 
is causing the issue as the data is being moved there is another write request 
for the same entity, as it is already busy being written to disk. 

> We have upgraded our ignite instance from 2.7.6 to 2.14. Found long running 
> cache operations
> 
>
> Key: IGNITE-21059
> URL: https://issues.apache.org/jira/browse/IGNITE-21059
> Project: Ignite
>  Issue Type: Bug
>  Components: binary, clients
>Affects Versions: 2.14
>Reporter: Vipul Thakur
>Priority: Critical
> Attachments: Ignite_server_logs.zip, cache-config-1.xml, 
> client-service.zip, 
> digiapi-eventprocessing-app-zone1-696c8c4946-62jbx-jstck.txt1, 
> digiapi-eventprocessing-app-zone1-696c8c4946-62jbx-jstck.txt2, 
> digiapi-eventprocessing-app-zone1-696c8c4946-62jbx-jstck.txt3, 
> digiapi-eventprocessing-app-zone1-696c8c4946-7d57w-jstck.txt1, 
> digiapi-eventprocessing-app-zone1-696c8c4946-7d57w-jstck.txt2, 
> ignite-server-nohup.out
>
>
> We have recently upgraded from 2.7.6 to 2.14 due to the issue observed in 
> production environment where cluster would go in hang state due to partition 
> map exchange.
> Please find the below ticket which i created a while back for ignite 2.7.6
> https://issues.apache.org/jira/browse/IGNITE-13298
> So we migrated the apache ignite version to 2.14 and upgrade happened 
> smoothly but on the third day we could see cluster traffic dip again. 
> We have 5 nodes in a cluster where we provide 400 GB of RAM and more than 1 
> TB SDD.
> PFB for the attached config.[I have added it as attachment for review]
> I have also added the server logs from the same time when issue happened.
> We have set txn timeout as well as socket timeout both at server and client 
> end for our write operations but seems like sometimes cluster goes into hang 
> state and all our get calls are stuck and slowly everything starts to freeze 
> our jms listener threads and every thread reaches a choked up state in 
> sometime.
> Due to which our read services which does not even use txn to retrieve data 
> also starts to choke. Ultimately leading to end user traffic dip.
> We were hoping product upgrade will help but that has not been the case till 
> now. 
>  
>  
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (IGNITE-21059) We have upgraded our ignite instance from 2.7.6 to 2.14. Found long running cache operations

2023-12-14 Thread Evgeny Stanilovsky (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-21059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17796797#comment-17796797
 ] 

Evgeny Stanilovsky commented on IGNITE-21059:
-

is there are a big load in such a case ? some anomaly probably ?

> We have upgraded our ignite instance from 2.7.6 to 2.14. Found long running 
> cache operations
> 
>
> Key: IGNITE-21059
> URL: https://issues.apache.org/jira/browse/IGNITE-21059
> Project: Ignite
>  Issue Type: Bug
>  Components: binary, clients
>Affects Versions: 2.14
>Reporter: Vipul Thakur
>Priority: Critical
> Attachments: Ignite_server_logs.zip, cache-config-1.xml, 
> client-service.zip, 
> digiapi-eventprocessing-app-zone1-696c8c4946-62jbx-jstck.txt1, 
> digiapi-eventprocessing-app-zone1-696c8c4946-62jbx-jstck.txt2, 
> digiapi-eventprocessing-app-zone1-696c8c4946-62jbx-jstck.txt3, 
> digiapi-eventprocessing-app-zone1-696c8c4946-7d57w-jstck.txt1, 
> digiapi-eventprocessing-app-zone1-696c8c4946-7d57w-jstck.txt2, 
> ignite-server-nohup.out
>
>
> We have recently upgraded from 2.7.6 to 2.14 due to the issue observed in 
> production environment where cluster would go in hang state due to partition 
> map exchange.
> Please find the below ticket which i created a while back for ignite 2.7.6
> https://issues.apache.org/jira/browse/IGNITE-13298
> So we migrated the apache ignite version to 2.14 and upgrade happened 
> smoothly but on the third day we could see cluster traffic dip again. 
> We have 5 nodes in a cluster where we provide 400 GB of RAM and more than 1 
> TB SDD.
> PFB for the attached config.[I have added it as attachment for review]
> I have also added the server logs from the same time when issue happened.
> We have set txn timeout as well as socket timeout both at server and client 
> end for our write operations but seems like sometimes cluster goes into hang 
> state and all our get calls are stuck and slowly everything starts to freeze 
> our jms listener threads and every thread reaches a choked up state in 
> sometime.
> Due to which our read services which does not even use txn to retrieve data 
> also starts to choke. Ultimately leading to end user traffic dip.
> We were hoping product upgrade will help but that has not been the case till 
> now. 
>  
>  
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (IGNITE-21059) We have upgraded our ignite instance from 2.7.6 to 2.14. Found long running cache operations

2023-12-14 Thread Evgeny Stanilovsky (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-21059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17796795#comment-17796795
 ] 

Evgeny Stanilovsky commented on IGNITE-21059:
-

ok i take a look a bit more, but :
1. as i can see the most problems is from 4 nodes, one of them: x.244.6.80 
node, you can grep it: grep 'Failed to acquire lock within provided timeout' 
nohup_26.out and check : nodeOrder=X there X is order of the node, after grep 
this order you can found the node tx initializer.
2. transactions can`t take a lock (i still can`t see the reason) but as i can 
see all transactions are rolled back
may be you have some monitoring for this nodes ?
you can reduce tx timeout and just rerun it after rollback or use optimistic as 
i already wrote.

> We have upgraded our ignite instance from 2.7.6 to 2.14. Found long running 
> cache operations
> 
>
> Key: IGNITE-21059
> URL: https://issues.apache.org/jira/browse/IGNITE-21059
> Project: Ignite
>  Issue Type: Bug
>  Components: binary, clients
>Affects Versions: 2.14
>Reporter: Vipul Thakur
>Priority: Critical
> Attachments: Ignite_server_logs.zip, cache-config-1.xml, 
> client-service.zip, 
> digiapi-eventprocessing-app-zone1-696c8c4946-62jbx-jstck.txt1, 
> digiapi-eventprocessing-app-zone1-696c8c4946-62jbx-jstck.txt2, 
> digiapi-eventprocessing-app-zone1-696c8c4946-62jbx-jstck.txt3, 
> digiapi-eventprocessing-app-zone1-696c8c4946-7d57w-jstck.txt1, 
> digiapi-eventprocessing-app-zone1-696c8c4946-7d57w-jstck.txt2, 
> ignite-server-nohup.out
>
>
> We have recently upgraded from 2.7.6 to 2.14 due to the issue observed in 
> production environment where cluster would go in hang state due to partition 
> map exchange.
> Please find the below ticket which i created a while back for ignite 2.7.6
> https://issues.apache.org/jira/browse/IGNITE-13298
> So we migrated the apache ignite version to 2.14 and upgrade happened 
> smoothly but on the third day we could see cluster traffic dip again. 
> We have 5 nodes in a cluster where we provide 400 GB of RAM and more than 1 
> TB SDD.
> PFB for the attached config.[I have added it as attachment for review]
> I have also added the server logs from the same time when issue happened.
> We have set txn timeout as well as socket timeout both at server and client 
> end for our write operations but seems like sometimes cluster goes into hang 
> state and all our get calls are stuck and slowly everything starts to freeze 
> our jms listener threads and every thread reaches a choked up state in 
> sometime.
> Due to which our read services which does not even use txn to retrieve data 
> also starts to choke. Ultimately leading to end user traffic dip.
> We were hoping product upgrade will help but that has not been the case till 
> now. 
>  
>  
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-20412) Fix ItIgniteDistributionZoneManagerNodeRestartTest# testScaleUpsTriggeredByFilterUpdateAndNodeJoinAreRestoredAfterRestart

2023-12-14 Thread Sergey Uttsel (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-20412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Uttsel updated IGNITE-20412:
---
Description: 
h3. Motivation

org.apache.ignite.internal.distribution.zones.ItIgniteDistributionZoneManagerNodeRestartTest#testScaleUpsTriggeredByFilterUpdateAndNodeJoinAreRestoredAfterRestart
 started to fall in the catalog-feature branch and fails in the main branch 
after catalog-feature is merged

[https://ci.ignite.apache.org/viewLog.html?buildId=7501721=buildResultsDiv=ApacheIgnite3xGradle_Test_RunAllTests=]
{code:java}
java.lang.AssertionError:
Expected: is <[]>
 but: was <[A]>
at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:6)
at 
org.apache.ignite.internal.distributionzones.DistributionZonesTestUtil.assertValueInStorage(DistributionZonesTestUtil.java:459)
at 
org.apache.ignite.internal.distribution.zones.ItIgniteDistributionZoneManagerNodeRestartTest.testScaleUpsTriggeredByFilterUpdateAndNodeJoinAreRestoredAfterRestart(ItIgniteDistributionZoneManagerNodeRestartTest.java:539)
{code}
h3. Implementation notes

The root cause:
 # This test changes metaStorageManager behavior and it throws expected 
exception on ms.invoke.
 # The test alters zone with new filter.
 # DistributionZoneManager#onUpdateFilter return a future from 
saveDataNodesToMetaStorageOnScaleUp(zoneId, causalityToken)
 # The future is completed exceptionally and WatchProcessor#notificationFuture 
will be completed exceptionally.
 # Next updates will not be handled properly because notificationFuture is 
completed exceptionally.

We have already created tickets obout exception handling:
 * https://issues.apache.org/jira/browse/IGNITE-14693
 * https://issues.apache.org/jira/browse/IGNITE-14611

 

The test scenario is incorrect because the node should be stopped (by failure 
handler) if the ms.invoke failed. We need to rewrite it when the DZM restart 
will be updated.

UPD1:

I've tried to rewrite test, so we could not throw exception in metastorage 
handler, but just force thread to wait in this invoke, but this lead the to the 
problem that because we use spy on Standalone Metastorage, and mockito use 
synchronised block when we call ms.invoke, so that leads to the problem that 
blocking of one invoke leads to blocking all other communication with ms.

Need further investigation how to rewrite this test

 

UPD2:

The testScaleUpsTriggeredByFilterUpdateAndNodeJoinAreRestoredAfterRestart test 
was removed under another commit. But there is another test which is disabled 
by this ticket. And it is fixed now.

  was:
h3. Motivation

org.apache.ignite.internal.distribution.zones.ItIgniteDistributionZoneManagerNodeRestartTest#testScaleUpsTriggeredByFilterUpdateAndNodeJoinAreRestoredAfterRestart
 started to fall in the catalog-feature branch and fails in the main branch 
after catalog-feature is merged

[https://ci.ignite.apache.org/viewLog.html?buildId=7501721=buildResultsDiv=ApacheIgnite3xGradle_Test_RunAllTests=]
{code:java}
java.lang.AssertionError:
Expected: is <[]>
 but: was <[A]>
at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:6)
at 
org.apache.ignite.internal.distributionzones.DistributionZonesTestUtil.assertValueInStorage(DistributionZonesTestUtil.java:459)
at 
org.apache.ignite.internal.distribution.zones.ItIgniteDistributionZoneManagerNodeRestartTest.testScaleUpsTriggeredByFilterUpdateAndNodeJoinAreRestoredAfterRestart(ItIgniteDistributionZoneManagerNodeRestartTest.java:539)
{code}
h3. Implementation notes

The root cause:
 # This test changes metaStorageManager behavior and it throws expected 
exception on ms.invoke.
 # The test alters zone with new filter.
 # DistributionZoneManager#onUpdateFilter return a future from 
saveDataNodesToMetaStorageOnScaleUp(zoneId, causalityToken)
 # The future is completed exceptionally and WatchProcessor#notificationFuture 
will be completed exceptionally.
 # Next updates will not be handled properly because notificationFuture is 
completed exceptionally.

We have already created tickets obout exception handling:
 * https://issues.apache.org/jira/browse/IGNITE-14693
 * https://issues.apache.org/jira/browse/IGNITE-14611

 

The test scenario is incorrect because the node should be stopped (by failure 
handler) if the ms.invoke failed. We need to rewrite it when the DZM restart 
will be updated.

UPD1:

I've tried to rewrite test, so we could not throw exception in metastorage 
handler, but just force thread to wait in this invoke, but this lead the to the 
problem that because we use spy on Standalone Metastorage, and mockito use 
synchronised block when we call ms.invoke, so that leads to the problem that 
blocking of one invoke leads to blocking all other communication with ms.

Need further investigation how to rewrite this test

 


[jira] [Updated] (IGNITE-20412) Fix ItIgniteDistributionZoneManagerNodeRestartTest# testScaleUpsTriggeredByFilterUpdateAndNodeJoinAreRestoredAfterRestart

2023-12-14 Thread Sergey Uttsel (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-20412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Uttsel updated IGNITE-20412:
---
Description: 
h3. Motivation

org.apache.ignite.internal.distribution.zones.ItIgniteDistributionZoneManagerNodeRestartTest#testScaleUpsTriggeredByFilterUpdateAndNodeJoinAreRestoredAfterRestart
 started to fall in the catalog-feature branch and fails in the main branch 
after catalog-feature is merged

[https://ci.ignite.apache.org/viewLog.html?buildId=7501721=buildResultsDiv=ApacheIgnite3xGradle_Test_RunAllTests=]
{code:java}
java.lang.AssertionError:
Expected: is <[]>
 but: was <[A]>
at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:6)
at 
org.apache.ignite.internal.distributionzones.DistributionZonesTestUtil.assertValueInStorage(DistributionZonesTestUtil.java:459)
at 
org.apache.ignite.internal.distribution.zones.ItIgniteDistributionZoneManagerNodeRestartTest.testScaleUpsTriggeredByFilterUpdateAndNodeJoinAreRestoredAfterRestart(ItIgniteDistributionZoneManagerNodeRestartTest.java:539)
{code}
h3. Implementation notes

The root cause:
 # This test changes metaStorageManager behavior and it throws expected 
exception on ms.invoke.
 # The test alters zone with new filter.
 # DistributionZoneManager#onUpdateFilter return a future from 
saveDataNodesToMetaStorageOnScaleUp(zoneId, causalityToken)
 # The future is completed exceptionally and WatchProcessor#notificationFuture 
will be completed exceptionally.
 # Next updates will not be handled properly because notificationFuture is 
completed exceptionally.

We have already created tickets obout exception handling:
 * https://issues.apache.org/jira/browse/IGNITE-14693
 * https://issues.apache.org/jira/browse/IGNITE-14611

 

The test scenario is incorrect because the node should be stopped (by failure 
handler) if the ms.invoke failed. We need to rewrite it when the DZM restart 
will be updated.

UPD1:

I've tried to rewrite test, so we could not throw exception in metastorage 
handler, but just force thread to wait in this invoke, but this lead the to the 
problem that because we use spy on Standalone Metastorage, and mockito use 
synchronised block when we call ms.invoke, so that leads to the problem that 
blocking of one invoke leads to blocking all other communication with ms.

Need further investigation how to rewrite this test

 

UPD2:

 

  was:
h3. Motivation

org.apache.ignite.internal.distribution.zones.ItIgniteDistributionZoneManagerNodeRestartTest#testScaleUpsTriggeredByFilterUpdateAndNodeJoinAreRestoredAfterRestart
 started to fall in the catalog-feature branch and fails in the main branch 
after catalog-feature is merged

[https://ci.ignite.apache.org/viewLog.html?buildId=7501721=buildResultsDiv=ApacheIgnite3xGradle_Test_RunAllTests=]
{code:java}
java.lang.AssertionError:
Expected: is <[]>
 but: was <[A]>
at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:6)
at 
org.apache.ignite.internal.distributionzones.DistributionZonesTestUtil.assertValueInStorage(DistributionZonesTestUtil.java:459)
at 
org.apache.ignite.internal.distribution.zones.ItIgniteDistributionZoneManagerNodeRestartTest.testScaleUpsTriggeredByFilterUpdateAndNodeJoinAreRestoredAfterRestart(ItIgniteDistributionZoneManagerNodeRestartTest.java:539)
{code}
h3. Implementation notes

The root cause:
 # This test changes metaStorageManager behavior and it throws expected 
exception on ms.invoke.
 # The test alters zone with new filter.
 # DistributionZoneManager#onUpdateFilter return a future from 
saveDataNodesToMetaStorageOnScaleUp(zoneId, causalityToken)
 # The future is completed exceptionally and WatchProcessor#notificationFuture 
will be completed exceptionally.
 # Next updates will not be handled properly because notificationFuture is 
completed exceptionally.

We have already created tickets obout exception handling:
 * https://issues.apache.org/jira/browse/IGNITE-14693
 * https://issues.apache.org/jira/browse/IGNITE-14611

 

The test scenario is incorrect because the node should be stopped (by failure 
handler) if the ms.invoke failed. We need to rewrite it when the DZM restart 
will be updated.


UPD1:

I've tried to rewrite test, so we could not throw exception in metastorage 
handler, but just force thread to wait in this invoke, but this lead the to the 
problem that because we use spy on Standalone Metastorage, and mockito use 
synchronised block when we call ms.invoke, so that leads to the problem that 
blocking of one invoke leads to blocking all other communication with ms.

Need further investigation how to rewrite this test 


> Fix ItIgniteDistributionZoneManagerNodeRestartTest# 
> testScaleUpsTriggeredByFilterUpdateAndNodeJoinAreRestoredAfterRestart
> 

[jira] [Commented] (IGNITE-20995) Add more integration tests for tx recovery on unstable topology

2023-12-14 Thread Denis Chudov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-20995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17796781#comment-17796781
 ] 

Denis Chudov commented on IGNITE-20995:
---

>From my point of view, we should have following scenarios:
 # Lock conflict between two RW transactions, coordinator is lost for lock 
holder, recovery starts, lock holder is aborted, lock waiter is not affected.
 # Lock conflict between two RW transactions, coordinator is alive, recovery is 
not started, lock waiter is not affected.
 # Resolution of write intent belonging to tx without coordinator, recovery 
happens successfully, write intent is switched.
 # Resolution of write intent belonging to abandoned tx, recovery happens 
successfully, write intent is switched.
 # Resolution of write intent belonging to abandoned tx, commit partition has 
restarted and lost its local volatile tx state map, recovery happens 
successfully, write intent is switched.
 # Resolution of write intent belonging to pending transaction, coordinator is 
alive, recovery is not started.
 # RO transaction tx0 resolves write intent belonging to the transaction tx1 
and marks it as abandoned and starts the recovery; after that RW transaction 
tx2 meets the lock belonging to tx1, sees that it's abandoned recently and 
doesn't start the recovery. Recovery is triggered just once.
 # Coordinator is lost, but it has sent the commit message to a commit 
partition, in the same time the recovery initiating request is received from 
some data node. Commit is successful, tx recovery was not able to change 
transaction state, there are no assertions or other errors, write intents on 
data node are switched.
 # Coordinator is lost, but it has sent the commit message to a commit 
partition, in the same time the recovery initiating request is received from 
some data node. Recovery successfully aborts the transaction, the is correct 
exception on the coordinator, it was not able to change transaction state to 
commit, there are no assertions or other errors, write intents on data node are 
switched.
 # Parallel tx recoveries happen on two replicas of commit partition, both 
processes were started at a moment when the corresponding replica was the 
primary one. This also shouldnt break anything.

> Add more integration tests for tx recovery on unstable topology
> ---
>
> Key: IGNITE-20995
> URL: https://issues.apache.org/jira/browse/IGNITE-20995
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Alexander Lapin
>Assignee:  Kirill Sizov
>Priority: Major
>  Labels: ignite-3
>
> h3. Motivation
> Surprisingly it might be useful to check tx recovery implementation with some 
> tests.
> h3. Defintion of Done
> 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (IGNITE-21059) We have upgraded our ignite instance from 2.7.6 to 2.14. Found long running cache operations

2023-12-14 Thread Vipul Thakur (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-21059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17796738#comment-17796738
 ] 

Vipul Thakur edited comment on IGNITE-21059 at 12/14/23 2:04 PM:
-

in server logs can't find the same, still we will look into as of now no bulk 
operation is implemented.

 

[https://ignite.apache.org/docs/latest/key-value-api/transactions]

 

as per docs the cause of timeout will be TransactionDeadlockException — cant 
find this anywhere either at client or server end.


was (Author: vipul.thakur):
in server logs can't find the same, still we will look into as of now no bulk 
operation is implemented.

> We have upgraded our ignite instance from 2.7.6 to 2.14. Found long running 
> cache operations
> 
>
> Key: IGNITE-21059
> URL: https://issues.apache.org/jira/browse/IGNITE-21059
> Project: Ignite
>  Issue Type: Bug
>  Components: binary, clients
>Affects Versions: 2.14
>Reporter: Vipul Thakur
>Priority: Critical
> Attachments: Ignite_server_logs.zip, cache-config-1.xml, 
> client-service.zip, 
> digiapi-eventprocessing-app-zone1-696c8c4946-62jbx-jstck.txt1, 
> digiapi-eventprocessing-app-zone1-696c8c4946-62jbx-jstck.txt2, 
> digiapi-eventprocessing-app-zone1-696c8c4946-62jbx-jstck.txt3, 
> digiapi-eventprocessing-app-zone1-696c8c4946-7d57w-jstck.txt1, 
> digiapi-eventprocessing-app-zone1-696c8c4946-7d57w-jstck.txt2, 
> ignite-server-nohup.out
>
>
> We have recently upgraded from 2.7.6 to 2.14 due to the issue observed in 
> production environment where cluster would go in hang state due to partition 
> map exchange.
> Please find the below ticket which i created a while back for ignite 2.7.6
> https://issues.apache.org/jira/browse/IGNITE-13298
> So we migrated the apache ignite version to 2.14 and upgrade happened 
> smoothly but on the third day we could see cluster traffic dip again. 
> We have 5 nodes in a cluster where we provide 400 GB of RAM and more than 1 
> TB SDD.
> PFB for the attached config.[I have added it as attachment for review]
> I have also added the server logs from the same time when issue happened.
> We have set txn timeout as well as socket timeout both at server and client 
> end for our write operations but seems like sometimes cluster goes into hang 
> state and all our get calls are stuck and slowly everything starts to freeze 
> our jms listener threads and every thread reaches a choked up state in 
> sometime.
> Due to which our read services which does not even use txn to retrieve data 
> also starts to choke. Ultimately leading to end user traffic dip.
> We were hoping product upgrade will help but that has not been the case till 
> now. 
>  
>  
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (IGNITE-21059) We have upgraded our ignite instance from 2.7.6 to 2.14. Found long running cache operations

2023-12-14 Thread Vipul Thakur (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-21059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17796738#comment-17796738
 ] 

Vipul Thakur commented on IGNITE-21059:
---

in server logs can't find the same, still we will look into as of now no bulk 
operation is implemented.

> We have upgraded our ignite instance from 2.7.6 to 2.14. Found long running 
> cache operations
> 
>
> Key: IGNITE-21059
> URL: https://issues.apache.org/jira/browse/IGNITE-21059
> Project: Ignite
>  Issue Type: Bug
>  Components: binary, clients
>Affects Versions: 2.14
>Reporter: Vipul Thakur
>Priority: Critical
> Attachments: Ignite_server_logs.zip, cache-config-1.xml, 
> client-service.zip, 
> digiapi-eventprocessing-app-zone1-696c8c4946-62jbx-jstck.txt1, 
> digiapi-eventprocessing-app-zone1-696c8c4946-62jbx-jstck.txt2, 
> digiapi-eventprocessing-app-zone1-696c8c4946-62jbx-jstck.txt3, 
> digiapi-eventprocessing-app-zone1-696c8c4946-7d57w-jstck.txt1, 
> digiapi-eventprocessing-app-zone1-696c8c4946-7d57w-jstck.txt2, 
> ignite-server-nohup.out
>
>
> We have recently upgraded from 2.7.6 to 2.14 due to the issue observed in 
> production environment where cluster would go in hang state due to partition 
> map exchange.
> Please find the below ticket which i created a while back for ignite 2.7.6
> https://issues.apache.org/jira/browse/IGNITE-13298
> So we migrated the apache ignite version to 2.14 and upgrade happened 
> smoothly but on the third day we could see cluster traffic dip again. 
> We have 5 nodes in a cluster where we provide 400 GB of RAM and more than 1 
> TB SDD.
> PFB for the attached config.[I have added it as attachment for review]
> I have also added the server logs from the same time when issue happened.
> We have set txn timeout as well as socket timeout both at server and client 
> end for our write operations but seems like sometimes cluster goes into hang 
> state and all our get calls are stuck and slowly everything starts to freeze 
> our jms listener threads and every thread reaches a choked up state in 
> sometime.
> Due to which our read services which does not even use txn to retrieve data 
> also starts to choke. Ultimately leading to end user traffic dip.
> We were hoping product upgrade will help but that has not been the case till 
> now. 
>  
>  
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (IGNITE-17780) Update Kubernetes Deployment examples to StatefulSet

2023-12-14 Thread Igor Gusev (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-17780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Igor Gusev reassigned IGNITE-17780:
---

Assignee: Igor Gusev

> Update Kubernetes Deployment examples to StatefulSet
> 
>
> Key: IGNITE-17780
> URL: https://issues.apache.org/jira/browse/IGNITE-17780
> Project: Ignite
>  Issue Type: Bug
>  Components: documentation
>Reporter: Jeremy McMillan
>Assignee: Igor Gusev
>Priority: Minor
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> StatefulSet has been supported since IGNITE-6241. Experience supporting 
> Ignite in the field has led best practice recommendations to stop 
> recommending the use of Kubernetes Deployment objects because of the 
> benefits, in practice, which persistent volumes bring to common operations 
> like restarting pods.
> Persistent topology and node identity makes re-joining a cluster much cheaper 
> than always creating new Ignite nodes as pods are started.
> This should be simple for the cloud provider specific examples: just change 
> the "Creating a Pod Configuration" sections to provide a valid StatefulSet 
> example.
>  * 
> [https://ignite.apache.org/docs/2.11.1/installation/kubernetes/amazon-eks-deployment#creating-pod-configuration]
>  * 
> [https://ignite.apache.org/docs/2.11.1/installation/kubernetes/azure-deployment#creating-pod-configuration]
>  * 
> [https://ignite.apache.org/docs/2.11.1/installation/kubernetes/gke-deployment#creating-pod-configuration]
>  *  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (IGNITE-21059) We have upgraded our ignite instance from 2.7.6 to 2.14. Found long running cache operations

2023-12-14 Thread Evgeny Stanilovsky (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-21059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17796732#comment-17796732
 ] 

Evgeny Stanilovsky commented on IGNITE-21059:
-

probably you have deadlocks ? check: Deadlock detection was timed out, in 
attached logs.
You can search in ignite documentation how  to avoid it.
Fast check : do you insert unordered batch of keys ? If so - you need to sort 
it or use optimistic txs.

> We have upgraded our ignite instance from 2.7.6 to 2.14. Found long running 
> cache operations
> 
>
> Key: IGNITE-21059
> URL: https://issues.apache.org/jira/browse/IGNITE-21059
> Project: Ignite
>  Issue Type: Bug
>  Components: binary, clients
>Affects Versions: 2.14
>Reporter: Vipul Thakur
>Priority: Critical
> Attachments: Ignite_server_logs.zip, cache-config-1.xml, 
> client-service.zip, 
> digiapi-eventprocessing-app-zone1-696c8c4946-62jbx-jstck.txt1, 
> digiapi-eventprocessing-app-zone1-696c8c4946-62jbx-jstck.txt2, 
> digiapi-eventprocessing-app-zone1-696c8c4946-62jbx-jstck.txt3, 
> digiapi-eventprocessing-app-zone1-696c8c4946-7d57w-jstck.txt1, 
> digiapi-eventprocessing-app-zone1-696c8c4946-7d57w-jstck.txt2, 
> ignite-server-nohup.out
>
>
> We have recently upgraded from 2.7.6 to 2.14 due to the issue observed in 
> production environment where cluster would go in hang state due to partition 
> map exchange.
> Please find the below ticket which i created a while back for ignite 2.7.6
> https://issues.apache.org/jira/browse/IGNITE-13298
> So we migrated the apache ignite version to 2.14 and upgrade happened 
> smoothly but on the third day we could see cluster traffic dip again. 
> We have 5 nodes in a cluster where we provide 400 GB of RAM and more than 1 
> TB SDD.
> PFB for the attached config.[I have added it as attachment for review]
> I have also added the server logs from the same time when issue happened.
> We have set txn timeout as well as socket timeout both at server and client 
> end for our write operations but seems like sometimes cluster goes into hang 
> state and all our get calls are stuck and slowly everything starts to freeze 
> our jms listener threads and every thread reaches a choked up state in 
> sometime.
> Due to which our read services which does not even use txn to retrieve data 
> also starts to choke. Ultimately leading to end user traffic dip.
> We were hoping product upgrade will help but that has not been the case till 
> now. 
>  
>  
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (IGNITE-21059) We have upgraded our ignite instance from 2.7.6 to 2.14. Found long running cache operations

2023-12-14 Thread Vipul Thakur (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-21059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17796724#comment-17796724
 ] 

Vipul Thakur edited comment on IGNITE-21059 at 12/14/23 12:50 PM:
--

Hi [~zstan] && [~cos]  , today we observed the same issue in our other data 
center and restarting the apps helped.[this data center was running for 44 days]

I am attaching all nodes logs from the cluster -> 
{*}Ignite_server_logs.zip{*}[in this you can find logs before the issue came]

I am also attaching client services logs ---> *client-service.zip*

*We are still in process of implementing your recommendation.*

Please help us with your observations.


was (Author: vipul.thakur):
Hi [~zstan] && [~cos]  , today we observed the same issue in our other data 
center and restarting the apps helped.[this data center was running for 44 days]

I am attaching all nodes logs from the cluster -> Ignite_server_logs.zip[in 
this you can find logs before the issue came]

I am also attaching client services logs ---> client-service.zip

*We are still in process of implementing your recommendation.*

Please help us with your observations.

> We have upgraded our ignite instance from 2.7.6 to 2.14. Found long running 
> cache operations
> 
>
> Key: IGNITE-21059
> URL: https://issues.apache.org/jira/browse/IGNITE-21059
> Project: Ignite
>  Issue Type: Bug
>  Components: binary, clients
>Affects Versions: 2.14
>Reporter: Vipul Thakur
>Priority: Critical
> Attachments: Ignite_server_logs.zip, cache-config-1.xml, 
> client-service.zip, 
> digiapi-eventprocessing-app-zone1-696c8c4946-62jbx-jstck.txt1, 
> digiapi-eventprocessing-app-zone1-696c8c4946-62jbx-jstck.txt2, 
> digiapi-eventprocessing-app-zone1-696c8c4946-62jbx-jstck.txt3, 
> digiapi-eventprocessing-app-zone1-696c8c4946-7d57w-jstck.txt1, 
> digiapi-eventprocessing-app-zone1-696c8c4946-7d57w-jstck.txt2, 
> ignite-server-nohup.out
>
>
> We have recently upgraded from 2.7.6 to 2.14 due to the issue observed in 
> production environment where cluster would go in hang state due to partition 
> map exchange.
> Please find the below ticket which i created a while back for ignite 2.7.6
> https://issues.apache.org/jira/browse/IGNITE-13298
> So we migrated the apache ignite version to 2.14 and upgrade happened 
> smoothly but on the third day we could see cluster traffic dip again. 
> We have 5 nodes in a cluster where we provide 400 GB of RAM and more than 1 
> TB SDD.
> PFB for the attached config.[I have added it as attachment for review]
> I have also added the server logs from the same time when issue happened.
> We have set txn timeout as well as socket timeout both at server and client 
> end for our write operations but seems like sometimes cluster goes into hang 
> state and all our get calls are stuck and slowly everything starts to freeze 
> our jms listener threads and every thread reaches a choked up state in 
> sometime.
> Due to which our read services which does not even use txn to retrieve data 
> also starts to choke. Ultimately leading to end user traffic dip.
> We were hoping product upgrade will help but that has not been the case till 
> now. 
>  
>  
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (IGNITE-21059) We have upgraded our ignite instance from 2.7.6 to 2.14. Found long running cache operations

2023-12-14 Thread Vipul Thakur (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-21059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17796724#comment-17796724
 ] 

Vipul Thakur edited comment on IGNITE-21059 at 12/14/23 12:50 PM:
--

Hi [~zstan] && [~cos]  , today we observed the same issue in our other data 
center and restarting the apps helped.[this data center was running for 44 days]

I am attaching all nodes logs from the cluster -> Ignite_server_logs.zip[in 
this you can find logs before the issue came]

I am also attaching client services logs ---> client-service.zip

*We are still in process of implementing your recommendation.*

Please help us with your observations.


was (Author: vipul.thakur):
Hi [~zstan] , today we observed the same issue in our other data center and 
restarting the apps helped.

I am attaching all nodes logs from the cluster -> Ignite_server_logs.zip

I am also attaching client services logs ---> client-service.zip

*We are still in process of implementing your recommendation.*

> We have upgraded our ignite instance from 2.7.6 to 2.14. Found long running 
> cache operations
> 
>
> Key: IGNITE-21059
> URL: https://issues.apache.org/jira/browse/IGNITE-21059
> Project: Ignite
>  Issue Type: Bug
>  Components: binary, clients
>Affects Versions: 2.14
>Reporter: Vipul Thakur
>Priority: Critical
> Attachments: Ignite_server_logs.zip, cache-config-1.xml, 
> client-service.zip, 
> digiapi-eventprocessing-app-zone1-696c8c4946-62jbx-jstck.txt1, 
> digiapi-eventprocessing-app-zone1-696c8c4946-62jbx-jstck.txt2, 
> digiapi-eventprocessing-app-zone1-696c8c4946-62jbx-jstck.txt3, 
> digiapi-eventprocessing-app-zone1-696c8c4946-7d57w-jstck.txt1, 
> digiapi-eventprocessing-app-zone1-696c8c4946-7d57w-jstck.txt2, 
> ignite-server-nohup.out
>
>
> We have recently upgraded from 2.7.6 to 2.14 due to the issue observed in 
> production environment where cluster would go in hang state due to partition 
> map exchange.
> Please find the below ticket which i created a while back for ignite 2.7.6
> https://issues.apache.org/jira/browse/IGNITE-13298
> So we migrated the apache ignite version to 2.14 and upgrade happened 
> smoothly but on the third day we could see cluster traffic dip again. 
> We have 5 nodes in a cluster where we provide 400 GB of RAM and more than 1 
> TB SDD.
> PFB for the attached config.[I have added it as attachment for review]
> I have also added the server logs from the same time when issue happened.
> We have set txn timeout as well as socket timeout both at server and client 
> end for our write operations but seems like sometimes cluster goes into hang 
> state and all our get calls are stuck and slowly everything starts to freeze 
> our jms listener threads and every thread reaches a choked up state in 
> sometime.
> Due to which our read services which does not even use txn to retrieve data 
> also starts to choke. Ultimately leading to end user traffic dip.
> We were hoping product upgrade will help but that has not been the case till 
> now. 
>  
>  
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (IGNITE-21059) We have upgraded our ignite instance from 2.7.6 to 2.14. Found long running cache operations

2023-12-14 Thread Vipul Thakur (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-21059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17796724#comment-17796724
 ] 

Vipul Thakur edited comment on IGNITE-21059 at 12/14/23 12:48 PM:
--

Hi [~zstan] , today we observed the same issue in our other data center and 
restarting the apps helped.

I am attaching all nodes logs from the cluster -> Ignite_server_logs.zip

I am also attaching client services logs ---> client-service.zip

*We are still in process of implementing your recommendation.*


was (Author: vipul.thakur):
Hi [~zstan] , today we observed the same issue in our other data center and 
restarting the apps helped.

I am attaching all nodes logs from the cluster -> Ignite_server_logs.zip

 

> We have upgraded our ignite instance from 2.7.6 to 2.14. Found long running 
> cache operations
> 
>
> Key: IGNITE-21059
> URL: https://issues.apache.org/jira/browse/IGNITE-21059
> Project: Ignite
>  Issue Type: Bug
>  Components: binary, clients
>Affects Versions: 2.14
>Reporter: Vipul Thakur
>Priority: Critical
> Attachments: Ignite_server_logs.zip, cache-config-1.xml, 
> client-service.zip, 
> digiapi-eventprocessing-app-zone1-696c8c4946-62jbx-jstck.txt1, 
> digiapi-eventprocessing-app-zone1-696c8c4946-62jbx-jstck.txt2, 
> digiapi-eventprocessing-app-zone1-696c8c4946-62jbx-jstck.txt3, 
> digiapi-eventprocessing-app-zone1-696c8c4946-7d57w-jstck.txt1, 
> digiapi-eventprocessing-app-zone1-696c8c4946-7d57w-jstck.txt2, 
> ignite-server-nohup.out
>
>
> We have recently upgraded from 2.7.6 to 2.14 due to the issue observed in 
> production environment where cluster would go in hang state due to partition 
> map exchange.
> Please find the below ticket which i created a while back for ignite 2.7.6
> https://issues.apache.org/jira/browse/IGNITE-13298
> So we migrated the apache ignite version to 2.14 and upgrade happened 
> smoothly but on the third day we could see cluster traffic dip again. 
> We have 5 nodes in a cluster where we provide 400 GB of RAM and more than 1 
> TB SDD.
> PFB for the attached config.[I have added it as attachment for review]
> I have also added the server logs from the same time when issue happened.
> We have set txn timeout as well as socket timeout both at server and client 
> end for our write operations but seems like sometimes cluster goes into hang 
> state and all our get calls are stuck and slowly everything starts to freeze 
> our jms listener threads and every thread reaches a choked up state in 
> sometime.
> Due to which our read services which does not even use txn to retrieve data 
> also starts to choke. Ultimately leading to end user traffic dip.
> We were hoping product upgrade will help but that has not been the case till 
> now. 
>  
>  
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-21059) We have upgraded our ignite instance from 2.7.6 to 2.14. Found long running cache operations

2023-12-14 Thread Vipul Thakur (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-21059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vipul Thakur updated IGNITE-21059:
--
Attachment: client-service.zip

> We have upgraded our ignite instance from 2.7.6 to 2.14. Found long running 
> cache operations
> 
>
> Key: IGNITE-21059
> URL: https://issues.apache.org/jira/browse/IGNITE-21059
> Project: Ignite
>  Issue Type: Bug
>  Components: binary, clients
>Affects Versions: 2.14
>Reporter: Vipul Thakur
>Priority: Critical
> Attachments: Ignite_server_logs.zip, cache-config-1.xml, 
> client-service.zip, 
> digiapi-eventprocessing-app-zone1-696c8c4946-62jbx-jstck.txt1, 
> digiapi-eventprocessing-app-zone1-696c8c4946-62jbx-jstck.txt2, 
> digiapi-eventprocessing-app-zone1-696c8c4946-62jbx-jstck.txt3, 
> digiapi-eventprocessing-app-zone1-696c8c4946-7d57w-jstck.txt1, 
> digiapi-eventprocessing-app-zone1-696c8c4946-7d57w-jstck.txt2, 
> ignite-server-nohup.out
>
>
> We have recently upgraded from 2.7.6 to 2.14 due to the issue observed in 
> production environment where cluster would go in hang state due to partition 
> map exchange.
> Please find the below ticket which i created a while back for ignite 2.7.6
> https://issues.apache.org/jira/browse/IGNITE-13298
> So we migrated the apache ignite version to 2.14 and upgrade happened 
> smoothly but on the third day we could see cluster traffic dip again. 
> We have 5 nodes in a cluster where we provide 400 GB of RAM and more than 1 
> TB SDD.
> PFB for the attached config.[I have added it as attachment for review]
> I have also added the server logs from the same time when issue happened.
> We have set txn timeout as well as socket timeout both at server and client 
> end for our write operations but seems like sometimes cluster goes into hang 
> state and all our get calls are stuck and slowly everything starts to freeze 
> our jms listener threads and every thread reaches a choked up state in 
> sometime.
> Due to which our read services which does not even use txn to retrieve data 
> also starts to choke. Ultimately leading to end user traffic dip.
> We were hoping product upgrade will help but that has not been the case till 
> now. 
>  
>  
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (IGNITE-21059) We have upgraded our ignite instance from 2.7.6 to 2.14. Found long running cache operations

2023-12-14 Thread Vipul Thakur (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-21059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17796724#comment-17796724
 ] 

Vipul Thakur commented on IGNITE-21059:
---

Hi [~zstan] , today we observed the same issue in our other data center and 
restarting the apps helped.

I am attaching all nodes logs from the cluster -> Ignite_server_logs.zip

 

> We have upgraded our ignite instance from 2.7.6 to 2.14. Found long running 
> cache operations
> 
>
> Key: IGNITE-21059
> URL: https://issues.apache.org/jira/browse/IGNITE-21059
> Project: Ignite
>  Issue Type: Bug
>  Components: binary, clients
>Affects Versions: 2.14
>Reporter: Vipul Thakur
>Priority: Critical
> Attachments: Ignite_server_logs.zip, cache-config-1.xml, 
> digiapi-eventprocessing-app-zone1-696c8c4946-62jbx-jstck.txt1, 
> digiapi-eventprocessing-app-zone1-696c8c4946-62jbx-jstck.txt2, 
> digiapi-eventprocessing-app-zone1-696c8c4946-62jbx-jstck.txt3, 
> digiapi-eventprocessing-app-zone1-696c8c4946-7d57w-jstck.txt1, 
> digiapi-eventprocessing-app-zone1-696c8c4946-7d57w-jstck.txt2, 
> ignite-server-nohup.out
>
>
> We have recently upgraded from 2.7.6 to 2.14 due to the issue observed in 
> production environment where cluster would go in hang state due to partition 
> map exchange.
> Please find the below ticket which i created a while back for ignite 2.7.6
> https://issues.apache.org/jira/browse/IGNITE-13298
> So we migrated the apache ignite version to 2.14 and upgrade happened 
> smoothly but on the third day we could see cluster traffic dip again. 
> We have 5 nodes in a cluster where we provide 400 GB of RAM and more than 1 
> TB SDD.
> PFB for the attached config.[I have added it as attachment for review]
> I have also added the server logs from the same time when issue happened.
> We have set txn timeout as well as socket timeout both at server and client 
> end for our write operations but seems like sometimes cluster goes into hang 
> state and all our get calls are stuck and slowly everything starts to freeze 
> our jms listener threads and every thread reaches a choked up state in 
> sometime.
> Due to which our read services which does not even use txn to retrieve data 
> also starts to choke. Ultimately leading to end user traffic dip.
> We were hoping product upgrade will help but that has not been the case till 
> now. 
>  
>  
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-21059) We have upgraded our ignite instance from 2.7.6 to 2.14. Found long running cache operations

2023-12-14 Thread Vipul Thakur (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-21059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vipul Thakur updated IGNITE-21059:
--
Attachment: Ignite_server_logs.zip

> We have upgraded our ignite instance from 2.7.6 to 2.14. Found long running 
> cache operations
> 
>
> Key: IGNITE-21059
> URL: https://issues.apache.org/jira/browse/IGNITE-21059
> Project: Ignite
>  Issue Type: Bug
>  Components: binary, clients
>Affects Versions: 2.14
>Reporter: Vipul Thakur
>Priority: Critical
> Attachments: Ignite_server_logs.zip, cache-config-1.xml, 
> digiapi-eventprocessing-app-zone1-696c8c4946-62jbx-jstck.txt1, 
> digiapi-eventprocessing-app-zone1-696c8c4946-62jbx-jstck.txt2, 
> digiapi-eventprocessing-app-zone1-696c8c4946-62jbx-jstck.txt3, 
> digiapi-eventprocessing-app-zone1-696c8c4946-7d57w-jstck.txt1, 
> digiapi-eventprocessing-app-zone1-696c8c4946-7d57w-jstck.txt2, 
> ignite-server-nohup.out
>
>
> We have recently upgraded from 2.7.6 to 2.14 due to the issue observed in 
> production environment where cluster would go in hang state due to partition 
> map exchange.
> Please find the below ticket which i created a while back for ignite 2.7.6
> https://issues.apache.org/jira/browse/IGNITE-13298
> So we migrated the apache ignite version to 2.14 and upgrade happened 
> smoothly but on the third day we could see cluster traffic dip again. 
> We have 5 nodes in a cluster where we provide 400 GB of RAM and more than 1 
> TB SDD.
> PFB for the attached config.[I have added it as attachment for review]
> I have also added the server logs from the same time when issue happened.
> We have set txn timeout as well as socket timeout both at server and client 
> end for our write operations but seems like sometimes cluster goes into hang 
> state and all our get calls are stuck and slowly everything starts to freeze 
> our jms listener threads and every thread reaches a choked up state in 
> sometime.
> Due to which our read services which does not even use txn to retrieve data 
> also starts to choke. Ultimately leading to end user traffic dip.
> We were hoping product upgrade will help but that has not been the case till 
> now. 
>  
>  
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (IGNITE-21052) Sql. Test ExecutionServiceImplTest.testErrorIsPropagatedToPrefetchCallback is flaky due to planning timeout

2023-12-14 Thread Pavel Pereslegin (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-21052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pavel Pereslegin reassigned IGNITE-21052:
-

Assignee: Pavel Pereslegin

> Sql. Test ExecutionServiceImplTest.testErrorIsPropagatedToPrefetchCallback is 
> flaky due to planning timeout
> ---
>
> Key: IGNITE-21052
> URL: https://issues.apache.org/jira/browse/IGNITE-21052
> Project: Ignite
>  Issue Type: Bug
>  Components: sql
>Reporter: Pavel Pereslegin
>Assignee: Pavel Pereslegin
>Priority: Major
>  Labels: ignite-3
>
> The ExecutionServiceImplTest.testErrorIsPropagatedToPrefetchCallback test 
> recently started failing on Teamcity.
> https://ci.ignite.apache.org/project.html?projectId=ApacheIgnite3xGradle_Test_RunUnitTests_virtual==testDetails=-7125216101951766635=TEST_STATUS_DESC_ApacheIgnite3xGradle_Test_RunUnitTests_virtual=__all_branches__=50
> We need to investigate why planning of a simple query takes so long. It looks 
> like current timeout - 5 seconds should be enough.
> {noformat}
> org.apache.ignite.sql.SqlException: IGN-SQL-10 
> TraceId:97de28e5-3561-4c31-b126-e089296cec39 Planning of a query aborted due 
> to planner timeout threshold is reached
>   at 
> app//org.apache.ignite.internal.sql.engine.prepare.PrepareServiceImpl.lambda$prepareAsync$0(PrepareServiceImpl.java:202)
>   at 
> java.base@11.0.17/java.util.concurrent.CompletableFuture.uniExceptionally(CompletableFuture.java:986)
>   at 
> java.base@11.0.17/java.util.concurrent.CompletableFuture$UniExceptionally.tryFire(CompletableFuture.java:970)
>   at 
> java.base@11.0.17/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506)
>   at 
> java.base@11.0.17/java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1705)
>   at 
> java.base@11.0.17/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>   at 
> java.base@11.0.17/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>   at java.base@11.0.17/java.lang.Thread.run(Thread.java:834)
>   Suppressed: java.lang.RuntimeException: This is a trimmed root
>   at 
> org.apache.ignite.internal.testframework.IgniteTestUtils.await(IgniteTestUtils.java:757)
>   at 
> org.apache.ignite.internal.testframework.IgniteTestUtils.await(IgniteTestUtils.java:777)
>   at 
> org.apache.ignite.internal.sql.engine.exec.ExecutionServiceImplTest.prepare(ExecutionServiceImplTest.java:813)
>   at 
> org.apache.ignite.internal.sql.engine.exec.ExecutionServiceImplTest.testErrorIsPropagatedToPrefetchCallback(ExecutionServiceImplTest.java:694)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:566)
>   at 
> org.junit.platform.commons.util.ReflectionUtils.invokeMethod(ReflectionUtils.java:727)
>   at 
> org.junit.jupiter.engine.execution.MethodInvocation.proceed(MethodInvocation.java:60)
>   at 
> org.junit.jupiter.engine.execution.InvocationInterceptorChain$ValidatingInvocation.proceed(InvocationInterceptorChain.java:131)
>   at 
> org.junit.jupiter.engine.extension.SameThreadTimeoutInvocation.proceed(SameThreadTimeoutInvocation.java:45)
>   at 
> org.junit.jupiter.engine.extension.TimeoutExtension.intercept(TimeoutExtension.java:156)
>   at 
> org.junit.jupiter.engine.extension.TimeoutExtension.interceptTestableMethod(TimeoutExtension.java:147)
>   at 
> org.junit.jupiter.engine.extension.TimeoutExtension.interceptTestMethod(TimeoutExtension.java:86)
>   at 
> org.junit.jupiter.engine.execution.InterceptingExecutableInvoker$ReflectiveInterceptorCall.lambda$ofVoidMethod$0(InterceptingExecutableInvoker.java:103)
>   at 
> org.junit.jupiter.engine.execution.InterceptingExecutableInvoker.lambda$invoke$0(InterceptingExecutableInvoker.java:93)
>   at 
> org.junit.jupiter.engine.execution.InvocationInterceptorChain$InterceptedInvocation.proceed(InvocationInterceptorChain.java:106)
>   at 
> org.junit.jupiter.engine.execution.InvocationInterceptorChain.proceed(InvocationInterceptorChain.java:64)
>   at 
> org.junit.jupiter.engine.execution.InvocationInterceptorChain.chainAndInvoke(InvocationInterceptorChain.java:45)
>   at 
> 

[jira] [Updated] (IGNITE-21084) Account for differences of logical part of now() on different nodes when waiting after a DDL

2023-12-14 Thread Roman Puchkovskiy (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-21084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roman Puchkovskiy updated IGNITE-21084:
---
Description: 
Client must see the results of a DDL they just executed, no matter through 
which node of the cluster a subsequent request is made. To maintain this, we 
must make sure that ActivationTimestamp(DDL) becomes non-future on all nodes of 
the cluster and only then can we complete the DDL future. As nodes' physical 
clocks might have some skew relatively to each other, we add MaxClockSkew to 
the timestamp till which we wait to compensate for this difference.

Analogously to HLC.now() being different because its physical part differs on 
different nodes, it might be different because *logical* parts are different on 
different nodes.

Let's assume we don't have any physical clock skew, and MaxClockSkew is 0. 
ActivationTimestamp(DDL) is (100, 5); (100 is physical part, 5 is logical 
part). We wait on node 1 (through which the DDL was executed) till its 
HLC.now() reaches (100, 5), then we complete the DDL future. The user goes to 
node 2 at which HLC.now() is (100, 2). The user executes a query at 'now', and 
the query sees the state before the DDL was executed, which breaks our 
requirement.

We can fix this by 'rounding' the timestamp-to-wait-for up in the following way:
 # If logical part is not 0, increment the physical part and set the logical 
part to 0
 # If the logical part is 0, leave the timestamp as is

As a result, for (100, 0) we will wait for (100, 0), and at node 1 HLC is at 
least (100, 0), so it cannot see the previous schema. For (100, 5) we would 
wait till (101, 0), which would also guarantee that a query executed after 
waiting sees the new schema version.

> Account for differences of logical part of now() on different nodes when 
> waiting after a DDL
> 
>
> Key: IGNITE-21084
> URL: https://issues.apache.org/jira/browse/IGNITE-21084
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Roman Puchkovskiy
>Assignee: Roman Puchkovskiy
>Priority: Major
>  Labels: ignite-3
> Fix For: 3.0.0-beta2
>
>
> Client must see the results of a DDL they just executed, no matter through 
> which node of the cluster a subsequent request is made. To maintain this, we 
> must make sure that ActivationTimestamp(DDL) becomes non-future on all nodes 
> of the cluster and only then can we complete the DDL future. As nodes' 
> physical clocks might have some skew relatively to each other, we add 
> MaxClockSkew to the timestamp till which we wait to compensate for this 
> difference.
> Analogously to HLC.now() being different because its physical part differs on 
> different nodes, it might be different because *logical* parts are different 
> on different nodes.
> Let's assume we don't have any physical clock skew, and MaxClockSkew is 0. 
> ActivationTimestamp(DDL) is (100, 5); (100 is physical part, 5 is logical 
> part). We wait on node 1 (through which the DDL was executed) till its 
> HLC.now() reaches (100, 5), then we complete the DDL future. The user goes to 
> node 2 at which HLC.now() is (100, 2). The user executes a query at 'now', 
> and the query sees the state before the DDL was executed, which breaks our 
> requirement.
> We can fix this by 'rounding' the timestamp-to-wait-for up in the following 
> way:
>  # If logical part is not 0, increment the physical part and set the logical 
> part to 0
>  # If the logical part is 0, leave the timestamp as is
> As a result, for (100, 0) we will wait for (100, 0), and at node 1 HLC is at 
> least (100, 0), so it cannot see the previous schema. For (100, 5) we would 
> wait till (101, 0), which would also guarantee that a query executed after 
> waiting sees the new schema version.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (IGNITE-21084) Account for differences of logical part of now() on different nodes when waiting after a DDL

2023-12-14 Thread Roman Puchkovskiy (Jira)
Roman Puchkovskiy created IGNITE-21084:
--

 Summary: Account for differences of logical part of now() on 
different nodes when waiting after a DDL
 Key: IGNITE-21084
 URL: https://issues.apache.org/jira/browse/IGNITE-21084
 Project: Ignite
  Issue Type: Improvement
Reporter: Roman Puchkovskiy
Assignee: Roman Puchkovskiy
 Fix For: 3.0.0-beta2






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-21084) Account for differences of logical part of now() on different nodes when waiting after a DDL

2023-12-14 Thread Roman Puchkovskiy (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-21084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roman Puchkovskiy updated IGNITE-21084:
---
Epic Link: IGNITE-20800

> Account for differences of logical part of now() on different nodes when 
> waiting after a DDL
> 
>
> Key: IGNITE-21084
> URL: https://issues.apache.org/jira/browse/IGNITE-21084
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Roman Puchkovskiy
>Assignee: Roman Puchkovskiy
>Priority: Major
>  Labels: ignite-3
> Fix For: 3.0.0-beta2
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-13342) DynamicMBean#getAttributes() should return a list of Attributes, not Objects

2023-12-14 Thread YuJue Li (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-13342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

YuJue Li updated IGNITE-13342:
--
Fix Version/s: 2.16

> DynamicMBean#getAttributes() should return a list of Attributes, not Objects
> 
>
> Key: IGNITE-13342
> URL: https://issues.apache.org/jira/browse/IGNITE-13342
> Project: Ignite
>  Issue Type: Improvement
>Affects Versions: 2.8.1
>Reporter: Eiichi Sato
>Priority: Minor
> Fix For: 2.16
>
>
> Currently, calling #getAttributes() on MBeans such as 
> `org.apache:clsLdr=18b4aac2,group=views,name="sql.queries"` returns an 
> AttributeList directly containing raw attribute values, not 
> javax.management.Attribute.
>  
> [https://github.com/apache/ignite/blob/77d21eaa367ea293233078e85ed0c967dc2b6ee7/modules/core/src/main/java/org/apache/ignite/spi/metric/jmx/ReadOnlyDynamicMBean.java#L66]
>  
> According to the javadoc of AttributeList, this is "highly discouraged".
>  > For compatibility reasons, it is possible, though highly discouraged, to 
> add objects to an {{AttributeList}} that are not instances of {{Attribute}}. 
>  
> [https://docs.oracle.com/javase/8/docs/api/javax/management/AttributeList.html]
>  
> Also, this behavior seems to cause jmx_exporter to fail due to 
> AttributeList#asList() throwing an IllegalArgumentException when 
> non-Attribute element is found.
>  [https://github.com/prometheus/jmx_exporter/issues/483]
>  [https://github.com/prometheus/jmx_exporter/issues/501]
>  
> I wouldn't call this a bug, but it'd be better if Ignite can simply wrap raw 
> attribute values by Attribute class. I'm going to write a patch and send a PR 
> on GitHub.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (IGNITE-21032) ReadOnlyDynamicMBean.getAttributes may return a list of attribute values instead of Attribute instances

2023-12-14 Thread Nikita Amelchev (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-21032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17796636#comment-17796636
 ] 

Nikita Amelchev edited comment on IGNITE-21032 at 12/14/23 9:28 AM:


Cherry-picked to the 2.16.

[~simon.greatrix], Thank you for the contribution.


was (Author: nsamelchev):
Cherry-picked to the 2.16.

> ReadOnlyDynamicMBean.getAttributes may return a list of attribute values 
> instead of Attribute instances
> ---
>
> Key: IGNITE-21032
> URL: https://issues.apache.org/jira/browse/IGNITE-21032
> Project: Ignite
>  Issue Type: Bug
>Reporter: Vyacheslav Koptilin
>Assignee: Simon Greatrix
>Priority: Major
> Fix For: 2.16
>
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> When supplying JMX information, the AttributeList class should contain 
> Attributes, however the existing code returns attribute values. This can 
> cause ClassCastExceptions in code that attempts to read an AttributeList.
>  
> [GitHub Issue #11045|https://github.com/apache/ignite/issues/11045]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (IGNITE-21032) ReadOnlyDynamicMBean.getAttributes may return a list of attribute values instead of Attribute instances

2023-12-14 Thread Nikita Amelchev (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-21032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17796636#comment-17796636
 ] 

Nikita Amelchev commented on IGNITE-21032:
--

Cherry-picked to the 2.16.

> ReadOnlyDynamicMBean.getAttributes may return a list of attribute values 
> instead of Attribute instances
> ---
>
> Key: IGNITE-21032
> URL: https://issues.apache.org/jira/browse/IGNITE-21032
> Project: Ignite
>  Issue Type: Bug
>Reporter: Vyacheslav Koptilin
>Assignee: Simon Greatrix
>Priority: Major
> Fix For: 2.16
>
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> When supplying JMX information, the AttributeList class should contain 
> Attributes, however the existing code returns attribute values. This can 
> cause ClassCastExceptions in code that attempts to read an AttributeList.
>  
> [GitHub Issue #11045|https://github.com/apache/ignite/issues/11045]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-21083) Java thin 3.0: Sporadic IllegalReferenceCountException when reading jdbc messages

2023-12-14 Thread Pavel Tupitsyn (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-21083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pavel Tupitsyn updated IGNITE-21083:

Summary: Java thin 3.0: Sporadic IllegalReferenceCountException when 
reading jdbc messages  (was: Sporadic IllegalReferenceCountException when 
reading jdbc messages.)

> Java thin 3.0: Sporadic IllegalReferenceCountException when reading jdbc 
> messages
> -
>
> Key: IGNITE-21083
> URL: https://issues.apache.org/jira/browse/IGNITE-21083
> Project: Ignite
>  Issue Type: Bug
>  Components: thin client
>Reporter: Maksim Zhuravkov
>Assignee: Pavel Tupitsyn
>Priority: Major
>  Labels: ignite-3
> Fix For: 3.0.0-beta2
>
>
> Some jdbc test fail  with IllegalReferenceCountException refCnt 0 
> (`org.apache.ignite.jdbc.ItJdbcResultSetSelfTest.testFindColumn`).
> This issue maybe caused by the following code in `TcpClientChannel` , because 
> a buffer is sent to another thread w/o calling `ByteBuf::retain`:
> {code:java}
> /** {@inheritDoc} */
> @Override
> public void onMessage(ByteBuf buf) {
> asyncContinuationExecutor.execute(() -> {
> try {
> processNextMessage(buf);
> } catch (Throwable t) {
> close(t, false);
> } finally {
> buf.release();
> }
> });
> }
> {code}
> Error:
> {code:java}
> java.sql.SQLException: Unable to close result set.
>   at 
> org.apache.ignite.internal.jdbc.JdbcResultSet.close0(JdbcResultSet.java:296)
>   at 
> org.apache.ignite.internal.jdbc.JdbcStatement.closeResults(JdbcStatement.java:717)
>   at 
> org.apache.ignite.internal.jdbc.JdbcStatement.close(JdbcStatement.java:259)
>   at 
> org.apache.ignite.jdbc.AbstractJdbcSelfTest.tearDownBase(AbstractJdbcSelfTest.java:130)
>   at jdk.internal.reflect.GeneratedMethodAccessor40.invoke(Unknown Source)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:566)
>   at 
> org.junit.platform.commons.util.ReflectionUtils.invokeMethod(ReflectionUtils.java:727)
>   at 
> org.junit.jupiter.engine.execution.MethodInvocation.proceed(MethodInvocation.java:60)
>   at 
> org.junit.jupiter.engine.execution.InvocationInterceptorChain$ValidatingInvocation.proceed(InvocationInterceptorChain.java:131)
>   at 
> org.junit.jupiter.engine.extension.SameThreadTimeoutInvocation.proceed(SameThreadTimeoutInvocation.java:45)
>   at 
> org.junit.jupiter.engine.extension.TimeoutExtension.intercept(TimeoutExtension.java:156)
>   at 
> org.junit.jupiter.engine.extension.TimeoutExtension.interceptLifecycleMethod(TimeoutExtension.java:128)
>   at 
> org.junit.jupiter.engine.extension.TimeoutExtension.interceptAfterEachMethod(TimeoutExtension.java:110)
>   at 
> org.junit.jupiter.engine.execution.InterceptingExecutableInvoker$ReflectiveInterceptorCall.lambda$ofVoidMethod$0(InterceptingExecutableInvoker.java:103)
>   at 
> org.junit.jupiter.engine.execution.InterceptingExecutableInvoker.lambda$invoke$0(InterceptingExecutableInvoker.java:93)
>   at 
> org.junit.jupiter.engine.execution.InvocationInterceptorChain$InterceptedInvocation.proceed(InvocationInterceptorChain.java:106)
>   at 
> org.junit.jupiter.engine.execution.InvocationInterceptorChain.proceed(InvocationInterceptorChain.java:64)
>   at 
> org.junit.jupiter.engine.execution.InvocationInterceptorChain.chainAndInvoke(InvocationInterceptorChain.java:45)
>   at 
> org.junit.jupiter.engine.execution.InvocationInterceptorChain.invoke(InvocationInterceptorChain.java:37)
>   at 
> org.junit.jupiter.engine.execution.InterceptingExecutableInvoker.invoke(InterceptingExecutableInvoker.java:92)
>   at 
> org.junit.jupiter.engine.execution.InterceptingExecutableInvoker.invoke(InterceptingExecutableInvoker.java:86)
>   at 
> org.junit.jupiter.engine.descriptor.ClassBasedTestDescriptor.invokeMethodInExtensionContext(ClassBasedTestDescriptor.java:520)
>   at 
> org.junit.jupiter.engine.descriptor.ClassBasedTestDescriptor.lambda$synthesizeAfterEachMethodAdapter$24(ClassBasedTestDescriptor.java:510)
>   at 
> org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.lambda$invokeAfterEachMethods$10(TestMethodTestDescriptor.java:243)
>   at 
> org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.lambda$invokeAllAfterMethodsOrCallbacks$13(TestMethodTestDescriptor.java:276)
>   at 
> org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
>   at 
> org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.lambda$invokeAllAfterMethodsOrCallbacks$14(TestMethodTestDescriptor.java:276)
>   at java.base/java.util.ArrayList.forEach(ArrayList.java:1541)
>   at 
> 

[jira] [Assigned] (IGNITE-21083) Sporadic IllegalReferenceCountException when reading jdbc messages.

2023-12-14 Thread Pavel Tupitsyn (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-21083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pavel Tupitsyn reassigned IGNITE-21083:
---

Assignee: Pavel Tupitsyn

> Sporadic IllegalReferenceCountException when reading jdbc messages.
> ---
>
> Key: IGNITE-21083
> URL: https://issues.apache.org/jira/browse/IGNITE-21083
> Project: Ignite
>  Issue Type: Bug
>  Components: thin client
>Reporter: Maksim Zhuravkov
>Assignee: Pavel Tupitsyn
>Priority: Major
>  Labels: ignite-3
> Fix For: 3.0.0-beta2
>
>
> Some jdbc test fail  with IllegalReferenceCountException refCnt 0 
> (`org.apache.ignite.jdbc.ItJdbcResultSetSelfTest.testFindColumn`).
> This issue maybe caused by the following code in `TcpClientChannel` , because 
> a buffer is sent to another thread w/o calling `ByteBuf::retain`:
> {code:java}
> /** {@inheritDoc} */
> @Override
> public void onMessage(ByteBuf buf) {
> asyncContinuationExecutor.execute(() -> {
> try {
> processNextMessage(buf);
> } catch (Throwable t) {
> close(t, false);
> } finally {
> buf.release();
> }
> });
> }
> {code}
> Error:
> {code:java}
> java.sql.SQLException: Unable to close result set.
>   at 
> org.apache.ignite.internal.jdbc.JdbcResultSet.close0(JdbcResultSet.java:296)
>   at 
> org.apache.ignite.internal.jdbc.JdbcStatement.closeResults(JdbcStatement.java:717)
>   at 
> org.apache.ignite.internal.jdbc.JdbcStatement.close(JdbcStatement.java:259)
>   at 
> org.apache.ignite.jdbc.AbstractJdbcSelfTest.tearDownBase(AbstractJdbcSelfTest.java:130)
>   at jdk.internal.reflect.GeneratedMethodAccessor40.invoke(Unknown Source)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:566)
>   at 
> org.junit.platform.commons.util.ReflectionUtils.invokeMethod(ReflectionUtils.java:727)
>   at 
> org.junit.jupiter.engine.execution.MethodInvocation.proceed(MethodInvocation.java:60)
>   at 
> org.junit.jupiter.engine.execution.InvocationInterceptorChain$ValidatingInvocation.proceed(InvocationInterceptorChain.java:131)
>   at 
> org.junit.jupiter.engine.extension.SameThreadTimeoutInvocation.proceed(SameThreadTimeoutInvocation.java:45)
>   at 
> org.junit.jupiter.engine.extension.TimeoutExtension.intercept(TimeoutExtension.java:156)
>   at 
> org.junit.jupiter.engine.extension.TimeoutExtension.interceptLifecycleMethod(TimeoutExtension.java:128)
>   at 
> org.junit.jupiter.engine.extension.TimeoutExtension.interceptAfterEachMethod(TimeoutExtension.java:110)
>   at 
> org.junit.jupiter.engine.execution.InterceptingExecutableInvoker$ReflectiveInterceptorCall.lambda$ofVoidMethod$0(InterceptingExecutableInvoker.java:103)
>   at 
> org.junit.jupiter.engine.execution.InterceptingExecutableInvoker.lambda$invoke$0(InterceptingExecutableInvoker.java:93)
>   at 
> org.junit.jupiter.engine.execution.InvocationInterceptorChain$InterceptedInvocation.proceed(InvocationInterceptorChain.java:106)
>   at 
> org.junit.jupiter.engine.execution.InvocationInterceptorChain.proceed(InvocationInterceptorChain.java:64)
>   at 
> org.junit.jupiter.engine.execution.InvocationInterceptorChain.chainAndInvoke(InvocationInterceptorChain.java:45)
>   at 
> org.junit.jupiter.engine.execution.InvocationInterceptorChain.invoke(InvocationInterceptorChain.java:37)
>   at 
> org.junit.jupiter.engine.execution.InterceptingExecutableInvoker.invoke(InterceptingExecutableInvoker.java:92)
>   at 
> org.junit.jupiter.engine.execution.InterceptingExecutableInvoker.invoke(InterceptingExecutableInvoker.java:86)
>   at 
> org.junit.jupiter.engine.descriptor.ClassBasedTestDescriptor.invokeMethodInExtensionContext(ClassBasedTestDescriptor.java:520)
>   at 
> org.junit.jupiter.engine.descriptor.ClassBasedTestDescriptor.lambda$synthesizeAfterEachMethodAdapter$24(ClassBasedTestDescriptor.java:510)
>   at 
> org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.lambda$invokeAfterEachMethods$10(TestMethodTestDescriptor.java:243)
>   at 
> org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.lambda$invokeAllAfterMethodsOrCallbacks$13(TestMethodTestDescriptor.java:276)
>   at 
> org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
>   at 
> org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.lambda$invokeAllAfterMethodsOrCallbacks$14(TestMethodTestDescriptor.java:276)
>   at java.base/java.util.ArrayList.forEach(ArrayList.java:1541)
>   at 
> org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.invokeAllAfterMethodsOrCallbacks(TestMethodTestDescriptor.java:275)
>   at 
> 

[jira] [Updated] (IGNITE-21083) Sporadic IllegalReferenceCountException when reading jdbc messages.

2023-12-14 Thread Maksim Zhuravkov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-21083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maksim Zhuravkov updated IGNITE-21083:
--
Description: 
Some jdbc test fail  with IllegalReferenceCountException refCnt 0 
(`org.apache.ignite.jdbc.ItJdbcResultSetSelfTest.testFindColumn`).
This issue maybe caused by the following code in `TcpClientChannel` , because a 
buffer is sent to another thread w/o calling `ByteBuf::retain`:

{code:java}
/** {@inheritDoc} */
@Override
public void onMessage(ByteBuf buf) {
asyncContinuationExecutor.execute(() -> {
try {
processNextMessage(buf);
} catch (Throwable t) {
close(t, false);
} finally {
buf.release();
}
});
}
{code}

Error:
{code:java}
java.sql.SQLException: Unable to close result set.
  at 
org.apache.ignite.internal.jdbc.JdbcResultSet.close0(JdbcResultSet.java:296)
  at 
org.apache.ignite.internal.jdbc.JdbcStatement.closeResults(JdbcStatement.java:717)
  at org.apache.ignite.internal.jdbc.JdbcStatement.close(JdbcStatement.java:259)
  at 
org.apache.ignite.jdbc.AbstractJdbcSelfTest.tearDownBase(AbstractJdbcSelfTest.java:130)
  at jdk.internal.reflect.GeneratedMethodAccessor40.invoke(Unknown Source)
  at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  at java.base/java.lang.reflect.Method.invoke(Method.java:566)
  at 
org.junit.platform.commons.util.ReflectionUtils.invokeMethod(ReflectionUtils.java:727)
  at 
org.junit.jupiter.engine.execution.MethodInvocation.proceed(MethodInvocation.java:60)
  at 
org.junit.jupiter.engine.execution.InvocationInterceptorChain$ValidatingInvocation.proceed(InvocationInterceptorChain.java:131)
  at 
org.junit.jupiter.engine.extension.SameThreadTimeoutInvocation.proceed(SameThreadTimeoutInvocation.java:45)
  at 
org.junit.jupiter.engine.extension.TimeoutExtension.intercept(TimeoutExtension.java:156)
  at 
org.junit.jupiter.engine.extension.TimeoutExtension.interceptLifecycleMethod(TimeoutExtension.java:128)
  at 
org.junit.jupiter.engine.extension.TimeoutExtension.interceptAfterEachMethod(TimeoutExtension.java:110)
  at 
org.junit.jupiter.engine.execution.InterceptingExecutableInvoker$ReflectiveInterceptorCall.lambda$ofVoidMethod$0(InterceptingExecutableInvoker.java:103)
  at 
org.junit.jupiter.engine.execution.InterceptingExecutableInvoker.lambda$invoke$0(InterceptingExecutableInvoker.java:93)
  at 
org.junit.jupiter.engine.execution.InvocationInterceptorChain$InterceptedInvocation.proceed(InvocationInterceptorChain.java:106)
  at 
org.junit.jupiter.engine.execution.InvocationInterceptorChain.proceed(InvocationInterceptorChain.java:64)
  at 
org.junit.jupiter.engine.execution.InvocationInterceptorChain.chainAndInvoke(InvocationInterceptorChain.java:45)
  at 
org.junit.jupiter.engine.execution.InvocationInterceptorChain.invoke(InvocationInterceptorChain.java:37)
  at 
org.junit.jupiter.engine.execution.InterceptingExecutableInvoker.invoke(InterceptingExecutableInvoker.java:92)
  at 
org.junit.jupiter.engine.execution.InterceptingExecutableInvoker.invoke(InterceptingExecutableInvoker.java:86)
  at 
org.junit.jupiter.engine.descriptor.ClassBasedTestDescriptor.invokeMethodInExtensionContext(ClassBasedTestDescriptor.java:520)
  at 
org.junit.jupiter.engine.descriptor.ClassBasedTestDescriptor.lambda$synthesizeAfterEachMethodAdapter$24(ClassBasedTestDescriptor.java:510)
  at 
org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.lambda$invokeAfterEachMethods$10(TestMethodTestDescriptor.java:243)
  at 
org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.lambda$invokeAllAfterMethodsOrCallbacks$13(TestMethodTestDescriptor.java:276)
  at 
org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
  at 
org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.lambda$invokeAllAfterMethodsOrCallbacks$14(TestMethodTestDescriptor.java:276)
  at java.base/java.util.ArrayList.forEach(ArrayList.java:1541)
  at 
org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.invokeAllAfterMethodsOrCallbacks(TestMethodTestDescriptor.java:275)
  at 
org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.invokeAfterEachMethods(TestMethodTestDescriptor.java:241)
  at 
org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.execute(TestMethodTestDescriptor.java:142)
  at 
org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.execute(TestMethodTestDescriptor.java:68)
  at 
org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$6(NodeTestTask.java:151)
  at 
org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
  at 
org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$8(NodeTestTask.java:141)
  at 

[jira] [Updated] (IGNITE-21083) Sporadic IllegalReferenceCountException when reading jdbc messages.

2023-12-14 Thread Maksim Zhuravkov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-21083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maksim Zhuravkov updated IGNITE-21083:
--
Description: 
Some jdbc test fail  with IllegalReferenceCountException refCnt 0 
(org.apache.ignite.jdbc.ItJdbcResultSetSelfTest.testFindColumn).
This issue maybe caused by the following code in TcpClientChannel, because a 
buffer is sent to another thread w/o calling ByteBuf::retain:

{code:java}
/** {@inheritDoc} */
@Override
public void onMessage(ByteBuf buf) {
asyncContinuationExecutor.execute(() -> {
try {
processNextMessage(buf);
} catch (Throwable t) {
close(t, false);
} finally {
buf.release();
}
});
}
{code}

Error:
{code:java}
java.sql.SQLException: Unable to close result set.
  at 
org.apache.ignite.internal.jdbc.JdbcResultSet.close0(JdbcResultSet.java:296)
  at 
org.apache.ignite.internal.jdbc.JdbcStatement.closeResults(JdbcStatement.java:717)
  at org.apache.ignite.internal.jdbc.JdbcStatement.close(JdbcStatement.java:259)
  at 
org.apache.ignite.jdbc.AbstractJdbcSelfTest.tearDownBase(AbstractJdbcSelfTest.java:130)
  at jdk.internal.reflect.GeneratedMethodAccessor40.invoke(Unknown Source)
  at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  at java.base/java.lang.reflect.Method.invoke(Method.java:566)
  at 
org.junit.platform.commons.util.ReflectionUtils.invokeMethod(ReflectionUtils.java:727)
  at 
org.junit.jupiter.engine.execution.MethodInvocation.proceed(MethodInvocation.java:60)
  at 
org.junit.jupiter.engine.execution.InvocationInterceptorChain$ValidatingInvocation.proceed(InvocationInterceptorChain.java:131)
  at 
org.junit.jupiter.engine.extension.SameThreadTimeoutInvocation.proceed(SameThreadTimeoutInvocation.java:45)
  at 
org.junit.jupiter.engine.extension.TimeoutExtension.intercept(TimeoutExtension.java:156)
  at 
org.junit.jupiter.engine.extension.TimeoutExtension.interceptLifecycleMethod(TimeoutExtension.java:128)
  at 
org.junit.jupiter.engine.extension.TimeoutExtension.interceptAfterEachMethod(TimeoutExtension.java:110)
  at 
org.junit.jupiter.engine.execution.InterceptingExecutableInvoker$ReflectiveInterceptorCall.lambda$ofVoidMethod$0(InterceptingExecutableInvoker.java:103)
  at 
org.junit.jupiter.engine.execution.InterceptingExecutableInvoker.lambda$invoke$0(InterceptingExecutableInvoker.java:93)
  at 
org.junit.jupiter.engine.execution.InvocationInterceptorChain$InterceptedInvocation.proceed(InvocationInterceptorChain.java:106)
  at 
org.junit.jupiter.engine.execution.InvocationInterceptorChain.proceed(InvocationInterceptorChain.java:64)
  at 
org.junit.jupiter.engine.execution.InvocationInterceptorChain.chainAndInvoke(InvocationInterceptorChain.java:45)
  at 
org.junit.jupiter.engine.execution.InvocationInterceptorChain.invoke(InvocationInterceptorChain.java:37)
  at 
org.junit.jupiter.engine.execution.InterceptingExecutableInvoker.invoke(InterceptingExecutableInvoker.java:92)
  at 
org.junit.jupiter.engine.execution.InterceptingExecutableInvoker.invoke(InterceptingExecutableInvoker.java:86)
  at 
org.junit.jupiter.engine.descriptor.ClassBasedTestDescriptor.invokeMethodInExtensionContext(ClassBasedTestDescriptor.java:520)
  at 
org.junit.jupiter.engine.descriptor.ClassBasedTestDescriptor.lambda$synthesizeAfterEachMethodAdapter$24(ClassBasedTestDescriptor.java:510)
  at 
org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.lambda$invokeAfterEachMethods$10(TestMethodTestDescriptor.java:243)
  at 
org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.lambda$invokeAllAfterMethodsOrCallbacks$13(TestMethodTestDescriptor.java:276)
  at 
org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
  at 
org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.lambda$invokeAllAfterMethodsOrCallbacks$14(TestMethodTestDescriptor.java:276)
  at java.base/java.util.ArrayList.forEach(ArrayList.java:1541)
  at 
org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.invokeAllAfterMethodsOrCallbacks(TestMethodTestDescriptor.java:275)
  at 
org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.invokeAfterEachMethods(TestMethodTestDescriptor.java:241)
  at 
org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.execute(TestMethodTestDescriptor.java:142)
  at 
org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.execute(TestMethodTestDescriptor.java:68)
  at 
org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$6(NodeTestTask.java:151)
  at 
org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
  at 
org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$8(NodeTestTask.java:141)
  at org.junit.platform.engine.support.hierarchical.Node.around(Node.java:137)

[jira] [Updated] (IGNITE-21083) Sporadic IllegalReferenceCountException when reading jdbc messages.

2023-12-14 Thread Maksim Zhuravkov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-21083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maksim Zhuravkov updated IGNITE-21083:
--
Description: 
Some jdbc test fail  with IllegalReferenceCountException refCnt 0 
(org.apache.ignite.jdbc.ItJdbcResultSetSelfTest.testFindColumn).
This issue maybe caused by the following code in TcpClientChannel because 
buffer is sent to another thread w/o calling ByteBuf::retain:

{code:java}
/** {@inheritDoc} */
@Override
public void onMessage(ByteBuf buf) {
asyncContinuationExecutor.execute(() -> {
try {
processNextMessage(buf);
} catch (Throwable t) {
close(t, false);
} finally {
buf.release();
}
});
}
{code}

Error:
{code:java}
java.sql.SQLException: Unable to close result set.
  at 
org.apache.ignite.internal.jdbc.JdbcResultSet.close0(JdbcResultSet.java:296)
  at 
org.apache.ignite.internal.jdbc.JdbcStatement.closeResults(JdbcStatement.java:717)
  at org.apache.ignite.internal.jdbc.JdbcStatement.close(JdbcStatement.java:259)
  at 
org.apache.ignite.jdbc.AbstractJdbcSelfTest.tearDownBase(AbstractJdbcSelfTest.java:130)
  at jdk.internal.reflect.GeneratedMethodAccessor40.invoke(Unknown Source)
  at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  at java.base/java.lang.reflect.Method.invoke(Method.java:566)
  at 
org.junit.platform.commons.util.ReflectionUtils.invokeMethod(ReflectionUtils.java:727)
  at 
org.junit.jupiter.engine.execution.MethodInvocation.proceed(MethodInvocation.java:60)
  at 
org.junit.jupiter.engine.execution.InvocationInterceptorChain$ValidatingInvocation.proceed(InvocationInterceptorChain.java:131)
  at 
org.junit.jupiter.engine.extension.SameThreadTimeoutInvocation.proceed(SameThreadTimeoutInvocation.java:45)
  at 
org.junit.jupiter.engine.extension.TimeoutExtension.intercept(TimeoutExtension.java:156)
  at 
org.junit.jupiter.engine.extension.TimeoutExtension.interceptLifecycleMethod(TimeoutExtension.java:128)
  at 
org.junit.jupiter.engine.extension.TimeoutExtension.interceptAfterEachMethod(TimeoutExtension.java:110)
  at 
org.junit.jupiter.engine.execution.InterceptingExecutableInvoker$ReflectiveInterceptorCall.lambda$ofVoidMethod$0(InterceptingExecutableInvoker.java:103)
  at 
org.junit.jupiter.engine.execution.InterceptingExecutableInvoker.lambda$invoke$0(InterceptingExecutableInvoker.java:93)
  at 
org.junit.jupiter.engine.execution.InvocationInterceptorChain$InterceptedInvocation.proceed(InvocationInterceptorChain.java:106)
  at 
org.junit.jupiter.engine.execution.InvocationInterceptorChain.proceed(InvocationInterceptorChain.java:64)
  at 
org.junit.jupiter.engine.execution.InvocationInterceptorChain.chainAndInvoke(InvocationInterceptorChain.java:45)
  at 
org.junit.jupiter.engine.execution.InvocationInterceptorChain.invoke(InvocationInterceptorChain.java:37)
  at 
org.junit.jupiter.engine.execution.InterceptingExecutableInvoker.invoke(InterceptingExecutableInvoker.java:92)
  at 
org.junit.jupiter.engine.execution.InterceptingExecutableInvoker.invoke(InterceptingExecutableInvoker.java:86)
  at 
org.junit.jupiter.engine.descriptor.ClassBasedTestDescriptor.invokeMethodInExtensionContext(ClassBasedTestDescriptor.java:520)
  at 
org.junit.jupiter.engine.descriptor.ClassBasedTestDescriptor.lambda$synthesizeAfterEachMethodAdapter$24(ClassBasedTestDescriptor.java:510)
  at 
org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.lambda$invokeAfterEachMethods$10(TestMethodTestDescriptor.java:243)
  at 
org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.lambda$invokeAllAfterMethodsOrCallbacks$13(TestMethodTestDescriptor.java:276)
  at 
org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
  at 
org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.lambda$invokeAllAfterMethodsOrCallbacks$14(TestMethodTestDescriptor.java:276)
  at java.base/java.util.ArrayList.forEach(ArrayList.java:1541)
  at 
org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.invokeAllAfterMethodsOrCallbacks(TestMethodTestDescriptor.java:275)
  at 
org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.invokeAfterEachMethods(TestMethodTestDescriptor.java:241)
  at 
org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.execute(TestMethodTestDescriptor.java:142)
  at 
org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.execute(TestMethodTestDescriptor.java:68)
  at 
org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$6(NodeTestTask.java:151)
  at 
org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
  at 
org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$8(NodeTestTask.java:141)
  at org.junit.platform.engine.support.hierarchical.Node.around(Node.java:137)
  

[jira] [Updated] (IGNITE-21083) Sporadic IllegalReferenceCountException when reading jdbc messages.

2023-12-14 Thread Maksim Zhuravkov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-21083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maksim Zhuravkov updated IGNITE-21083:
--
Description: 
Some jdbc test fail  with IllegalReferenceCountException refCnt 0 
(org.apache.ignite.jdbc.ItJdbcResultSetSelfTest.testFindColumn).
This issue maybe cause by the following code in TcpClientChannel because buffer 
is sent to another thread w/o calling ByteBuf::retain:

{code:java}
/** {@inheritDoc} */
@Override
public void onMessage(ByteBuf buf) {
asyncContinuationExecutor.execute(() -> {
try {
processNextMessage(buf);
} catch (Throwable t) {
close(t, false);
} finally {
buf.release();
}
});
}
{code}

Error:
{code:java}
java.sql.SQLException: Unable to close result set.
  at 
org.apache.ignite.internal.jdbc.JdbcResultSet.close0(JdbcResultSet.java:296)
  at 
org.apache.ignite.internal.jdbc.JdbcStatement.closeResults(JdbcStatement.java:717)
  at org.apache.ignite.internal.jdbc.JdbcStatement.close(JdbcStatement.java:259)
  at 
org.apache.ignite.jdbc.AbstractJdbcSelfTest.tearDownBase(AbstractJdbcSelfTest.java:130)
  at jdk.internal.reflect.GeneratedMethodAccessor40.invoke(Unknown Source)
  at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  at java.base/java.lang.reflect.Method.invoke(Method.java:566)
  at 
org.junit.platform.commons.util.ReflectionUtils.invokeMethod(ReflectionUtils.java:727)
  at 
org.junit.jupiter.engine.execution.MethodInvocation.proceed(MethodInvocation.java:60)
  at 
org.junit.jupiter.engine.execution.InvocationInterceptorChain$ValidatingInvocation.proceed(InvocationInterceptorChain.java:131)
  at 
org.junit.jupiter.engine.extension.SameThreadTimeoutInvocation.proceed(SameThreadTimeoutInvocation.java:45)
  at 
org.junit.jupiter.engine.extension.TimeoutExtension.intercept(TimeoutExtension.java:156)
  at 
org.junit.jupiter.engine.extension.TimeoutExtension.interceptLifecycleMethod(TimeoutExtension.java:128)
  at 
org.junit.jupiter.engine.extension.TimeoutExtension.interceptAfterEachMethod(TimeoutExtension.java:110)
  at 
org.junit.jupiter.engine.execution.InterceptingExecutableInvoker$ReflectiveInterceptorCall.lambda$ofVoidMethod$0(InterceptingExecutableInvoker.java:103)
  at 
org.junit.jupiter.engine.execution.InterceptingExecutableInvoker.lambda$invoke$0(InterceptingExecutableInvoker.java:93)
  at 
org.junit.jupiter.engine.execution.InvocationInterceptorChain$InterceptedInvocation.proceed(InvocationInterceptorChain.java:106)
  at 
org.junit.jupiter.engine.execution.InvocationInterceptorChain.proceed(InvocationInterceptorChain.java:64)
  at 
org.junit.jupiter.engine.execution.InvocationInterceptorChain.chainAndInvoke(InvocationInterceptorChain.java:45)
  at 
org.junit.jupiter.engine.execution.InvocationInterceptorChain.invoke(InvocationInterceptorChain.java:37)
  at 
org.junit.jupiter.engine.execution.InterceptingExecutableInvoker.invoke(InterceptingExecutableInvoker.java:92)
  at 
org.junit.jupiter.engine.execution.InterceptingExecutableInvoker.invoke(InterceptingExecutableInvoker.java:86)
  at 
org.junit.jupiter.engine.descriptor.ClassBasedTestDescriptor.invokeMethodInExtensionContext(ClassBasedTestDescriptor.java:520)
  at 
org.junit.jupiter.engine.descriptor.ClassBasedTestDescriptor.lambda$synthesizeAfterEachMethodAdapter$24(ClassBasedTestDescriptor.java:510)
  at 
org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.lambda$invokeAfterEachMethods$10(TestMethodTestDescriptor.java:243)
  at 
org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.lambda$invokeAllAfterMethodsOrCallbacks$13(TestMethodTestDescriptor.java:276)
  at 
org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
  at 
org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.lambda$invokeAllAfterMethodsOrCallbacks$14(TestMethodTestDescriptor.java:276)
  at java.base/java.util.ArrayList.forEach(ArrayList.java:1541)
  at 
org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.invokeAllAfterMethodsOrCallbacks(TestMethodTestDescriptor.java:275)
  at 
org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.invokeAfterEachMethods(TestMethodTestDescriptor.java:241)
  at 
org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.execute(TestMethodTestDescriptor.java:142)
  at 
org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.execute(TestMethodTestDescriptor.java:68)
  at 
org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$6(NodeTestTask.java:151)
  at 
org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
  at 
org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$8(NodeTestTask.java:141)
  at org.junit.platform.engine.support.hierarchical.Node.around(Node.java:137)
  

[jira] [Commented] (IGNITE-19955) Fix create zone on restart rewrites existing data nodes because of trigger key inconsistnecy

2023-12-14 Thread Vladislav Pyatkov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-19955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17796631#comment-17796631
 ] 

Vladislav Pyatkov commented on IGNITE-19955:


Merged ba6cd4dfff0e4bc2eacb437a1331454e030bff84

> Fix create zone on restart rewrites existing data nodes because of trigger 
> key inconsistnecy
> 
>
> Key: IGNITE-19955
> URL: https://issues.apache.org/jira/browse/IGNITE-19955
> Project: Ignite
>  Issue Type: Bug
>Reporter: Mirza Aliev
>Assignee: Mirza Aliev
>Priority: Major
>  Labels: ignite-3
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Outdated, see UPD
> Currently we have the logic for initialisation of newly created zone that it 
> writes keys
> {noformat}
> zoneDataNodesKey(zoneId), zoneScaleUpChangeTriggerKey(zoneId), 
> zoneScaleDownChangeTriggerKey(zoneId), zonesChangeTriggerKey(zoneId)
> {noformat}
> to metastorage, and condition is 
> {noformat}
> static CompoundCondition triggerKeyConditionForZonesChanges(long 
> revision, int zoneId) {
> return or(
> notExists(zonesChangeTriggerKey(zoneId)),
> 
> value(zonesChangeTriggerKey(zoneId)).lt(ByteUtils.longToBytes(revision))
> );
> {noformat}
> Recovery process implies that the create zone event will be processed again, 
> but with the higher revision, so data nodes will be rewritten.
> We need to handle this situation, so data nodes will be consistent after 
> restart.
> Possible solution is to change condition to 
> {noformat}
> static SimpleCondition triggerKeyConditionForZonesCreation(long revision, 
> int zoneId) {
> return notExists(zonesChangeTriggerKey(zoneId));
> }
> static SimpleCondition triggerKeyConditionForZonesDelete(int zoneId) {
> return exists(zonesChangeTriggerKey(zoneId));
> }
> {noformat}
>  
> so we could not rely on revision and check only existence of the key, when we 
> create or remove zone. The problem in this solution is that reordering of the 
> create and remove on some node could lead to not consistent state for zones 
> key in metastorage
> *UPD*:
> This problem will be resolves once we implement 
> https://issues.apache.org/jira/browse/IGNITE-20561
> In this ticket we need to unmute all tickets that were muted by this ticket



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (IGNITE-21032) ReadOnlyDynamicMBean.getAttributes may return a list of attribute values instead of Attribute instances

2023-12-14 Thread Vyacheslav Koptilin (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-21032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17796625#comment-17796625
 ] 

Vyacheslav Koptilin commented on IGNITE-21032:
--

Hi [~simon.greatrix] ,

The patch merged into the master branch. Thank you for your contribution!

> ReadOnlyDynamicMBean.getAttributes may return a list of attribute values 
> instead of Attribute instances
> ---
>
> Key: IGNITE-21032
> URL: https://issues.apache.org/jira/browse/IGNITE-21032
> Project: Ignite
>  Issue Type: Bug
>Reporter: Vyacheslav Koptilin
>Assignee: Simon Greatrix
>Priority: Major
> Fix For: 2.16
>
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> When supplying JMX information, the AttributeList class should contain 
> Attributes, however the existing code returns attribute values. This can 
> cause ClassCastExceptions in code that attempts to read an AttributeList.
>  
> [GitHub Issue #11045|https://github.com/apache/ignite/issues/11045]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (IGNITE-21032) ReadOnlyDynamicMBean.getAttributes may return a list of attribute values instead of Attribute instances

2023-12-14 Thread Vyacheslav Koptilin (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-21032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vyacheslav Koptilin resolved IGNITE-21032.
--
Resolution: Fixed

> ReadOnlyDynamicMBean.getAttributes may return a list of attribute values 
> instead of Attribute instances
> ---
>
> Key: IGNITE-21032
> URL: https://issues.apache.org/jira/browse/IGNITE-21032
> Project: Ignite
>  Issue Type: Bug
>Reporter: Vyacheslav Koptilin
>Assignee: Simon Greatrix
>Priority: Major
> Fix For: 2.16
>
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> When supplying JMX information, the AttributeList class should contain 
> Attributes, however the existing code returns attribute values. This can 
> cause ClassCastExceptions in code that attempts to read an AttributeList.
>  
> [GitHub Issue #11045|https://github.com/apache/ignite/issues/11045]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-21083) Sporadic IllegalReferenceCountException when reading jdbc messages.

2023-12-14 Thread Maksim Zhuravkov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-21083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maksim Zhuravkov updated IGNITE-21083:
--
Description: 
There Jdbc test failed with IllegalReferenceCountException refCnt 0.


{code:java}
java.sql.SQLException: Unable to close result set.
  at 
org.apache.ignite.internal.jdbc.JdbcResultSet.close0(JdbcResultSet.java:296)
  at 
org.apache.ignite.internal.jdbc.JdbcStatement.closeResults(JdbcStatement.java:717)
  at org.apache.ignite.internal.jdbc.JdbcStatement.close(JdbcStatement.java:259)
  at 
org.apache.ignite.jdbc.AbstractJdbcSelfTest.tearDownBase(AbstractJdbcSelfTest.java:130)
  at jdk.internal.reflect.GeneratedMethodAccessor40.invoke(Unknown Source)
  at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  at java.base/java.lang.reflect.Method.invoke(Method.java:566)
  at 
org.junit.platform.commons.util.ReflectionUtils.invokeMethod(ReflectionUtils.java:727)
  at 
org.junit.jupiter.engine.execution.MethodInvocation.proceed(MethodInvocation.java:60)
  at 
org.junit.jupiter.engine.execution.InvocationInterceptorChain$ValidatingInvocation.proceed(InvocationInterceptorChain.java:131)
  at 
org.junit.jupiter.engine.extension.SameThreadTimeoutInvocation.proceed(SameThreadTimeoutInvocation.java:45)
  at 
org.junit.jupiter.engine.extension.TimeoutExtension.intercept(TimeoutExtension.java:156)
  at 
org.junit.jupiter.engine.extension.TimeoutExtension.interceptLifecycleMethod(TimeoutExtension.java:128)
  at 
org.junit.jupiter.engine.extension.TimeoutExtension.interceptAfterEachMethod(TimeoutExtension.java:110)
  at 
org.junit.jupiter.engine.execution.InterceptingExecutableInvoker$ReflectiveInterceptorCall.lambda$ofVoidMethod$0(InterceptingExecutableInvoker.java:103)
  at 
org.junit.jupiter.engine.execution.InterceptingExecutableInvoker.lambda$invoke$0(InterceptingExecutableInvoker.java:93)
  at 
org.junit.jupiter.engine.execution.InvocationInterceptorChain$InterceptedInvocation.proceed(InvocationInterceptorChain.java:106)
  at 
org.junit.jupiter.engine.execution.InvocationInterceptorChain.proceed(InvocationInterceptorChain.java:64)
  at 
org.junit.jupiter.engine.execution.InvocationInterceptorChain.chainAndInvoke(InvocationInterceptorChain.java:45)
  at 
org.junit.jupiter.engine.execution.InvocationInterceptorChain.invoke(InvocationInterceptorChain.java:37)
  at 
org.junit.jupiter.engine.execution.InterceptingExecutableInvoker.invoke(InterceptingExecutableInvoker.java:92)
  at 
org.junit.jupiter.engine.execution.InterceptingExecutableInvoker.invoke(InterceptingExecutableInvoker.java:86)
  at 
org.junit.jupiter.engine.descriptor.ClassBasedTestDescriptor.invokeMethodInExtensionContext(ClassBasedTestDescriptor.java:520)
  at 
org.junit.jupiter.engine.descriptor.ClassBasedTestDescriptor.lambda$synthesizeAfterEachMethodAdapter$24(ClassBasedTestDescriptor.java:510)
  at 
org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.lambda$invokeAfterEachMethods$10(TestMethodTestDescriptor.java:243)
  at 
org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.lambda$invokeAllAfterMethodsOrCallbacks$13(TestMethodTestDescriptor.java:276)
  at 
org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
  at 
org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.lambda$invokeAllAfterMethodsOrCallbacks$14(TestMethodTestDescriptor.java:276)
  at java.base/java.util.ArrayList.forEach(ArrayList.java:1541)
  at 
org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.invokeAllAfterMethodsOrCallbacks(TestMethodTestDescriptor.java:275)
  at 
org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.invokeAfterEachMethods(TestMethodTestDescriptor.java:241)
  at 
org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.execute(TestMethodTestDescriptor.java:142)
  at 
org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.execute(TestMethodTestDescriptor.java:68)
  at 
org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$6(NodeTestTask.java:151)
  at 
org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
  at 
org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$8(NodeTestTask.java:141)
  at org.junit.platform.engine.support.hierarchical.Node.around(Node.java:137)
  at 
org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$9(NodeTestTask.java:139)
  at 
org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
  at 
org.junit.platform.engine.support.hierarchical.NodeTestTask.executeRecursively(NodeTestTask.java:138)
  at 
org.junit.platform.engine.support.hierarchical.NodeTestTask.execute(NodeTestTask.java:95)
  at java.base/java.util.ArrayList.forEach(ArrayList.java:1541)
  at 

[jira] [Updated] (IGNITE-21083) Sporadic IllegalReferenceCountException when reading jdbc messages.

2023-12-14 Thread Maksim Zhuravkov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-21083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maksim Zhuravkov updated IGNITE-21083:
--
Labels: ignite-3  (was: )

> Sporadic IllegalReferenceCountException when reading jdbc messages.
> ---
>
> Key: IGNITE-21083
> URL: https://issues.apache.org/jira/browse/IGNITE-21083
> Project: Ignite
>  Issue Type: Bug
>  Components: thin client
>Reporter: Maksim Zhuravkov
>Priority: Major
>  Labels: ignite-3
>
> There Jdbc test failed with IllegalReferenceCountException refCnt 0.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-21083) Sporadic IllegalReferenceCountException when reading jdbc messages.

2023-12-14 Thread Maksim Zhuravkov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-21083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maksim Zhuravkov updated IGNITE-21083:
--
Fix Version/s: 3.0.0-beta2

> Sporadic IllegalReferenceCountException when reading jdbc messages.
> ---
>
> Key: IGNITE-21083
> URL: https://issues.apache.org/jira/browse/IGNITE-21083
> Project: Ignite
>  Issue Type: Bug
>  Components: thin client
>Reporter: Maksim Zhuravkov
>Priority: Major
>  Labels: ignite-3
> Fix For: 3.0.0-beta2
>
>
> There Jdbc test failed with IllegalReferenceCountException refCnt 0.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (IGNITE-21083) Sporadic IllegalReferenceCountException when reading jdbc messages.

2023-12-14 Thread Maksim Zhuravkov (Jira)
Maksim Zhuravkov created IGNITE-21083:
-

 Summary: Sporadic IllegalReferenceCountException when reading jdbc 
messages.
 Key: IGNITE-21083
 URL: https://issues.apache.org/jira/browse/IGNITE-21083
 Project: Ignite
  Issue Type: Bug
  Components: thin client
Reporter: Maksim Zhuravkov


There Jdbc test failed with IllegalReferenceCountException refCnt 0.




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (IGNITE-21032) ReadOnlyDynamicMBean.getAttributes may return a list of attribute values instead of Attribute instances

2023-12-14 Thread Nikita Amelchev (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-21032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nikita Amelchev reassigned IGNITE-21032:


Assignee: Simon Greatrix  (was: Vyacheslav Koptilin)

> ReadOnlyDynamicMBean.getAttributes may return a list of attribute values 
> instead of Attribute instances
> ---
>
> Key: IGNITE-21032
> URL: https://issues.apache.org/jira/browse/IGNITE-21032
> Project: Ignite
>  Issue Type: Bug
>Reporter: Vyacheslav Koptilin
>Assignee: Simon Greatrix
>Priority: Major
> Fix For: 2.16
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> When supplying JMX information, the AttributeList class should contain 
> Attributes, however the existing code returns attribute values. This can 
> cause ClassCastExceptions in code that attempts to read an AttributeList.
>  
> [GitHub Issue #11045|https://github.com/apache/ignite/issues/11045]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (IGNITE-19955) Fix create zone on restart rewrites existing data nodes because of trigger key inconsistnecy

2023-12-14 Thread Sergey Uttsel (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-19955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17796615#comment-17796615
 ] 

Sergey Uttsel commented on IGNITE-19955:


LGTM

> Fix create zone on restart rewrites existing data nodes because of trigger 
> key inconsistnecy
> 
>
> Key: IGNITE-19955
> URL: https://issues.apache.org/jira/browse/IGNITE-19955
> Project: Ignite
>  Issue Type: Bug
>Reporter: Mirza Aliev
>Assignee: Mirza Aliev
>Priority: Major
>  Labels: ignite-3
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Outdated, see UPD
> Currently we have the logic for initialisation of newly created zone that it 
> writes keys
> {noformat}
> zoneDataNodesKey(zoneId), zoneScaleUpChangeTriggerKey(zoneId), 
> zoneScaleDownChangeTriggerKey(zoneId), zonesChangeTriggerKey(zoneId)
> {noformat}
> to metastorage, and condition is 
> {noformat}
> static CompoundCondition triggerKeyConditionForZonesChanges(long 
> revision, int zoneId) {
> return or(
> notExists(zonesChangeTriggerKey(zoneId)),
> 
> value(zonesChangeTriggerKey(zoneId)).lt(ByteUtils.longToBytes(revision))
> );
> {noformat}
> Recovery process implies that the create zone event will be processed again, 
> but with the higher revision, so data nodes will be rewritten.
> We need to handle this situation, so data nodes will be consistent after 
> restart.
> Possible solution is to change condition to 
> {noformat}
> static SimpleCondition triggerKeyConditionForZonesCreation(long revision, 
> int zoneId) {
> return notExists(zonesChangeTriggerKey(zoneId));
> }
> static SimpleCondition triggerKeyConditionForZonesDelete(int zoneId) {
> return exists(zonesChangeTriggerKey(zoneId));
> }
> {noformat}
>  
> so we could not rely on revision and check only existence of the key, when we 
> create or remove zone. The problem in this solution is that reordering of the 
> create and remove on some node could lead to not consistent state for zones 
> key in metastorage
> *UPD*:
> This problem will be resolves once we implement 
> https://issues.apache.org/jira/browse/IGNITE-20561
> In this ticket we need to unmute all tickets that were muted by this ticket



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (IGNITE-21032) ReadOnlyDynamicMBean.getAttributes may return a list of attribute values instead of Attribute instances

2023-12-14 Thread Ignite TC Bot (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-21032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17796612#comment-17796612
 ] 

Ignite TC Bot commented on IGNITE-21032:


{panel:title=Branch: [pull/11046/head] Base: [master] : No blockers 
found!|borderStyle=dashed|borderColor=#ccc|titleBGColor=#D6F7C1}{panel}
{panel:title=Branch: [pull/11046/head] Base: [master] : No new tests 
found!|borderStyle=dashed|borderColor=#ccc|titleBGColor=#F7D6C1}{panel}
[TeamCity *-- Run :: All* 
Results|https://ci2.ignite.apache.org/viewLog.html?buildId=7654961buildTypeId=IgniteTests24Java8_RunAll]

> ReadOnlyDynamicMBean.getAttributes may return a list of attribute values 
> instead of Attribute instances
> ---
>
> Key: IGNITE-21032
> URL: https://issues.apache.org/jira/browse/IGNITE-21032
> Project: Ignite
>  Issue Type: Bug
>Reporter: Vyacheslav Koptilin
>Assignee: Vyacheslav Koptilin
>Priority: Major
> Fix For: 2.16
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> When supplying JMX information, the AttributeList class should contain 
> Attributes, however the existing code returns attribute values. This can 
> cause ClassCastExceptions in code that attempts to read an AttributeList.
>  
> [GitHub Issue #11045|https://github.com/apache/ignite/issues/11045]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-20906) Attempt to change column type from INT to BIGINT fails

2023-12-14 Thread Roman Puchkovskiy (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-20906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roman Puchkovskiy updated IGNITE-20906:
---
Description: 
CREATE TABLE t (id INT PRIMARY KEY, val INT NOT NULL);

ALTER TABLE t ALTER COLUMN val SET DATA TYPE BIGINT;

Second query fails with the following message 'Changing the precision for 
column of type 'INT32' is not allowed', alghough precision is not mentioned in 
any of the queries.

Same thing happens when trying to change SMALLINT -> INT.

IEP: 
[https://cwiki.apache.org/confluence/display/IGNITE/IEP-108+Change+column+type#IEP108Changecolumntype-Datavalidation.]

  was:
CREATE TABLE t (id INT PRIMARY KEY, val INT NOT NULL);

ALTER TABLE t ALTER COLUMN val SET DATA TYPE BIGINT;

Second query fails with the following message 'Changing the precision for 
column of type 'INT32' is not allowed', alghough precision is not mentioned in 
any of the queries.

Same thing happens when trying to change SMALLINT -> INT.


> Attempt to change column type from INT to BIGINT fails
> --
>
> Key: IGNITE-20906
> URL: https://issues.apache.org/jira/browse/IGNITE-20906
> Project: Ignite
>  Issue Type: Bug
>  Components: sql
>Reporter: Roman Puchkovskiy
>Assignee: Evgeny Stanilovsky
>Priority: Major
>  Labels: ignite-3
>
> CREATE TABLE t (id INT PRIMARY KEY, val INT NOT NULL);
> ALTER TABLE t ALTER COLUMN val SET DATA TYPE BIGINT;
> Second query fails with the following message 'Changing the precision for 
> column of type 'INT32' is not allowed', alghough precision is not mentioned 
> in any of the queries.
> Same thing happens when trying to change SMALLINT -> INT.
> IEP: 
> [https://cwiki.apache.org/confluence/display/IGNITE/IEP-108+Change+column+type#IEP108Changecolumntype-Datavalidation.]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)