[jira] [Commented] (IGNITE-602) [Test] GridToStringBuilder is vulnerable for StackOverflowError caused by infinite recursion
[ https://issues.apache.org/jira/browse/IGNITE-602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16576901#comment-16576901 ] Ryabov Dmitrii commented on IGNITE-602: --- [~agoncharuk], yes, I noticed it too and created [IGNITE-9209|https://issues.apache.org/jira/browse/IGNITE-9209]. I didn't investigated issue immediately when found, but looks like I know where the trouble is. I've made PR and start tests. I hope tomorrow we can merge the fix. > [Test] GridToStringBuilder is vulnerable for StackOverflowError caused by > infinite recursion > > > Key: IGNITE-602 > URL: https://issues.apache.org/jira/browse/IGNITE-602 > Project: Ignite > Issue Type: Bug > Components: general >Reporter: Artem Shutak >Assignee: Ryabov Dmitrii >Priority: Major > Labels: MakeTeamcityGreenAgain, Muted_test > Fix For: 2.7 > > > See test > org.gridgain.grid.util.tostring.GridToStringBuilderSelfTest#_testToStringCheckAdvancedRecursionPrevention > and related TODO in same source file. > Also take a look at > http://stackoverflow.com/questions/11300203/most-efficient-way-to-prevent-an-infinite-recursion-in-tostring > Test should be unmuted on TC after fix. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (IGNITE-9209) GridDistributedTxMapping.toString() returns broken string
[ https://issues.apache.org/jira/browse/IGNITE-9209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryabov Dmitrii reassigned IGNITE-9209: -- Assignee: Ryabov Dmitrii > GridDistributedTxMapping.toString() returns broken string > - > > Key: IGNITE-9209 > URL: https://issues.apache.org/jira/browse/IGNITE-9209 > Project: Ignite > Issue Type: Bug >Reporter: Ryabov Dmitrii >Assignee: Ryabov Dmitrii >Priority: Minor > > Something wrong with `GridDistributedTxMapping` when we try to get string > representation by `GridToStringBuilder`. > It should looks like > {noformat} > GridDistributedTxMapping [entries=LinkedHashSet [/*values here*/], > explicitLock=false, dhtVer=null, last=false, nearEntries=0,/*more text*/] > {noformat} > But currently it looks like > {noformat} > KeyCacheObjectImpl [part=1, val=1, hasValBytes=false]KeyCacheObjectImpl > [part=1, val=1, hasValBytes=false],// more text > {noformat} > Reproducer: > {code:java} > public class GridToStringBuilderSelfTest extends GridCommonAbstractTest { > /** > * @throws Exception > */ > public void testGridDistributedTxMapping() throws Exception { > IgniteEx ignite = startGrid(0); > IgniteCache cache = > ignite.createCache(defaultCacheConfiguration()); > try (Transaction tx = ignite.transactions().txStart()) { > cache.put(1, 1); > GridDistributedTxMapping mapping = new > GridDistributedTxMapping(grid(0).localNode()); > assertTrue("Wrong string: " + mapping, > mapping.toString().startsWith("GridDistributedTxMapping [")); > > mapping.add(((TransactionProxyImpl)tx).tx().txState().allEntries().stream().findAny().get()); > assertTrue("Wrong string: " + mapping, > mapping.toString().startsWith("GridDistributedTxMapping [")); > } > stopAllGrids(); > } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (IGNITE-9209) GridDistributedTxMapping.toString() returns broken string
[ https://issues.apache.org/jira/browse/IGNITE-9209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16576891#comment-16576891 ] ASF GitHub Bot commented on IGNITE-9209: GitHub user SomeFire opened a pull request: https://github.com/apache/ignite/pull/4519 IGNITE-9209: GridDistributedTxMapping.toString() returns broken string You can merge this pull request into a Git repository by running: $ git pull https://github.com/SomeFire/ignite ignite-9209 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/ignite/pull/4519.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #4519 commit 9d0f4853e9b179553631866b57cba8d393328be8 Author: Dmitrii Ryabov Date: 2018-08-10T21:41:53Z IGNITE-9209: GridDistributedTxMapping.toString() returns broken string > GridDistributedTxMapping.toString() returns broken string > - > > Key: IGNITE-9209 > URL: https://issues.apache.org/jira/browse/IGNITE-9209 > Project: Ignite > Issue Type: Bug >Reporter: Ryabov Dmitrii >Priority: Minor > > Something wrong with `GridDistributedTxMapping` when we try to get string > representation by `GridToStringBuilder`. > It should looks like > {noformat} > GridDistributedTxMapping [entries=LinkedHashSet [/*values here*/], > explicitLock=false, dhtVer=null, last=false, nearEntries=0,/*more text*/] > {noformat} > But currently it looks like > {noformat} > KeyCacheObjectImpl [part=1, val=1, hasValBytes=false]KeyCacheObjectImpl > [part=1, val=1, hasValBytes=false],// more text > {noformat} > Reproducer: > {code:java} > public class GridToStringBuilderSelfTest extends GridCommonAbstractTest { > /** > * @throws Exception > */ > public void testGridDistributedTxMapping() throws Exception { > IgniteEx ignite = startGrid(0); > IgniteCache cache = > ignite.createCache(defaultCacheConfiguration()); > try (Transaction tx = ignite.transactions().txStart()) { > cache.put(1, 1); > GridDistributedTxMapping mapping = new > GridDistributedTxMapping(grid(0).localNode()); > assertTrue("Wrong string: " + mapping, > mapping.toString().startsWith("GridDistributedTxMapping [")); > > mapping.add(((TransactionProxyImpl)tx).tx().txState().allEntries().stream().findAny().get()); > assertTrue("Wrong string: " + mapping, > mapping.toString().startsWith("GridDistributedTxMapping [")); > } > stopAllGrids(); > } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (IGNITE-8923) Add step-by-step guide - Google Cloud Engine Deployment (Kubernetes)
[ https://issues.apache.org/jira/browse/IGNITE-8923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Denis Magda resolved IGNITE-8923. - Resolution: Fixed > Add step-by-step guide - Google Cloud Engine Deployment (Kubernetes) > > > Key: IGNITE-8923 > URL: https://issues.apache.org/jira/browse/IGNITE-8923 > Project: Ignite > Issue Type: Improvement > Components: documentation >Reporter: Roman Guseinov >Assignee: Denis Magda >Priority: Major > Fix For: 2.7 > > Attachments: config.zip, example-kube.xml, > google_cloud_engine_deployment.zip, yaml.zip > > > We have such documentation for Microsoft Azure > [https://apacheignite.readme.io/docs/microsoft-azure-deployment] > It would be great to publish the same for GCE. > Here are steps which I used to deploy cluster (stateless, stateful) and web > console: > {code:java} > ## Start Ignite Cluster > 1. Grant cluster-admin role to current google user (to allow create roles): > $ kubectl create clusterrolebinding myname2-cluster-admin-binding \ > --clusterrole=cluster-admin \ > --user= > 2. Create service account and grant permissions: > $ kubectl create -f sa.yaml > $ kubectl create -f role.yaml > $ kubectl create -f rolebind.yaml > 3. Create a grid service: > $ kubectl create -f service.yaml > 4. Deploy Ignite Cluster: > $ kubectl create -f grid.yaml > ## Enable Ignite Persistence > 5. Deploy Ignite StatefulSet with enabled Persistence (instead of step 4). > $ kubectl create -f grid-pds.yaml > 6. Connect to the Ignite node and activate cluster: > $ kubectl exec -it ignite-cluster-0 -- /bin/bash > $ cd /opt/ignite/apache-ignite-* > $ ./bin/control.sh --activate > ## Deploy Web Console: > 7. Create a volume to keep web console data: > $ kubectl create -f console-volume.yaml > 8. Create load balancer to expose HTTP port and make web console available by > service DNS-name (web-console.default.svc.cluster.local) inside Kuberntes > enviroment: > $ kubectl create -f console-service.yaml > 9. Deploy Web Console: > $ kubectl create -f console.yaml > 10. Check external IP: > $ kubectl get service web-console > 11. Open Web Console in a web browser and Sign Up. > 12. Move to User Profile page (Settings > Profile) and copy security token. > 13. Insert security token into web-agent.yaml (TOKENS environment variable). > 14. Deploy Web Agent: > $ kubectl create -f web-agent.yaml > {code} > YAML and configs are attached. > Creating a public Docker-image for Web Agent in progress: > https://issues.apache.org/jira/browse/IGNITE-8526 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Closed] (IGNITE-8923) Add step-by-step guide - Google Cloud Engine Deployment (Kubernetes)
[ https://issues.apache.org/jira/browse/IGNITE-8923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Denis Magda closed IGNITE-8923. --- > Add step-by-step guide - Google Cloud Engine Deployment (Kubernetes) > > > Key: IGNITE-8923 > URL: https://issues.apache.org/jira/browse/IGNITE-8923 > Project: Ignite > Issue Type: Improvement > Components: documentation >Reporter: Roman Guseinov >Assignee: Denis Magda >Priority: Major > Fix For: 2.7 > > Attachments: config.zip, example-kube.xml, > google_cloud_engine_deployment.zip, yaml.zip > > > We have such documentation for Microsoft Azure > [https://apacheignite.readme.io/docs/microsoft-azure-deployment] > It would be great to publish the same for GCE. > Here are steps which I used to deploy cluster (stateless, stateful) and web > console: > {code:java} > ## Start Ignite Cluster > 1. Grant cluster-admin role to current google user (to allow create roles): > $ kubectl create clusterrolebinding myname2-cluster-admin-binding \ > --clusterrole=cluster-admin \ > --user= > 2. Create service account and grant permissions: > $ kubectl create -f sa.yaml > $ kubectl create -f role.yaml > $ kubectl create -f rolebind.yaml > 3. Create a grid service: > $ kubectl create -f service.yaml > 4. Deploy Ignite Cluster: > $ kubectl create -f grid.yaml > ## Enable Ignite Persistence > 5. Deploy Ignite StatefulSet with enabled Persistence (instead of step 4). > $ kubectl create -f grid-pds.yaml > 6. Connect to the Ignite node and activate cluster: > $ kubectl exec -it ignite-cluster-0 -- /bin/bash > $ cd /opt/ignite/apache-ignite-* > $ ./bin/control.sh --activate > ## Deploy Web Console: > 7. Create a volume to keep web console data: > $ kubectl create -f console-volume.yaml > 8. Create load balancer to expose HTTP port and make web console available by > service DNS-name (web-console.default.svc.cluster.local) inside Kuberntes > enviroment: > $ kubectl create -f console-service.yaml > 9. Deploy Web Console: > $ kubectl create -f console.yaml > 10. Check external IP: > $ kubectl get service web-console > 11. Open Web Console in a web browser and Sign Up. > 12. Move to User Profile page (Settings > Profile) and copy security token. > 13. Insert security token into web-agent.yaml (TOKENS environment variable). > 14. Deploy Web Agent: > $ kubectl create -f web-agent.yaml > {code} > YAML and configs are attached. > Creating a public Docker-image for Web Agent in progress: > https://issues.apache.org/jira/browse/IGNITE-8526 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (IGNITE-8923) Add step-by-step guide - Google Cloud Engine Deployment (Kubernetes)
[ https://issues.apache.org/jira/browse/IGNITE-8923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16576753#comment-16576753 ] Denis Magda commented on IGNITE-8923: - Looks good, thanks. > Add step-by-step guide - Google Cloud Engine Deployment (Kubernetes) > > > Key: IGNITE-8923 > URL: https://issues.apache.org/jira/browse/IGNITE-8923 > Project: Ignite > Issue Type: Improvement > Components: documentation >Reporter: Roman Guseinov >Assignee: Denis Magda >Priority: Major > Fix For: 2.7 > > Attachments: config.zip, example-kube.xml, > google_cloud_engine_deployment.zip, yaml.zip > > > We have such documentation for Microsoft Azure > [https://apacheignite.readme.io/docs/microsoft-azure-deployment] > It would be great to publish the same for GCE. > Here are steps which I used to deploy cluster (stateless, stateful) and web > console: > {code:java} > ## Start Ignite Cluster > 1. Grant cluster-admin role to current google user (to allow create roles): > $ kubectl create clusterrolebinding myname2-cluster-admin-binding \ > --clusterrole=cluster-admin \ > --user= > 2. Create service account and grant permissions: > $ kubectl create -f sa.yaml > $ kubectl create -f role.yaml > $ kubectl create -f rolebind.yaml > 3. Create a grid service: > $ kubectl create -f service.yaml > 4. Deploy Ignite Cluster: > $ kubectl create -f grid.yaml > ## Enable Ignite Persistence > 5. Deploy Ignite StatefulSet with enabled Persistence (instead of step 4). > $ kubectl create -f grid-pds.yaml > 6. Connect to the Ignite node and activate cluster: > $ kubectl exec -it ignite-cluster-0 -- /bin/bash > $ cd /opt/ignite/apache-ignite-* > $ ./bin/control.sh --activate > ## Deploy Web Console: > 7. Create a volume to keep web console data: > $ kubectl create -f console-volume.yaml > 8. Create load balancer to expose HTTP port and make web console available by > service DNS-name (web-console.default.svc.cluster.local) inside Kuberntes > enviroment: > $ kubectl create -f console-service.yaml > 9. Deploy Web Console: > $ kubectl create -f console.yaml > 10. Check external IP: > $ kubectl get service web-console > 11. Open Web Console in a web browser and Sign Up. > 12. Move to User Profile page (Settings > Profile) and copy security token. > 13. Insert security token into web-agent.yaml (TOKENS environment variable). > 14. Deploy Web Agent: > $ kubectl create -f web-agent.yaml > {code} > YAML and configs are attached. > Creating a public Docker-image for Web Agent in progress: > https://issues.apache.org/jira/browse/IGNITE-8526 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (IGNITE-8994) Configuring dedicated volumes for WAL and data with Kuberenetes
[ https://issues.apache.org/jira/browse/IGNITE-8994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16576750#comment-16576750 ] Denis Magda commented on IGNITE-8994: - [~abchaudhri], please do the following: * Review and edit all the pages and sub-pages of Ignite Kubernetes doc. That's the root - https://apacheignite.readme.io/v2.6/docs/kubernetes-deployment * Make sure you can reproduce the steps documented for GCE: https://apacheignite.readme.io/v2.6/docs/google-cloud-deployment (we're interested in the StatefulSet deployment only when WAL and database files are stored separately, please rework that section on GCE page if needed, you don't need to reproduce the stateless deployment) * Create a guideline for Amazon AWS similar to the one we have for Microsoft Azure and GKE. > Configuring dedicated volumes for WAL and data with Kuberenetes > --- > > Key: IGNITE-8994 > URL: https://issues.apache.org/jira/browse/IGNITE-8994 > Project: Ignite > Issue Type: Task > Components: documentation >Reporter: Denis Magda >Assignee: Akmal Chaudhri >Priority: Major > Fix For: 2.7 > > Attachments: yaml.zip > > > The current StatefulSet documentation request only one persistent volume for > both WAL and data/index files: > https://apacheignite.readme.io/docs/stateful-deployment#section-statefulset-deployment > However, according to Ignite performance guide the WAL has to be located on a > dedicated volume: > https://apacheignite.readme.io/docs/durable-memory-tuning#section-separate-disk-device-for-wal > Provide StatefulSet configuration that shows how to request separate volumes > for the WAL and data/index files. If needed, provide YAML configs for > StorageClass and volume claims. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (IGNITE-8994) Configuring dedicated volumes for WAL and data with Kuberenetes
[ https://issues.apache.org/jira/browse/IGNITE-8994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Denis Magda reassigned IGNITE-8994: --- Assignee: Akmal Chaudhri (was: Denis Magda) > Configuring dedicated volumes for WAL and data with Kuberenetes > --- > > Key: IGNITE-8994 > URL: https://issues.apache.org/jira/browse/IGNITE-8994 > Project: Ignite > Issue Type: Task > Components: documentation >Reporter: Denis Magda >Assignee: Akmal Chaudhri >Priority: Major > Fix For: 2.7 > > Attachments: yaml.zip > > > The current StatefulSet documentation request only one persistent volume for > both WAL and data/index files: > https://apacheignite.readme.io/docs/stateful-deployment#section-statefulset-deployment > However, according to Ignite performance guide the WAL has to be located on a > dedicated volume: > https://apacheignite.readme.io/docs/durable-memory-tuning#section-separate-disk-device-for-wal > Provide StatefulSet configuration that shows how to request separate volumes > for the WAL and data/index files. If needed, provide YAML configs for > StorageClass and volume claims. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (IGNITE-9178) Partition lost event are not triggered if multiple nodes left cluster
[ https://issues.apache.org/jira/browse/IGNITE-9178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16576488#comment-16576488 ] ASF GitHub Bot commented on IGNITE-9178: Github user asfgit closed the pull request at: https://github.com/apache/ignite/pull/4495 > Partition lost event are not triggered if multiple nodes left cluster > - > > Key: IGNITE-9178 > URL: https://issues.apache.org/jira/browse/IGNITE-9178 > Project: Ignite > Issue Type: Bug > Components: cache >Affects Versions: 2.4 >Reporter: Pavel Vinokurov >Assignee: Pavel Vinokurov >Priority: Blocker > Fix For: 2.7 > > > If multiple nodes left cluster simultaneously, left partitions are removed > from GridDhtPartitionTopologyImpl#node2part without adding to leftNode2Part > in GridDhtPartitionTopologyImpl#update method. > Thus GridDhtPartitionTopologyImpl#detectLostPartitions can't detect lost > partitions -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (IGNITE-9231) onMarkDirty improvement throttle implementation.
[ https://issues.apache.org/jira/browse/IGNITE-9231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16576487#comment-16576487 ] ASF GitHub Bot commented on IGNITE-9231: Github user asfgit closed the pull request at: https://github.com/apache/ignite/pull/4506 > onMarkDirty improvement throttle implementation. > > > Key: IGNITE-9231 > URL: https://issues.apache.org/jira/browse/IGNITE-9231 > Project: Ignite > Issue Type: Bug > Components: general >Affects Versions: 2.6 >Reporter: Stanilovsky Evgeny >Assignee: Stanilovsky Evgeny >Priority: Major > Fix For: 2.7 > > > PagesWriteThrottle#onMarkDirty implementation park threads if > checkpointBuffer is near to overflow, but further release of checkpointBuffer > has no effect on parking threads, they still can be parked. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (IGNITE-9178) Partition lost event are not triggered if multiple nodes left cluster
[ https://issues.apache.org/jira/browse/IGNITE-9178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16576481#comment-16576481 ] Alexey Goncharuk commented on IGNITE-9178: -- Got it now. Thanks, merged to master! > Partition lost event are not triggered if multiple nodes left cluster > - > > Key: IGNITE-9178 > URL: https://issues.apache.org/jira/browse/IGNITE-9178 > Project: Ignite > Issue Type: Bug > Components: cache >Affects Versions: 2.4 >Reporter: Pavel Vinokurov >Assignee: Pavel Vinokurov >Priority: Blocker > Fix For: 2.7 > > > If multiple nodes left cluster simultaneously, left partitions are removed > from GridDhtPartitionTopologyImpl#node2part without adding to leftNode2Part > in GridDhtPartitionTopologyImpl#update method. > Thus GridDhtPartitionTopologyImpl#detectLostPartitions can't detect lost > partitions -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (IGNITE-8673) Reconcile isClient* methods
[ https://issues.apache.org/jira/browse/IGNITE-8673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16576468#comment-16576468 ] Alexey Goncharuk commented on IGNITE-8673: -- Merged master and re-triggered TC build: https://ci.ignite.apache.org/project.html?projectId=IgniteTests24Java8&branch_IgniteTests24Java8=pull%2F4104%2Fhead > Reconcile isClient* methods > --- > > Key: IGNITE-8673 > URL: https://issues.apache.org/jira/browse/IGNITE-8673 > Project: Ignite > Issue Type: Bug >Reporter: Eduard Shangareev >Assignee: Eduard Shangareev >Priority: Critical > Fix For: 2.7 > > > Now isClient (Mode, Cache and so on) methods semantic could mean different > things: > -the same as IgniteConfiguration#setClientMode; > -or the way how a node is connected to cluster (in the ring or not). > Almost in all cases, we need the first thing, but actually methods could > return second. > For example, ClusterNode.isClient means second, but all of us use as first. > So, I propose to make all methods return first. > And if there are places which require second replace them with the usage of > forceClientMode. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (IGNITE-7339) RENTING partition is not evicted after restore from storage
[ https://issues.apache.org/jira/browse/IGNITE-7339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16576461#comment-16576461 ] Pavel Kovalenko commented on IGNITE-7339: - [~ascherbakov] Looks good to me. Ready to merge. > RENTING partition is not evicted after restore from storage > --- > > Key: IGNITE-7339 > URL: https://issues.apache.org/jira/browse/IGNITE-7339 > Project: Ignite > Issue Type: Bug >Reporter: Semen Boikov >Assignee: Alexei Scherbakov >Priority: Critical > > If partition was in RENTING state at the moment when node is stopped, then > after restart it is not evicted. > It seems it is an issue in GridDhtLocalPartition.rent, 'tryEvictAsync' is not > called is partition was already in RENTING state. > Also there is error in GridDhtPartitionTopologyImpl.checkEvictions: partition > state is always treated as changed after part.rent call even if part.rent > does not actually change state. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (IGNITE-9244) Partition eviction may use all threads in sys pool, it leads to hangs send a message via sys pool
[ https://issues.apache.org/jira/browse/IGNITE-9244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16576456#comment-16576456 ] Pavel Kovalenko commented on IGNITE-9244: - [~agoncharuk] Looks good to me also. No other comments. > Partition eviction may use all threads in sys pool, it leads to hangs send a > message via sys pool > -- > > Key: IGNITE-9244 > URL: https://issues.apache.org/jira/browse/IGNITE-9244 > Project: Ignite > Issue Type: Bug >Reporter: Dmitriy Govorukhin >Assignee: Dmitriy Govorukhin >Priority: Major > Fix For: 2.7 > > > In the current implementation, GridDhtPartitionsEvictor reset partition to > evict one by one. > GridDhtPartitionsEvictor is created for each cache group, if we try to evict > too many groups as sys pool size, group evictors will take all available > threads in sys pool. It leads to hangs send a message via sys pool. As a fix, > I suggest to limit concurrent execution via sys pool or use another pool for > this purpose. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (IGNITE-9244) Partition eviction may use all threads in sys pool, it leads to hangs send a message via sys pool
[ https://issues.apache.org/jira/browse/IGNITE-9244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16576391#comment-16576391 ] Alexey Goncharuk commented on IGNITE-9244: -- Looks good to me now. [~Jokser], do you have any other comments? > Partition eviction may use all threads in sys pool, it leads to hangs send a > message via sys pool > -- > > Key: IGNITE-9244 > URL: https://issues.apache.org/jira/browse/IGNITE-9244 > Project: Ignite > Issue Type: Bug >Reporter: Dmitriy Govorukhin >Assignee: Dmitriy Govorukhin >Priority: Major > Fix For: 2.7 > > > In the current implementation, GridDhtPartitionsEvictor reset partition to > evict one by one. > GridDhtPartitionsEvictor is created for each cache group, if we try to evict > too many groups as sys pool size, group evictors will take all available > threads in sys pool. It leads to hangs send a message via sys pool. As a fix, > I suggest to limit concurrent execution via sys pool or use another pool for > this purpose. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (IGNITE-9244) Partition eviction may use all threads in sys pool, it leads to hangs send a message via sys pool
[ https://issues.apache.org/jira/browse/IGNITE-9244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16576354#comment-16576354 ] Dmitriy Govorukhin commented on IGNITE-9244: [~agoncharuk] Thanks! All comments fixed. > Partition eviction may use all threads in sys pool, it leads to hangs send a > message via sys pool > -- > > Key: IGNITE-9244 > URL: https://issues.apache.org/jira/browse/IGNITE-9244 > Project: Ignite > Issue Type: Bug >Reporter: Dmitriy Govorukhin >Assignee: Dmitriy Govorukhin >Priority: Major > Fix For: 2.7 > > > In the current implementation, GridDhtPartitionsEvictor reset partition to > evict one by one. > GridDhtPartitionsEvictor is created for each cache group, if we try to evict > too many groups as sys pool size, group evictors will take all available > threads in sys pool. It leads to hangs send a message via sys pool. As a fix, > I suggest to limit concurrent execution via sys pool or use another pool for > this purpose. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (IGNITE-9244) Partition eviction may use all threads in sys pool, it leads to hangs send a message via sys pool
[ https://issues.apache.org/jira/browse/IGNITE-9244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16576344#comment-16576344 ] Alexey Goncharuk commented on IGNITE-9244: -- A few comments: * Use {{Long.compare}} instead of {{(p1, p2) -> p1.part.fullSize() > p2.part.fullSize() ? -1 : 1}} * Method {{calculateBucket(PartitionEvictionTask)}} has an unused argument - either a bug, or the argument should be removed * For safety, make sure that {{threads}} does not get {{0}} value after {{sysPoolSize / 4}} > Partition eviction may use all threads in sys pool, it leads to hangs send a > message via sys pool > -- > > Key: IGNITE-9244 > URL: https://issues.apache.org/jira/browse/IGNITE-9244 > Project: Ignite > Issue Type: Bug >Reporter: Dmitriy Govorukhin >Assignee: Dmitriy Govorukhin >Priority: Major > Fix For: 2.7 > > > In the current implementation, GridDhtPartitionsEvictor reset partition to > evict one by one. > GridDhtPartitionsEvictor is created for each cache group, if we try to evict > too many groups as sys pool size, group evictors will take all available > threads in sys pool. It leads to hangs send a message via sys pool. As a fix, > I suggest to limit concurrent execution via sys pool or use another pool for > this purpose. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (IGNITE-8559) WAL rollOver can be blocked by WAL iterator reservation
[ https://issues.apache.org/jira/browse/IGNITE-8559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16576327#comment-16576327 ] ASF GitHub Bot commented on IGNITE-8559: GitHub user akalash opened a pull request: https://github.com/apache/ignite/pull/4517 IGNITE-8559 Replace CacheAffinitySharedManager.CachesInfo by ClusterC… …achesInfo You can merge this pull request into a Git repository by running: $ git pull https://github.com/gridgain/apache-ignite ignite-9250 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/ignite/pull/4517.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #4517 commit ea4c1be40d5f621043994a04a0c1bd1f6e47914e Author: Anton Kalashnikov Date: 2018-08-10T14:16:08Z IGNITE-8559 Replace CacheAffinitySharedManager.CachesInfo by ClusterCachesInfo > WAL rollOver can be blocked by WAL iterator reservation > --- > > Key: IGNITE-8559 > URL: https://issues.apache.org/jira/browse/IGNITE-8559 > Project: Ignite > Issue Type: Improvement >Reporter: Alexey Goncharuk >Assignee: Anton Kalashnikov >Priority: Critical > Fix For: 2.7 > > > I've got the following thread dump from one of the Ignite nodes (only > meaningful threads are kept for simplicity) > WAL archiver is waiting for locked segment release > TX commit is waiting for WAL rollover > WAL rollover is blocked by the archiver > Exchange is blocked by TX commit > {code} > "sys-stripe-55-#56%GRID%GridNodeName%" #246 daemon prio=5 os_prio=0 > tid=0x7fdd1eeff000 nid=0x164252 waiting on condition [0x7fdb36eec000] >java.lang.Thread.State: WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0x7fe0a5e96278> (a > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitUninterruptibly(AbstractQueuedSynchronizer.java:1976) > at > org.apache.ignite.internal.util.IgniteUtils.awaitQuiet(IgniteUtils.java:7400) > at > org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$FileWriteHandle.awaitNext(FileWriteAheadLogManager.java:2819) > at > org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$FileWriteHandle.access$2900(FileWriteAheadLogManager.java:2390) > at > org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager.rollOver(FileWriteAheadLogManager.java:1065) > at > org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager.log(FileWriteAheadLogManager.java:715) > at > org.gridgain.grid.internal.processors.cache.database.snapshot.GridCacheSnapshotManager.onChangeTrackerPage(GridCacheSnapshotManager.java:2436) > at > org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager$9.applyx(GridCacheDatabaseSharedManager.java:942) > at > org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager$9.applyx(GridCacheDatabaseSharedManager.java:935) > at > org.apache.ignite.internal.util.lang.GridInClosure3X.apply(GridInClosure3X.java:34) > at > org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.writeUnlockPage(PageMemoryImpl.java:1341) > at > org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.writeUnlock(PageMemoryImpl.java:415) > at > org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.writeUnlock(PageMemoryImpl.java:409) > at > org.apache.ignite.internal.processors.cache.persistence.tree.util.PageHandler.writeUnlock(PageHandler.java:377) > at > org.apache.ignite.internal.processors.cache.persistence.tree.util.PageHandler.writePage(PageHandler.java:287) > at > org.apache.ignite.internal.processors.cache.persistence.DataStructure.write(DataStructure.java:282) > at > org.apache.ignite.internal.processors.cache.persistence.freelist.AbstractFreeList.insertDataRow(AbstractFreeList.java:509) > at > org.apache.ignite.internal.processors.cache.persistence.RowStore.addRow(RowStore.java:102) > at > org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl$CacheDataStoreImpl.createRow(IgniteCacheOffheapManagerImpl.java:1252) > at > org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager$GridCacheDataStore.createRow(GridCacheOffheapManager.java:1370) > at > org.apache.ignite.internal.
[jira] [Created] (IGNITE-9250) Replace CacheAffinitySharedManager.CachesInfo by ClusterCachesInfo
Anton Kalashnikov created IGNITE-9250: - Summary: Replace CacheAffinitySharedManager.CachesInfo by ClusterCachesInfo Key: IGNITE-9250 URL: https://issues.apache.org/jira/browse/IGNITE-9250 Project: Ignite Issue Type: Improvement Reporter: Anton Kalashnikov Assignee: Anton Kalashnikov Now we have duplicate of registerCaches(and groups). They holds in ClusterCachesInfo - main storage, and also they holds in CacheAffinitySharedManager.CachesInfo. It looks like redundantly and can lead to unconsistancy of caches info. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (IGNITE-7384) MVCC Mvcc versions may be lost in case of rebalanse with persistence enabled
[ https://issues.apache.org/jira/browse/IGNITE-7384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Roman Kondakov reassigned IGNITE-7384: -- Assignee: Roman Kondakov > MVCC Mvcc versions may be lost in case of rebalanse with persistence enabled > > > Key: IGNITE-7384 > URL: https://issues.apache.org/jira/browse/IGNITE-7384 > Project: Ignite > Issue Type: Bug >Reporter: Igor Seliverstov >Assignee: Roman Kondakov >Priority: Major > > In case a node returns to topology it equests a delta instead of full > partition, WAL-based iterator is used there > ({{o.a.i.i.processors.cache.persistence.GridCacheOffheapManager#rebalanceIterator}}) > WAL-based iterator doesn't contain MVCC versions which causes issues. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (IGNITE-9178) Partition lost event are not triggered if multiple nodes left cluster
[ https://issues.apache.org/jira/browse/IGNITE-9178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16576302#comment-16576302 ] Pavel Vinokurov edited comment on IGNITE-9178 at 8/10/18 1:45 PM: -- [~agoncharuk] _leftNode2Part_ contains partitions for left nodes. The partition lost event raised if _leftNode2Part_ contains nodes missed in _node2Part_. _node2Part_ map is cleaned up in two places _GridDhtPartitionTopologyImpl#update_ and _GridDhtPartitionTopologyImpl#removeNode_ methods Current patch fixes the following situation. Two nodes have left cluster simultaneously. During exchange for the first left node, the coordinator sends full map to other nodes. _GridDhtPartitionTopologyImpl#update_ handles full map and removes partitions for the second left node without adding to _leftNode2Part_. On the next exchange for the second node, node2part already hasn't partitions for second node, so partitions are not added to _leftNode2Part_ in the _removeNode()_ . This patch does not affect to _diffFromAffinity_ map anyhow. There is an another possible patch - cleanup node2Part map only in detectLostPartitions() method for all left nodes. But I am not sure that it doesn't broke logic related to _diffFromAffinity_ map in _GridDhtPartitionTopologyImpl#update_ method. Please let me know if this patch would be more appropriate. was (Author: pvinokurov): [~agoncharuk] _leftNode2Part_ contains partitions for left nodes. The partition lost event raised if _leftNode2Part_ contains nodes missed in _node2Part_. _node2Part_ map is cleaned up in two places _GridDhtPartitionTopologyImpl#update_ and _GridDhtPartitionTopologyImpl#removeNode_ methods Current patch fixes the following situation. Two nodes have left cluster simultaneously. During exchange for the first left node, the coordinator sends full map to other nodes. _GridDhtPartitionTopologyImpl#update _handles full map and removes partitions for the second left node without adding to _leftNode2Part_. On the next exchange for the second node, node2part already hasn't partitions for second node, so partitions are not added to _leftNode2Part_ in the _removeNode()_ . This patch does not affect to _diffFromAffinity_ map anyhow. There is an another possible patch - cleanup node2Part map only in detectLostPartitions() method for all left nodes. But I am not sure that it doesn't broke logic related to _diffFromAffinity_ map in _GridDhtPartitionTopologyImpl#update_ method. Please let me know if this patch would be more appropriate. > Partition lost event are not triggered if multiple nodes left cluster > - > > Key: IGNITE-9178 > URL: https://issues.apache.org/jira/browse/IGNITE-9178 > Project: Ignite > Issue Type: Bug > Components: cache >Affects Versions: 2.4 >Reporter: Pavel Vinokurov >Assignee: Pavel Vinokurov >Priority: Blocker > Fix For: 2.7 > > > If multiple nodes left cluster simultaneously, left partitions are removed > from GridDhtPartitionTopologyImpl#node2part without adding to leftNode2Part > in GridDhtPartitionTopologyImpl#update method. > Thus GridDhtPartitionTopologyImpl#detectLostPartitions can't detect lost > partitions -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (IGNITE-9178) Partition lost event are not triggered if multiple nodes left cluster
[ https://issues.apache.org/jira/browse/IGNITE-9178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16576302#comment-16576302 ] Pavel Vinokurov commented on IGNITE-9178: - [~agoncharuk] _leftNode2Part _ contains partitions for left nodes. The partition lost event raised if _leftNode2Part_ contains nodes missed in _node2Part_. _node2Part_ map is cleaned up in two places _GridDhtPartitionTopologyImpl#update_ and _GridDhtPartitionTopologyImpl#removeNode_ methods Current patch fixes the following situation. Two nodes have left cluster simultaneously. During exchange for the first left node, the coordinator sends full map to other nodes. _GridDhtPartitionTopologyImpl#update _handles full map and removes partitions for the second left node without adding to _leftNode2Part_. On the next exchange for the second node, node2part already hasn't partitions for second node, so partitions are not added to _leftNode2Part_ in the _removeNode()_ . This patch does not affect to _diffFromAffinity_ map anyhow. There is an another possible patch - cleanup node2Part map only in detectLostPartitions() method for all left nodes. But I am not sure that it doesn't broke logic related to _diffFromAffinity_ map in _GridDhtPartitionTopologyImpl#update_ method. Please let me know if this patch would be more appropriate. > Partition lost event are not triggered if multiple nodes left cluster > - > > Key: IGNITE-9178 > URL: https://issues.apache.org/jira/browse/IGNITE-9178 > Project: Ignite > Issue Type: Bug > Components: cache >Affects Versions: 2.4 >Reporter: Pavel Vinokurov >Assignee: Pavel Vinokurov >Priority: Blocker > Fix For: 2.7 > > > If multiple nodes left cluster simultaneously, left partitions are removed > from GridDhtPartitionTopologyImpl#node2part without adding to leftNode2Part > in GridDhtPartitionTopologyImpl#update method. > Thus GridDhtPartitionTopologyImpl#detectLostPartitions can't detect lost > partitions -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (IGNITE-9178) Partition lost event are not triggered if multiple nodes left cluster
[ https://issues.apache.org/jira/browse/IGNITE-9178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16576302#comment-16576302 ] Pavel Vinokurov edited comment on IGNITE-9178 at 8/10/18 1:44 PM: -- [~agoncharuk] _leftNode2Part_ contains partitions for left nodes. The partition lost event raised if _leftNode2Part_ contains nodes missed in _node2Part_. _node2Part_ map is cleaned up in two places _GridDhtPartitionTopologyImpl#update_ and _GridDhtPartitionTopologyImpl#removeNode_ methods Current patch fixes the following situation. Two nodes have left cluster simultaneously. During exchange for the first left node, the coordinator sends full map to other nodes. _GridDhtPartitionTopologyImpl#update _handles full map and removes partitions for the second left node without adding to _leftNode2Part_. On the next exchange for the second node, node2part already hasn't partitions for second node, so partitions are not added to _leftNode2Part_ in the _removeNode()_ . This patch does not affect to _diffFromAffinity_ map anyhow. There is an another possible patch - cleanup node2Part map only in detectLostPartitions() method for all left nodes. But I am not sure that it doesn't broke logic related to _diffFromAffinity_ map in _GridDhtPartitionTopologyImpl#update_ method. Please let me know if this patch would be more appropriate. was (Author: pvinokurov): [~agoncharuk] _leftNode2Part _ contains partitions for left nodes. The partition lost event raised if _leftNode2Part_ contains nodes missed in _node2Part_. _node2Part_ map is cleaned up in two places _GridDhtPartitionTopologyImpl#update_ and _GridDhtPartitionTopologyImpl#removeNode_ methods Current patch fixes the following situation. Two nodes have left cluster simultaneously. During exchange for the first left node, the coordinator sends full map to other nodes. _GridDhtPartitionTopologyImpl#update _handles full map and removes partitions for the second left node without adding to _leftNode2Part_. On the next exchange for the second node, node2part already hasn't partitions for second node, so partitions are not added to _leftNode2Part_ in the _removeNode()_ . This patch does not affect to _diffFromAffinity_ map anyhow. There is an another possible patch - cleanup node2Part map only in detectLostPartitions() method for all left nodes. But I am not sure that it doesn't broke logic related to _diffFromAffinity_ map in _GridDhtPartitionTopologyImpl#update_ method. Please let me know if this patch would be more appropriate. > Partition lost event are not triggered if multiple nodes left cluster > - > > Key: IGNITE-9178 > URL: https://issues.apache.org/jira/browse/IGNITE-9178 > Project: Ignite > Issue Type: Bug > Components: cache >Affects Versions: 2.4 >Reporter: Pavel Vinokurov >Assignee: Pavel Vinokurov >Priority: Blocker > Fix For: 2.7 > > > If multiple nodes left cluster simultaneously, left partitions are removed > from GridDhtPartitionTopologyImpl#node2part without adding to leftNode2Part > in GridDhtPartitionTopologyImpl#update method. > Thus GridDhtPartitionTopologyImpl#detectLostPartitions can't detect lost > partitions -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (IGNITE-9053) testReentrantLockConstantTopologyChangeNonFailoverSafe can hang in case of broken tx
[ https://issues.apache.org/jira/browse/IGNITE-9053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anton Vinogradov updated IGNITE-9053: - Priority: Critical (was: Major) > testReentrantLockConstantTopologyChangeNonFailoverSafe can hang in case of > broken tx > > > Key: IGNITE-9053 > URL: https://issues.apache.org/jira/browse/IGNITE-9053 > Project: Ignite > Issue Type: Bug > Components: data structures >Affects Versions: 2.5 >Reporter: Anton Vinogradov >Assignee: Anton Vinogradov >Priority: Critical > Labels: MakeTeamcityGreenAgain > Fix For: 2.7 > > > -GridCachePartitionedDataStructuresFailoverSelfTest#testReentrantLockConstantTopologyChangeNonFailoverSafe > -GridCachePartitionedDataStructuresFailoverSelfTest#testCountDownLatchConstantTopologyChange > > can hang in case of broken tx > {noformat} > Pending transactions: > [2018-07-15 14:13:41,210][WARN > ][exchange-worker-#1596354%partitioned.GridCachePartitionedDataStructuresFailoverSelfTest1%][diagnostic] > >>> [txVer=AffinityTopologyVersion [topVer=7, minorTopVer=0], exchWait=true, > tx=GridDhtTxLocal [nearNodeId=1392b1bd-c807-4479-9bfe-fc9f7050, > nearFutId=14ffca0a461-999e75d0-a333-4bd6-a2a2-7f143d0af773, nearMiniId=1, > nearFinFutId=null, nearFinMiniId=0, nearXidVer=GridCacheVersion > [topVer=143133203, order=1531653200153, nodeOrder=1], > super=GridDhtTxLocalAdapter [nearOnOriginatingNode=false, nearNodes=[], > dhtNodes=[], explicitLock=false, super=IgniteTxLocalAdapter > [completedBase=null, sndTransformedVals=false, depEnabled=false, > txState=IgniteTxStateImpl [activeCacheIds=[1968300681], recovery=false, > txMap=[IgniteTxEntry [key=KeyCacheObjectImpl [part=494, > val=GridCacheInternalKeyImpl [name=structure, > grpName=default-volatile-ds-group], hasValBytes=true], cacheId=1968300681, > txKey=IgniteTxKey [key=KeyCacheObjectImpl [part=494, > val=GridCacheInternalKeyImpl [name=structure, > grpName=default-volatile-ds-group], hasValBytes=true], cacheId=1968300681], > val=[op=NOOP, val=null], prevVal=[op=NOOP, val=null], oldVal=[op=NOOP, > val=null], entryProcessorsCol=null, ttl=-1, conflictExpireTime=-1, > conflictVer=null, explicitVer=null, dhtVer=null, filters=[], > filtersPassed=false, filtersSet=false, entry=GridDhtCacheEntry [rdrs=[], > part=494, super=GridDistributedCacheEntry [super=GridCacheMapEntry > [key=KeyCacheObjectImpl [part=494, val=GridCacheInternalKeyImpl > [name=structure, grpName=default-volatile-ds-group], hasValBytes=true], > val=CacheObjectImpl [val=null, hasValBytes=true], ver=GridCacheVersion > [topVer=143133201, order=1531653200154, nodeOrder=2], hash=2095426867, > extras=GridCacheMvccEntryExtras [mvcc=GridCacheMvcc > [locs=[GridCacheMvccCandidate [nodeId=1bf28b00-feed-412b-a20b-ca9fc111, > ver=GridCacheVersion [topVer=143133203, order=1531653200157, nodeOrder=2], > threadId=1947290, id=31143709, topVer=AffinityTopologyVersion [topVer=7, > minorTopVer=0], reentry=null, > otherNodeId=1392b1bd-c807-4479-9bfe-fc9f7050, otherVer=GridCacheVersion > [topVer=143133203, order=1531653200153, nodeOrder=1], mappedDhtNodes=null, > mappedNearNodes=null, ownerVer=null, serOrder=null, key=KeyCacheObjectImpl > [part=494, val=GridCacheInternalKeyImpl [name=structure, > grpName=default-volatile-ds-group], hasValBytes=true], > masks=local=1|owner=1|ready=1|reentry=0|used=0|tx=1|single_implicit=0|dht_local=1|near_local=0|removed=0|read=0, > prevVer=null, nextVer=null]], rmts=null]], flags=2]]], prepared=0, > locked=false, nodeId=null, locMapped=false, expiryPlc=null, > transferExpiryPlc=false, flags=0, partUpdateCntr=0, serReadVer=null, > xidVer=GridCacheVersion [topVer=143133203, order=1531653200157, > nodeOrder=2, super=IgniteTxAdapter [xidVer=GridCacheVersion > [topVer=143133203, order=1531653200157, nodeOrder=2], writeVer=null, > implicit=false, loc=true, threadId=1947290, startTime=1531653200578, > nodeId=1bf28b00-feed-412b-a20b-ca9fc111, startVer=GridCacheVersion > [topVer=143133203, order=1531653200157, nodeOrder=2], endVer=null, > isolation=REPEATABLE_READ, concurrency=PESSIMISTIC, timeout=0, > sysInvalidate=false, sys=true, plc=2, commitVer=null, finalizing=NONE, > invalidParts=null, state=ACTIVE, timedOut=false, > topVer=AffinityTopologyVersion [topVer=7, minorTopVer=0], duration=20632ms, > onePhaseCommit=false], size=1 > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (IGNITE-9053) testReentrantLockConstantTopologyChangeNonFailoverSafe can hang in case of broken tx
[ https://issues.apache.org/jira/browse/IGNITE-9053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16576299#comment-16576299 ] Anton Vinogradov edited comment on IGNITE-9053 at 8/10/18 1:42 PM: --- Looks like we have deadlock here first thread waits for ack {noformat} "sys-stripe-4-#226264%partitioned.GridCachePartitionedDataStructuresFailoverSelfTest1%" #250984 prio=5 os_prio=0 tid=0x7f273c018000 nid=0x2de6 waiting on condition [0x7f274aeee000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:304) at org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:177) at org.apache.ignite.internal.util.future.GridFutureAdapter.get(GridFutureAdapter.java:140) at org.apache.ignite.internal.processors.continuous.GridContinuousProcessor.addNotification(GridContinuousProcessor.java:1168) at org.apache.ignite.internal.processors.cache.query.continuous.CacheContinuousQueryHandler.onEntryUpdate(CacheContinuousQueryHandler.java:890) at org.apache.ignite.internal.processors.cache.query.continuous.CacheContinuousQueryHandler.access$600(CacheContinuousQueryHandler.java:85) at org.apache.ignite.internal.processors.cache.query.continuous.CacheContinuousQueryHandler$2.onEntryUpdated(CacheContinuousQueryHandler.java:430) at org.apache.ignite.internal.processors.cache.query.continuous.CacheContinuousQueryManager.onEntryUpdated(CacheContinuousQueryManager.java:400) at org.apache.ignite.internal.processors.cache.GridCacheMapEntry.innerSet(GridCacheMapEntry.java:1079) at org.apache.ignite.internal.processors.cache.transactions.IgniteTxLocalAdapter.userCommit(IgniteTxLocalAdapter.java:652) at org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTxLocalAdapter.localFinish(GridDhtTxLocalAdapter.java:795) at org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTxLocal.localFinish(GridDhtTxLocal.java:583) at org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTxLocal.finishTx(GridDhtTxLocal.java:464) at org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTxLocal.commitDhtLocalAsync(GridDhtTxLocal.java:505) at org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler.finishDhtLocal(IgniteTxHandler.java:942) at org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler.finish(IgniteTxHandler.java:821) at org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler.processNearTxFinishRequest(IgniteTxHandler.java:777) at org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler.access$200(IgniteTxHandler.java:99) at org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler$3.apply(IgniteTxHandler.java:191) at org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler$3.apply(IgniteTxHandler.java:189) at org.apache.ignite.internal.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:1056) at org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:581) at org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:380) at org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:306) at org.apache.ignite.internal.processors.cache.GridCacheIoManager.access$100(GridCacheIoManager.java:101) at org.apache.ignite.internal.processors.cache.GridCacheIoManager$1.onMessage(GridCacheIoManager.java:295) at org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1556) at org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:1184) at org.apache.ignite.internal.managers.communication.GridIoManager.access$4200(GridIoManager.java:125) at org.apache.ignite.internal.managers.communication.GridIoManager$9.run(GridIoManager.java:1091) at org.apache.ignite.internal.util.StripedExecutor$Stripe.body(StripedExecutor.java:496) at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110) at java.lang.Thread.run(Thread.java:745) {noformat} It sent the CQ notification, and node received it, but failed after that. fut can be completed only at org.apache.ignite.internal.processors.continuous.GridContinuousProcessor.DiscoveryListener on EVT_NODE_FAILED but EVT_NODE_FAILED can't be handled since we're trying to removeExplicitNodeLocks at previous listener :( {noformat} "disco-event-worker-#226410%partitioned.GridCachePartit
[jira] [Commented] (IGNITE-8724) Skip logging 3-rd parameter while calling U.warn with initialized logger.
[ https://issues.apache.org/jira/browse/IGNITE-8724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16576298#comment-16576298 ] ASF GitHub Bot commented on IGNITE-8724: Github user asfgit closed the pull request at: https://github.com/apache/ignite/pull/4145 > Skip logging 3-rd parameter while calling U.warn with initialized logger. > - > > Key: IGNITE-8724 > URL: https://issues.apache.org/jira/browse/IGNITE-8724 > Project: Ignite > Issue Type: Bug > Components: general >Affects Versions: 2.5 >Reporter: Stanilovsky Evgeny >Assignee: Stanilovsky Evgeny >Priority: Major > Fix For: 2.7 > > Attachments: tc.png > > > There are a lot of places where exception need to be logged, for example : > {code:java} > U.warn(log,"Unable to await partitions release future", e); > {code} > but current U.warn realization silently swallow it. > {code:java} > public static void warn(@Nullable IgniteLogger log, Object longMsg, > Object shortMsg) { > assert longMsg != null; > assert shortMsg != null; > if (log != null) > log.warning(compact(longMsg.toString())); > else > X.println("[" + SHORT_DATE_FMT.format(new java.util.Date()) + "] > (wrn) " + > compact(shortMsg.toString())); > } > {code} > fix, looks like simple add: > {code:java} > public static void warn(@Nullable IgniteLogger log, Object longMsg, > Throwable ex) { > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (IGNITE-9053) testReentrantLockConstantTopologyChangeNonFailoverSafe can hang in case of broken tx
[ https://issues.apache.org/jira/browse/IGNITE-9053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16576299#comment-16576299 ] Anton Vinogradov commented on IGNITE-9053: -- Looks like we have deadlock here first thread waits for ack {noformat} "sys-stripe-4-#226264%partitioned.GridCachePartitionedDataStructuresFailoverSelfTest1%" #250984 prio=5 os_prio=0 tid=0x7f273c018000 nid=0x2de6 waiting on condition [0x7f274aeee000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:304) at org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:177) at org.apache.ignite.internal.util.future.GridFutureAdapter.get(GridFutureAdapter.java:140) at org.apache.ignite.internal.processors.continuous.GridContinuousProcessor.addNotification(GridContinuousProcessor.java:1168) at org.apache.ignite.internal.processors.cache.query.continuous.CacheContinuousQueryHandler.onEntryUpdate(CacheContinuousQueryHandler.java:890) at org.apache.ignite.internal.processors.cache.query.continuous.CacheContinuousQueryHandler.access$600(CacheContinuousQueryHandler.java:85) at org.apache.ignite.internal.processors.cache.query.continuous.CacheContinuousQueryHandler$2.onEntryUpdated(CacheContinuousQueryHandler.java:430) at org.apache.ignite.internal.processors.cache.query.continuous.CacheContinuousQueryManager.onEntryUpdated(CacheContinuousQueryManager.java:400) at org.apache.ignite.internal.processors.cache.GridCacheMapEntry.innerSet(GridCacheMapEntry.java:1079) at org.apache.ignite.internal.processors.cache.transactions.IgniteTxLocalAdapter.userCommit(IgniteTxLocalAdapter.java:652) at org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTxLocalAdapter.localFinish(GridDhtTxLocalAdapter.java:795) at org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTxLocal.localFinish(GridDhtTxLocal.java:583) at org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTxLocal.finishTx(GridDhtTxLocal.java:464) at org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTxLocal.commitDhtLocalAsync(GridDhtTxLocal.java:505) at org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler.finishDhtLocal(IgniteTxHandler.java:942) at org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler.finish(IgniteTxHandler.java:821) at org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler.processNearTxFinishRequest(IgniteTxHandler.java:777) at org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler.access$200(IgniteTxHandler.java:99) at org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler$3.apply(IgniteTxHandler.java:191) at org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler$3.apply(IgniteTxHandler.java:189) at org.apache.ignite.internal.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:1056) at org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:581) at org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:380) at org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:306) at org.apache.ignite.internal.processors.cache.GridCacheIoManager.access$100(GridCacheIoManager.java:101) at org.apache.ignite.internal.processors.cache.GridCacheIoManager$1.onMessage(GridCacheIoManager.java:295) at org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1556) at org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:1184) at org.apache.ignite.internal.managers.communication.GridIoManager.access$4200(GridIoManager.java:125) at org.apache.ignite.internal.managers.communication.GridIoManager$9.run(GridIoManager.java:1091) at org.apache.ignite.internal.util.StripedExecutor$Stripe.body(StripedExecutor.java:496) at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110) at java.lang.Thread.run(Thread.java:745) {noformat} It sent the event, and node received it, but failed after that. fut can be completed only at org.apache.ignite.internal.processors.continuous.GridContinuousProcessor.DiscoveryListener on EVT_NODE_FAILED but EVT_NODE_FAILED can't be handled since we're trying to removeExplicitNodeLocks at previous listener :( {noformat} "disco-event-worker-#226410%partitioned.GridCachePartitionedDataStructuresFailoverSelfTest1%" #251148 prio=5 os_p
[jira] [Commented] (IGNITE-8724) Skip logging 3-rd parameter while calling U.warn with initialized logger.
[ https://issues.apache.org/jira/browse/IGNITE-8724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16576295#comment-16576295 ] Alexey Goncharuk commented on IGNITE-8724: -- Thanks, merged to master. > Skip logging 3-rd parameter while calling U.warn with initialized logger. > - > > Key: IGNITE-8724 > URL: https://issues.apache.org/jira/browse/IGNITE-8724 > Project: Ignite > Issue Type: Bug > Components: general >Affects Versions: 2.5 >Reporter: Stanilovsky Evgeny >Assignee: Stanilovsky Evgeny >Priority: Major > Fix For: 2.7 > > Attachments: tc.png > > > There are a lot of places where exception need to be logged, for example : > {code:java} > U.warn(log,"Unable to await partitions release future", e); > {code} > but current U.warn realization silently swallow it. > {code:java} > public static void warn(@Nullable IgniteLogger log, Object longMsg, > Object shortMsg) { > assert longMsg != null; > assert shortMsg != null; > if (log != null) > log.warning(compact(longMsg.toString())); > else > X.println("[" + SHORT_DATE_FMT.format(new java.util.Date()) + "] > (wrn) " + > compact(shortMsg.toString())); > } > {code} > fix, looks like simple add: > {code:java} > public static void warn(@Nullable IgniteLogger log, Object longMsg, > Throwable ex) { > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (IGNITE-9249) Tests hang when different threads try to start and stop nodes at the same time.
[ https://issues.apache.org/jira/browse/IGNITE-9249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16576286#comment-16576286 ] Ilya Lantukh commented on IGNITE-9249: -- https://ci.ignite.apache.org/viewQueued.html?itemId=1628472 > Tests hang when different threads try to start and stop nodes at the same > time. > --- > > Key: IGNITE-9249 > URL: https://issues.apache.org/jira/browse/IGNITE-9249 > Project: Ignite > Issue Type: Bug >Reporter: Ilya Lantukh >Assignee: Ilya Lantukh >Priority: Major > > An example of such test is > GridCachePartitionedNearDisabledOptimisticTxNodeRestartTest.testRestartWithPutFourNodesOneBackupsOffheapEvict(). > Hanged threads: > {code} > "restart-worker-1@63424" prio=5 tid=0x7f5e nid=NA waiting > java.lang.Thread.State: WAITING > at java.lang.Object.wait(Object.java:-1) > at > org.apache.ignite.spi.discovery.tcp.ServerImpl.joinTopology(ServerImpl.java:949) > at > org.apache.ignite.spi.discovery.tcp.ServerImpl.spiStart(ServerImpl.java:389) > at > org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi.spiStart(TcpDiscoverySpi.java:2002) > at > org.apache.ignite.internal.managers.GridManagerAdapter.startSpi(GridManagerAdapter.java:297) > at > org.apache.ignite.internal.managers.discovery.GridDiscoveryManager.start(GridDiscoveryManager.java:916) > at > org.apache.ignite.internal.IgniteKernal.startManager(IgniteKernal.java:1754) > at > org.apache.ignite.internal.IgniteKernal.start(IgniteKernal.java:1050) > at > org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start0(IgnitionEx.java:2020) > at > org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start(IgnitionEx.java:1725) > - locked <0xfc36> (a > org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance) > at org.apache.ignite.internal.IgnitionEx.start0(IgnitionEx.java:1153) > at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:651) > at > org.apache.ignite.testframework.junits.GridAbstractTest.startGrid(GridAbstractTest.java:920) > at > org.apache.ignite.testframework.junits.GridAbstractTest.startGrid(GridAbstractTest.java:858) > at > org.apache.ignite.testframework.junits.GridAbstractTest.startGrid(GridAbstractTest.java:846) > at > org.apache.ignite.testframework.junits.GridAbstractTest.startGrid(GridAbstractTest.java:812) > at > org.apache.ignite.internal.processors.cache.distributed.GridCacheAbstractNodeRestartSelfTest.access$1000(GridCacheAbstractNodeRestartSelfTest.java:64) > at > org.apache.ignite.internal.processors.cache.distributed.GridCacheAbstractNodeRestartSelfTest$2.run(GridCacheAbstractNodeRestartSelfTest.java:665) > at java.lang.Thread.run(Thread.java:748) > "restart-worker-0@63423" prio=5 tid=0x7f5d nid=NA waiting > java.lang.Thread.State: WAITING > at sun.misc.Unsafe.park(Unsafe.java:-1) > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304) > at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:231) > at > org.apache.ignite.internal.util.IgniteUtils.awaitQuiet(IgniteUtils.java:7584) > at > org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.grid(IgnitionEx.java:1666) > at > org.apache.ignite.internal.IgnitionEx.allGrids(IgnitionEx.java:1284) > at > org.apache.ignite.internal.IgnitionEx.allGrids(IgnitionEx.java:1262) > at org.apache.ignite.Ignition.allGrids(Ignition.java:502) > at > org.apache.ignite.testframework.junits.GridAbstractTest.awaitTopologyChange(GridAbstractTest.java:2258) > at > org.apache.ignite.testframework.junits.GridAbstractTest.stopGrid(GridAbstractTest.java:1158) > at > org.apache.ignite.testframework.junits.GridAbstractTest.stopGrid(GridAbstractTest.java:1133) > at > org.apache.ignite.testframework.junits.GridAbstractTest.stopGrid(GridAbstractTest.java:1433) > at > org.apache.ignite.internal.processors.cache.distributed.GridCacheAbstractNodeRestartSelfTest.access$800(GridCacheAbstractNodeRestartSelfTest.java:64) > at > org.apache.ignite.internal.processors.cache.distributed.GridCacheAbstractNodeRestartSelfTest$2.run(GridCacheAbstractNodeRestartSelfTest.java:661) > at java.lang.Thread.run(Threa
[jira] [Commented] (IGNITE-9249) Tests hang when different threads try to start and stop nodes at the same time.
[ https://issues.apache.org/jira/browse/IGNITE-9249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16576282#comment-16576282 ] ASF GitHub Bot commented on IGNITE-9249: GitHub user ilantukh opened a pull request: https://github.com/apache/ignite/pull/4515 IGNITE-9249 : Configured node join timeout for all tests. You can merge this pull request into a Git repository by running: $ git pull https://github.com/gridgain/apache-ignite ignite-9249 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/ignite/pull/4515.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #4515 commit 3be4dcc6da6649fb04f99f61c31cebb29d03c0fe Author: Ilya Lantukh Date: 2018-08-10T13:23:28Z IGNITE-9249 : Configured node join timeout for all tests. > Tests hang when different threads try to start and stop nodes at the same > time. > --- > > Key: IGNITE-9249 > URL: https://issues.apache.org/jira/browse/IGNITE-9249 > Project: Ignite > Issue Type: Bug >Reporter: Ilya Lantukh >Assignee: Ilya Lantukh >Priority: Major > > An example of such test is > GridCachePartitionedNearDisabledOptimisticTxNodeRestartTest.testRestartWithPutFourNodesOneBackupsOffheapEvict(). > Hanged threads: > {code} > "restart-worker-1@63424" prio=5 tid=0x7f5e nid=NA waiting > java.lang.Thread.State: WAITING > at java.lang.Object.wait(Object.java:-1) > at > org.apache.ignite.spi.discovery.tcp.ServerImpl.joinTopology(ServerImpl.java:949) > at > org.apache.ignite.spi.discovery.tcp.ServerImpl.spiStart(ServerImpl.java:389) > at > org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi.spiStart(TcpDiscoverySpi.java:2002) > at > org.apache.ignite.internal.managers.GridManagerAdapter.startSpi(GridManagerAdapter.java:297) > at > org.apache.ignite.internal.managers.discovery.GridDiscoveryManager.start(GridDiscoveryManager.java:916) > at > org.apache.ignite.internal.IgniteKernal.startManager(IgniteKernal.java:1754) > at > org.apache.ignite.internal.IgniteKernal.start(IgniteKernal.java:1050) > at > org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start0(IgnitionEx.java:2020) > at > org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start(IgnitionEx.java:1725) > - locked <0xfc36> (a > org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance) > at org.apache.ignite.internal.IgnitionEx.start0(IgnitionEx.java:1153) > at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:651) > at > org.apache.ignite.testframework.junits.GridAbstractTest.startGrid(GridAbstractTest.java:920) > at > org.apache.ignite.testframework.junits.GridAbstractTest.startGrid(GridAbstractTest.java:858) > at > org.apache.ignite.testframework.junits.GridAbstractTest.startGrid(GridAbstractTest.java:846) > at > org.apache.ignite.testframework.junits.GridAbstractTest.startGrid(GridAbstractTest.java:812) > at > org.apache.ignite.internal.processors.cache.distributed.GridCacheAbstractNodeRestartSelfTest.access$1000(GridCacheAbstractNodeRestartSelfTest.java:64) > at > org.apache.ignite.internal.processors.cache.distributed.GridCacheAbstractNodeRestartSelfTest$2.run(GridCacheAbstractNodeRestartSelfTest.java:665) > at java.lang.Thread.run(Thread.java:748) > "restart-worker-0@63423" prio=5 tid=0x7f5d nid=NA waiting > java.lang.Thread.State: WAITING > at sun.misc.Unsafe.park(Unsafe.java:-1) > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304) > at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:231) > at > org.apache.ignite.internal.util.IgniteUtils.awaitQuiet(IgniteUtils.java:7584) > at > org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.grid(IgnitionEx.java:1666) > at > org.apache.ignite.internal.IgnitionEx.allGrids(IgnitionEx.java:1284) > at > org.apache.ignite.internal.IgnitionEx.allGrids(IgnitionEx.java:1262) > at org.apache.ignite.Ignition.allGrids(Ignition.java:502) > at > org.apache.ignite.testframework.junits.GridAbstractTest.awaitTopologyChange(GridAbstractTest.java:2258) >
[jira] [Commented] (IGNITE-9249) Tests hang when different threads try to start and stop nodes at the same time.
[ https://issues.apache.org/jira/browse/IGNITE-9249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16576277#comment-16576277 ] Ilya Lantukh commented on IGNITE-9249: -- As a temporary solution I suggest to set join timeout in GridAbstractTest, so tests will fail instead of hanging up. > Tests hang when different threads try to start and stop nodes at the same > time. > --- > > Key: IGNITE-9249 > URL: https://issues.apache.org/jira/browse/IGNITE-9249 > Project: Ignite > Issue Type: Bug >Reporter: Ilya Lantukh >Assignee: Ilya Lantukh >Priority: Major > > An example of such test is > GridCachePartitionedNearDisabledOptimisticTxNodeRestartTest.testRestartWithPutFourNodesOneBackupsOffheapEvict(). > Hanged threads: > {code} > "restart-worker-1@63424" prio=5 tid=0x7f5e nid=NA waiting > java.lang.Thread.State: WAITING > at java.lang.Object.wait(Object.java:-1) > at > org.apache.ignite.spi.discovery.tcp.ServerImpl.joinTopology(ServerImpl.java:949) > at > org.apache.ignite.spi.discovery.tcp.ServerImpl.spiStart(ServerImpl.java:389) > at > org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi.spiStart(TcpDiscoverySpi.java:2002) > at > org.apache.ignite.internal.managers.GridManagerAdapter.startSpi(GridManagerAdapter.java:297) > at > org.apache.ignite.internal.managers.discovery.GridDiscoveryManager.start(GridDiscoveryManager.java:916) > at > org.apache.ignite.internal.IgniteKernal.startManager(IgniteKernal.java:1754) > at > org.apache.ignite.internal.IgniteKernal.start(IgniteKernal.java:1050) > at > org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start0(IgnitionEx.java:2020) > at > org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start(IgnitionEx.java:1725) > - locked <0xfc36> (a > org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance) > at org.apache.ignite.internal.IgnitionEx.start0(IgnitionEx.java:1153) > at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:651) > at > org.apache.ignite.testframework.junits.GridAbstractTest.startGrid(GridAbstractTest.java:920) > at > org.apache.ignite.testframework.junits.GridAbstractTest.startGrid(GridAbstractTest.java:858) > at > org.apache.ignite.testframework.junits.GridAbstractTest.startGrid(GridAbstractTest.java:846) > at > org.apache.ignite.testframework.junits.GridAbstractTest.startGrid(GridAbstractTest.java:812) > at > org.apache.ignite.internal.processors.cache.distributed.GridCacheAbstractNodeRestartSelfTest.access$1000(GridCacheAbstractNodeRestartSelfTest.java:64) > at > org.apache.ignite.internal.processors.cache.distributed.GridCacheAbstractNodeRestartSelfTest$2.run(GridCacheAbstractNodeRestartSelfTest.java:665) > at java.lang.Thread.run(Thread.java:748) > "restart-worker-0@63423" prio=5 tid=0x7f5d nid=NA waiting > java.lang.Thread.State: WAITING > at sun.misc.Unsafe.park(Unsafe.java:-1) > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304) > at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:231) > at > org.apache.ignite.internal.util.IgniteUtils.awaitQuiet(IgniteUtils.java:7584) > at > org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.grid(IgnitionEx.java:1666) > at > org.apache.ignite.internal.IgnitionEx.allGrids(IgnitionEx.java:1284) > at > org.apache.ignite.internal.IgnitionEx.allGrids(IgnitionEx.java:1262) > at org.apache.ignite.Ignition.allGrids(Ignition.java:502) > at > org.apache.ignite.testframework.junits.GridAbstractTest.awaitTopologyChange(GridAbstractTest.java:2258) > at > org.apache.ignite.testframework.junits.GridAbstractTest.stopGrid(GridAbstractTest.java:1158) > at > org.apache.ignite.testframework.junits.GridAbstractTest.stopGrid(GridAbstractTest.java:1133) > at > org.apache.ignite.testframework.junits.GridAbstractTest.stopGrid(GridAbstractTest.java:1433) > at > org.apache.ignite.internal.processors.cache.distributed.GridCacheAbstractNodeRestartSelfTest.access$800(GridCacheAbstractNodeRestartSelfTest.java:64) > at > org.apache.ignite.internal.processors.cache.distributed.GridCacheAbstractNodeRestartSelfTest$2.run(GridCacheAbstractNodeRestart
[jira] [Commented] (IGNITE-602) [Test] GridToStringBuilder is vulnerable for StackOverflowError caused by infinite recursion
[ https://issues.apache.org/jira/browse/IGNITE-602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16576275#comment-16576275 ] Alexey Goncharuk commented on IGNITE-602: - [~SomeFire], this change breaks {{S.toString()}} output, this is the printout I see in logs: {code} [2018-08-10 16:14:15,671][WARN ][exchange-worker-#186%client%][diagnostic] >>> KeyCacheObjectImpl [part=6, val=6, hasValBytes=true]KeyCacheObjectImpl [part=6, val=6, hasValBytes=true], val=null, ver=GridCacheVersion [topVer=0, order=0, nodeOrder=0], hash=6, extras=null, flags=0]GridDistributedCacheEntry [super=]GridDhtDetachedCacheEntry [super=], prepared=0, locked=false, nodeId=834e31ba-a000-46d5-bef3-45f28531, locMapped=false, expiryPlc=null, transferExpiryPlc=false, flags=0, partUpdateCntr=0, serReadVer=null, xidVer=GridCacheVersion [topVer=145386834, order=1533906834364, nodeOrder=4, super=, size=1]GridDhtTxLocalAdapter [nearOnOriginatingNode=false, nearNodes=KeySetView [], dhtNodes=KeySetView [], explicitLock=false, super=]GridNearTxLocal [mappings=IgniteTxMappingsImpl [], nearLocallyMapped=false, colocatedLocallyMapped=false, needCheckBackup=null, hasRemoteLocks=false, trackTimeout=false, lb=null, thread=async-runnable-runner-1, mappings=IgniteTxMappingsImpl [], super=], super=GridCompoundFuture [rdc=o.a.i.i.processors.cache.distributed.near.GridNearTxPrepareFutureAdapter$1@4f11fa96, initFlag=1, lsnrCalls=0, done=false, cancelled=false, err=null, futs=TransformCollectionView [false]]] {code} Can you estimate how long the fix will take? If it takes too long, we will need to revert the change. > [Test] GridToStringBuilder is vulnerable for StackOverflowError caused by > infinite recursion > > > Key: IGNITE-602 > URL: https://issues.apache.org/jira/browse/IGNITE-602 > Project: Ignite > Issue Type: Bug > Components: general >Reporter: Artem Shutak >Assignee: Ryabov Dmitrii >Priority: Major > Labels: MakeTeamcityGreenAgain, Muted_test > Fix For: 2.7 > > > See test > org.gridgain.grid.util.tostring.GridToStringBuilderSelfTest#_testToStringCheckAdvancedRecursionPrevention > and related TODO in same source file. > Also take a look at > http://stackoverflow.com/questions/11300203/most-efficient-way-to-prevent-an-infinite-recursion-in-tostring > Test should be unmuted on TC after fix. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-9249) Tests hang when different threads try to start and stop nodes at the same time.
Ilya Lantukh created IGNITE-9249: Summary: Tests hang when different threads try to start and stop nodes at the same time. Key: IGNITE-9249 URL: https://issues.apache.org/jira/browse/IGNITE-9249 Project: Ignite Issue Type: Bug Reporter: Ilya Lantukh Assignee: Ilya Lantukh An example of such test is GridCachePartitionedNearDisabledOptimisticTxNodeRestartTest.testRestartWithPutFourNodesOneBackupsOffheapEvict(). Hanged threads: {code} "restart-worker-1@63424" prio=5 tid=0x7f5e nid=NA waiting java.lang.Thread.State: WAITING at java.lang.Object.wait(Object.java:-1) at org.apache.ignite.spi.discovery.tcp.ServerImpl.joinTopology(ServerImpl.java:949) at org.apache.ignite.spi.discovery.tcp.ServerImpl.spiStart(ServerImpl.java:389) at org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi.spiStart(TcpDiscoverySpi.java:2002) at org.apache.ignite.internal.managers.GridManagerAdapter.startSpi(GridManagerAdapter.java:297) at org.apache.ignite.internal.managers.discovery.GridDiscoveryManager.start(GridDiscoveryManager.java:916) at org.apache.ignite.internal.IgniteKernal.startManager(IgniteKernal.java:1754) at org.apache.ignite.internal.IgniteKernal.start(IgniteKernal.java:1050) at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start0(IgnitionEx.java:2020) at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start(IgnitionEx.java:1725) - locked <0xfc36> (a org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance) at org.apache.ignite.internal.IgnitionEx.start0(IgnitionEx.java:1153) at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:651) at org.apache.ignite.testframework.junits.GridAbstractTest.startGrid(GridAbstractTest.java:920) at org.apache.ignite.testframework.junits.GridAbstractTest.startGrid(GridAbstractTest.java:858) at org.apache.ignite.testframework.junits.GridAbstractTest.startGrid(GridAbstractTest.java:846) at org.apache.ignite.testframework.junits.GridAbstractTest.startGrid(GridAbstractTest.java:812) at org.apache.ignite.internal.processors.cache.distributed.GridCacheAbstractNodeRestartSelfTest.access$1000(GridCacheAbstractNodeRestartSelfTest.java:64) at org.apache.ignite.internal.processors.cache.distributed.GridCacheAbstractNodeRestartSelfTest$2.run(GridCacheAbstractNodeRestartSelfTest.java:665) at java.lang.Thread.run(Thread.java:748) "restart-worker-0@63423" prio=5 tid=0x7f5d nid=NA waiting java.lang.Thread.State: WAITING at sun.misc.Unsafe.park(Unsafe.java:-1) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836) at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304) at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:231) at org.apache.ignite.internal.util.IgniteUtils.awaitQuiet(IgniteUtils.java:7584) at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.grid(IgnitionEx.java:1666) at org.apache.ignite.internal.IgnitionEx.allGrids(IgnitionEx.java:1284) at org.apache.ignite.internal.IgnitionEx.allGrids(IgnitionEx.java:1262) at org.apache.ignite.Ignition.allGrids(Ignition.java:502) at org.apache.ignite.testframework.junits.GridAbstractTest.awaitTopologyChange(GridAbstractTest.java:2258) at org.apache.ignite.testframework.junits.GridAbstractTest.stopGrid(GridAbstractTest.java:1158) at org.apache.ignite.testframework.junits.GridAbstractTest.stopGrid(GridAbstractTest.java:1133) at org.apache.ignite.testframework.junits.GridAbstractTest.stopGrid(GridAbstractTest.java:1433) at org.apache.ignite.internal.processors.cache.distributed.GridCacheAbstractNodeRestartSelfTest.access$800(GridCacheAbstractNodeRestartSelfTest.java:64) at org.apache.ignite.internal.processors.cache.distributed.GridCacheAbstractNodeRestartSelfTest$2.run(GridCacheAbstractNodeRestartSelfTest.java:661) at java.lang.Thread.run(Thread.java:748) {code} Full thread dump: {code} "test-runner-#26488%dht.GridCachePartitionedNearDisabledOptimisticTxNodeRestartTest%@63124" prio=5 tid=0x7e6a nid=NA waiting java.lang.Thread.State: WAITING at sun.misc.Unsafe.park(Unsafe.java:-1) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInte
[jira] [Commented] (IGNITE-9236) Handshake timeout never completes in some tests (GridCacheReplicatedFailoverSelfTest in particular)
[ https://issues.apache.org/jira/browse/IGNITE-9236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16576226#comment-16576226 ] ASF GitHub Bot commented on IGNITE-9236: Github user asfgit closed the pull request at: https://github.com/apache/ignite/pull/4499 > Handshake timeout never completes in some tests > (GridCacheReplicatedFailoverSelfTest in particular) > --- > > Key: IGNITE-9236 > URL: https://issues.apache.org/jira/browse/IGNITE-9236 > Project: Ignite > Issue Type: Bug >Reporter: Ilya Lantukh >Assignee: Ilya Lantukh >Priority: Major > Labels: MakeTeamcityGreenAgain > Fix For: 2.7 > > > In GridCacheReplicatedFailoverSelfTest one thread tries to establish TCP > connection and hangs on handshake forever, holding lock on RebalanceFuture: > {code} > [11:51:55] : [Step 3/4] Locked synchronizers: > [11:51:55] : [Step 3/4] > java.util.concurrent.ThreadPoolExecutor$Worker@5b17b883 > [11:51:55] : [Step 3/4] Thread > [name="sys-#68921%new-node-topology-change-thread-1%", id=77410, > state=RUNNABLE, blockCnt=3, waitCnt=0] > [11:51:55] : [Step 3/4] at > sun.nio.ch.FileDispatcherImpl.read0(Native Method) > [11:51:55] : [Step 3/4] at > sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39) > [11:51:55] : [Step 3/4] at > sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223) > [11:51:55] : [Step 3/4] at sun.nio.ch.IOUtil.read(IOUtil.java:197) > [11:51:55] : [Step 3/4] at > sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380) > [11:51:55] : [Step 3/4] - locked java.lang.Object@23aaa756 > [11:51:55] : [Step 3/4] at > o.a.i.spi.communication.tcp.TcpCommunicationSpi.safeTcpHandshake(TcpCommunicationSpi.java:3647) > [11:51:55] : [Step 3/4] at > o.a.i.spi.communication.tcp.TcpCommunicationSpi.createTcpClient(TcpCommunicationSpi.java:3293) > [11:51:55] : [Step 3/4] at > o.a.i.spi.communication.tcp.TcpCommunicationSpi.createNioClient(TcpCommunicationSpi.java:2967) > [11:51:55] : [Step 3/4] at > o.a.i.spi.communication.tcp.TcpCommunicationSpi.reserveClient(TcpCommunicationSpi.java:2850) > [11:51:55] : [Step 3/4] at > o.a.i.spi.communication.tcp.TcpCommunicationSpi.sendMessage0(TcpCommunicationSpi.java:2693) > [11:51:55] : [Step 3/4] at > o.a.i.spi.communication.tcp.TcpCommunicationSpi.sendMessage(TcpCommunicationSpi.java:2652) > [11:51:55] : [Step 3/4] at > o.a.i.i.managers.communication.GridIoManager.send(GridIoManager.java:1643) > [11:51:55] : [Step 3/4] at > o.a.i.i.managers.communication.GridIoManager.sendOrderedMessage(GridIoManager.java:1750) > [11:51:55] : [Step 3/4] at > o.a.i.i.processors.cache.GridCacheIoManager.sendOrderedMessage(GridCacheIoManager.java:1231) > [11:51:55] : [Step 3/4] at > o.a.i.i.processors.cache.distributed.dht.preloader.GridDhtPartitionDemander$RebalanceFuture.cleanupRemoteContexts(GridDhtPartitionDemander.java:) > [11:51:55] : [Step 3/4] at > o.a.i.i.processors.cache.distributed.dht.preloader.GridDhtPartitionDemander$RebalanceFuture.cancel(GridDhtPartitionDemander.java:1041) > [11:51:55] : [Step 3/4] - locked > o.a.i.i.processors.cache.distributed.dht.preloader.GridDhtPartitionDemander$RebalanceFuture@7e28f150 > [11:51:55] : [Step 3/4] at > o.a.i.i.processors.cache.distributed.dht.preloader.GridDhtPartitionDemander.lambda$null$2(GridDhtPartitionDemander.java:534) > [11:51:55] : [Step 3/4] at > o.a.i.i.processors.cache.distributed.dht.preloader.GridDhtPartitionDemander$$Lambda$41/603501511.run(Unknown > Source) > [11:51:55] : [Step 3/4] at > o.a.i.i.util.IgniteUtils.wrapThreadLoader(IgniteUtils.java:6800) > [11:51:55] : [Step 3/4] at > o.a.i.i.processors.closure.GridClosureProcessor$1.body(GridClosureProcessor.java:827) > [11:51:55] : [Step 3/4] at > o.a.i.i.util.worker.GridWorker.run(GridWorker.java:110) > [11:51:55] : [Step 3/4] at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > [11:51:55] : [Step 3/4] at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > [11:51:55] : [Step 3/4] at java.lang.Thread.run(Thread.java:748) > {code} > Because of that, exchange worker hangs forever while trying to acquire that > lock: > {code} > [11:51:55] : [Step 3/4] Thread > [name="exchange-worker-#68894%new-node-topology-change-thread-1%", id=77379, > state=BLOCKED, blockCnt=11, waitCnt=7] > [11:51:55] : [Step 3/4] Lock > [object=o.a.i.i.processors.cache.distributed.dht.preloader.GridDhtPartitionDemander$Reb
[jira] [Commented] (IGNITE-9236) Handshake timeout never completes in some tests (GridCacheReplicatedFailoverSelfTest in particular)
[ https://issues.apache.org/jira/browse/IGNITE-9236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16576225#comment-16576225 ] Alexey Goncharuk commented on IGNITE-9236: -- Thanks, Ilya, merged your changes to master. > Handshake timeout never completes in some tests > (GridCacheReplicatedFailoverSelfTest in particular) > --- > > Key: IGNITE-9236 > URL: https://issues.apache.org/jira/browse/IGNITE-9236 > Project: Ignite > Issue Type: Bug >Reporter: Ilya Lantukh >Assignee: Ilya Lantukh >Priority: Major > Labels: MakeTeamcityGreenAgain > Fix For: 2.7 > > > In GridCacheReplicatedFailoverSelfTest one thread tries to establish TCP > connection and hangs on handshake forever, holding lock on RebalanceFuture: > {code} > [11:51:55] : [Step 3/4] Locked synchronizers: > [11:51:55] : [Step 3/4] > java.util.concurrent.ThreadPoolExecutor$Worker@5b17b883 > [11:51:55] : [Step 3/4] Thread > [name="sys-#68921%new-node-topology-change-thread-1%", id=77410, > state=RUNNABLE, blockCnt=3, waitCnt=0] > [11:51:55] : [Step 3/4] at > sun.nio.ch.FileDispatcherImpl.read0(Native Method) > [11:51:55] : [Step 3/4] at > sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39) > [11:51:55] : [Step 3/4] at > sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223) > [11:51:55] : [Step 3/4] at sun.nio.ch.IOUtil.read(IOUtil.java:197) > [11:51:55] : [Step 3/4] at > sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380) > [11:51:55] : [Step 3/4] - locked java.lang.Object@23aaa756 > [11:51:55] : [Step 3/4] at > o.a.i.spi.communication.tcp.TcpCommunicationSpi.safeTcpHandshake(TcpCommunicationSpi.java:3647) > [11:51:55] : [Step 3/4] at > o.a.i.spi.communication.tcp.TcpCommunicationSpi.createTcpClient(TcpCommunicationSpi.java:3293) > [11:51:55] : [Step 3/4] at > o.a.i.spi.communication.tcp.TcpCommunicationSpi.createNioClient(TcpCommunicationSpi.java:2967) > [11:51:55] : [Step 3/4] at > o.a.i.spi.communication.tcp.TcpCommunicationSpi.reserveClient(TcpCommunicationSpi.java:2850) > [11:51:55] : [Step 3/4] at > o.a.i.spi.communication.tcp.TcpCommunicationSpi.sendMessage0(TcpCommunicationSpi.java:2693) > [11:51:55] : [Step 3/4] at > o.a.i.spi.communication.tcp.TcpCommunicationSpi.sendMessage(TcpCommunicationSpi.java:2652) > [11:51:55] : [Step 3/4] at > o.a.i.i.managers.communication.GridIoManager.send(GridIoManager.java:1643) > [11:51:55] : [Step 3/4] at > o.a.i.i.managers.communication.GridIoManager.sendOrderedMessage(GridIoManager.java:1750) > [11:51:55] : [Step 3/4] at > o.a.i.i.processors.cache.GridCacheIoManager.sendOrderedMessage(GridCacheIoManager.java:1231) > [11:51:55] : [Step 3/4] at > o.a.i.i.processors.cache.distributed.dht.preloader.GridDhtPartitionDemander$RebalanceFuture.cleanupRemoteContexts(GridDhtPartitionDemander.java:) > [11:51:55] : [Step 3/4] at > o.a.i.i.processors.cache.distributed.dht.preloader.GridDhtPartitionDemander$RebalanceFuture.cancel(GridDhtPartitionDemander.java:1041) > [11:51:55] : [Step 3/4] - locked > o.a.i.i.processors.cache.distributed.dht.preloader.GridDhtPartitionDemander$RebalanceFuture@7e28f150 > [11:51:55] : [Step 3/4] at > o.a.i.i.processors.cache.distributed.dht.preloader.GridDhtPartitionDemander.lambda$null$2(GridDhtPartitionDemander.java:534) > [11:51:55] : [Step 3/4] at > o.a.i.i.processors.cache.distributed.dht.preloader.GridDhtPartitionDemander$$Lambda$41/603501511.run(Unknown > Source) > [11:51:55] : [Step 3/4] at > o.a.i.i.util.IgniteUtils.wrapThreadLoader(IgniteUtils.java:6800) > [11:51:55] : [Step 3/4] at > o.a.i.i.processors.closure.GridClosureProcessor$1.body(GridClosureProcessor.java:827) > [11:51:55] : [Step 3/4] at > o.a.i.i.util.worker.GridWorker.run(GridWorker.java:110) > [11:51:55] : [Step 3/4] at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > [11:51:55] : [Step 3/4] at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > [11:51:55] : [Step 3/4] at java.lang.Thread.run(Thread.java:748) > {code} > Because of that, exchange worker hangs forever while trying to acquire that > lock: > {code} > [11:51:55] : [Step 3/4] Thread > [name="exchange-worker-#68894%new-node-topology-change-thread-1%", id=77379, > state=BLOCKED, blockCnt=11, waitCnt=7] > [11:51:55] : [Step 3/4] Lock > [object=o.a.i.i.processors.cache.distributed.dht.preloader.GridDhtPartitionDemander$RebalanceFuture@7e28f150, > ownerName=sys-#68921%
[jira] [Updated] (IGNITE-9236) Handshake timeout never completes in some tests (GridCacheReplicatedFailoverSelfTest in particular)
[ https://issues.apache.org/jira/browse/IGNITE-9236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Goncharuk updated IGNITE-9236: - Fix Version/s: 2.7 > Handshake timeout never completes in some tests > (GridCacheReplicatedFailoverSelfTest in particular) > --- > > Key: IGNITE-9236 > URL: https://issues.apache.org/jira/browse/IGNITE-9236 > Project: Ignite > Issue Type: Bug >Reporter: Ilya Lantukh >Assignee: Ilya Lantukh >Priority: Major > Labels: MakeTeamcityGreenAgain > Fix For: 2.7 > > > In GridCacheReplicatedFailoverSelfTest one thread tries to establish TCP > connection and hangs on handshake forever, holding lock on RebalanceFuture: > {code} > [11:51:55] : [Step 3/4] Locked synchronizers: > [11:51:55] : [Step 3/4] > java.util.concurrent.ThreadPoolExecutor$Worker@5b17b883 > [11:51:55] : [Step 3/4] Thread > [name="sys-#68921%new-node-topology-change-thread-1%", id=77410, > state=RUNNABLE, blockCnt=3, waitCnt=0] > [11:51:55] : [Step 3/4] at > sun.nio.ch.FileDispatcherImpl.read0(Native Method) > [11:51:55] : [Step 3/4] at > sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39) > [11:51:55] : [Step 3/4] at > sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223) > [11:51:55] : [Step 3/4] at sun.nio.ch.IOUtil.read(IOUtil.java:197) > [11:51:55] : [Step 3/4] at > sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380) > [11:51:55] : [Step 3/4] - locked java.lang.Object@23aaa756 > [11:51:55] : [Step 3/4] at > o.a.i.spi.communication.tcp.TcpCommunicationSpi.safeTcpHandshake(TcpCommunicationSpi.java:3647) > [11:51:55] : [Step 3/4] at > o.a.i.spi.communication.tcp.TcpCommunicationSpi.createTcpClient(TcpCommunicationSpi.java:3293) > [11:51:55] : [Step 3/4] at > o.a.i.spi.communication.tcp.TcpCommunicationSpi.createNioClient(TcpCommunicationSpi.java:2967) > [11:51:55] : [Step 3/4] at > o.a.i.spi.communication.tcp.TcpCommunicationSpi.reserveClient(TcpCommunicationSpi.java:2850) > [11:51:55] : [Step 3/4] at > o.a.i.spi.communication.tcp.TcpCommunicationSpi.sendMessage0(TcpCommunicationSpi.java:2693) > [11:51:55] : [Step 3/4] at > o.a.i.spi.communication.tcp.TcpCommunicationSpi.sendMessage(TcpCommunicationSpi.java:2652) > [11:51:55] : [Step 3/4] at > o.a.i.i.managers.communication.GridIoManager.send(GridIoManager.java:1643) > [11:51:55] : [Step 3/4] at > o.a.i.i.managers.communication.GridIoManager.sendOrderedMessage(GridIoManager.java:1750) > [11:51:55] : [Step 3/4] at > o.a.i.i.processors.cache.GridCacheIoManager.sendOrderedMessage(GridCacheIoManager.java:1231) > [11:51:55] : [Step 3/4] at > o.a.i.i.processors.cache.distributed.dht.preloader.GridDhtPartitionDemander$RebalanceFuture.cleanupRemoteContexts(GridDhtPartitionDemander.java:) > [11:51:55] : [Step 3/4] at > o.a.i.i.processors.cache.distributed.dht.preloader.GridDhtPartitionDemander$RebalanceFuture.cancel(GridDhtPartitionDemander.java:1041) > [11:51:55] : [Step 3/4] - locked > o.a.i.i.processors.cache.distributed.dht.preloader.GridDhtPartitionDemander$RebalanceFuture@7e28f150 > [11:51:55] : [Step 3/4] at > o.a.i.i.processors.cache.distributed.dht.preloader.GridDhtPartitionDemander.lambda$null$2(GridDhtPartitionDemander.java:534) > [11:51:55] : [Step 3/4] at > o.a.i.i.processors.cache.distributed.dht.preloader.GridDhtPartitionDemander$$Lambda$41/603501511.run(Unknown > Source) > [11:51:55] : [Step 3/4] at > o.a.i.i.util.IgniteUtils.wrapThreadLoader(IgniteUtils.java:6800) > [11:51:55] : [Step 3/4] at > o.a.i.i.processors.closure.GridClosureProcessor$1.body(GridClosureProcessor.java:827) > [11:51:55] : [Step 3/4] at > o.a.i.i.util.worker.GridWorker.run(GridWorker.java:110) > [11:51:55] : [Step 3/4] at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > [11:51:55] : [Step 3/4] at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > [11:51:55] : [Step 3/4] at java.lang.Thread.run(Thread.java:748) > {code} > Because of that, exchange worker hangs forever while trying to acquire that > lock: > {code} > [11:51:55] : [Step 3/4] Thread > [name="exchange-worker-#68894%new-node-topology-change-thread-1%", id=77379, > state=BLOCKED, blockCnt=11, waitCnt=7] > [11:51:55] : [Step 3/4] Lock > [object=o.a.i.i.processors.cache.distributed.dht.preloader.GridDhtPartitionDemander$RebalanceFuture@7e28f150, > ownerName=sys-#68921%new-node-topology-change-thread-1%, ownerId=77410] > [11:51:55] : [Step 3/4]
[jira] [Updated] (IGNITE-9236) Handshake timeout never completes in some tests (GridCacheReplicatedFailoverSelfTest in particular)
[ https://issues.apache.org/jira/browse/IGNITE-9236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Goncharuk updated IGNITE-9236: - Ignite Flags: (was: Docs Required) > Handshake timeout never completes in some tests > (GridCacheReplicatedFailoverSelfTest in particular) > --- > > Key: IGNITE-9236 > URL: https://issues.apache.org/jira/browse/IGNITE-9236 > Project: Ignite > Issue Type: Bug >Reporter: Ilya Lantukh >Assignee: Ilya Lantukh >Priority: Major > Labels: MakeTeamcityGreenAgain > Fix For: 2.7 > > > In GridCacheReplicatedFailoverSelfTest one thread tries to establish TCP > connection and hangs on handshake forever, holding lock on RebalanceFuture: > {code} > [11:51:55] : [Step 3/4] Locked synchronizers: > [11:51:55] : [Step 3/4] > java.util.concurrent.ThreadPoolExecutor$Worker@5b17b883 > [11:51:55] : [Step 3/4] Thread > [name="sys-#68921%new-node-topology-change-thread-1%", id=77410, > state=RUNNABLE, blockCnt=3, waitCnt=0] > [11:51:55] : [Step 3/4] at > sun.nio.ch.FileDispatcherImpl.read0(Native Method) > [11:51:55] : [Step 3/4] at > sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39) > [11:51:55] : [Step 3/4] at > sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223) > [11:51:55] : [Step 3/4] at sun.nio.ch.IOUtil.read(IOUtil.java:197) > [11:51:55] : [Step 3/4] at > sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380) > [11:51:55] : [Step 3/4] - locked java.lang.Object@23aaa756 > [11:51:55] : [Step 3/4] at > o.a.i.spi.communication.tcp.TcpCommunicationSpi.safeTcpHandshake(TcpCommunicationSpi.java:3647) > [11:51:55] : [Step 3/4] at > o.a.i.spi.communication.tcp.TcpCommunicationSpi.createTcpClient(TcpCommunicationSpi.java:3293) > [11:51:55] : [Step 3/4] at > o.a.i.spi.communication.tcp.TcpCommunicationSpi.createNioClient(TcpCommunicationSpi.java:2967) > [11:51:55] : [Step 3/4] at > o.a.i.spi.communication.tcp.TcpCommunicationSpi.reserveClient(TcpCommunicationSpi.java:2850) > [11:51:55] : [Step 3/4] at > o.a.i.spi.communication.tcp.TcpCommunicationSpi.sendMessage0(TcpCommunicationSpi.java:2693) > [11:51:55] : [Step 3/4] at > o.a.i.spi.communication.tcp.TcpCommunicationSpi.sendMessage(TcpCommunicationSpi.java:2652) > [11:51:55] : [Step 3/4] at > o.a.i.i.managers.communication.GridIoManager.send(GridIoManager.java:1643) > [11:51:55] : [Step 3/4] at > o.a.i.i.managers.communication.GridIoManager.sendOrderedMessage(GridIoManager.java:1750) > [11:51:55] : [Step 3/4] at > o.a.i.i.processors.cache.GridCacheIoManager.sendOrderedMessage(GridCacheIoManager.java:1231) > [11:51:55] : [Step 3/4] at > o.a.i.i.processors.cache.distributed.dht.preloader.GridDhtPartitionDemander$RebalanceFuture.cleanupRemoteContexts(GridDhtPartitionDemander.java:) > [11:51:55] : [Step 3/4] at > o.a.i.i.processors.cache.distributed.dht.preloader.GridDhtPartitionDemander$RebalanceFuture.cancel(GridDhtPartitionDemander.java:1041) > [11:51:55] : [Step 3/4] - locked > o.a.i.i.processors.cache.distributed.dht.preloader.GridDhtPartitionDemander$RebalanceFuture@7e28f150 > [11:51:55] : [Step 3/4] at > o.a.i.i.processors.cache.distributed.dht.preloader.GridDhtPartitionDemander.lambda$null$2(GridDhtPartitionDemander.java:534) > [11:51:55] : [Step 3/4] at > o.a.i.i.processors.cache.distributed.dht.preloader.GridDhtPartitionDemander$$Lambda$41/603501511.run(Unknown > Source) > [11:51:55] : [Step 3/4] at > o.a.i.i.util.IgniteUtils.wrapThreadLoader(IgniteUtils.java:6800) > [11:51:55] : [Step 3/4] at > o.a.i.i.processors.closure.GridClosureProcessor$1.body(GridClosureProcessor.java:827) > [11:51:55] : [Step 3/4] at > o.a.i.i.util.worker.GridWorker.run(GridWorker.java:110) > [11:51:55] : [Step 3/4] at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > [11:51:55] : [Step 3/4] at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > [11:51:55] : [Step 3/4] at java.lang.Thread.run(Thread.java:748) > {code} > Because of that, exchange worker hangs forever while trying to acquire that > lock: > {code} > [11:51:55] : [Step 3/4] Thread > [name="exchange-worker-#68894%new-node-topology-change-thread-1%", id=77379, > state=BLOCKED, blockCnt=11, waitCnt=7] > [11:51:55] : [Step 3/4] Lock > [object=o.a.i.i.processors.cache.distributed.dht.preloader.GridDhtPartitionDemander$RebalanceFuture@7e28f150, > ownerName=sys-#68921%new-node-topology-change-thread-1%, ownerId=77410] > [11:51:55
[jira] [Commented] (IGNITE-9050) WALIterator should throw an exception if iterator stopped in the WAL archive but not in WAL work
[ https://issues.apache.org/jira/browse/IGNITE-9050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16576221#comment-16576221 ] ASF GitHub Bot commented on IGNITE-9050: Github user asfgit closed the pull request at: https://github.com/apache/ignite/pull/4429 > WALIterator should throw an exception if iterator stopped in the WAL archive > but not in WAL work > > > Key: IGNITE-9050 > URL: https://issues.apache.org/jira/browse/IGNITE-9050 > Project: Ignite > Issue Type: Bug >Reporter: Dmitriy Govorukhin >Assignee: Dmitriy Govorukhin >Priority: Major > Fix For: 2.7 > > > The iterator will stop iteration if next WAL record pointer is not equals > expected (WalSegmentTailReachedException), if it happens during iteration > over segments in WAL archive, this means WAL is corrupted and we cannot > ignore this, WAL log is not fully read. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (IGNITE-8724) Skip logging 3-rd parameter while calling U.warn with initialized logger.
[ https://issues.apache.org/jira/browse/IGNITE-8724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16576224#comment-16576224 ] Ilya Lantukh commented on IGNITE-8724: -- Thanks! Looks good now. > Skip logging 3-rd parameter while calling U.warn with initialized logger. > - > > Key: IGNITE-8724 > URL: https://issues.apache.org/jira/browse/IGNITE-8724 > Project: Ignite > Issue Type: Bug > Components: general >Affects Versions: 2.5 >Reporter: Stanilovsky Evgeny >Assignee: Stanilovsky Evgeny >Priority: Major > Fix For: 2.7 > > Attachments: tc.png > > > There are a lot of places where exception need to be logged, for example : > {code:java} > U.warn(log,"Unable to await partitions release future", e); > {code} > but current U.warn realization silently swallow it. > {code:java} > public static void warn(@Nullable IgniteLogger log, Object longMsg, > Object shortMsg) { > assert longMsg != null; > assert shortMsg != null; > if (log != null) > log.warning(compact(longMsg.toString())); > else > X.println("[" + SHORT_DATE_FMT.format(new java.util.Date()) + "] > (wrn) " + > compact(shortMsg.toString())); > } > {code} > fix, looks like simple add: > {code:java} > public static void warn(@Nullable IgniteLogger log, Object longMsg, > Throwable ex) { > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (IGNITE-9050) WALIterator should throw an exception if iterator stopped in the WAL archive but not in WAL work
[ https://issues.apache.org/jira/browse/IGNITE-9050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16576214#comment-16576214 ] Alexey Goncharuk commented on IGNITE-9050: -- Thanks for the fixes, merged to master. > WALIterator should throw an exception if iterator stopped in the WAL archive > but not in WAL work > > > Key: IGNITE-9050 > URL: https://issues.apache.org/jira/browse/IGNITE-9050 > Project: Ignite > Issue Type: Bug >Reporter: Dmitriy Govorukhin >Assignee: Dmitriy Govorukhin >Priority: Major > Fix For: 2.7 > > > The iterator will stop iteration if next WAL record pointer is not equals > expected (WalSegmentTailReachedException), if it happens during iteration > over segments in WAL archive, this means WAL is corrupted and we cannot > ignore this, WAL log is not fully read. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (IGNITE-9171) Use lazy mode with results pre-fetch
[ https://issues.apache.org/jira/browse/IGNITE-9171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16576212#comment-16576212 ] ASF GitHub Bot commented on IGNITE-9171: GitHub user tledkov-gridgain opened a pull request: https://github.com/apache/ignite/pull/4514 IGNITE-9171 Use lazy mode with results pre-fetch You can merge this pull request into a Git repository by running: $ git pull https://github.com/gridgain/apache-ignite ignite-9171 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/ignite/pull/4514.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #4514 commit de324a59df6dcba2ec906fb974baf95dabd19504 Author: tledkov-gridgain Date: 2018-08-03T11:37:26Z IGNITE-9171: save the progress commit 63181f02cc950c49f93101e79682939949396675 Author: tledkov-gridgain Date: 2018-08-03T13:51:22Z IGNITE-9171: save the progress commit 8fec73450804fd1f9080dce7d000c8eefb0c6749 Author: tledkov-gridgain Date: 2018-08-03T14:35:23Z Merge branch '_master' into ignite-9171 commit 260f3bf244fac0031fa4bb8d27aac365acfd43db Author: tledkov-gridgain Date: 2018-08-06T09:22:39Z IGNITE-9171: save the progress commit f9bfdf76a791c0fa588d1a3a2a91bb349d7affd6 Author: tledkov-gridgain Date: 2018-08-06T09:35:42Z Merge branch '_master' into ignite-9171 commit 59bf5ee665c460179aa961f9a3939b892f916dbc Author: tledkov-gridgain Date: 2018-08-06T11:14:47Z IGNITE-9171: save the progress commit 83cca801e7547b032e7f3436ef22c979e72e04f0 Author: tledkov-gridgain Date: 2018-08-06T12:43:00Z Merge branch '_master' into ignite-9171 commit bfb342cd0e35aad8c1c79044e0ebcde936c71806 Author: tledkov-gridgain Date: 2018-08-06T13:41:32Z IGNITE-9171: save the progress commit fe64dc2b22cf2f8e6b5d56068286dda1e6cc77fd Author: tledkov-gridgain Date: 2018-08-07T12:30:24Z IGNITE-9171: remove lazy worker commit 98e4c57b795bc89c5b4a1c27f7f06f2cdbfc4dd4 Author: tledkov-gridgain Date: 2018-08-07T12:36:27Z Merge branch '_master' into ignite-9171 commit 3ace9838ce13e3a08e5fdbe88101d2b61c40c718 Author: tledkov-gridgain Date: 2018-08-08T11:10:51Z IGNITE-9171: benchmark commit a3ee0f5f70d452f43de84a61832866ecd02e92da Author: tledkov-gridgain Date: 2018-08-08T11:16:08Z Merge branch '_master' into ignite-9171 commit b9a2ecfc6ac9f6c324ff0ab832899bb99ae46473 Author: tledkov-gridgain Date: 2018-08-08T13:13:07Z IGNITE-9171: fix lazy mode commit de71737b20e1683ff9d2078f1ba33a5843ccc541 Author: tledkov-gridgain Date: 2018-08-09T09:15:54Z IGNITE-9171: save the progress commit 51c15c81496c9b5dd8beebdf66087ea61a3324d9 Author: tledkov-gridgain Date: 2018-08-09T12:42:19Z IGNITE-9171: save the progress commit 75dbcc9872388b29758b56593fdc04d497d8d0ed Author: tledkov-gridgain Date: 2018-08-10T08:04:25Z IGNITE-9171: modify table lock commit 18e198044d1806e2951b249977230d0dfaed7053 Author: tledkov-gridgain Date: 2018-08-10T08:14:28Z IGNITE-9171: modify table lock - minors commit f2188d118e9028a783bb2347d7ed15f7664828c0 Author: tledkov-gridgain Date: 2018-08-10T08:53:17Z Merge branch '_master' into ignite-9171 commit 57e77782ce24d4f36843e89d9e81ab5d39eef4c1 Author: tledkov-gridgain Date: 2018-08-10T10:38:11Z IGNITE-9171: minors commit 79b2292733b3fb510ce38053b44512e25f143fa6 Author: tledkov-gridgain Date: 2018-08-10T12:35:32Z Merge branch '_master' into ignite-9171 > Use lazy mode with results pre-fetch > > > Key: IGNITE-9171 > URL: https://issues.apache.org/jira/browse/IGNITE-9171 > Project: Ignite > Issue Type: Improvement > Components: sql >Affects Versions: 2.6 >Reporter: Taras Ledkov >Assignee: Taras Ledkov >Priority: Major > > Current implementation of the {{lazy}} mode always starts separate thread for > {{MapQueryLazyWorker}}. It causes excessive overhead for requests that > produces small results set. > We have to begin execute query at the {{QUERY_POOL}} thread pool and fetch > first page of the results. In case results set is bigger than one page > {{MapQueryLazyWorker}} is started and link with {{MapNodeResults}} to handle > next pages lazy. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (IGNITE-9178) Partition lost event are not triggered if multiple nodes left cluster
[ https://issues.apache.org/jira/browse/IGNITE-9178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16576209#comment-16576209 ] Ilya Lantukh commented on IGNITE-9178: -- [~agoncharuk], I've double-checked this PR, it looks correct to me. {{leftNode2Part}} in this case is just a temporary map that is used to fire part lost events. There is no need to update {{diffFromAffinity}} in that part of code, because it will be re-calculated later. [~pvinokurov], Thanks for contribution! > Partition lost event are not triggered if multiple nodes left cluster > - > > Key: IGNITE-9178 > URL: https://issues.apache.org/jira/browse/IGNITE-9178 > Project: Ignite > Issue Type: Bug > Components: cache >Affects Versions: 2.4 >Reporter: Pavel Vinokurov >Assignee: Pavel Vinokurov >Priority: Blocker > Fix For: 2.7 > > > If multiple nodes left cluster simultaneously, left partitions are removed > from GridDhtPartitionTopologyImpl#node2part without adding to leftNode2Part > in GridDhtPartitionTopologyImpl#update method. > Thus GridDhtPartitionTopologyImpl#detectLostPartitions can't detect lost > partitions -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (IGNITE-9196) SQL: Memory leak in MapNodeResults
[ https://issues.apache.org/jira/browse/IGNITE-9196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16576208#comment-16576208 ] Taras Ledkov commented on IGNITE-9196: -- [~dmekhanikov], my comments: # None test suite contains {{CacheQueryMemoryLeakTest}}. # What do you think about use {{GridDebug#dumpHeap}} for test of mem leaks instead of check JVM pause by our own implementation? > SQL: Memory leak in MapNodeResults > -- > > Key: IGNITE-9196 > URL: https://issues.apache.org/jira/browse/IGNITE-9196 > Project: Ignite > Issue Type: Bug > Components: sql >Affects Versions: 2.6 >Reporter: Denis Mekhanikov >Assignee: Denis Mekhanikov >Priority: Blocker > Fix For: 2.7 > > > When size of a SQL query result set is a multiple of {{Query#pageSize}}, then > {{MapQueryResult}} is never closed and removed from {{MapNodeResults#res}} > collection. > The following code leads to OOME when run with 1Gb heap: > {code:java} > public class MemLeakRepro { > public static void main(String[] args) { > Ignition.start(getConfiguration("server")); > try (Ignite client = > Ignition.start(getConfiguration("client").setClientMode(true))) { > IgniteCache cache = startPeopleCache(client); > int pages = 10; > int pageSize = 1024; > for (int i = 0; i < pages * pageSize; i++) { > Person p = new Person("Person #" + i, 25); > cache.put(i, p); > } > for (int i = 0; i < 1_000_000; i++) { > if (i % 1000 == 0) > System.out.println("Select iteration #" + i); > Query> qry = new SqlFieldsQuery("select * from > people"); > qry.setPageSize(pageSize); > QueryCursor> cursor = cache.query(qry); > cursor.getAll(); > cursor.close(); > } > } > } > private static IgniteConfiguration getConfiguration(String instanceName) { > IgniteConfiguration igniteCfg = new IgniteConfiguration(); > igniteCfg.setIgniteInstanceName(instanceName); > TcpDiscoverySpi discoSpi = new TcpDiscoverySpi(); > discoSpi.setIpFinder(new TcpDiscoveryVmIpFinder(true)); > return igniteCfg; > } > private static IgniteCache startPeopleCache(Ignite node) > { > CacheConfiguration cacheCfg = new > CacheConfiguration<>("cache"); > QueryEntity qe = new QueryEntity(Integer.class, Person.class); > qe.setTableName("people"); > cacheCfg.setQueryEntities(Collections.singleton(qe)); > cacheCfg.setSqlSchema("PUBLIC"); > return node.getOrCreateCache(cacheCfg); > } > public static class Person { > @QuerySqlField > private String name; > @QuerySqlField > private int age; > public Person(String name, int age) { > this.name = name; > this.age = age; > } > } > } > {code} > > At the same time it works perfectly fine, when there are, for example, > {{pages * pageSize - 1}} records in cache instead. > The reason for it is that {{MapQueryResult#fetchNextPage(...)}} method > doesn't return true, when the result set size is a multiple of the page size. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-9248) CPP: Support Clang compiler
Igor Sapego created IGNITE-9248: --- Summary: CPP: Support Clang compiler Key: IGNITE-9248 URL: https://issues.apache.org/jira/browse/IGNITE-9248 Project: Ignite Issue Type: Improvement Components: platforms Reporter: Igor Sapego Assignee: Igor Sapego Currently Ignite C++ can not be compiled with the clang compiler. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (IGNITE-9050) WALIterator should throw an exception if iterator stopped in the WAL archive but not in WAL work
[ https://issues.apache.org/jira/browse/IGNITE-9050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16576147#comment-16576147 ] Alexey Goncharuk commented on IGNITE-9050: -- [~DmitriyGovorukhin], I have IgniteWALTailIsReachedDuringIterationOverArchiveTest#testStandAloneIterator failing with about 5% rate locally with the error {code} junit.framework.AssertionFailedError: Last read ptr=FileWALPointer [idx=23, fileOff=9224675, len=59], corruptedPtr=FileWALPointer [idx=22, fileOff=2776095, len=1115] {code} Also, the test is not added to any test suite. Can you take a look? > WALIterator should throw an exception if iterator stopped in the WAL archive > but not in WAL work > > > Key: IGNITE-9050 > URL: https://issues.apache.org/jira/browse/IGNITE-9050 > Project: Ignite > Issue Type: Bug >Reporter: Dmitriy Govorukhin >Assignee: Dmitriy Govorukhin >Priority: Major > Fix For: 2.7 > > > The iterator will stop iteration if next WAL record pointer is not equals > expected (WalSegmentTailReachedException), if it happens during iteration > over segments in WAL archive, this means WAL is corrupted and we cannot > ignore this, WAL log is not fully read. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (IGNITE-9178) Partition lost event are not triggered if multiple nodes left cluster
[ https://issues.apache.org/jira/browse/IGNITE-9178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16576140#comment-16576140 ] Alexey Goncharuk edited comment on IGNITE-9178 at 8/10/18 11:35 AM: [~pvinokurov] Can you please explain why you needed to update {{leftNode2Part}} inside the loop and how this fixes the missed event? Note that there is a separate method handling left nodes - {{removeNode()}}, and from the current code I see that not only does it add new entry to {{leftNode2Part}}, but it also updates {{diffFromAffinity}} map. With your change it looks like the {{diffFromAffinity}} map may be outdated. I see that {{removeNode()}} is called in {{beforeExchange}} for all left nodes, so I it is not clear why those nodes did not get to {{leftNode2Part}} was (Author: agoncharuk): [~pvinokurov] Can you please explain why you needed to update {{leftNode2Part}} inside the loop and how this fixes the missed event? Note that there is a separate method handling left nodes - {{removeNode()}}, and from the current code I see that not only does it add new entry to {{leftNode2Part}}, but it also updates {{diffFromAffinity}} map. With your change it looks like the {{diffFromAffinity}} map may be outdated. > Partition lost event are not triggered if multiple nodes left cluster > - > > Key: IGNITE-9178 > URL: https://issues.apache.org/jira/browse/IGNITE-9178 > Project: Ignite > Issue Type: Bug > Components: cache >Affects Versions: 2.4 >Reporter: Pavel Vinokurov >Assignee: Pavel Vinokurov >Priority: Blocker > Fix For: 2.7 > > > If multiple nodes left cluster simultaneously, left partitions are removed > from GridDhtPartitionTopologyImpl#node2part without adding to leftNode2Part > in GridDhtPartitionTopologyImpl#update method. > Thus GridDhtPartitionTopologyImpl#detectLostPartitions can't detect lost > partitions -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (IGNITE-9178) Partition lost event are not triggered if multiple nodes left cluster
[ https://issues.apache.org/jira/browse/IGNITE-9178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16576140#comment-16576140 ] Alexey Goncharuk commented on IGNITE-9178: -- [~pvinokurov] Can you please explain why you needed to update {{leftNode2Part}} inside the loop and how this fixes the missed event? Note that there is a separate method handling left nodes - {{removeNode()}}, and from the current code I see that not only does it add new entry to {{leftNode2Part}}, but it also updates {{diffFromAffinity}} map. With your change it looks like the {{diffFromAffinity}} map may be outdated. > Partition lost event are not triggered if multiple nodes left cluster > - > > Key: IGNITE-9178 > URL: https://issues.apache.org/jira/browse/IGNITE-9178 > Project: Ignite > Issue Type: Bug > Components: cache >Affects Versions: 2.4 >Reporter: Pavel Vinokurov >Assignee: Pavel Vinokurov >Priority: Blocker > Fix For: 2.7 > > > If multiple nodes left cluster simultaneously, left partitions are removed > from GridDhtPartitionTopologyImpl#node2part without adding to leftNode2Part > in GridDhtPartitionTopologyImpl#update method. > Thus GridDhtPartitionTopologyImpl#detectLostPartitions can't detect lost > partitions -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (IGNITE-5103) TcpDiscoverySpi ignores maxMissedClientHeartbeats property
[ https://issues.apache.org/jira/browse/IGNITE-5103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16576132#comment-16576132 ] ASF GitHub Bot commented on IGNITE-5103: Github user asfgit closed the pull request at: https://github.com/apache/ignite/pull/4446 > TcpDiscoverySpi ignores maxMissedClientHeartbeats property > -- > > Key: IGNITE-5103 > URL: https://issues.apache.org/jira/browse/IGNITE-5103 > Project: Ignite > Issue Type: Bug > Components: general >Affects Versions: 1.9 >Reporter: Valentin Kulichenko >Assignee: Evgenii Zhuravlev >Priority: Blocker > Fix For: 2.7 > > Attachments: TcpDiscoveryClientSuspensionSelfTest.java > > > Test scenario is the following: > * Start one or more servers. > * Start a client node. > * Suspend client process using {{-SIGSTOP}} signal. > * Wait for {{maxMissedClientHeartbeats*heartbeatFrequency}}. > * Client node is expected to be removed from topology, but server nodes don't > do that. > Attached is the unit test reproducing the same by stopping the heartbeat > sender thread on the client. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (IGNITE-5103) TcpDiscoverySpi ignores maxMissedClientHeartbeats property
[ https://issues.apache.org/jira/browse/IGNITE-5103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16576125#comment-16576125 ] Alexey Goncharuk commented on IGNITE-5103: -- Thanks, [~ezhuravl], merged your changes to master. > TcpDiscoverySpi ignores maxMissedClientHeartbeats property > -- > > Key: IGNITE-5103 > URL: https://issues.apache.org/jira/browse/IGNITE-5103 > Project: Ignite > Issue Type: Bug > Components: general >Affects Versions: 1.9 >Reporter: Valentin Kulichenko >Assignee: Evgenii Zhuravlev >Priority: Blocker > Fix For: 2.7 > > Attachments: TcpDiscoveryClientSuspensionSelfTest.java > > > Test scenario is the following: > * Start one or more servers. > * Start a client node. > * Suspend client process using {{-SIGSTOP}} signal. > * Wait for {{maxMissedClientHeartbeats*heartbeatFrequency}}. > * Client node is expected to be removed from topology, but server nodes don't > do that. > Attached is the unit test reproducing the same by stopping the heartbeat > sender thread on the client. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Closed] (IGNITE-596) Add missed scala examples and remove unnecessary scala examples.
[ https://issues.apache.org/jira/browse/IGNITE-596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vasiliy Sisko closed IGNITE-596. > Add missed scala examples and remove unnecessary scala examples. > - > > Key: IGNITE-596 > URL: https://issues.apache.org/jira/browse/IGNITE-596 > Project: Ignite > Issue Type: Bug > Components: general >Affects Versions: sprint-3 >Reporter: Vasiliy Sisko >Assignee: Alexey Kuznetsov >Priority: Minor > Attachments: #_IGNITE-596_All_examples.patch, > #_IGNITE-596_Other_examples.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Closed] (IGNITE-897) Add missed datagrid scala examples
[ https://issues.apache.org/jira/browse/IGNITE-897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vasiliy Sisko closed IGNITE-897. > Add missed datagrid scala examples > -- > > Key: IGNITE-897 > URL: https://issues.apache.org/jira/browse/IGNITE-897 > Project: Ignite > Issue Type: Sub-task > Components: general >Affects Versions: sprint-5 >Reporter: Vasiliy Sisko >Assignee: Alexey Kuznetsov >Priority: Minor > Attachments: #_IGNITE-897_Datagrid_examples.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Closed] (IGNITE-871) Add missed datastructures scala examples
[ https://issues.apache.org/jira/browse/IGNITE-871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vasiliy Sisko closed IGNITE-871. > Add missed datastructures scala examples > > > Key: IGNITE-871 > URL: https://issues.apache.org/jira/browse/IGNITE-871 > Project: Ignite > Issue Type: Sub-task > Components: general >Affects Versions: sprint-5 >Reporter: Vasiliy Sisko >Assignee: Alexey Kuznetsov >Priority: Minor > Attachments: #_IGNITE-871_Datastructures_examples.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (IGNITE-897) Add missed datagrid scala examples
[ https://issues.apache.org/jira/browse/IGNITE-897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vasiliy Sisko resolved IGNITE-897. -- Resolution: Won't Fix Assignee: Alexey Kuznetsov (was: Vasiliy Sisko) > Add missed datagrid scala examples > -- > > Key: IGNITE-897 > URL: https://issues.apache.org/jira/browse/IGNITE-897 > Project: Ignite > Issue Type: Sub-task > Components: general >Affects Versions: sprint-5 >Reporter: Vasiliy Sisko >Assignee: Alexey Kuznetsov >Priority: Minor > Attachments: #_IGNITE-897_Datagrid_examples.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (IGNITE-596) Add missed scala examples and remove unnecessary scala examples.
[ https://issues.apache.org/jira/browse/IGNITE-596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vasiliy Sisko resolved IGNITE-596. -- Resolution: Won't Fix Assignee: Alexey Kuznetsov (was: Vasiliy Sisko) > Add missed scala examples and remove unnecessary scala examples. > - > > Key: IGNITE-596 > URL: https://issues.apache.org/jira/browse/IGNITE-596 > Project: Ignite > Issue Type: Bug > Components: general >Affects Versions: sprint-3 >Reporter: Vasiliy Sisko >Assignee: Alexey Kuznetsov >Priority: Minor > Attachments: #_IGNITE-596_All_examples.patch, > #_IGNITE-596_Other_examples.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (IGNITE-871) Add missed datastructures scala examples
[ https://issues.apache.org/jira/browse/IGNITE-871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vasiliy Sisko resolved IGNITE-871. -- Resolution: Won't Fix Assignee: Alexey Kuznetsov (was: Vasiliy Sisko) > Add missed datastructures scala examples > > > Key: IGNITE-871 > URL: https://issues.apache.org/jira/browse/IGNITE-871 > Project: Ignite > Issue Type: Sub-task > Components: general >Affects Versions: sprint-5 >Reporter: Vasiliy Sisko >Assignee: Alexey Kuznetsov >Priority: Minor > Attachments: #_IGNITE-871_Datastructures_examples.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (IGNITE-846) Add missed computegrid scala examples.
[ https://issues.apache.org/jira/browse/IGNITE-846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vasiliy Sisko resolved IGNITE-846. -- Resolution: Won't Fix Assignee: Alexey Kuznetsov (was: Vasiliy Sisko) > Add missed computegrid scala examples. > -- > > Key: IGNITE-846 > URL: https://issues.apache.org/jira/browse/IGNITE-846 > Project: Ignite > Issue Type: Sub-task > Components: general >Affects Versions: sprint-5 >Reporter: Vasiliy Sisko >Assignee: Alexey Kuznetsov >Priority: Minor > Attachments: #_IGNITE-846_Computegrid_examples.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Closed] (IGNITE-846) Add missed computegrid scala examples.
[ https://issues.apache.org/jira/browse/IGNITE-846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vasiliy Sisko closed IGNITE-846. > Add missed computegrid scala examples. > -- > > Key: IGNITE-846 > URL: https://issues.apache.org/jira/browse/IGNITE-846 > Project: Ignite > Issue Type: Sub-task > Components: general >Affects Versions: sprint-5 >Reporter: Vasiliy Sisko >Assignee: Alexey Kuznetsov >Priority: Minor > Attachments: #_IGNITE-846_Computegrid_examples.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (IGNITE-596) Add missed scala examples and remove unnecessary scala examples.
[ https://issues.apache.org/jira/browse/IGNITE-596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16576114#comment-16576114 ] Vasiliy Sisko commented on IGNITE-596: -- The problem is no longer relevant. Examples are not updated for 2 years and can be removed in Ignite 3.0 > Add missed scala examples and remove unnecessary scala examples. > - > > Key: IGNITE-596 > URL: https://issues.apache.org/jira/browse/IGNITE-596 > Project: Ignite > Issue Type: Bug > Components: general >Affects Versions: sprint-3 >Reporter: Vasiliy Sisko >Assignee: Vasiliy Sisko >Priority: Minor > Attachments: #_IGNITE-596_All_examples.patch, > #_IGNITE-596_Other_examples.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (IGNITE-897) Add missed datagrid scala examples
[ https://issues.apache.org/jira/browse/IGNITE-897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16576109#comment-16576109 ] Vasiliy Sisko commented on IGNITE-897: -- The problem is no longer relevant. Examples are not updated for 2 years and can be removed in Ignite 3.0 > Add missed datagrid scala examples > -- > > Key: IGNITE-897 > URL: https://issues.apache.org/jira/browse/IGNITE-897 > Project: Ignite > Issue Type: Sub-task > Components: general >Affects Versions: sprint-5 >Reporter: Vasiliy Sisko >Assignee: Vasiliy Sisko >Priority: Minor > Attachments: #_IGNITE-897_Datagrid_examples.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (IGNITE-846) Add missed computegrid scala examples.
[ https://issues.apache.org/jira/browse/IGNITE-846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16576112#comment-16576112 ] Vasiliy Sisko commented on IGNITE-846: -- The problem is no longer relevant. Examples are not updated for 2 years and can be removed in Ignite 3.0 > Add missed computegrid scala examples. > -- > > Key: IGNITE-846 > URL: https://issues.apache.org/jira/browse/IGNITE-846 > Project: Ignite > Issue Type: Sub-task > Components: general >Affects Versions: sprint-5 >Reporter: Vasiliy Sisko >Assignee: Vasiliy Sisko >Priority: Minor > Attachments: #_IGNITE-846_Computegrid_examples.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (IGNITE-871) Add missed datastructures scala examples
[ https://issues.apache.org/jira/browse/IGNITE-871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16576110#comment-16576110 ] Vasiliy Sisko commented on IGNITE-871: -- The problem is no longer relevant. Examples are not updated for 2 years and can be removed in Ignite 3.0 > Add missed datastructures scala examples > > > Key: IGNITE-871 > URL: https://issues.apache.org/jira/browse/IGNITE-871 > Project: Ignite > Issue Type: Sub-task > Components: general >Affects Versions: sprint-5 >Reporter: Vasiliy Sisko >Assignee: Vasiliy Sisko >Priority: Minor > Attachments: #_IGNITE-871_Datastructures_examples.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (IGNITE-7251) Remove term "fabric" from Ignite deliverables
[ https://issues.apache.org/jira/browse/IGNITE-7251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16576097#comment-16576097 ] Anton Vinogradov commented on IGNITE-7251: -- [~vveider] PR ready to be reviewed. > Remove term "fabric" from Ignite deliverables > - > > Key: IGNITE-7251 > URL: https://issues.apache.org/jira/browse/IGNITE-7251 > Project: Ignite > Issue Type: Task >Reporter: Denis Magda >Assignee: Anton Vinogradov >Priority: Blocker > Labels: important > Fix For: 2.7 > > > Apache Ignite binary releases still include “fabric” word in their names: > https://ignite.apache.org/download.cgi#binaries > For instance, this is a full name of the previous release - > apache-ignite-fabric-2.3.0-bin. > It’s a little oversight on our side because the project has not been > positioned as a fabric for a while. > Remove “fabric” from the name and have the binary releases named as - > {{apache-ignite-\{version}-bin}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (IGNITE-8842) Web console: Wrong start screen on start of demo mode
[ https://issues.apache.org/jira/browse/IGNITE-8842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vasiliy Sisko reassigned IGNITE-8842: - Assignee: Pavel Konstantinov (was: Vasiliy Sisko) > Web console: Wrong start screen on start of demo mode > - > > Key: IGNITE-8842 > URL: https://issues.apache.org/jira/browse/IGNITE-8842 > Project: Ignite > Issue Type: Bug > Components: wizards >Reporter: Vasiliy Sisko >Assignee: Pavel Konstantinov >Priority: Minor > > On start of demo mode screen with "SQL demo" notebook should be opened. > Also on "Notebooks" screen "SQL demo" notebook should be available. > On demo start "SQL demo" should be recreated if needed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (IGNITE-8842) Web console: Wrong start screen on start of demo mode
[ https://issues.apache.org/jira/browse/IGNITE-8842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16576083#comment-16576083 ] Vasiliy Sisko commented on IGNITE-8842: --- Implemented opening of Demo queries page on demo run. > Web console: Wrong start screen on start of demo mode > - > > Key: IGNITE-8842 > URL: https://issues.apache.org/jira/browse/IGNITE-8842 > Project: Ignite > Issue Type: Bug > Components: wizards >Reporter: Vasiliy Sisko >Assignee: Vasiliy Sisko >Priority: Minor > > On start of demo mode screen with "SQL demo" notebook should be opened. > Also on "Notebooks" screen "SQL demo" notebook should be available. > On demo start "SQL demo" should be recreated if needed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (IGNITE-8950) Need to have more informative output info while database files check operation.
[ https://issues.apache.org/jira/browse/IGNITE-8950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16576079#comment-16576079 ] Stanilovsky Evgeny commented on IGNITE-8950: TC looks ok. > Need to have more informative output info while database files check > operation. > --- > > Key: IGNITE-8950 > URL: https://issues.apache.org/jira/browse/IGNITE-8950 > Project: Ignite > Issue Type: Bug > Components: persistence >Affects Versions: 2.5 >Reporter: Stanilovsky Evgeny >Assignee: Stanilovsky Evgeny >Priority: Minor > Fix For: 2.7 > > > "Failed to verify store file ..." messages have no file path info. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-9247) CPP Thin: implement GetAll
Stanislav Lukyanov created IGNITE-9247: -- Summary: CPP Thin: implement GetAll Key: IGNITE-9247 URL: https://issues.apache.org/jira/browse/IGNITE-9247 Project: Ignite Issue Type: New Feature Components: thin client Reporter: Stanislav Lukyanov Need to implement GetAll in C++ Thin client. Currently, there is no way to extract values from cache via C++ Thin client without knowing the keys beforehand. GetAll would be the easiest way to do that. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (IGNITE-9244) Partition eviction may use all threads in sys pool, it leads to hangs send a message via sys pool
[ https://issues.apache.org/jira/browse/IGNITE-9244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16576055#comment-16576055 ] ASF GitHub Bot commented on IGNITE-9244: GitHub user DmitriyGovorukhin opened a pull request: https://github.com/apache/ignite/pull/4513 IGNITE-9244 Rework partition eviction. - add evict shared manager - concurrent evict partition from one group - balanced executors by partition size - limitation concurrent evict operation via permits counter You can merge this pull request into a Git repository by running: $ git pull https://github.com/gridgain/apache-ignite ignite-9244 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/ignite/pull/4513.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #4513 commit ab10ca99d7d7052414ef0927d52f17c81e5d7bde Author: Dmitriy Govorukhin Date: 2018-08-10T10:10:12Z IGNITE-9244 Rework partition eviction. - add evict shared manager - concurrent evict partition from one group - balanced executors by partition size - limitation concurrent evict operation via permits counter Signed-off-by: Dmitriy Govorukhin > Partition eviction may use all threads in sys pool, it leads to hangs send a > message via sys pool > -- > > Key: IGNITE-9244 > URL: https://issues.apache.org/jira/browse/IGNITE-9244 > Project: Ignite > Issue Type: Bug >Reporter: Dmitriy Govorukhin >Assignee: Dmitriy Govorukhin >Priority: Major > Fix For: 2.7 > > > In the current implementation, GridDhtPartitionsEvictor reset partition to > evict one by one. > GridDhtPartitionsEvictor is created for each cache group, if we try to evict > too many groups as sys pool size, group evictors will take all available > threads in sys pool. It leads to hangs send a message via sys pool. As a fix, > I suggest to limit concurrent execution via sys pool or use another pool for > this purpose. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (IGNITE-9178) Partition lost event are not triggered if multiple nodes left cluster
[ https://issues.apache.org/jira/browse/IGNITE-9178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16576053#comment-16576053 ] Pavel Vinokurov commented on IGNITE-9178: - Test results look good. [~Jokser][~agoncharuk] Please review > Partition lost event are not triggered if multiple nodes left cluster > - > > Key: IGNITE-9178 > URL: https://issues.apache.org/jira/browse/IGNITE-9178 > Project: Ignite > Issue Type: Bug > Components: cache >Affects Versions: 2.4 >Reporter: Pavel Vinokurov >Assignee: Pavel Vinokurov >Priority: Blocker > Fix For: 2.7 > > > If multiple nodes left cluster simultaneously, left partitions are removed > from GridDhtPartitionTopologyImpl#node2part without adding to leftNode2Part > in GridDhtPartitionTopologyImpl#update method. > Thus GridDhtPartitionTopologyImpl#detectLostPartitions can't detect lost > partitions -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (IGNITE-9147) Race between tx rollback and prepare on near node can produce hanging primary tx
[ https://issues.apache.org/jira/browse/IGNITE-9147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexei Scherbakov updated IGNITE-9147: -- Summary: Race between tx rollback and prepare on near node can produce hanging primary tx (was: When server node left cluster on high load, cluster take hang on PartitionalExchange) > Race between tx rollback and prepare on near node can produce hanging primary > tx > > > Key: IGNITE-9147 > URL: https://issues.apache.org/jira/browse/IGNITE-9147 > Project: Ignite > Issue Type: Bug > Components: general >Affects Versions: 2.5 >Reporter: ARomantsov >Assignee: Alexei Scherbakov >Priority: Critical > Fix For: 2.7 > > > I ran a simple test > 1) Start 15 servers node > 2) Start client with long transaction > 3) Additional start 5 client with loading in many caches (near 2 thousand) > 4) Stop 1 server node, wait 1 minute and start it back > Cluster freenze on more than hour, then license end -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (IGNITE-9246) Optimistic transactions can wait for topology future on remap for a long time even if timeout is set.
[ https://issues.apache.org/jira/browse/IGNITE-9246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexei Scherbakov updated IGNITE-9246: -- Fix Version/s: 2.7 > Optimistic transactions can wait for topology future on remap for a long time > even if timeout is set. > - > > Key: IGNITE-9246 > URL: https://issues.apache.org/jira/browse/IGNITE-9246 > Project: Ignite > Issue Type: Improvement >Reporter: Alexei Scherbakov >Assignee: Alexei Scherbakov >Priority: Major > Fix For: 2.7 > > > This is possible if long PME is occured during tx remap phase. > Fix: wait for new topology on remap with timeout if set. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-9246) Optimistic transactions can wait for topology future on remap for a long time even if timeout is set.
Alexei Scherbakov created IGNITE-9246: - Summary: Optimistic transactions can wait for topology future on remap for a long time even if timeout is set. Key: IGNITE-9246 URL: https://issues.apache.org/jira/browse/IGNITE-9246 Project: Ignite Issue Type: Improvement Reporter: Alexei Scherbakov Assignee: Alexei Scherbakov This is possible if long PME is occured during tx remap phase. Fix: wait for new topology on remap with timeout if set. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (IGNITE-9196) SQL: Memory leak in MapNodeResults
[ https://issues.apache.org/jira/browse/IGNITE-9196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16576011#comment-16576011 ] Ilya Kasnacheev commented on IGNITE-9196: - [~tledkov-gridgain] please review the proposed fix. > SQL: Memory leak in MapNodeResults > -- > > Key: IGNITE-9196 > URL: https://issues.apache.org/jira/browse/IGNITE-9196 > Project: Ignite > Issue Type: Bug > Components: sql >Affects Versions: 2.6 >Reporter: Denis Mekhanikov >Assignee: Denis Mekhanikov >Priority: Blocker > Fix For: 2.7 > > > When size of a SQL query result set is a multiple of {{Query#pageSize}}, then > {{MapQueryResult}} is never closed and removed from {{MapNodeResults#res}} > collection. > The following code leads to OOME when run with 1Gb heap: > {code:java} > public class MemLeakRepro { > public static void main(String[] args) { > Ignition.start(getConfiguration("server")); > try (Ignite client = > Ignition.start(getConfiguration("client").setClientMode(true))) { > IgniteCache cache = startPeopleCache(client); > int pages = 10; > int pageSize = 1024; > for (int i = 0; i < pages * pageSize; i++) { > Person p = new Person("Person #" + i, 25); > cache.put(i, p); > } > for (int i = 0; i < 1_000_000; i++) { > if (i % 1000 == 0) > System.out.println("Select iteration #" + i); > Query> qry = new SqlFieldsQuery("select * from > people"); > qry.setPageSize(pageSize); > QueryCursor> cursor = cache.query(qry); > cursor.getAll(); > cursor.close(); > } > } > } > private static IgniteConfiguration getConfiguration(String instanceName) { > IgniteConfiguration igniteCfg = new IgniteConfiguration(); > igniteCfg.setIgniteInstanceName(instanceName); > TcpDiscoverySpi discoSpi = new TcpDiscoverySpi(); > discoSpi.setIpFinder(new TcpDiscoveryVmIpFinder(true)); > return igniteCfg; > } > private static IgniteCache startPeopleCache(Ignite node) > { > CacheConfiguration cacheCfg = new > CacheConfiguration<>("cache"); > QueryEntity qe = new QueryEntity(Integer.class, Person.class); > qe.setTableName("people"); > cacheCfg.setQueryEntities(Collections.singleton(qe)); > cacheCfg.setSqlSchema("PUBLIC"); > return node.getOrCreateCache(cacheCfg); > } > public static class Person { > @QuerySqlField > private String name; > @QuerySqlField > private int age; > public Person(String name, int age) { > this.name = name; > this.age = age; > } > } > } > {code} > > At the same time it works perfectly fine, when there are, for example, > {{pages * pageSize - 1}} records in cache instead. > The reason for it is that {{MapQueryResult#fetchNextPage(...)}} method > doesn't return true, when the result set size is a multiple of the page size. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (IGNITE-8950) Need to have more informative output info while database files check operation.
[ https://issues.apache.org/jira/browse/IGNITE-8950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16576009#comment-16576009 ] Dmitriy Pavlov commented on IGNITE-8950: I understand that the fix is very simple. At the same time, I think it would be perfectly ok to run at least basic suite, https://ci.ignite.apache.org/viewQueued.html?itemId=1626834&tab=queuedBuildOverviewTab > Need to have more informative output info while database files check > operation. > --- > > Key: IGNITE-8950 > URL: https://issues.apache.org/jira/browse/IGNITE-8950 > Project: Ignite > Issue Type: Bug > Components: persistence >Affects Versions: 2.5 >Reporter: Stanilovsky Evgeny >Assignee: Stanilovsky Evgeny >Priority: Minor > Fix For: 2.7 > > > "Failed to verify store file ..." messages have no file path info. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (IGNITE-9235) Transitivity violation in GridMergeIndex Comparator
[ https://issues.apache.org/jira/browse/IGNITE-9235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Evgenii Zagumennov updated IGNITE-9235: --- Fix Version/s: 2.5 > Transitivity violation in GridMergeIndex Comparator > --- > > Key: IGNITE-9235 > URL: https://issues.apache.org/jira/browse/IGNITE-9235 > Project: Ignite > Issue Type: Bug >Affects Versions: 2.5 >Reporter: Andrew Medvedev >Assignee: Andrew Medvedev >Priority: Major > Fix For: 2.5 > > > Currently comparator in > org.apache.ignite.internal.processors.query.h2.twostep.GridMergeIndex is: > > Private final Comparator streamCmp = new Comparator() { > @Override public int compare(RowStream o1, RowStream o2) { > // Nulls at the beginning. > if (o1 == null) > return -1; > if (o2 == null) > return 1; > return compareRows(o1.get(), o2.get()); > } > }; > -- > > This comparator violates transitivity when o1 and o2 are null. Thus we get > exception in JDK1.8: > > > {color:#d04437}Caused by: java.lang.IllegalArgumentException: Comparison > method violates its general contract!{color} > {color:#d04437} at java.util.TimSort.mergeHi(TimSort.java:899){color} > {color:#d04437} at java.util.TimSort.mergeAt(TimSort.java:516){color} > {color:#d04437} at java.util.TimSort.mergeCollapse(TimSort.java:441){color} > {color:#d04437} at java.util.TimSort.sort(TimSort.java:245){color} > {color:#d04437} at java.util.Arrays.sort(Arrays.java:1438){color} > {color:#d04437} at > org.apache.ignite.internal.processors.query.h2.twostep.GridMergeIndexSorted$MergeStreamIterator.goFirst(GridMergeIndexSorted.java:248){color} > {color:#d04437} at > org.apache.ignite.internal.processors.query.h2.twostep.GridMergeIndexSorted$MergeStreamIterator.hasNext(GridMergeIndexSorted.java:270){color} > {color:#d04437} at > org.apache.ignite.internal.processors.query.h2.twostep.GridMergeIndex$FetchingCursor.fetchRows(GridMergeIndex.java:614){color} > {color:#d04437} at > org.apache.ignite.internal.processors.query.h2.twostep.GridMergeIndex$FetchingCursor.next(GridMergeIndex.java:658){color} > {color:#d04437} at org.h2.index.IndexCursor.next(IndexCursor.java:305){color} > {color:#d04437} at org.h2.table.TableFilter.next(TableFilter.java:499){color} > {color:#d04437} at > org.h2.command.dml.Select$LazyResultQueryFlat.fetchNextRow(Select.java:1452){color} > {color:#d04437} at > org.h2.result.LazyResult.hasNext(LazyResult.java:79){color} > {color:#d04437} at org.h2.result.LazyResult.next(LazyResult.java:59){color} > {color:#d04437} at > org.h2.command.dml.Select.queryFlat(Select.java:519){color} > {color:#d04437} at > org.h2.command.dml.Select.queryWithoutCache(Select.java:625){color} > {color:#d04437} at > org.h2.command.dml.Query.queryWithoutCacheLazyCheck(Query.java:114){color} > {color:#d04437} at org.h2.command.dml.Query.query(Query.java:352){color} > {color:#d04437} at org.h2.command.dml.Query.query(Query.java:333){color} > {color:#d04437} at > org.h2.command.CommandContainer.query(CommandContainer.java:113){color} > {color:#d04437} at > org.h2.command.Command.executeQuery(Command.java:201){color} > {color:#d04437} ... 44 more{color} > > WA: use -Djava.util.Arrays.useLegacyMergeSort=true > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (IGNITE-9238) Test GridTaskFailoverAffinityRunTest.testNodeRestartClient hangs when coordinator forces client to reconnect on grid startup.
[ https://issues.apache.org/jira/browse/IGNITE-9238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16575932#comment-16575932 ] Pavel Pereslegin commented on IGNITE-9238: -- Hello [~Jokser], review this fix, please. When coordinator checks exchange history, it can see updated affinity version, but the exchange future on which the affinity version was updated is not fully completed. > Test GridTaskFailoverAffinityRunTest.testNodeRestartClient hangs when > coordinator forces client to reconnect on grid startup. > - > > Key: IGNITE-9238 > URL: https://issues.apache.org/jira/browse/IGNITE-9238 > Project: Ignite > Issue Type: Bug >Affects Versions: 2.6 >Reporter: Pavel Pereslegin >Assignee: Pavel Pereslegin >Priority: Major > Fix For: 2.7 > > > Example of such hang on TC: > https://ci.ignite.apache.org/viewLog.html?buildId=1605243&tab=buildResultsDiv&buildTypeId=IgniteTests24Java8_ComputeGrid > Log output: > {noformat} > ... > [2018-08-07 12:20:09,804][WARN > ][sys-#12799%internal.GridTaskFailoverAffinityRunTest1%][GridCachePartitionExchangeManager] > Client node tries to connect but its exchange info is cleaned up from > exchange history. Consider increasing 'IGNITE_EXCHANGE_HISTORY_SIZE' property > or start clients in smaller batches. Current settings and versions: > [IGNITE_EXCHANGE_HISTORY_SIZE=1000, initVer=AffinityTopologyVersion > [topVer=3, minorTopVer=0], readyVer=AffinityTopologyVersion [topVer=4, > minorTopVer=0]]. > [2018-08-07 12:20:09,804][INFO > ][exchange-worker-#12782%internal.GridTaskFailoverAffinityRunTest1%][GridDhtPartitionsExchangeFuture] > Completed partition exchange > [localNode=511d5932-5f22-4919-807d-575c7f61, > exchange=GridDhtPartitionsExchangeFuture [topVer=AffinityTopologyVersion > [topVer=3, minorTopVer=0], evt=NODE_JOINED, evtNode=TcpDiscoveryNode > [id=6b9a7a1d-07bf-4d20-882a-8462ada3, addrs=ArrayList [127.0.0.1], > sockAddrs=HashSet [/127.0.0.1:47502], discPort=47502, order=3, intOrder=3, > lastExchangeTime=1533644409739, loc=false, ver=2.7.0#20180807-sha1:e96616f5, > isClient=false], done=true], topVer=AffinityTopologyVersion [topVer=4, > minorTopVer=0], durationFromInit=21] > [2018-08-07 12:20:09,806][INFO > ][exchange-worker-#12782%internal.GridTaskFailoverAffinityRunTest1%][time] > Finished exchange init [topVer=AffinityTopologyVersion [topVer=3, > minorTopVer=0], crd=true] > [2018-08-07 12:20:09,807][INFO > ][exchange-worker-#12782%internal.GridTaskFailoverAffinityRunTest1%][GridCachePartitionExchangeManager] > Skipping rebalancing (nothing scheduled) [top=AffinityTopologyVersion > [topVer=4, minorTopVer=0], force=false, evt=NODE_JOINED, > node=6b9a7a1d-07bf-4d20-882a-8462ada3] > [2018-08-07 12:20:09,811][INFO > ][sys-#12798%internal.GridTaskFailoverAffinityRunTest2%][GridDhtPartitionsExchangeFuture] > Finish exchange future [startVer=AffinityTopologyVersion [topVer=4, > minorTopVer=0], resVer=AffinityTopologyVersion [topVer=4, minorTopVer=0], > err=null] > [2018-08-07 12:20:09,813][INFO > ][sys-#12798%internal.GridTaskFailoverAffinityRunTest2%][GridDhtPartitionsExchangeFuture] > Completed partition exchange > [localNode=a3206c1f-6d57-4fd6-8aa5-e22f3b42, > exchange=GridDhtPartitionsExchangeFuture [topVer=AffinityTopologyVersion > [topVer=4, minorTopVer=0], evt=NODE_JOINED, evtNode=TcpDiscoveryNode > [id=a3206c1f-6d57-4fd6-8aa5-e22f3b42, addrs=ArrayList [127.0.0.1], > sockAddrs=HashSet [/127.0.0.1:47503], discPort=47503, order=4, intOrder=4, > lastExchangeTime=1533644409779, loc=true, ver=2.7.0#20180807-sha1:e96616f5, > isClient=false], done=true], topVer=AffinityTopologyVersion [topVer=4, > minorTopVer=0], durationFromInit=41] > [2018-08-07 12:20:09,814][INFO > ][grid-starter-testNodeRestartClient-1][GridTaskFailoverAffinityRunTest1] To > start Console Management & Monitoring run ignitevisorcmd.{sh|bat} > [2018-08-07 12:20:09,815][INFO > ][grid-starter-testNodeRestartClient-1][GridTaskFailoverAffinityRunTest1] > [2018-08-07 12:20:09,815][INFO > ][grid-starter-testNodeRestartClient-1][GridTaskFailoverAffinityRunTest1] > >>> +---+ > >>> Ignite ver. > >>> 2.7.0-SNAPSHOT#20180807-sha1:e96616f580930f267eab44f75d410fa29a876bcb > >>> +---+ > >>> OS name: Linux 4.4.0-128-generic amd64 > >>> CPU(s): 5 > >>> Heap: 2.0GB > >>> VM name: 20126@8790182f15a5 > >>> Ignite instance name: internal.GridTaskFailoverAffinityRunTest1 > >>> Local node [ID=511D5932-5F22-4919-807D-575C7F61, order=2, > >>> clientMode=false] > >>> Local node addresses: [127.0.0
[jira] [Updated] (IGNITE-9245) Document how to monitor Ignite with Zabbix
[ https://issues.apache.org/jira/browse/IGNITE-9245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Artem Budnikov updated IGNITE-9245: --- Issue Type: Task (was: Test) > Document how to monitor Ignite with Zabbix > --- > > Key: IGNITE-9245 > URL: https://issues.apache.org/jira/browse/IGNITE-9245 > Project: Ignite > Issue Type: Task > Components: documentation >Reporter: Artem Budnikov >Assignee: Artem Budnikov >Priority: Major > > Create a how-to page with an instruction on how to use Zabbix templates to > monitor Ignite metrics. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-9245) Document how to monitor Ignite with Zabbix
Artem Budnikov created IGNITE-9245: -- Summary: Document how to monitor Ignite with Zabbix Key: IGNITE-9245 URL: https://issues.apache.org/jira/browse/IGNITE-9245 Project: Ignite Issue Type: Test Components: documentation Reporter: Artem Budnikov Assignee: Artem Budnikov Create a how-to page with an instruction on how to use Zabbix templates to monitor Ignite metrics. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (IGNITE-9244) Partition eviction may use all threads in sys pool, it leads to hangs send a message via sys pool
[ https://issues.apache.org/jira/browse/IGNITE-9244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dmitriy Govorukhin reassigned IGNITE-9244: -- Assignee: Dmitriy Govorukhin > Partition eviction may use all threads in sys pool, it leads to hangs send a > message via sys pool > -- > > Key: IGNITE-9244 > URL: https://issues.apache.org/jira/browse/IGNITE-9244 > Project: Ignite > Issue Type: Bug >Reporter: Dmitriy Govorukhin >Assignee: Dmitriy Govorukhin >Priority: Major > Fix For: 2.7 > > > In the current implementation, GridDhtPartitionsEvictor reset partition to > evict one by one. > GridDhtPartitionsEvictor is created for each cache group, if we try to evict > too many groups as sys pool size, group evictors will take all available > threads in sys pool. It leads to hangs send a message via sys pool. As a fix, > I suggest to limit concurrent execution via sys pool or use another pool for > this purpose. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (IGNITE-9244) Partition eviction may use all threads in sys pool, it leads to hangs send a message via sys pool
[ https://issues.apache.org/jira/browse/IGNITE-9244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dmitriy Govorukhin updated IGNITE-9244: --- Ignite Flags: (was: Docs Required) > Partition eviction may use all threads in sys pool, it leads to hangs send a > message via sys pool > -- > > Key: IGNITE-9244 > URL: https://issues.apache.org/jira/browse/IGNITE-9244 > Project: Ignite > Issue Type: Bug >Reporter: Dmitriy Govorukhin >Priority: Major > Fix For: 2.7 > > > In the current implementation, GridDhtPartitionsEvictor reset partition to > evict one by one. > GridDhtPartitionsEvictor is created for each cache group, if we try to evict > too many groups as sys pool size, group evictors will take all available > threads in sys pool. It leads to hangs send a message via sys pool. As a fix, > I suggest to limit concurrent execution via sys pool or use another pool for > this purpose. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (IGNITE-9244) Partition eviction may use all threads in sys pool, it leads to hangs send a message via sys pool
[ https://issues.apache.org/jira/browse/IGNITE-9244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dmitriy Govorukhin updated IGNITE-9244: --- Environment: (was: In the current implementation, GridDhtPartitionsEvictor reset partition to evict one by one. GridDhtPartitionsEvictor is created for each cache group, if we try to evict too many groups as sys pool size, group evictors will take all available threads in sys pool. It leads to hangs send a message via sys pool. As a fix, I suggest to limit concurrent execution via sys pool or use another pool for this purpose.) > Partition eviction may use all threads in sys pool, it leads to hangs send a > message via sys pool > -- > > Key: IGNITE-9244 > URL: https://issues.apache.org/jira/browse/IGNITE-9244 > Project: Ignite > Issue Type: Bug >Reporter: Dmitriy Govorukhin >Priority: Major > Fix For: 2.7 > > > In the current implementation, GridDhtPartitionsEvictor reset partition to > evict one by one. > GridDhtPartitionsEvictor is created for each cache group, if we try to evict > too many groups as sys pool size, group evictors will take all available > threads in sys pool. It leads to hangs send a message via sys pool. As a fix, > I suggest to limit concurrent execution via sys pool or use another pool for > this purpose. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (IGNITE-9244) Partition eviction may use all threads in sys pool, it leads to hangs send a message via sys pool
[ https://issues.apache.org/jira/browse/IGNITE-9244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dmitriy Govorukhin updated IGNITE-9244: --- Description: In the current implementation, GridDhtPartitionsEvictor reset partition to evict one by one. GridDhtPartitionsEvictor is created for each cache group, if we try to evict too many groups as sys pool size, group evictors will take all available threads in sys pool. It leads to hangs send a message via sys pool. As a fix, I suggest to limit concurrent execution via sys pool or use another pool for this purpose. > Partition eviction may use all threads in sys pool, it leads to hangs send a > message via sys pool > -- > > Key: IGNITE-9244 > URL: https://issues.apache.org/jira/browse/IGNITE-9244 > Project: Ignite > Issue Type: Bug > Environment: In the current implementation, GridDhtPartitionsEvictor > reset partition to evict one by one. > GridDhtPartitionsEvictor is created for each cache group, if we try to evict > too many groups as sys pool size, group evictors will take all available > threads in sys pool. It leads to hangs send a message via sys pool. As a fix, > I suggest to limit concurrent execution via sys pool or use another pool for > this purpose. >Reporter: Dmitriy Govorukhin >Priority: Major > Fix For: 2.7 > > > In the current implementation, GridDhtPartitionsEvictor reset partition to > evict one by one. > GridDhtPartitionsEvictor is created for each cache group, if we try to evict > too many groups as sys pool size, group evictors will take all available > threads in sys pool. It leads to hangs send a message via sys pool. As a fix, > I suggest to limit concurrent execution via sys pool or use another pool for > this purpose. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-9244) Partition eviction may use all threads in sys pool, it leads to hangs send a message via sys pool
Dmitriy Govorukhin created IGNITE-9244: -- Summary: Partition eviction may use all threads in sys pool, it leads to hangs send a message via sys pool Key: IGNITE-9244 URL: https://issues.apache.org/jira/browse/IGNITE-9244 Project: Ignite Issue Type: Bug Environment: In the current implementation, GridDhtPartitionsEvictor reset partition to evict one by one. GridDhtPartitionsEvictor is created for each cache group, if we try to evict too many groups as sys pool size, group evictors will take all available threads in sys pool. It leads to hangs send a message via sys pool. As a fix, I suggest to limit concurrent execution via sys pool or use another pool for this purpose. Reporter: Dmitriy Govorukhin -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (IGNITE-9244) Partition eviction may use all threads in sys pool, it leads to hangs send a message via sys pool
[ https://issues.apache.org/jira/browse/IGNITE-9244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dmitriy Govorukhin updated IGNITE-9244: --- Fix Version/s: 2.7 > Partition eviction may use all threads in sys pool, it leads to hangs send a > message via sys pool > -- > > Key: IGNITE-9244 > URL: https://issues.apache.org/jira/browse/IGNITE-9244 > Project: Ignite > Issue Type: Bug > Environment: In the current implementation, GridDhtPartitionsEvictor > reset partition to evict one by one. > GridDhtPartitionsEvictor is created for each cache group, if we try to evict > too many groups as sys pool size, group evictors will take all available > threads in sys pool. It leads to hangs send a message via sys pool. As a fix, > I suggest to limit concurrent execution via sys pool or use another pool for > this purpose. >Reporter: Dmitriy Govorukhin >Priority: Major > Fix For: 2.7 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)