[jira] [Commented] (IGNITE-602) [Test] GridToStringBuilder is vulnerable for StackOverflowError caused by infinite recursion

2018-08-10 Thread Ryabov Dmitrii (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16576901#comment-16576901
 ] 

Ryabov Dmitrii commented on IGNITE-602:
---

[~agoncharuk], yes, I noticed it too and created 
[IGNITE-9209|https://issues.apache.org/jira/browse/IGNITE-9209]. I didn't 
investigated issue immediately when found, but looks like I know where the 
trouble is. I've made PR and start tests. I hope tomorrow we can merge the fix.

> [Test] GridToStringBuilder is vulnerable for StackOverflowError caused by 
> infinite recursion
> 
>
> Key: IGNITE-602
> URL: https://issues.apache.org/jira/browse/IGNITE-602
> Project: Ignite
>  Issue Type: Bug
>  Components: general
>Reporter: Artem Shutak
>Assignee: Ryabov Dmitrii
>Priority: Major
>  Labels: MakeTeamcityGreenAgain, Muted_test
> Fix For: 2.7
>
>
> See test 
> org.gridgain.grid.util.tostring.GridToStringBuilderSelfTest#_testToStringCheckAdvancedRecursionPrevention
>  and related TODO in same source file.
> Also take a look at 
> http://stackoverflow.com/questions/11300203/most-efficient-way-to-prevent-an-infinite-recursion-in-tostring
> Test should be unmuted on TC after fix.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (IGNITE-9209) GridDistributedTxMapping.toString() returns broken string

2018-08-10 Thread Ryabov Dmitrii (JIRA)


 [ 
https://issues.apache.org/jira/browse/IGNITE-9209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ryabov Dmitrii reassigned IGNITE-9209:
--

Assignee: Ryabov Dmitrii

> GridDistributedTxMapping.toString() returns broken string
> -
>
> Key: IGNITE-9209
> URL: https://issues.apache.org/jira/browse/IGNITE-9209
> Project: Ignite
>  Issue Type: Bug
>Reporter: Ryabov Dmitrii
>Assignee: Ryabov Dmitrii
>Priority: Minor
>
> Something wrong with `GridDistributedTxMapping` when we try to get string 
> representation by `GridToStringBuilder`.
> It should looks like
> {noformat}
> GridDistributedTxMapping [entries=LinkedHashSet [/*values here*/], 
> explicitLock=false, dhtVer=null, last=false, nearEntries=0,/*more text*/]
> {noformat}
> But currently it looks like
> {noformat}
> KeyCacheObjectImpl [part=1, val=1, hasValBytes=false]KeyCacheObjectImpl 
> [part=1, val=1, hasValBytes=false],// more text
> {noformat}
> Reproducer:
> {code:java}
> public class GridToStringBuilderSelfTest extends GridCommonAbstractTest {
> /**
>  * @throws Exception
>  */
> public void testGridDistributedTxMapping() throws Exception {
> IgniteEx ignite = startGrid(0);
> IgniteCache cache = 
> ignite.createCache(defaultCacheConfiguration());
> try (Transaction tx = ignite.transactions().txStart()) {
> cache.put(1, 1);
> GridDistributedTxMapping mapping = new 
> GridDistributedTxMapping(grid(0).localNode());
> assertTrue("Wrong string: " + mapping, 
> mapping.toString().startsWith("GridDistributedTxMapping ["));
> 
> mapping.add(((TransactionProxyImpl)tx).tx().txState().allEntries().stream().findAny().get());
> assertTrue("Wrong string: " + mapping, 
> mapping.toString().startsWith("GridDistributedTxMapping ["));
> }
> stopAllGrids();
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (IGNITE-9209) GridDistributedTxMapping.toString() returns broken string

2018-08-10 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-9209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16576891#comment-16576891
 ] 

ASF GitHub Bot commented on IGNITE-9209:


GitHub user SomeFire opened a pull request:

https://github.com/apache/ignite/pull/4519

IGNITE-9209: GridDistributedTxMapping.toString() returns broken string



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/SomeFire/ignite ignite-9209

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/ignite/pull/4519.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4519


commit 9d0f4853e9b179553631866b57cba8d393328be8
Author: Dmitrii Ryabov 
Date:   2018-08-10T21:41:53Z

IGNITE-9209: GridDistributedTxMapping.toString() returns broken string




> GridDistributedTxMapping.toString() returns broken string
> -
>
> Key: IGNITE-9209
> URL: https://issues.apache.org/jira/browse/IGNITE-9209
> Project: Ignite
>  Issue Type: Bug
>Reporter: Ryabov Dmitrii
>Priority: Minor
>
> Something wrong with `GridDistributedTxMapping` when we try to get string 
> representation by `GridToStringBuilder`.
> It should looks like
> {noformat}
> GridDistributedTxMapping [entries=LinkedHashSet [/*values here*/], 
> explicitLock=false, dhtVer=null, last=false, nearEntries=0,/*more text*/]
> {noformat}
> But currently it looks like
> {noformat}
> KeyCacheObjectImpl [part=1, val=1, hasValBytes=false]KeyCacheObjectImpl 
> [part=1, val=1, hasValBytes=false],// more text
> {noformat}
> Reproducer:
> {code:java}
> public class GridToStringBuilderSelfTest extends GridCommonAbstractTest {
> /**
>  * @throws Exception
>  */
> public void testGridDistributedTxMapping() throws Exception {
> IgniteEx ignite = startGrid(0);
> IgniteCache cache = 
> ignite.createCache(defaultCacheConfiguration());
> try (Transaction tx = ignite.transactions().txStart()) {
> cache.put(1, 1);
> GridDistributedTxMapping mapping = new 
> GridDistributedTxMapping(grid(0).localNode());
> assertTrue("Wrong string: " + mapping, 
> mapping.toString().startsWith("GridDistributedTxMapping ["));
> 
> mapping.add(((TransactionProxyImpl)tx).tx().txState().allEntries().stream().findAny().get());
> assertTrue("Wrong string: " + mapping, 
> mapping.toString().startsWith("GridDistributedTxMapping ["));
> }
> stopAllGrids();
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (IGNITE-8923) Add step-by-step guide - Google Cloud Engine Deployment (Kubernetes)

2018-08-10 Thread Denis Magda (JIRA)


 [ 
https://issues.apache.org/jira/browse/IGNITE-8923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Denis Magda resolved IGNITE-8923.
-
Resolution: Fixed

> Add step-by-step guide - Google Cloud Engine Deployment (Kubernetes)
> 
>
> Key: IGNITE-8923
> URL: https://issues.apache.org/jira/browse/IGNITE-8923
> Project: Ignite
>  Issue Type: Improvement
>  Components: documentation
>Reporter: Roman Guseinov
>Assignee: Denis Magda
>Priority: Major
> Fix For: 2.7
>
> Attachments: config.zip, example-kube.xml, 
> google_cloud_engine_deployment.zip, yaml.zip
>
>
> We have such documentation for Microsoft Azure 
> [https://apacheignite.readme.io/docs/microsoft-azure-deployment]
> It would be great to publish the same for GCE.
> Here are steps which I used to deploy cluster (stateless, stateful) and web 
> console:
> {code:java}
> ## Start Ignite Cluster
> 1. Grant cluster-admin role to current google user (to allow create roles):
> $ kubectl create clusterrolebinding myname2-cluster-admin-binding \
> --clusterrole=cluster-admin \
> --user=
> 2. Create service account and grant permissions:
> $ kubectl create -f sa.yaml
> $ kubectl create -f role.yaml
> $ kubectl create -f rolebind.yaml
> 3. Create a grid service:
> $ kubectl create -f service.yaml
> 4. Deploy Ignite Cluster:
> $ kubectl create -f grid.yaml
> ## Enable Ignite Persistence
> 5. Deploy Ignite StatefulSet with enabled Persistence (instead of step 4).
> $ kubectl create -f grid-pds.yaml
> 6. Connect to the Ignite node and activate cluster:
> $ kubectl exec -it ignite-cluster-0 -- /bin/bash
> $ cd /opt/ignite/apache-ignite-*
> $ ./bin/control.sh --activate
> ## Deploy Web Console:
> 7. Create a volume to keep web console data:
> $ kubectl create -f console-volume.yaml
> 8. Create load balancer to expose HTTP port and make web console available by 
> service DNS-name (web-console.default.svc.cluster.local) inside Kuberntes 
> enviroment:
> $ kubectl create -f console-service.yaml
> 9. Deploy Web Console:
> $ kubectl create -f console.yaml
> 10. Check external IP:
> $ kubectl get service web-console
> 11. Open Web Console in a web browser and Sign Up.
> 12. Move to User Profile page (Settings > Profile) and copy security token.
> 13. Insert security token into web-agent.yaml (TOKENS environment variable).
> 14. Deploy Web Agent:
> $ kubectl create -f web-agent.yaml
> {code}
> YAML and configs are attached.
> Creating a public Docker-image for Web Agent in progress: 
> https://issues.apache.org/jira/browse/IGNITE-8526



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (IGNITE-8923) Add step-by-step guide - Google Cloud Engine Deployment (Kubernetes)

2018-08-10 Thread Denis Magda (JIRA)


 [ 
https://issues.apache.org/jira/browse/IGNITE-8923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Denis Magda closed IGNITE-8923.
---

> Add step-by-step guide - Google Cloud Engine Deployment (Kubernetes)
> 
>
> Key: IGNITE-8923
> URL: https://issues.apache.org/jira/browse/IGNITE-8923
> Project: Ignite
>  Issue Type: Improvement
>  Components: documentation
>Reporter: Roman Guseinov
>Assignee: Denis Magda
>Priority: Major
> Fix For: 2.7
>
> Attachments: config.zip, example-kube.xml, 
> google_cloud_engine_deployment.zip, yaml.zip
>
>
> We have such documentation for Microsoft Azure 
> [https://apacheignite.readme.io/docs/microsoft-azure-deployment]
> It would be great to publish the same for GCE.
> Here are steps which I used to deploy cluster (stateless, stateful) and web 
> console:
> {code:java}
> ## Start Ignite Cluster
> 1. Grant cluster-admin role to current google user (to allow create roles):
> $ kubectl create clusterrolebinding myname2-cluster-admin-binding \
> --clusterrole=cluster-admin \
> --user=
> 2. Create service account and grant permissions:
> $ kubectl create -f sa.yaml
> $ kubectl create -f role.yaml
> $ kubectl create -f rolebind.yaml
> 3. Create a grid service:
> $ kubectl create -f service.yaml
> 4. Deploy Ignite Cluster:
> $ kubectl create -f grid.yaml
> ## Enable Ignite Persistence
> 5. Deploy Ignite StatefulSet with enabled Persistence (instead of step 4).
> $ kubectl create -f grid-pds.yaml
> 6. Connect to the Ignite node and activate cluster:
> $ kubectl exec -it ignite-cluster-0 -- /bin/bash
> $ cd /opt/ignite/apache-ignite-*
> $ ./bin/control.sh --activate
> ## Deploy Web Console:
> 7. Create a volume to keep web console data:
> $ kubectl create -f console-volume.yaml
> 8. Create load balancer to expose HTTP port and make web console available by 
> service DNS-name (web-console.default.svc.cluster.local) inside Kuberntes 
> enviroment:
> $ kubectl create -f console-service.yaml
> 9. Deploy Web Console:
> $ kubectl create -f console.yaml
> 10. Check external IP:
> $ kubectl get service web-console
> 11. Open Web Console in a web browser and Sign Up.
> 12. Move to User Profile page (Settings > Profile) and copy security token.
> 13. Insert security token into web-agent.yaml (TOKENS environment variable).
> 14. Deploy Web Agent:
> $ kubectl create -f web-agent.yaml
> {code}
> YAML and configs are attached.
> Creating a public Docker-image for Web Agent in progress: 
> https://issues.apache.org/jira/browse/IGNITE-8526



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (IGNITE-8923) Add step-by-step guide - Google Cloud Engine Deployment (Kubernetes)

2018-08-10 Thread Denis Magda (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-8923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16576753#comment-16576753
 ] 

Denis Magda commented on IGNITE-8923:
-

Looks good, thanks.

> Add step-by-step guide - Google Cloud Engine Deployment (Kubernetes)
> 
>
> Key: IGNITE-8923
> URL: https://issues.apache.org/jira/browse/IGNITE-8923
> Project: Ignite
>  Issue Type: Improvement
>  Components: documentation
>Reporter: Roman Guseinov
>Assignee: Denis Magda
>Priority: Major
> Fix For: 2.7
>
> Attachments: config.zip, example-kube.xml, 
> google_cloud_engine_deployment.zip, yaml.zip
>
>
> We have such documentation for Microsoft Azure 
> [https://apacheignite.readme.io/docs/microsoft-azure-deployment]
> It would be great to publish the same for GCE.
> Here are steps which I used to deploy cluster (stateless, stateful) and web 
> console:
> {code:java}
> ## Start Ignite Cluster
> 1. Grant cluster-admin role to current google user (to allow create roles):
> $ kubectl create clusterrolebinding myname2-cluster-admin-binding \
> --clusterrole=cluster-admin \
> --user=
> 2. Create service account and grant permissions:
> $ kubectl create -f sa.yaml
> $ kubectl create -f role.yaml
> $ kubectl create -f rolebind.yaml
> 3. Create a grid service:
> $ kubectl create -f service.yaml
> 4. Deploy Ignite Cluster:
> $ kubectl create -f grid.yaml
> ## Enable Ignite Persistence
> 5. Deploy Ignite StatefulSet with enabled Persistence (instead of step 4).
> $ kubectl create -f grid-pds.yaml
> 6. Connect to the Ignite node and activate cluster:
> $ kubectl exec -it ignite-cluster-0 -- /bin/bash
> $ cd /opt/ignite/apache-ignite-*
> $ ./bin/control.sh --activate
> ## Deploy Web Console:
> 7. Create a volume to keep web console data:
> $ kubectl create -f console-volume.yaml
> 8. Create load balancer to expose HTTP port and make web console available by 
> service DNS-name (web-console.default.svc.cluster.local) inside Kuberntes 
> enviroment:
> $ kubectl create -f console-service.yaml
> 9. Deploy Web Console:
> $ kubectl create -f console.yaml
> 10. Check external IP:
> $ kubectl get service web-console
> 11. Open Web Console in a web browser and Sign Up.
> 12. Move to User Profile page (Settings > Profile) and copy security token.
> 13. Insert security token into web-agent.yaml (TOKENS environment variable).
> 14. Deploy Web Agent:
> $ kubectl create -f web-agent.yaml
> {code}
> YAML and configs are attached.
> Creating a public Docker-image for Web Agent in progress: 
> https://issues.apache.org/jira/browse/IGNITE-8526



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (IGNITE-8994) Configuring dedicated volumes for WAL and data with Kuberenetes

2018-08-10 Thread Denis Magda (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-8994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16576750#comment-16576750
 ] 

Denis Magda commented on IGNITE-8994:
-

[~abchaudhri], please do the following:
* Review and edit all the pages and sub-pages of Ignite Kubernetes doc. That's 
the root - https://apacheignite.readme.io/v2.6/docs/kubernetes-deployment
* Make sure you can reproduce the steps documented for GCE: 
https://apacheignite.readme.io/v2.6/docs/google-cloud-deployment (we're 
interested in the StatefulSet deployment only when WAL and database files are 
stored separately, please rework that section on GCE page if needed, you don't 
need to reproduce the stateless deployment)
* Create a guideline for Amazon AWS similar to the one we have for Microsoft 
Azure and GKE. 

> Configuring dedicated volumes for WAL and data with Kuberenetes
> ---
>
> Key: IGNITE-8994
> URL: https://issues.apache.org/jira/browse/IGNITE-8994
> Project: Ignite
>  Issue Type: Task
>  Components: documentation
>Reporter: Denis Magda
>Assignee: Akmal Chaudhri
>Priority: Major
> Fix For: 2.7
>
> Attachments: yaml.zip
>
>
> The current StatefulSet documentation request only one persistent volume for 
> both WAL and data/index files:
> https://apacheignite.readme.io/docs/stateful-deployment#section-statefulset-deployment
> However, according to Ignite performance guide the WAL has to be located on a 
> dedicated volume:
> https://apacheignite.readme.io/docs/durable-memory-tuning#section-separate-disk-device-for-wal
> Provide StatefulSet configuration that shows how to request separate volumes 
> for the WAL and data/index files. If needed, provide YAML configs for 
> StorageClass and volume claims.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (IGNITE-8994) Configuring dedicated volumes for WAL and data with Kuberenetes

2018-08-10 Thread Denis Magda (JIRA)


 [ 
https://issues.apache.org/jira/browse/IGNITE-8994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Denis Magda reassigned IGNITE-8994:
---

Assignee: Akmal Chaudhri  (was: Denis Magda)

> Configuring dedicated volumes for WAL and data with Kuberenetes
> ---
>
> Key: IGNITE-8994
> URL: https://issues.apache.org/jira/browse/IGNITE-8994
> Project: Ignite
>  Issue Type: Task
>  Components: documentation
>Reporter: Denis Magda
>Assignee: Akmal Chaudhri
>Priority: Major
> Fix For: 2.7
>
> Attachments: yaml.zip
>
>
> The current StatefulSet documentation request only one persistent volume for 
> both WAL and data/index files:
> https://apacheignite.readme.io/docs/stateful-deployment#section-statefulset-deployment
> However, according to Ignite performance guide the WAL has to be located on a 
> dedicated volume:
> https://apacheignite.readme.io/docs/durable-memory-tuning#section-separate-disk-device-for-wal
> Provide StatefulSet configuration that shows how to request separate volumes 
> for the WAL and data/index files. If needed, provide YAML configs for 
> StorageClass and volume claims.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (IGNITE-9178) Partition lost event are not triggered if multiple nodes left cluster

2018-08-10 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-9178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16576488#comment-16576488
 ] 

ASF GitHub Bot commented on IGNITE-9178:


Github user asfgit closed the pull request at:

https://github.com/apache/ignite/pull/4495


> Partition lost event are not triggered if multiple nodes left cluster
> -
>
> Key: IGNITE-9178
> URL: https://issues.apache.org/jira/browse/IGNITE-9178
> Project: Ignite
>  Issue Type: Bug
>  Components: cache
>Affects Versions: 2.4
>Reporter: Pavel Vinokurov
>Assignee: Pavel Vinokurov
>Priority: Blocker
> Fix For: 2.7
>
>
> If multiple nodes left cluster simultaneously, left partitions are removed 
> from GridDhtPartitionTopologyImpl#node2part without adding to leftNode2Part  
> in GridDhtPartitionTopologyImpl#update method.
> Thus GridDhtPartitionTopologyImpl#detectLostPartitions can't detect lost 
> partitions



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (IGNITE-9231) onMarkDirty improvement throttle implementation.

2018-08-10 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-9231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16576487#comment-16576487
 ] 

ASF GitHub Bot commented on IGNITE-9231:


Github user asfgit closed the pull request at:

https://github.com/apache/ignite/pull/4506


> onMarkDirty improvement throttle implementation.
> 
>
> Key: IGNITE-9231
> URL: https://issues.apache.org/jira/browse/IGNITE-9231
> Project: Ignite
>  Issue Type: Bug
>  Components: general
>Affects Versions: 2.6
>Reporter: Stanilovsky Evgeny
>Assignee: Stanilovsky Evgeny
>Priority: Major
> Fix For: 2.7
>
>
> PagesWriteThrottle#onMarkDirty implementation park threads if 
> checkpointBuffer is near to overflow, but further release of checkpointBuffer 
> has no effect on parking threads, they still can be parked.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (IGNITE-9178) Partition lost event are not triggered if multiple nodes left cluster

2018-08-10 Thread Alexey Goncharuk (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-9178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16576481#comment-16576481
 ] 

Alexey Goncharuk commented on IGNITE-9178:
--

Got it now. Thanks, merged to master!

> Partition lost event are not triggered if multiple nodes left cluster
> -
>
> Key: IGNITE-9178
> URL: https://issues.apache.org/jira/browse/IGNITE-9178
> Project: Ignite
>  Issue Type: Bug
>  Components: cache
>Affects Versions: 2.4
>Reporter: Pavel Vinokurov
>Assignee: Pavel Vinokurov
>Priority: Blocker
> Fix For: 2.7
>
>
> If multiple nodes left cluster simultaneously, left partitions are removed 
> from GridDhtPartitionTopologyImpl#node2part without adding to leftNode2Part  
> in GridDhtPartitionTopologyImpl#update method.
> Thus GridDhtPartitionTopologyImpl#detectLostPartitions can't detect lost 
> partitions



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (IGNITE-8673) Reconcile isClient* methods

2018-08-10 Thread Alexey Goncharuk (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-8673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16576468#comment-16576468
 ] 

Alexey Goncharuk commented on IGNITE-8673:
--

Merged master and re-triggered TC build: 
https://ci.ignite.apache.org/project.html?projectId=IgniteTests24Java8&branch_IgniteTests24Java8=pull%2F4104%2Fhead

> Reconcile isClient* methods
> ---
>
> Key: IGNITE-8673
> URL: https://issues.apache.org/jira/browse/IGNITE-8673
> Project: Ignite
>  Issue Type: Bug
>Reporter: Eduard Shangareev
>Assignee: Eduard Shangareev
>Priority: Critical
> Fix For: 2.7
>
>
> Now isClient (Mode, Cache and so on) methods semantic could mean different 
> things:
> -the same as IgniteConfiguration#setClientMode;
> -or the way how a node is connected to cluster (in the ring or not).
> Almost in all cases, we need the first thing, but actually methods could 
> return second.
> For example, ClusterNode.isClient means second, but all of us use as first.
> So, I propose to make all methods return first.
> And if there are places which require second replace them with the usage of 
> forceClientMode.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (IGNITE-7339) RENTING partition is not evicted after restore from storage

2018-08-10 Thread Pavel Kovalenko (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-7339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16576461#comment-16576461
 ] 

Pavel Kovalenko commented on IGNITE-7339:
-

[~ascherbakov] Looks good to me. Ready to merge.

> RENTING partition is not evicted after restore from storage
> ---
>
> Key: IGNITE-7339
> URL: https://issues.apache.org/jira/browse/IGNITE-7339
> Project: Ignite
>  Issue Type: Bug
>Reporter: Semen Boikov
>Assignee: Alexei Scherbakov
>Priority: Critical
>
> If partition was in RENTING state at the moment when node is stopped, then 
> after restart it is not evicted.
> It seems it is an issue in GridDhtLocalPartition.rent, 'tryEvictAsync' is not 
> called is partition was already in RENTING state.
> Also there is error in GridDhtPartitionTopologyImpl.checkEvictions: partition 
> state is always treated as changed after part.rent call even if part.rent 
> does not actually change state.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (IGNITE-9244) Partition eviction may use all threads in sys pool, it leads to hangs send a message via sys pool

2018-08-10 Thread Pavel Kovalenko (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-9244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16576456#comment-16576456
 ] 

Pavel Kovalenko commented on IGNITE-9244:
-

[~agoncharuk] Looks good to me also. No other comments.

> Partition eviction may use all threads in sys pool, it leads to hangs send a 
> message via sys pool 
> --
>
> Key: IGNITE-9244
> URL: https://issues.apache.org/jira/browse/IGNITE-9244
> Project: Ignite
>  Issue Type: Bug
>Reporter: Dmitriy Govorukhin
>Assignee: Dmitriy Govorukhin
>Priority: Major
> Fix For: 2.7
>
>
> In the current implementation, GridDhtPartitionsEvictor reset partition to 
> evict one by one.
> GridDhtPartitionsEvictor is created for each cache group, if we try to evict 
> too many groups as sys pool size, group evictors will take all available 
> threads in sys pool. It leads to hangs send a message via sys pool. As a fix, 
> I suggest to limit concurrent execution via sys pool or use another pool for 
> this purpose.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (IGNITE-9244) Partition eviction may use all threads in sys pool, it leads to hangs send a message via sys pool

2018-08-10 Thread Alexey Goncharuk (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-9244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16576391#comment-16576391
 ] 

Alexey Goncharuk commented on IGNITE-9244:
--

Looks good to me now. [~Jokser], do you have any other comments?

> Partition eviction may use all threads in sys pool, it leads to hangs send a 
> message via sys pool 
> --
>
> Key: IGNITE-9244
> URL: https://issues.apache.org/jira/browse/IGNITE-9244
> Project: Ignite
>  Issue Type: Bug
>Reporter: Dmitriy Govorukhin
>Assignee: Dmitriy Govorukhin
>Priority: Major
> Fix For: 2.7
>
>
> In the current implementation, GridDhtPartitionsEvictor reset partition to 
> evict one by one.
> GridDhtPartitionsEvictor is created for each cache group, if we try to evict 
> too many groups as sys pool size, group evictors will take all available 
> threads in sys pool. It leads to hangs send a message via sys pool. As a fix, 
> I suggest to limit concurrent execution via sys pool or use another pool for 
> this purpose.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (IGNITE-9244) Partition eviction may use all threads in sys pool, it leads to hangs send a message via sys pool

2018-08-10 Thread Dmitriy Govorukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-9244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16576354#comment-16576354
 ] 

Dmitriy Govorukhin commented on IGNITE-9244:


[~agoncharuk] Thanks! All comments fixed.

> Partition eviction may use all threads in sys pool, it leads to hangs send a 
> message via sys pool 
> --
>
> Key: IGNITE-9244
> URL: https://issues.apache.org/jira/browse/IGNITE-9244
> Project: Ignite
>  Issue Type: Bug
>Reporter: Dmitriy Govorukhin
>Assignee: Dmitriy Govorukhin
>Priority: Major
> Fix For: 2.7
>
>
> In the current implementation, GridDhtPartitionsEvictor reset partition to 
> evict one by one.
> GridDhtPartitionsEvictor is created for each cache group, if we try to evict 
> too many groups as sys pool size, group evictors will take all available 
> threads in sys pool. It leads to hangs send a message via sys pool. As a fix, 
> I suggest to limit concurrent execution via sys pool or use another pool for 
> this purpose.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (IGNITE-9244) Partition eviction may use all threads in sys pool, it leads to hangs send a message via sys pool

2018-08-10 Thread Alexey Goncharuk (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-9244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16576344#comment-16576344
 ] 

Alexey Goncharuk commented on IGNITE-9244:
--

A few comments:
 * Use {{Long.compare}} instead of {{(p1, p2) -> p1.part.fullSize() > 
p2.part.fullSize() ? -1 : 1}} 
 * Method {{calculateBucket(PartitionEvictionTask)}} has an unused argument - 
either a bug, or the argument should be removed
 * For safety, make sure that {{threads}} does not get {{0}} value after 
{{sysPoolSize / 4}}

> Partition eviction may use all threads in sys pool, it leads to hangs send a 
> message via sys pool 
> --
>
> Key: IGNITE-9244
> URL: https://issues.apache.org/jira/browse/IGNITE-9244
> Project: Ignite
>  Issue Type: Bug
>Reporter: Dmitriy Govorukhin
>Assignee: Dmitriy Govorukhin
>Priority: Major
> Fix For: 2.7
>
>
> In the current implementation, GridDhtPartitionsEvictor reset partition to 
> evict one by one.
> GridDhtPartitionsEvictor is created for each cache group, if we try to evict 
> too many groups as sys pool size, group evictors will take all available 
> threads in sys pool. It leads to hangs send a message via sys pool. As a fix, 
> I suggest to limit concurrent execution via sys pool or use another pool for 
> this purpose.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (IGNITE-8559) WAL rollOver can be blocked by WAL iterator reservation

2018-08-10 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-8559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16576327#comment-16576327
 ] 

ASF GitHub Bot commented on IGNITE-8559:


GitHub user akalash opened a pull request:

https://github.com/apache/ignite/pull/4517

IGNITE-8559 Replace CacheAffinitySharedManager.CachesInfo by ClusterC…

…achesInfo

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/gridgain/apache-ignite ignite-9250

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/ignite/pull/4517.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4517


commit ea4c1be40d5f621043994a04a0c1bd1f6e47914e
Author: Anton Kalashnikov 
Date:   2018-08-10T14:16:08Z

IGNITE-8559 Replace CacheAffinitySharedManager.CachesInfo by 
ClusterCachesInfo




> WAL rollOver can be blocked by WAL iterator reservation
> ---
>
> Key: IGNITE-8559
> URL: https://issues.apache.org/jira/browse/IGNITE-8559
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Alexey Goncharuk
>Assignee: Anton Kalashnikov
>Priority: Critical
> Fix For: 2.7
>
>
> I've got the following thread dump from one of the Ignite nodes (only 
> meaningful threads are kept for simplicity)
> WAL archiver is waiting for locked segment release
> TX commit is waiting for WAL rollover
> WAL rollover is blocked by the archiver
> Exchange is blocked by TX commit
> {code}
> "sys-stripe-55-#56%GRID%GridNodeName%" #246 daemon prio=5 os_prio=0 
> tid=0x7fdd1eeff000 nid=0x164252 waiting on condition [0x7fdb36eec000]
>java.lang.Thread.State: WAITING (parking)
>   at sun.misc.Unsafe.park(Native Method)
>   - parking to wait for  <0x7fe0a5e96278> (a 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
>   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitUninterruptibly(AbstractQueuedSynchronizer.java:1976)
>   at 
> org.apache.ignite.internal.util.IgniteUtils.awaitQuiet(IgniteUtils.java:7400)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$FileWriteHandle.awaitNext(FileWriteAheadLogManager.java:2819)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$FileWriteHandle.access$2900(FileWriteAheadLogManager.java:2390)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager.rollOver(FileWriteAheadLogManager.java:1065)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager.log(FileWriteAheadLogManager.java:715)
>   at 
> org.gridgain.grid.internal.processors.cache.database.snapshot.GridCacheSnapshotManager.onChangeTrackerPage(GridCacheSnapshotManager.java:2436)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager$9.applyx(GridCacheDatabaseSharedManager.java:942)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager$9.applyx(GridCacheDatabaseSharedManager.java:935)
>   at 
> org.apache.ignite.internal.util.lang.GridInClosure3X.apply(GridInClosure3X.java:34)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.writeUnlockPage(PageMemoryImpl.java:1341)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.writeUnlock(PageMemoryImpl.java:415)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.writeUnlock(PageMemoryImpl.java:409)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.tree.util.PageHandler.writeUnlock(PageHandler.java:377)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.tree.util.PageHandler.writePage(PageHandler.java:287)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.DataStructure.write(DataStructure.java:282)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.freelist.AbstractFreeList.insertDataRow(AbstractFreeList.java:509)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.RowStore.addRow(RowStore.java:102)
>   at 
> org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl$CacheDataStoreImpl.createRow(IgniteCacheOffheapManagerImpl.java:1252)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager$GridCacheDataStore.createRow(GridCacheOffheapManager.java:1370)
>   at 
> org.apache.ignite.internal.

[jira] [Created] (IGNITE-9250) Replace CacheAffinitySharedManager.CachesInfo by ClusterCachesInfo

2018-08-10 Thread Anton Kalashnikov (JIRA)
Anton Kalashnikov created IGNITE-9250:
-

 Summary: Replace CacheAffinitySharedManager.CachesInfo by 
ClusterCachesInfo
 Key: IGNITE-9250
 URL: https://issues.apache.org/jira/browse/IGNITE-9250
 Project: Ignite
  Issue Type: Improvement
Reporter: Anton Kalashnikov
Assignee: Anton Kalashnikov


Now we have duplicate of registerCaches(and groups). They holds in 
ClusterCachesInfo - main storage, and also they holds in 
CacheAffinitySharedManager.CachesInfo. It looks like redundantly and can lead 
to unconsistancy of caches info.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (IGNITE-7384) MVCC Mvcc versions may be lost in case of rebalanse with persistence enabled

2018-08-10 Thread Roman Kondakov (JIRA)


 [ 
https://issues.apache.org/jira/browse/IGNITE-7384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roman Kondakov reassigned IGNITE-7384:
--

Assignee: Roman Kondakov

> MVCC Mvcc versions may be lost in case of rebalanse with persistence enabled
> 
>
> Key: IGNITE-7384
> URL: https://issues.apache.org/jira/browse/IGNITE-7384
> Project: Ignite
>  Issue Type: Bug
>Reporter: Igor Seliverstov
>Assignee: Roman Kondakov
>Priority: Major
>
> In case a node returns to topology it equests a delta instead of full 
> partition, WAL-based iterator is used there 
> ({{o.a.i.i.processors.cache.persistence.GridCacheOffheapManager#rebalanceIterator}})
> WAL-based iterator doesn't contain MVCC versions which causes issues.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (IGNITE-9178) Partition lost event are not triggered if multiple nodes left cluster

2018-08-10 Thread Pavel Vinokurov (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-9178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16576302#comment-16576302
 ] 

Pavel Vinokurov edited comment on IGNITE-9178 at 8/10/18 1:45 PM:
--

[~agoncharuk]
_leftNode2Part_ contains partitions for left nodes. The partition lost event 
raised if _leftNode2Part_ contains nodes missed in _node2Part_.
_node2Part_ map is cleaned up in two places 
_GridDhtPartitionTopologyImpl#update_ and 
_GridDhtPartitionTopologyImpl#removeNode_ methods 
Current patch fixes the following situation. Two nodes have left cluster 
simultaneously. During exchange for the first left node, the coordinator sends 
full map to other nodes.
_GridDhtPartitionTopologyImpl#update_ handles full map and removes partitions 
for the second left node without adding to _leftNode2Part_.
On the next exchange for the second  node, node2part already hasn't  partitions 
for second node, so partitions are not added to  _leftNode2Part_ in the 
_removeNode()_ .
This patch does not affect to _diffFromAffinity_ map anyhow. 
There is an another possible patch - cleanup node2Part map only in 
detectLostPartitions() method for all left nodes. But I am not sure that  it 
doesn't broke logic related to _diffFromAffinity_ map in 
_GridDhtPartitionTopologyImpl#update_ method. Please let me know if this patch 
would be more appropriate.  



was (Author: pvinokurov):
[~agoncharuk]
_leftNode2Part_ contains partitions for left nodes. The partition lost event 
raised if _leftNode2Part_ contains nodes missed in _node2Part_.
_node2Part_ map is cleaned up in two places 
_GridDhtPartitionTopologyImpl#update_ and 
_GridDhtPartitionTopologyImpl#removeNode_ methods 
Current patch fixes the following situation. Two nodes have left cluster 
simultaneously. During exchange for the first left node, the coordinator sends 
full map to other nodes.
_GridDhtPartitionTopologyImpl#update _handles full map and removes partitions 
for the second left node without adding to _leftNode2Part_.
On the next exchange for the second  node, node2part already hasn't  partitions 
for second node, so partitions are not added to  _leftNode2Part_ in the 
_removeNode()_ .
This patch does not affect to _diffFromAffinity_ map anyhow. 
There is an another possible patch - cleanup node2Part map only in 
detectLostPartitions() method for all left nodes. But I am not sure that  it 
doesn't broke logic related to _diffFromAffinity_ map in 
_GridDhtPartitionTopologyImpl#update_ method. Please let me know if this patch 
would be more appropriate.  


> Partition lost event are not triggered if multiple nodes left cluster
> -
>
> Key: IGNITE-9178
> URL: https://issues.apache.org/jira/browse/IGNITE-9178
> Project: Ignite
>  Issue Type: Bug
>  Components: cache
>Affects Versions: 2.4
>Reporter: Pavel Vinokurov
>Assignee: Pavel Vinokurov
>Priority: Blocker
> Fix For: 2.7
>
>
> If multiple nodes left cluster simultaneously, left partitions are removed 
> from GridDhtPartitionTopologyImpl#node2part without adding to leftNode2Part  
> in GridDhtPartitionTopologyImpl#update method.
> Thus GridDhtPartitionTopologyImpl#detectLostPartitions can't detect lost 
> partitions



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (IGNITE-9178) Partition lost event are not triggered if multiple nodes left cluster

2018-08-10 Thread Pavel Vinokurov (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-9178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16576302#comment-16576302
 ] 

Pavel Vinokurov commented on IGNITE-9178:
-

[~agoncharuk]
_leftNode2Part _ contains partitions for left nodes. The partition lost event 
raised if _leftNode2Part_ contains nodes missed in _node2Part_.
_node2Part_ map is cleaned up in two places 
_GridDhtPartitionTopologyImpl#update_ and 
_GridDhtPartitionTopologyImpl#removeNode_ methods 
Current patch fixes the following situation. Two nodes have left cluster 
simultaneously. During exchange for the first left node, the coordinator sends 
full map to other nodes.
_GridDhtPartitionTopologyImpl#update _handles full map and removes partitions 
for the second left node without adding to _leftNode2Part_.
On the next exchange for the second  node, node2part already hasn't  partitions 
for second node, so partitions are not added to  _leftNode2Part_ in the 
_removeNode()_ .
This patch does not affect to _diffFromAffinity_ map anyhow. 
There is an another possible patch - cleanup node2Part map only in 
detectLostPartitions() method for all left nodes. But I am not sure that  it 
doesn't broke logic related to _diffFromAffinity_ map in 
_GridDhtPartitionTopologyImpl#update_ method. Please let me know if this patch 
would be more appropriate.  


> Partition lost event are not triggered if multiple nodes left cluster
> -
>
> Key: IGNITE-9178
> URL: https://issues.apache.org/jira/browse/IGNITE-9178
> Project: Ignite
>  Issue Type: Bug
>  Components: cache
>Affects Versions: 2.4
>Reporter: Pavel Vinokurov
>Assignee: Pavel Vinokurov
>Priority: Blocker
> Fix For: 2.7
>
>
> If multiple nodes left cluster simultaneously, left partitions are removed 
> from GridDhtPartitionTopologyImpl#node2part without adding to leftNode2Part  
> in GridDhtPartitionTopologyImpl#update method.
> Thus GridDhtPartitionTopologyImpl#detectLostPartitions can't detect lost 
> partitions



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (IGNITE-9178) Partition lost event are not triggered if multiple nodes left cluster

2018-08-10 Thread Pavel Vinokurov (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-9178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16576302#comment-16576302
 ] 

Pavel Vinokurov edited comment on IGNITE-9178 at 8/10/18 1:44 PM:
--

[~agoncharuk]
_leftNode2Part_ contains partitions for left nodes. The partition lost event 
raised if _leftNode2Part_ contains nodes missed in _node2Part_.
_node2Part_ map is cleaned up in two places 
_GridDhtPartitionTopologyImpl#update_ and 
_GridDhtPartitionTopologyImpl#removeNode_ methods 
Current patch fixes the following situation. Two nodes have left cluster 
simultaneously. During exchange for the first left node, the coordinator sends 
full map to other nodes.
_GridDhtPartitionTopologyImpl#update _handles full map and removes partitions 
for the second left node without adding to _leftNode2Part_.
On the next exchange for the second  node, node2part already hasn't  partitions 
for second node, so partitions are not added to  _leftNode2Part_ in the 
_removeNode()_ .
This patch does not affect to _diffFromAffinity_ map anyhow. 
There is an another possible patch - cleanup node2Part map only in 
detectLostPartitions() method for all left nodes. But I am not sure that  it 
doesn't broke logic related to _diffFromAffinity_ map in 
_GridDhtPartitionTopologyImpl#update_ method. Please let me know if this patch 
would be more appropriate.  



was (Author: pvinokurov):
[~agoncharuk]
_leftNode2Part _ contains partitions for left nodes. The partition lost event 
raised if _leftNode2Part_ contains nodes missed in _node2Part_.
_node2Part_ map is cleaned up in two places 
_GridDhtPartitionTopologyImpl#update_ and 
_GridDhtPartitionTopologyImpl#removeNode_ methods 
Current patch fixes the following situation. Two nodes have left cluster 
simultaneously. During exchange for the first left node, the coordinator sends 
full map to other nodes.
_GridDhtPartitionTopologyImpl#update _handles full map and removes partitions 
for the second left node without adding to _leftNode2Part_.
On the next exchange for the second  node, node2part already hasn't  partitions 
for second node, so partitions are not added to  _leftNode2Part_ in the 
_removeNode()_ .
This patch does not affect to _diffFromAffinity_ map anyhow. 
There is an another possible patch - cleanup node2Part map only in 
detectLostPartitions() method for all left nodes. But I am not sure that  it 
doesn't broke logic related to _diffFromAffinity_ map in 
_GridDhtPartitionTopologyImpl#update_ method. Please let me know if this patch 
would be more appropriate.  


> Partition lost event are not triggered if multiple nodes left cluster
> -
>
> Key: IGNITE-9178
> URL: https://issues.apache.org/jira/browse/IGNITE-9178
> Project: Ignite
>  Issue Type: Bug
>  Components: cache
>Affects Versions: 2.4
>Reporter: Pavel Vinokurov
>Assignee: Pavel Vinokurov
>Priority: Blocker
> Fix For: 2.7
>
>
> If multiple nodes left cluster simultaneously, left partitions are removed 
> from GridDhtPartitionTopologyImpl#node2part without adding to leftNode2Part  
> in GridDhtPartitionTopologyImpl#update method.
> Thus GridDhtPartitionTopologyImpl#detectLostPartitions can't detect lost 
> partitions



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (IGNITE-9053) testReentrantLockConstantTopologyChangeNonFailoverSafe can hang in case of broken tx

2018-08-10 Thread Anton Vinogradov (JIRA)


 [ 
https://issues.apache.org/jira/browse/IGNITE-9053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anton Vinogradov updated IGNITE-9053:
-
Priority: Critical  (was: Major)

> testReentrantLockConstantTopologyChangeNonFailoverSafe can hang in case of 
> broken tx
> 
>
> Key: IGNITE-9053
> URL: https://issues.apache.org/jira/browse/IGNITE-9053
> Project: Ignite
>  Issue Type: Bug
>  Components: data structures
>Affects Versions: 2.5
>Reporter: Anton Vinogradov
>Assignee: Anton Vinogradov
>Priority: Critical
>  Labels: MakeTeamcityGreenAgain
> Fix For: 2.7
>
>
> -GridCachePartitionedDataStructuresFailoverSelfTest#testReentrantLockConstantTopologyChangeNonFailoverSafe
> -GridCachePartitionedDataStructuresFailoverSelfTest#testCountDownLatchConstantTopologyChange
>  
> can hang in case of broken tx
> {noformat}
>  Pending transactions:
> [2018-07-15 14:13:41,210][WARN 
> ][exchange-worker-#1596354%partitioned.GridCachePartitionedDataStructuresFailoverSelfTest1%][diagnostic]
>  >>> [txVer=AffinityTopologyVersion [topVer=7, minorTopVer=0], exchWait=true, 
> tx=GridDhtTxLocal [nearNodeId=1392b1bd-c807-4479-9bfe-fc9f7050, 
> nearFutId=14ffca0a461-999e75d0-a333-4bd6-a2a2-7f143d0af773, nearMiniId=1, 
> nearFinFutId=null, nearFinMiniId=0, nearXidVer=GridCacheVersion 
> [topVer=143133203, order=1531653200153, nodeOrder=1], 
> super=GridDhtTxLocalAdapter [nearOnOriginatingNode=false, nearNodes=[], 
> dhtNodes=[], explicitLock=false, super=IgniteTxLocalAdapter 
> [completedBase=null, sndTransformedVals=false, depEnabled=false, 
> txState=IgniteTxStateImpl [activeCacheIds=[1968300681], recovery=false, 
> txMap=[IgniteTxEntry [key=KeyCacheObjectImpl [part=494, 
> val=GridCacheInternalKeyImpl [name=structure, 
> grpName=default-volatile-ds-group], hasValBytes=true], cacheId=1968300681, 
> txKey=IgniteTxKey [key=KeyCacheObjectImpl [part=494, 
> val=GridCacheInternalKeyImpl [name=structure, 
> grpName=default-volatile-ds-group], hasValBytes=true], cacheId=1968300681], 
> val=[op=NOOP, val=null], prevVal=[op=NOOP, val=null], oldVal=[op=NOOP, 
> val=null], entryProcessorsCol=null, ttl=-1, conflictExpireTime=-1, 
> conflictVer=null, explicitVer=null, dhtVer=null, filters=[], 
> filtersPassed=false, filtersSet=false, entry=GridDhtCacheEntry [rdrs=[], 
> part=494, super=GridDistributedCacheEntry [super=GridCacheMapEntry 
> [key=KeyCacheObjectImpl [part=494, val=GridCacheInternalKeyImpl 
> [name=structure, grpName=default-volatile-ds-group], hasValBytes=true], 
> val=CacheObjectImpl [val=null, hasValBytes=true], ver=GridCacheVersion 
> [topVer=143133201, order=1531653200154, nodeOrder=2], hash=2095426867, 
> extras=GridCacheMvccEntryExtras [mvcc=GridCacheMvcc 
> [locs=[GridCacheMvccCandidate [nodeId=1bf28b00-feed-412b-a20b-ca9fc111, 
> ver=GridCacheVersion [topVer=143133203, order=1531653200157, nodeOrder=2], 
> threadId=1947290, id=31143709, topVer=AffinityTopologyVersion [topVer=7, 
> minorTopVer=0], reentry=null, 
> otherNodeId=1392b1bd-c807-4479-9bfe-fc9f7050, otherVer=GridCacheVersion 
> [topVer=143133203, order=1531653200153, nodeOrder=1], mappedDhtNodes=null, 
> mappedNearNodes=null, ownerVer=null, serOrder=null, key=KeyCacheObjectImpl 
> [part=494, val=GridCacheInternalKeyImpl [name=structure, 
> grpName=default-volatile-ds-group], hasValBytes=true], 
> masks=local=1|owner=1|ready=1|reentry=0|used=0|tx=1|single_implicit=0|dht_local=1|near_local=0|removed=0|read=0,
>  prevVer=null, nextVer=null]], rmts=null]], flags=2]]], prepared=0, 
> locked=false, nodeId=null, locMapped=false, expiryPlc=null, 
> transferExpiryPlc=false, flags=0, partUpdateCntr=0, serReadVer=null, 
> xidVer=GridCacheVersion [topVer=143133203, order=1531653200157, 
> nodeOrder=2, super=IgniteTxAdapter [xidVer=GridCacheVersion 
> [topVer=143133203, order=1531653200157, nodeOrder=2], writeVer=null, 
> implicit=false, loc=true, threadId=1947290, startTime=1531653200578, 
> nodeId=1bf28b00-feed-412b-a20b-ca9fc111, startVer=GridCacheVersion 
> [topVer=143133203, order=1531653200157, nodeOrder=2], endVer=null, 
> isolation=REPEATABLE_READ, concurrency=PESSIMISTIC, timeout=0, 
> sysInvalidate=false, sys=true, plc=2, commitVer=null, finalizing=NONE, 
> invalidParts=null, state=ACTIVE, timedOut=false, 
> topVer=AffinityTopologyVersion [topVer=7, minorTopVer=0], duration=20632ms, 
> onePhaseCommit=false], size=1
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (IGNITE-9053) testReentrantLockConstantTopologyChangeNonFailoverSafe can hang in case of broken tx

2018-08-10 Thread Anton Vinogradov (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-9053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16576299#comment-16576299
 ] 

Anton Vinogradov edited comment on IGNITE-9053 at 8/10/18 1:42 PM:
---

Looks like we have deadlock here 

first thread waits for ack 

{noformat}
"sys-stripe-4-#226264%partitioned.GridCachePartitionedDataStructuresFailoverSelfTest1%"
 #250984 prio=5 os_prio=0 tid=0x7f273c018000 nid=0x2de6 waiting on 
condition [0x7f274aeee000]
   java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:304)
at 
org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:177)
at 
org.apache.ignite.internal.util.future.GridFutureAdapter.get(GridFutureAdapter.java:140)
at 
org.apache.ignite.internal.processors.continuous.GridContinuousProcessor.addNotification(GridContinuousProcessor.java:1168)
at 
org.apache.ignite.internal.processors.cache.query.continuous.CacheContinuousQueryHandler.onEntryUpdate(CacheContinuousQueryHandler.java:890)
at 
org.apache.ignite.internal.processors.cache.query.continuous.CacheContinuousQueryHandler.access$600(CacheContinuousQueryHandler.java:85)
at 
org.apache.ignite.internal.processors.cache.query.continuous.CacheContinuousQueryHandler$2.onEntryUpdated(CacheContinuousQueryHandler.java:430)
at 
org.apache.ignite.internal.processors.cache.query.continuous.CacheContinuousQueryManager.onEntryUpdated(CacheContinuousQueryManager.java:400)
at 
org.apache.ignite.internal.processors.cache.GridCacheMapEntry.innerSet(GridCacheMapEntry.java:1079)
at 
org.apache.ignite.internal.processors.cache.transactions.IgniteTxLocalAdapter.userCommit(IgniteTxLocalAdapter.java:652)
at 
org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTxLocalAdapter.localFinish(GridDhtTxLocalAdapter.java:795)
at 
org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTxLocal.localFinish(GridDhtTxLocal.java:583)
at 
org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTxLocal.finishTx(GridDhtTxLocal.java:464)
at 
org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTxLocal.commitDhtLocalAsync(GridDhtTxLocal.java:505)
at 
org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler.finishDhtLocal(IgniteTxHandler.java:942)
at 
org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler.finish(IgniteTxHandler.java:821)
at 
org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler.processNearTxFinishRequest(IgniteTxHandler.java:777)
at 
org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler.access$200(IgniteTxHandler.java:99)
at 
org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler$3.apply(IgniteTxHandler.java:191)
at 
org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler$3.apply(IgniteTxHandler.java:189)
at 
org.apache.ignite.internal.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:1056)
at 
org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:581)
at 
org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:380)
at 
org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:306)
at 
org.apache.ignite.internal.processors.cache.GridCacheIoManager.access$100(GridCacheIoManager.java:101)
at 
org.apache.ignite.internal.processors.cache.GridCacheIoManager$1.onMessage(GridCacheIoManager.java:295)
at 
org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1556)
at 
org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:1184)
at 
org.apache.ignite.internal.managers.communication.GridIoManager.access$4200(GridIoManager.java:125)
at 
org.apache.ignite.internal.managers.communication.GridIoManager$9.run(GridIoManager.java:1091)
at 
org.apache.ignite.internal.util.StripedExecutor$Stripe.body(StripedExecutor.java:496)
at 
org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110)
at java.lang.Thread.run(Thread.java:745)
{noformat}

It sent the CQ notification, and node received it, but failed after that.

fut can be completed only at 
org.apache.ignite.internal.processors.continuous.GridContinuousProcessor.DiscoveryListener
 on EVT_NODE_FAILED

but EVT_NODE_FAILED can't be handled since we're trying to 
removeExplicitNodeLocks at previous listener :( 

{noformat}
"disco-event-worker-#226410%partitioned.GridCachePartit

[jira] [Commented] (IGNITE-8724) Skip logging 3-rd parameter while calling U.warn with initialized logger.

2018-08-10 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-8724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16576298#comment-16576298
 ] 

ASF GitHub Bot commented on IGNITE-8724:


Github user asfgit closed the pull request at:

https://github.com/apache/ignite/pull/4145


> Skip logging 3-rd parameter while calling U.warn with initialized logger.
> -
>
> Key: IGNITE-8724
> URL: https://issues.apache.org/jira/browse/IGNITE-8724
> Project: Ignite
>  Issue Type: Bug
>  Components: general
>Affects Versions: 2.5
>Reporter: Stanilovsky Evgeny
>Assignee: Stanilovsky Evgeny
>Priority: Major
> Fix For: 2.7
>
> Attachments: tc.png
>
>
> There are a lot of places where exception need to be logged, for example :
> {code:java}
> U.warn(log,"Unable to await partitions release future", e);
> {code}
> but current U.warn realization silently swallow it.
> {code:java}
> public static void warn(@Nullable IgniteLogger log, Object longMsg, 
> Object shortMsg) {
> assert longMsg != null;
> assert shortMsg != null;
> if (log != null)
> log.warning(compact(longMsg.toString()));
> else
> X.println("[" + SHORT_DATE_FMT.format(new java.util.Date()) + "] 
> (wrn) " +
> compact(shortMsg.toString()));
> }
> {code}
> fix, looks like simple add:
> {code:java}
> public static void warn(@Nullable IgniteLogger log, Object longMsg, 
> Throwable ex) {
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (IGNITE-9053) testReentrantLockConstantTopologyChangeNonFailoverSafe can hang in case of broken tx

2018-08-10 Thread Anton Vinogradov (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-9053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16576299#comment-16576299
 ] 

Anton Vinogradov commented on IGNITE-9053:
--

Looks like we have deadlock here 

first thread waits for ack 

{noformat}
"sys-stripe-4-#226264%partitioned.GridCachePartitionedDataStructuresFailoverSelfTest1%"
 #250984 prio=5 os_prio=0 tid=0x7f273c018000 nid=0x2de6 waiting on 
condition [0x7f274aeee000]
   java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:304)
at 
org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:177)
at 
org.apache.ignite.internal.util.future.GridFutureAdapter.get(GridFutureAdapter.java:140)
at 
org.apache.ignite.internal.processors.continuous.GridContinuousProcessor.addNotification(GridContinuousProcessor.java:1168)
at 
org.apache.ignite.internal.processors.cache.query.continuous.CacheContinuousQueryHandler.onEntryUpdate(CacheContinuousQueryHandler.java:890)
at 
org.apache.ignite.internal.processors.cache.query.continuous.CacheContinuousQueryHandler.access$600(CacheContinuousQueryHandler.java:85)
at 
org.apache.ignite.internal.processors.cache.query.continuous.CacheContinuousQueryHandler$2.onEntryUpdated(CacheContinuousQueryHandler.java:430)
at 
org.apache.ignite.internal.processors.cache.query.continuous.CacheContinuousQueryManager.onEntryUpdated(CacheContinuousQueryManager.java:400)
at 
org.apache.ignite.internal.processors.cache.GridCacheMapEntry.innerSet(GridCacheMapEntry.java:1079)
at 
org.apache.ignite.internal.processors.cache.transactions.IgniteTxLocalAdapter.userCommit(IgniteTxLocalAdapter.java:652)
at 
org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTxLocalAdapter.localFinish(GridDhtTxLocalAdapter.java:795)
at 
org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTxLocal.localFinish(GridDhtTxLocal.java:583)
at 
org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTxLocal.finishTx(GridDhtTxLocal.java:464)
at 
org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTxLocal.commitDhtLocalAsync(GridDhtTxLocal.java:505)
at 
org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler.finishDhtLocal(IgniteTxHandler.java:942)
at 
org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler.finish(IgniteTxHandler.java:821)
at 
org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler.processNearTxFinishRequest(IgniteTxHandler.java:777)
at 
org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler.access$200(IgniteTxHandler.java:99)
at 
org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler$3.apply(IgniteTxHandler.java:191)
at 
org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler$3.apply(IgniteTxHandler.java:189)
at 
org.apache.ignite.internal.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:1056)
at 
org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:581)
at 
org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:380)
at 
org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:306)
at 
org.apache.ignite.internal.processors.cache.GridCacheIoManager.access$100(GridCacheIoManager.java:101)
at 
org.apache.ignite.internal.processors.cache.GridCacheIoManager$1.onMessage(GridCacheIoManager.java:295)
at 
org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1556)
at 
org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:1184)
at 
org.apache.ignite.internal.managers.communication.GridIoManager.access$4200(GridIoManager.java:125)
at 
org.apache.ignite.internal.managers.communication.GridIoManager$9.run(GridIoManager.java:1091)
at 
org.apache.ignite.internal.util.StripedExecutor$Stripe.body(StripedExecutor.java:496)
at 
org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110)
at java.lang.Thread.run(Thread.java:745)
{noformat}

It sent the event, and node received it, but failed after that.

fut can be completed only at 
org.apache.ignite.internal.processors.continuous.GridContinuousProcessor.DiscoveryListener
 on EVT_NODE_FAILED

but EVT_NODE_FAILED can't be handled since we're trying to 
removeExplicitNodeLocks at previous listener :( 

{noformat}
"disco-event-worker-#226410%partitioned.GridCachePartitionedDataStructuresFailoverSelfTest1%"
 #251148 prio=5 os_p

[jira] [Commented] (IGNITE-8724) Skip logging 3-rd parameter while calling U.warn with initialized logger.

2018-08-10 Thread Alexey Goncharuk (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-8724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16576295#comment-16576295
 ] 

Alexey Goncharuk commented on IGNITE-8724:
--

Thanks, merged to master.

> Skip logging 3-rd parameter while calling U.warn with initialized logger.
> -
>
> Key: IGNITE-8724
> URL: https://issues.apache.org/jira/browse/IGNITE-8724
> Project: Ignite
>  Issue Type: Bug
>  Components: general
>Affects Versions: 2.5
>Reporter: Stanilovsky Evgeny
>Assignee: Stanilovsky Evgeny
>Priority: Major
> Fix For: 2.7
>
> Attachments: tc.png
>
>
> There are a lot of places where exception need to be logged, for example :
> {code:java}
> U.warn(log,"Unable to await partitions release future", e);
> {code}
> but current U.warn realization silently swallow it.
> {code:java}
> public static void warn(@Nullable IgniteLogger log, Object longMsg, 
> Object shortMsg) {
> assert longMsg != null;
> assert shortMsg != null;
> if (log != null)
> log.warning(compact(longMsg.toString()));
> else
> X.println("[" + SHORT_DATE_FMT.format(new java.util.Date()) + "] 
> (wrn) " +
> compact(shortMsg.toString()));
> }
> {code}
> fix, looks like simple add:
> {code:java}
> public static void warn(@Nullable IgniteLogger log, Object longMsg, 
> Throwable ex) {
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (IGNITE-9249) Tests hang when different threads try to start and stop nodes at the same time.

2018-08-10 Thread Ilya Lantukh (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-9249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16576286#comment-16576286
 ] 

Ilya Lantukh commented on IGNITE-9249:
--

https://ci.ignite.apache.org/viewQueued.html?itemId=1628472

> Tests hang when different threads try to start and stop nodes at the same 
> time.
> ---
>
> Key: IGNITE-9249
> URL: https://issues.apache.org/jira/browse/IGNITE-9249
> Project: Ignite
>  Issue Type: Bug
>Reporter: Ilya Lantukh
>Assignee: Ilya Lantukh
>Priority: Major
>
> An example of such test is 
> GridCachePartitionedNearDisabledOptimisticTxNodeRestartTest.testRestartWithPutFourNodesOneBackupsOffheapEvict().
> Hanged threads:
> {code}
> "restart-worker-1@63424" prio=5 tid=0x7f5e nid=NA waiting
>   java.lang.Thread.State: WAITING
> at java.lang.Object.wait(Object.java:-1)
> at 
> org.apache.ignite.spi.discovery.tcp.ServerImpl.joinTopology(ServerImpl.java:949)
> at 
> org.apache.ignite.spi.discovery.tcp.ServerImpl.spiStart(ServerImpl.java:389)
> at 
> org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi.spiStart(TcpDiscoverySpi.java:2002)
> at 
> org.apache.ignite.internal.managers.GridManagerAdapter.startSpi(GridManagerAdapter.java:297)
> at 
> org.apache.ignite.internal.managers.discovery.GridDiscoveryManager.start(GridDiscoveryManager.java:916)
> at 
> org.apache.ignite.internal.IgniteKernal.startManager(IgniteKernal.java:1754)
> at 
> org.apache.ignite.internal.IgniteKernal.start(IgniteKernal.java:1050)
> at 
> org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start0(IgnitionEx.java:2020)
> at 
> org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start(IgnitionEx.java:1725)
> - locked <0xfc36> (a 
> org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance)
> at org.apache.ignite.internal.IgnitionEx.start0(IgnitionEx.java:1153)
> at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:651)
> at 
> org.apache.ignite.testframework.junits.GridAbstractTest.startGrid(GridAbstractTest.java:920)
> at 
> org.apache.ignite.testframework.junits.GridAbstractTest.startGrid(GridAbstractTest.java:858)
> at 
> org.apache.ignite.testframework.junits.GridAbstractTest.startGrid(GridAbstractTest.java:846)
> at 
> org.apache.ignite.testframework.junits.GridAbstractTest.startGrid(GridAbstractTest.java:812)
> at 
> org.apache.ignite.internal.processors.cache.distributed.GridCacheAbstractNodeRestartSelfTest.access$1000(GridCacheAbstractNodeRestartSelfTest.java:64)
> at 
> org.apache.ignite.internal.processors.cache.distributed.GridCacheAbstractNodeRestartSelfTest$2.run(GridCacheAbstractNodeRestartSelfTest.java:665)
> at java.lang.Thread.run(Thread.java:748)
> "restart-worker-0@63423" prio=5 tid=0x7f5d nid=NA waiting
>   java.lang.Thread.State: WAITING
> at sun.misc.Unsafe.park(Unsafe.java:-1)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304)
> at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:231)
> at 
> org.apache.ignite.internal.util.IgniteUtils.awaitQuiet(IgniteUtils.java:7584)
> at 
> org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.grid(IgnitionEx.java:1666)
> at 
> org.apache.ignite.internal.IgnitionEx.allGrids(IgnitionEx.java:1284)
> at 
> org.apache.ignite.internal.IgnitionEx.allGrids(IgnitionEx.java:1262)
> at org.apache.ignite.Ignition.allGrids(Ignition.java:502)
> at 
> org.apache.ignite.testframework.junits.GridAbstractTest.awaitTopologyChange(GridAbstractTest.java:2258)
> at 
> org.apache.ignite.testframework.junits.GridAbstractTest.stopGrid(GridAbstractTest.java:1158)
> at 
> org.apache.ignite.testframework.junits.GridAbstractTest.stopGrid(GridAbstractTest.java:1133)
> at 
> org.apache.ignite.testframework.junits.GridAbstractTest.stopGrid(GridAbstractTest.java:1433)
> at 
> org.apache.ignite.internal.processors.cache.distributed.GridCacheAbstractNodeRestartSelfTest.access$800(GridCacheAbstractNodeRestartSelfTest.java:64)
> at 
> org.apache.ignite.internal.processors.cache.distributed.GridCacheAbstractNodeRestartSelfTest$2.run(GridCacheAbstractNodeRestartSelfTest.java:661)
> at java.lang.Thread.run(Threa

[jira] [Commented] (IGNITE-9249) Tests hang when different threads try to start and stop nodes at the same time.

2018-08-10 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-9249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16576282#comment-16576282
 ] 

ASF GitHub Bot commented on IGNITE-9249:


GitHub user ilantukh opened a pull request:

https://github.com/apache/ignite/pull/4515

IGNITE-9249 : Configured node join timeout for all tests.



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/gridgain/apache-ignite ignite-9249

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/ignite/pull/4515.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4515


commit 3be4dcc6da6649fb04f99f61c31cebb29d03c0fe
Author: Ilya Lantukh 
Date:   2018-08-10T13:23:28Z

IGNITE-9249 : Configured node join timeout for all tests.




> Tests hang when different threads try to start and stop nodes at the same 
> time.
> ---
>
> Key: IGNITE-9249
> URL: https://issues.apache.org/jira/browse/IGNITE-9249
> Project: Ignite
>  Issue Type: Bug
>Reporter: Ilya Lantukh
>Assignee: Ilya Lantukh
>Priority: Major
>
> An example of such test is 
> GridCachePartitionedNearDisabledOptimisticTxNodeRestartTest.testRestartWithPutFourNodesOneBackupsOffheapEvict().
> Hanged threads:
> {code}
> "restart-worker-1@63424" prio=5 tid=0x7f5e nid=NA waiting
>   java.lang.Thread.State: WAITING
> at java.lang.Object.wait(Object.java:-1)
> at 
> org.apache.ignite.spi.discovery.tcp.ServerImpl.joinTopology(ServerImpl.java:949)
> at 
> org.apache.ignite.spi.discovery.tcp.ServerImpl.spiStart(ServerImpl.java:389)
> at 
> org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi.spiStart(TcpDiscoverySpi.java:2002)
> at 
> org.apache.ignite.internal.managers.GridManagerAdapter.startSpi(GridManagerAdapter.java:297)
> at 
> org.apache.ignite.internal.managers.discovery.GridDiscoveryManager.start(GridDiscoveryManager.java:916)
> at 
> org.apache.ignite.internal.IgniteKernal.startManager(IgniteKernal.java:1754)
> at 
> org.apache.ignite.internal.IgniteKernal.start(IgniteKernal.java:1050)
> at 
> org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start0(IgnitionEx.java:2020)
> at 
> org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start(IgnitionEx.java:1725)
> - locked <0xfc36> (a 
> org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance)
> at org.apache.ignite.internal.IgnitionEx.start0(IgnitionEx.java:1153)
> at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:651)
> at 
> org.apache.ignite.testframework.junits.GridAbstractTest.startGrid(GridAbstractTest.java:920)
> at 
> org.apache.ignite.testframework.junits.GridAbstractTest.startGrid(GridAbstractTest.java:858)
> at 
> org.apache.ignite.testframework.junits.GridAbstractTest.startGrid(GridAbstractTest.java:846)
> at 
> org.apache.ignite.testframework.junits.GridAbstractTest.startGrid(GridAbstractTest.java:812)
> at 
> org.apache.ignite.internal.processors.cache.distributed.GridCacheAbstractNodeRestartSelfTest.access$1000(GridCacheAbstractNodeRestartSelfTest.java:64)
> at 
> org.apache.ignite.internal.processors.cache.distributed.GridCacheAbstractNodeRestartSelfTest$2.run(GridCacheAbstractNodeRestartSelfTest.java:665)
> at java.lang.Thread.run(Thread.java:748)
> "restart-worker-0@63423" prio=5 tid=0x7f5d nid=NA waiting
>   java.lang.Thread.State: WAITING
> at sun.misc.Unsafe.park(Unsafe.java:-1)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304)
> at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:231)
> at 
> org.apache.ignite.internal.util.IgniteUtils.awaitQuiet(IgniteUtils.java:7584)
> at 
> org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.grid(IgnitionEx.java:1666)
> at 
> org.apache.ignite.internal.IgnitionEx.allGrids(IgnitionEx.java:1284)
> at 
> org.apache.ignite.internal.IgnitionEx.allGrids(IgnitionEx.java:1262)
> at org.apache.ignite.Ignition.allGrids(Ignition.java:502)
> at 
> org.apache.ignite.testframework.junits.GridAbstractTest.awaitTopologyChange(GridAbstractTest.java:2258)
>

[jira] [Commented] (IGNITE-9249) Tests hang when different threads try to start and stop nodes at the same time.

2018-08-10 Thread Ilya Lantukh (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-9249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16576277#comment-16576277
 ] 

Ilya Lantukh commented on IGNITE-9249:
--

As a temporary solution I suggest to set join timeout in GridAbstractTest, so 
tests will fail instead of hanging up.

> Tests hang when different threads try to start and stop nodes at the same 
> time.
> ---
>
> Key: IGNITE-9249
> URL: https://issues.apache.org/jira/browse/IGNITE-9249
> Project: Ignite
>  Issue Type: Bug
>Reporter: Ilya Lantukh
>Assignee: Ilya Lantukh
>Priority: Major
>
> An example of such test is 
> GridCachePartitionedNearDisabledOptimisticTxNodeRestartTest.testRestartWithPutFourNodesOneBackupsOffheapEvict().
> Hanged threads:
> {code}
> "restart-worker-1@63424" prio=5 tid=0x7f5e nid=NA waiting
>   java.lang.Thread.State: WAITING
> at java.lang.Object.wait(Object.java:-1)
> at 
> org.apache.ignite.spi.discovery.tcp.ServerImpl.joinTopology(ServerImpl.java:949)
> at 
> org.apache.ignite.spi.discovery.tcp.ServerImpl.spiStart(ServerImpl.java:389)
> at 
> org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi.spiStart(TcpDiscoverySpi.java:2002)
> at 
> org.apache.ignite.internal.managers.GridManagerAdapter.startSpi(GridManagerAdapter.java:297)
> at 
> org.apache.ignite.internal.managers.discovery.GridDiscoveryManager.start(GridDiscoveryManager.java:916)
> at 
> org.apache.ignite.internal.IgniteKernal.startManager(IgniteKernal.java:1754)
> at 
> org.apache.ignite.internal.IgniteKernal.start(IgniteKernal.java:1050)
> at 
> org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start0(IgnitionEx.java:2020)
> at 
> org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start(IgnitionEx.java:1725)
> - locked <0xfc36> (a 
> org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance)
> at org.apache.ignite.internal.IgnitionEx.start0(IgnitionEx.java:1153)
> at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:651)
> at 
> org.apache.ignite.testframework.junits.GridAbstractTest.startGrid(GridAbstractTest.java:920)
> at 
> org.apache.ignite.testframework.junits.GridAbstractTest.startGrid(GridAbstractTest.java:858)
> at 
> org.apache.ignite.testframework.junits.GridAbstractTest.startGrid(GridAbstractTest.java:846)
> at 
> org.apache.ignite.testframework.junits.GridAbstractTest.startGrid(GridAbstractTest.java:812)
> at 
> org.apache.ignite.internal.processors.cache.distributed.GridCacheAbstractNodeRestartSelfTest.access$1000(GridCacheAbstractNodeRestartSelfTest.java:64)
> at 
> org.apache.ignite.internal.processors.cache.distributed.GridCacheAbstractNodeRestartSelfTest$2.run(GridCacheAbstractNodeRestartSelfTest.java:665)
> at java.lang.Thread.run(Thread.java:748)
> "restart-worker-0@63423" prio=5 tid=0x7f5d nid=NA waiting
>   java.lang.Thread.State: WAITING
> at sun.misc.Unsafe.park(Unsafe.java:-1)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304)
> at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:231)
> at 
> org.apache.ignite.internal.util.IgniteUtils.awaitQuiet(IgniteUtils.java:7584)
> at 
> org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.grid(IgnitionEx.java:1666)
> at 
> org.apache.ignite.internal.IgnitionEx.allGrids(IgnitionEx.java:1284)
> at 
> org.apache.ignite.internal.IgnitionEx.allGrids(IgnitionEx.java:1262)
> at org.apache.ignite.Ignition.allGrids(Ignition.java:502)
> at 
> org.apache.ignite.testframework.junits.GridAbstractTest.awaitTopologyChange(GridAbstractTest.java:2258)
> at 
> org.apache.ignite.testframework.junits.GridAbstractTest.stopGrid(GridAbstractTest.java:1158)
> at 
> org.apache.ignite.testframework.junits.GridAbstractTest.stopGrid(GridAbstractTest.java:1133)
> at 
> org.apache.ignite.testframework.junits.GridAbstractTest.stopGrid(GridAbstractTest.java:1433)
> at 
> org.apache.ignite.internal.processors.cache.distributed.GridCacheAbstractNodeRestartSelfTest.access$800(GridCacheAbstractNodeRestartSelfTest.java:64)
> at 
> org.apache.ignite.internal.processors.cache.distributed.GridCacheAbstractNodeRestartSelfTest$2.run(GridCacheAbstractNodeRestart

[jira] [Commented] (IGNITE-602) [Test] GridToStringBuilder is vulnerable for StackOverflowError caused by infinite recursion

2018-08-10 Thread Alexey Goncharuk (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16576275#comment-16576275
 ] 

Alexey Goncharuk commented on IGNITE-602:
-

[~SomeFire], this change breaks {{S.toString()}} output, this is the printout I 
see in logs:
{code}
[2018-08-10 16:14:15,671][WARN ][exchange-worker-#186%client%][diagnostic] >>> 
KeyCacheObjectImpl [part=6, val=6, hasValBytes=true]KeyCacheObjectImpl [part=6, 
val=6, hasValBytes=true], val=null, ver=GridCacheVersion [topVer=0, order=0, 
nodeOrder=0], hash=6, extras=null, flags=0]GridDistributedCacheEntry 
[super=]GridDhtDetachedCacheEntry [super=], prepared=0, locked=false, 
nodeId=834e31ba-a000-46d5-bef3-45f28531, locMapped=false, expiryPlc=null, 
transferExpiryPlc=false, flags=0, partUpdateCntr=0, serReadVer=null, 
xidVer=GridCacheVersion [topVer=145386834, order=1533906834364, 
nodeOrder=4, super=, size=1]GridDhtTxLocalAdapter 
[nearOnOriginatingNode=false, nearNodes=KeySetView [], dhtNodes=KeySetView [], 
explicitLock=false, super=]GridNearTxLocal [mappings=IgniteTxMappingsImpl [], 
nearLocallyMapped=false, colocatedLocallyMapped=false, needCheckBackup=null, 
hasRemoteLocks=false, trackTimeout=false, lb=null, 
thread=async-runnable-runner-1, mappings=IgniteTxMappingsImpl [], super=], 
super=GridCompoundFuture 
[rdc=o.a.i.i.processors.cache.distributed.near.GridNearTxPrepareFutureAdapter$1@4f11fa96,
 initFlag=1, lsnrCalls=0, done=false, cancelled=false, err=null, 
futs=TransformCollectionView [false]]]
{code}

Can you estimate how long the fix will take? If it takes too long, we will need 
to revert the change.

> [Test] GridToStringBuilder is vulnerable for StackOverflowError caused by 
> infinite recursion
> 
>
> Key: IGNITE-602
> URL: https://issues.apache.org/jira/browse/IGNITE-602
> Project: Ignite
>  Issue Type: Bug
>  Components: general
>Reporter: Artem Shutak
>Assignee: Ryabov Dmitrii
>Priority: Major
>  Labels: MakeTeamcityGreenAgain, Muted_test
> Fix For: 2.7
>
>
> See test 
> org.gridgain.grid.util.tostring.GridToStringBuilderSelfTest#_testToStringCheckAdvancedRecursionPrevention
>  and related TODO in same source file.
> Also take a look at 
> http://stackoverflow.com/questions/11300203/most-efficient-way-to-prevent-an-infinite-recursion-in-tostring
> Test should be unmuted on TC after fix.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-9249) Tests hang when different threads try to start and stop nodes at the same time.

2018-08-10 Thread Ilya Lantukh (JIRA)
Ilya Lantukh created IGNITE-9249:


 Summary: Tests hang when different threads try to start and stop 
nodes at the same time.
 Key: IGNITE-9249
 URL: https://issues.apache.org/jira/browse/IGNITE-9249
 Project: Ignite
  Issue Type: Bug
Reporter: Ilya Lantukh
Assignee: Ilya Lantukh


An example of such test is 
GridCachePartitionedNearDisabledOptimisticTxNodeRestartTest.testRestartWithPutFourNodesOneBackupsOffheapEvict().

Hanged threads:
{code}
"restart-worker-1@63424" prio=5 tid=0x7f5e nid=NA waiting
  java.lang.Thread.State: WAITING
  at java.lang.Object.wait(Object.java:-1)
  at 
org.apache.ignite.spi.discovery.tcp.ServerImpl.joinTopology(ServerImpl.java:949)
  at 
org.apache.ignite.spi.discovery.tcp.ServerImpl.spiStart(ServerImpl.java:389)
  at 
org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi.spiStart(TcpDiscoverySpi.java:2002)
  at 
org.apache.ignite.internal.managers.GridManagerAdapter.startSpi(GridManagerAdapter.java:297)
  at 
org.apache.ignite.internal.managers.discovery.GridDiscoveryManager.start(GridDiscoveryManager.java:916)
  at 
org.apache.ignite.internal.IgniteKernal.startManager(IgniteKernal.java:1754)
  at 
org.apache.ignite.internal.IgniteKernal.start(IgniteKernal.java:1050)
  at 
org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start0(IgnitionEx.java:2020)
  at 
org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start(IgnitionEx.java:1725)
  - locked <0xfc36> (a 
org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance)
  at org.apache.ignite.internal.IgnitionEx.start0(IgnitionEx.java:1153)
  at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:651)
  at 
org.apache.ignite.testframework.junits.GridAbstractTest.startGrid(GridAbstractTest.java:920)
  at 
org.apache.ignite.testframework.junits.GridAbstractTest.startGrid(GridAbstractTest.java:858)
  at 
org.apache.ignite.testframework.junits.GridAbstractTest.startGrid(GridAbstractTest.java:846)
  at 
org.apache.ignite.testframework.junits.GridAbstractTest.startGrid(GridAbstractTest.java:812)
  at 
org.apache.ignite.internal.processors.cache.distributed.GridCacheAbstractNodeRestartSelfTest.access$1000(GridCacheAbstractNodeRestartSelfTest.java:64)
  at 
org.apache.ignite.internal.processors.cache.distributed.GridCacheAbstractNodeRestartSelfTest$2.run(GridCacheAbstractNodeRestartSelfTest.java:665)
  at java.lang.Thread.run(Thread.java:748)

"restart-worker-0@63423" prio=5 tid=0x7f5d nid=NA waiting
  java.lang.Thread.State: WAITING
  at sun.misc.Unsafe.park(Unsafe.java:-1)
  at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
  at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
  at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997)
  at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304)
  at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:231)
  at 
org.apache.ignite.internal.util.IgniteUtils.awaitQuiet(IgniteUtils.java:7584)
  at 
org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.grid(IgnitionEx.java:1666)
  at 
org.apache.ignite.internal.IgnitionEx.allGrids(IgnitionEx.java:1284)
  at 
org.apache.ignite.internal.IgnitionEx.allGrids(IgnitionEx.java:1262)
  at org.apache.ignite.Ignition.allGrids(Ignition.java:502)
  at 
org.apache.ignite.testframework.junits.GridAbstractTest.awaitTopologyChange(GridAbstractTest.java:2258)
  at 
org.apache.ignite.testframework.junits.GridAbstractTest.stopGrid(GridAbstractTest.java:1158)
  at 
org.apache.ignite.testframework.junits.GridAbstractTest.stopGrid(GridAbstractTest.java:1133)
  at 
org.apache.ignite.testframework.junits.GridAbstractTest.stopGrid(GridAbstractTest.java:1433)
  at 
org.apache.ignite.internal.processors.cache.distributed.GridCacheAbstractNodeRestartSelfTest.access$800(GridCacheAbstractNodeRestartSelfTest.java:64)
  at 
org.apache.ignite.internal.processors.cache.distributed.GridCacheAbstractNodeRestartSelfTest$2.run(GridCacheAbstractNodeRestartSelfTest.java:661)
  at java.lang.Thread.run(Thread.java:748)
{code}

Full thread dump:
{code}
"test-runner-#26488%dht.GridCachePartitionedNearDisabledOptimisticTxNodeRestartTest%@63124"
 prio=5 tid=0x7e6a nid=NA waiting
  java.lang.Thread.State: WAITING
  at sun.misc.Unsafe.park(Unsafe.java:-1)
  at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
  at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInte

[jira] [Commented] (IGNITE-9236) Handshake timeout never completes in some tests (GridCacheReplicatedFailoverSelfTest in particular)

2018-08-10 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-9236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16576226#comment-16576226
 ] 

ASF GitHub Bot commented on IGNITE-9236:


Github user asfgit closed the pull request at:

https://github.com/apache/ignite/pull/4499


> Handshake timeout never completes in some tests 
> (GridCacheReplicatedFailoverSelfTest in particular)
> ---
>
> Key: IGNITE-9236
> URL: https://issues.apache.org/jira/browse/IGNITE-9236
> Project: Ignite
>  Issue Type: Bug
>Reporter: Ilya Lantukh
>Assignee: Ilya Lantukh
>Priority: Major
>  Labels: MakeTeamcityGreenAgain
> Fix For: 2.7
>
>
> In GridCacheReplicatedFailoverSelfTest one thread tries to establish TCP 
> connection and hangs on handshake forever, holding lock on RebalanceFuture:
> {code}
> [11:51:55] :   [Step 3/4] Locked synchronizers:
> [11:51:55] :   [Step 3/4] 
> java.util.concurrent.ThreadPoolExecutor$Worker@5b17b883
> [11:51:55] :   [Step 3/4] Thread 
> [name="sys-#68921%new-node-topology-change-thread-1%", id=77410, 
> state=RUNNABLE, blockCnt=3, waitCnt=0]
> [11:51:55] :   [Step 3/4] at 
> sun.nio.ch.FileDispatcherImpl.read0(Native Method)
> [11:51:55] :   [Step 3/4] at 
> sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
> [11:51:55] :   [Step 3/4] at 
> sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
> [11:51:55] :   [Step 3/4] at sun.nio.ch.IOUtil.read(IOUtil.java:197)
> [11:51:55] :   [Step 3/4] at 
> sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380)
> [11:51:55] :   [Step 3/4] - locked java.lang.Object@23aaa756
> [11:51:55] :   [Step 3/4] at 
> o.a.i.spi.communication.tcp.TcpCommunicationSpi.safeTcpHandshake(TcpCommunicationSpi.java:3647)
> [11:51:55] :   [Step 3/4] at 
> o.a.i.spi.communication.tcp.TcpCommunicationSpi.createTcpClient(TcpCommunicationSpi.java:3293)
> [11:51:55] :   [Step 3/4] at 
> o.a.i.spi.communication.tcp.TcpCommunicationSpi.createNioClient(TcpCommunicationSpi.java:2967)
> [11:51:55] :   [Step 3/4] at 
> o.a.i.spi.communication.tcp.TcpCommunicationSpi.reserveClient(TcpCommunicationSpi.java:2850)
> [11:51:55] :   [Step 3/4] at 
> o.a.i.spi.communication.tcp.TcpCommunicationSpi.sendMessage0(TcpCommunicationSpi.java:2693)
> [11:51:55] :   [Step 3/4] at 
> o.a.i.spi.communication.tcp.TcpCommunicationSpi.sendMessage(TcpCommunicationSpi.java:2652)
> [11:51:55] :   [Step 3/4] at 
> o.a.i.i.managers.communication.GridIoManager.send(GridIoManager.java:1643)
> [11:51:55] :   [Step 3/4] at 
> o.a.i.i.managers.communication.GridIoManager.sendOrderedMessage(GridIoManager.java:1750)
> [11:51:55] :   [Step 3/4] at 
> o.a.i.i.processors.cache.GridCacheIoManager.sendOrderedMessage(GridCacheIoManager.java:1231)
> [11:51:55] :   [Step 3/4] at 
> o.a.i.i.processors.cache.distributed.dht.preloader.GridDhtPartitionDemander$RebalanceFuture.cleanupRemoteContexts(GridDhtPartitionDemander.java:)
> [11:51:55] :   [Step 3/4] at 
> o.a.i.i.processors.cache.distributed.dht.preloader.GridDhtPartitionDemander$RebalanceFuture.cancel(GridDhtPartitionDemander.java:1041)
> [11:51:55] :   [Step 3/4] - locked 
> o.a.i.i.processors.cache.distributed.dht.preloader.GridDhtPartitionDemander$RebalanceFuture@7e28f150
> [11:51:55] :   [Step 3/4] at 
> o.a.i.i.processors.cache.distributed.dht.preloader.GridDhtPartitionDemander.lambda$null$2(GridDhtPartitionDemander.java:534)
> [11:51:55] :   [Step 3/4] at 
> o.a.i.i.processors.cache.distributed.dht.preloader.GridDhtPartitionDemander$$Lambda$41/603501511.run(Unknown
>  Source)
> [11:51:55] :   [Step 3/4] at 
> o.a.i.i.util.IgniteUtils.wrapThreadLoader(IgniteUtils.java:6800)
> [11:51:55] :   [Step 3/4] at 
> o.a.i.i.processors.closure.GridClosureProcessor$1.body(GridClosureProcessor.java:827)
> [11:51:55] :   [Step 3/4] at 
> o.a.i.i.util.worker.GridWorker.run(GridWorker.java:110)
> [11:51:55] :   [Step 3/4] at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> [11:51:55] :   [Step 3/4] at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> [11:51:55] :   [Step 3/4] at java.lang.Thread.run(Thread.java:748)
> {code}
> Because of that, exchange worker hangs forever while trying to acquire that 
> lock:
> {code}
> [11:51:55] :   [Step 3/4] Thread 
> [name="exchange-worker-#68894%new-node-topology-change-thread-1%", id=77379, 
> state=BLOCKED, blockCnt=11, waitCnt=7]
> [11:51:55] :   [Step 3/4] Lock 
> [object=o.a.i.i.processors.cache.distributed.dht.preloader.GridDhtPartitionDemander$Reb

[jira] [Commented] (IGNITE-9236) Handshake timeout never completes in some tests (GridCacheReplicatedFailoverSelfTest in particular)

2018-08-10 Thread Alexey Goncharuk (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-9236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16576225#comment-16576225
 ] 

Alexey Goncharuk commented on IGNITE-9236:
--

Thanks, Ilya, merged your changes to master.

> Handshake timeout never completes in some tests 
> (GridCacheReplicatedFailoverSelfTest in particular)
> ---
>
> Key: IGNITE-9236
> URL: https://issues.apache.org/jira/browse/IGNITE-9236
> Project: Ignite
>  Issue Type: Bug
>Reporter: Ilya Lantukh
>Assignee: Ilya Lantukh
>Priority: Major
>  Labels: MakeTeamcityGreenAgain
> Fix For: 2.7
>
>
> In GridCacheReplicatedFailoverSelfTest one thread tries to establish TCP 
> connection and hangs on handshake forever, holding lock on RebalanceFuture:
> {code}
> [11:51:55] :   [Step 3/4] Locked synchronizers:
> [11:51:55] :   [Step 3/4] 
> java.util.concurrent.ThreadPoolExecutor$Worker@5b17b883
> [11:51:55] :   [Step 3/4] Thread 
> [name="sys-#68921%new-node-topology-change-thread-1%", id=77410, 
> state=RUNNABLE, blockCnt=3, waitCnt=0]
> [11:51:55] :   [Step 3/4] at 
> sun.nio.ch.FileDispatcherImpl.read0(Native Method)
> [11:51:55] :   [Step 3/4] at 
> sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
> [11:51:55] :   [Step 3/4] at 
> sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
> [11:51:55] :   [Step 3/4] at sun.nio.ch.IOUtil.read(IOUtil.java:197)
> [11:51:55] :   [Step 3/4] at 
> sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380)
> [11:51:55] :   [Step 3/4] - locked java.lang.Object@23aaa756
> [11:51:55] :   [Step 3/4] at 
> o.a.i.spi.communication.tcp.TcpCommunicationSpi.safeTcpHandshake(TcpCommunicationSpi.java:3647)
> [11:51:55] :   [Step 3/4] at 
> o.a.i.spi.communication.tcp.TcpCommunicationSpi.createTcpClient(TcpCommunicationSpi.java:3293)
> [11:51:55] :   [Step 3/4] at 
> o.a.i.spi.communication.tcp.TcpCommunicationSpi.createNioClient(TcpCommunicationSpi.java:2967)
> [11:51:55] :   [Step 3/4] at 
> o.a.i.spi.communication.tcp.TcpCommunicationSpi.reserveClient(TcpCommunicationSpi.java:2850)
> [11:51:55] :   [Step 3/4] at 
> o.a.i.spi.communication.tcp.TcpCommunicationSpi.sendMessage0(TcpCommunicationSpi.java:2693)
> [11:51:55] :   [Step 3/4] at 
> o.a.i.spi.communication.tcp.TcpCommunicationSpi.sendMessage(TcpCommunicationSpi.java:2652)
> [11:51:55] :   [Step 3/4] at 
> o.a.i.i.managers.communication.GridIoManager.send(GridIoManager.java:1643)
> [11:51:55] :   [Step 3/4] at 
> o.a.i.i.managers.communication.GridIoManager.sendOrderedMessage(GridIoManager.java:1750)
> [11:51:55] :   [Step 3/4] at 
> o.a.i.i.processors.cache.GridCacheIoManager.sendOrderedMessage(GridCacheIoManager.java:1231)
> [11:51:55] :   [Step 3/4] at 
> o.a.i.i.processors.cache.distributed.dht.preloader.GridDhtPartitionDemander$RebalanceFuture.cleanupRemoteContexts(GridDhtPartitionDemander.java:)
> [11:51:55] :   [Step 3/4] at 
> o.a.i.i.processors.cache.distributed.dht.preloader.GridDhtPartitionDemander$RebalanceFuture.cancel(GridDhtPartitionDemander.java:1041)
> [11:51:55] :   [Step 3/4] - locked 
> o.a.i.i.processors.cache.distributed.dht.preloader.GridDhtPartitionDemander$RebalanceFuture@7e28f150
> [11:51:55] :   [Step 3/4] at 
> o.a.i.i.processors.cache.distributed.dht.preloader.GridDhtPartitionDemander.lambda$null$2(GridDhtPartitionDemander.java:534)
> [11:51:55] :   [Step 3/4] at 
> o.a.i.i.processors.cache.distributed.dht.preloader.GridDhtPartitionDemander$$Lambda$41/603501511.run(Unknown
>  Source)
> [11:51:55] :   [Step 3/4] at 
> o.a.i.i.util.IgniteUtils.wrapThreadLoader(IgniteUtils.java:6800)
> [11:51:55] :   [Step 3/4] at 
> o.a.i.i.processors.closure.GridClosureProcessor$1.body(GridClosureProcessor.java:827)
> [11:51:55] :   [Step 3/4] at 
> o.a.i.i.util.worker.GridWorker.run(GridWorker.java:110)
> [11:51:55] :   [Step 3/4] at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> [11:51:55] :   [Step 3/4] at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> [11:51:55] :   [Step 3/4] at java.lang.Thread.run(Thread.java:748)
> {code}
> Because of that, exchange worker hangs forever while trying to acquire that 
> lock:
> {code}
> [11:51:55] :   [Step 3/4] Thread 
> [name="exchange-worker-#68894%new-node-topology-change-thread-1%", id=77379, 
> state=BLOCKED, blockCnt=11, waitCnt=7]
> [11:51:55] :   [Step 3/4] Lock 
> [object=o.a.i.i.processors.cache.distributed.dht.preloader.GridDhtPartitionDemander$RebalanceFuture@7e28f150,
>  ownerName=sys-#68921%

[jira] [Updated] (IGNITE-9236) Handshake timeout never completes in some tests (GridCacheReplicatedFailoverSelfTest in particular)

2018-08-10 Thread Alexey Goncharuk (JIRA)


 [ 
https://issues.apache.org/jira/browse/IGNITE-9236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexey Goncharuk updated IGNITE-9236:
-
Fix Version/s: 2.7

> Handshake timeout never completes in some tests 
> (GridCacheReplicatedFailoverSelfTest in particular)
> ---
>
> Key: IGNITE-9236
> URL: https://issues.apache.org/jira/browse/IGNITE-9236
> Project: Ignite
>  Issue Type: Bug
>Reporter: Ilya Lantukh
>Assignee: Ilya Lantukh
>Priority: Major
>  Labels: MakeTeamcityGreenAgain
> Fix For: 2.7
>
>
> In GridCacheReplicatedFailoverSelfTest one thread tries to establish TCP 
> connection and hangs on handshake forever, holding lock on RebalanceFuture:
> {code}
> [11:51:55] :   [Step 3/4] Locked synchronizers:
> [11:51:55] :   [Step 3/4] 
> java.util.concurrent.ThreadPoolExecutor$Worker@5b17b883
> [11:51:55] :   [Step 3/4] Thread 
> [name="sys-#68921%new-node-topology-change-thread-1%", id=77410, 
> state=RUNNABLE, blockCnt=3, waitCnt=0]
> [11:51:55] :   [Step 3/4] at 
> sun.nio.ch.FileDispatcherImpl.read0(Native Method)
> [11:51:55] :   [Step 3/4] at 
> sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
> [11:51:55] :   [Step 3/4] at 
> sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
> [11:51:55] :   [Step 3/4] at sun.nio.ch.IOUtil.read(IOUtil.java:197)
> [11:51:55] :   [Step 3/4] at 
> sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380)
> [11:51:55] :   [Step 3/4] - locked java.lang.Object@23aaa756
> [11:51:55] :   [Step 3/4] at 
> o.a.i.spi.communication.tcp.TcpCommunicationSpi.safeTcpHandshake(TcpCommunicationSpi.java:3647)
> [11:51:55] :   [Step 3/4] at 
> o.a.i.spi.communication.tcp.TcpCommunicationSpi.createTcpClient(TcpCommunicationSpi.java:3293)
> [11:51:55] :   [Step 3/4] at 
> o.a.i.spi.communication.tcp.TcpCommunicationSpi.createNioClient(TcpCommunicationSpi.java:2967)
> [11:51:55] :   [Step 3/4] at 
> o.a.i.spi.communication.tcp.TcpCommunicationSpi.reserveClient(TcpCommunicationSpi.java:2850)
> [11:51:55] :   [Step 3/4] at 
> o.a.i.spi.communication.tcp.TcpCommunicationSpi.sendMessage0(TcpCommunicationSpi.java:2693)
> [11:51:55] :   [Step 3/4] at 
> o.a.i.spi.communication.tcp.TcpCommunicationSpi.sendMessage(TcpCommunicationSpi.java:2652)
> [11:51:55] :   [Step 3/4] at 
> o.a.i.i.managers.communication.GridIoManager.send(GridIoManager.java:1643)
> [11:51:55] :   [Step 3/4] at 
> o.a.i.i.managers.communication.GridIoManager.sendOrderedMessage(GridIoManager.java:1750)
> [11:51:55] :   [Step 3/4] at 
> o.a.i.i.processors.cache.GridCacheIoManager.sendOrderedMessage(GridCacheIoManager.java:1231)
> [11:51:55] :   [Step 3/4] at 
> o.a.i.i.processors.cache.distributed.dht.preloader.GridDhtPartitionDemander$RebalanceFuture.cleanupRemoteContexts(GridDhtPartitionDemander.java:)
> [11:51:55] :   [Step 3/4] at 
> o.a.i.i.processors.cache.distributed.dht.preloader.GridDhtPartitionDemander$RebalanceFuture.cancel(GridDhtPartitionDemander.java:1041)
> [11:51:55] :   [Step 3/4] - locked 
> o.a.i.i.processors.cache.distributed.dht.preloader.GridDhtPartitionDemander$RebalanceFuture@7e28f150
> [11:51:55] :   [Step 3/4] at 
> o.a.i.i.processors.cache.distributed.dht.preloader.GridDhtPartitionDemander.lambda$null$2(GridDhtPartitionDemander.java:534)
> [11:51:55] :   [Step 3/4] at 
> o.a.i.i.processors.cache.distributed.dht.preloader.GridDhtPartitionDemander$$Lambda$41/603501511.run(Unknown
>  Source)
> [11:51:55] :   [Step 3/4] at 
> o.a.i.i.util.IgniteUtils.wrapThreadLoader(IgniteUtils.java:6800)
> [11:51:55] :   [Step 3/4] at 
> o.a.i.i.processors.closure.GridClosureProcessor$1.body(GridClosureProcessor.java:827)
> [11:51:55] :   [Step 3/4] at 
> o.a.i.i.util.worker.GridWorker.run(GridWorker.java:110)
> [11:51:55] :   [Step 3/4] at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> [11:51:55] :   [Step 3/4] at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> [11:51:55] :   [Step 3/4] at java.lang.Thread.run(Thread.java:748)
> {code}
> Because of that, exchange worker hangs forever while trying to acquire that 
> lock:
> {code}
> [11:51:55] :   [Step 3/4] Thread 
> [name="exchange-worker-#68894%new-node-topology-change-thread-1%", id=77379, 
> state=BLOCKED, blockCnt=11, waitCnt=7]
> [11:51:55] :   [Step 3/4] Lock 
> [object=o.a.i.i.processors.cache.distributed.dht.preloader.GridDhtPartitionDemander$RebalanceFuture@7e28f150,
>  ownerName=sys-#68921%new-node-topology-change-thread-1%, ownerId=77410]
> [11:51:55] :   [Step 3/4]  

[jira] [Updated] (IGNITE-9236) Handshake timeout never completes in some tests (GridCacheReplicatedFailoverSelfTest in particular)

2018-08-10 Thread Alexey Goncharuk (JIRA)


 [ 
https://issues.apache.org/jira/browse/IGNITE-9236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexey Goncharuk updated IGNITE-9236:
-
Ignite Flags:   (was: Docs Required)

> Handshake timeout never completes in some tests 
> (GridCacheReplicatedFailoverSelfTest in particular)
> ---
>
> Key: IGNITE-9236
> URL: https://issues.apache.org/jira/browse/IGNITE-9236
> Project: Ignite
>  Issue Type: Bug
>Reporter: Ilya Lantukh
>Assignee: Ilya Lantukh
>Priority: Major
>  Labels: MakeTeamcityGreenAgain
> Fix For: 2.7
>
>
> In GridCacheReplicatedFailoverSelfTest one thread tries to establish TCP 
> connection and hangs on handshake forever, holding lock on RebalanceFuture:
> {code}
> [11:51:55] :   [Step 3/4] Locked synchronizers:
> [11:51:55] :   [Step 3/4] 
> java.util.concurrent.ThreadPoolExecutor$Worker@5b17b883
> [11:51:55] :   [Step 3/4] Thread 
> [name="sys-#68921%new-node-topology-change-thread-1%", id=77410, 
> state=RUNNABLE, blockCnt=3, waitCnt=0]
> [11:51:55] :   [Step 3/4] at 
> sun.nio.ch.FileDispatcherImpl.read0(Native Method)
> [11:51:55] :   [Step 3/4] at 
> sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
> [11:51:55] :   [Step 3/4] at 
> sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
> [11:51:55] :   [Step 3/4] at sun.nio.ch.IOUtil.read(IOUtil.java:197)
> [11:51:55] :   [Step 3/4] at 
> sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380)
> [11:51:55] :   [Step 3/4] - locked java.lang.Object@23aaa756
> [11:51:55] :   [Step 3/4] at 
> o.a.i.spi.communication.tcp.TcpCommunicationSpi.safeTcpHandshake(TcpCommunicationSpi.java:3647)
> [11:51:55] :   [Step 3/4] at 
> o.a.i.spi.communication.tcp.TcpCommunicationSpi.createTcpClient(TcpCommunicationSpi.java:3293)
> [11:51:55] :   [Step 3/4] at 
> o.a.i.spi.communication.tcp.TcpCommunicationSpi.createNioClient(TcpCommunicationSpi.java:2967)
> [11:51:55] :   [Step 3/4] at 
> o.a.i.spi.communication.tcp.TcpCommunicationSpi.reserveClient(TcpCommunicationSpi.java:2850)
> [11:51:55] :   [Step 3/4] at 
> o.a.i.spi.communication.tcp.TcpCommunicationSpi.sendMessage0(TcpCommunicationSpi.java:2693)
> [11:51:55] :   [Step 3/4] at 
> o.a.i.spi.communication.tcp.TcpCommunicationSpi.sendMessage(TcpCommunicationSpi.java:2652)
> [11:51:55] :   [Step 3/4] at 
> o.a.i.i.managers.communication.GridIoManager.send(GridIoManager.java:1643)
> [11:51:55] :   [Step 3/4] at 
> o.a.i.i.managers.communication.GridIoManager.sendOrderedMessage(GridIoManager.java:1750)
> [11:51:55] :   [Step 3/4] at 
> o.a.i.i.processors.cache.GridCacheIoManager.sendOrderedMessage(GridCacheIoManager.java:1231)
> [11:51:55] :   [Step 3/4] at 
> o.a.i.i.processors.cache.distributed.dht.preloader.GridDhtPartitionDemander$RebalanceFuture.cleanupRemoteContexts(GridDhtPartitionDemander.java:)
> [11:51:55] :   [Step 3/4] at 
> o.a.i.i.processors.cache.distributed.dht.preloader.GridDhtPartitionDemander$RebalanceFuture.cancel(GridDhtPartitionDemander.java:1041)
> [11:51:55] :   [Step 3/4] - locked 
> o.a.i.i.processors.cache.distributed.dht.preloader.GridDhtPartitionDemander$RebalanceFuture@7e28f150
> [11:51:55] :   [Step 3/4] at 
> o.a.i.i.processors.cache.distributed.dht.preloader.GridDhtPartitionDemander.lambda$null$2(GridDhtPartitionDemander.java:534)
> [11:51:55] :   [Step 3/4] at 
> o.a.i.i.processors.cache.distributed.dht.preloader.GridDhtPartitionDemander$$Lambda$41/603501511.run(Unknown
>  Source)
> [11:51:55] :   [Step 3/4] at 
> o.a.i.i.util.IgniteUtils.wrapThreadLoader(IgniteUtils.java:6800)
> [11:51:55] :   [Step 3/4] at 
> o.a.i.i.processors.closure.GridClosureProcessor$1.body(GridClosureProcessor.java:827)
> [11:51:55] :   [Step 3/4] at 
> o.a.i.i.util.worker.GridWorker.run(GridWorker.java:110)
> [11:51:55] :   [Step 3/4] at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> [11:51:55] :   [Step 3/4] at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> [11:51:55] :   [Step 3/4] at java.lang.Thread.run(Thread.java:748)
> {code}
> Because of that, exchange worker hangs forever while trying to acquire that 
> lock:
> {code}
> [11:51:55] :   [Step 3/4] Thread 
> [name="exchange-worker-#68894%new-node-topology-change-thread-1%", id=77379, 
> state=BLOCKED, blockCnt=11, waitCnt=7]
> [11:51:55] :   [Step 3/4] Lock 
> [object=o.a.i.i.processors.cache.distributed.dht.preloader.GridDhtPartitionDemander$RebalanceFuture@7e28f150,
>  ownerName=sys-#68921%new-node-topology-change-thread-1%, ownerId=77410]
> [11:51:55

[jira] [Commented] (IGNITE-9050) WALIterator should throw an exception if iterator stopped in the WAL archive but not in WAL work

2018-08-10 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-9050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16576221#comment-16576221
 ] 

ASF GitHub Bot commented on IGNITE-9050:


Github user asfgit closed the pull request at:

https://github.com/apache/ignite/pull/4429


> WALIterator should throw an exception if iterator stopped in the WAL archive 
> but not in WAL work
> 
>
> Key: IGNITE-9050
> URL: https://issues.apache.org/jira/browse/IGNITE-9050
> Project: Ignite
>  Issue Type: Bug
>Reporter: Dmitriy Govorukhin
>Assignee: Dmitriy Govorukhin
>Priority: Major
> Fix For: 2.7
>
>
> The iterator will stop iteration if next WAL record pointer is not equals 
> expected (WalSegmentTailReachedException), if it happens during iteration 
> over segments in WAL archive, this means WAL is corrupted and we cannot 
> ignore this, WAL log is not fully read.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (IGNITE-8724) Skip logging 3-rd parameter while calling U.warn with initialized logger.

2018-08-10 Thread Ilya Lantukh (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-8724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16576224#comment-16576224
 ] 

Ilya Lantukh commented on IGNITE-8724:
--

Thanks! Looks good now.

> Skip logging 3-rd parameter while calling U.warn with initialized logger.
> -
>
> Key: IGNITE-8724
> URL: https://issues.apache.org/jira/browse/IGNITE-8724
> Project: Ignite
>  Issue Type: Bug
>  Components: general
>Affects Versions: 2.5
>Reporter: Stanilovsky Evgeny
>Assignee: Stanilovsky Evgeny
>Priority: Major
> Fix For: 2.7
>
> Attachments: tc.png
>
>
> There are a lot of places where exception need to be logged, for example :
> {code:java}
> U.warn(log,"Unable to await partitions release future", e);
> {code}
> but current U.warn realization silently swallow it.
> {code:java}
> public static void warn(@Nullable IgniteLogger log, Object longMsg, 
> Object shortMsg) {
> assert longMsg != null;
> assert shortMsg != null;
> if (log != null)
> log.warning(compact(longMsg.toString()));
> else
> X.println("[" + SHORT_DATE_FMT.format(new java.util.Date()) + "] 
> (wrn) " +
> compact(shortMsg.toString()));
> }
> {code}
> fix, looks like simple add:
> {code:java}
> public static void warn(@Nullable IgniteLogger log, Object longMsg, 
> Throwable ex) {
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (IGNITE-9050) WALIterator should throw an exception if iterator stopped in the WAL archive but not in WAL work

2018-08-10 Thread Alexey Goncharuk (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-9050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16576214#comment-16576214
 ] 

Alexey Goncharuk commented on IGNITE-9050:
--

Thanks for the fixes, merged to master.

> WALIterator should throw an exception if iterator stopped in the WAL archive 
> but not in WAL work
> 
>
> Key: IGNITE-9050
> URL: https://issues.apache.org/jira/browse/IGNITE-9050
> Project: Ignite
>  Issue Type: Bug
>Reporter: Dmitriy Govorukhin
>Assignee: Dmitriy Govorukhin
>Priority: Major
> Fix For: 2.7
>
>
> The iterator will stop iteration if next WAL record pointer is not equals 
> expected (WalSegmentTailReachedException), if it happens during iteration 
> over segments in WAL archive, this means WAL is corrupted and we cannot 
> ignore this, WAL log is not fully read.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (IGNITE-9171) Use lazy mode with results pre-fetch

2018-08-10 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-9171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16576212#comment-16576212
 ] 

ASF GitHub Bot commented on IGNITE-9171:


GitHub user tledkov-gridgain opened a pull request:

https://github.com/apache/ignite/pull/4514

IGNITE-9171 Use lazy mode with results pre-fetch



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/gridgain/apache-ignite ignite-9171

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/ignite/pull/4514.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4514


commit de324a59df6dcba2ec906fb974baf95dabd19504
Author: tledkov-gridgain 
Date:   2018-08-03T11:37:26Z

IGNITE-9171: save the progress

commit 63181f02cc950c49f93101e79682939949396675
Author: tledkov-gridgain 
Date:   2018-08-03T13:51:22Z

IGNITE-9171: save the progress

commit 8fec73450804fd1f9080dce7d000c8eefb0c6749
Author: tledkov-gridgain 
Date:   2018-08-03T14:35:23Z

Merge branch '_master' into ignite-9171

commit 260f3bf244fac0031fa4bb8d27aac365acfd43db
Author: tledkov-gridgain 
Date:   2018-08-06T09:22:39Z

IGNITE-9171: save the progress

commit f9bfdf76a791c0fa588d1a3a2a91bb349d7affd6
Author: tledkov-gridgain 
Date:   2018-08-06T09:35:42Z

Merge branch '_master' into ignite-9171

commit 59bf5ee665c460179aa961f9a3939b892f916dbc
Author: tledkov-gridgain 
Date:   2018-08-06T11:14:47Z

IGNITE-9171: save the progress

commit 83cca801e7547b032e7f3436ef22c979e72e04f0
Author: tledkov-gridgain 
Date:   2018-08-06T12:43:00Z

Merge branch '_master' into ignite-9171

commit bfb342cd0e35aad8c1c79044e0ebcde936c71806
Author: tledkov-gridgain 
Date:   2018-08-06T13:41:32Z

IGNITE-9171: save the progress

commit fe64dc2b22cf2f8e6b5d56068286dda1e6cc77fd
Author: tledkov-gridgain 
Date:   2018-08-07T12:30:24Z

IGNITE-9171: remove lazy worker

commit 98e4c57b795bc89c5b4a1c27f7f06f2cdbfc4dd4
Author: tledkov-gridgain 
Date:   2018-08-07T12:36:27Z

Merge branch '_master' into ignite-9171

commit 3ace9838ce13e3a08e5fdbe88101d2b61c40c718
Author: tledkov-gridgain 
Date:   2018-08-08T11:10:51Z

IGNITE-9171: benchmark

commit a3ee0f5f70d452f43de84a61832866ecd02e92da
Author: tledkov-gridgain 
Date:   2018-08-08T11:16:08Z

Merge branch '_master' into ignite-9171

commit b9a2ecfc6ac9f6c324ff0ab832899bb99ae46473
Author: tledkov-gridgain 
Date:   2018-08-08T13:13:07Z

IGNITE-9171: fix lazy mode

commit de71737b20e1683ff9d2078f1ba33a5843ccc541
Author: tledkov-gridgain 
Date:   2018-08-09T09:15:54Z

IGNITE-9171: save the progress

commit 51c15c81496c9b5dd8beebdf66087ea61a3324d9
Author: tledkov-gridgain 
Date:   2018-08-09T12:42:19Z

IGNITE-9171: save the progress

commit 75dbcc9872388b29758b56593fdc04d497d8d0ed
Author: tledkov-gridgain 
Date:   2018-08-10T08:04:25Z

IGNITE-9171: modify table lock

commit 18e198044d1806e2951b249977230d0dfaed7053
Author: tledkov-gridgain 
Date:   2018-08-10T08:14:28Z

IGNITE-9171: modify table lock - minors

commit f2188d118e9028a783bb2347d7ed15f7664828c0
Author: tledkov-gridgain 
Date:   2018-08-10T08:53:17Z

Merge branch '_master' into ignite-9171

commit 57e77782ce24d4f36843e89d9e81ab5d39eef4c1
Author: tledkov-gridgain 
Date:   2018-08-10T10:38:11Z

IGNITE-9171: minors

commit 79b2292733b3fb510ce38053b44512e25f143fa6
Author: tledkov-gridgain 
Date:   2018-08-10T12:35:32Z

Merge branch '_master' into ignite-9171




> Use lazy mode with results pre-fetch
> 
>
> Key: IGNITE-9171
> URL: https://issues.apache.org/jira/browse/IGNITE-9171
> Project: Ignite
>  Issue Type: Improvement
>  Components: sql
>Affects Versions: 2.6
>Reporter: Taras Ledkov
>Assignee: Taras Ledkov
>Priority: Major
>
> Current implementation of the {{lazy}} mode always starts separate thread for 
> {{MapQueryLazyWorker}}. It  causes excessive overhead for requests that 
> produces small results set.
> We have to begin execute query at the {{QUERY_POOL}} thread pool and fetch 
> first page of the results. In case results set is bigger than one page 
> {{MapQueryLazyWorker}} is started and link with {{MapNodeResults}} to handle 
> next pages lazy.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (IGNITE-9178) Partition lost event are not triggered if multiple nodes left cluster

2018-08-10 Thread Ilya Lantukh (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-9178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16576209#comment-16576209
 ] 

Ilya Lantukh commented on IGNITE-9178:
--

[~agoncharuk],

I've double-checked this PR, it looks correct to me. {{leftNode2Part}} in this 
case is just a temporary map that is used to fire part lost events. There is no 
need to update {{diffFromAffinity}} in that part of code, because it will be 
re-calculated later.

[~pvinokurov],

Thanks for contribution!

> Partition lost event are not triggered if multiple nodes left cluster
> -
>
> Key: IGNITE-9178
> URL: https://issues.apache.org/jira/browse/IGNITE-9178
> Project: Ignite
>  Issue Type: Bug
>  Components: cache
>Affects Versions: 2.4
>Reporter: Pavel Vinokurov
>Assignee: Pavel Vinokurov
>Priority: Blocker
> Fix For: 2.7
>
>
> If multiple nodes left cluster simultaneously, left partitions are removed 
> from GridDhtPartitionTopologyImpl#node2part without adding to leftNode2Part  
> in GridDhtPartitionTopologyImpl#update method.
> Thus GridDhtPartitionTopologyImpl#detectLostPartitions can't detect lost 
> partitions



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (IGNITE-9196) SQL: Memory leak in MapNodeResults

2018-08-10 Thread Taras Ledkov (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-9196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16576208#comment-16576208
 ] 

Taras Ledkov commented on IGNITE-9196:
--

[~dmekhanikov], my comments:
# None test suite contains {{CacheQueryMemoryLeakTest}}.
# What do you think about use {{GridDebug#dumpHeap}} for test of mem leaks 
instead of check JVM pause by our own implementation?

> SQL: Memory leak in MapNodeResults
> --
>
> Key: IGNITE-9196
> URL: https://issues.apache.org/jira/browse/IGNITE-9196
> Project: Ignite
>  Issue Type: Bug
>  Components: sql
>Affects Versions: 2.6
>Reporter: Denis Mekhanikov
>Assignee: Denis Mekhanikov
>Priority: Blocker
> Fix For: 2.7
>
>
> When size of a SQL query result set is a multiple of {{Query#pageSize}}, then 
> {{MapQueryResult}} is never closed and removed from {{MapNodeResults#res}} 
> collection.
> The following code leads to OOME when run with 1Gb heap:
> {code:java}
> public class MemLeakRepro {
> public static void main(String[] args) {
> Ignition.start(getConfiguration("server"));
> try (Ignite client = 
> Ignition.start(getConfiguration("client").setClientMode(true))) {
> IgniteCache cache = startPeopleCache(client);
> int pages = 10;
> int pageSize = 1024;
> for (int i = 0; i < pages * pageSize; i++) {
> Person p = new Person("Person #" + i, 25);
> cache.put(i, p);
> }
> for (int i = 0; i < 1_000_000; i++) {
> if (i % 1000 == 0)
> System.out.println("Select iteration #" + i);
> Query> qry = new SqlFieldsQuery("select * from 
> people");
> qry.setPageSize(pageSize);
> QueryCursor> cursor = cache.query(qry);
> cursor.getAll();
> cursor.close();
> }
> }
> }
> private static IgniteConfiguration getConfiguration(String instanceName) {
> IgniteConfiguration igniteCfg = new IgniteConfiguration();
> igniteCfg.setIgniteInstanceName(instanceName);
> TcpDiscoverySpi discoSpi = new TcpDiscoverySpi();
> discoSpi.setIpFinder(new TcpDiscoveryVmIpFinder(true));
> return igniteCfg;
> }
> private static IgniteCache startPeopleCache(Ignite node) 
> {
> CacheConfiguration cacheCfg = new 
> CacheConfiguration<>("cache");
> QueryEntity qe = new QueryEntity(Integer.class, Person.class);
> qe.setTableName("people");
> cacheCfg.setQueryEntities(Collections.singleton(qe));
> cacheCfg.setSqlSchema("PUBLIC");
> return node.getOrCreateCache(cacheCfg);
> }
> public static class Person {
> @QuerySqlField
> private String name;
> @QuerySqlField
> private int age;
> public Person(String name, int age) {
> this.name = name;
> this.age = age;
> }
> }
> }
> {code}
>  
> At the same time it works perfectly fine, when there are, for example, 
> {{pages * pageSize - 1}} records in cache instead.
> The reason for it is that {{MapQueryResult#fetchNextPage(...)}} method 
> doesn't return true, when the result set size is a multiple of the page size.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-9248) CPP: Support Clang compiler

2018-08-10 Thread Igor Sapego (JIRA)
Igor Sapego created IGNITE-9248:
---

 Summary: CPP: Support Clang compiler
 Key: IGNITE-9248
 URL: https://issues.apache.org/jira/browse/IGNITE-9248
 Project: Ignite
  Issue Type: Improvement
  Components: platforms
Reporter: Igor Sapego
Assignee: Igor Sapego


Currently Ignite C++ can not be compiled with the clang compiler.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (IGNITE-9050) WALIterator should throw an exception if iterator stopped in the WAL archive but not in WAL work

2018-08-10 Thread Alexey Goncharuk (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-9050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16576147#comment-16576147
 ] 

Alexey Goncharuk commented on IGNITE-9050:
--

[~DmitriyGovorukhin], I have 
IgniteWALTailIsReachedDuringIterationOverArchiveTest#testStandAloneIterator 
failing with about 5% rate locally with the error
{code}
junit.framework.AssertionFailedError: Last read ptr=FileWALPointer [idx=23, 
fileOff=9224675, len=59], corruptedPtr=FileWALPointer [idx=22, fileOff=2776095, 
len=1115]
{code}
Also, the test is not added to any test suite. Can you take a look?

> WALIterator should throw an exception if iterator stopped in the WAL archive 
> but not in WAL work
> 
>
> Key: IGNITE-9050
> URL: https://issues.apache.org/jira/browse/IGNITE-9050
> Project: Ignite
>  Issue Type: Bug
>Reporter: Dmitriy Govorukhin
>Assignee: Dmitriy Govorukhin
>Priority: Major
> Fix For: 2.7
>
>
> The iterator will stop iteration if next WAL record pointer is not equals 
> expected (WalSegmentTailReachedException), if it happens during iteration 
> over segments in WAL archive, this means WAL is corrupted and we cannot 
> ignore this, WAL log is not fully read.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (IGNITE-9178) Partition lost event are not triggered if multiple nodes left cluster

2018-08-10 Thread Alexey Goncharuk (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-9178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16576140#comment-16576140
 ] 

Alexey Goncharuk edited comment on IGNITE-9178 at 8/10/18 11:35 AM:


[~pvinokurov] Can you please explain why you needed to update {{leftNode2Part}} 
inside the loop and how this fixes the missed event? 

Note that there is a separate method handling left nodes - {{removeNode()}}, 
and from the current code I see that not only does it add new entry to 
{{leftNode2Part}}, but it also updates {{diffFromAffinity}} map. With your 
change it looks like the {{diffFromAffinity}} map may be outdated. I see that 
{{removeNode()}} is called in {{beforeExchange}} for all left nodes, so I it is 
not clear why those nodes did not get to {{leftNode2Part}}


was (Author: agoncharuk):
[~pvinokurov] Can you please explain why you needed to update {{leftNode2Part}} 
inside the loop and how this fixes the missed event? 

Note that there is a separate method handling left nodes - {{removeNode()}}, 
and from the current code I see that not only does it add new entry to 
{{leftNode2Part}}, but it also updates {{diffFromAffinity}} map. With your 
change it looks like the {{diffFromAffinity}} map may be outdated.

> Partition lost event are not triggered if multiple nodes left cluster
> -
>
> Key: IGNITE-9178
> URL: https://issues.apache.org/jira/browse/IGNITE-9178
> Project: Ignite
>  Issue Type: Bug
>  Components: cache
>Affects Versions: 2.4
>Reporter: Pavel Vinokurov
>Assignee: Pavel Vinokurov
>Priority: Blocker
> Fix For: 2.7
>
>
> If multiple nodes left cluster simultaneously, left partitions are removed 
> from GridDhtPartitionTopologyImpl#node2part without adding to leftNode2Part  
> in GridDhtPartitionTopologyImpl#update method.
> Thus GridDhtPartitionTopologyImpl#detectLostPartitions can't detect lost 
> partitions



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (IGNITE-9178) Partition lost event are not triggered if multiple nodes left cluster

2018-08-10 Thread Alexey Goncharuk (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-9178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16576140#comment-16576140
 ] 

Alexey Goncharuk commented on IGNITE-9178:
--

[~pvinokurov] Can you please explain why you needed to update {{leftNode2Part}} 
inside the loop and how this fixes the missed event? 

Note that there is a separate method handling left nodes - {{removeNode()}}, 
and from the current code I see that not only does it add new entry to 
{{leftNode2Part}}, but it also updates {{diffFromAffinity}} map. With your 
change it looks like the {{diffFromAffinity}} map may be outdated.

> Partition lost event are not triggered if multiple nodes left cluster
> -
>
> Key: IGNITE-9178
> URL: https://issues.apache.org/jira/browse/IGNITE-9178
> Project: Ignite
>  Issue Type: Bug
>  Components: cache
>Affects Versions: 2.4
>Reporter: Pavel Vinokurov
>Assignee: Pavel Vinokurov
>Priority: Blocker
> Fix For: 2.7
>
>
> If multiple nodes left cluster simultaneously, left partitions are removed 
> from GridDhtPartitionTopologyImpl#node2part without adding to leftNode2Part  
> in GridDhtPartitionTopologyImpl#update method.
> Thus GridDhtPartitionTopologyImpl#detectLostPartitions can't detect lost 
> partitions



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (IGNITE-5103) TcpDiscoverySpi ignores maxMissedClientHeartbeats property

2018-08-10 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-5103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16576132#comment-16576132
 ] 

ASF GitHub Bot commented on IGNITE-5103:


Github user asfgit closed the pull request at:

https://github.com/apache/ignite/pull/4446


> TcpDiscoverySpi ignores maxMissedClientHeartbeats property
> --
>
> Key: IGNITE-5103
> URL: https://issues.apache.org/jira/browse/IGNITE-5103
> Project: Ignite
>  Issue Type: Bug
>  Components: general
>Affects Versions: 1.9
>Reporter: Valentin Kulichenko
>Assignee: Evgenii Zhuravlev
>Priority: Blocker
> Fix For: 2.7
>
> Attachments: TcpDiscoveryClientSuspensionSelfTest.java
>
>
> Test scenario is the following:
> * Start one or more servers.
> * Start a client node.
> * Suspend client process using {{-SIGSTOP}} signal.
> * Wait for {{maxMissedClientHeartbeats*heartbeatFrequency}}.
> * Client node is expected to be removed from topology, but server nodes don't 
> do that.
> Attached is the unit test reproducing the same by stopping the heartbeat 
> sender thread on the client.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (IGNITE-5103) TcpDiscoverySpi ignores maxMissedClientHeartbeats property

2018-08-10 Thread Alexey Goncharuk (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-5103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16576125#comment-16576125
 ] 

Alexey Goncharuk commented on IGNITE-5103:
--

Thanks, [~ezhuravl], merged your changes to master.

> TcpDiscoverySpi ignores maxMissedClientHeartbeats property
> --
>
> Key: IGNITE-5103
> URL: https://issues.apache.org/jira/browse/IGNITE-5103
> Project: Ignite
>  Issue Type: Bug
>  Components: general
>Affects Versions: 1.9
>Reporter: Valentin Kulichenko
>Assignee: Evgenii Zhuravlev
>Priority: Blocker
> Fix For: 2.7
>
> Attachments: TcpDiscoveryClientSuspensionSelfTest.java
>
>
> Test scenario is the following:
> * Start one or more servers.
> * Start a client node.
> * Suspend client process using {{-SIGSTOP}} signal.
> * Wait for {{maxMissedClientHeartbeats*heartbeatFrequency}}.
> * Client node is expected to be removed from topology, but server nodes don't 
> do that.
> Attached is the unit test reproducing the same by stopping the heartbeat 
> sender thread on the client.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (IGNITE-596) Add missed scala examples and remove unnecessary scala examples.

2018-08-10 Thread Vasiliy Sisko (JIRA)


 [ 
https://issues.apache.org/jira/browse/IGNITE-596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vasiliy Sisko closed IGNITE-596.


> Add missed  scala examples and remove unnecessary scala examples.
> -
>
> Key: IGNITE-596
> URL: https://issues.apache.org/jira/browse/IGNITE-596
> Project: Ignite
>  Issue Type: Bug
>  Components: general
>Affects Versions: sprint-3
>Reporter: Vasiliy Sisko
>Assignee: Alexey Kuznetsov
>Priority: Minor
> Attachments: #_IGNITE-596_All_examples.patch, 
> #_IGNITE-596_Other_examples.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (IGNITE-897) Add missed datagrid scala examples

2018-08-10 Thread Vasiliy Sisko (JIRA)


 [ 
https://issues.apache.org/jira/browse/IGNITE-897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vasiliy Sisko closed IGNITE-897.


> Add missed datagrid scala examples
> --
>
> Key: IGNITE-897
> URL: https://issues.apache.org/jira/browse/IGNITE-897
> Project: Ignite
>  Issue Type: Sub-task
>  Components: general
>Affects Versions: sprint-5
>Reporter: Vasiliy Sisko
>Assignee: Alexey Kuznetsov
>Priority: Minor
> Attachments: #_IGNITE-897_Datagrid_examples.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (IGNITE-871) Add missed datastructures scala examples

2018-08-10 Thread Vasiliy Sisko (JIRA)


 [ 
https://issues.apache.org/jira/browse/IGNITE-871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vasiliy Sisko closed IGNITE-871.


> Add missed datastructures scala examples
> 
>
> Key: IGNITE-871
> URL: https://issues.apache.org/jira/browse/IGNITE-871
> Project: Ignite
>  Issue Type: Sub-task
>  Components: general
>Affects Versions: sprint-5
>Reporter: Vasiliy Sisko
>Assignee: Alexey Kuznetsov
>Priority: Minor
> Attachments: #_IGNITE-871_Datastructures_examples.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (IGNITE-897) Add missed datagrid scala examples

2018-08-10 Thread Vasiliy Sisko (JIRA)


 [ 
https://issues.apache.org/jira/browse/IGNITE-897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vasiliy Sisko resolved IGNITE-897.
--
Resolution: Won't Fix
  Assignee: Alexey Kuznetsov  (was: Vasiliy Sisko)

> Add missed datagrid scala examples
> --
>
> Key: IGNITE-897
> URL: https://issues.apache.org/jira/browse/IGNITE-897
> Project: Ignite
>  Issue Type: Sub-task
>  Components: general
>Affects Versions: sprint-5
>Reporter: Vasiliy Sisko
>Assignee: Alexey Kuznetsov
>Priority: Minor
> Attachments: #_IGNITE-897_Datagrid_examples.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (IGNITE-596) Add missed scala examples and remove unnecessary scala examples.

2018-08-10 Thread Vasiliy Sisko (JIRA)


 [ 
https://issues.apache.org/jira/browse/IGNITE-596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vasiliy Sisko resolved IGNITE-596.
--
Resolution: Won't Fix
  Assignee: Alexey Kuznetsov  (was: Vasiliy Sisko)

> Add missed  scala examples and remove unnecessary scala examples.
> -
>
> Key: IGNITE-596
> URL: https://issues.apache.org/jira/browse/IGNITE-596
> Project: Ignite
>  Issue Type: Bug
>  Components: general
>Affects Versions: sprint-3
>Reporter: Vasiliy Sisko
>Assignee: Alexey Kuznetsov
>Priority: Minor
> Attachments: #_IGNITE-596_All_examples.patch, 
> #_IGNITE-596_Other_examples.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (IGNITE-871) Add missed datastructures scala examples

2018-08-10 Thread Vasiliy Sisko (JIRA)


 [ 
https://issues.apache.org/jira/browse/IGNITE-871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vasiliy Sisko resolved IGNITE-871.
--
Resolution: Won't Fix
  Assignee: Alexey Kuznetsov  (was: Vasiliy Sisko)

> Add missed datastructures scala examples
> 
>
> Key: IGNITE-871
> URL: https://issues.apache.org/jira/browse/IGNITE-871
> Project: Ignite
>  Issue Type: Sub-task
>  Components: general
>Affects Versions: sprint-5
>Reporter: Vasiliy Sisko
>Assignee: Alexey Kuznetsov
>Priority: Minor
> Attachments: #_IGNITE-871_Datastructures_examples.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (IGNITE-846) Add missed computegrid scala examples.

2018-08-10 Thread Vasiliy Sisko (JIRA)


 [ 
https://issues.apache.org/jira/browse/IGNITE-846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vasiliy Sisko resolved IGNITE-846.
--
Resolution: Won't Fix
  Assignee: Alexey Kuznetsov  (was: Vasiliy Sisko)

> Add missed computegrid scala examples.
> --
>
> Key: IGNITE-846
> URL: https://issues.apache.org/jira/browse/IGNITE-846
> Project: Ignite
>  Issue Type: Sub-task
>  Components: general
>Affects Versions: sprint-5
>Reporter: Vasiliy Sisko
>Assignee: Alexey Kuznetsov
>Priority: Minor
> Attachments: #_IGNITE-846_Computegrid_examples.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (IGNITE-846) Add missed computegrid scala examples.

2018-08-10 Thread Vasiliy Sisko (JIRA)


 [ 
https://issues.apache.org/jira/browse/IGNITE-846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vasiliy Sisko closed IGNITE-846.


> Add missed computegrid scala examples.
> --
>
> Key: IGNITE-846
> URL: https://issues.apache.org/jira/browse/IGNITE-846
> Project: Ignite
>  Issue Type: Sub-task
>  Components: general
>Affects Versions: sprint-5
>Reporter: Vasiliy Sisko
>Assignee: Alexey Kuznetsov
>Priority: Minor
> Attachments: #_IGNITE-846_Computegrid_examples.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (IGNITE-596) Add missed scala examples and remove unnecessary scala examples.

2018-08-10 Thread Vasiliy Sisko (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16576114#comment-16576114
 ] 

Vasiliy Sisko commented on IGNITE-596:
--

The problem is no longer relevant. Examples are not updated for 2 years and can 
be removed in Ignite 3.0

> Add missed  scala examples and remove unnecessary scala examples.
> -
>
> Key: IGNITE-596
> URL: https://issues.apache.org/jira/browse/IGNITE-596
> Project: Ignite
>  Issue Type: Bug
>  Components: general
>Affects Versions: sprint-3
>Reporter: Vasiliy Sisko
>Assignee: Vasiliy Sisko
>Priority: Minor
> Attachments: #_IGNITE-596_All_examples.patch, 
> #_IGNITE-596_Other_examples.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (IGNITE-897) Add missed datagrid scala examples

2018-08-10 Thread Vasiliy Sisko (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16576109#comment-16576109
 ] 

Vasiliy Sisko commented on IGNITE-897:
--

 The problem is no longer relevant. Examples are not updated for 2 years and 
can be removed in Ignite 3.0

> Add missed datagrid scala examples
> --
>
> Key: IGNITE-897
> URL: https://issues.apache.org/jira/browse/IGNITE-897
> Project: Ignite
>  Issue Type: Sub-task
>  Components: general
>Affects Versions: sprint-5
>Reporter: Vasiliy Sisko
>Assignee: Vasiliy Sisko
>Priority: Minor
> Attachments: #_IGNITE-897_Datagrid_examples.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (IGNITE-846) Add missed computegrid scala examples.

2018-08-10 Thread Vasiliy Sisko (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16576112#comment-16576112
 ] 

Vasiliy Sisko commented on IGNITE-846:
--

The problem is no longer relevant. Examples are not updated for 2 years and can 
be removed in Ignite 3.0

> Add missed computegrid scala examples.
> --
>
> Key: IGNITE-846
> URL: https://issues.apache.org/jira/browse/IGNITE-846
> Project: Ignite
>  Issue Type: Sub-task
>  Components: general
>Affects Versions: sprint-5
>Reporter: Vasiliy Sisko
>Assignee: Vasiliy Sisko
>Priority: Minor
> Attachments: #_IGNITE-846_Computegrid_examples.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (IGNITE-871) Add missed datastructures scala examples

2018-08-10 Thread Vasiliy Sisko (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16576110#comment-16576110
 ] 

Vasiliy Sisko commented on IGNITE-871:
--

The problem is no longer relevant. Examples are not updated for 2 years and can 
be removed in Ignite 3.0

> Add missed datastructures scala examples
> 
>
> Key: IGNITE-871
> URL: https://issues.apache.org/jira/browse/IGNITE-871
> Project: Ignite
>  Issue Type: Sub-task
>  Components: general
>Affects Versions: sprint-5
>Reporter: Vasiliy Sisko
>Assignee: Vasiliy Sisko
>Priority: Minor
> Attachments: #_IGNITE-871_Datastructures_examples.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (IGNITE-7251) Remove term "fabric" from Ignite deliverables

2018-08-10 Thread Anton Vinogradov (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-7251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16576097#comment-16576097
 ] 

Anton Vinogradov commented on IGNITE-7251:
--

[~vveider]
PR ready to be reviewed.

> Remove term "fabric" from Ignite deliverables
> -
>
> Key: IGNITE-7251
> URL: https://issues.apache.org/jira/browse/IGNITE-7251
> Project: Ignite
>  Issue Type: Task
>Reporter: Denis Magda
>Assignee: Anton Vinogradov
>Priority: Blocker
>  Labels: important
> Fix For: 2.7
>
>
> Apache Ignite binary releases still include “fabric” word in their names:
> https://ignite.apache.org/download.cgi#binaries
> For instance, this is a full name of the previous release - 
> apache-ignite-fabric-2.3.0-bin.
> It’s a little oversight on our side because the project has not been 
> positioned as a fabric for a while.
> Remove “fabric” from the name and have the binary releases named as - 
> {{apache-ignite-\{version}-bin}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (IGNITE-8842) Web console: Wrong start screen on start of demo mode

2018-08-10 Thread Vasiliy Sisko (JIRA)


 [ 
https://issues.apache.org/jira/browse/IGNITE-8842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vasiliy Sisko reassigned IGNITE-8842:
-

Assignee: Pavel Konstantinov  (was: Vasiliy Sisko)

> Web console: Wrong start screen on start of demo mode
> -
>
> Key: IGNITE-8842
> URL: https://issues.apache.org/jira/browse/IGNITE-8842
> Project: Ignite
>  Issue Type: Bug
>  Components: wizards
>Reporter: Vasiliy Sisko
>Assignee: Pavel Konstantinov
>Priority: Minor
>
> On start of demo mode screen with "SQL demo" notebook should be opened.
> Also on "Notebooks" screen "SQL demo" notebook should be available.
> On demo start "SQL demo" should be recreated if needed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (IGNITE-8842) Web console: Wrong start screen on start of demo mode

2018-08-10 Thread Vasiliy Sisko (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-8842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16576083#comment-16576083
 ] 

Vasiliy Sisko commented on IGNITE-8842:
---

 Implemented opening of Demo queries page on demo run.

> Web console: Wrong start screen on start of demo mode
> -
>
> Key: IGNITE-8842
> URL: https://issues.apache.org/jira/browse/IGNITE-8842
> Project: Ignite
>  Issue Type: Bug
>  Components: wizards
>Reporter: Vasiliy Sisko
>Assignee: Vasiliy Sisko
>Priority: Minor
>
> On start of demo mode screen with "SQL demo" notebook should be opened.
> Also on "Notebooks" screen "SQL demo" notebook should be available.
> On demo start "SQL demo" should be recreated if needed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (IGNITE-8950) Need to have more informative output info while database files check operation.

2018-08-10 Thread Stanilovsky Evgeny (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-8950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16576079#comment-16576079
 ] 

Stanilovsky Evgeny commented on IGNITE-8950:


TC looks ok.

> Need to have more informative output info while database files check 
> operation.
> ---
>
> Key: IGNITE-8950
> URL: https://issues.apache.org/jira/browse/IGNITE-8950
> Project: Ignite
>  Issue Type: Bug
>  Components: persistence
>Affects Versions: 2.5
>Reporter: Stanilovsky Evgeny
>Assignee: Stanilovsky Evgeny
>Priority: Minor
> Fix For: 2.7
>
>
> "Failed to verify store file ..." messages have no file path info.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-9247) CPP Thin: implement GetAll

2018-08-10 Thread Stanislav Lukyanov (JIRA)
Stanislav Lukyanov created IGNITE-9247:
--

 Summary: CPP Thin: implement GetAll
 Key: IGNITE-9247
 URL: https://issues.apache.org/jira/browse/IGNITE-9247
 Project: Ignite
  Issue Type: New Feature
  Components: thin client
Reporter: Stanislav Lukyanov


Need to implement GetAll in C++ Thin client.

Currently, there is no way to extract values from cache via C++ Thin client 
without knowing the keys beforehand. GetAll would be the easiest way to do that.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (IGNITE-9244) Partition eviction may use all threads in sys pool, it leads to hangs send a message via sys pool

2018-08-10 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-9244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16576055#comment-16576055
 ] 

ASF GitHub Bot commented on IGNITE-9244:


GitHub user DmitriyGovorukhin opened a pull request:

https://github.com/apache/ignite/pull/4513

IGNITE-9244 Rework partition eviction.

- add evict shared manager
- concurrent evict partition from one group
- balanced executors by partition size
- limitation concurrent evict operation via permits counter

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/gridgain/apache-ignite ignite-9244

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/ignite/pull/4513.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4513


commit ab10ca99d7d7052414ef0927d52f17c81e5d7bde
Author: Dmitriy Govorukhin 
Date:   2018-08-10T10:10:12Z

IGNITE-9244 Rework partition eviction.
- add evict shared manager
- concurrent evict partition from one group
- balanced executors by partition size
- limitation concurrent evict operation via permits counter

Signed-off-by: Dmitriy Govorukhin 




> Partition eviction may use all threads in sys pool, it leads to hangs send a 
> message via sys pool 
> --
>
> Key: IGNITE-9244
> URL: https://issues.apache.org/jira/browse/IGNITE-9244
> Project: Ignite
>  Issue Type: Bug
>Reporter: Dmitriy Govorukhin
>Assignee: Dmitriy Govorukhin
>Priority: Major
> Fix For: 2.7
>
>
> In the current implementation, GridDhtPartitionsEvictor reset partition to 
> evict one by one.
> GridDhtPartitionsEvictor is created for each cache group, if we try to evict 
> too many groups as sys pool size, group evictors will take all available 
> threads in sys pool. It leads to hangs send a message via sys pool. As a fix, 
> I suggest to limit concurrent execution via sys pool or use another pool for 
> this purpose.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (IGNITE-9178) Partition lost event are not triggered if multiple nodes left cluster

2018-08-10 Thread Pavel Vinokurov (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-9178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16576053#comment-16576053
 ] 

Pavel Vinokurov commented on IGNITE-9178:
-

Test results look good.
[~Jokser][~agoncharuk] Please review

> Partition lost event are not triggered if multiple nodes left cluster
> -
>
> Key: IGNITE-9178
> URL: https://issues.apache.org/jira/browse/IGNITE-9178
> Project: Ignite
>  Issue Type: Bug
>  Components: cache
>Affects Versions: 2.4
>Reporter: Pavel Vinokurov
>Assignee: Pavel Vinokurov
>Priority: Blocker
> Fix For: 2.7
>
>
> If multiple nodes left cluster simultaneously, left partitions are removed 
> from GridDhtPartitionTopologyImpl#node2part without adding to leftNode2Part  
> in GridDhtPartitionTopologyImpl#update method.
> Thus GridDhtPartitionTopologyImpl#detectLostPartitions can't detect lost 
> partitions



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (IGNITE-9147) Race between tx rollback and prepare on near node can produce hanging primary tx

2018-08-10 Thread Alexei Scherbakov (JIRA)


 [ 
https://issues.apache.org/jira/browse/IGNITE-9147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexei Scherbakov updated IGNITE-9147:
--
Summary: Race between tx rollback and prepare on near node can produce 
hanging primary tx  (was: When server node left cluster on high load, cluster 
take hang on PartitionalExchange)

> Race between tx rollback and prepare on near node can produce hanging primary 
> tx
> 
>
> Key: IGNITE-9147
> URL: https://issues.apache.org/jira/browse/IGNITE-9147
> Project: Ignite
>  Issue Type: Bug
>  Components: general
>Affects Versions: 2.5
>Reporter: ARomantsov
>Assignee: Alexei Scherbakov
>Priority: Critical
> Fix For: 2.7
>
>
> I ran a simple test
> 1) Start 15 servers node
> 2) Start client with long transaction
> 3) Additional start 5 client with loading in many caches (near 2 thousand)
> 4) Stop 1 server node, wait 1 minute and start it back
> Cluster freenze on more than hour, then license end



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (IGNITE-9246) Optimistic transactions can wait for topology future on remap for a long time even if timeout is set.

2018-08-10 Thread Alexei Scherbakov (JIRA)


 [ 
https://issues.apache.org/jira/browse/IGNITE-9246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexei Scherbakov updated IGNITE-9246:
--
Fix Version/s: 2.7

> Optimistic transactions can wait for topology future on remap for a long time 
> even if timeout is set.
> -
>
> Key: IGNITE-9246
> URL: https://issues.apache.org/jira/browse/IGNITE-9246
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Alexei Scherbakov
>Assignee: Alexei Scherbakov
>Priority: Major
> Fix For: 2.7
>
>
> This is possible if long PME is occured during tx remap phase.
> Fix: wait for new topology on remap with timeout if set.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-9246) Optimistic transactions can wait for topology future on remap for a long time even if timeout is set.

2018-08-10 Thread Alexei Scherbakov (JIRA)
Alexei Scherbakov created IGNITE-9246:
-

 Summary: Optimistic transactions can wait for topology future on 
remap for a long time even if timeout is set.
 Key: IGNITE-9246
 URL: https://issues.apache.org/jira/browse/IGNITE-9246
 Project: Ignite
  Issue Type: Improvement
Reporter: Alexei Scherbakov
Assignee: Alexei Scherbakov


This is possible if long PME is occured during tx remap phase.

Fix: wait for new topology on remap with timeout if set.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (IGNITE-9196) SQL: Memory leak in MapNodeResults

2018-08-10 Thread Ilya Kasnacheev (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-9196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16576011#comment-16576011
 ] 

Ilya Kasnacheev commented on IGNITE-9196:
-

[~tledkov-gridgain] please review the proposed fix.

> SQL: Memory leak in MapNodeResults
> --
>
> Key: IGNITE-9196
> URL: https://issues.apache.org/jira/browse/IGNITE-9196
> Project: Ignite
>  Issue Type: Bug
>  Components: sql
>Affects Versions: 2.6
>Reporter: Denis Mekhanikov
>Assignee: Denis Mekhanikov
>Priority: Blocker
> Fix For: 2.7
>
>
> When size of a SQL query result set is a multiple of {{Query#pageSize}}, then 
> {{MapQueryResult}} is never closed and removed from {{MapNodeResults#res}} 
> collection.
> The following code leads to OOME when run with 1Gb heap:
> {code:java}
> public class MemLeakRepro {
> public static void main(String[] args) {
> Ignition.start(getConfiguration("server"));
> try (Ignite client = 
> Ignition.start(getConfiguration("client").setClientMode(true))) {
> IgniteCache cache = startPeopleCache(client);
> int pages = 10;
> int pageSize = 1024;
> for (int i = 0; i < pages * pageSize; i++) {
> Person p = new Person("Person #" + i, 25);
> cache.put(i, p);
> }
> for (int i = 0; i < 1_000_000; i++) {
> if (i % 1000 == 0)
> System.out.println("Select iteration #" + i);
> Query> qry = new SqlFieldsQuery("select * from 
> people");
> qry.setPageSize(pageSize);
> QueryCursor> cursor = cache.query(qry);
> cursor.getAll();
> cursor.close();
> }
> }
> }
> private static IgniteConfiguration getConfiguration(String instanceName) {
> IgniteConfiguration igniteCfg = new IgniteConfiguration();
> igniteCfg.setIgniteInstanceName(instanceName);
> TcpDiscoverySpi discoSpi = new TcpDiscoverySpi();
> discoSpi.setIpFinder(new TcpDiscoveryVmIpFinder(true));
> return igniteCfg;
> }
> private static IgniteCache startPeopleCache(Ignite node) 
> {
> CacheConfiguration cacheCfg = new 
> CacheConfiguration<>("cache");
> QueryEntity qe = new QueryEntity(Integer.class, Person.class);
> qe.setTableName("people");
> cacheCfg.setQueryEntities(Collections.singleton(qe));
> cacheCfg.setSqlSchema("PUBLIC");
> return node.getOrCreateCache(cacheCfg);
> }
> public static class Person {
> @QuerySqlField
> private String name;
> @QuerySqlField
> private int age;
> public Person(String name, int age) {
> this.name = name;
> this.age = age;
> }
> }
> }
> {code}
>  
> At the same time it works perfectly fine, when there are, for example, 
> {{pages * pageSize - 1}} records in cache instead.
> The reason for it is that {{MapQueryResult#fetchNextPage(...)}} method 
> doesn't return true, when the result set size is a multiple of the page size.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (IGNITE-8950) Need to have more informative output info while database files check operation.

2018-08-10 Thread Dmitriy Pavlov (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-8950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16576009#comment-16576009
 ] 

Dmitriy Pavlov commented on IGNITE-8950:


I understand that the fix is very simple. At the same time, I think it would be 
perfectly ok to run at least basic suite,

https://ci.ignite.apache.org/viewQueued.html?itemId=1626834&tab=queuedBuildOverviewTab

> Need to have more informative output info while database files check 
> operation.
> ---
>
> Key: IGNITE-8950
> URL: https://issues.apache.org/jira/browse/IGNITE-8950
> Project: Ignite
>  Issue Type: Bug
>  Components: persistence
>Affects Versions: 2.5
>Reporter: Stanilovsky Evgeny
>Assignee: Stanilovsky Evgeny
>Priority: Minor
> Fix For: 2.7
>
>
> "Failed to verify store file ..." messages have no file path info.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (IGNITE-9235) Transitivity violation in GridMergeIndex Comparator

2018-08-10 Thread Evgenii Zagumennov (JIRA)


 [ 
https://issues.apache.org/jira/browse/IGNITE-9235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Evgenii Zagumennov updated IGNITE-9235:
---
Fix Version/s: 2.5

> Transitivity violation in GridMergeIndex Comparator
> ---
>
> Key: IGNITE-9235
> URL: https://issues.apache.org/jira/browse/IGNITE-9235
> Project: Ignite
>  Issue Type: Bug
>Affects Versions: 2.5
>Reporter: Andrew Medvedev
>Assignee: Andrew Medvedev
>Priority: Major
> Fix For: 2.5
>
>
> Currently comparator in 
> org.apache.ignite.internal.processors.query.h2.twostep.GridMergeIndex is:
>  
> Private final Comparator streamCmp = new Comparator() {
>  @Override public int compare(RowStream o1, RowStream o2) {
>  // Nulls at the beginning.
>  if (o1 == null)
>  return -1;
>  if (o2 == null)
>  return 1;
>  return compareRows(o1.get(), o2.get());
>  }
> };
> --
>  
> This comparator violates transitivity when o1 and o2 are null. Thus we get 
> exception in JDK1.8:
>  
>  
> {color:#d04437}Caused by: java.lang.IllegalArgumentException: Comparison 
> method violates its general contract!{color}
> {color:#d04437}  at java.util.TimSort.mergeHi(TimSort.java:899){color}
> {color:#d04437}  at java.util.TimSort.mergeAt(TimSort.java:516){color}
> {color:#d04437}  at java.util.TimSort.mergeCollapse(TimSort.java:441){color}
> {color:#d04437}  at java.util.TimSort.sort(TimSort.java:245){color}
> {color:#d04437}  at java.util.Arrays.sort(Arrays.java:1438){color}
> {color:#d04437}  at 
> org.apache.ignite.internal.processors.query.h2.twostep.GridMergeIndexSorted$MergeStreamIterator.goFirst(GridMergeIndexSorted.java:248){color}
> {color:#d04437}  at 
> org.apache.ignite.internal.processors.query.h2.twostep.GridMergeIndexSorted$MergeStreamIterator.hasNext(GridMergeIndexSorted.java:270){color}
> {color:#d04437}  at 
> org.apache.ignite.internal.processors.query.h2.twostep.GridMergeIndex$FetchingCursor.fetchRows(GridMergeIndex.java:614){color}
> {color:#d04437}  at 
> org.apache.ignite.internal.processors.query.h2.twostep.GridMergeIndex$FetchingCursor.next(GridMergeIndex.java:658){color}
> {color:#d04437}  at org.h2.index.IndexCursor.next(IndexCursor.java:305){color}
> {color:#d04437}  at org.h2.table.TableFilter.next(TableFilter.java:499){color}
> {color:#d04437}  at 
> org.h2.command.dml.Select$LazyResultQueryFlat.fetchNextRow(Select.java:1452){color}
> {color:#d04437}  at 
> org.h2.result.LazyResult.hasNext(LazyResult.java:79){color}
> {color:#d04437}  at org.h2.result.LazyResult.next(LazyResult.java:59){color}
> {color:#d04437}  at 
> org.h2.command.dml.Select.queryFlat(Select.java:519){color}
> {color:#d04437}  at 
> org.h2.command.dml.Select.queryWithoutCache(Select.java:625){color}
> {color:#d04437}  at 
> org.h2.command.dml.Query.queryWithoutCacheLazyCheck(Query.java:114){color}
> {color:#d04437}  at org.h2.command.dml.Query.query(Query.java:352){color}
> {color:#d04437}  at org.h2.command.dml.Query.query(Query.java:333){color}
> {color:#d04437}  at 
> org.h2.command.CommandContainer.query(CommandContainer.java:113){color}
> {color:#d04437}  at 
> org.h2.command.Command.executeQuery(Command.java:201){color}
> {color:#d04437} ... 44 more{color}
>   
> WA: use -Djava.util.Arrays.useLegacyMergeSort=true
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (IGNITE-9238) Test GridTaskFailoverAffinityRunTest.testNodeRestartClient hangs when coordinator forces client to reconnect on grid startup.

2018-08-10 Thread Pavel Pereslegin (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-9238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16575932#comment-16575932
 ] 

Pavel Pereslegin commented on IGNITE-9238:
--

Hello [~Jokser],
review this fix, please.

When coordinator checks exchange history, it can see updated affinity version, 
but the exchange future on which the affinity version was updated is not fully 
completed.

> Test GridTaskFailoverAffinityRunTest.testNodeRestartClient hangs when 
> coordinator forces client to reconnect on grid startup.
> -
>
> Key: IGNITE-9238
> URL: https://issues.apache.org/jira/browse/IGNITE-9238
> Project: Ignite
>  Issue Type: Bug
>Affects Versions: 2.6
>Reporter: Pavel Pereslegin
>Assignee: Pavel Pereslegin
>Priority: Major
> Fix For: 2.7
>
>
> Example of such hang on TC: 
> https://ci.ignite.apache.org/viewLog.html?buildId=1605243&tab=buildResultsDiv&buildTypeId=IgniteTests24Java8_ComputeGrid
> Log output:
> {noformat}
> ...
> [2018-08-07 12:20:09,804][WARN 
> ][sys-#12799%internal.GridTaskFailoverAffinityRunTest1%][GridCachePartitionExchangeManager]
>  Client node tries to connect but its exchange info is cleaned up from 
> exchange history. Consider increasing 'IGNITE_EXCHANGE_HISTORY_SIZE' property 
> or start clients in  smaller batches. Current settings and versions: 
> [IGNITE_EXCHANGE_HISTORY_SIZE=1000, initVer=AffinityTopologyVersion 
> [topVer=3, minorTopVer=0], readyVer=AffinityTopologyVersion [topVer=4, 
> minorTopVer=0]].
> [2018-08-07 12:20:09,804][INFO 
> ][exchange-worker-#12782%internal.GridTaskFailoverAffinityRunTest1%][GridDhtPartitionsExchangeFuture]
>  Completed partition exchange 
> [localNode=511d5932-5f22-4919-807d-575c7f61, 
> exchange=GridDhtPartitionsExchangeFuture [topVer=AffinityTopologyVersion 
> [topVer=3, minorTopVer=0], evt=NODE_JOINED, evtNode=TcpDiscoveryNode 
> [id=6b9a7a1d-07bf-4d20-882a-8462ada3, addrs=ArrayList [127.0.0.1], 
> sockAddrs=HashSet [/127.0.0.1:47502], discPort=47502, order=3, intOrder=3, 
> lastExchangeTime=1533644409739, loc=false, ver=2.7.0#20180807-sha1:e96616f5, 
> isClient=false], done=true], topVer=AffinityTopologyVersion [topVer=4, 
> minorTopVer=0], durationFromInit=21]
> [2018-08-07 12:20:09,806][INFO 
> ][exchange-worker-#12782%internal.GridTaskFailoverAffinityRunTest1%][time] 
> Finished exchange init [topVer=AffinityTopologyVersion [topVer=3, 
> minorTopVer=0], crd=true]
> [2018-08-07 12:20:09,807][INFO 
> ][exchange-worker-#12782%internal.GridTaskFailoverAffinityRunTest1%][GridCachePartitionExchangeManager]
>  Skipping rebalancing (nothing scheduled) [top=AffinityTopologyVersion 
> [topVer=4, minorTopVer=0], force=false, evt=NODE_JOINED, 
> node=6b9a7a1d-07bf-4d20-882a-8462ada3]
> [2018-08-07 12:20:09,811][INFO 
> ][sys-#12798%internal.GridTaskFailoverAffinityRunTest2%][GridDhtPartitionsExchangeFuture]
>  Finish exchange future [startVer=AffinityTopologyVersion [topVer=4, 
> minorTopVer=0], resVer=AffinityTopologyVersion [topVer=4, minorTopVer=0], 
> err=null]
> [2018-08-07 12:20:09,813][INFO 
> ][sys-#12798%internal.GridTaskFailoverAffinityRunTest2%][GridDhtPartitionsExchangeFuture]
>  Completed partition exchange 
> [localNode=a3206c1f-6d57-4fd6-8aa5-e22f3b42, 
> exchange=GridDhtPartitionsExchangeFuture [topVer=AffinityTopologyVersion 
> [topVer=4, minorTopVer=0], evt=NODE_JOINED, evtNode=TcpDiscoveryNode 
> [id=a3206c1f-6d57-4fd6-8aa5-e22f3b42, addrs=ArrayList [127.0.0.1], 
> sockAddrs=HashSet [/127.0.0.1:47503], discPort=47503, order=4, intOrder=4, 
> lastExchangeTime=1533644409779, loc=true, ver=2.7.0#20180807-sha1:e96616f5, 
> isClient=false], done=true], topVer=AffinityTopologyVersion [topVer=4, 
> minorTopVer=0], durationFromInit=41]
> [2018-08-07 12:20:09,814][INFO 
> ][grid-starter-testNodeRestartClient-1][GridTaskFailoverAffinityRunTest1] To 
> start Console Management & Monitoring run ignitevisorcmd.{sh|bat}
> [2018-08-07 12:20:09,815][INFO 
> ][grid-starter-testNodeRestartClient-1][GridTaskFailoverAffinityRunTest1] 
> [2018-08-07 12:20:09,815][INFO 
> ][grid-starter-testNodeRestartClient-1][GridTaskFailoverAffinityRunTest1] 
> >>> +---+
> >>> Ignite ver. 
> >>> 2.7.0-SNAPSHOT#20180807-sha1:e96616f580930f267eab44f75d410fa29a876bcb
> >>> +---+
> >>> OS name: Linux 4.4.0-128-generic amd64
> >>> CPU(s): 5
> >>> Heap: 2.0GB
> >>> VM name: 20126@8790182f15a5
> >>> Ignite instance name: internal.GridTaskFailoverAffinityRunTest1
> >>> Local node [ID=511D5932-5F22-4919-807D-575C7F61, order=2, 
> >>> clientMode=false]
> >>> Local node addresses: [127.0.0

[jira] [Updated] (IGNITE-9245) Document how to monitor Ignite with Zabbix

2018-08-10 Thread Artem Budnikov (JIRA)


 [ 
https://issues.apache.org/jira/browse/IGNITE-9245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Artem Budnikov updated IGNITE-9245:
---
Issue Type: Task  (was: Test)

> Document how to monitor Ignite with Zabbix 
> ---
>
> Key: IGNITE-9245
> URL: https://issues.apache.org/jira/browse/IGNITE-9245
> Project: Ignite
>  Issue Type: Task
>  Components: documentation
>Reporter: Artem Budnikov
>Assignee: Artem Budnikov
>Priority: Major
>
> Create a how-to page with an instruction on how to use Zabbix templates to 
> monitor Ignite metrics.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-9245) Document how to monitor Ignite with Zabbix

2018-08-10 Thread Artem Budnikov (JIRA)
Artem Budnikov created IGNITE-9245:
--

 Summary: Document how to monitor Ignite with Zabbix 
 Key: IGNITE-9245
 URL: https://issues.apache.org/jira/browse/IGNITE-9245
 Project: Ignite
  Issue Type: Test
  Components: documentation
Reporter: Artem Budnikov
Assignee: Artem Budnikov


Create a how-to page with an instruction on how to use Zabbix templates to 
monitor Ignite metrics.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (IGNITE-9244) Partition eviction may use all threads in sys pool, it leads to hangs send a message via sys pool

2018-08-10 Thread Dmitriy Govorukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/IGNITE-9244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitriy Govorukhin reassigned IGNITE-9244:
--

Assignee: Dmitriy Govorukhin

> Partition eviction may use all threads in sys pool, it leads to hangs send a 
> message via sys pool 
> --
>
> Key: IGNITE-9244
> URL: https://issues.apache.org/jira/browse/IGNITE-9244
> Project: Ignite
>  Issue Type: Bug
>Reporter: Dmitriy Govorukhin
>Assignee: Dmitriy Govorukhin
>Priority: Major
> Fix For: 2.7
>
>
> In the current implementation, GridDhtPartitionsEvictor reset partition to 
> evict one by one.
> GridDhtPartitionsEvictor is created for each cache group, if we try to evict 
> too many groups as sys pool size, group evictors will take all available 
> threads in sys pool. It leads to hangs send a message via sys pool. As a fix, 
> I suggest to limit concurrent execution via sys pool or use another pool for 
> this purpose.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (IGNITE-9244) Partition eviction may use all threads in sys pool, it leads to hangs send a message via sys pool

2018-08-10 Thread Dmitriy Govorukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/IGNITE-9244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitriy Govorukhin updated IGNITE-9244:
---
Ignite Flags:   (was: Docs Required)

> Partition eviction may use all threads in sys pool, it leads to hangs send a 
> message via sys pool 
> --
>
> Key: IGNITE-9244
> URL: https://issues.apache.org/jira/browse/IGNITE-9244
> Project: Ignite
>  Issue Type: Bug
>Reporter: Dmitriy Govorukhin
>Priority: Major
> Fix For: 2.7
>
>
> In the current implementation, GridDhtPartitionsEvictor reset partition to 
> evict one by one.
> GridDhtPartitionsEvictor is created for each cache group, if we try to evict 
> too many groups as sys pool size, group evictors will take all available 
> threads in sys pool. It leads to hangs send a message via sys pool. As a fix, 
> I suggest to limit concurrent execution via sys pool or use another pool for 
> this purpose.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (IGNITE-9244) Partition eviction may use all threads in sys pool, it leads to hangs send a message via sys pool

2018-08-10 Thread Dmitriy Govorukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/IGNITE-9244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitriy Govorukhin updated IGNITE-9244:
---
Environment: (was: In the current implementation, 
GridDhtPartitionsEvictor reset partition to evict one by one.
GridDhtPartitionsEvictor is created for each cache group, if we try to evict 
too many groups as sys pool size, group evictors will take all available 
threads in sys pool. It leads to hangs send a message via sys pool. As a fix, I 
suggest to limit concurrent execution via sys pool or use another pool for this 
purpose.)

> Partition eviction may use all threads in sys pool, it leads to hangs send a 
> message via sys pool 
> --
>
> Key: IGNITE-9244
> URL: https://issues.apache.org/jira/browse/IGNITE-9244
> Project: Ignite
>  Issue Type: Bug
>Reporter: Dmitriy Govorukhin
>Priority: Major
> Fix For: 2.7
>
>
> In the current implementation, GridDhtPartitionsEvictor reset partition to 
> evict one by one.
> GridDhtPartitionsEvictor is created for each cache group, if we try to evict 
> too many groups as sys pool size, group evictors will take all available 
> threads in sys pool. It leads to hangs send a message via sys pool. As a fix, 
> I suggest to limit concurrent execution via sys pool or use another pool for 
> this purpose.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (IGNITE-9244) Partition eviction may use all threads in sys pool, it leads to hangs send a message via sys pool

2018-08-10 Thread Dmitriy Govorukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/IGNITE-9244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitriy Govorukhin updated IGNITE-9244:
---
Description: 

In the current implementation, GridDhtPartitionsEvictor reset partition to 
evict one by one.
GridDhtPartitionsEvictor is created for each cache group, if we try to evict 
too many groups as sys pool size, group evictors will take all available 
threads in sys pool. It leads to hangs send a message via sys pool. As a fix, I 
suggest to limit concurrent execution via sys pool or use another pool for this 
purpose.

> Partition eviction may use all threads in sys pool, it leads to hangs send a 
> message via sys pool 
> --
>
> Key: IGNITE-9244
> URL: https://issues.apache.org/jira/browse/IGNITE-9244
> Project: Ignite
>  Issue Type: Bug
> Environment: In the current implementation, GridDhtPartitionsEvictor 
> reset partition to evict one by one.
> GridDhtPartitionsEvictor is created for each cache group, if we try to evict 
> too many groups as sys pool size, group evictors will take all available 
> threads in sys pool. It leads to hangs send a message via sys pool. As a fix, 
> I suggest to limit concurrent execution via sys pool or use another pool for 
> this purpose.
>Reporter: Dmitriy Govorukhin
>Priority: Major
> Fix For: 2.7
>
>
> In the current implementation, GridDhtPartitionsEvictor reset partition to 
> evict one by one.
> GridDhtPartitionsEvictor is created for each cache group, if we try to evict 
> too many groups as sys pool size, group evictors will take all available 
> threads in sys pool. It leads to hangs send a message via sys pool. As a fix, 
> I suggest to limit concurrent execution via sys pool or use another pool for 
> this purpose.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-9244) Partition eviction may use all threads in sys pool, it leads to hangs send a message via sys pool

2018-08-10 Thread Dmitriy Govorukhin (JIRA)
Dmitriy Govorukhin created IGNITE-9244:
--

 Summary: Partition eviction may use all threads in sys pool, it 
leads to hangs send a message via sys pool 
 Key: IGNITE-9244
 URL: https://issues.apache.org/jira/browse/IGNITE-9244
 Project: Ignite
  Issue Type: Bug
 Environment: In the current implementation, GridDhtPartitionsEvictor 
reset partition to evict one by one.
GridDhtPartitionsEvictor is created for each cache group, if we try to evict 
too many groups as sys pool size, group evictors will take all available 
threads in sys pool. It leads to hangs send a message via sys pool. As a fix, I 
suggest to limit concurrent execution via sys pool or use another pool for this 
purpose.
Reporter: Dmitriy Govorukhin






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (IGNITE-9244) Partition eviction may use all threads in sys pool, it leads to hangs send a message via sys pool

2018-08-10 Thread Dmitriy Govorukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/IGNITE-9244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitriy Govorukhin updated IGNITE-9244:
---
Fix Version/s: 2.7

> Partition eviction may use all threads in sys pool, it leads to hangs send a 
> message via sys pool 
> --
>
> Key: IGNITE-9244
> URL: https://issues.apache.org/jira/browse/IGNITE-9244
> Project: Ignite
>  Issue Type: Bug
> Environment: In the current implementation, GridDhtPartitionsEvictor 
> reset partition to evict one by one.
> GridDhtPartitionsEvictor is created for each cache group, if we try to evict 
> too many groups as sys pool size, group evictors will take all available 
> threads in sys pool. It leads to hangs send a message via sys pool. As a fix, 
> I suggest to limit concurrent execution via sys pool or use another pool for 
> this purpose.
>Reporter: Dmitriy Govorukhin
>Priority: Major
> Fix For: 2.7
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)