[jira] [Commented] (IGNITE-11936) Avoid changing AffinityTopologyVersion on a server node join/left event from not baseline topology.

2019-07-10 Thread Ignite TC Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-11936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16882305#comment-16882305
 ] 

Ignite TC Bot commented on IGNITE-11936:


{panel:title=-- Run :: All: No blockers 
found!|borderStyle=dashed|borderColor=#ccc|titleBGColor=#D6F7C1}{panel}
[TeamCity *-- Run :: All* 
Results|https://ci.ignite.apache.org/viewLog.html?buildId=4300185buildTypeId=IgniteTests24Java8_RunAll]

> Avoid changing AffinityTopologyVersion on a server node join/left event from 
> not baseline topology.
> ---
>
> Key: IGNITE-11936
> URL: https://issues.apache.org/jira/browse/IGNITE-11936
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Amelchev Nikita
>Assignee: Amelchev Nikita
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently, a client join/left event does not change AffinityTopologyVersion 
> (see IGNITE-9558). It shouldn't be changed on a server node join/left event 
> from not baseline topology too.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (IGNITE-11951) Set ThreadLocal node name only once in JdkMarshaller

2019-07-10 Thread Aleksey Plekhanov (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-11951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16882217#comment-16882217
 ] 

Aleksey Plekhanov commented on IGNITE-11951:


[~Pavlukhin], I've looked at your PR. It looks good to me.

> Set ThreadLocal node name only once in JdkMarshaller
> 
>
> Key: IGNITE-11951
> URL: https://issues.apache.org/jira/browse/IGNITE-11951
> Project: Ignite
>  Issue Type: Improvement
>Affects Versions: 2.7.5
>Reporter: Ivan Pavlukhin
>Assignee: Ivan Pavlukhin
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently {{JdkMarshaller}} saves a node name twice in couple of 
> marshall/unmarshall methods. Code can be improved to do it only once.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (IGNITE-11958) JDBC connection validation should use it's own task instead of cache validation task

2019-07-10 Thread Yury Gerzhedovich (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-11958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16882079#comment-16882079
 ] 

Yury Gerzhedovich commented on IGNITE-11958:


Looks ok to me

> JDBC connection validation should use it's own task instead of cache 
> validation task
> 
>
> Key: IGNITE-11958
> URL: https://issues.apache.org/jira/browse/IGNITE-11958
> Project: Ignite
>  Issue Type: Bug
>Reporter: Denis Chudov
>Assignee: Denis Chudov
>Priority: Major
> Fix For: 2.8
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> JDBC connection is validated using GridCacheQueryJdbcValidationTask. We 
> should create own validation task for this activity.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (IGNITE-11974) infinite loop and 100% cpu in GridDhtPartitionsEvictor: Eviction in progress ...

2019-07-10 Thread Igor Kamyshnikov (JIRA)


 [ 
https://issues.apache.org/jira/browse/IGNITE-11974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Igor Kamyshnikov updated IGNITE-11974:
--
Attachment: server-node-restarts-1.png

> infinite loop and 100% cpu in GridDhtPartitionsEvictor: Eviction in progress 
> ...
> 
>
> Key: IGNITE-11974
> URL: https://issues.apache.org/jira/browse/IGNITE-11974
> Project: Ignite
>  Issue Type: Bug
>  Components: cache
>Affects Versions: 2.5
>Reporter: Igor Kamyshnikov
>Priority: Blocker
> Attachments: image-2019-07-10-16-07-37-185.png, 
> server-node-restarts-1.png
>
>
> Note: RCA was not done:
> Sometimes ignite server nodes fall into infinite loop and consume 100% cpu:
> {noformat}
> "sys-#260008" #260285 prio=5 os_prio=0 tid=0x7fabb020a800 nid=0x1e850 
> runnable [0x7fab26fef000]
>java.lang.Thread.State: RUNNABLE
>   at 
> java.util.concurrent.ConcurrentHashMap$Traverser.advance(ConcurrentHashMap.java:3339)
>   at 
> java.util.concurrent.ConcurrentHashMap$ValueIterator.next(ConcurrentHashMap.java:3439)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtPartitionsEvictor$1.call(GridDhtPartitionsEvictor.java:84)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtPartitionsEvictor$1.call(GridDhtPartitionsEvictor.java:73)
>   at 
> org.apache.ignite.internal.util.IgniteUtils.wrapThreadLoader(IgniteUtils.java:6695)
>   at 
> org.apache.ignite.internal.processors.closure.GridClosureProcessor$2.body(GridClosureProcessor.java:967)
>   at 
> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
>Locked ownable synchronizers:
>   - <0x000649b9cba0> (a 
> java.util.concurrent.ThreadPoolExecutor$Worker)
> {noformat}
> the following appears in logs each 2 minutes:
> {noformat}
>  INFO  2019-07-08 12:21:45.081 (1562581305081) [sys-#98168] 
> [GridDhtPartitionsEvictor] > Eviction in progress 
> [grp=CUSTPRODINVOICEDISCUSAGE, remainingCnt=102]
> {noformat}
> remainingCnt remains the same once it reached 102 (the very first line in the 
> logs was with value equal to 101).
> Some other facts:
> we have a heapdump taken for *topVer = 900* . the problem appeared after 
> *topVer = 790*, but it looks like it was silently waiting from *topVer = 641* 
> (about 24 hours back).
> There were 259 topology changes between 900 and 641.
> All 102 GridDhtLocalPartitions can be found in the heapdump:
> {noformat}
> select * from 
> "org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtLocalPartition"
>  t where delayedRenting = true
> {noformat}
> They all have status = 65537 , which means (according to 
> org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtLocalPartition#state):
> reservations(65537) = 1
> getPartState(65537) = OWNING
> There are also 26968 instances of 
> org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtPartitionTopologyImpl$$Lambda$70,
>  that are created by 
> org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtPartitionTopologyImpl#checkEvictions
>  method.
> 26418 of 26968 refer to AtomicInteger instance with value = 102:
> 26418/102 = 259 = 900 - 641 (see topology info above).
> The key thing seen from the heapdump is that topVer = 641 or topVer = 642 was 
> the last topology where these 102 partitions were assigned to the current 
> ignite server node.
> {noformat}
> select
>   t.this
>  ,t.this['clientEvtChange'] as clientEvtChange
>  ,t.this['topVer.topVer'] as topVer
>  
> ,t.this['assignment.elementData'][555]['elementData'][0]['hostNames.elementData'][0]
>  as primary_part
>  
> ,t.this['assignment.elementData'][555]['elementData'][1]['hostNames.elementData'][0]
>  as secondary_part
> from org.apache.ignite.internal.processors.affinity.HistoryAffinityAssignment 
> t where length(t.this['assignment.elementData']) = 1024
> order by topVer
> {noformat}
>  !image-2019-07-10-16-07-37-185.png! 
> The connection of a client node at topVer = 790 somehow triggered the 
> GridDhtPartitionsEvictor loop to execute.
> Summary:
> 1) it is seen that 102 partitions has one reservation and OWNING state.
> 2) they were backup partitions.
> 3) for some reason their eviction has been silently delaying (because of 
> reservations), but each topology change seemed to trigger eviction attempt.
> 4) something managed to make 
> org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtPartitionsEvictor#evictPartitionAsync
>  to run never exiting.



--
This message was 

[jira] [Updated] (IGNITE-11974) infinite loop and 100% cpu in GridDhtPartitionsEvictor: Eviction in progress ...

2019-07-10 Thread Igor Kamyshnikov (JIRA)


 [ 
https://issues.apache.org/jira/browse/IGNITE-11974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Igor Kamyshnikov updated IGNITE-11974:
--
Description: 
Note: RCA was not done:

Sometimes ignite server nodes fall into infinite loop and consume 100% cpu:
{noformat}
"sys-#260008" #260285 prio=5 os_prio=0 tid=0x7fabb020a800 nid=0x1e850 
runnable [0x7fab26fef000]
   java.lang.Thread.State: RUNNABLE
at 
java.util.concurrent.ConcurrentHashMap$Traverser.advance(ConcurrentHashMap.java:3339)
at 
java.util.concurrent.ConcurrentHashMap$ValueIterator.next(ConcurrentHashMap.java:3439)
at 
org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtPartitionsEvictor$1.call(GridDhtPartitionsEvictor.java:84)
at 
org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtPartitionsEvictor$1.call(GridDhtPartitionsEvictor.java:73)
at 
org.apache.ignite.internal.util.IgniteUtils.wrapThreadLoader(IgniteUtils.java:6695)
at 
org.apache.ignite.internal.processors.closure.GridClosureProcessor$2.body(GridClosureProcessor.java:967)
at 
org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

   Locked ownable synchronizers:
- <0x000649b9cba0> (a 
java.util.concurrent.ThreadPoolExecutor$Worker)
{noformat}

the following appears in logs each 2 minutes:
{noformat}
 INFO  2019-07-08 12:21:45.081 (1562581305081) [sys-#98168] 
[GridDhtPartitionsEvictor] > Eviction in progress 
[grp=CUSTPRODINVOICEDISCUSAGE, remainingCnt=102]
{noformat}

remainingCnt remains the same once it reached 102 (the very first line in the 
logs was with value equal to 101).

Some other facts:
we have a heapdump taken for *topVer = 900* . the problem appeared after 
*topVer = 790*, but it looks like it was silently waiting from *topVer = 641* 
(about 24 hours back).
There were 259 topology changes between 900 and 641.

All 102 GridDhtLocalPartitions can be found in the heapdump:
{noformat}
select * from 
"org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtLocalPartition"
 t where delayedRenting = true
{noformat}

They all have status = 65537 , which means (according to 
org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtLocalPartition#state):

reservations(65537) = 1
getPartState(65537) = OWNING

There are also 26968 instances of 
org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtPartitionTopologyImpl$$Lambda$70,
 that are created by 
org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtPartitionTopologyImpl#checkEvictions
 method.
26418 of 26968 refer to AtomicInteger instance with value = 102:
26418/102 = 259 = 900 - 641 (see topology info above).

The key thing seen from the heapdump is that topVer = 641 or topVer = 642 was 
the last topology where these 102 partitions were assigned to the current 
ignite server node.

{noformat}
select
  t.this
 ,t.this['clientEvtChange'] as clientEvtChange
 ,t.this['topVer.topVer'] as topVer
 
,t.this['assignment.elementData'][555]['elementData'][0]['hostNames.elementData'][0]
 as primary_part
 
,t.this['assignment.elementData'][555]['elementData'][1]['hostNames.elementData'][0]
 as secondary_part
from org.apache.ignite.internal.processors.affinity.HistoryAffinityAssignment t 
where length(t.this['assignment.elementData']) = 1024
order by topVer
{noformat}

 !image-2019-07-10-16-07-37-185.png! 

The connection of a client node at topVer = 790 somehow triggered the 
GridDhtPartitionsEvictor loop to execute.

Summary:
1) it is seen that 102 partitions has one reservation and OWNING state.
2) they were backup partitions.
3) for some reason their eviction has been silently delaying (because of 
reservations), but each topology change seemed to trigger eviction attempt.
4) something managed to make 
org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtPartitionsEvictor#evictPartitionAsync
 to run never exiting.

Additional info:
topVer = 641 was in chain of sever nodes restarts (not sure if rebalancing 
actually succeeded):
 !server-node-restarts-1.png! 

  was:
Note: RCA was not done:

Sometimes ignite server nodes fall into infinite loop and consume 100% cpu:
{noformat}
"sys-#260008" #260285 prio=5 os_prio=0 tid=0x7fabb020a800 nid=0x1e850 
runnable [0x7fab26fef000]
   java.lang.Thread.State: RUNNABLE
at 
java.util.concurrent.ConcurrentHashMap$Traverser.advance(ConcurrentHashMap.java:3339)
at 
java.util.concurrent.ConcurrentHashMap$ValueIterator.next(ConcurrentHashMap.java:3439)
at 
org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtPartitionsEvictor$1.call(GridDhtPartitionsEvictor.java:84)

[jira] [Created] (IGNITE-11974) infinite loop and 100% cpu in GridDhtPartitionsEvictor: Eviction in progress ...

2019-07-10 Thread Igor Kamyshnikov (JIRA)
Igor Kamyshnikov created IGNITE-11974:
-

 Summary: infinite loop and 100% cpu in GridDhtPartitionsEvictor: 
Eviction in progress ...
 Key: IGNITE-11974
 URL: https://issues.apache.org/jira/browse/IGNITE-11974
 Project: Ignite
  Issue Type: Bug
  Components: cache
Affects Versions: 2.5
Reporter: Igor Kamyshnikov
 Attachments: image-2019-07-10-16-07-37-185.png

Note: RCA was not done:

Sometimes ignite server nodes fall into infinite loop and consume 100% cpu:
{noformat}
"sys-#260008" #260285 prio=5 os_prio=0 tid=0x7fabb020a800 nid=0x1e850 
runnable [0x7fab26fef000]
   java.lang.Thread.State: RUNNABLE
at 
java.util.concurrent.ConcurrentHashMap$Traverser.advance(ConcurrentHashMap.java:3339)
at 
java.util.concurrent.ConcurrentHashMap$ValueIterator.next(ConcurrentHashMap.java:3439)
at 
org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtPartitionsEvictor$1.call(GridDhtPartitionsEvictor.java:84)
at 
org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtPartitionsEvictor$1.call(GridDhtPartitionsEvictor.java:73)
at 
org.apache.ignite.internal.util.IgniteUtils.wrapThreadLoader(IgniteUtils.java:6695)
at 
org.apache.ignite.internal.processors.closure.GridClosureProcessor$2.body(GridClosureProcessor.java:967)
at 
org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

   Locked ownable synchronizers:
- <0x000649b9cba0> (a 
java.util.concurrent.ThreadPoolExecutor$Worker)
{noformat}

the following appears in logs each 2 minutes:
{noformat}
 INFO  2019-07-08 12:21:45.081 (1562581305081) [sys-#98168] 
[GridDhtPartitionsEvictor] > Eviction in progress 
[grp=CUSTPRODINVOICEDISCUSAGE, remainingCnt=102]
{noformat}

remainingCnt remains the same once it reached 102 (the very first line in the 
logs was with value equal to 101).

Some other facts:
we have a heapdump taken for *topVer = 900* . the problem appeared after 
*topVer = 790*, but it looks like it was silently waiting from *topVer = 641* 
(about 24 hours back).
There were 259 topology changes between 900 and 641.

All 102 GridDhtLocalPartitions can be found in the heapdump:
{noformat}
select * from 
"org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtLocalPartition"
 t where delayedRenting = true
{noformat}

They all have status = 65537 , which means (according to 
org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtLocalPartition#state):

reservations(65537) = 1
getPartState(65537) = OWNING

There are also 26968 instances of 
org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtPartitionTopologyImpl$$Lambda$70,
 that are created by 
org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtPartitionTopologyImpl#checkEvictions
 method.
26418 of 26968 refer to AtomicInteger instance with value = 102:
26418/102 = 259 = 900 - 641 (see topology info above).

The key thing seen from the heapdump is that topVer = 641 or topVer = 642 was 
the last topology where these 102 partitions were assigned to the current 
ignite server node.

{noformat}
select
  t.this
 ,t.this['clientEvtChange'] as clientEvtChange
 ,t.this['topVer.topVer'] as topVer
 
,t.this['assignment.elementData'][555]['elementData'][0]['hostNames.elementData'][0]
 as primary_part
 
,t.this['assignment.elementData'][555]['elementData'][1]['hostNames.elementData'][0]
 as secondary_part
from org.apache.ignite.internal.processors.affinity.HistoryAffinityAssignment t 
where length(t.this['assignment.elementData']) = 1024
order by topVer
{noformat}

 !image-2019-07-10-16-07-37-185.png! 

The connection of a client node at topVer = 790 somehow triggered the 
GridDhtPartitionsEvictor loop to execute.

Summary:
1) it is seen that 102 partitions has one reservation and OWNING state.
2) they were backup partitions.
3) for some reason their eviction has been silently delaying (because of 
reservations), but each topology change seemed to trigger eviction attempt.
4) something managed to make 
org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtPartitionsEvictor#evictPartitionAsync
 to run never exiting.




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (IGNITE-11907) Registration of continuous query should fail if nodes don't have remote filter class

2019-07-10 Thread Ivan Pavlukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-11907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16881891#comment-16881891
 ] 

Ivan Pavlukhin commented on IGNITE-11907:
-

[~rkondakov], [~jooger], thank you for review I fixed all points except
{quote} 1) IncompleteDeserializationExceptionTest - commented code at the 
end{quote}
I would prefer to keep commented lines because they explains how a file with 
serialized object is generated. Unfortunately it is not possible to uncomment 
them because test will not work as expected (class should not be available at 
runtime).

> Registration of continuous query should fail if nodes don't have remote 
> filter class
> 
>
> Key: IGNITE-11907
> URL: https://issues.apache.org/jira/browse/IGNITE-11907
> Project: Ignite
>  Issue Type: Bug
>Affects Versions: 2.7
>Reporter: Denis Mekhanikov
>Assignee: Ivan Pavlukhin
>Priority: Major
> Attachments: 
> ContinuousQueryRemoteFilterMissingInClassPathSelfTest.java
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> If one of data nodes doesn't have a remote filter class, then registration of 
> continuous queries should fail with an exception. Currently nodes fail 
> instead.
> Reproducer is attached: 
> [^ContinuousQueryRemoteFilterMissingInClassPathSelfTest.java]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (IGNITE-11907) Registration of continuous query should fail if nodes don't have remote filter class

2019-07-10 Thread Yury Gerzhedovich (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-11907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16881874#comment-16881874
 ] 

Yury Gerzhedovich edited comment on IGNITE-11907 at 7/10/19 9:13 AM:
-

[~Pavlukhin]

Pull request looks good. Just two additional minors:

1) IncompleteDeserializationExceptionTest - commented code at the end
 2) 
ContinuousQueryRemoteFilterMissingInClassPathSelfTest#testServerMissingClassFailsRegistration
 method - doesn't check that exception not thrown. 


was (Author: jooger):
[~Pavlukhin]

Pull request looks good. Just two additional minors:

1) IncompleteDeserializationExceptionTest - commented code at the end
2) testServerMissingClassFailsRegistration method - doesn't check that 
exception not thrown. 

> Registration of continuous query should fail if nodes don't have remote 
> filter class
> 
>
> Key: IGNITE-11907
> URL: https://issues.apache.org/jira/browse/IGNITE-11907
> Project: Ignite
>  Issue Type: Bug
>Affects Versions: 2.7
>Reporter: Denis Mekhanikov
>Assignee: Ivan Pavlukhin
>Priority: Major
> Attachments: 
> ContinuousQueryRemoteFilterMissingInClassPathSelfTest.java
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> If one of data nodes doesn't have a remote filter class, then registration of 
> continuous queries should fail with an exception. Currently nodes fail 
> instead.
> Reproducer is attached: 
> [^ContinuousQueryRemoteFilterMissingInClassPathSelfTest.java]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (IGNITE-11907) Registration of continuous query should fail if nodes don't have remote filter class

2019-07-10 Thread Yury Gerzhedovich (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-11907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16881874#comment-16881874
 ] 

Yury Gerzhedovich commented on IGNITE-11907:


[~Pavlukhin]

Pull request looks good. Just two additional minors:

1) IncompleteDeserializationExceptionTest - commented code at the end
2) testServerMissingClassFailsRegistration method - doesn't check that 
exception not thrown. 

> Registration of continuous query should fail if nodes don't have remote 
> filter class
> 
>
> Key: IGNITE-11907
> URL: https://issues.apache.org/jira/browse/IGNITE-11907
> Project: Ignite
>  Issue Type: Bug
>Affects Versions: 2.7
>Reporter: Denis Mekhanikov
>Assignee: Ivan Pavlukhin
>Priority: Major
> Attachments: 
> ContinuousQueryRemoteFilterMissingInClassPathSelfTest.java
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> If one of data nodes doesn't have a remote filter class, then registration of 
> continuous queries should fail with an exception. Currently nodes fail 
> instead.
> Reproducer is attached: 
> [^ContinuousQueryRemoteFilterMissingInClassPathSelfTest.java]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (IGNITE-11973) testAccountTxNodeRestart cause unexpected repairs in case of ReadRepair usage

2019-07-10 Thread Anton Vinogradov (JIRA)


 [ 
https://issues.apache.org/jira/browse/IGNITE-11973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anton Vinogradov updated IGNITE-11973:
--
Ignite Flags:   (was: Docs Required)

> testAccountTxNodeRestart cause unexpected repairs in case of ReadRepair usage
> -
>
> Key: IGNITE-11973
> URL: https://issues.apache.org/jira/browse/IGNITE-11973
> Project: Ignite
>  Issue Type: Task
>Reporter: Anton Vinogradov
>Assignee: Anton Vinogradov
>Priority: Major
>  Labels: iep-31
> Fix For: 2.8
>
>
> Just add withReadRepair() proxy to test's cache and you'll see unexpected 
> data repairs (values are differ on backups while primary is locked).
> To debug this add a breakpoint to 
> {{GridNearReadRepairFuture#recordConsistencyViolation}} after 
> fixedRaw.isEmpty() check.
> Not empty map means repair happened.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (IGNITE-11973) testAccountTxNodeRestart cause unexpected repairs in case of ReadRepair usage

2019-07-10 Thread Anton Vinogradov (JIRA)


 [ 
https://issues.apache.org/jira/browse/IGNITE-11973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anton Vinogradov updated IGNITE-11973:
--
Fix Version/s: 2.8

> testAccountTxNodeRestart cause unexpected repairs in case of ReadRepair usage
> -
>
> Key: IGNITE-11973
> URL: https://issues.apache.org/jira/browse/IGNITE-11973
> Project: Ignite
>  Issue Type: Task
>Reporter: Anton Vinogradov
>Assignee: Anton Vinogradov
>Priority: Major
>  Labels: iep-31
> Fix For: 2.8
>
>
> Just add withReadRepair() proxy to test's cache and you'll see unexpected 
> data repairs (values are differ on backups while primary is locked).
> To debug this add a breakpoint to 
> {{GridNearReadRepairFuture#recordConsistencyViolation}} after 
> fixedRaw.isEmpty() check.
> Not empty map means repair happened.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (IGNITE-11971) Consistency check on test finish

2019-07-10 Thread Anton Vinogradov (JIRA)


 [ 
https://issues.apache.org/jira/browse/IGNITE-11971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anton Vinogradov updated IGNITE-11971:
--
Ignite Flags:   (was: Docs Required)

> Consistency check on test finish
> 
>
> Key: IGNITE-11971
> URL: https://issues.apache.org/jira/browse/IGNITE-11971
> Project: Ignite
>  Issue Type: Task
>Reporter: Anton Vinogradov
>Priority: Major
>  Labels: iep-31
> Fix For: 2.8
>
>
> Tests based on GridAbstractTest should automatically check cache's content 
> too be consistent on test finish.
> Good place to check this is a tearDown method.
> Additional check can be add to awaitPartitionMapExchange() method.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (IGNITE-11971) Consistency check on test finish

2019-07-10 Thread Anton Vinogradov (JIRA)


 [ 
https://issues.apache.org/jira/browse/IGNITE-11971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anton Vinogradov updated IGNITE-11971:
--
Fix Version/s: 2.8

> Consistency check on test finish
> 
>
> Key: IGNITE-11971
> URL: https://issues.apache.org/jira/browse/IGNITE-11971
> Project: Ignite
>  Issue Type: Task
>Reporter: Anton Vinogradov
>Priority: Major
>  Labels: iep-31
> Fix For: 2.8
>
>
> Tests based on GridAbstractTest should automatically check cache's content 
> too be consistent on test finish.
> Good place to check this is a tearDown method.
> Additional check can be add to awaitPartitionMapExchange() method.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (IGNITE-11972) Jepsen tests should check consistency

2019-07-10 Thread Anton Vinogradov (JIRA)


 [ 
https://issues.apache.org/jira/browse/IGNITE-11972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anton Vinogradov updated IGNITE-11972:
--
Fix Version/s: 2.8

> Jepsen tests should check consistency
> -
>
> Key: IGNITE-11972
> URL: https://issues.apache.org/jira/browse/IGNITE-11972
> Project: Ignite
>  Issue Type: Task
>Reporter: Anton Vinogradov
>Assignee: Mikhail Filatov
>Priority: Major
>  Labels: iep-31
> Fix For: 2.8
>
>
> We have to check data is consistent during and after the tests.
> Good case is to use:
> - idle_verify of test finish
> - ReadRepair 
> -- during the test (some/(all?) gets should be with RR proxy) 
> -- after the test finish.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (IGNITE-11972) Jepsen tests should check consistency

2019-07-10 Thread Anton Vinogradov (JIRA)


 [ 
https://issues.apache.org/jira/browse/IGNITE-11972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anton Vinogradov updated IGNITE-11972:
--
Ignite Flags:   (was: Docs Required)

> Jepsen tests should check consistency
> -
>
> Key: IGNITE-11972
> URL: https://issues.apache.org/jira/browse/IGNITE-11972
> Project: Ignite
>  Issue Type: Task
>Reporter: Anton Vinogradov
>Assignee: Mikhail Filatov
>Priority: Major
>  Labels: iep-31
>
> We have to check data is consistent during and after the tests.
> Good case is to use:
> - idle_verify of test finish
> - ReadRepair 
> -- during the test (some/(all?) gets should be with RR proxy) 
> -- after the test finish.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (IGNITE-11972) Jepsen tests should check consistency

2019-07-10 Thread Anton Vinogradov (JIRA)


 [ 
https://issues.apache.org/jira/browse/IGNITE-11972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anton Vinogradov reassigned IGNITE-11972:
-

Assignee: Mikhail Filatov

> Jepsen tests should check consistency
> -
>
> Key: IGNITE-11972
> URL: https://issues.apache.org/jira/browse/IGNITE-11972
> Project: Ignite
>  Issue Type: Task
>Reporter: Anton Vinogradov
>Assignee: Mikhail Filatov
>Priority: Major
>  Labels: iep-31
>
> We have to check data is consistent during and after the tests.
> Good case is to use:
> - idle_verify of test finish
> - ReadRepair 
> -- during the test (some/(all?) gets should be with RR proxy) 
> -- after the test finish.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-11973) testAccountTxNodeRestart cause unexpected repairs in case of ReadRepair usage

2019-07-10 Thread Anton Vinogradov (JIRA)
Anton Vinogradov created IGNITE-11973:
-

 Summary: testAccountTxNodeRestart cause unexpected repairs in case 
of ReadRepair usage
 Key: IGNITE-11973
 URL: https://issues.apache.org/jira/browse/IGNITE-11973
 Project: Ignite
  Issue Type: Task
Reporter: Anton Vinogradov
Assignee: Anton Vinogradov


Just add withReadRepair() proxy to test's cache and you'll see unexpected data 
repairs (values are differ on backups while primary is locked).

To debug this add a breakpoint to 
{{GridNearReadRepairFuture#recordConsistencyViolation}} after 
fixedRaw.isEmpty() check.

Not empty map means repair happened.




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-11972) Jepsen tests should check consistency

2019-07-10 Thread Anton Vinogradov (JIRA)
Anton Vinogradov created IGNITE-11972:
-

 Summary: Jepsen tests should check consistency
 Key: IGNITE-11972
 URL: https://issues.apache.org/jira/browse/IGNITE-11972
 Project: Ignite
  Issue Type: Task
Reporter: Anton Vinogradov


We have to check data is consistent during and after the tests.

Good case is to use:
- idle_verify of test finish
- ReadRepair 
-- during the test (some/(all?) gets should be with RR proxy) 
-- after the test finish.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (IGNITE-11927) [IEP-35] Add ability to configure subset of metrics

2019-07-10 Thread Nikolay Izhikov (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-11927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16881779#comment-16881779
 ] 

Nikolay Izhikov commented on IGNITE-11927:
--

[~agura] Sorry, I should read ticket description(written by myself) more 
carefully.

I will add an enabling/disabling feature to this PR shortly.

> [IEP-35] Add ability to configure subset of metrics
> ---
>
> Key: IGNITE-11927
> URL: https://issues.apache.org/jira/browse/IGNITE-11927
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Nikolay Izhikov
>Assignee: Nikolay Izhikov
>Priority: Major
>  Labels: IEP-35
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Ignite should be able to:
> * Enable or disable an arbitrary subset of the metrics. User should be able 
> to do it in runtime.
> * Configure Histogram metrics
> * Configure HitRate metrics.
> We should provide 2 ways to configure metric:
> 1. -Configuration file.- Discussed on dev-list. Agreed to go with the 
> simplest solution - JMX method.
> 2. JMX method.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (IGNITE-11971) Consistency check on test finish

2019-07-10 Thread Anton Vinogradov (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-11971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16881778#comment-16881778
 ] 

Anton Vinogradov commented on IGNITE-11971:
---

ReadRepair

> Consistency check on test finish
> 
>
> Key: IGNITE-11971
> URL: https://issues.apache.org/jira/browse/IGNITE-11971
> Project: Ignite
>  Issue Type: Task
>Reporter: Anton Vinogradov
>Priority: Major
>  Labels: iep-31
>
> Tests based on GridAbstractTest should automatically check cache's content 
> too be consistent on test finish.
> Good place to check this is a tearDown method.
> Additional check can be add to awaitPartitionMapExchange() method.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Issue Comment Deleted] (IGNITE-11971) Consistency check on test finish

2019-07-10 Thread Anton Vinogradov (JIRA)


 [ 
https://issues.apache.org/jira/browse/IGNITE-11971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anton Vinogradov updated IGNITE-11971:
--
Comment: was deleted

(was: ReadRepair)

> Consistency check on test finish
> 
>
> Key: IGNITE-11971
> URL: https://issues.apache.org/jira/browse/IGNITE-11971
> Project: Ignite
>  Issue Type: Task
>Reporter: Anton Vinogradov
>Priority: Major
>  Labels: iep-31
>
> Tests based on GridAbstractTest should automatically check cache's content 
> too be consistent on test finish.
> Good place to check this is a tearDown method.
> Additional check can be add to awaitPartitionMapExchange() method.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-11971) Consistency check on test finish

2019-07-10 Thread Anton Vinogradov (JIRA)
Anton Vinogradov created IGNITE-11971:
-

 Summary: Consistency check on test finish
 Key: IGNITE-11971
 URL: https://issues.apache.org/jira/browse/IGNITE-11971
 Project: Ignite
  Issue Type: Task
Reporter: Anton Vinogradov


Tests based on GridAbstractTest should automatically check cache's content too 
be consistent on test finish.
Good place to check this is a tearDown method.

Additional check can be add to awaitPartitionMapExchange() method.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (IGNITE-10663) Implement cache mode allows reads with consistency check and fix

2019-07-10 Thread Anton Vinogradov (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-10663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16881765#comment-16881765
 ] 

Anton Vinogradov edited comment on IGNITE-10663 at 7/10/19 6:35 AM:


Merged to the master branch.


was (Author: avinogradov):
Merger to the master branch.

> Implement cache mode allows reads with consistency check and fix
> 
>
> Key: IGNITE-10663
> URL: https://issues.apache.org/jira/browse/IGNITE-10663
> Project: Ignite
>  Issue Type: Task
>Reporter: Anton Vinogradov
>Assignee: Anton Vinogradov
>Priority: Major
>  Labels: iep-31
> Fix For: 2.8
>
>  Time Spent: 18h
>  Remaining Estimate: 0h
>
> The main idea is to provide special "read from cache" mode which will read a 
> value from primary and all backups and will check that values are the same.
> In case values differ they should be fixed according to the appropriate 
> strategy.
> ToDo list:
> 1) {{cache.withReadRepair().get(key)}} should guarantee values will be 
> checked across the topology and fixed if necessary.
> 2) LWW (Last Write Wins) strategy should be used for validation.
> 3) Since  LWW and any other strategy do not guarantee that the correct value 
> will be chosen.
> We have to record the event contains all values and the chosen one.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)