[jira] [Commented] (IGNITE-14055) Deadlock in timeoutObjectProcessor between 'send message' & 'handshake timeout'

2021-02-08 Thread Ivan Bessonov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-14055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17280853#comment-17280853
 ] 

Ivan Bessonov commented on IGNITE-14055:


[~akalashnikov] looks good, I'll merge it right now.

> Deadlock in timeoutObjectProcessor between 'send message' & 'handshake 
> timeout'
> ---
>
> Key: IGNITE-14055
> URL: https://issues.apache.org/jira/browse/IGNITE-14055
> Project: Ignite
>  Issue Type: Bug
>Reporter: Anton Kalashnikov
>Assignee: Anton Kalashnikov
>Priority: Major
> Attachments: StartServerWithTxPuts (1).java, freeze (1).sh
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Cluster hangs after jvm pauses on one of server nodes.
>  Scenario:
>  1. Start three server nodes with put operations using StartServerWithTxPuts.
>  2. Emulate jvm freezes on one server node by running the attached script:
>  {{*sh freeze.sh *}}
>  3. Wait until the script has finished.
> Result:
>  The cluster hangs on tx put operations.
> The first server node continuously prints:
> {noformat}
> [2020-11-03 09:36:01,719][INFO 
> ][grid-nio-worker-tcp-comm-3-#26%TcpCommunicationSpi%][TcpCommunicationSpi] 
> Accepted incoming communication connection [locAddr=/127.0.0.1:47100, 
> rmtAddr=/127.0.0.1:57714][2020-11-03 09:36:01,720][INFO 
> ][grid-nio-worker-tcp-comm-3-#26%TcpCommunicationSpi%][TcpCommunicationSpi] 
> Received incoming connection from remote node while connecting to this node, 
> rejecting [locNode=5defd32f-5bdb-4b9e-8a6e-5ee268edac42, locNodeOrder=1, 
> rmtNode=07583a9d-36c8-4100-a69c-8cbd26ca82c9, rmtNodeOrder=3][2020-11-03 
> 09:36:01,922][INFO 
> ][grid-nio-worker-tcp-comm-0-#23%TcpCommunicationSpi%][TcpCommunicationSpi] 
> Accepted incoming communication connection [locAddr=/127.0.0.1:47100, 
> rmtAddr=/127.0.0.1:57716][2020-11-03 09:36:01,922][INFO 
> ][grid-nio-worker-tcp-comm-0-#23%TcpCommunicationSpi%][TcpCommunicationSpi] 
> Received incoming connection from remote node while connecting to this node, 
> rejecting [locNode=5defd32f-5bdb-4b9e-8a6e-5ee268edac42, locNodeOrder=1, 
> rmtNode=07583a9d-36c8-4100-a69c-8cbd26ca82c9, rmtNodeOrder=3][2020-11-03 
> 09:36:02,124][INFO 
> ][grid-nio-worker-tcp-comm-1-#24%TcpCommunicationSpi%][TcpCommunicationSpi] 
> Accepted incoming communication connection [locAddr=/127.0.0.1:47100, 
> rmtAddr=/127.0.0.1:57718][2020-11-03 09:36:02,125][INFO 
> ][grid-nio-worker-tcp-comm-1-#24%TcpCommunicationSpi%][TcpCommunicationSpi] 
> Received incoming connection from remote node while connecting to this node, 
> rejecting [locNode=5defd32f-5bdb-4b9e-8a6e-5ee268edac42, locNodeOrder=1, 
> rmtNode=07583a9d-36c8-4100-a69c-8cbd26ca82c9, rmtNodeOrder=3][2020-11-03 
> 09:36:02,326][INFO 
> ][grid-nio-worker-tcp-comm-2-#25%TcpCommunicationSpi%][TcpCommunicationSpi] 
> Accepted incoming communication connection [locAddr=/127.0.0.1:47100, 
> rmtAddr=/127.0.0.1:57720][2020-11-03 09:36:02,327][INFO 
> ][grid-nio-worker-tcp-comm-2-#25%TcpCommunicationSpi%][TcpCommunicationSpi] 
> Received incoming connection from remote node while connecting to this node, 
> rejecting [locNode=5defd32f-5bdb-4b9e-8a6e-5ee268edac42, locNodeOrder=1, 
> rmtNode=07583a9d-36c8-4100-a69c-8cbd26ca82c9, rmtNodeOrder=3][2020-11-03 
> 09:36:02,528][INFO 
> ][grid-nio-worker-tcp-comm-3-#26%TcpCommunicationSpi%][TcpCommunicationSpi] 
> Accepted incoming communication connection [locAddr=/127.0.0.1:47100, 
> rmtAddr=/127.0.0.1:57722][2020-11-03 09:36:02,529][INFO 
> ][grid-nio-worker-tcp-comm-3-#26%TcpCommunicationSpi%][TcpCommunicationSpi] 
> Received incoming connection from remote node while connecting to this node, 
> rejecting [locNode=5defd32f-5bdb-4b9e-8a6e-5ee268edac42, locNodeOrder=1, 
> rmtNode=07583a9d-36c8-4100-a69c-8cbd26ca82c9, rmtNodeOrder=3]}}
> {noformat}
>  The second node prints long running transactions in prepared state ignoring 
> the default tx timeout:
>  
> {noformat}
> [2020-11-03 09:36:46,199][WARN 
> ][sys-#83%56b4f715-82d6-4d63-ba99-441ffcd673b4%][diagnostic] >>> Future 
> [startTime=09:33:08.496, curTime=09:36:46.181, fut=GridNearTxFinishFuture 
> [futId=425decc8571-4ce98554-8c56-4daf-a7a9-5b9bff52fa08, tx=GridNearTxLocal 
> [mappings=IgniteTxMappingsSingleImpl [mapping=GridDistributedTxMapping 
> [entries=LinkedHashSet [IgniteTxEntry [txKey=IgniteTxKey 
> [key=KeyCacheObjectImpl [part=833, val=833, hasValBytes=true], 
> cacheId=-923393186], val=TxEntryValueHolder [val=CacheObjectByteArrayImpl 
> [arrLen=1048576], op=CREATE], prevVal=TxEntryValueHolder [val=null, op=NOOP], 
> oldVal=TxEntryValueHolder [val=null, op=NOOP], entryProcessorsCol=null, 
> ttl=-1, conflictExpireTime=-1, conflictVer=null, explicitVer=null, 
> dhtVer=null, filters=CacheEntryPredicate[] [], filtersPassed=false, 
> filtersSe

[jira] [Updated] (IGNITE-14102) Create escaping and searching util methods for configuration framework

2021-02-05 Thread Ivan Bessonov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-14102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Bessonov updated IGNITE-14102:
---
Fix Version/s: 3.0.0-alpha2

> Create escaping and searching util methods for configuration framework
> --
>
> Key: IGNITE-14102
> URL: https://issues.apache.org/jira/browse/IGNITE-14102
> Project: Ignite
>  Issue Type: Sub-task
>Reporter: Ivan Bessonov
>Assignee: Ivan Bessonov
>Priority: Major
> Fix For: 3.0.0-alpha2
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Right of the bat, I can think of two useful things to do:
>  * escaping / unescaping;
>  * replace for BaseSelectors#find that'll work on new trees.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (IGNITE-14121) Implement ability to generate configuration trees from arbitrary sources

2021-02-05 Thread Ivan Bessonov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-14121?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Bessonov updated IGNITE-14121:
---
Fix Version/s: 3.0.0-alpha2

> Implement ability to generate configuration trees from arbitrary sources
> 
>
> Key: IGNITE-14121
> URL: https://issues.apache.org/jira/browse/IGNITE-14121
> Project: Ignite
>  Issue Type: Sub-task
>Reporter: Ivan Bessonov
>Assignee: Ivan Bessonov
>Priority: Major
> Fix For: 3.0.0-alpha2
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Prototype is already present here: 
> [https://github.com/apache/ignite-3/pull/34/files]
> Now we need to adapt it to current configuration code and implement automatic 
> generation of construction method's implementations.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (IGNITE-14087) Implement code generation for interfaces introduced in IGNITE-14062

2021-02-05 Thread Ivan Bessonov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-14087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Bessonov updated IGNITE-14087:
---
Fix Version/s: 3.0.0-alpha2

> Implement code generation for interfaces introduced in IGNITE-14062
> ---
>
> Key: IGNITE-14087
> URL: https://issues.apache.org/jira/browse/IGNITE-14087
> Project: Ignite
>  Issue Type: Sub-task
>Reporter: Ivan Bessonov
>Assignee: Ivan Bessonov
>Priority: Major
> Fix For: 3.0.0-alpha2
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> {color:#172b4d}I expect to see following code to be created with all used 
> interfaces:{color}
> {code:java}
> public final class RestNode extends InnerNode implements RestView, 
> RestChange, RestInit {
> private Integer port;
> private Integer portRange;
> @Override
> public int port() {
> return port;
> }
> @Override
> public RestChange changePort(int port) {
> this.port = port;
> return this;
> }
> @Override
> public RestInit initPort(int port) {
> this.port = port;
> return this;
> }
> @Override
> public int portRange() {
> return portRange;
> }
> @Override
> public RestChange changePortRange(int portRange) {
> this.portRange = portRange;
> return this;
> }
> @Override
> public RestInit initPortRange(int portRange) {
> this.portRange = portRange;
> return this;
> }
> /**
>  * {@inheritDoc}
>  */
> @Override
> public void traverseChildren(ConfigurationVisitor visitor) {
> visitor.visitLeafNode("port", port);
> visitor.visitLeafNode("portRange", portRange);
> }
> /**
>  * {@inheritDoc}
>  */
> @Override
> public void traverseChild(String key, ConfigurationVisitor visitor) 
> throws NoSuchElementException {
> switch (key) {
> case "port": visitor.visitLeafNode("port", port);
> break;
> case "portRange": visitor.visitLeafNode("portRange", portRange);
> break;
> default: throw new NoSuchElementException(key);
> }
> }
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (IGNITE-14094) Configuration storage interface

2021-02-05 Thread Ivan Bessonov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-14094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Bessonov updated IGNITE-14094:
---
Fix Version/s: 3.0.0-alpha2

> Configuration storage interface
> ---
>
> Key: IGNITE-14094
> URL: https://issues.apache.org/jira/browse/IGNITE-14094
> Project: Ignite
>  Issue Type: Sub-task
>Reporter: Semyon Danilov
>Assignee: Semyon Danilov
>Priority: Major
> Fix For: 3.0.0-alpha2
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> Create configuration storage interface with its probable "metastorage" 
> implementation in mind (meaning string keys and primitive values). Support 
> write retries (based on versioning).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (IGNITE-14062) Create basic classes and interfaces for traversable configuration tree.

2021-02-05 Thread Ivan Bessonov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-14062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Bessonov updated IGNITE-14062:
---
Fix Version/s: 3.0.0-alpha2

> Create basic classes and interfaces for traversable configuration tree.
> ---
>
> Key: IGNITE-14062
> URL: https://issues.apache.org/jira/browse/IGNITE-14062
> Project: Ignite
>  Issue Type: Sub-task
>Reporter: Ivan Bessonov
>Assignee: Ivan Bessonov
>Priority: Major
> Fix For: 3.0.0-alpha2
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Prototype code is presented in this PR: 
> https://github.com/apache/ignite-3/pull/34



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IGNITE-12982) NullPointerException on TcpCommunicationMetricsListener for some of the cases

2021-02-04 Thread Ivan Bessonov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-12982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17278889#comment-17278889
 ] 

Ivan Bessonov commented on IGNITE-12982:


[~sergeychugunov] fix looks good to me, thank you!

> NullPointerException on TcpCommunicationMetricsListener for some of the cases
> -
>
> Key: IGNITE-12982
> URL: https://issues.apache.org/jira/browse/IGNITE-12982
> Project: Ignite
>  Issue Type: Bug
>  Components: networking
>Reporter: Maxim Muzafarov
>Assignee: Sergey Chugunov
>Priority: Major
> Fix For: 2.11
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The code block below throws an {{NullPointerException}} for some of the 
> cases. Investigation required.
> {code}
> @Override public void onMessageSent(GridNioSession ses, Message 
> msg) {
> Object consistentId = ses.meta(CONSISTENT_ID_META);
> if (consistentId != null)
> metricsLsnr.onMessageSent(msg, consistentId);
> }
> {code}
> {code}
> [2020-05-04 
> 18:12:12,991][ERROR][grid-nio-worker-tcp-comm-0-#543%snapshot.IgniteClusterSnapshotSelfTest2%][TestRecordingCommunicationSpi]
>  Failed to process selector key [ses=GridSelectorNioSessionImpl 
> [worker=DirectNioClientWorker [super=AbstractNioClientWorker [idx=0, 
> bytesRcvd=42, bytesSent=18, bytesRcvd0=42, bytesSent0=18, select=true, 
> super=GridWorker [name=grid-nio-worker-tcp-comm-0, 
> igniteInstanceName=snapshot.IgniteClusterSnapshotSelfTest2, finished=false, 
> heartbeatTs=1588605131981, hashCode=1038334332, interrupted=false, 
> runner=grid-nio-worker-tcp-comm-0-#543%snapshot.IgniteClusterSnapshotSelfTest2%]]],
>  writeBuf=java.nio.DirectByteBuffer[pos=10 lim=32768 cap=32768], 
> readBuf=java.nio.DirectByteBuffer[pos=0 lim=32768 cap=32768], 
> inRecovery=GridNioRecoveryDescriptor [acked=0, resendCnt=0, rcvCnt=0, 
> sentCnt=0, reserved=true, lastAck=0, nodeLeft=false, node=TcpDiscoveryNode 
> [id=7f78d082-6ce9-42b1-ab08-da1fde40, 
> consistentId=snapshot.IgniteClusterSnapshotSelfTest0, addrs=ArrayList 
> [127.0.0.1], sockAddrs=HashSet [/127.0.0.1:47500], discPort=47500, order=1, 
> intOrder=1, lastExchangeTime=1588605131971, loc=false, 
> ver=2.9.0#20200428-sha1:e551fa71, isClient=false], connected=true, 
> connectCnt=0, queueLimit=4096, reserveCnt=1, pairedConnections=false], 
> outRecovery=GridNioRecoveryDescriptor [acked=0, resendCnt=0, rcvCnt=0, 
> sentCnt=0, reserved=true, lastAck=0, nodeLeft=false, node=TcpDiscoveryNode 
> [id=7f78d082-6ce9-42b1-ab08-da1fde40, 
> consistentId=snapshot.IgniteClusterSnapshotSelfTest0, addrs=ArrayList 
> [127.0.0.1], sockAddrs=HashSet [/127.0.0.1:47500], discPort=47500, order=1, 
> intOrder=1, lastExchangeTime=1588605131971, loc=false, 
> ver=2.9.0#20200428-sha1:e551fa71, isClient=false], connected=true, 
> connectCnt=0, queueLimit=4096, reserveCnt=1, pairedConnections=false], 
> closeSocket=true, 
> outboundMessagesQueueSizeMetric=o.a.i.i.processors.metric.impl.LongAdderMetric@69a257d1,
>  super=GridNioSessionImpl [locAddr=/127.0.0.1:47102, 
> rmtAddr=/127.0.0.1:50655, createTime=1588605131981, closeTime=0, 
> bytesSent=18, bytesRcvd=42, bytesSent0=18, bytesRcvd0=42, 
> sndSchedTime=1588605131981, lastSndTime=1588605131981, 
> lastRcvTime=1588605131981, readsPaused=false, 
> filterChain=FilterChain[filters=[GridNioCodecFilter 
> [parser=o.a.i.i.util.nio.GridDirectParser@fc19b0b, directMode=true], 
> GridConnectionBytesVerifyFilter], accepted=true, markedForClose=true]]]
> java.lang.NullPointerException
>   at 
> org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$1.onMessageSent(TcpCommunicationSpi.java:803)
>   at 
> org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$1.onMessageSent(TcpCommunicationSpi.java:472)
>   at 
> org.apache.ignite.internal.util.nio.GridNioServer.onMessageWritten(GridNioServer.java:1764)
>   at 
> org.apache.ignite.internal.util.nio.GridNioServer.access$1800(GridNioServer.java:99)
>   at 
> org.apache.ignite.internal.util.nio.GridNioServer$DirectNioClientWorker.processWrite0(GridNioServer.java:1665)
>   at 
> org.apache.ignite.internal.util.nio.GridNioServer$DirectNioClientWorker.processWrite(GridNioServer.java:1365)
>   at 
> org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.processSelectedKeysOptimized(GridNioServer.java:2437)
>   at 
> org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.bodyInternal(GridNioServer.java:2201)
>   at 
> org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.body(GridNioServer.java:1842)
>   at 
> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
>   at java.lang.Thr

[jira] [Commented] (IGNITE-14094) Configuration storage interface

2021-02-03 Thread Ivan Bessonov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-14094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17277971#comment-17277971
 ] 

Ivan Bessonov commented on IGNITE-14094:


[~sdanilov]  looks good to me, thank you! Let's proceed with other improvements 
in other issues.

> Configuration storage interface
> ---
>
> Key: IGNITE-14094
> URL: https://issues.apache.org/jira/browse/IGNITE-14094
> Project: Ignite
>  Issue Type: Sub-task
>Reporter: Semyon Danilov
>Assignee: Semyon Danilov
>Priority: Major
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> Create configuration storage interface with its probable "metastorage" 
> implementation in mind (meaning string keys and primitive values). Support 
> write retries (based on versioning).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-14121) Implement ability to generate configuration trees from arbitrary sources

2021-02-03 Thread Ivan Bessonov (Jira)
Ivan Bessonov created IGNITE-14121:
--

 Summary: Implement ability to generate configuration trees from 
arbitrary sources
 Key: IGNITE-14121
 URL: https://issues.apache.org/jira/browse/IGNITE-14121
 Project: Ignite
  Issue Type: Sub-task
Reporter: Ivan Bessonov
Assignee: Ivan Bessonov


Prototype is already present here: 
[https://github.com/apache/ignite-3/pull/34/files]
Now we need to adapt it to current configuration code and implement automatic 
generation of construction method's implementations.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-14102) Create escaping and searching util methods for configuration framework

2021-01-29 Thread Ivan Bessonov (Jira)
Ivan Bessonov created IGNITE-14102:
--

 Summary: Create escaping and searching util methods for 
configuration framework
 Key: IGNITE-14102
 URL: https://issues.apache.org/jira/browse/IGNITE-14102
 Project: Ignite
  Issue Type: Sub-task
Reporter: Ivan Bessonov
Assignee: Ivan Bessonov


Right of the bat, I can think of two useful things to do:
 * escaping / unescaping;
 * replace for BaseSelectors#find that'll work on new trees.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (IGNITE-14087) Implement code generation for interfaces introduced in IGNITE-14062

2021-01-28 Thread Ivan Bessonov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-14087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Bessonov updated IGNITE-14087:
---
Description: 
{color:#172b4d}I expect to see following code to be created with all used 
interfaces:{color}
{code:java}
public final class RestNode extends InnerNode implements RestView, RestChange, 
RestInit {
private Integer port;

private Integer portRange;

@Override
public int port() {
return port;
}

@Override
public RestChange changePort(int port) {
this.port = port;
return this;
}

@Override
public RestInit initPort(int port) {
this.port = port;
return this;
}

@Override
public int portRange() {
return portRange;
}

@Override
public RestChange changePortRange(int portRange) {
this.portRange = portRange;
return this;
}

@Override
public RestInit initPortRange(int portRange) {
this.portRange = portRange;
return this;
}

/**
 * {@inheritDoc}
 */
@Override
public void traverseChildren(ConfigurationVisitor visitor) {
visitor.visitLeafNode("port", port);
visitor.visitLeafNode("portRange", portRange);
}

/**
 * {@inheritDoc}
 */
@Override
public void traverseChild(String key, ConfigurationVisitor visitor) throws 
NoSuchElementException {
switch (key) {
case "port": visitor.visitLeafNode("port", port);
break;
case "portRange": visitor.visitLeafNode("portRange", portRange);
break;
default: throw new NoSuchElementException(key);
}
}
}
{code}

> Implement code generation for interfaces introduced in IGNITE-14062
> ---
>
> Key: IGNITE-14087
> URL: https://issues.apache.org/jira/browse/IGNITE-14087
> Project: Ignite
>  Issue Type: Sub-task
>Reporter: Ivan Bessonov
>Assignee: Ivan Bessonov
>Priority: Major
>
> {color:#172b4d}I expect to see following code to be created with all used 
> interfaces:{color}
> {code:java}
> public final class RestNode extends InnerNode implements RestView, 
> RestChange, RestInit {
> private Integer port;
> private Integer portRange;
> @Override
> public int port() {
> return port;
> }
> @Override
> public RestChange changePort(int port) {
> this.port = port;
> return this;
> }
> @Override
> public RestInit initPort(int port) {
> this.port = port;
> return this;
> }
> @Override
> public int portRange() {
> return portRange;
> }
> @Override
> public RestChange changePortRange(int portRange) {
> this.portRange = portRange;
> return this;
> }
> @Override
> public RestInit initPortRange(int portRange) {
> this.portRange = portRange;
> return this;
> }
> /**
>  * {@inheritDoc}
>  */
> @Override
> public void traverseChildren(ConfigurationVisitor visitor) {
> visitor.visitLeafNode("port", port);
> visitor.visitLeafNode("portRange", portRange);
> }
> /**
>  * {@inheritDoc}
>  */
> @Override
> public void traverseChild(String key, ConfigurationVisitor visitor) 
> throws NoSuchElementException {
> switch (key) {
> case "port": visitor.visitLeafNode("port", port);
> break;
> case "portRange": visitor.visitLeafNode("portRange", portRange);
> break;
> default: throw new NoSuchElementException(key);
> }
> }
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (IGNITE-14062) Create basic classes and interfaces for traversable configuration tree.

2021-01-28 Thread Ivan Bessonov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-14062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Bessonov updated IGNITE-14062:
---
Ignite Flags:   (was: Release Notes Required)

> Create basic classes and interfaces for traversable configuration tree.
> ---
>
> Key: IGNITE-14062
> URL: https://issues.apache.org/jira/browse/IGNITE-14062
> Project: Ignite
>  Issue Type: Sub-task
>Reporter: Ivan Bessonov
>Assignee: Ivan Bessonov
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Prototype code is presented in this PR: 
> https://github.com/apache/ignite-3/pull/34



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-14087) Implement code generation for interfaces introduced in IGNITE-14062

2021-01-28 Thread Ivan Bessonov (Jira)
Ivan Bessonov created IGNITE-14087:
--

 Summary: Implement code generation for interfaces introduced in 
IGNITE-14062
 Key: IGNITE-14087
 URL: https://issues.apache.org/jira/browse/IGNITE-14087
 Project: Ignite
  Issue Type: Sub-task
Reporter: Ivan Bessonov
Assignee: Ivan Bessonov






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-14062) Create basic classes and interfaces for traversable configuration tree.

2021-01-26 Thread Ivan Bessonov (Jira)
Ivan Bessonov created IGNITE-14062:
--

 Summary: Create basic classes and interfaces for traversable 
configuration tree.
 Key: IGNITE-14062
 URL: https://issues.apache.org/jira/browse/IGNITE-14062
 Project: Ignite
  Issue Type: Sub-task
Reporter: Ivan Bessonov
Assignee: Ivan Bessonov


Prototype code is presented in this PR: 
https://github.com/apache/ignite-3/pull/34



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IGNITE-13986) Proof of concept - SWIM group membership protocol for discovery

2021-01-18 Thread Ivan Bessonov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-13986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17267364#comment-17267364
 ] 

Ivan Bessonov commented on IGNITE-13986:


Ok, now the same thing but with a new context. Following thought might go into 
IEP in the future.

New component will conceptually include group membership and p2p messaging 
subsystem. It's going to be a general network component. Required API consists 
of the following:
 * start / stop;
 * retrieving information about current group members;
 * listening to membership events:
 ** node appeared;
 ** node left;
 ** node failed;
 * listening for incoming messages;
 * sending messages:
 ** there's a requirements of being able to send idempotent messages with very 
weak guarantees:
 *** no delivery guarantees required;
 *** multiple copies of the same message might be sent;
 *** no need to have any kind of acknowledgement;
 ** there's another requirement for the common use:
 *** message must be sent exactly once with an acknowledgement that it has 
actually been received (not processed);
 *** messages must be received in the same order they were sent.
These types of messages might utilize current recovery protocol with acks every 
32 (or so) messages. This setting must be flexible enough so that we won't get 
OOM in big topologies.

Possibility of SSL connections should be considered.

Given that SWIM membership events appear in no particular order, it's possible 
to receive a message from node before knowing that it exists or after it's 
already gone. This is one of the reasons why we should basically merge 
"discovery" and "communication" into one thing. Another reason is that it's 
more convenient to use a single port per node instead of two (i.e. use 
multiplexing).

There might be a requirement to stream data from one node to another (in SQL 
engine, for example). The implementation is not obvious for me, maybe such 
thing will be implemented on top of the current component. Anyway, 
implementation will be tightly coupled with netty integration, currently 
investigated by [~sergeychugunov].

Everything related to actual "Discovery" and messages like "NodeJoined" is not 
discussed here. Since we're moving to RAFT-based distributed metadata storage 
as a mean to send ordered messages, discussing them is just out of scope. 
Current "Discovery" with its join protocol and strict ordering of events will 
cease to exist.

There's a bunch of settings that should be extracted from scalecube 
configuration and joined with some specific netty settings. The whole set of 
settings will be determined during implementation. New configuration framework 
will be used. For the first time, probably.

> Proof of concept - SWIM group membership protocol for discovery
> ---
>
> Key: IGNITE-13986
> URL: https://issues.apache.org/jira/browse/IGNITE-13986
> Project: Ignite
>  Issue Type: New Feature
>Reporter: Ivan Bessonov
>Assignee: Ivan Bessonov
>Priority: Major
>  Labels: iep-61, ignite-3
>
> In IEP-61 it is mentioned that discovery protocol will be updated. We need to 
> play with mentioned options for a little bit to conclude if they match our 
> needs:
> [http://www.cs.cornell.edu/Info/Projects/Spinglass/public_pdfs/SWIM.pdf]
> [https://github.com/scalecube/scalecube-cluster]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (IGNITE-13986) Proof of concept - SWIM group membership protocol for discovery

2021-01-14 Thread Ivan Bessonov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-13986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17264891#comment-17264891
 ] 

Ivan Bessonov edited comment on IGNITE-13986 at 1/14/21, 2:16 PM:
--

I found first really interesting problem - default 
{{scalecube-transport-netty}} can't be excluded from maven dependencies when 
you create your custom implementation. Usage of static method from class 
{{TransportImpl}} is hardcoded in {{ClusterImpl}}, which is a shame. This can 
only be fixed inside of scalecube code itself. 

But, there was some activity. Version 2.6.6 is released 2 days ago in maven 
central.


was (Author: ibessonov):
I found first really interesting problem - default 
{{scalecube-transport-netty}} can't be excluded from maven dependencies when 
you create your custom implementation. Usage of static method from class 
{{TransportImpl}} is hardcoded in {{ClusterImpl}}, which is a shame. This can 
only be fixed inside of scalecube code itself. 

But, there was some activity. Version 2.6.6 is released somewhere.

> Proof of concept - SWIM group membership protocol for discovery
> ---
>
> Key: IGNITE-13986
> URL: https://issues.apache.org/jira/browse/IGNITE-13986
> Project: Ignite
>  Issue Type: New Feature
>Reporter: Ivan Bessonov
>Assignee: Ivan Bessonov
>Priority: Major
>  Labels: iep-61, ignite-3
>
> In IEP-61 it is mentioned that discovery protocol will be updated. We need to 
> play with mentioned options for a little bit to conclude if they match our 
> needs:
> [http://www.cs.cornell.edu/Info/Projects/Spinglass/public_pdfs/SWIM.pdf]
> [https://github.com/scalecube/scalecube-cluster]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (IGNITE-13986) Proof of concept - SWIM group membership protocol for discovery

2021-01-14 Thread Ivan Bessonov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-13986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17264891#comment-17264891
 ] 

Ivan Bessonov edited comment on IGNITE-13986 at 1/14/21, 1:36 PM:
--

I found first really interesting problem - default 
{{scalecube-transport-netty}} can't be excluded from maven dependencies when 
you create your custom implementation. Usage of static method from class 
{{TransportImpl}} is hardcoded in {{ClusterImpl}}, which is a shame. This can 
only be fixed inside of scalecube code itself. 

But, there was some activity. Version 2.6.6 is released somewhere.


was (Author: ibessonov):
I found first really interesting problem - default 
{{scalecube-transport-netty}} can't be excluded from maven dependencies when 
you create your custom implementation. Usage of static method from class 
{{TransportImpl}} is hardcoded in {{ClusterImpl}}, which is a shame. This can 
only be fixed inside of scalecube code itself.

 

But, there was some activity. Version 2.6.6 is released somewhere.

> Proof of concept - SWIM group membership protocol for discovery
> ---
>
> Key: IGNITE-13986
> URL: https://issues.apache.org/jira/browse/IGNITE-13986
> Project: Ignite
>  Issue Type: New Feature
>Reporter: Ivan Bessonov
>Assignee: Ivan Bessonov
>Priority: Major
>  Labels: iep-61, ignite-3
>
> In IEP-61 it is mentioned that discovery protocol will be updated. We need to 
> play with mentioned options for a little bit to conclude if they match our 
> needs:
> [http://www.cs.cornell.edu/Info/Projects/Spinglass/public_pdfs/SWIM.pdf]
> [https://github.com/scalecube/scalecube-cluster]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (IGNITE-13986) Proof of concept - SWIM group membership protocol for discovery

2021-01-14 Thread Ivan Bessonov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-13986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17264891#comment-17264891
 ] 

Ivan Bessonov edited comment on IGNITE-13986 at 1/14/21, 1:34 PM:
--

I found first really interesting problem - default 
{{scalecube-transport-netty}} can't be excluded from maven dependencies when 
you create your custom implementation. Usage of static method from class 
{{TransportImpl}} is hardcoded in {{ClusterImpl}}, which is a shame. This can 
only be fixed inside of scalecube code itself.

 

But, there was some activity. Version 2.6.6 is released somewhere.


was (Author: ibessonov):
I found first really interesting problem - default 
{{scalecube-transport-netty}} can't be excluded from maven dependencies when 
you create your custom implementation. Usage of static method from class 
{{TransportImpl}} is hardcoded in {{ClusterImpl}}, which is a shame. This can 
only be fixed inside of scalecube code itself.

> Proof of concept - SWIM group membership protocol for discovery
> ---
>
> Key: IGNITE-13986
> URL: https://issues.apache.org/jira/browse/IGNITE-13986
> Project: Ignite
>  Issue Type: New Feature
>Reporter: Ivan Bessonov
>Assignee: Ivan Bessonov
>Priority: Major
>  Labels: iep-61, ignite-3
>
> In IEP-61 it is mentioned that discovery protocol will be updated. We need to 
> play with mentioned options for a little bit to conclude if they match our 
> needs:
> [http://www.cs.cornell.edu/Info/Projects/Spinglass/public_pdfs/SWIM.pdf]
> [https://github.com/scalecube/scalecube-cluster]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IGNITE-13986) Proof of concept - SWIM group membership protocol for discovery

2021-01-14 Thread Ivan Bessonov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-13986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17264891#comment-17264891
 ] 

Ivan Bessonov commented on IGNITE-13986:


I found first really interesting problem - default 
{{scalecube-transport-netty}} can't be excluded from maven dependencies when 
you create your custom implementation. Usage of static method from class 
{{TransportImpl}} is hardcoded in {{ClusterImpl}}, which is a shame. This can 
only be fixed inside of scalecube code itself.

> Proof of concept - SWIM group membership protocol for discovery
> ---
>
> Key: IGNITE-13986
> URL: https://issues.apache.org/jira/browse/IGNITE-13986
> Project: Ignite
>  Issue Type: New Feature
>Reporter: Ivan Bessonov
>Assignee: Ivan Bessonov
>Priority: Major
>  Labels: iep-61, ignite-3
>
> In IEP-61 it is mentioned that discovery protocol will be updated. We need to 
> play with mentioned options for a little bit to conclude if they match our 
> needs:
> [http://www.cs.cornell.edu/Info/Projects/Spinglass/public_pdfs/SWIM.pdf]
> [https://github.com/scalecube/scalecube-cluster]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IGNITE-13986) Proof of concept - SWIM group membership protocol for discovery

2021-01-13 Thread Ivan Bessonov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-13986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17264203#comment-17264203
 ] 

Ivan Bessonov commented on IGNITE-13986:


So, the library in question is {{scalecube-cluter}}. Last update was 4 months 
ago and it feels like nothing's happening with it right now. Maven central 
lacks latest release (2.6.2) and only has {{2.6.0-RC7}}, which is weird. I'll 
assume that latest release is stable though.

Usage examples can be seen in {{scalecube-cluster-examples}}, they serve as a 
good introduction.

Scalecube allows to group a cluster with given number of _seed_ addresses. It 
can be a single node or all nodes in cluster. New node cannot join cluster if 
all seed nodes are offline. This means that for better functioning we should 
list all potential IP addresses and ports in the list of seed addresses, which 
is a lot. As far as I understand, nodes will periodically bash into these 
addresses in background, so size of the list can affect cluster bootstrap time 
or network busyness.

Every node has associated _metadata_ that can be modified at any moment. 
Basically, metadata is any Java object. Serialization for these objects can be 
customized either explicitly via configuration or implicitly via 
{{ServiceLoader}} Java feature. Metadata can be used as {{JoiningNodeData}} 
object, but everything else for joining process has to be revisited (more on 
that later).

Overall, there are 4 types of events that nodes can handle:
 * ADDED - node is joining;
 * REMOVED - node is disconnected unexpectedly;
 * LEAVING - node is being stopped gracefully;
 * UPDATED - node metadata has been updated.

These don't come in any specific order, which means that current discovery 
events ordering can't be easily replicated. Messages mutability is also 
impossible. There is a builtin way to broadcast custom messages with gossip 
protocol, but it has no ordering as well. This means that join into _group 
membership subsystem_ and join into _Ignite cluster_ are very distinct 
processes.

Transport layer can be reconfigured. It consists of two entities: 
{{TransportFactory}} and {{MessageCodec}}. Second one has a weird interface 
that isn't used anywhere publicly (only in {{TransportImpl}}, which is just a 
part of default transport factory implementation).

Default transport uses netty. Even though we're going to use netty as well, I 
expect custom transport implementation. Reasons are simple:
 * versions incompatibility will be completely avoided;
 * we could use same underlying code as in communication protocol;
 * logs format is messed up in default implementation and there are many 
excessive messages being logged when node is leaving.

Speaking about logs - it uses sl4j. I'm not aware of what logging library we're 
going to use, but it's clear that we should find/write some adapter.

In short, the problems that I see:
 * somewhat excessive logging and explicit log4j dependency instead of 
{{java.util.logging.Logger}};
 * possible problems with big seed nodes lists.

Otherwise, looks good if we're ok with eventual consistency.

> Proof of concept - SWIM group membership protocol for discovery
> ---
>
> Key: IGNITE-13986
> URL: https://issues.apache.org/jira/browse/IGNITE-13986
> Project: Ignite
>  Issue Type: New Feature
>Reporter: Ivan Bessonov
>Assignee: Ivan Bessonov
>Priority: Major
>  Labels: iep-61, ignite-3
>
> In IEP-61 it is mentioned that discovery protocol will be updated. We need to 
> play with mentioned options for a little bit to conclude if they match our 
> needs:
> [http://www.cs.cornell.edu/Info/Projects/Spinglass/public_pdfs/SWIM.pdf]
> [https://github.com/scalecube/scalecube-cluster]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-13986) Proof of concept - SWIM group membership protocol for discovery

2021-01-13 Thread Ivan Bessonov (Jira)
Ivan Bessonov created IGNITE-13986:
--

 Summary: Proof of concept - SWIM group membership protocol for 
discovery
 Key: IGNITE-13986
 URL: https://issues.apache.org/jira/browse/IGNITE-13986
 Project: Ignite
  Issue Type: New Feature
Reporter: Ivan Bessonov
Assignee: Ivan Bessonov


In IEP-61 it is mentioned that discovery protocol will be updated. We need to 
play with mentioned options for a little bit to conclude if they match our 
needs:

[http://www.cs.cornell.edu/Info/Projects/Spinglass/public_pdfs/SWIM.pdf]

[https://github.com/scalecube/scalecube-cluster]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IGNITE-13837) Configuration initialization

2020-12-31 Thread Ivan Bessonov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-13837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17257002#comment-17257002
 ] 

Ivan Bessonov commented on IGNITE-13837:


So, here's my train of thoughts before the new year:

regardless of configuration framework implementation we should have an 
independent configuration subsystem that will provide:
 * discovery configuration to get cluster information;
 * local metastorage configuration to read stored values;
 * distributed metastorage should be available to read cluster configuration as 
well.

First of all, let's talk about local node configuration, that's easier. Is 
there a necessity to give it the initial configuration that can't have default 
values in principle? That's a hard question. I'd say that local "init" 
configuration should not exist at all - we just reconfigure already started 
node with existing API, that's it. BUT, even if it's required, it still can be 
applied on started node.

I see it like a special mode, "limbo". Process is started, but most of 
components are not. They're waiting for command. During this time "readonly" 
configuration (like pool sizes) can be changed and applied later. How do we 
achieve it - it's either Java API method call or command line tool command that 
moves node to configured state.

So let me repeat - the idea is that when node is starting, it doesn't require 
"init" configuration to be passed to it. We only read values from local 
metastorage. And wait for some command it metastorage indicates clear new node.

Next is distributed configuration. Basic idea is the same, but with a broader 
scope. When node is started, it gets distributed configuration and only then 
declares itself as a part of topology, starting its components at the same 
time. Node components must not start with outdated distributed configuration 
values. I'm not sure about how absurdly strong this requirement should be, 
maybe we shouldn't join node with outdated configuration and +force it to 
invalidate its components, update configuration and try to start again+. It 
would actually make sense.

New cluster will also be started in limbo state, but on a global scale. It'll 
wait for "activation" or something. And only when cluster receives additional 
command with every required configuration value, it'll become fully functional.

You know, this whole thing comes very close to the topic of upgrading to the 
new version. I see it this way:
 * you restart your cluster with updated Ignite 3.1, for example;
 * cluster either waits for manual "new version activation" OR works like the 
old Ignite 3.0 +until it receives all configuration values required for 3.1 
features+.

I know that my ideas look kinda extreme, we should discuss them. Everything 
that I wrote is tricky for implementation, I get it, this solution should be 
compromised or simplified at least.

> Configuration initialization
> 
>
> Key: IGNITE-13837
> URL: https://issues.apache.org/jira/browse/IGNITE-13837
> Project: Ignite
>  Issue Type: Sub-task
>Reporter: Anton Kalashnikov
>Assignee: Ivan Bessonov
>Priority: Major
>
> It needs to think how the first initialization of node/cluster should look 
> like. What is the format of initial properties(json/hocon etc.)? How should 
> they be handled?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (IGNITE-13837) Configuration initialization

2020-12-31 Thread Ivan Bessonov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-13837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Bessonov reassigned IGNITE-13837:
--

Assignee: Ivan Bessonov

> Configuration initialization
> 
>
> Key: IGNITE-13837
> URL: https://issues.apache.org/jira/browse/IGNITE-13837
> Project: Ignite
>  Issue Type: Sub-task
>Reporter: Anton Kalashnikov
>Assignee: Ivan Bessonov
>Priority: Major
>
> It needs to think how the first initialization of node/cluster should look 
> like. What is the format of initial properties(json/hocon etc.)? How should 
> they be handled?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IGNITE-13928) Index defragmentation: only one index defragmented

2020-12-29 Thread Ivan Bessonov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-13928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17255949#comment-17255949
 ] 

Ivan Bessonov commented on IGNITE-13928:


[~sdanilov] looks good, thank you! Let's wait for tests before merging.

> Index defragmentation: only one index defragmented
> --
>
> Key: IGNITE-13928
> URL: https://issues.apache.org/jira/browse/IGNITE-13928
> Project: Ignite
>  Issue Type: Bug
>  Components: persistence
>Reporter: Semyon Danilov
>Assignee: Semyon Danilov
>Priority: Major
> Fix For: 2.10
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> In IndexingDefragmentation.java:
> {code:java}
> // code placeholder
> H2TreeIndex oldH2Idx = (H2TreeIndex)indexes.get(2);
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (IGNITE-13833) PersistenceBasicCompatibilityTest lacks recent releases

2020-12-10 Thread Ivan Bessonov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-13833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Bessonov updated IGNITE-13833:
---
Description: Last version is 2.6

> PersistenceBasicCompatibilityTest lacks recent releases
> ---
>
> Key: IGNITE-13833
> URL: https://issues.apache.org/jira/browse/IGNITE-13833
> Project: Ignite
>  Issue Type: Bug
>Reporter: Ivan Bessonov
>Assignee: Ivan Bessonov
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Last version is 2.6



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-13833) PersistenceBasicCompatibilityTest lacks recent releases

2020-12-10 Thread Ivan Bessonov (Jira)
Ivan Bessonov created IGNITE-13833:
--

 Summary: PersistenceBasicCompatibilityTest lacks recent releases
 Key: IGNITE-13833
 URL: https://issues.apache.org/jira/browse/IGNITE-13833
 Project: Ignite
  Issue Type: Bug
Reporter: Ivan Bessonov
Assignee: Ivan Bessonov






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-13832) disco-notifier-worker handles IgniteInterruptedCheckedException incorrectly

2020-12-09 Thread Ivan Bessonov (Jira)
Ivan Bessonov created IGNITE-13832:
--

 Summary: disco-notifier-worker handles 
IgniteInterruptedCheckedException incorrectly
 Key: IGNITE-13832
 URL: https://issues.apache.org/jira/browse/IGNITE-13832
 Project: Ignite
  Issue Type: Bug
Reporter: Ivan Bessonov
Assignee: Ivan Bessonov


DiscoveryMessageNotifierWorker#body handles InterruptedException correctly but 
if it catches IgniteInterruptedCheckedException, it'll do different logic which 
is incorrect. I believe all InterruptedException should be handled in the same 
way.

 
{code:java}
[org.gridgain:gridgain-compatibility] [2020-04-13 
08:19:15,109][ERROR][disco-notifier-worker-#69754%top2_node_rcv%][root] 
Critical system error detected. Will be handled accordingly to configured 
handler [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, 
super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet 
[SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], 
failureCtx=FailureContext [type=SYSTEM_WORKER_TERMINATION, err=class 
o.a.i.IgniteException: Failed to wait for handling disconnect event.]]
[08:19:15]W: [org.gridgain:gridgain-compatibility] class 
org.apache.ignite.IgniteException: Failed to wait for handling disconnect event.
[08:19:15]W: [org.gridgain:gridgain-compatibility]  at 
org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$DiscoveryWorker.awaitDisconnectEvent(GridDiscoveryManager.java:3128)
[08:19:15]W: [org.gridgain:gridgain-compatibility]  at 
org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$DiscoveryWorker.access$6400(GridDiscoveryManager.java:2793)
[08:19:15]W: [org.gridgain:gridgain-compatibility]  at 
org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$4.onDiscovery0(GridDiscoveryManager.java:868)
[08:19:15]W: [org.gridgain:gridgain-compatibility]  at 
org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$4.lambda$onDiscovery$0(GridDiscoveryManager.java:519)
[08:19:15]W: [org.gridgain:gridgain-compatibility]  at 
org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$DiscoveryMessageNotifierWorker.body0(GridDiscoveryManager.java:2686)
[08:19:15]W: [org.gridgain:gridgain-compatibility]  at 
org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$DiscoveryMessageNotifierWorker.body(GridDiscoveryManager.java:2724)
[08:19:15]W: [org.gridgain:gridgain-compatibility]  at 
org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:119)
[08:19:15]W: [org.gridgain:gridgain-compatibility]  at 
java.lang.Thread.run(Thread.java:748)
[08:19:15]W: [org.gridgain:gridgain-compatibility] Caused by: class 
org.apache.ignite.internal.IgniteInterruptedCheckedException: Got interrupted 
while waiting for future to complete.
[08:19:15]W: [org.gridgain:gridgain-compatibility]  at 
org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:185)
[08:19:15]W: [org.gridgain:gridgain-compatibility]  at 
org.apache.ignite.internal.util.future.GridFutureAdapter.get(GridFutureAdapter.java:140)
[08:19:15]W: [org.gridgain:gridgain-compatibility]  at 
org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$DiscoveryWorker.awaitDisconnectEvent(GridDiscoveryManager.java:3125)
[08:19:15]W: [org.gridgain:gridgain-compatibility]  ... 7 more
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IGNITE-13823) WAL iterators require WRITE permissions

2020-12-09 Thread Ivan Bessonov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-13823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17246484#comment-17246484
 ] 

Ivan Bessonov commented on IGNITE-13823:


Failed test in cache 5 is not mine, it's flaky.

> WAL iterators require WRITE permissions
> ---
>
> Key: IGNITE-13823
> URL: https://issues.apache.org/jira/browse/IGNITE-13823
> Project: Ignite
>  Issue Type: Bug
>Reporter: Ivan Bessonov
>Assignee: Ivan Bessonov
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> org.apache.ignite.internal.processors.cache.persistence.wal.FileDescriptor#toIO
>  uses default permissions, i.e. "CREATE, READ, WRITE"



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IGNITE-13775) U.ReentrantReadWriteLockTracer improper realization.

2020-12-08 Thread Ivan Bessonov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-13775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17246332#comment-17246332
 ] 

Ivan Bessonov commented on IGNITE-13775:


[~zstan] code looks good, thank you!

> U.ReentrantReadWriteLockTracer improper realization.
> 
>
> Key: IGNITE-13775
> URL: https://issues.apache.org/jira/browse/IGNITE-13775
> Project: Ignite
>  Issue Type: Improvement
>  Components: general
>Affects Versions: 2.9
>Reporter: Stanilovsky Evgeny
>Assignee: Stanilovsky Evgeny
>Priority: Major
> Attachments: image-2020-12-01-13-51-39-048.png, screenshot-1.png
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> ReentrantReadWriteLockTracer accepts ReentrantReadWriteLock as a delegate and 
> stores delegates for readLock and writeLock. But 
> ReentrantReadWriteLock#isWriteLockedByCurrentThread uses sync object to 
> evaluate the result instead of writeLock, and ReentrantReadWriteLockTracer 
> has it's own sync object.
> As a result, if ReentrantReadWriteLockTracer is used to create checkpoint 
> lock (when IGNITE_PDS_LOG_CP_READ_LOCK_HOLDERS=true), 
> GridCacheDatabaseSharedManager#checkpointLockIsHeldByThread doesn't work 
> correctly: it returns false when checkpoint lock is acquired.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (IGNITE-13101) Metastore may leave uncompleted write futures during node stop

2020-12-07 Thread Ivan Bessonov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-13101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Bessonov reassigned IGNITE-13101:
--

Assignee: Ivan Bessonov

> Metastore may leave uncompleted write futures during node stop
> --
>
> Key: IGNITE-13101
> URL: https://issues.apache.org/jira/browse/IGNITE-13101
> Project: Ignite
>  Issue Type: Bug
>Reporter: Alexey Goncharuk
>Assignee: Ivan Bessonov
>Priority: Major
> Fix For: 2.10
>
>
> I've got the following thread-dump (only relevant parts are retained) during 
> one of the teamcity runs:
> {code}
> "sys-#103862%baseline.IgniteStableBaselineBinObjFieldsQuerySelfTest0%" 
> #107048 prio=5 os_prio=0 tid=0x7fa2d8009800 nid=0x480d waiting on 
> condition [0x7fa1d1cdc000]
>java.lang.Thread.State: WAITING (parking)
>   at sun.misc.Unsafe.park(Native Method)
>   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:304)
>   at 
> org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:178)
>   at 
> org.apache.ignite.internal.util.future.GridFutureAdapter.get(GridFutureAdapter.java:141)
>   at 
> org.apache.ignite.internal.processors.metric.GridMetricManager.remove(GridMetricManager.java:411)
>   at 
> org.apache.ignite.internal.processors.cache.CacheGroupMetricsImpl.remove(CacheGroupMetricsImpl.java:497)
>   at 
> org.apache.ignite.internal.processors.cache.GridCacheProcessor.cleanup(GridCacheProcessor.java:512)
>   at 
> org.apache.ignite.internal.processors.cache.GridCacheProcessor.stopCacheGroup(GridCacheProcessor.java:2901)
>   at 
> org.apache.ignite.internal.processors.cache.GridCacheProcessor.stopCacheGroup(GridCacheProcessor.java:2889)
>   at 
> org.apache.ignite.internal.processors.cache.GridCacheProcessor.processCacheStopRequestOnExchangeDone(GridCacheProcessor.java:2781)
>   at 
> org.apache.ignite.internal.processors.cache.GridCacheProcessor.onExchangeDone(GridCacheProcessor.java:2878)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.onDone(GridDhtPartitionsExchangeFuture.java:2431)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.finishExchangeOnCoordinator(GridDhtPartitionsExchangeFuture.java:3832)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.onAllReceived(GridDhtPartitionsExchangeFuture.java:3608)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.processSingleMessage(GridDhtPartitionsExchangeFuture.java:3207)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.access$200(GridDhtPartitionsExchangeFuture.java:154)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture$2.apply(GridDhtPartitionsExchangeFuture.java:2994)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture$2.apply(GridDhtPartitionsExchangeFuture.java:2982)
>   at 
> org.apache.ignite.internal.util.future.GridFutureAdapter.notifyListener(GridFutureAdapter.java:399)
>   at 
> org.apache.ignite.internal.util.future.GridFutureAdapter.listen(GridFutureAdapter.java:354)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.onReceiveSingleMessage(GridDhtPartitionsExchangeFuture.java:2982)
>   at 
> org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager.processSinglePartitionUpdate(GridCachePartitionExchangeManager.java:1989)
>   at 
> org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager.preprocessSingleMessage(GridCachePartitionExchangeManager.java:524)
>   at 
> org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager.access$1100(GridCachePartitionExchangeManager.java:182)
>   at 
> org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$2.onMessage(GridCachePartitionExchangeManager.java:407)
>   at 
> org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$2.onMessage(GridCachePartitionExchangeManager.java:389)
>   at 
> org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$MessageHandler.apply(GridCachePartitionExchangeManager.java:3715)
>   at 
> org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$MessageHandler.apply(GridCachePartitionExchangeManager.java:3694)
>   at 
> org.apache.ignite.internal.processors.cache.GridCacheIoManager.processMessage(Gri

[jira] [Created] (IGNITE-13823) WAL iterators require WRITE permissions

2020-12-07 Thread Ivan Bessonov (Jira)
Ivan Bessonov created IGNITE-13823:
--

 Summary: WAL iterators require WRITE permissions
 Key: IGNITE-13823
 URL: https://issues.apache.org/jira/browse/IGNITE-13823
 Project: Ignite
  Issue Type: Bug
Reporter: Ivan Bessonov
Assignee: Ivan Bessonov


org.apache.ignite.internal.processors.cache.persistence.wal.FileDescriptor#toIO 
uses default permissions, i.e. "CREATE, READ, WRITE"



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-13814) Long restorePartitionStates triggers FailureHandler on node startup

2020-12-04 Thread Ivan Bessonov (Jira)
Ivan Bessonov created IGNITE-13814:
--

 Summary: Long restorePartitionStates triggers FailureHandler on 
node startup
 Key: IGNITE-13814
 URL: https://issues.apache.org/jira/browse/IGNITE-13814
 Project: Ignite
  Issue Type: Bug
 Environment: {noformat}
Thread [name="sys-stripe-4-#5%EPE_CLUSTER_PERF%", id=24, state=WAITING, 
blockCnt=4, waitCnt=70836]
at java.base@11.0.8/jdk.internal.misc.Unsafe.park(Native Method)
at 
java.base@11.0.8/java.util.concurrent.locks.LockSupport.park(LockSupport.java:323)
at 
app//o.a.i.i.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:186)
at 
app//o.a.i.i.util.future.GridFutureAdapter.getUninterruptibly(GridFutureAdapter.java:154)
at 
app//o.a.i.i.processors.cache.persistence.file.AsyncFileIO.read(AsyncFileIO.java:128)
at 
app//o.a.i.i.processors.cache.persistence.file.AbstractFileIO$2.run(AbstractFileIO.java:89)
at 
app//o.a.i.i.processors.cache.persistence.file.AbstractFileIO.fully(AbstractFileIO.java:52)
at 
app//o.a.i.i.processors.cache.persistence.file.AbstractFileIO.readFully(AbstractFileIO.java:87)
at 
app//o.a.i.i.processors.cache.persistence.file.FilePageStore.readWithFailover(FilePageStore.java:794)
at 
app//o.a.i.i.processors.cache.persistence.file.FilePageStore.read(FilePageStore.java:418)
at 
app//o.a.i.i.processors.cache.persistence.file.FilePageStoreManager.read(FilePageStoreManager.java:519)
at 
app//o.a.i.i.processors.cache.persistence.file.FilePageStoreManager.read(FilePageStoreManager.java:503)
at 
app//o.a.i.i.processors.cache.persistence.pagemem.PageMemoryImpl.acquirePage(PageMemoryImpl.java:874)
at 
app//o.a.i.i.processors.cache.persistence.pagemem.PageMemoryImpl.acquirePage(PageMemoryImpl.java:700)
at 
app//o.a.i.i.processors.cache.persistence.pagemem.PageMemoryImpl.acquirePage(PageMemoryImpl.java:689)
at 
app//o.a.i.i.processors.cache.persistence.DataStructure.acquirePage(DataStructure.java:157)
at 
app//o.a.i.i.processors.cache.persistence.freelist.PagesList.init(PagesList.java:274)
at 
app//o.a.i.i.processors.cache.persistence.freelist.AbstractFreeList.(AbstractFreeList.java:390)
at 
app//o.a.i.i.processors.cache.persistence.freelist.CacheFreeList.(CacheFreeList.java:57)
at 
app//o.a.i.i.processors.cache.persistence.GridCacheOffheapManager$GridCacheDataStore$1.(GridCacheOffheapManager.java:1806)
at 
app//o.a.i.i.processors.cache.persistence.GridCacheOffheapManager$GridCacheDataStore.init0(GridCacheOffheapManager.java:1805)
at 
app//o.a.i.i.processors.cache.persistence.GridCacheOffheapManager$GridCacheDataStore.init(GridCacheOffheapManager.java:2130)
at 
app//o.a.i.i.processors.cache.persistence.GridCacheOffheapManager.restorePartitionStates(GridCacheOffheapManager.java:544)
at 
app//o.a.i.i.processors.cache.GridCacheProcessor$CacheRecoveryLifecycle.lambda$restorePartitionStates$0(GridCacheProcessor.java:5253)
at 
app//o.a.i.i.processors.cache.GridCacheProcessor$CacheRecoveryLifecycle$$Lambda$633/0x000800717040.run(Unknown
 Source)
at 
app//o.a.i.i.util.StripedExecutor$Stripe.body(StripedExecutor.java:559)
at app//o.a.i.i.util.worker.GridWorker.run(GridWorker.java:119)
at java.base@11.0.8/java.lang.Thread.run(Thread.java:834){noformat}
In this case, warm-up is on, but client also reports this to happen without 
warm-up.I don't think that restore partition states should trigger FH. It may 
take a lot of time with PDS. Also, why do we run it in striped pool? Let's 
imagine two large caches get the same stripe - restore time doubles.
Reporter: Ivan Bessonov
Assignee: Ivan Bessonov


The following would be printed to log:
{noformat}
[2020-10-30T17:32:26,190][WARN ][grid-timeout-worker-#22%EPE_CLUSTER_PERF%][] 
Possible failure suppressed accordingly to a configured handler 
[hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, 
super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet 
[SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], 
failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class 
o.a.i.IgniteException: GridWorker [name=sys-stripe-4, 
igniteInstanceName=EPE_CLUSTER_PERF, finished=false, 
heartbeatTs=1604104192954]]]
org.apache.ignite.IgniteException: GridWorker [name=sys-stripe-4, 
igniteInstanceName=EPE_CLUSTER_PERF, finished=false, heartbeatTs=1604104192954]
at 
org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance$2.apply(IgnitionEx.java:1859)
 [ignite-core-8.7.28.jar:8.7.28]
at 
org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance$2.apply(IgnitionEx.java:1854)
 [ignite-core-8.7.28.jar:8.7.28]
at 
org.apache.ignite.internal.worker.WorkersRegistry.onIdle(WorkersRegistry.java:233)
 [ignite-core-8.7.28.jar:8.7.28

[jira] [Created] (IGNITE-13813) SKIP_GARBAGE WAL compression doesn't work for binary recovery

2020-12-04 Thread Ivan Bessonov (Jira)
Ivan Bessonov created IGNITE-13813:
--

 Summary: SKIP_GARBAGE WAL compression doesn't work for binary 
recovery
 Key: IGNITE-13813
 URL: https://issues.apache.org/jira/browse/IGNITE-13813
 Project: Ignite
  Issue Type: Bug
Reporter: Ivan Bessonov
Assignee: Ivan Bessonov


{noformat}
class org.apache.ignite.IgniteCheckedException: Failed to apply page snapshot

at 
org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.lambda$performBinaryMemoryRestore$14(GridCacheDatabaseSharedManager.java:2419)
at 
org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.lambda$stripedApplyPage$18(GridCacheDatabaseSharedManager.java:2603)
at 
org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.lambda$stripedApply$19(GridCacheDatabaseSharedManager.java:2641)
at 
org.apache.ignite.internal.util.StripedExecutor$Stripe.body(StripedExecutor.java:559)
at 
org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:119)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.AssertionError: 4096
at 
org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.applyPageSnapshot(GridCacheDatabaseSharedManager.java:2671)
at 
org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.lambda$performBinaryMemoryRestore$14(GridCacheDatabaseSharedManager.java:2412)
... 5 more{noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-13812) CheckpointEntry is read from WAL right after its creation.

2020-12-03 Thread Ivan Bessonov (Jira)
Ivan Bessonov created IGNITE-13812:
--

 Summary: CheckpointEntry is read from WAL right after its creation.
 Key: IGNITE-13812
 URL: https://issues.apache.org/jira/browse/IGNITE-13812
 Project: Ignite
  Issue Type: Bug
Reporter: Ivan Bessonov
Assignee: Ivan Bessonov


{noformat}
[2020-07-31 16:33:15,545][INFO ][pitr-ctx-exec-#304][WalStateManager] WAL 
logging disabled
[2020-07-31 16:33:15,545][INFO 
][db-checkpoint-thread-#152][GridCacheDatabaseSharedManager] Checkpoint 
finished [cpId=e1a57b48-1610-4280-a3e2-4d808a5f0343, pages=64, 
markPos=FileWALPointer [idx=5, fileOff=45749881, len=186791], 
walSegmentsCleared=0, walSegmentsCovered=[], markDuration=49ms, pagesWrite=0ms, 
fsync=5ms, total=79ms]
[2020-07-31 16:33:15,546][INFO ][pitr-ctx-exec-#304][GridRecoveryProcessor] 
Start apply segment idx=1
[2020-07-31 16:33:16,012][INFO ][pitr-ctx-exec-#304][GridRecoveryProcessor] 
Segment idx=1 applied
[2020-07-31 16:33:16,373][INFO ][pitr-ctx-exec-#304][GridRecoveryProcessor] 
Segment idx=2 applied
[2020-07-31 16:33:16,553][ERROR][db-checkpoint-thread-#152][root] Critical 
system error detected. Will be handled accordingly to configured handler 
[hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, 
super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet 
[SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], 
failureCtx=FailureContext [type=CRITICAL_ERROR, 
err=java.lang.ClassCastException: class 
o.a.i.i.pagemem.wal.record.MemoryRecoveryRecord cannot be cast to class 
o.a.i.i.pagemem.wal.record.CheckpointRecord 
(o.a.i.i.pagemem.wal.record.MemoryRecoveryRecord and 
o.a.i.i.pagemem.wal.record.CheckpointRecord are in unnamed module of loader 
'app')]]
java.lang.ClassCastException: class 
org.apache.ignite.internal.pagemem.wal.record.MemoryRecoveryRecord cannot be 
cast to class org.apache.ignite.internal.pagemem.wal.record.CheckpointRecord 
(org.apache.ignite.internal.pagemem.wal.record.MemoryRecoveryRecord and 
org.apache.ignite.internal.pagemem.wal.record.CheckpointRecord are in unnamed 
module of loader 'app')
at 
org.apache.ignite.internal.processors.cache.persistence.checkpoint.CheckpointEntry$GroupStateLazyStore.initIfNeeded(CheckpointEntry.java:353)
at 
org.apache.ignite.internal.processors.cache.persistence.checkpoint.CheckpointEntry$GroupStateLazyStore.access$300(CheckpointEntry.java:245)
at 
org.apache.ignite.internal.processors.cache.persistence.checkpoint.CheckpointEntry.initIfNeeded(CheckpointEntry.java:124)
at 
org.apache.ignite.internal.processors.cache.persistence.checkpoint.CheckpointEntry.groupState(CheckpointEntry.java:106)
at 
org.apache.ignite.internal.processors.cache.persistence.checkpoint.CheckpointHistory.addCpToEarliestCpMap(CheckpointHistory.java:246)
at 
org.apache.ignite.internal.processors.cache.persistence.checkpoint.CheckpointHistory.addCheckpoint(CheckpointHistory.java:179)
at 
org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager$Checkpointer.markCheckpointBegin(GridCacheDatabaseSharedManager.java:4221)
at 
org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager$Checkpointer.doCheckpoint(GridCacheDatabaseSharedManager.java:3732)
at 
org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager$Checkpointer.body(GridCacheDatabaseSharedManager.java:3621)
at 
org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:119)
at java.base/java.lang.Thread.run(Thread.java:834){noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-13811) ServerImpl#pingNode(InetSocketAddress, UUID, UUID) fails to ping nodes with unresolved addresses

2020-12-03 Thread Ivan Bessonov (Jira)
Ivan Bessonov created IGNITE-13811:
--

 Summary: ServerImpl#pingNode(InetSocketAddress, UUID, UUID) fails 
to ping nodes with unresolved addresses
 Key: IGNITE-13811
 URL: https://issues.apache.org/jira/browse/IGNITE-13811
 Project: Ignite
  Issue Type: Bug
Reporter: Ivan Bessonov
Assignee: Ivan Bessonov


Wrong key is deleted from map.
{code:java}
pingMap.putIfAbsent(addr, fut)
{code}
{code:java}
if (addr.isUnresolved())
 addr = new InetSocketAddress(InetAddress.getByName(addr.getHostName()), 
addr.getPort());
{code}
{code:java}
boolean b = pingMap.remove(addr, fut);

assert b;
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IGNITE-13802) GridCacheOffheapManager#addPartitions ignores candidate pages count for index partition

2020-12-03 Thread Ivan Bessonov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-13802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17243182#comment-17243182
 ] 

Ivan Bessonov commented on IGNITE-13802:


TC Bot visa is just impossible to get. [~sergeychugunov] can you please take a 
look at the fix?

> GridCacheOffheapManager#addPartitions ignores candidate pages count for index 
> partition
> ---
>
> Key: IGNITE-13802
> URL: https://issues.apache.org/jira/browse/IGNITE-13802
> Project: Ignite
>  Issue Type: Bug
>Reporter: Ivan Bessonov
>Assignee: Ivan Bessonov
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> It also marks page as dirty despite doing nothing with it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-13808) Control.sh validate_indexes throws CorruptedTreeException and fails server node during check

2020-12-03 Thread Ivan Bessonov (Jira)
Ivan Bessonov created IGNITE-13808:
--

 Summary: Control.sh validate_indexes throws CorruptedTreeException 
and fails server node during check
 Key: IGNITE-13808
 URL: https://issues.apache.org/jira/browse/IGNITE-13808
 Project: Ignite
  Issue Type: Bug
Reporter: Ivan Bessonov
Assignee: Ivan Bessonov


CorruptedTreeException during validate index command calls Failure handler and 
stops server node:
{code:java}
[21:44:26,257][WARNING][pool-5-thread-2][ValidateIndexesClosure] Current 
progress of ValidateIndexesClosure: checked integrity of 1 index partitions of 
14 cache groups
[21:44:26,852][SEVERE][pool-5-thread-16][] Critical system error detected. Will 
be handled accordingly to configured handler [hnd=StopNodeOrHaltFailureHandler 
[tryStop=false, timeout=0, super=AbstractFailureHandler 
[ignoredFailureTypes=UnmodifiableSet [SYSTEM_WORKER_BLOCKED, 
SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], failureCtx=FailureContext 
[type=CRITICAL_ERROR, err=class 
o.a.i.i.processors.cache.persistence.tree.CorruptedTreeException: B+Tree is 
corrupted [pages(groupId, pageId)=[], msg=Runtime failure on bounds: 
[lower=null, upper=null
class 
org.apache.ignite.internal.processors.cache.persistence.tree.CorruptedTreeException:
 B+Tree is corrupted [pages(groupId, pageId)=[], msg=Runtime failure on bounds: 
[lower=null, upper=null]]
at 
org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.corruptedTreeException(BPlusTree.java:5126)
at 
org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.find(BPlusTree.java:1029)
at 
org.apache.ignite.internal.processors.query.h2.database.H2TreeIndex.find(H2TreeIndex.java:243)
at 
org.apache.ignite.internal.visor.verify.ValidateIndexesClosure.processIndex(ValidateIndexesClosure.java:651)
at 
org.apache.ignite.internal.visor.verify.ValidateIndexesClosure.access$200(ValidateIndexesClosure.java:93)
at 
org.apache.ignite.internal.visor.verify.ValidateIndexesClosure$4.call(ValidateIndexesClosure.java:631)
at 
org.apache.ignite.internal.visor.verify.ValidateIndexesClosure$4.call(ValidateIndexesClosure.java:629)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: 
org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTreeRuntimeException:
 
org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTreeRuntimeException:
 java.lang.IllegalStateException: Item not found: 11
at 
org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.findLowerUnbounded(BPlusTree.java:987)
at 
org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.find(BPlusTree.java:1014)
... 9 more
Caused by: 
org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTreeRuntimeException:
 java.lang.IllegalStateException: Item not found: 11
at 
org.apache.ignite.internal.processors.cache.persistence.CacheDataRowAdapter.initFromLink(CacheDataRowAdapter.java:203)
at 
org.apache.ignite.internal.processors.cache.persistence.CacheDataRowAdapter.initFromLink(CacheDataRowAdapter.java:104)
at 
org.apache.ignite.internal.processors.query.h2.database.H2RowFactory.getRow(H2RowFactory.java:61)
at 
org.apache.ignite.internal.processors.query.h2.database.H2Tree.createRowFromLink(H2Tree.java:246)
at 
org.apache.ignite.internal.processors.query.h2.database.io.H2ExtrasLeafIO.getLookupRow(H2ExtrasLeafIO.java:126)
at 
org.apache.ignite.internal.processors.query.h2.database.io.H2ExtrasLeafIO.getLookupRow(H2ExtrasLeafIO.java:36)
at 
org.apache.ignite.internal.processors.query.h2.database.H2Tree.getRow(H2Tree.java:264)
at 
org.apache.ignite.internal.processors.query.h2.database.H2Tree.getRow(H2Tree.java:56)
at 
org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree$ForwardCursor.fillFromBuffer(BPlusTree.java:4808)
at 
org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree$ForwardCursor.init(BPlusTree.java:4710)
at 
org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree$ForwardCursor.access$5000(BPlusTree.java:4646)
at 
org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.findLowerUnbounded(BPlusTree.java:976)
... 10 more
Caused by: java.lang.IllegalStateException: Item not found: 11
at 
org.apache.ignite.internal.processors.cache.persistence.tree.io.AbstractDataPageIO.findIndirectItemIndex(AbstractDataPageIO.java:341)
at 
org.apache.ignite.internal.processors.cache.persistence.tree.io.AbstractDataPageIO.getDataOffset(Abstra

[jira] [Commented] (IGNITE-13742) Fix failed WalModeChangeAdvancedSelfTest.testMaintenanceIsSkippedIfWasFixedManuallyOnDowntime

2020-12-02 Thread Ivan Bessonov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-13742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17242397#comment-17242397
 ] 

Ivan Bessonov commented on IGNITE-13742:


All those failures have nothing to do with my changes, I checked them. Waiting 
for MVCC suite to rerun makes no sense, it fails with execution timeout must of 
the times. .NET is even worse, it seems like we don't have enough agents. Scala 
Examples fail in master.

> Fix failed 
> WalModeChangeAdvancedSelfTest.testMaintenanceIsSkippedIfWasFixedManuallyOnDowntime
> -
>
> Key: IGNITE-13742
> URL: https://issues.apache.org/jira/browse/IGNITE-13742
> Project: Ignite
>  Issue Type: Sub-task
>Reporter: Ivan Bessonov
>Assignee: Ivan Bessonov
>Priority: Major
>  Labels: IEP-47
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> This code is to blame:
>  
> {code:java}
> // Reinitialized discovery manager won't have a valid consistentId on 
> creation.
> discoMgr.consistentId(ctx.pdsFolderResolver().resolveFolders().consistentId());
> {code}
> More specifically: "***.consistentId" invocation with valid consistent id 
> from ANY source.
>  
>  
> [https://ci.ignite.apache.org/project.html?projectId=IgniteTests24Java8&testNameId=5803772702668480758&tab=testDetails]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IGNITE-13742) Fix failed WalModeChangeAdvancedSelfTest.testMaintenanceIsSkippedIfWasFixedManuallyOnDowntime

2020-12-02 Thread Ivan Bessonov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-13742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17242398#comment-17242398
 ] 

Ivan Bessonov commented on IGNITE-13742:


[~sergeychugunov] can you merge it?

> Fix failed 
> WalModeChangeAdvancedSelfTest.testMaintenanceIsSkippedIfWasFixedManuallyOnDowntime
> -
>
> Key: IGNITE-13742
> URL: https://issues.apache.org/jira/browse/IGNITE-13742
> Project: Ignite
>  Issue Type: Sub-task
>Reporter: Ivan Bessonov
>Assignee: Ivan Bessonov
>Priority: Major
>  Labels: IEP-47
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> This code is to blame:
>  
> {code:java}
> // Reinitialized discovery manager won't have a valid consistentId on 
> creation.
> discoMgr.consistentId(ctx.pdsFolderResolver().resolveFolders().consistentId());
> {code}
> More specifically: "***.consistentId" invocation with valid consistent id 
> from ANY source.
>  
>  
> [https://ci.ignite.apache.org/project.html?projectId=IgniteTests24Java8&testNameId=5803772702668480758&tab=testDetails]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-13802) GridCacheOffheapManager#addPartitions ignores candidate pages count for index partition

2020-12-02 Thread Ivan Bessonov (Jira)
Ivan Bessonov created IGNITE-13802:
--

 Summary: GridCacheOffheapManager#addPartitions ignores candidate 
pages count for index partition
 Key: IGNITE-13802
 URL: https://issues.apache.org/jira/browse/IGNITE-13802
 Project: Ignite
  Issue Type: Bug
Reporter: Ivan Bessonov
Assignee: Ivan Bessonov


It also marks page as dirty despite doing nothing with it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (IGNITE-13795) java.nio.file.InvalidPathException: Illegal char <:> at lock page on windows

2020-12-02 Thread Ivan Bessonov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-13795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Bessonov updated IGNITE-13795:
---
Ignite Flags:   (was: Docs Required,Release Notes Required)

> java.nio.file.InvalidPathException: Illegal char <:> at lock page on windows
> 
>
> Key: IGNITE-13795
> URL: https://issues.apache.org/jira/browse/IGNITE-13795
> Project: Ignite
>  Issue Type: Bug
>Reporter: Ivan Bessonov
>Assignee: Ivan Bessonov
>Priority: Major
>
> {code:java}
> Exception in thread "Thread-1" java.nio.file.InvalidPathException: Illegal 
> char <:> at index 109: 
> C:\BuildAgent\work\d501ae8146bd8253\i2test\var\suite-thin_clients\art-gg-ult\work\diagnostic\page_lock_dump_0:0:0:0:0:0:0:1,127.0.0.1,172.23.240.1,172.25.2.217:47500_2020_06_22_17_24_06_377
>   at sun.nio.fs.WindowsPathParser.normalize(WindowsPathParser.java:182)
>   at sun.nio.fs.WindowsPathParser.parse(WindowsPathParser.java:153)
>   at sun.nio.fs.WindowsPathParser.parse(WindowsPathParser.java:77)
>   at sun.nio.fs.WindowsPath.parse(WindowsPath.java:94)
>   at sun.nio.fs.WindowsFileSystem.getPath(WindowsFileSystem.java:255)
>   at java.io.File.toPath(File.java:2234)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.diagnostic.pagelocktracker.dumpprocessors.ToFileDumpProcessor.saveToFile(ToFileDumpProcessor.java:69)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.diagnostic.pagelocktracker.dumpprocessors.ToFileDumpProcessor.toFileDump(ToFileDumpProcessor.java:53)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.diagnostic.pagelocktracker.PageLockTrackerManager.onHangThreads(PageLockTrackerManager.java:123)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.diagnostic.pagelocktracker.SharedPageLockTracker$TimeOutWorker.run(SharedPageLockTracker.java:385)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-13795) java.nio.file.InvalidPathException: Illegal char <:> at lock page on windows

2020-12-02 Thread Ivan Bessonov (Jira)
Ivan Bessonov created IGNITE-13795:
--

 Summary: java.nio.file.InvalidPathException: Illegal char <:> at 
lock page on windows
 Key: IGNITE-13795
 URL: https://issues.apache.org/jira/browse/IGNITE-13795
 Project: Ignite
  Issue Type: Bug
Reporter: Ivan Bessonov
Assignee: Ivan Bessonov


{code:java}
Exception in thread "Thread-1" java.nio.file.InvalidPathException: Illegal char 
<:> at index 109: 
C:\BuildAgent\work\d501ae8146bd8253\i2test\var\suite-thin_clients\art-gg-ult\work\diagnostic\page_lock_dump_0:0:0:0:0:0:0:1,127.0.0.1,172.23.240.1,172.25.2.217:47500_2020_06_22_17_24_06_377
at sun.nio.fs.WindowsPathParser.normalize(WindowsPathParser.java:182)
at sun.nio.fs.WindowsPathParser.parse(WindowsPathParser.java:153)
at sun.nio.fs.WindowsPathParser.parse(WindowsPathParser.java:77)
at sun.nio.fs.WindowsPath.parse(WindowsPath.java:94)
at sun.nio.fs.WindowsFileSystem.getPath(WindowsFileSystem.java:255)
at java.io.File.toPath(File.java:2234)
at 
org.apache.ignite.internal.processors.cache.persistence.diagnostic.pagelocktracker.dumpprocessors.ToFileDumpProcessor.saveToFile(ToFileDumpProcessor.java:69)
at 
org.apache.ignite.internal.processors.cache.persistence.diagnostic.pagelocktracker.dumpprocessors.ToFileDumpProcessor.toFileDump(ToFileDumpProcessor.java:53)
at 
org.apache.ignite.internal.processors.cache.persistence.diagnostic.pagelocktracker.PageLockTrackerManager.onHangThreads(PageLockTrackerManager.java:123)
at 
org.apache.ignite.internal.processors.cache.persistence.diagnostic.pagelocktracker.SharedPageLockTracker$TimeOutWorker.run(SharedPageLockTracker.java:385)
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IGNITE-11998) Fix DataPageScan for fragmented pages.

2020-12-02 Thread Ivan Bessonov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-11998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17242332#comment-17242332
 ] 

Ivan Bessonov commented on IGNITE-11998:


Hi [~mmuzaf], just found out that this issue is on you now.

DataPageScan optimization was disabled as a part of 
https://issues.apache.org/jira/browse/IGNITE-11982. It comes with a bunch of 
other fixes, which sucks. But you can still use it as a reference, many of 
those changes must be reapplied back.

BTW, what's your plan? How are you going to fix it?

> Fix DataPageScan for fragmented pages.
> --
>
> Key: IGNITE-11998
> URL: https://issues.apache.org/jira/browse/IGNITE-11998
> Project: Ignite
>  Issue Type: Bug
>Reporter: Ivan Bessonov
>Assignee: Maxim Muzafarov
>Priority: Critical
> Fix For: 2.10
>
>
> Fragmented pages crash JVM when accessed by DataPageScan scanner/query 
> optimized scanner. It happens when scanner accesses data in later chunk in 
> fragmented entry but treats it like the first one, expecting length of the 
> payload, which is absent and replaced with raw entry data.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IGNITE-11346) Remote client authentication failed for the CommandHandler in the case where it optional on the server

2020-12-02 Thread Ivan Bessonov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-11346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17242268#comment-17242268
 ] 

Ivan Bessonov commented on IGNITE-11346:


[~Maxoid]

I know it's been a long time, but what's the status of this issue? It was ready 
to merge, like, 18 months ago.

> Remote client authentication failed for the CommandHandler in the case where 
> it optional on the server
> --
>
> Key: IGNITE-11346
> URL: https://issues.apache.org/jira/browse/IGNITE-11346
> Project: Ignite
>  Issue Type: Bug
>  Components: clients, security, thin client
>Affects Versions: 2.7
>Reporter: Maxim Karavaev
>Assignee: Maxim Karavaev
>Priority: Minor
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> h2. Preposition:
> Custom _GridSecurityProcessor_ implementation allows optional authentication. 
> With other words, if some credentials are presents then authentication 
> performed, otherwise - not (some restricted SecurityContext returned). 
> REST API works fine. If credentials are present or the auth request was made 
> then the auth works as desired, if not - it also works but only for some 
> authorized requests.
> h2. The problem:
> _CommandHandler_ which is used for controlling a cluster through the CLI 
> script _command.sh|bat_ doesn't respect credential parameters and sends auth 
> request only in case of authentication exception for a regular request. In 
> the described case of optional authentication it never happens, so the result 
> always depends on the "default" Permissions.
> h2. Possible solution:
> Change _GridClientNioTcpConnection_ to always send first an auth request in 
> case of provided credentials.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IGNITE-13190) Core defragmentation functions

2020-12-02 Thread Ivan Bessonov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-13190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17242136#comment-17242136
 ] 

Ivan Bessonov commented on IGNITE-13190:


[~mmuzaf] thank you, I'll take a look at performance profiling tool.

And, of course, we won't merge anything without TC results.

> Core defragmentation functions
> --
>
> Key: IGNITE-13190
> URL: https://issues.apache.org/jira/browse/IGNITE-13190
> Project: Ignite
>  Issue Type: Sub-task
>Reporter: Sergey Chugunov
>Assignee: Ivan Bessonov
>Priority: Major
>  Labels: IEP-47
>  Time Spent: 20h 40m
>  Remaining Estimate: 0h
>
> The following set of functions covering defragmentation happy-case needed:
>  * Initialization of defragmentation manager when node is started in 
> maintenance mode.
>  * Information about partition files is gathered by defrag mgr.
>  * For each partition file corresponding file of defragmented partition is 
> created and initialized.
>  * Keys are transferred from old partitions to new partitions.
>  * Checkpointer is aware of new partition files and flushes defragmented 
> memory to new partition files.
>  
> No fault-tolerance code nor index defragmentation mappings are needed in this 
> task.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IGNITE-13190) Core defragmentation functions

2020-12-01 Thread Ivan Bessonov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-13190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17241511#comment-17241511
 ] 

Ivan Bessonov commented on IGNITE-13190:


Hi, [~mmuzaf],

it's kinda hard to keep up for all the comments, but I hope that I fixed 
everything that you asked. TimeTracker is removed.

Last questionable thing, I guess, is checkpoint lock as a method parameter. We 
either pass it as is or somehow access it through 
GridCacheDatabaseSharedManager. This would mean that we must provide a public 
accessor method for "CachePartitionDefragmentationManager", which is also 
strange.

Please take a look, thank you!

> Core defragmentation functions
> --
>
> Key: IGNITE-13190
> URL: https://issues.apache.org/jira/browse/IGNITE-13190
> Project: Ignite
>  Issue Type: Sub-task
>Reporter: Sergey Chugunov
>Assignee: Ivan Bessonov
>Priority: Major
>  Labels: IEP-47
>  Time Spent: 20h 40m
>  Remaining Estimate: 0h
>
> The following set of functions covering defragmentation happy-case needed:
>  * Initialization of defragmentation manager when node is started in 
> maintenance mode.
>  * Information about partition files is gathered by defrag mgr.
>  * For each partition file corresponding file of defragmented partition is 
> created and initialized.
>  * Keys are transferred from old partitions to new partitions.
>  * Checkpointer is aware of new partition files and flushes defragmented 
> memory to new partition files.
>  
> No fault-tolerance code nor index defragmentation mappings are needed in this 
> task.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (IGNITE-13786) PDS defragmentation can inflate index size

2020-12-01 Thread Ivan Bessonov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-13786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Bessonov updated IGNITE-13786:
---
Parent: IGNITE-13143
Issue Type: Sub-task  (was: Bug)

> PDS defragmentation can inflate index size
> --
>
> Key: IGNITE-13786
> URL: https://issues.apache.org/jira/browse/IGNITE-13786
> Project: Ignite
>  Issue Type: Sub-task
>Reporter: Ivan Bessonov
>Assignee: Ivan Bessonov
>Priority: Major
>
> For huge caches it is possible that defragmentation will lead to bigger 
> indexes size.
> The reason is that we only append new data to index trees and never insert 
> into the middle, this leads to under-utilization of B+Tree pages space.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-13786) PDS defragmentation can inflate index size

2020-12-01 Thread Ivan Bessonov (Jira)
Ivan Bessonov created IGNITE-13786:
--

 Summary: PDS defragmentation can inflate index size
 Key: IGNITE-13786
 URL: https://issues.apache.org/jira/browse/IGNITE-13786
 Project: Ignite
  Issue Type: Bug
Reporter: Ivan Bessonov
Assignee: Ivan Bessonov


For huge caches it is possible that defragmentation will lead to bigger indexes 
size.

The reason is that we only append new data to index trees and never insert into 
the middle, this leads to under-utilization of B+Tree pages space.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IGNITE-12885) Checkpoint thread executes partitions fsync in single thread

2020-11-30 Thread Ivan Bessonov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-12885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17240758#comment-17240758
 ] 

Ivan Bessonov commented on IGNITE-12885:


Fix was rejected by [~ilyak]

> Checkpoint thread executes partitions fsync in single thread
> 
>
> Key: IGNITE-12885
> URL: https://issues.apache.org/jira/browse/IGNITE-12885
> Project: Ignite
>  Issue Type: Bug
>Reporter: Ivan Bessonov
>Assignee: Ivan Bessonov
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> It should use "asyncRunner" if it was configured, this will optimize 
> checkpoint speed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IGNITE-13190) Core defragmentation functions

2020-11-27 Thread Ivan Bessonov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-13190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17239581#comment-17239581
 ] 

Ivan Bessonov commented on IGNITE-13190:


[~mmuzaf] thank you for review!

I addressed your latest comments and made necessary changes to the code, please 
take a look. I hope that code is a bit more clean now.

> Core defragmentation functions
> --
>
> Key: IGNITE-13190
> URL: https://issues.apache.org/jira/browse/IGNITE-13190
> Project: Ignite
>  Issue Type: Sub-task
>Reporter: Sergey Chugunov
>Assignee: Ivan Bessonov
>Priority: Major
>  Labels: IEP-47
>  Time Spent: 18h 40m
>  Remaining Estimate: 0h
>
> The following set of functions covering defragmentation happy-case needed:
>  * Initialization of defragmentation manager when node is started in 
> maintenance mode.
>  * Information about partition files is gathered by defrag mgr.
>  * For each partition file corresponding file of defragmented partition is 
> created and initialized.
>  * Keys are transferred from old partitions to new partitions.
>  * Checkpointer is aware of new partition files and flushes defragmented 
> memory to new partition files.
>  
> No fault-tolerance code nor index defragmentation mappings are needed in this 
> task.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (IGNITE-13742) Fix failed WalModeChangeAdvancedSelfTest.testMaintenanceIsSkippedIfWasFixedManuallyOnDowntime

2020-11-24 Thread Ivan Bessonov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-13742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Bessonov reassigned IGNITE-13742:
--

Assignee: Ivan Bessonov

> Fix failed 
> WalModeChangeAdvancedSelfTest.testMaintenanceIsSkippedIfWasFixedManuallyOnDowntime
> -
>
> Key: IGNITE-13742
> URL: https://issues.apache.org/jira/browse/IGNITE-13742
> Project: Ignite
>  Issue Type: Sub-task
>Reporter: Ivan Bessonov
>Assignee: Ivan Bessonov
>Priority: Major
>  Labels: IEP-47
>
> This code is to blame:
>  
> {code:java}
> // Reinitialized discovery manager won't have a valid consistentId on 
> creation.
> discoMgr.consistentId(ctx.pdsFolderResolver().resolveFolders().consistentId());
> {code}
> More specifically: "***.consistentId" invocation with valid consistent id 
> from ANY source.
>  
>  
> [https://ci.ignite.apache.org/project.html?projectId=IgniteTests24Java8&testNameId=5803772702668480758&tab=testDetails]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (IGNITE-13743) Defragmentation JMX API for schedule/cancel/status

2020-11-23 Thread Ivan Bessonov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-13743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Bessonov updated IGNITE-13743:
---
Labels: IEP-47  (was: )

> Defragmentation JMX API for schedule/cancel/status
> --
>
> Key: IGNITE-13743
> URL: https://issues.apache.org/jira/browse/IGNITE-13743
> Project: Ignite
>  Issue Type: Sub-task
>Reporter: Ivan Bessonov
>Assignee: Semyon Danilov
>Priority: Major
>  Labels: IEP-47
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-13743) Defragmentation JMX API for schedule/cancel/status

2020-11-23 Thread Ivan Bessonov (Jira)
Ivan Bessonov created IGNITE-13743:
--

 Summary: Defragmentation JMX API for schedule/cancel/status
 Key: IGNITE-13743
 URL: https://issues.apache.org/jira/browse/IGNITE-13743
 Project: Ignite
  Issue Type: Sub-task
Reporter: Ivan Bessonov
Assignee: Semyon Danilov






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IGNITE-13190) Core defragmentation functions

2020-11-20 Thread Ivan Bessonov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-13190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17236075#comment-17236075
 ] 

Ivan Bessonov commented on IGNITE-13190:


[~agoncharuk], [~mmuzaf]

guys, I addressed your issues, can you please take a look one more time? It 
would be awesome to merge it next week. Thank you!

> Core defragmentation functions
> --
>
> Key: IGNITE-13190
> URL: https://issues.apache.org/jira/browse/IGNITE-13190
> Project: Ignite
>  Issue Type: Sub-task
>Reporter: Sergey Chugunov
>Assignee: Ivan Bessonov
>Priority: Major
>  Labels: IEP-47
>  Time Spent: 8h 20m
>  Remaining Estimate: 0h
>
> The following set of functions covering defragmentation happy-case needed:
>  * Initialization of defragmentation manager when node is started in 
> maintenance mode.
>  * Information about partition files is gathered by defrag mgr.
>  * For each partition file corresponding file of defragmented partition is 
> created and initialized.
>  * Keys are transferred from old partitions to new partitions.
>  * Checkpointer is aware of new partition files and flushes defragmented 
> memory to new partition files.
>  
> No fault-tolerance code nor index defragmentation mappings are needed in this 
> task.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-13742) Fix failed WalModeChangeAdvancedSelfTest.testMaintenanceIsSkippedIfWasFixedManuallyOnDowntime

2020-11-20 Thread Ivan Bessonov (Jira)
Ivan Bessonov created IGNITE-13742:
--

 Summary: Fix failed 
WalModeChangeAdvancedSelfTest.testMaintenanceIsSkippedIfWasFixedManuallyOnDowntime
 Key: IGNITE-13742
 URL: https://issues.apache.org/jira/browse/IGNITE-13742
 Project: Ignite
  Issue Type: Sub-task
Reporter: Ivan Bessonov


https://ci.ignite.apache.org/project.html?projectId=IgniteTests24Java8&testNameId=5803772702668480758&tab=testDetails



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (IGNITE-13709) Control.sh API - status

2020-11-16 Thread Ivan Bessonov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-13709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Bessonov updated IGNITE-13709:
---
Description: 
_Prerequisites:_ command can be sent to nodes in maintenance mode and in normal 
operations as well.
 
_Command output:_
 # For node in normal operations:
defragmentation is scheduled for caches: 


 # For node in maintenance mode executing defragmentation:
defragmentation is completed for the caches:
    cache0 - size before/after: 200GB/150GB, time took: 15 mins 42 secs
defragmentation is in progress for cache:
    cache1 - partitions processed/all: 177/512, time elapsed: 7 mins 11 secs
awaiting defragmentation: cache2, cache3, cache4.


 # For node in maintenance mode for other reason:
no defragmentation is scheduled for the node, the node is in maintenance to 
perform tasks: 

> Control.sh API - status
> ---
>
> Key: IGNITE-13709
> URL: https://issues.apache.org/jira/browse/IGNITE-13709
> Project: Ignite
>  Issue Type: Sub-task
>Reporter: Ivan Bessonov
>Assignee: Ivan Bessonov
>Priority: Major
>  Labels: IEP-47
>
> _Prerequisites:_ command can be sent to nodes in maintenance mode and in 
> normal operations as well.
>  
> _Command output:_
>  # For node in normal operations:
> defragmentation is scheduled for caches: 
>  # For node in maintenance mode executing defragmentation:
> defragmentation is completed for the caches:
>     cache0 - size before/after: 200GB/150GB, time took: 15 mins 42 secs
> defragmentation is in progress for cache:
>     cache1 - partitions processed/all: 177/512, time elapsed: 7 mins 11 secs
> awaiting defragmentation: cache2, cache3, cache4.
>  # For node in maintenance mode for other reason:
> no defragmentation is scheduled for the node, the node is in maintenance to 
> perform tasks: 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (IGNITE-13709) Control.sh API - status

2020-11-16 Thread Ivan Bessonov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-13709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Bessonov updated IGNITE-13709:
---
Labels: IEP-47  (was: )

> Control.sh API - status
> ---
>
> Key: IGNITE-13709
> URL: https://issues.apache.org/jira/browse/IGNITE-13709
> Project: Ignite
>  Issue Type: Sub-task
>Reporter: Ivan Bessonov
>Assignee: Ivan Bessonov
>Priority: Major
>  Labels: IEP-47
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-13709) Control.sh API - status

2020-11-16 Thread Ivan Bessonov (Jira)
Ivan Bessonov created IGNITE-13709:
--

 Summary: Control.sh API - status
 Key: IGNITE-13709
 URL: https://issues.apache.org/jira/browse/IGNITE-13709
 Project: Ignite
  Issue Type: Sub-task
Reporter: Ivan Bessonov
Assignee: Ivan Bessonov






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (IGNITE-13697) Control.sh API - schedule & cancel

2020-11-16 Thread Ivan Bessonov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-13697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Bessonov updated IGNITE-13697:
---
Description: 
 
 From original draft by [~sergeychugunov]:
  
 Schedule
 *control.sh defragmentation schedule nodes 
nodeConsistentId0[,nodeConsistentId1] [caches cacheName0,cacheName1,cacheName2]*
  
 Optional list of caches is passed to perform defragmentation for a particular 
set of caches. By default all caches are defragmented.
  
 _Prerequisites_: command is sent to node in normal operations, node in 
maintenance mode should not accept it

_Command output:_
 Defragmentation is successfully scheduled on nodes , on next 
restart the following caches will be defragmented: .
 Cancel
 *control.sh defragmentation cancel nodeHost nodePort [cache cacheName0]*

_Prerequisites_: command is sent to node in maintenance mode or in normal mode

_Command output:_
 Defragmentation is already completed for caches: 
 Defragmentation is cancelled for caches: ; all intermediate files 
are cleaned up.
  
 *Note:* Caches list for cancel command will not be implemented here.

  was:
 
>From original draft by [~sergeychugunov]:
 
Schedule
*control.sh defragmentation schedule nodes 
nodeConsistentId0[,nodeConsistentId1] [caches cacheName0,cacheName1,cacheName2]*
 
Optional list of caches is passed to perform defragmentation for a particular 
set of caches. By default all caches are defragmented.
 
_Prerequisites_: command is sent to node in normal operations, node in 
maintenance mode should not accept it

_Command output:_
Defragmentation is successfully scheduled on nodes , on next 
restart the following caches will be defragmented: .
Cancel
*control.sh defragmentation cancel nodeHost nodePort [cache cacheName0]*

_Prerequisites_: command is sent to node in maintenance mode, node in normal 
operations should not accept it.

_Command output:_
Defragmentation is already completed for caches: 
Defragmentation is cancelled for caches: ; all intermediate files 
are cleaned up.
 
*Note:* Caches list for cancel command will not be implemented here.


> Control.sh API - schedule & cancel
> --
>
> Key: IGNITE-13697
> URL: https://issues.apache.org/jira/browse/IGNITE-13697
> Project: Ignite
>  Issue Type: Sub-task
>Reporter: Ivan Bessonov
>Assignee: Ivan Bessonov
>Priority: Major
>  Labels: IEP-47
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
>  
>  From original draft by [~sergeychugunov]:
>   
>  Schedule
>  *control.sh defragmentation schedule nodes 
> nodeConsistentId0[,nodeConsistentId1] [caches 
> cacheName0,cacheName1,cacheName2]*
>   
>  Optional list of caches is passed to perform defragmentation for a 
> particular set of caches. By default all caches are defragmented.
>   
>  _Prerequisites_: command is sent to node in normal operations, node in 
> maintenance mode should not accept it
> _Command output:_
>  Defragmentation is successfully scheduled on nodes , on next 
> restart the following caches will be defragmented: .
>  Cancel
>  *control.sh defragmentation cancel nodeHost nodePort [cache cacheName0]*
> _Prerequisites_: command is sent to node in maintenance mode or in normal mode
> _Command output:_
>  Defragmentation is already completed for caches: 
>  Defragmentation is cancelled for caches: ; all intermediate 
> files are cleaned up.
>   
>  *Note:* Caches list for cancel command will not be implemented here.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (IGNITE-13697) Control.sh API - schedule & cancel

2020-11-12 Thread Ivan Bessonov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-13697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Bessonov updated IGNITE-13697:
---
Labels: IEP-47  (was: )

> Control.sh API - schedule & cancel
> --
>
> Key: IGNITE-13697
> URL: https://issues.apache.org/jira/browse/IGNITE-13697
> Project: Ignite
>  Issue Type: Sub-task
>Reporter: Ivan Bessonov
>Assignee: Ivan Bessonov
>Priority: Major
>  Labels: IEP-47
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
>  
> From original draft by [~sergeychugunov]:
>  
> Schedule
> *control.sh defragmentation schedule nodes 
> nodeConsistentId0[,nodeConsistentId1] [caches 
> cacheName0,cacheName1,cacheName2]*
>  
> Optional list of caches is passed to perform defragmentation for a particular 
> set of caches. By default all caches are defragmented.
>  
> _Prerequisites_: command is sent to node in normal operations, node in 
> maintenance mode should not accept it
> _Command output:_
> Defragmentation is successfully scheduled on nodes , on next 
> restart the following caches will be defragmented: .
> Cancel
> *control.sh defragmentation cancel nodeHost nodePort [cache cacheName0]*
> _Prerequisites_: command is sent to node in maintenance mode, node in normal 
> operations should not accept it.
> _Command output:_
> Defragmentation is already completed for caches: 
> Defragmentation is cancelled for caches: ; all intermediate 
> files are cleaned up.
>  
> *Note:* Caches list for cancel command will not be implemented here.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (IGNITE-13681) Non markers checkpoint implementation

2020-11-12 Thread Ivan Bessonov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-13681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Bessonov updated IGNITE-13681:
---
Labels: IEP-47  (was: )

> Non markers checkpoint implementation
> -
>
> Key: IGNITE-13681
> URL: https://issues.apache.org/jira/browse/IGNITE-13681
> Project: Ignite
>  Issue Type: Sub-task
>Reporter: Anton Kalashnikov
>Assignee: Anton Kalashnikov
>Priority: Major
>  Labels: IEP-47
> Fix For: 2.10
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> It's needed to implement a new version of checkpoint which will be simpler 
> than the current one. The main differences compared to the current checkpoint:
> * It doesn't contain any write operation to WAL.
> * It doesn't create checkpoint markers.
> * It should be possible to configure checkpoint listener only on the exact 
> data region
> This checkpoint will be helpful for defragmentation and for recovery(it is 
> not possible to use the current checkpoint during recovery right now)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (IGNITE-13683) Added MVCC validation to ValidateIndexesClosure

2020-11-12 Thread Ivan Bessonov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-13683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Bessonov updated IGNITE-13683:
---
Labels: IEP-47  (was: )

> Added MVCC validation to ValidateIndexesClosure
> ---
>
> Key: IGNITE-13683
> URL: https://issues.apache.org/jira/browse/IGNITE-13683
> Project: Ignite
>  Issue Type: Sub-task
>Affects Versions: 2.9
>Reporter: Anton Kalashnikov
>Assignee: Semyon Danilov
>Priority: Major
>  Labels: IEP-47
> Fix For: 2.10
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> MVCC indexes validation should be added to ValidateIndexesClosure



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (IGNITE-13682) Add generic to maintenance mode feature

2020-11-12 Thread Ivan Bessonov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-13682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Bessonov updated IGNITE-13682:
---
Labels: IEP-47  (was: )

> Add generic to maintenance mode feature
> ---
>
> Key: IGNITE-13682
> URL: https://issues.apache.org/jira/browse/IGNITE-13682
> Project: Ignite
>  Issue Type: Sub-task
>Reporter: Anton Kalashnikov
>Assignee: Anton Kalashnikov
>Priority: Major
>  Labels: IEP-47
> Fix For: 2.10
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> MaintenanceAction has no generic right now which lead to parametirezed problem



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (IGNITE-13684) Prepare PageStore/B+Tree to usage outside of standart lifecycle

2020-11-12 Thread Ivan Bessonov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-13684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Bessonov updated IGNITE-13684:
---
Labels: IEP-47  (was: )

> Prepare PageStore/B+Tree to usage outside of standart lifecycle
> ---
>
> Key: IGNITE-13684
> URL: https://issues.apache.org/jira/browse/IGNITE-13684
> Project: Ignite
>  Issue Type: Sub-task
>Reporter: Anton Kalashnikov
>Assignee: Ivan Bessonov
>Priority: Major
>  Labels: IEP-47
> Fix For: 2.10
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Right now, PageStore and some other classes which responsible for persistent 
> too couple with many other dependencies which not allow to use it in 
> different initial conditions(ex. defragmentation). So it is needed to 
> refactor some places in order to improve this situation.
> Changes are:
> * static constant for cache group meta page;
> * PageStore allocation tracker replaced with a more generic LongConsumer do 
> decouple it from metrics framework;
> * PageReadWriteManager added to basically allow having same cache group in 
> different data regions;
> * several methods and fields exposed as internally public/protected API;
> * several inner classes refactored so that they become static classes;
> * PageIOResolver interface created and used to make data structure more 
> flexible;
> * InsertLast interface for B+Tree added that will optimize comparisons on 
> inserts. Unused for now;
> * All this code doesn't affect existing behavior.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (IGNITE-13190) Core defragmentation functions

2020-11-12 Thread Ivan Bessonov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-13190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Bessonov reassigned IGNITE-13190:
--

Assignee: Ivan Bessonov  (was: Semyon Danilov)

> Core defragmentation functions
> --
>
> Key: IGNITE-13190
> URL: https://issues.apache.org/jira/browse/IGNITE-13190
> Project: Ignite
>  Issue Type: Sub-task
>Reporter: Sergey Chugunov
>Assignee: Ivan Bessonov
>Priority: Major
>  Labels: IEP-47
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> The following set of functions covering defragmentation happy-case needed:
>  * Initialization of defragmentation manager when node is started in 
> maintenance mode.
>  * Information about partition files is gathered by defrag mgr.
>  * For each partition file corresponding file of defragmented partition is 
> created and initialized.
>  * Keys are transferred from old partitions to new partitions.
>  * Checkpointer is aware of new partition files and flushes defragmented 
> memory to new partition files.
>  
> No fault-tolerance code nor index defragmentation mappings are needed in this 
> task.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (IGNITE-13697) Control.sh API - schedule & cancel

2020-11-12 Thread Ivan Bessonov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-13697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Bessonov updated IGNITE-13697:
---
Description: 
 
>From original draft by [~sergeychugunov]:
 
Schedule
*control.sh defragmentation schedule nodes 
nodeConsistentId0[,nodeConsistentId1] [caches cacheName0,cacheName1,cacheName2]*
 
Optional list of caches is passed to perform defragmentation for a particular 
set of caches. By default all caches are defragmented.
 
_Prerequisites_: command is sent to node in normal operations, node in 
maintenance mode should not accept it

_Command output:_
Defragmentation is successfully scheduled on nodes , on next 
restart the following caches will be defragmented: .
Cancel
*control.sh defragmentation cancel nodeHost nodePort [cache cacheName0]*

_Prerequisites_: command is sent to node in maintenance mode, node in normal 
operations should not accept it.

_Command output:_
Defragmentation is already completed for caches: 
Defragmentation is cancelled for caches: ; all intermediate files 
are cleaned up.
 
*Note:* Caches list for cancel command will not be implemented here.

> Control.sh API - schedule & cancel
> --
>
> Key: IGNITE-13697
> URL: https://issues.apache.org/jira/browse/IGNITE-13697
> Project: Ignite
>  Issue Type: Sub-task
>Reporter: Ivan Bessonov
>Assignee: Ivan Bessonov
>Priority: Major
>
>  
> From original draft by [~sergeychugunov]:
>  
> Schedule
> *control.sh defragmentation schedule nodes 
> nodeConsistentId0[,nodeConsistentId1] [caches 
> cacheName0,cacheName1,cacheName2]*
>  
> Optional list of caches is passed to perform defragmentation for a particular 
> set of caches. By default all caches are defragmented.
>  
> _Prerequisites_: command is sent to node in normal operations, node in 
> maintenance mode should not accept it
> _Command output:_
> Defragmentation is successfully scheduled on nodes , on next 
> restart the following caches will be defragmented: .
> Cancel
> *control.sh defragmentation cancel nodeHost nodePort [cache cacheName0]*
> _Prerequisites_: command is sent to node in maintenance mode, node in normal 
> operations should not accept it.
> _Command output:_
> Defragmentation is already completed for caches: 
> Defragmentation is cancelled for caches: ; all intermediate 
> files are cleaned up.
>  
> *Note:* Caches list for cancel command will not be implemented here.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-13697) Control.sh API - schedule & cancel

2020-11-12 Thread Ivan Bessonov (Jira)
Ivan Bessonov created IGNITE-13697:
--

 Summary: Control.sh API - schedule & cancel
 Key: IGNITE-13697
 URL: https://issues.apache.org/jira/browse/IGNITE-13697
 Project: Ignite
  Issue Type: Sub-task
Reporter: Ivan Bessonov
Assignee: Ivan Bessonov






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IGNITE-13684) Rewrite PageIo resolver from static to explicit dependency

2020-11-09 Thread Ivan Bessonov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-13684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17228442#comment-17228442
 ] 

Ivan Bessonov commented on IGNITE-13684:


I think I should add an explanation. This ticket contains refactoring elements 
necessary for defragmentation. Given that defragmentation patch is huge, we 
decided to split it. This way it's going to be easier to review and to track 
history.

Changes are:
 * static constant for cache group meta page;
 * PageStore allocation tracker replaced with a more generic LongConsumer do 
decouple it from metrics framework;
 * PageReadWriteManager added to basically allow having same cache group in 
different data regions;
 * several methods and fields exposed as internally public/protected API;
 * several inner classes refactored so that they become static classes;
 * PageIOResolver interface created and used to make data structure more 
flexible;
 * InsertLast interface for B+Tree added that will optimize comparisons on 
inserts. Unused for now;

All this code doesn't affect existing behavior.

> Rewrite PageIo resolver from static to explicit dependency
> --
>
> Key: IGNITE-13684
> URL: https://issues.apache.org/jira/browse/IGNITE-13684
> Project: Ignite
>  Issue Type: Sub-task
>Reporter: Anton Kalashnikov
>Assignee: Ivan Bessonov
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Right now, ignite has a static pageIo resolver which not allow substituting 
> the different implementation if needed. So it is needed to rewrite the 
> current implementation in order of this target.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Reopened] (IGNITE-13558) GridCacheProcessor should implement better parallelization when restoring partition states on startup

2020-10-21 Thread Ivan Bessonov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-13558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Bessonov reopened IGNITE-13558:


Reopened due to a deadlock in PDS suites, see attached issue.

> GridCacheProcessor should implement better parallelization when restoring 
> partition states on startup
> -
>
> Key: IGNITE-13558
> URL: https://issues.apache.org/jira/browse/IGNITE-13558
> Project: Ignite
>  Issue Type: Improvement
>  Components: persistence
>Reporter: Sergey Chugunov
>Assignee: Ivan Bessonov
>Priority: Major
> Fix For: 2.10
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> GridCacheProcessor#restorePartitionStates method tries to employ striped pool 
> to restore partition states in parallel but level of parallelization is down 
> only to cache group per thread.
> It is not enough and not utilizes resources effectively in case of one cache 
> group much bigger than the others.
> We need to parallel restore process down to individual partitions to get the 
> most from the available resources and speed up node startup.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IGNITE-13597) Execution timeout in PDS 2

2020-10-20 Thread Ivan Bessonov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-13597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17217536#comment-17217536
 ] 

Ivan Bessonov commented on IGNITE-13597:


PDS 2: 
https://ci.ignite.apache.org/buildConfiguration/IgniteTests24Java8_Pds2?branch=pull%2F8373%2Fhead&buildTypeTab=overview&mode=builds

> Execution timeout in PDS 2 
> ---
>
> Key: IGNITE-13597
> URL: https://issues.apache.org/jira/browse/IGNITE-13597
> Project: Ignite
>  Issue Type: Test
>Reporter: Ivan Bessonov
>Assignee: Ivan Bessonov
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> https://ci.ignite.apache.org/buildConfiguration/IgniteTests24Java8_Pds2/5677092?buildTab=log&focusLine=3



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-13597) Execution timeout in PDS 2

2020-10-20 Thread Ivan Bessonov (Jira)
Ivan Bessonov created IGNITE-13597:
--

 Summary: Execution timeout in PDS 2 
 Key: IGNITE-13597
 URL: https://issues.apache.org/jira/browse/IGNITE-13597
 Project: Ignite
  Issue Type: Test
Reporter: Ivan Bessonov
Assignee: Ivan Bessonov


https://ci.ignite.apache.org/buildConfiguration/IgniteTests24Java8_Pds2/5677092?buildTab=log&focusLine=3



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IGNITE-12489) Error during purges by expiration: Unknown page type

2020-10-19 Thread Ivan Bessonov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-12489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17216796#comment-17216796
 ] 

Ivan Bessonov commented on IGNITE-12489:


Hi [~alex_pl], [~cyberdemon],

can you please take a look at the fix? I'm interested in you reviewing the 
approach to the fix at least.

Current naming is a subject to change. I will also create all required Javadocs 
once everyone is fine with the fix, I already asked [~agoncharuk] and 
[~sergey-chugunov] to look at the PR as well.

Thank you!

> Error during purges by expiration: Unknown page type
> 
>
> Key: IGNITE-12489
> URL: https://issues.apache.org/jira/browse/IGNITE-12489
> Project: Ignite
>  Issue Type: Bug
>Affects Versions: 2.7, 2.7.6
>Reporter: Ruslan Kamashev
>Assignee: Ivan Bessonov
>Priority: Blocker
> Fix For: 2.10
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> {{*logger*}}
> {code:java}
> org.apache.ignite.internal.processors.cache.GridCacheIoManager
> {code}
> {{*message*}}
> {code:java}
> Failed to process message [senderId=969d56ba-4b46-40cf-886e-ac445cf6a95d, 
> messageType=class 
> o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicUpdateRequest]{code}
> {{*thread*}}
> {code:java}
> sys-stripe-19-#20{code}
> {{*trace*}}
> {code:java}
> java.lang.IllegalStateException: Unknown page type: 1 pageId: 00010303117d
>   at 
> org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.io(BPlusTree.java:5058)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.access$200(BPlusTree.java:90)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree$AbstractForwardCursor.nextPage(BPlusTree.java:5330)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree$ForwardCursor.next(BPlusTree.java:5566)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager$GridCacheDataStore.purgeExpiredInternal(GridCacheOffheapManager.java:2232)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager$GridCacheDataStore.purgeExpired(GridCacheOffheapManager.java:2157)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager.expire(GridCacheOffheapManager.java:845)
>   at 
> org.apache.ignite.internal.processors.cache.GridCacheTtlManager.expire(GridCacheTtlManager.java:207)
>   at 
> org.apache.ignite.internal.processors.cache.GridCacheUtils.unwindEvicts(GridCacheUtils.java:888)
>   at 
> org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessageProcessed(GridCacheIoManager.java:1103)
>   at 
> org.apache.ignite.internal.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:1076)
>   at 
> org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:581)
>   at 
> org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:380)
>   at 
> org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:306)
>   at 
> org.apache.ignite.internal.processors.cache.GridCacheIoManager.access$100(GridCacheIoManager.java:101)
>   at 
> org.apache.ignite.internal.processors.cache.GridCacheIoManager$1.onMessage(GridCacheIoManager.java:295)
>   at 
> org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1569)
>   at 
> org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:1197)
>   at 
> org.apache.ignite.internal.managers.communication.GridIoManager.access$4200(GridIoManager.java:127)
>   at 
> org.apache.ignite.internal.managers.communication.GridIoManager$9.run(GridIoManager.java:1093)
>   at 
> org.apache.ignite.internal.util.StripedExecutor$Stripe.body(StripedExecutor.java:505)
>   at 
> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
>   at java.lang.Thread.run(Thread.java:748)
>   Dec 23, 2019 @ 18:28:28.457 {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (IGNITE-12489) Error during purges by expiration: Unknown page type

2020-10-13 Thread Ivan Bessonov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-12489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Bessonov reassigned IGNITE-12489:
--

Assignee: Ivan Bessonov

> Error during purges by expiration: Unknown page type
> 
>
> Key: IGNITE-12489
> URL: https://issues.apache.org/jira/browse/IGNITE-12489
> Project: Ignite
>  Issue Type: Bug
>Affects Versions: 2.7, 2.7.6
>Reporter: Ruslan Kamashev
>Assignee: Ivan Bessonov
>Priority: Blocker
> Fix For: 2.10
>
>
> {{*logger*}}
> {code:java}
> org.apache.ignite.internal.processors.cache.GridCacheIoManager
> {code}
> {{*message*}}
> {code:java}
> Failed to process message [senderId=969d56ba-4b46-40cf-886e-ac445cf6a95d, 
> messageType=class 
> o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicUpdateRequest]{code}
> {{*thread*}}
> {code:java}
> sys-stripe-19-#20{code}
> {{*trace*}}
> {code:java}
> java.lang.IllegalStateException: Unknown page type: 1 pageId: 00010303117d
>   at 
> org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.io(BPlusTree.java:5058)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.access$200(BPlusTree.java:90)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree$AbstractForwardCursor.nextPage(BPlusTree.java:5330)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree$ForwardCursor.next(BPlusTree.java:5566)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager$GridCacheDataStore.purgeExpiredInternal(GridCacheOffheapManager.java:2232)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager$GridCacheDataStore.purgeExpired(GridCacheOffheapManager.java:2157)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager.expire(GridCacheOffheapManager.java:845)
>   at 
> org.apache.ignite.internal.processors.cache.GridCacheTtlManager.expire(GridCacheTtlManager.java:207)
>   at 
> org.apache.ignite.internal.processors.cache.GridCacheUtils.unwindEvicts(GridCacheUtils.java:888)
>   at 
> org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessageProcessed(GridCacheIoManager.java:1103)
>   at 
> org.apache.ignite.internal.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:1076)
>   at 
> org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:581)
>   at 
> org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:380)
>   at 
> org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:306)
>   at 
> org.apache.ignite.internal.processors.cache.GridCacheIoManager.access$100(GridCacheIoManager.java:101)
>   at 
> org.apache.ignite.internal.processors.cache.GridCacheIoManager$1.onMessage(GridCacheIoManager.java:295)
>   at 
> org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1569)
>   at 
> org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:1197)
>   at 
> org.apache.ignite.internal.managers.communication.GridIoManager.access$4200(GridIoManager.java:127)
>   at 
> org.apache.ignite.internal.managers.communication.GridIoManager$9.run(GridIoManager.java:1093)
>   at 
> org.apache.ignite.internal.util.StripedExecutor$Stripe.body(StripedExecutor.java:505)
>   at 
> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
>   at java.lang.Thread.run(Thread.java:748)
>   Dec 23, 2019 @ 18:28:28.457 {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (IGNITE-13558) GridCacheProcessor should implement better parallelization when restoring partition states on startup

2020-10-08 Thread Ivan Bessonov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-13558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Bessonov reassigned IGNITE-13558:
--

Assignee: Ivan Bessonov

> GridCacheProcessor should implement better parallelization when restoring 
> partition states on startup
> -
>
> Key: IGNITE-13558
> URL: https://issues.apache.org/jira/browse/IGNITE-13558
> Project: Ignite
>  Issue Type: Improvement
>  Components: persistence
>Reporter: Sergey Chugunov
>Assignee: Ivan Bessonov
>Priority: Major
> Fix For: 2.10
>
>
> GridCacheProcessor#restorePartitionStates method tries to employ striped pool 
> to restore partition states in parallel but level of parallelization is down 
> only to cache group per thread.
> It is not enough and not utilizes resources effectively in case of one cache 
> group much bigger than the others.
> We need to parallel restore process down to individual partitions to get the 
> most from the available resources and speed up node startup.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IGNITE-13366) Special mode for maintenance of Ignite node. Employing Maintenance Mode for clearing corrupted PDS files.

2020-09-20 Thread Ivan Bessonov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-13366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17199194#comment-17199194
 ] 

Ivan Bessonov commented on IGNITE-13366:


Hi [~sergeychugunov],

I tried to adapt your code early in IGNITE-13190 and have a few notes:
 * when you start your node clean, maintenance record file won't be created. Is 
this by design? Such detail makes writing tests a bit harder, you should 
manually start maintenance component one more time.
 * it's impossible to pass empty string or null into record parameters. For 
example, if my record doesn't have any parameters, it won't work. Problem is in 
the way you split that string from maintenance file.

> Special mode for maintenance of Ignite node. Employing Maintenance Mode for 
> clearing corrupted PDS files.
> -
>
> Key: IGNITE-13366
> URL: https://issues.apache.org/jira/browse/IGNITE-13366
> Project: Ignite
>  Issue Type: New Feature
>  Components: persistence
>Affects Versions: 2.8.1
>Reporter: Sergey Chugunov
>Assignee: Sergey Chugunov
>Priority: Critical
>  Labels: IEP-53
> Fix For: 2.10
>
>   Original Estimate: 168h
>  Time Spent: 10m
>  Remaining Estimate: 167h 50m
>
> If node with persistence is stopped when WAL was disabled for a cache (no 
> matters because of rebalancing in progress or by explicit user request) on 
> next node start all data files of that cache are removed automatically and 
> unconditionally.
> This behavior may be unexpected for users as they may not understand all 
> consequences of disabling WAL locally (for rebalancing) or globally (via 
> IgniteCluster API call). Also it is not smart enough as there is no point in 
> deleting consistent data files.
> We should change this behavior to the following list: no automatic deletions 
> whatsoever. If data files are consistent (equivalent to: no checkpoint was 
> running when node was stopped) start up normally. If data files are 
> corrupted, don't let the node start.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IGNITE-13425) Add task name to the log messages related to this task

2020-09-14 Thread Ivan Bessonov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-13425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17195277#comment-17195277
 ] 

Ivan Bessonov commented on IGNITE-13425:


[~Nikita T] looks good, thank you!

> Add task name to the log messages related to this task
> --
>
> Key: IGNITE-13425
> URL: https://issues.apache.org/jira/browse/IGNITE-13425
> Project: Ignite
>  Issue Type: Task
>Reporter: Nikita Tolstunov
>Assignee: Nikita Tolstunov
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Right now, it's pretty hard to understand anything related to the Compute 
> task from this log message:
> WARN  o.a.i.i.p.task.GridTaskProcessor - Received job execution response 
> while stopping grid (will ignore): GridJobExecuteResponse 
> [nodeId=cc2429d3-b1a0-4bc1-9358-edb919d5c64e, 
> sesId=e2c961e4371-cc2429d3-b1a0-4bc1-9358-edb919d5c64e, 
> jobId=23c961e4371-cc2429d3-b1a0-4bc1-9358-edb919d5c64e, gridEx=null, 
> isCancelled=true, retry=null]
> Potential solution: add taskName to GridJobExecuteResponse.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IGNITE-13207) Checkpointer code refactoring: Splitting GridCacheDatabaseSharedManager ant Checkpointer

2020-09-08 Thread Ivan Bessonov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-13207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17192186#comment-17192186
 ] 

Ivan Bessonov commented on IGNITE-13207:


[~akalashnikov] looks good, thank you! You can merge it once you have tc visa.

> Checkpointer code refactoring: Splitting GridCacheDatabaseSharedManager ant 
> Checkpointer
> 
>
> Key: IGNITE-13207
> URL: https://issues.apache.org/jira/browse/IGNITE-13207
> Project: Ignite
>  Issue Type: Sub-task
>Reporter: Anton Kalashnikov
>Assignee: Anton Kalashnikov
>Priority: Major
>  Labels: IEP-47
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (IGNITE-13207) Checkpointer code refactoring: Splitting GridCacheDatabaseSharedManager ant Checkpointer

2020-09-08 Thread Ivan Bessonov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-13207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Bessonov updated IGNITE-13207:
---
Labels: IEP-47  (was: )

> Checkpointer code refactoring: Splitting GridCacheDatabaseSharedManager ant 
> Checkpointer
> 
>
> Key: IGNITE-13207
> URL: https://issues.apache.org/jira/browse/IGNITE-13207
> Project: Ignite
>  Issue Type: Sub-task
>Reporter: Anton Kalashnikov
>Assignee: Anton Kalashnikov
>Priority: Major
>  Labels: IEP-47
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IGNITE-13013) Thick client must not open server sockets when used by serverless functions

2020-09-01 Thread Ivan Bessonov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-13013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17188346#comment-17188346
 ] 

Ivan Bessonov commented on IGNITE-13013:


Hi [~alex_pl],

I prepared PR for ignite-2.9: [https://github.com/apache/ignite/pull/8199/files]

It differs a little bit from master code because we have huge 
TcpCommunicationSpi refactoring in master. I was careful, test don't show new 
issues in this branch. I think it's safe to merge it.

Thank you!

> Thick client must not open server sockets when used by serverless functions
> ---
>
> Key: IGNITE-13013
> URL: https://issues.apache.org/jira/browse/IGNITE-13013
> Project: Ignite
>  Issue Type: Improvement
>  Components: networking
>Affects Versions: 2.8
>Reporter: Denis A. Magda
>Assignee: Ivan Bessonov
>Priority: Critical
> Fix For: 2.10
>
> Attachments: image-2020-07-30-18-42-01-266.png
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> A thick client fails to start if being used inside of a serverless function 
> such as AWS Lamda or Azure Functions. Cloud providers prohibit opening 
> network ports to accept connections on the function's end. In short, the 
> function can only connect to a remote address.
> To reproduce, you can follow this tutorial and swap the thin client (used in 
> the tutorial) with the thick one: 
> https://www.gridgain.com/docs/tutorials/serverless/azure_functions_tutorial
> The thick client needs to support a mode when the communication SPI doesn't 
> create a server socket if the client is used for serverless computing. This 
> improvement looks like an extra task of this initiative: 
> https://issues.apache.org/jira/browse/IGNITE-12438



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IGNITE-13013) Thick client must not open server sockets when used by serverless functions

2020-08-05 Thread Ivan Bessonov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-13013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17171470#comment-17171470
 ] 

Ivan Bessonov commented on IGNITE-13013:


[~akalashnikov] done, thank you for suggestion.

> Thick client must not open server sockets when used by serverless functions
> ---
>
> Key: IGNITE-13013
> URL: https://issues.apache.org/jira/browse/IGNITE-13013
> Project: Ignite
>  Issue Type: Improvement
>  Components: networking
>Affects Versions: 2.8
>Reporter: Denis A. Magda
>Assignee: Ivan Bessonov
>Priority: Critical
> Fix For: 2.10
>
> Attachments: image-2020-07-30-18-42-01-266.png
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> A thick client fails to start if being used inside of a serverless function 
> such as AWS Lamda or Azure Functions. Cloud providers prohibit opening 
> network ports to accept connections on the function's end. In short, the 
> function can only connect to a remote address.
> To reproduce, you can follow this tutorial and swap the thin client (used in 
> the tutorial) with the thick one: 
> https://www.gridgain.com/docs/tutorials/serverless/azure_functions_tutorial
> The thick client needs to support a mode when the communication SPI doesn't 
> create a server socket if the client is used for serverless computing. This 
> improvement looks like an extra task of this initiative: 
> https://issues.apache.org/jira/browse/IGNITE-12438



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IGNITE-12934) Change test start|stop log format for correct TC and build.log visibility.

2020-07-31 Thread Ivan Bessonov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-12934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17168542#comment-17168542
 ] 

Ivan Bessonov commented on IGNITE-12934:


[~ilyak] changes look good, thank you!

> Change test start|stop log format for correct TC and build.log visibility.
> --
>
> Key: IGNITE-12934
> URL: https://issues.apache.org/jira/browse/IGNITE-12934
> Project: Ignite
>  Issue Type: Improvement
>  Components: build
>Reporter: Ilya Kasnacheev
>Assignee: Ilya Kasnacheev
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> For correct TC and builds log visibility need to repeat log format, from : 
> ">>> Stopping test: "
> as 
> "##teamcity[testFinished name='"
> additional info:
> https://www.jetbrains.com/help/teamcity/build-script-interaction-with-teamcity.html#BuildScriptInteractionwithTeamCity-ReportingTests
> Also, make “Starting test”, “Stopping test” messages visible in Quiet mode!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IGNITE-13306) CpuLoad metric return -1 under Java 11

2020-07-28 Thread Ivan Bessonov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-13306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17166905#comment-17166905
 ] 

Ivan Bessonov commented on IGNITE-13306:


[~maliev] I'm fine with the current change if it's urgent and if we'll fix it 
properly in the future.

> CpuLoad metric return -1 under Java 11
> --
>
> Key: IGNITE-13306
> URL: https://issues.apache.org/jira/browse/IGNITE-13306
> Project: Ignite
>  Issue Type: Bug
>Affects Versions: 2.8.1
>Reporter: Mirza Aliev
>Assignee: Mirza Aliev
>Priority: Major
> Fix For: 2.9
>
>
> Start cluster under Java 11.
> Observed: 
>  CpuLoad metric will return -1
> Expected:
>  Real CpuLoad.
> We investigated this issue and found that under Java 11 code failed with 
> following trace:
> {code:java}
> class org.apache.ignite.IgniteException: Failed to get property value 
> [property=processCpuTime, 
> obj=com.sun.management.internal.OperatingSystemImpl@1dd92fe2] at 
> org.apache.ignite.internal.util.IgniteUtils.property(IgniteUtils.java:8306) 
> at 
> org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$MetricsUpdater.getCpuLoad(GridDiscoveryManager.java:3131)
>  at 
> org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$MetricsUpdater.run(GridDiscoveryManager.java:3093)
>  at 
> org.apache.ignite.internal.processors.timeout.GridTimeoutProcessor$CancelableTask.onTimeout(GridTimeoutProcessor.java:364)
>  at 
> org.apache.ignite.internal.processors.timeout.GridTimeoutProcessor$TimeoutWorker.body(GridTimeoutProcessor.java:233)
>  at 
> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:119) at 
> java.base/java.lang.Thread.run(Thread.java:834) Caused by: 
> java.lang.reflect.InaccessibleObjectException: Unable to make public long 
> com.sun.management.internal.OperatingSystemImpl.getProcessCpuTime() 
> accessible: module jdk.management does not "opens 
> com.sun.management.internal" to unnamed module @35fb3008 at 
> java.base/java.lang.reflect.AccessibleObject.checkCanSetAccessible(AccessibleObject.java:340)
>  at 
> java.base/java.lang.reflect.AccessibleObject.checkCanSetAccessible(AccessibleObject.java:280)
>  at java.base/java.lang.reflect.Method.checkCanSetAccessible(Method.java:198) 
> at java.base/java.lang.reflect.Method.setAccessible(Method.java:192) at 
> org.apache.ignite.internal.util.IgniteUtils.property(IgniteUtils.java:8297) 
> ... 6 more
> {code}
> Under Java 8 metric has expected value.
>  
> Solution:
> The behaviour is expected because in Java 11 the CPU load metrics is moved to 
> JDK internal module which is not accessible by default. Adding the following 
> line to the jvm in which Ignite node is started should solve the issue:
> {noformat}
> --add-opens jdk.management/com.sun.management.internal=ALL-UNNAMED{noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IGNITE-13306) CpuLoad metric return -1 under Java 11

2020-07-28 Thread Ivan Bessonov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-13306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17166301#comment-17166301
 ] 

Ivan Bessonov commented on IGNITE-13306:


Hi [~maliev],

 

I have two possible fixes for you to consider:
 * you may not invoke "setAccessible" in 
org.apache.ignite.internal.util.IgniteUtils#property if method is already 
accessible. This will save from security exception;
 * you may use public methods from com.sun.management.OperatingSystemMXBean for 
these metrics.

What do you think?

> CpuLoad metric return -1 under Java 11
> --
>
> Key: IGNITE-13306
> URL: https://issues.apache.org/jira/browse/IGNITE-13306
> Project: Ignite
>  Issue Type: Bug
>Affects Versions: 2.8.1
>Reporter: Mirza Aliev
>Assignee: Mirza Aliev
>Priority: Major
> Fix For: 2.9
>
>
> Start cluster under Java 11.
> Observed: 
>  CpuLoad metric will return -1
> Expected:
>  Real CpuLoad.
> We investigated this issue and found that under Java 11 code failed with 
> following trace:
> {code:java}
> class org.apache.ignite.IgniteException: Failed to get property value 
> [property=processCpuTime, 
> obj=com.sun.management.internal.OperatingSystemImpl@1dd92fe2] at 
> org.apache.ignite.internal.util.IgniteUtils.property(IgniteUtils.java:8306) 
> at 
> org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$MetricsUpdater.getCpuLoad(GridDiscoveryManager.java:3131)
>  at 
> org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$MetricsUpdater.run(GridDiscoveryManager.java:3093)
>  at 
> org.apache.ignite.internal.processors.timeout.GridTimeoutProcessor$CancelableTask.onTimeout(GridTimeoutProcessor.java:364)
>  at 
> org.apache.ignite.internal.processors.timeout.GridTimeoutProcessor$TimeoutWorker.body(GridTimeoutProcessor.java:233)
>  at 
> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:119) at 
> java.base/java.lang.Thread.run(Thread.java:834) Caused by: 
> java.lang.reflect.InaccessibleObjectException: Unable to make public long 
> com.sun.management.internal.OperatingSystemImpl.getProcessCpuTime() 
> accessible: module jdk.management does not "opens 
> com.sun.management.internal" to unnamed module @35fb3008 at 
> java.base/java.lang.reflect.AccessibleObject.checkCanSetAccessible(AccessibleObject.java:340)
>  at 
> java.base/java.lang.reflect.AccessibleObject.checkCanSetAccessible(AccessibleObject.java:280)
>  at java.base/java.lang.reflect.Method.checkCanSetAccessible(Method.java:198) 
> at java.base/java.lang.reflect.Method.setAccessible(Method.java:192) at 
> org.apache.ignite.internal.util.IgniteUtils.property(IgniteUtils.java:8297) 
> ... 6 more
> {code}
> Under Java 8 metric has expected value.
>  
> Solution:
> The behaviour is expected because in Java 11 the CPU load metrics is moved to 
> JDK internal module which is not accessible by default. Adding the following 
> line to the jvm in which Ignite node is started should solve the issue:
> {noformat}
> --add-opens jdk.management/com.sun.management.internal=ALL-UNNAMED{noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (IGNITE-13266) PDS (Indexing) fails with 'Exit code 137'

2020-07-24 Thread Ivan Bessonov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-13266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Bessonov updated IGNITE-13266:
---
Summary: PDS (Indexing) fails with 'Exit code 137'  (was: PDS (Indexing) 
fails with 'Exit code 137)

> PDS (Indexing) fails with 'Exit code 137'
> -
>
> Key: IGNITE-13266
> URL: https://issues.apache.org/jira/browse/IGNITE-13266
> Project: Ignite
>  Issue Type: Test
>Reporter: Ivan Bessonov
>Assignee: Ivan Bessonov
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> [https://ci.ignite.apache.org/viewType.html?buildTypeId=IgniteTests24Java8_PdsIndexing&branch_IgniteTests24Java8=%3Cdefault%3E&tab=buildTypeHistoryList]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (IGNITE-13266) PDS (Indexing) fails with 'Exit code 137

2020-07-24 Thread Ivan Bessonov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-13266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Bessonov updated IGNITE-13266:
---
Summary: PDS (Indexing) fails with 'Exit code 137  (was: PDS (Indexing) 
fails with 'Exit code 137")

> PDS (Indexing) fails with 'Exit code 137
> 
>
> Key: IGNITE-13266
> URL: https://issues.apache.org/jira/browse/IGNITE-13266
> Project: Ignite
>  Issue Type: Test
>Reporter: Ivan Bessonov
>Assignee: Ivan Bessonov
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> [https://ci.ignite.apache.org/viewType.html?buildTypeId=IgniteTests24Java8_PdsIndexing&branch_IgniteTests24Java8=%3Cdefault%3E&tab=buildTypeHistoryList]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IGNITE-13266) PDS (Indexing) fails with 'Exit code 137"

2020-07-23 Thread Ivan Bessonov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-13266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17164186#comment-17164186
 ] 

Ivan Bessonov commented on IGNITE-13266:


"Cache 7" suite: 
[https://ci.ignite.apache.org/viewType.html?buildTypeId=IgniteTests24Java8_Cache7&branch_IgniteTests24Java8=pull%2F8051%2Fhead&tab=buildTypeStatusDiv]

> PDS (Indexing) fails with 'Exit code 137"
> -
>
> Key: IGNITE-13266
> URL: https://issues.apache.org/jira/browse/IGNITE-13266
> Project: Ignite
>  Issue Type: Test
>Reporter: Ivan Bessonov
>Assignee: Ivan Bessonov
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> [https://ci.ignite.apache.org/viewType.html?buildTypeId=IgniteTests24Java8_PdsIndexing&branch_IgniteTests24Java8=%3Cdefault%3E&tab=buildTypeHistoryList]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IGNITE-13266) PDS (Indexing) fails with 'Exit code 137"

2020-07-22 Thread Ivan Bessonov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-13266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17163263#comment-17163263
 ] 

Ivan Bessonov commented on IGNITE-13266:


So there are two main problems with the suite.

 

Exit with error code 130 - caused by {{StopNodeOrHaltFailureHandler}} in 
{{WalRolloverRecordLoggingTest}}.

 

Exit with error code 137 - JVM's excessive memory consumption, caused by 
rapidly started and stopped threads that allocate a lot of memory in parallel. 
This causes "malloc" to behave really weird, it wasn't meant to be used this 
way.

Popular fix found on the internet is to set MALLOC_ARENA_MAX environment 
variable to some low value like 1, 2 or 4. Tested it with 4 both locally and on 
TC, looks good. We should consider propagating this setting on all suites, I've 
seen that at least PDS 4 suffers from the same problem.

In general - the longer the suite, the more chances it has to fail with the 
same error, purely because of threads count and constant memory allocation.

[https://ci.ignite.apache.org/viewType.html?buildTypeId=IgniteTests24Java8_PdsIndexing&tab=buildTypeStatusDiv&branch_IgniteTests24Java8=pull%2F8051%2Fhead]

> PDS (Indexing) fails with 'Exit code 137"
> -
>
> Key: IGNITE-13266
> URL: https://issues.apache.org/jira/browse/IGNITE-13266
> Project: Ignite
>  Issue Type: Test
>Reporter: Ivan Bessonov
>Assignee: Ivan Bessonov
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> [https://ci.ignite.apache.org/viewType.html?buildTypeId=IgniteTests24Java8_PdsIndexing&branch_IgniteTests24Java8=%3Cdefault%3E&tab=buildTypeHistoryList]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-13266) PDS (Indexing) fails with 'Exit code 137"

2020-07-17 Thread Ivan Bessonov (Jira)
Ivan Bessonov created IGNITE-13266:
--

 Summary: PDS (Indexing) fails with 'Exit code 137"
 Key: IGNITE-13266
 URL: https://issues.apache.org/jira/browse/IGNITE-13266
 Project: Ignite
  Issue Type: Test
Reporter: Ivan Bessonov
Assignee: Ivan Bessonov


[https://ci.ignite.apache.org/viewType.html?buildTypeId=IgniteTests24Java8_PdsIndexing&branch_IgniteTests24Java8=%3Cdefault%3E&tab=buildTypeHistoryList]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (IGNITE-13235) Deadlock in IgniteServiceProcessor

2020-07-16 Thread Ivan Bessonov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-13235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Bessonov updated IGNITE-13235:
---
Fix Version/s: (was: 2.9)
   2.10

> Deadlock in IgniteServiceProcessor
> --
>
> Key: IGNITE-13235
> URL: https://issues.apache.org/jira/browse/IGNITE-13235
> Project: Ignite
>  Issue Type: Bug
>Reporter: Ivan Bessonov
>Assignee: Ivan Bessonov
>Priority: Major
> Fix For: 2.10
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> {code:java}
> "main" #1 prio=5 os_prio=0 tid=0x7ff9ac00f000 nid=0x86d in Object.wait() 
> [0x7ff9b418b000]
>java.lang.Thread.State: WAITING (on object monitor)
>   at java.lang.Object.wait(Native Method)
>   at java.lang.Object.wait(Object.java:502)
>   at 
> org.apache.ignite.internal.util.worker.GridWorker.join(GridWorker.java:242)
>   - locked <0x000776ee2028> (a java.lang.Object)
>   at 
> org.apache.ignite.internal.util.IgniteUtils.join(IgniteUtils.java:5009)
>   at 
> org.apache.ignite.internal.processors.service.ServiceDeploymentManager.stopProcessing(ServiceDeploymentManager.java:145)
>   at 
> org.apache.ignite.internal.processors.service.IgniteServiceProcessor.stopProcessor(IgniteServiceProcessor.java:261)
>   at 
> org.apache.ignite.internal.processors.service.IgniteServiceProcessor.onKernalStop(IgniteServiceProcessor.java:248)
>   at org.apache.ignite.internal.IgniteKernal.stop0(IgniteKernal.java:2466)
>   at org.apache.ignite.internal.IgniteKernal.stop(IgniteKernal.java:2414)
>   at 
> org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.stop0(IgnitionEx.java:2577)
>   - locked <0x000776424138> (a 
> org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance)
>   at 
> org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.stop(IgnitionEx.java:2540)
>   at org.apache.ignite.internal.IgnitionEx.stop(IgnitionEx.java:333)
>   at org.apache.ignite.Ignition.stop(Ignition.java:221)
>   at 
> org.apache.ignite.testframework.junits.GridAbstractTest.stopGrid(GridAbstractTest.java:1225)
>   at 
> org.apache.ignite.testframework.junits.GridAbstractTest.stopAllGrids(GridAbstractTest.java:1268)
>   at 
> org.apache.ignite.testframework.junits.GridAbstractTest.stopAllGrids(GridAbstractTest.java:1246)
>   at 
> org.apache.ignite.events.ClusterActivationStartedEventTest.afterTest(ClusterActivationStartedEventTest.java:41)
>   at 
> org.apache.ignite.testframework.junits.GridAbstractTest.cleanUpTestEnviroment(GridAbstractTest.java:701)
>   at 
> org.apache.ignite.testframework.junits.GridAbstractTest.runTest(GridAbstractTest.java:2165)
>   at 
> org.apache.ignite.testframework.junits.GridAbstractTest.access$600(GridAbstractTest.java:172)
>   at 
> org.apache.ignite.testframework.junits.GridAbstractTest$2.evaluate(GridAbstractTest.java:207)
>   at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
>   at 
> org.apache.ignite.testframework.junits.SystemPropertiesRule.lambda$methodStatement$1(SystemPropertiesRule.java:109)
>   at 
> org.apache.ignite.testframework.junits.SystemPropertiesRule$$Lambda$6/167185492.evaluate(Unknown
>  Source)
>   at 
> org.apache.ignite.testframework.junits.DelegatingJUnitStatement.evaluate(DelegatingJUnitStatement.java:48)
>   at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at 
> org.apache.ignite.testframework.junits.GridAbstractTest.evaluateInsideFixture(GridAbstractTest.java:2669)
>   at 
> org.apache.ignite.testframework.junits.GridAbstractTest.access$500(GridAbstractTest.java:172)
>   at 
> org.apache.ignite.testframework.junits.GridAbstractTest$BeforeFirstAndAfterLastTestRule$1.evaluate(GridAbstractTest.java:2649)
>   at 
> org.apache.ignite.testframework.junits.SystemPropertiesRule.lambda$classStatement$0(SystemPropertiesRule.java:93)
>   at 
> org.apache.ignite.testframework.junits.SystemPropertiesRule$$Lambda$2/1879492184.evaluate(Unknown
>  Source)
>   at 
> org.apache.ignite.testframework.junits.DelegatingJUnitStatement.evaluate(DelegatingJUnitStatement.java:48)
>   at org.junit.r

[jira] [Commented] (IGNITE-13246) Implement EVT_BASELINE_XXX events

2020-07-15 Thread Ivan Bessonov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-13246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17158121#comment-17158121
 ] 

Ivan Bessonov commented on IGNITE-13246:


PDS 4 failures are the same in master branch: 
[https://ci.ignite.apache.org/buildConfiguration/IgniteTests24Java8_Pds4?branch=%3Cdefault%3E&mode=builds]

> Implement EVT_BASELINE_XXX events
> -
>
> Key: IGNITE-13246
> URL: https://issues.apache.org/jira/browse/IGNITE-13246
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Ivan Bessonov
>Assignee: Ivan Bessonov
>Priority: Major
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> In order to notify external tools we need events EVT_BASELINE_CHANGED, 
> EVT_BASELINE_AUTO_ADJUST_ENABLED_CHANGED and 
> EVT_BASELINE_AUTO_ADJUST_AWAITING_TIME_CHANGED to correctly update baseline 
> info on UI.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IGNITE-13246) Implement EVT_BASELINE_XXX events

2020-07-13 Thread Ivan Bessonov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-13246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17156767#comment-17156767
 ] 

Ivan Bessonov commented on IGNITE-13246:


"Streamers" suite fails because of IGNITE-12362

> Implement EVT_BASELINE_XXX events
> -
>
> Key: IGNITE-13246
> URL: https://issues.apache.org/jira/browse/IGNITE-13246
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Ivan Bessonov
>Assignee: Ivan Bessonov
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> In order to notify external tools we need events EVT_BASELINE_CHANGED, 
> EVT_BASELINE_AUTO_ADJUST_ENABLED_CHANGED and 
> EVT_BASELINE_AUTO_ADJUST_AWAITING_TIME_CHANGED to correctly update baseline 
> info on UI.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IGNITE-12362) Migrate MQTT module to ignite-extensions

2020-07-13 Thread Ivan Bessonov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-12362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17156727#comment-17156727
 ] 

Ivan Bessonov commented on IGNITE-12362:


Hi [~samaitra],

I see that this particular suite fails after your change: 
[https://ci.ignite.apache.org/buildConfiguration/IgniteTests24Java8_Streamers?branch=%3Cdefault%3E&mode=builds]

I guess you merged this ticket without running tests.

Can you please fix it?

> Migrate MQTT module to ignite-extensions
> 
>
> Key: IGNITE-12362
> URL: https://issues.apache.org/jira/browse/IGNITE-12362
> Project: Ignite
>  Issue Type: Sub-task
>  Components: streaming
>Affects Versions: 2.8
>Reporter: Saikat Maitra
>Assignee: Saikat Maitra
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Migrate MQTT module to ignite-extensions
> [https://github.com/apache/ignite-extensions] 
> Details: 
> [https://cwiki.apache.org/confluence/display/IGNITE/IEP-36%3A+Modularization#IEP-36:Modularization-IndependentIntegrations]
> Discussion : 
> [http://apache-ignite-developers.2346864.n4.nabble.com/DISCUSS-Proposal-for-Ignite-Extensions-as-a-separate-Bahir-module-or-Incubator-project-td44064.html#a44107]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-13246) Implement EVT_BASELINE_XXX events

2020-07-13 Thread Ivan Bessonov (Jira)
Ivan Bessonov created IGNITE-13246:
--

 Summary: Implement EVT_BASELINE_XXX events
 Key: IGNITE-13246
 URL: https://issues.apache.org/jira/browse/IGNITE-13246
 Project: Ignite
  Issue Type: Improvement
Reporter: Ivan Bessonov
Assignee: Ivan Bessonov


In order to notify external tools we need events EVT_BASELINE_CHANGED, 
EVT_BASELINE_AUTO_ADJUST_ENABLED_CHANGED and 
EVT_BASELINE_AUTO_ADJUST_AWAITING_TIME_CHANGED to correctly update baseline 
info on UI.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (IGNITE-13013) Thick client must not open server sockets when used by serverless functions

2020-07-13 Thread Ivan Bessonov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-13013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Bessonov reassigned IGNITE-13013:
--

Assignee: Ivan Bessonov

> Thick client must not open server sockets when used by serverless functions
> ---
>
> Key: IGNITE-13013
> URL: https://issues.apache.org/jira/browse/IGNITE-13013
> Project: Ignite
>  Issue Type: Improvement
>  Components: networking
>Affects Versions: 2.8
>Reporter: Denis A. Magda
>Assignee: Ivan Bessonov
>Priority: Critical
> Fix For: 2.9
>
>
> A thick client fails to start if being used inside of a serverless function 
> such as AWS Lamda or Azure Functions. Cloud providers prohibit opening 
> network ports to accept connections on the function's end. In short, the 
> function can only connect to a remote address.
> To reproduce, you can follow this tutorial and swap the thin client (used in 
> the tutorial) with the thick one: 
> https://www.gridgain.com/docs/tutorials/serverless/azure_functions_tutorial
> The thick client needs to support a mode when the communication SPI doesn't 
> create a server socket if the client is used for serverless computing. This 
> improvement looks like an extra task of this initiative: 
> https://issues.apache.org/jira/browse/IGNITE-12438



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IGNITE-13242) LocalWalModeChangeDuringRebalancingSelfTest.testDataClearedAfterRestartWithDisabledWal fails

2020-07-10 Thread Ivan Bessonov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-13242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17155373#comment-17155373
 ] 

Ivan Bessonov commented on IGNITE-13242:


[https://ci.ignite.apache.org/viewType.html?buildTypeId=IgniteTests24Java8_Pds2&tab=buildTypeStatusDiv&branch_IgniteTests24Java8=pull%2F8021%2Fhead]

> LocalWalModeChangeDuringRebalancingSelfTest.testDataClearedAfterRestartWithDisabledWal
>  fails
> 
>
> Key: IGNITE-13242
> URL: https://issues.apache.org/jira/browse/IGNITE-13242
> Project: Ignite
>  Issue Type: Test
>Reporter: Ivan Bessonov
>Assignee: Ivan Bessonov
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> [https://ci.ignite.apache.org/project.html?projectId=IgniteTests24Java8&testNameId=-5966400795288779246&tab=testDetails]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-13242) LocalWalModeChangeDuringRebalancingSelfTest.testDataClearedAfterRestartWithDisabledWal fails

2020-07-10 Thread Ivan Bessonov (Jira)
Ivan Bessonov created IGNITE-13242:
--

 Summary: 
LocalWalModeChangeDuringRebalancingSelfTest.testDataClearedAfterRestartWithDisabledWal
 fails
 Key: IGNITE-13242
 URL: https://issues.apache.org/jira/browse/IGNITE-13242
 Project: Ignite
  Issue Type: Test
Reporter: Ivan Bessonov
Assignee: Ivan Bessonov


[https://ci.ignite.apache.org/project.html?projectId=IgniteTests24Java8&testNameId=-5966400795288779246&tab=testDetails]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IGNITE-13013) Thick client must not open server sockets when used by serverless functions

2020-07-09 Thread Ivan Bessonov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-13013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17154625#comment-17154625
 ] 

Ivan Bessonov commented on IGNITE-13013:


I guess that would be better, "connIdx" affected too many methods. We should 
roll it back.

> Thick client must not open server sockets when used by serverless functions
> ---
>
> Key: IGNITE-13013
> URL: https://issues.apache.org/jira/browse/IGNITE-13013
> Project: Ignite
>  Issue Type: Improvement
>  Components: networking
>Affects Versions: 2.8
>Reporter: Denis A. Magda
>Priority: Critical
> Fix For: 2.9
>
>
> A thick client fails to start if being used inside of a serverless function 
> such as AWS Lamda or Azure Functions. Cloud providers prohibit opening 
> network ports to accept connections on the function's end. In short, the 
> function can only connect to a remote address.
> To reproduce, you can follow this tutorial and swap the thin client (used in 
> the tutorial) with the thick one: 
> https://www.gridgain.com/docs/tutorials/serverless/azure_functions_tutorial
> The thick client needs to support a mode when the communication SPI doesn't 
> create a server socket if the client is used for serverless computing. This 
> improvement looks like an extra task of this initiative: 
> https://issues.apache.org/jira/browse/IGNITE-12438



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-13235) Deadlock in IgniteServiceProcessor

2020-07-09 Thread Ivan Bessonov (Jira)
Ivan Bessonov created IGNITE-13235:
--

 Summary: Deadlock in IgniteServiceProcessor
 Key: IGNITE-13235
 URL: https://issues.apache.org/jira/browse/IGNITE-13235
 Project: Ignite
  Issue Type: Bug
Reporter: Ivan Bessonov
Assignee: Ivan Bessonov
 Fix For: 2.9


{code:java}
"main" #1 prio=5 os_prio=0 tid=0x7ff9ac00f000 nid=0x86d in Object.wait() 
[0x7ff9b418b000]
   java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at java.lang.Object.wait(Object.java:502)
at 
org.apache.ignite.internal.util.worker.GridWorker.join(GridWorker.java:242)
- locked <0x000776ee2028> (a java.lang.Object)
at 
org.apache.ignite.internal.util.IgniteUtils.join(IgniteUtils.java:5009)
at 
org.apache.ignite.internal.processors.service.ServiceDeploymentManager.stopProcessing(ServiceDeploymentManager.java:145)
at 
org.apache.ignite.internal.processors.service.IgniteServiceProcessor.stopProcessor(IgniteServiceProcessor.java:261)
at 
org.apache.ignite.internal.processors.service.IgniteServiceProcessor.onKernalStop(IgniteServiceProcessor.java:248)
at org.apache.ignite.internal.IgniteKernal.stop0(IgniteKernal.java:2466)
at org.apache.ignite.internal.IgniteKernal.stop(IgniteKernal.java:2414)
at 
org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.stop0(IgnitionEx.java:2577)
- locked <0x000776424138> (a 
org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance)
at 
org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.stop(IgnitionEx.java:2540)
at org.apache.ignite.internal.IgnitionEx.stop(IgnitionEx.java:333)
at org.apache.ignite.Ignition.stop(Ignition.java:221)
at 
org.apache.ignite.testframework.junits.GridAbstractTest.stopGrid(GridAbstractTest.java:1225)
at 
org.apache.ignite.testframework.junits.GridAbstractTest.stopAllGrids(GridAbstractTest.java:1268)
at 
org.apache.ignite.testframework.junits.GridAbstractTest.stopAllGrids(GridAbstractTest.java:1246)
at 
org.apache.ignite.events.ClusterActivationStartedEventTest.afterTest(ClusterActivationStartedEventTest.java:41)
at 
org.apache.ignite.testframework.junits.GridAbstractTest.cleanUpTestEnviroment(GridAbstractTest.java:701)
at 
org.apache.ignite.testframework.junits.GridAbstractTest.runTest(GridAbstractTest.java:2165)
at 
org.apache.ignite.testframework.junits.GridAbstractTest.access$600(GridAbstractTest.java:172)
at 
org.apache.ignite.testframework.junits.GridAbstractTest$2.evaluate(GridAbstractTest.java:207)
at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
at 
org.apache.ignite.testframework.junits.SystemPropertiesRule.lambda$methodStatement$1(SystemPropertiesRule.java:109)
at 
org.apache.ignite.testframework.junits.SystemPropertiesRule$$Lambda$6/167185492.evaluate(Unknown
 Source)
at 
org.apache.ignite.testframework.junits.DelegatingJUnitStatement.evaluate(DelegatingJUnitStatement.java:48)
at org.junit.rules.RunRules.evaluate(RunRules.java:20)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
at 
org.apache.ignite.testframework.junits.GridAbstractTest.evaluateInsideFixture(GridAbstractTest.java:2669)
at 
org.apache.ignite.testframework.junits.GridAbstractTest.access$500(GridAbstractTest.java:172)
at 
org.apache.ignite.testframework.junits.GridAbstractTest$BeforeFirstAndAfterLastTestRule$1.evaluate(GridAbstractTest.java:2649)
at 
org.apache.ignite.testframework.junits.SystemPropertiesRule.lambda$classStatement$0(SystemPropertiesRule.java:93)
at 
org.apache.ignite.testframework.junits.SystemPropertiesRule$$Lambda$2/1879492184.evaluate(Unknown
 Source)
at 
org.apache.ignite.testframework.junits.DelegatingJUnitStatement.evaluate(DelegatingJUnitStatement.java:48)
at org.junit.rules.RunRules.evaluate(RunRules.java:20)
at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
at 
com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:68)
at 
com.intellij.rt.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRu

[jira] [Commented] (IGNITE-13200) SQL create index on invalid data type

2020-07-08 Thread Ivan Bessonov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-13200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17153574#comment-17153574
 ] 

Ivan Bessonov commented on IGNITE-13200:


Hi [~tledkov-gridgain],

fix looks good, but one particular change in BPlusTree looks excessive, can you 
please look at it? I left a comment in PR, thank you!

> SQL create index on invalid data type
> -
>
> Key: IGNITE-13200
> URL: https://issues.apache.org/jira/browse/IGNITE-13200
> Project: Ignite
>  Issue Type: Bug
>  Components: sql
>Affects Versions: 2.8.1
>Reporter: Taras Ledkov
>Assignee: Taras Ledkov
>Priority: Major
> Fix For: 2.9
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> *Reproduce*
> - Create cache with value class
> {code}
> private static class Value {
> @QuerySqlField
> int val_int;
> java.util.Date val_date;
> }
> {code}
> - alter table with command
> {{ALTER TABLE TEST ADD COLUMN (VAL_DATE DATE)}}
> - try to create index with command
> {{CREATE INDEX TEST_VAL_DATE_IDX ON TEST(VAL_DATE)}}
> {{CorruptedTreeException}} is thrown, the node is stopped.
> {code}
> class org.apache.ignite.IgniteCheckedException: Runtime failure on row: 
> Row@6a2853cd[ key: 0, val: 
> org.apache.ignite.internal.processors.query.CreateIndexOnInvalidDataTypeTest$Value
>  [idHash=1693430008, hash=1583713321, val_int=0, val_date=Thu Jan 01 03:00:00 
> MSK 1970] ][ 0,  java.util.Date cannot be cast to java.sql.Date> ]
>   at 
> org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.doPut(BPlusTree.java:2438)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.putx(BPlusTree.java:2388)
>   at 
> org.apache.ignite.internal.processors.query.h2.database.H2TreeIndex.putx(H2TreeIndex.java:434)
>   at 
> org.apache.ignite.internal.processors.query.h2.IndexBuildClosure.apply(IndexBuildClosure.java:52)
>   at 
> org.apache.ignite.internal.processors.query.schema.SchemaIndexCachePartitionWorker$SchemaIndexCacheVisitorClosureWrapper.apply(SchemaIndexCachePartitionWorker.java:298)
>   at 
> org.apache.ignite.internal.processors.cache.GridCacheMapEntry.updateIndex(GridCacheMapEntry.java:4494)
>   at 
> org.apache.ignite.internal.processors.query.schema.SchemaIndexCachePartitionWorker.processKey(SchemaIndexCachePartitionWorker.java:231)
>   at 
> org.apache.ignite.internal.processors.query.schema.SchemaIndexCachePartitionWorker.processPartition(SchemaIndexCachePartitionWorker.java:188)
>   at 
> org.apache.ignite.internal.processors.query.schema.SchemaIndexCachePartitionWorker.body(SchemaIndexCachePartitionWorker.java:127)
>   at 
> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: class 
> org.apache.ignite.internal.processors.query.IgniteSQLException: Failed to 
> wrap object into H2 Value. java.util.Date cannot be cast to java.sql.Date
>   at 
> org.apache.ignite.internal.processors.query.h2.opt.H2CacheRow.wrap(H2CacheRow.java:177)
>   at 
> org.apache.ignite.internal.processors.query.h2.opt.H2CacheRow.getValue0(H2CacheRow.java:109)
>   at 
> org.apache.ignite.internal.processors.query.h2.opt.H2CacheRow.getValue(H2CacheRow.java:91)
>   at 
> org.apache.ignite.internal.processors.query.h2.database.io.AbstractH2ExtrasLeafIO.storeByOffset(AbstractH2ExtrasLeafIO.java:115)
>   at 
> org.apache.ignite.internal.processors.query.h2.database.io.AbstractH2ExtrasLeafIO.storeByOffset(AbstractH2ExtrasLeafIO.java:37)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.tree.io.BPlusIO.store(BPlusIO.java:185)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.tree.io.BPlusIO.insert(BPlusIO.java:272)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree$Put.insertSimple(BPlusTree.java:3685)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree$Put.insert(BPlusTree.java:3667)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree$Put.access$1900(BPlusTree.java:3539)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree$Insert.run0(BPlusTree.java:452)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree$Insert.run0(BPlusTree.java:433)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree$GetPageHandler.run(BPlusTree.java:5889)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.tree.BPlus

[jira] [Updated] (IGNITE-13151) Checkpointer code refactoring: extracting classes from GridCacheDatabaseSharedManager

2020-07-02 Thread Ivan Bessonov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-13151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Bessonov updated IGNITE-13151:
---
Labels: IEP-47  (was: )

> Checkpointer code refactoring: extracting classes from 
> GridCacheDatabaseSharedManager
> -
>
> Key: IGNITE-13151
> URL: https://issues.apache.org/jira/browse/IGNITE-13151
> Project: Ignite
>  Issue Type: Sub-task
>  Components: persistence
>Reporter: Sergey Chugunov
>Assignee: Anton Kalashnikov
>Priority: Major
>  Labels: IEP-47
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Checkpointer is at the center of Ignite persistence subsystem and more people 
> from the community understand it the better means it is more stable and more 
> efficient.
> However for now checkpointer code sits inside of 
> GridCacheDatabaseSharedManager class and is entangled with this higher-level 
> and more general component.
> To take a step forward to more modular checkpointer we need to do two things:
>  # Move checkpointer code outside database manager to a separate class. 
> (That's what this ticket is about.)
>  # Create a well-defined API of checkpointer that will allow us to create new 
> implementations of checkpointer in the future. An example of this is new 
> checkpointer implementation needed for defragmentation feature purposes. 
> (Should be done in a separate ticket)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


<    9   10   11   12   13   14   15   16   17   18   >