[jira] [Created] (IGNITE-14527) CVE-2021-2816[3,4,5] in Jetty

2021-04-13 Thread Alexander Belyak (Jira)
Alexander Belyak created IGNITE-14527:
-

 Summary: CVE-2021-2816[3,4,5] in Jetty
 Key: IGNITE-14527
 URL: https://issues.apache.org/jira/browse/IGNITE-14527
 Project: Ignite
  Issue Type: Task
  Components: integrations
Reporter: Alexander Belyak
Assignee: Alexander Belyak


Vulnerabilities found:

[https://nvd.nist.gov/vuln/detail/CVE-2021-28163]
[https://nvd.nist.gov/vuln/detail/CVE-2021-28164]
[https://nvd.nist.gov/vuln/detail/CVE-2021-28165]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-11783) Open file limit for deb distribution

2019-04-18 Thread Alexander Belyak (JIRA)
Alexander Belyak created IGNITE-11783:
-

 Summary: Open file limit for deb distribution
 Key: IGNITE-11783
 URL: https://issues.apache.org/jira/browse/IGNITE-11783
 Project: Ignite
  Issue Type: Bug
  Components: persistence
Affects Versions: 2.7
 Environment: ubuntu-16.04
Reporter: Alexander Belyak


Step to reproduce:
1) Install ignite from deb package on ubuntu 16.04
2) Start with persistence
3) Create 5 caches (or one with 4000+ partitions)
Error text:
{noformat}
[18:29:44,369][INFO][exchange-worker-#43][GridCacheDatabaseSharedManager] 
Restoring partition state for local groups [cntPartStateWal=0, 
lastCheckpointId=bd24ff23-da6f-46e5-bafd-b643db3870d4]
[18:29:51,864][SEVERE][exchange-worker-#43][] Critical system error detected. 
Will be handled accordingly to configured handler 
[hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, 
super=AbstractFailureH
andler [ignoredFailureTypes=[SYSTEM_WORKER_BLOCKED]]], 
failureCtx=FailureContext [type=CRITICAL_ERROR, err=class 
o.a.i.i.processors.cache.persistence.StorageException: Failed to initialize 
partition file: /usr/s
hare/apache-ignite/work/db/node00-f49af718-48da-4186-b664-62aca736bdc9/cache-SQL_PUBLIC_VERTEX_TBL/part-913.bin]]
class org.apache.ignite.internal.processors.cache.persistence.StorageException: 
Failed to initialize partition file: 
/usr/share/apache-ignite/work/db/node00-f49af718-48da-4186-b664-62aca736bdc9/cache-SQL_PUBLIC_
VERTEX_TBL/part-913.bin
at 
org.apache.ignite.internal.processors.cache.persistence.file.FilePageStore.init(FilePageStore.java:444)
at 
org.apache.ignite.internal.processors.cache.persistence.file.FilePageStore.ensure(FilePageStore.java:650)
at 
org.apache.ignite.internal.processors.cache.persistence.file.FilePageStoreManager.ensure(FilePageStoreManager.java:712)
at 
org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.restorePartitionStates(GridCacheDatabaseSharedManager.java:2472)
at 
org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.applyLastUpdates(GridCacheDatabaseSharedManager.java:2419)
at 
org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.restoreState(GridCacheDatabaseSharedManager.java:1628)
at 
org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.beforeExchange(GridCacheDatabaseSharedManager.java:1302)
at 
org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.distributedExchange(GridDhtPartitionsExchangeFuture.java:1453)
at 
org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.init(GridDhtPartitionsExchangeFuture.java:806)
at 
org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.body0(GridCachePartitionExchangeManager.java:2667)
at 
org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.body(GridCachePartitionExchangeManager.java:2539)
at 
org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.nio.file.FileSystemException: 
/usr/share/apache-ignite/work/db/node00-f49af718-48da-4186-b664-62aca736bdc9/cache-SQL_PUBLIC_VERTEX_TBL/part-913.bin:
 Too many open files
at 
sun.nio.fs.UnixException.translateToIOException(UnixException.java:91)
at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
at 
sun.nio.fs.UnixFileSystemProvider.newAsynchronousFileChannel(UnixFileSystemProvider.java:196)
at 
java.nio.channels.AsynchronousFileChannel.open(AsynchronousFileChannel.java:248)
at 
java.nio.channels.AsynchronousFileChannel.open(AsynchronousFileChannel.java:301)
at 
org.apache.ignite.internal.processors.cache.persistence.file.AsyncFileIO.(AsyncFileIO.java:57)
at 
org.apache.ignite.internal.processors.cache.persistence.file.AsyncFileIOFactory.create(AsyncFileIOFactory.java:53)
at 
org.apache.ignite.internal.processors.cache.persistence.file.FilePageStore.init(FilePageStore.java:416)
... 12 more
{noformat}

It happen because systemd service description 
(/etc/systemd/system/apache-ignite@.service) didn't contain
{noformat}
LimitNOFILE=50
(possible with) LimitNPROC=50
{noformat}
see: https://fredrikaverpil.github.io/2016/04/27/systemd-and-resource-limits/
Possible, installation script should also add:
*  "fs.file-max = 2097152" to "/etc/sysctl.conf" 
*  into /etc/security/limits.conf:
{noformat}
* hardnofile  50
* softnofile  50
root  hardnofile  50
root  

[jira] [Created] (IGNITE-8407) Wrong memory size printing in IgniteCacheDatabaseSnaredManager

2018-04-27 Thread Alexander Belyak (JIRA)
Alexander Belyak created IGNITE-8407:


 Summary: Wrong memory size printing in 
IgniteCacheDatabaseSnaredManager
 Key: IGNITE-8407
 URL: https://issues.apache.org/jira/browse/IGNITE-8407
 Project: Ignite
  Issue Type: Bug
  Components: general
Affects Versions: 2.4
Reporter: Alexander Belyak


In checkDataRegionSize regCfg printing in "si" format (based on 1000, not 
1024). Need to fix it and any other usages of getInitialSize()/getMaxSize()) 
with U.readableSize(8, true) 
{noformat}
throw new IgniteCheckedException("DataRegion must have size more than 10MB (use 
" +
"DataRegionConfiguration.initialSize and .maxSize properties to 
set correct size in bytes) " +
"[name=" + regCfg.getName() + ", initialSize=" + 
U.readableSize(regCfg.getInitialSize(), true) +
", maxSize=" + U.readableSize(regCfg.getMaxSize(), true) + "]"
{noformat}
should be replaced with
{noformat}
throw new IgniteCheckedException("DataRegion must have size more than 10MB (use 
" +
"DataRegionConfiguration.initialSize and .maxSize properties to 
set correct size in bytes) " +
"[name=" + regCfg.getName() + ", initialSize=" + 
U.readableSize(regCfg.getInitialSize(), false) +
", maxSize=" + U.readableSize(regCfg.getMaxSize(), false) + "]"
{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-8288) ScanQuery ignore readFromBackups

2018-04-17 Thread Alexander Belyak (JIRA)
Alexander Belyak created IGNITE-8288:


 Summary: ScanQuery ignore readFromBackups
 Key: IGNITE-8288
 URL: https://issues.apache.org/jira/browse/IGNITE-8288
 Project: Ignite
  Issue Type: Bug
Reporter: Alexander Belyak


1) Create partitioned cache on



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-8286) ScanQuery ignore setLocal with non local partition

2018-04-16 Thread Alexander Belyak (JIRA)
Alexander Belyak created IGNITE-8286:


 Summary: ScanQuery ignore setLocal with non local partition
 Key: IGNITE-8286
 URL: https://issues.apache.org/jira/browse/IGNITE-8286
 Project: Ignite
  Issue Type: Bug
Affects Versions: 2.4
Reporter: Alexander Belyak


1) Create partitioned cache on 2+ nodes cluster
2) Select some partition N, local node should not be OWNER of partition N
3) execute: cache.query(new ScanQuery<>().setLocal(true).setPartition(N))
Expected result:
empty result (probaply with logging smth like "Trying to execute local query 
 with non local partition N") or even throw excedption
Actual result:
executing (with ScanQueryFallbackClosableIterator) query on remote node.
Problem is that we execute local query on remote node.
Same behaviour can be achieved if we get empty node list from 
GridCacheQueryAdapter.node() by any reasons, for example - if we run "local" 
query from non data node from given cache (see 
GridDiscoveryNamager.cacheAffinityNode(ClusterNode node, String cacheName) in 
GridcacheQueryAdapter.executeScanQuery()



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-8119) NPE on clear DB and unclear WAL/WAL_ARCHIVE

2018-04-02 Thread Alexander Belyak (JIRA)
Alexander Belyak created IGNITE-8119:


 Summary: NPE on clear DB and unclear WAL/WAL_ARCHIVE
 Key: IGNITE-8119
 URL: https://issues.apache.org/jira/browse/IGNITE-8119
 Project: Ignite
  Issue Type: Bug
  Components: general
Affects Versions: 2.4
Reporter: Alexander Belyak
 Attachments: ClearTestP.java

1) Start grid (1 node will be enought), activate it and populate some data
2) Stop node and clear db folder
3) Start grid and activate it
Expected result:
Error about inconsistent storage configuration with/without start node with 
such store
Actual result:
Exchange-worker on node stop with NPE, this can hang whole cluster from 
complete any PME operations.
{noformat}
Failed to reinitialize local partitions (preloading will be stopped): 
GridDhtPartitionExchangeId [topVer=AffinityTopologyVersion [topVer=1, 
minorTopVer=1], ...
java.lang.NullPointerException
at 
org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.applyUpdate(GridCacheDatabaseSharedManager.java:2354)
at 
org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.applyLastUpdates(GridCacheDatabaseSharedManager.java:2099)
at 
org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.restoreState(GridCacheDatabaseSharedManager.java:1325)
at 
org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.beforeExchange(GridCacheDatabaseSharedManager.java:1113)
at 
org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.distributedExchange(GridDhtPartitionsExchangeFuture.java:1063)
at 
org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.init(GridDhtPartitionsExchangeFuture.java:661)
at 
org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.body(GridCachePartitionExchangeManager.java:2329)
at 
org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110)
at java.lang.Thread.run(Thread.java:748)
{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-8105) Close() while auto activate

2018-04-02 Thread Alexander Belyak (JIRA)
Alexander Belyak created IGNITE-8105:


 Summary: Close() while auto activate
 Key: IGNITE-8105
 URL: https://issues.apache.org/jira/browse/IGNITE-8105
 Project: Ignite
  Issue Type: Bug
  Components: general
Affects Versions: 2.4
Reporter: Alexander Belyak


1) Start one node, activate, fill some data
2) Close node
3) Start node and right after start call close()
Expected result:
Node start and close correctly (maybe with auto activate/deactivate, maybe 
without)
Actual result:
Node start and throw java.nio.channels.ClosedByInterruptException in activation 
process cause close() process close checkpoint file channel
Expection is:
{noformat}
[2018-04-02 
19:57:27,831][ERROR][exchange-worker-#94%srv1%][GridCachePartitionExchangeManager]
 Failed to wait for completion of partition map exchange (preloading will not 
start): GridDhtPartitionsExchangeFuture [firstDiscoEvt=DiscoveryCustomEvent 
[customMsg=null, affTopVer=AffinityTopologyVersion [topVer=1, minorTopVer=1], 
super=DiscoveryEvent [evtNode=TcpDiscoveryNode 
[id=a9fa729f-613f-496b-8e7c-e53142817226, addrs=[0:0:0:0:0:0:0:1%lo, 10.0.3.1, 
10.38.184.66, 10.42.1.107, 127.0.0.1, 172.17.0.1], 
sockAddrs=[/10.38.184.66:47500, /172.17.0.1:47500, /0:0:0:0:0:0:0:1%lo:47500, 
/10.0.3.1:47500, /10.42.1.107:47500, /127.0.0.1:47500], discPort=47500, 
order=1, intOrder=1, lastExchangeTime=1522673846477, loc=true, 
ver=2.4.0#19700101-sha1:, isClient=false], topVer=1, nodeId8=a9fa729f, 
msg=null, type=DISCOVERY_CUSTOM_EVT, tstamp=1522673847732]], 
crd=TcpDiscoveryNode [id=a9fa729f-613f-496b-8e7c-e53142817226, 
addrs=[0:0:0:0:0:0:0:1%lo, 10.0.3.1, 10.38.184.66, 10.42.1.107, 127.0.0.1, 
172.17.0.1], sockAddrs=[/10.38.184.66:47500, /172.17.0.1:47500, 
/0:0:0:0:0:0:0:1%lo:47500, /10.0.3.1:47500, /10.42.1.107:47500, 
/127.0.0.1:47500], discPort=47500, order=1, intOrder=1, 
lastExchangeTime=1522673846477, loc=true, ver=2.4.0#19700101-sha1:, 
isClient=false], exchId=GridDhtPartitionExchangeId 
[topVer=AffinityTopologyVersion [topVer=1, minorTopVer=1], 
discoEvt=DiscoveryCustomEvent [customMsg=null, 
affTopVer=AffinityTopologyVersion [topVer=1, minorTopVer=1], 
super=DiscoveryEvent [evtNode=TcpDiscoveryNode 
[id=a9fa729f-613f-496b-8e7c-e53142817226, addrs=[0:0:0:0:0:0:0:1%lo, 10.0.3.1, 
10.38.184.66, 10.42.1.107, 127.0.0.1, 172.17.0.1], 
sockAddrs=[/10.38.184.66:47500, /172.17.0.1:47500, /0:0:0:0:0:0:0:1%lo:47500, 
/10.0.3.1:47500, /10.42.1.107:47500, /127.0.0.1:47500], discPort=47500, 
order=1, intOrder=1, lastExchangeTime=1522673846477, loc=true, 
ver=2.4.0#19700101-sha1:, isClient=false], topVer=1, nodeId8=a9fa729f, 
msg=null, type=DISCOVERY_CUSTOM_EVT, tstamp=1522673847732]], nodeId=a9fa729f, 
evt=DISCOVERY_CUSTOM_EVT], added=true, initFut=GridFutureAdapter 
[ignoreInterrupts=false, state=DONE, res=false, hash=791289709], init=false, 
lastVer=null, partReleaseFut=PartitionReleaseFuture 
[topVer=AffinityTopologyVersion [topVer=1, minorTopVer=1], 
futures=[ExplicitLockReleaseFuture [topVer=AffinityTopologyVersion [topVer=1, 
minorTopVer=1], futures=[]], TxReleaseFuture [topVer=AffinityTopologyVersion 
[topVer=1, minorTopVer=1], futures=[]], AtomicUpdateReleaseFuture 
[topVer=AffinityTopologyVersion [topVer=1, minorTopVer=1], futures=[]], 
DataStreamerReleaseFuture [topVer=AffinityTopologyVersion [topVer=1, 
minorTopVer=1], futures=[, exchActions=null, affChangeMsg=null, 
initTs=1522673847742, centralizedAff=false, changeGlobalStateE=class 
o.a.i.IgniteCheckedException: Failed to read checkpoint pointer from marker 
file: 
/tmp/test/srv1/db/cons_srv1/cp/1522673842894-84776dc9-6fac-4aa0-804c-f56cbee68c12-START.bin,
 done=true, state=CRD, evtLatch=0, remaining=[], super=GridFutureAdapter 
[ignoreInterrupts=false, state=DONE, res=class o.a.i.IgniteCheckedException: 
Failed to read checkpoint pointer from marker file: 
/tmp/test/srv1/db/cons_srv1/cp/1522673842894-84776dc9-6fac-4aa0-804c-f56cbee68c12-START.bin,
 hash=1311860231]]
class org.apache.ignite.IgniteCheckedException: Failed to read checkpoint 
pointer from marker file: 
/tmp/test/srv1/db/cons_srv1/cp/1522673842894-84776dc9-6fac-4aa0-804c-f56cbee68c12-START.bin
at 
org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.readPointer(GridCacheDatabaseSharedManager.java:1794)
at 
org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.readCheckpointStatus(GridCacheDatabaseSharedManager.java:1764)
at 
org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.restoreState(GridCacheDatabaseSharedManager.java:1321)
at 
org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.beforeExchange(GridCacheDatabaseSharedManager.java:1114)
at 

[jira] [Created] (IGNITE-8103) Node with BLT is not allowed to join cluster without one

2018-04-02 Thread Alexander Belyak (JIRA)
Alexander Belyak created IGNITE-8103:


 Summary: Node with BLT is not allowed to join cluster without one
 Key: IGNITE-8103
 URL: https://issues.apache.org/jira/browse/IGNITE-8103
 Project: Ignite
  Issue Type: Improvement
  Components: general
Affects Versions: 2.4
Reporter: Alexander Belyak


1) Start cluster of 2-3 nodes and activate it, fill some data
2) Stop cluster, clear LFS on first node
3) Start cluster from first node (or start all nodes synchronously)
Expected result: ?
Actual result: "Node with set up BaselineTopology is not allowed to join 
cluster without one: cons_srv2"
In the technical point of view it's expected behaviour, because first node with 
cleared storage became grid coordinator and reject any connection attempts from 
nodes with different baseline. But it's bad for usability: if we always start 
all nodes together and wanna clear storage on one node by some reason - we need 
to define start sequence.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-8066) Reset wal segment idx

2018-03-28 Thread Alexander Belyak (JIRA)
Alexander Belyak created IGNITE-8066:


 Summary: Reset wal segment idx
 Key: IGNITE-8066
 URL: https://issues.apache.org/jira/browse/IGNITE-8066
 Project: Ignite
  Issue Type: New Feature
  Components: general
Affects Versions: 2.4
Reporter: Alexander Belyak


1) On activation grid read checkpoint status with segment idx=7742:

2018-03-21 02:34:04.465[INFO 
][exchange-worker-#152%DPL_GRID%DplGridNodeName%][o.a.i.i.p.c.d.d.p.GridDhtPartitionsExchangeFuture]
 Successfully activated caches [nodeId=9c0c2e76-fb7f-46df-8b0b-3379d0c91db9, 
clie
nt=false, topVer=AffinityTopologyVersion [topVer=161, minorTopVer=1]]
2018-03-21 02:34:04.479[INFO 
][exchange-worker-#152%DPL_GRID%DplGridNodeName%][o.a.i.i.p.c.d.d.p.GridDhtPartitionsExchangeFuture]
 Finished waiting for partition release future [topVer=AffinityTopologyVersion 
[t
opVer=161, minorTopVer=1], waitTime=0ms, futInfo=NA]
2018-03-21 02:34:04.487[INFO 
][exchange-worker-#152%DPL_GRID%DplGridNodeName%][o.a.i.i.p.c.p.GridCacheDatabaseSharedManager]
 Read checkpoint status 
[startMarker=/gridgain/ssd/data/10_126_1_172_47500/cp/15215870
60132-aafbf88b-f783-40e8-8e3c-ef60cd383e21-START.bin, 
endMarker=/gridgain/ssd/data/10_126_1_172_47500/cp/1521587060132-aafbf88b-f783-40e8-8e3c-ef60cd383e21-END.bin]
2018-03-21 02:34:04.488[INFO 
][exchange-worker-#152%DPL_GRID%DplGridNodeName%][o.a.i.i.p.c.p.GridCacheDatabaseSharedManager]
 Applying lost cache updates since last checkpoint record 
[lastMarked=FileWALPointer [
idx=7742, fileOff=1041057120, len=1470746], 
lastCheckpointId=aafbf88b-f783-40e8-8e3c-ef60cd383e21]

2) but right after it (with only two metrics messages in log between it) write 
checkpoint with wal segment idx=0

2018-03-21 02:35:21.875[INFO 
][exchange-worker-#152%DPL_GRID%DplGridNodeName%][o.a.i.i.p.c.p.GridCacheDatabaseSharedManager]
 Finished applying WAL changes [updatesApplied=0, time=77388ms]
2018-03-21 02:35:22.386[INFO 
][db-checkpoint-thread-#243%DPL_GRID%DplGridNodeName%][o.a.i.i.p.c.p.GridCacheDatabaseSharedManager]
 Checkpoint started [checkpointId=8cf946e6-a718-4388-8bef-c76bf79d93cd, 
startPtr=
FileWALPointer [idx=0, fileOff=77196029, len=450864], checkpointLockWait=0ms, 
checkpointLockHoldTime=422ms, pages=16379, reason='node started']
2018-03-21 02:35:25.934[INFO 
][db-checkpoint-thread-#243%DPL_GRID%DplGridNodeName%][o.a.i.i.p.c.p.GridCacheDatabaseSharedManager]
 Checkpoint finished [cpId=8cf946e6-a718-4388-8bef-c76bf79d93cd, pages=16379, 
mar
kPos=FileWALPointer [idx=0, fileOff=77196029, len=450864], 
walSegmentsCleared=0, markDuration=508ms, pagesWrite=155ms, fsync=3391ms, 
total=4054ms]

Then we get some AssertionError while trying to archive wal segment 0 when 
lastArchivedIdx=7742



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-7995) Assertion on GridGhtPartitionDemandMessage

2018-03-20 Thread Alexander Belyak (JIRA)
Alexander Belyak created IGNITE-7995:


 Summary: Assertion on GridGhtPartitionDemandMessage
 Key: IGNITE-7995
 URL: https://issues.apache.org/jira/browse/IGNITE-7995
 Project: Ignite
  Issue Type: New Feature
  Components: general
Affects Versions: 2.4
Reporter: Alexander Belyak


After applying new baseline topology get
Failed processing message [sender=..., 
msg=GridGhtPartitionDemandMessage[updateSeq=10524, timeout=1, workerId=-1, 
topVer=ArrinityTopologyVersion [topVer=170, minorTopVer=1], partCnt=1, 
super=GridCacheGroupIdMessage [grpId=-1029020343]]]
java.lang.AssertionError: partCntr=5338946, reservations=Map []
from GridCacheOffheapManager.rebalanceIterator:704




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-7951) Add metrics for remains to evict keys/partitions

2018-03-14 Thread Alexander Belyak (JIRA)
Alexander Belyak created IGNITE-7951:


 Summary: Add metrics for remains to evict keys/partitions
 Key: IGNITE-7951
 URL: https://issues.apache.org/jira/browse/IGNITE-7951
 Project: Ignite
  Issue Type: New Feature
  Components: general
Affects Versions: 2.4
Reporter: Alexander Belyak


Need to add some metrics for remains to evict keys/partitions to indicate total 
amount of evicting work. In some cases we have synchronous eviction and it's 
critically important to know how many keys need to be evicted before exchange 
process end and cluster became working again. In some other cases we just wanna 
know what happens in cluster now (background eviction without workload) and 
when cluster will became 100% healthy. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-7892) Remove aquirence of any locks from toString methods

2018-03-06 Thread Alexander Belyak (JIRA)
Alexander Belyak created IGNITE-7892:


 Summary: Remove aquirence of any locks from toString methods
 Key: IGNITE-7892
 URL: https://issues.apache.org/jira/browse/IGNITE-7892
 Project: Ignite
  Issue Type: Wish
  Components: general
Affects Versions: 2.4
Reporter: Alexander Belyak


In org.apache.ignite.internal.processors.cache.GridCacheMapEntry we have thread 
safe toString() method that can lead to some hangs of monitoring threads like 
grid-timeout-worker if we try to dump LongRunningOperations with locked entry.
I think that toString methods will never need to be a thread safe and can throw 
ConcurrentModificationException or print inconsistent data, so we must remove 
synchronization from every toString methods in codebase. If we need some 
"consistent" string representation - let's add consistentToString methods or do 
external synchronization.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-7776) Check calculated values in javadoc

2018-02-21 Thread Alexander Belyak (JIRA)
Alexander Belyak created IGNITE-7776:


 Summary: Check calculated values in javadoc
 Key: IGNITE-7776
 URL: https://issues.apache.org/jira/browse/IGNITE-7776
 Project: Ignite
  Issue Type: Bug
  Components: documentation
Affects Versions: 2.3, 2.2, 2.1, 2.0
Reporter: Alexander Belyak
Assignee: Alexander Belyak


We have two issue with calculated value in javadoc:

1) wrong numbers, for example: #\{5 * 1024 * 102 * 1024}

2) overflow int type, for example: #\{5 * 1024 * 1024 * 1024}

Need to check as many places as possible.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-7765) walSegmentSize can be negative in config

2018-02-20 Thread Alexander Belyak (JIRA)
Alexander Belyak created IGNITE-7765:


 Summary: walSegmentSize can be negative in config
 Key: IGNITE-7765
 URL: https://issues.apache.org/jira/browse/IGNITE-7765
 Project: Ignite
  Issue Type: Bug
  Components: general
Affects Versions: 2.2
Reporter: Alexander Belyak


Grid use default (64Mb) DataStorageConfiguration.walSegmentSize without 
warnings if negative value specified, for example if in xml specified something 
like
 (overflow)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-7760) Handle FS hangs

2018-02-19 Thread Alexander Belyak (JIRA)
Alexander Belyak created IGNITE-7760:


 Summary: Handle FS hangs
 Key: IGNITE-7760
 URL: https://issues.apache.org/jira/browse/IGNITE-7760
 Project: Ignite
  Issue Type: Improvement
  Components: general
Affects Versions: 2.2, 2.1, 2.0, 1.9, 1.8, 1.7, 1.6
Reporter: Alexander Belyak


Need to handle FS operations hangs, for example  - copy WAL into wal archive 
(specially if wal archive mount as network file system volume).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-7684) Ignore IGNITE_USE_ASYNC_FILE_IO_FACTORY in FileWriteAheadLogManager

2018-02-13 Thread Alexander Belyak (JIRA)
Alexander Belyak created IGNITE-7684:


 Summary: Ignore IGNITE_USE_ASYNC_FILE_IO_FACTORY in 
FileWriteAheadLogManager
 Key: IGNITE-7684
 URL: https://issues.apache.org/jira/browse/IGNITE-7684
 Project: Ignite
  Issue Type: Improvement
  Components: general
Affects Versions: 2.4
Reporter: Alexander Belyak


If IGNITE_USE_ASYNC_FILE_IO_FACTORY specified and no IGNITE_WAL_MMAP we get:

{noformat}

java.lang.UnsupportedOperationException: AsynchronousFileChannel doesn't 
support mmap. 
at 
org.apache.ignite.internal.processors.cache.persistence.file.AsyncFileIO.map(AsyncFileIO.java:173)
 
at 
org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager.restoreWriteHandle(FileWriteAheadLogManager.java:1068)
 

at 
org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager.resumeLogging(FileWriteAheadLogManager.java:552)

at 
org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.readCheckpointAndRestoreMemory(GridCacheDatabaseSharedManager.java:714)

at 
org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.onClusterStateChangeRequest(GridDhtPartitionsExchangeFuture.java:841)

at 
org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.init(GridDhtPartitionsExchangeFuture.java:595)

at 
org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.body(GridCachePartitionExchangeManager.java:2329)

at 
org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110) 
at java.lang.Thread.run(Thread.java:748)

{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-7608) Sort keys in putAll/removeAll methods

2018-02-01 Thread Alexander Belyak (JIRA)
Alexander Belyak created IGNITE-7608:


 Summary: Sort keys in putAll/removeAll methods
 Key: IGNITE-7608
 URL: https://issues.apache.org/jira/browse/IGNITE-7608
 Project: Ignite
  Issue Type: Improvement
  Components: general
Affects Versions: 2.2, 2.1, 2.0
 Environment: We need to sort keys in cache putAll/removeAll operations 
to avoid deadlocks there.
Reporter: Alexander Belyak






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-7565) Remove IgniteSet from heap

2018-01-29 Thread Alexander Belyak (JIRA)
Alexander Belyak created IGNITE-7565:


 Summary: Remove IgniteSet from heap
 Key: IGNITE-7565
 URL: https://issues.apache.org/jira/browse/IGNITE-7565
 Project: Ignite
  Issue Type: Bug
  Components: general
Affects Versions: 2.2
Reporter: Alexander Belyak


IgniteSet store all data in durable memory and in java heap. It's not good for 
big clusters and big sets, so we need to remove values from heap.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-7564) Document IgniteSet memory consumption

2018-01-29 Thread Alexander Belyak (JIRA)
Alexander Belyak created IGNITE-7564:


 Summary: Document IgniteSet memory consumption
 Key: IGNITE-7564
 URL: https://issues.apache.org/jira/browse/IGNITE-7564
 Project: Ignite
  Issue Type: Bug
  Components: documentation
Affects Versions: 2.2
Reporter: Alexander Belyak


We need to document onheap memory consumption of IgniteSet collections (all 
values stored in durable memory AND in java heap).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-7478) Too many HistoryAffinityAssignments in HistAffAssignmentsCache

2018-01-19 Thread Alexander Belyak (JIRA)
Alexander Belyak created IGNITE-7478:


 Summary: Too many HistoryAffinityAssignments in 
HistAffAssignmentsCache
 Key: IGNITE-7478
 URL: https://issues.apache.org/jira/browse/IGNITE-7478
 Project: Ignite
  Issue Type: Bug
Affects Versions: 2.4
Reporter: Alexander Belyak


Get throuble with GC, found over 26k instances of

org.apache.ignite.internal.processors.affinity.HistoryAffinityAssignment with 
about 12Gb of: ArrayList->Object[]->ArrayList->Object[] but can't find 
ClusterNode objects there!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-7448) destroy*() API for datastructures

2018-01-17 Thread Alexander Belyak (JIRA)
Alexander Belyak created IGNITE-7448:


 Summary: destroy*() API for datastructures
 Key: IGNITE-7448
 URL: https://issues.apache.org/jira/browse/IGNITE-7448
 Project: Ignite
  Issue Type: Bug
Affects Versions: 2.3, 2.2, 2.1, 2.0, 1.9, 1.8, 1.7, 1.6
 Environment: In public API we have ignite.destroyCache(String) and 
ingite.destroyCaches(Collection) methods to destroy cache by name(s) 
and ignite.services().cancel(String) to undeploy services, but no method for 
data structures like:
 * destroyAtomicSequence()
 * destroyAtomicLong()
 * destroyAtomicReference()
 * destroyAtomicStamped()
 * destroyQueue()
 * destroySet()
 * destroySemaphore()
 * destroyCountDownLatch()
Reporter: Alexander Belyak






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-7385) Fix GridToStringBuilder

2018-01-11 Thread Alexander Belyak (JIRA)
Alexander Belyak created IGNITE-7385:


 Summary: Fix GridToStringBuilder
 Key: IGNITE-7385
 URL: https://issues.apache.org/jira/browse/IGNITE-7385
 Project: Ignite
  Issue Type: Bug
Reporter: Alexander Belyak
Assignee: Semen Boikov


Need to review and merge ignite-7195-hotfix.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (IGNITE-7246) MarshallerContextimpl.putAtIndex

2017-12-19 Thread Alexander Belyak (JIRA)
Alexander Belyak created IGNITE-7246:


 Summary: MarshallerContextimpl.putAtIndex
 Key: IGNITE-7246
 URL: https://issues.apache.org/jira/browse/IGNITE-7246
 Project: Ignite
  Issue Type: Bug
  Components: general
Affects Versions: 2.4
Reporter: Alexander Belyak
Priority: Minor


1) putAtIndex in org.apache.ignite.internal.MarshallerContextImpl contains code 
for unordered insertion, but it didn't work (add only into tail of allCaches 
collection). Test:
{panel}
public static void main(String[] args) {
ArrayList> all = new ArrayList<>();
ConcurrentMap m0 = new ConcurrentHashMap<>();
ConcurrentMap m1 = new ConcurrentHashMap<>();
putAtIndex(m1, all,(byte)1, all.size());
putAtIndex(m0, all, (byte)0, all.size());
System.out.println(all.get(0)==m0);
System.out.println(all.get(1)==m1);
System.out.println(all.size());
}
{panel}
2) Interface Collection is unordered (javadoc: "Some are ordered and others 
unordered") so its better to use List interface;
3) putAtIndex called only from getCacheFor(byte) method from synchro block so 
it can get size of allCaches by itself



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (IGNITE-7146) Assertion in GridCacheTxFinishSync

2017-12-07 Thread Alexander Belyak (JIRA)
Alexander Belyak created IGNITE-7146:


 Summary: Assertion in GridCacheTxFinishSync
 Key: IGNITE-7146
 URL: https://issues.apache.org/jira/browse/IGNITE-7146
 Project: Ignite
  Issue Type: Bug
Affects Versions: 2.1
Reporter: Alexander Belyak


Got assertion error in clear log:
{noformat}
2017-12-07 17:24:10.358 
[ERROR][sys-#2376%DPL_GRID%DplGridNodeName%][o.a.i.i.p.c.GridClosureProcessor] 
Closure execution failed with error.

java.lang.AssertionError: null
at 
org.apache.ignite.internal.processors.cache.distributed.GridCacheTxFinishSync$TxFinishSync.onSend(GridCacheTxFinishSync.java:250)
at 
org.apache.ignite.internal.processors.cache.distributed.GridCacheTxFinishSync$ThreadFinishSync.onSend(GridCacheTxFinishSync.java:163)
at 
org.apache.ignite.internal.processors.cache.distributed.GridCacheTxFinishSync.onFinishSend(GridCacheTxFinishSync.java:70)
at 
org.apache.ignite.internal.processors.cache.transactions.IgniteTxManager.beforeFinishRemote(IgniteTxManager.java:1522)
at 
org.apache.ignite.internal.processors.cache.distributed.near.GridNearTxFinishFuture.finish(GridNearTxFinishFuture.java:750)
at 
org.apache.ignite.internal.processors.cache.distributed.near.GridNearTxFinishFuture.finish(GridNearTxFinishFuture.java:690)
at 
org.apache.ignite.internal.processors.cache.distributed.near.GridNearTxFinishFuture.finish(GridNearTxFinishFuture.java:430)
at 
org.apache.ignite.internal.processors.cache.distributed.near.GridNearTxLocal.rollbackNearTxLocalAsync(GridNearTxLocal.java:3314)
at 
org.apache.ignite.internal.processors.cache.distributed.near.GridNearTxLocal.access$4900(GridNearTxLocal.java:122)
at 
org.apache.ignite.internal.processors.cache.distributed.near.GridNearTxLocal$26.run(GridNearTxLocal.java:4130)
at 
org.apache.ignite.internal.util.IgniteUtils.wrapThreadLoader(IgniteUtils.java:6685)
at 
org.apache.ignite.internal.processors.closure.GridClosureProcessor$1.body(GridClosureProcessor.java:827)
at 
org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
2017-12-07 17:24:10.358 
[ERROR][sys-#2376%DPL_GRID%DplGridNodeName%][o.a.i.i.p.c.GridClosureProcessor] 
Runtime error caught during grid runnable execution: GridWorker 
[name=closure-proc-worker, igniteInstanceName=DPL_GRID%DplGridNodeName, 
finished=false, hashCode=1220995949, interrupted=false, 
runner=sys-#2376%DPL_GRID%DplGridNodeName%]
{noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (IGNITE-7130) Simplify message related code in communication

2017-12-06 Thread Alexander Belyak (JIRA)
Alexander Belyak created IGNITE-7130:


 Summary: Simplify message related code in communication
 Key: IGNITE-7130
 URL: https://issues.apache.org/jira/browse/IGNITE-7130
 Project: Ignite
  Issue Type: Improvement
  Components: general
Affects Versions: 2.1
Reporter: Alexander Belyak
Priority: Minor


All code, auto generated in 
org.apache.ignite.plugin.extensions.communication.Message implementation by 
MessageCodeGenerator should be annotated with link to generator.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (IGNITE-7078) *Names() API for datastructures

2017-11-30 Thread Alexander Belyak (JIRA)
Alexander Belyak created IGNITE-7078:


 Summary: *Names() API for datastructures
 Key: IGNITE-7078
 URL: https://issues.apache.org/jira/browse/IGNITE-7078
 Project: Ignite
  Issue Type: Wish
  Components: general
Affects Versions: 2.3, 2.2, 2.1, 2.0, 1.9, 1.8, 1.7, 1.6
Reporter: Alexander Belyak


In public API we have ignite.cacheNames() method to get all cache names and 
ignite.services().serviceDescriptors() to get services, but no method for data 
structures like:
* atomicSequenceNames()
* atomicLongNames()
* atomicReferenceNames()
* atomicStampedNames()
* queueNames()
* setNames()
* semaphoreNames()
* countDownLatchNames()




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (IGNITE-7076) NPE while stopping with GridDhtLockFuture

2017-11-29 Thread Alexander Belyak (JIRA)
Alexander Belyak created IGNITE-7076:


 Summary: NPE while stopping with GridDhtLockFuture
 Key: IGNITE-7076
 URL: https://issues.apache.org/jira/browse/IGNITE-7076
 Project: Ignite
  Issue Type: Bug
Affects Versions: 2.1
Reporter: Alexander Belyak
Priority: Minor


Get NPE after "Stopped cache" msg
{noformat}
2017-11-29 08:18:20.994 
[ERROR][grid-timeout-worker-#119%DPL_GRID%DplGridNodeName%][o.a.i.i.p.t.GridTimeoutProcessor]
 Error when executing timeout callback: LockTimeoutObject []

java.lang.NullPointerException: null
at 
org.apache.ignite.internal.processors.cache.GridCacheContext.loadPreviousValue(GridCacheContext.java:1446)
at 
org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtLockFuture.loadMissingFromStore(GridDhtLockFuture.java:1030)
at 
org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtLockFuture.onComplete(GridDhtLockFuture.java:731)
at 
org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtLockFuture.access$900(GridDhtLockFuture.java:82)
at 
org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtLockFuture$LockTimeoutObject.onTimeout(GridDhtLockFuture.java:1133)
at 
org.apache.ignite.internal.processors.timeout.GridTimeoutProcessor$TimeoutWorker.body(GridTimeoutProcessor.java:163)
at 
org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110)
at java.lang.Thread.run(Thread.java:748)
{noformat}
because in GridCacheContext.java:1446 tryint to read from cacheCfg local 
variable, but cacheCfg was zeroed out while cache stopping.
Probability of such error will be significantly lowered if in 
GridDhtLockFuture.LockTimeoutObject.onTimeout we pass actual value of 
nodeStopping flag (GridGhtLockFuture:1133) instead of hardcoded false.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (IGNITE-6977) Wrong initial BitSet size in GridPartitionStateMap

2017-11-21 Thread Alexander Belyak (JIRA)
Alexander Belyak created IGNITE-6977:


 Summary: Wrong initial BitSet size in GridPartitionStateMap
 Key: IGNITE-6977
 URL: https://issues.apache.org/jira/browse/IGNITE-6977
 Project: Ignite
  Issue Type: Bug
  Components: general
Affects Versions: 2.1
Reporter: Alexander Belyak


In constructor of org.apache.ignite.internal.utilGridPartitionStateMap(int 
parts) {
states = new BitSet(parts);
}
we initialize BitSet with part bit, but use private static final int BITS for 
each partition state. As result long[] in BitSet get difficult predictable size 
(depends of access order it can be exact as needed or almost twice bigger with 
at least one additional array copying)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (IGNITE-6962) Reduce ExchangeHistory memory consumption

2017-11-20 Thread Alexander Belyak (JIRA)
Alexander Belyak created IGNITE-6962:


 Summary: Reduce ExchangeHistory memory consumption
 Key: IGNITE-6962
 URL: https://issues.apache.org/jira/browse/IGNITE-6962
 Project: Ignite
  Issue Type: Bug
Affects Versions: 2.1
Reporter: Alexander Belyak


GridDhtPartitionExchangeManager$ExhcangeFutureSet store huge message 
GridDhtPartitionsFullMessage with IgniteDhtPartitionCountersMap2 for each cache 
group with two long[partCount]. If we have big grid (100+ nodes) with large 
amount of cacheGroups and partitions in CachePartitionFullCountersMap(long[] 
initialUpdCntrs; long[] updCntrs;)
*<2 

[jira] [Created] (IGNITE-6958) Reduce FilePageStore allocation on start

2017-11-20 Thread Alexander Belyak (JIRA)
Alexander Belyak created IGNITE-6958:


 Summary: Reduce FilePageStore allocation on start
 Key: IGNITE-6958
 URL: https://issues.apache.org/jira/browse/IGNITE-6958
 Project: Ignite
  Issue Type: Bug
  Components: general
Affects Versions: 2.1
Reporter: Alexander Belyak


On cache start ignite create FilePageStore for all partition in CacheGroup, 
even if that partition never assigned to particular node. See 
FilePageStoreManager.initForCache method.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (IGNITE-6898) Datastreamers can lead to OOM on server side

2017-11-14 Thread Alexander Belyak (JIRA)
Alexander Belyak created IGNITE-6898:


 Summary: Datastreamers can lead to OOM on server side
 Key: IGNITE-6898
 URL: https://issues.apache.org/jira/browse/IGNITE-6898
 Project: Ignite
  Issue Type: Bug
  Security Level: Public (Viewable by anyone)
  Components: general
Affects Versions: 2.1
Reporter: Alexander Belyak


If grid server node process many datastreamer in same time (from many clients, 
with many cache backups and persistence, i.e. if processing take some time) it 
can lead to OutOfMemoryError in server JVM. To fix we can:
1) specify buffer sized in bytes instead of entries
2) use pageMemory to store streamer buffers
I get this problem on 16 server node grid with 45g heap each and 15 clients 
with 2 datastreamer each with this settings:
autoFlushFrequency=0
allowOverwrite=false
perNodeParallelOperations=8
perNodeBufferSize=1
Each client have 64g heap.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (IGNITE-6866) Allocate offheap on client

2017-11-10 Thread Alexander Belyak (JIRA)
Alexander Belyak created IGNITE-6866:


 Summary: Allocate offheap on client
 Key: IGNITE-6866
 URL: https://issues.apache.org/jira/browse/IGNITE-6866
 Project: Ignite
  Issue Type: Bug
  Security Level: Public (Viewable by anyone)
  Components: general
Affects Versions: 2.1
Reporter: Alexander Belyak


Often client use the same config file as a server and ignite start offheap 
memory for client too... but never use it. How it happens:
1) Default memory configuration for server is creating in 
IgnitionEx.initializeConfiguration() method:
if (!myCfg.isClientMode() && myCfg.getMemoryConfiguration() == null) 
so if ignite configuration already contains memoryConfiguration - it stay there
2) In IgniteCacheDatabaseSharedManager.anActivate method do nothing only:
if (cctx.kernalContext().clientNode() && 
cctx.kernalContext().config().getMemoryConfiguration() == null)
return;
So if ignite configuration contains memory configuration - it will be 
allocated. Why its not good:
1) Memory allocation spend virtual memory (OS didn't really allocate memory 
before first access to it) and if overcommit_memory strategy is set to 
OVERCOMMIT_NEVER - it can block start client node (maybe first or second one) 
in same host (see: /proc/sys/vm/overcommit_memory and 
/proc/sys/vm/overcommit_ratio)
2) In IgniteKernal.checkPhysicalRam() we use maxSize of offheap memory and log 
warning about memory overusage
Good news only one - often in memory configuration really big only maxSize, but 
initialSize is just about 256Mb so each client really allocate not so many RAM.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (IGNITE-6832) handle IO errors while checkpointing

2017-11-06 Thread Alexander Belyak (JIRA)
Alexander Belyak created IGNITE-6832:


 Summary: handle IO errors while checkpointing
 Key: IGNITE-6832
 URL: https://issues.apache.org/jira/browse/IGNITE-6832
 Project: Ignite
  Issue Type: Bug
  Security Level: Public (Viewable by anyone)
Affects Versions: 2.1
Reporter: Alexander Belyak


If we get some IO error (like "No spece left on device") during checkpointing 
(GridCacheDatabaseSharedManager$WriteCheckpointPages:2509) node didn't stop as 
when get same error while writting WAL log and clients will get some "Long 
running cache futures". We must stop node in this case! Better - add some 
internal healthcheck and stop node anyway if  it won't pass for few times (do 
it with different issue).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (IGNITE-6825) Unhandled interruption in GridH2Table

2017-11-03 Thread Alexander Belyak (JIRA)
Alexander Belyak created IGNITE-6825:


 Summary: Unhandled interruption in GridH2Table
 Key: IGNITE-6825
 URL: https://issues.apache.org/jira/browse/IGNITE-6825
 Project: Ignite
  Issue Type: Bug
  Security Level: Public (Viewable by anyone)
Affects Versions: 2.1
Reporter: Alexander Belyak
Priority: Blocker


In GridH2Table.lock(Ses, excl, force) method we:
1) put session in sessions table;
2) add lock in H2 session locks
3) try to Lock(excl), but if in GridH2Table.lock(excl):277 while thread in 
lock.lockInterruptiblu() it got interruption - session with lock still alive in 
GridH2Table sessions map but no really lock acquired and when session will 
trying to unlock all acquired locks it will try to unlock it too and we get 
exception:
{noformat}
[ERROR][pub-#3855%DPL_GRID%DplGridNodeName%][o.a.i.i.p.q.h.t.GridMapQueryExecutor]
 Failed to run map query on local node.
 
org.apache.ignite.IgniteCheckedException: Failed to execute SQL query.
at 
org.apache.ignite.internal.processors.query.h2.IgniteH2Indexing.executeSqlQuery(IgniteH2Indexing.java:970)
at 
org.apache.ignite.internal.processors.query.h2.IgniteH2Indexing.executeSqlQueryWithTimer(IgniteH2Indexing.java:1029)
at 
org.apache.ignite.internal.processors.query.h2.IgniteH2Indexing.executeSqlQueryWithTimer(IgniteH2Indexing.java:1008)
at 
org.apache.ignite.internal.processors.query.h2.twostep.GridMapQueryExecutor.onQueryRequest0(GridMapQueryExecutor.java:660)
at 
org.apache.ignite.internal.processors.query.h2.twostep.GridMapQueryExecutor.onQueryRequest(GridMapQueryExecutor.java:506)
at 
org.apache.ignite.internal.processors.query.h2.twostep.GridMapQueryExecutor.onMessage(GridMapQueryExecutor.java:206)
at 
org.apache.ignite.internal.processors.query.h2.twostep.GridReduceQueryExecutor$1.applyx(GridReduceQueryExecutor.java:145)
at 
org.apache.ignite.internal.processors.query.h2.twostep.GridReduceQueryExecutor$1.applyx(GridReduceQueryExecutor.java:143)
at 
org.apache.ignite.internal.util.lang.IgniteInClosure2X.apply(IgniteInClosure2X.java:38)
at 
org.apache.ignite.internal.processors.query.h2.IgniteH2Indexing.send(IgniteH2Indexing.java:2066)
at 
org.apache.ignite.internal.processors.query.h2.twostep.GridReduceQueryExecutor.send(GridReduceQueryExecutor.java:1273)
at 
org.apache.ignite.internal.processors.query.h2.twostep.GridReduceQueryExecutor.query(GridReduceQueryExecutor.java:733)
at 
org.apache.ignite.internal.processors.query.h2.IgniteH2Indexing$8.iterator(IgniteH2Indexing.java:1214)
at 
org.apache.ignite.internal.processors.cache.QueryCursorImpl.iterator(QueryCursorImpl.java:95)
at 
org.apache.ignite.internal.processors.query.h2.IgniteH2Indexing$9.iterator(IgniteH2Indexing.java:1256)
at 
org.apache.ignite.internal.processors.cache.QueryCursorImpl.iterator(QueryCursorImpl.java:95)
at 
com.sbt.dpl.gridgain.collection.dataselectors.executor.cachequery.IgniteCacheQueryExecutor.iterator(IgniteCacheQueryExecutor.java:131)
at 
com.sbt.dpl.gridgain.collection.dataselectors.executor.cachequery.impl.SqlQueryExecutor.iterator(SqlQueryExecutor.java:58)
at 
com.sbt.dpl.gridgain.collection.dataselectors.executor.cachequery.impl.SqlQueryExecutor.iterator(SqlQueryExecutor.java:23)
at 
com.sbt.dpl.gridgain.collection.dataselectors.impl.H2IndexesDataSelector.binaryIterator(H2IndexesDataSelector.java:142)
at 
com.sbt.dpl.gridgain.collection.dataselectors.AbstractDataSelector.getIterator(AbstractDataSelector.java:110)
at 
com.sbt.dpl.gridgain.collection.dataselectors.IndexesSwitchSelectDataSelector.getIterator(IndexesSwitchSelectDataSelector.java:106)
at 
com.sbt.dpl.gridgain.collection.base.GGAbstractCollectionWithDataSelector.iterator(GGAbstractCollectionWithDataSelector.java:390)
at 
ru.sbt.deposit_pf_api.comparators.EntityService.findDepositByProduct(EntityService.java:846)
at 
ru.sbt.deposit_pf_api.comparators.EntityService.findDepositByProduct(EntityService.java:807)
at 
ru.sbt.deposit_pf_api.comparators.EntityService.getDepositByObjectInner(EntityService.java:1350)
at 
ru.sbt.deposit_pf_api.comparators.EntityService.getDepositByObject(EntityService.java:1169)
at 
ru.sbt.deposit_pf_api.comparators.EntityService.getGroupingObject(EntityService.java:1098)
at 
ru.sbt.deposit_pf_api.comparators.UnknownClassMapFunction$FindUnknownMapFunctionPredicate.apply(UnknownClassMapFunction.java:183)
at 
ru.sbt.deposit_pf_api.comparators.UnknownClassMapFunction$FindUnknownMapFunctionPredicate.apply(UnknownClassMapFunction.java:1)
at ru.sbt.deposit_pf_api.CollectionUtils.filter(CollectionUtils.java:55)
at 

[jira] [Created] (IGNITE-6817) CME in GridCacheIoManager.cacheHandlers access

2017-11-02 Thread Alexander Belyak (JIRA)
Alexander Belyak created IGNITE-6817:


 Summary: CME in GridCacheIoManager.cacheHandlers access
 Key: IGNITE-6817
 URL: https://issues.apache.org/jira/browse/IGNITE-6817
 Project: Ignite
  Issue Type: Bug
  Security Level: Public (Viewable by anyone)
  Components: general
Affects Versions: 2.1
Reporter: Alexander Belyak


Got exception:
{noformat}
java.util.ConcurrentModificationException: null

at java.util.HashMap$HashIterator.nextNode(HashMap.java:1437)

at java.util.HashMap$EntryIterator.next(HashMap.java:1471)

at java.util.HashMap$EntryIterator.next(HashMap.java:1469)

at 
org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:355)

   at 
org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:304)

at 
org.apache.ignite.internal.processors.cache.GridCacheIoManager.access$100(GridCacheIoManager.java:99)

at 
org.apache.ignite.internal.processors.cache.GridCacheIoManager$1.onMessage(GridCacheIoManager.java:293)

at 
org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1562)

at 
org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:1190)

at 
org.apache.ignite.internal.managers.communication.GridIoManager.access$4200(GridIoManager.java:126)

at 
org.apache.ignite.internal.managers.communication.GridIoManager$9.run(GridIoManager.java:1097)

at 
org.apache.ignite.internal.util.StripedExecutor$Stripe.run(StripedExecutor.java:505)

at java.lang.Thread.run(Thread.java:748)
{noformat}
becouse in GridCacheIoManager.handleMessage access to 
GridCacheIoManager.cacheHandles protected by GridCacheIoManager.rw.readLock, 
but in GridCacheIoManager.addHandler same collection modify without 
rw.writeLock accuiring and idxClsHandlers is just HashMap in 
GridCacheioManager.MessageHandlers class.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (IGNITE-6798) Ignite start without WAL with no exceptions

2017-10-31 Thread Alexander Belyak (JIRA)
Alexander Belyak created IGNITE-6798:


 Summary: Ignite start without WAL with no exceptions
 Key: IGNITE-6798
 URL: https://issues.apache.org/jira/browse/IGNITE-6798
 Project: Ignite
  Issue Type: Bug
  Security Level: Public (Viewable by anyone)
Affects Versions: 2.1
Reporter: Alexander Belyak
Priority: Critical


Ignite start without any WAL log files. Step to reproduce:
1) Start node with persistence (WAL_MODE != NONE)
2) Create cache with some data
3) Stop node
4) Delete WAL
5) Start node
Expected:
If last checkpoint was finished - start with error in log
If last checkpoint wasn't finished - LFS can be corrupted so, maybe, we 
shouldn't start at all (with some message like "if you really wan't to start 
with possible corrupt database just remove last CP_start marker)
Actual:
Start without any errors/warnings.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (IGNITE-6797) Handle IO errors in LFS files

2017-10-30 Thread Alexander Belyak (JIRA)
Alexander Belyak created IGNITE-6797:


 Summary: Handle IO errors in LFS files
 Key: IGNITE-6797
 URL: https://issues.apache.org/jira/browse/IGNITE-6797
 Project: Ignite
  Issue Type: Bug
  Security Level: Public (Viewable by anyone)
Affects Versions: 2.1
Reporter: Alexander Belyak
Priority: Minor


If some thread was interrupted while IO operation with LFS file (for example - 
read page) then JVM close FileChannel of such file and mark it as closed by 
interrupt. If next thread try to load any page from closed file it get 
ClosedChannelException, but PageMemoryImpl first register page in segment 
FillPageIdTable loadedPages and didn't clear it after IO error, so third thread 
will find empty page in it and throw Unknown page type: 0 
IgniteCheckedException.
To fix it we should try to restore FileChannel after ClosedChannelException 
(for first time) and stop node if we get any other exception or get some error 
while reopening by ClosedChannelException in FilePageStore.
Read from closed channel exception:
{noformat}
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
com.google.common.eventbus.EventSubscriber.handleEvent(EventSubscriber.java:74)
at com.google.common.eventbus.EventBus.dispatch(EventBus.java:322)
at 
com.google.common.eventbus.AsyncEventBus.access$001(AsyncEventBus.java:34)
at 
com.google.common.eventbus.AsyncEventBus$1.run(AsyncEventBus.java:117)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.ignite.IgniteCheckedException: Runtime failure on lookup 
row: 
org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl$SearchRow@5678e76a
at 
org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.findOne(BPlusTree.java:1070)
at 
org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl$CacheDataStoreImpl.find(IgniteCacheOffheapManagerImpl.java:1476)
at 
org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager$GridCacheDataStore.find(GridCacheOffheapManager.java:1276)
at 
org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl.read(IgniteCacheOffheapManagerImpl.java:406)
at 
org.apache.ignite.internal.processors.cache.GridCacheAdapter.getAllAsync0(GridCacheAdapter.java:1902)
at 
org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtCacheAdapter.getDhtAllAsync(GridDhtCacheAdapter.java:780)
at 
org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtGetSingleFuture.getAsync(GridDhtGetSingleFuture.java:360)
at 
org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtGetSingleFuture.map0(GridDhtGetSingleFuture.java:254)
at 
org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtGetSingleFuture.map(GridDhtGetSingleFuture.java:237)
at 
org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtGetSingleFuture.init(GridDhtGetSingleFuture.java:161)
at 
org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtCacheAdapter.getDhtSingleAsync(GridDhtCacheAdapter.java:878)
at 
org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtCacheAdapter.processNearSingleGetRequest(GridDhtCacheAdapter.java:892)
at 
org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTransactionalCacheAdapter$2.apply(GridDhtTransactionalCacheAdapter.java:131)
at 
org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTransactionalCacheAdapter$2.apply(GridDhtTransactionalCacheAdapter.java:129)
at 
org.apache.ignite.internal.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:1060)
at 
org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:579)
at 
org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:378)
at 
org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:304)
at 
org.apache.ignite.internal.processors.cache.GridCacheIoManager.access$100(GridCacheIoManager.java:99)
at 
org.apache.ignite.internal.processors.cache.GridCacheIoManager$1.onMessage(GridCacheIoManager.java:293)
at 
org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1562)
at 

[jira] [Created] (IGNITE-6759) URL not using in http rest API

2017-10-26 Thread Alexander Belyak (JIRA)
Alexander Belyak created IGNITE-6759:


 Summary: URL not using in http rest API
 Key: IGNITE-6759
 URL: https://issues.apache.org/jira/browse/IGNITE-6759
 Project: Ignite
  Issue Type: Bug
  Security Level: Public (Viewable by anyone)
  Components: general
Affects Versions: 2.2, 2.1, 2.0
Reporter: Alexander Belyak
 Fix For: 3.0


In http rest API I cat send:

curl "http://localhost:8080/ignite?cmd=get;

{"successStatus":1,"sessionToken":null,"error":"Failed to handle request: 
[req=CACHE_GET, err=Failed to find mandatory parameter in request: 
key]","response":null}

and 

curl "http://localhost:8080/ignite2/2/2/2/2/?cmd=get;

{"successStatus":1,"sessionToken":null,"error":"Failed to handle request: 
[req=CACHE_GET, err=Failed to find mandatory parameter in request: 
key]","response":null}

With same result, i.e. we didn't  test whole request URL (only /ignite prefix 
is mandatory). Btw - its REST antipattern to use single URL to do anything (set 
ignite version 3.0 as fix version to be able to change API): 
http://www.restapitutorial.com/lessons/restfulresourcenaming.html



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (IGNITE-6758) Slow memory releasing while deactivation

2017-10-25 Thread Alexander Belyak (JIRA)
Alexander Belyak created IGNITE-6758:


 Summary: Slow memory releasing while deactivation
 Key: IGNITE-6758
 URL: https://issues.apache.org/jira/browse/IGNITE-6758
 Project: Ignite
  Issue Type: Bug
  Security Level: Public (Viewable by anyone)
  Components: general
Affects Versions: 2.1
Reporter: Alexander Belyak
Priority: Minor
 Fix For: 2.4


It take about 1 minutes to fill each page by 0 and release it to page pool in 
PageMemoryImpl.ClearSegmentRunnable() from 
GridCacheDatabaseSharedManager.onCacheGroupsStopped(). When we have 100+M pages 
in hundred of Gb of pageCache in take quite long to GridUnsafe.setMemory to 0 
and in logs we get lot of "Failed to wait for partition map exchange 
[topVer=AffinityTopologyVersion [topVer=56, minorTop
Ver=1], node=3676f020-0bf0-4145-861e-689c96d7e853]. Dumping pending objects 
that might be the cause: " without any cause and progress 
indicator. So full grid reboot take longer downtime with unnecessary warnings. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (IGNITE-6750) Return "wrong command" error in http rest api

2017-10-25 Thread Alexander Belyak (JIRA)
Alexander Belyak created IGNITE-6750:


 Summary: Return "wrong command" error in http rest api
 Key: IGNITE-6750
 URL: https://issues.apache.org/jira/browse/IGNITE-6750
 Project: Ignite
  Issue Type: Bug
  Security Level: Public (Viewable by anyone)
  Components: general
Affects Versions: 2.2, 2.1, 2.0, 1.9
Reporter: Alexander Belyak
Priority: Minor
 Fix For: 2.4


If I make mistake in command name, for example

curl "http://localhost:8080/ignite?cmd=wrongcmd;



I get no error message and nothing will be logged in ignite log (even in 
IGNITE_QUIET=false mode) and only by getting response code

curl -I "http://localhost:8080/ignite?cmd=wrongcmd;

HTTP/1.1 400 Bad Request
Date: Wed, 25 Oct 2017 10:03:06 GMT
Content-Type: application/json; charset=UTF-8
Content-Length: 0
Server: Jetty(9.2.11.v20150529)

I can see something, but without root cause.
We need:
1) return error text

curl "http://localhost:8080/ignite?cmd=wrongcmd;

{"successStatus":1,"sessionToken":null,"error":"Failed to handle request: 
[req=UNKNOWN, err=Failed to find command: wrongcmd]","response":null}

 as usual:

curl "http://localhost:8080/ignite?cmd=get;

{"successStatus":1,"sessionToken":null,"error":"Failed to handle request: 
[req=CACHE_GET, err=Failed to find mandatory parameter in request: 
key]","response":null}

2)  set status code in http response to 400 ( 
http://www.restapitutorial.com/httpstatuscodes.html )



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (IGNITE-6749) Illegal comparsion in NodeOrderComparator

2017-10-25 Thread Alexander Belyak (JIRA)
Alexander Belyak created IGNITE-6749:


 Summary: Illegal comparsion in NodeOrderComparator
 Key: IGNITE-6749
 URL: https://issues.apache.org/jira/browse/IGNITE-6749
 Project: Ignite
  Issue Type: Bug
  Security Level: Public (Viewable by anyone)
  Components: general
Affects Versions: 2.1
Reporter: Alexander Belyak
 Fix For: 2.4


In org.apache.ignite.internal.cluster.compare method code
{panel}
Object consId1 = n1.consistentId();
Object consId2 = n2.consistentId();

if (consId1 instanceof Comparable && consId2 instanceof Comparable) {
return ((Comparable)consId1).compareTo(consId2);
}
{panel}
check only that consId1 and consId2 is Comparable, but they may not be 
Comparable to each other. For example: String and UUID is comparable, but 
UUID.compareTo(String) throw ClassCastException.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (IGNITE-6616) WebConsole cache config parse

2017-10-12 Thread Alexander Belyak (JIRA)
Alexander Belyak created IGNITE-6616:


 Summary: WebConsole cache config parse
 Key: IGNITE-6616
 URL: https://issues.apache.org/jira/browse/IGNITE-6616
 Project: Ignite
  Issue Type: Bug
  Components: wizards
Affects Versions: 2.1
Reporter: Alexander Belyak
Priority: Minor
 Fix For: 2.4


1) Go to /monitoring/dashboard
2) Press Start cache button
3) Add  (without quotes in value)
4) Press Start button
Expected result: warning about xml format
Actual result: "Are you sure you want to start cache with name: ?" 
message



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (IGNITE-6606) Web console agent download

2017-10-12 Thread Alexander Belyak (JIRA)
Alexander Belyak created IGNITE-6606:


 Summary: Web console agent download
 Key: IGNITE-6606
 URL: https://issues.apache.org/jira/browse/IGNITE-6606
 Project: Ignite
  Issue Type: Improvement
  Components: wizards
Affects Versions: 2.1
Reporter: Alexander Belyak
Assignee: Alexey Kuznetsov
Priority: Minor
 Fix For: 2.4


To connect web console to ignite cluster I must use web-agent, but in first 
time its not oblivious where to get it.
1) Documentation ( https://apacheignite-tools.readme.io/docs/getting-started ) 
say "Ignite Web Agent zip ships with ignite-web-agent.{sh|bat} script" It's 
wrong.
2) In web console cluster configure screen I see big red buttons "Save project" 
and "Save and download projects", but to download web-agent I must found small 
link in bottom "Download agent" (near Feedback and Apache Ignite logo, it's 
wrong place) Moreover, agent configuration contain one parameter from cluster 
configuration (IGNITE_JETTY_PORT) so download link should be cluster wide, not 
web console wide.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (IGNITE-6604) Log exchange progress

2017-10-12 Thread Alexander Belyak (JIRA)
Alexander Belyak created IGNITE-6604:


 Summary: Log exchange progress
 Key: IGNITE-6604
 URL: https://issues.apache.org/jira/browse/IGNITE-6604
 Project: Ignite
  Issue Type: Improvement
  Components: general
Affects Versions: 2.1
Reporter: Alexander Belyak
Priority: Minor


Sometimes exchange process hangs (because some errors, OOMe, deadlocks, etc), 
sometimes it require significant time to finish (finish eviction, long Full GC, 
etc).
We need some logging, that will show progress, because often exchange is block 
whole cluster and support team wanna know what's happen and how many time it 
will continue. Main point - simplify throubleshooting, just as grep standard 
message/logging class, for example: "Exchange progress: evicting partition 
" or "Exchange progress: waiting for response from  nodes".



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (IGNITE-6578) Too many diagnostic: Found long running cache future

2017-10-09 Thread Alexander Belyak (JIRA)
Alexander Belyak created IGNITE-6578:


 Summary: Too many diagnostic: Found long running cache future
 Key: IGNITE-6578
 URL: https://issues.apache.org/jira/browse/IGNITE-6578
 Project: Ignite
  Issue Type: Bug
  Components: general
Affects Versions: 2.1
Reporter: Alexander Belyak
Priority: Critical


Get about 100Mb of message:
 [WARN][grid-timeout-worker-...][o.apache.ignite.internal.diagnostic] 
Found long running cache future 
few equals message per ms! Can loose logs by rotating! Can't read logs without 
prefiltering!



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (IGNITE-6559) Wrong JMX CacheClusterMetrics

2017-10-04 Thread Alexander Belyak (JIRA)
Alexander Belyak created IGNITE-6559:


 Summary: Wrong JMX CacheClusterMetrics
 Key: IGNITE-6559
 URL: https://issues.apache.org/jira/browse/IGNITE-6559
 Project: Ignite
  Issue Type: Bug
Reporter: Alexander Belyak


In JMX 
org.apacheorg.apache.ignite.internal.processors.cache.CacheClusterMetrics
 I see:
1) same values as in CacheLocalMetrics: same Size, KeySize (cluster metrics 
must represent cluster wide numbers, right?)
2) zero in CacheClusterMetrics.TotalPartitionsCount (must contain real 
partitions count in cluster) and cacheContiguration.partitions in 
CacheLocalmetrics.TotalPartitionsCount (must contain real partitions count, 
owning by local node)
3) zero in all rebalancing* keys in CacheClusterMetrics




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (IGNITE-6544) Can't switch WalMode from LOG_ONLY/BACKGROUND to DEFAULT

2017-10-02 Thread Alexander Belyak (JIRA)
Alexander Belyak created IGNITE-6544:


 Summary: Can't switch WalMode from LOG_ONLY/BACKGROUND to DEFAULT
 Key: IGNITE-6544
 URL: https://issues.apache.org/jira/browse/IGNITE-6544
 Project: Ignite
  Issue Type: Bug
Affects Versions: 2.1
Reporter: Alexander Belyak
 Fix For: 2.4


To reproduce:
1) Start ignite with persistence with LOG_ONLY/BACKGROUND log mode
2) Stop and start with DEFAULT log mode
Exception is:
{noformat}
Exception in thread "main" class org.apache.ignite.IgniteException: Failed to 
start processor: GridProcessorAdapter []
at 
org.apache.ignite.internal.util.IgniteUtils.convertException(IgniteUtils.java:966)
at org.apache.ignite.Ignition.start(Ignition.java:325)
at 
org.apache.ignite.examples.datagrid.CacheApiExample.main(CacheApiExample.java:59)
Caused by: class org.apache.ignite.IgniteCheckedException: Failed to start 
processor: GridProcessorAdapter []
at 
org.apache.ignite.internal.IgniteKernal.startProcessor(IgniteKernal.java:1813)
at org.apache.ignite.internal.IgniteKernal.start(IgniteKernal.java:931)
at 
org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start0(IgnitionEx.java:1904)
at 
org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start(IgnitionEx.java:1646)
at org.apache.ignite.internal.IgnitionEx.start0(IgnitionEx.java:1074)
at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:594)
at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:519)
at org.apache.ignite.Ignition.start(Ignition.java:322)
... 1 more
Caused by: class org.apache.ignite.IgniteCheckedException: Failed to initialize 
WAL log segment (WAL segment size change is not 
supported):/tmp/s1/wal/0_0_0_0_0_0_0_1_lo_10_0_3_1_10_42_1_107_127_0_0_1_172_17_0_1_47500/.wal
at 
org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager.checkFiles(FileWriteAheadLogManager.java:1420)
at 
org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager.checkOrPrepareFiles(FileWriteAheadLogManager.java:934)
at 
org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager.start0(FileWriteAheadLogManager.java:274)
at 
org.apache.ignite.internal.processors.cache.GridCacheSharedManagerAdapter.start(GridCacheSharedManagerAdapter.java:61)
at 
org.apache.ignite.internal.processors.cache.GridCacheProcessor.start(GridCacheProcessor.java:614)
at 
org.apache.ignite.internal.IgniteKernal.startProcessor(IgniteKernal.java:1810)
... 8 more
{noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (IGNITE-6477) Add cache index metric to represent index size

2017-09-22 Thread Alexander Belyak (JIRA)
Alexander Belyak created IGNITE-6477:


 Summary: Add cache index metric to represent index size
 Key: IGNITE-6477
 URL: https://issues.apache.org/jira/browse/IGNITE-6477
 Project: Ignite
  Issue Type: Bug
  Components: general
Affects Versions: 2.1, 2.0, 1.9, 1.8
Reporter: Alexander Belyak
Priority: Minor
 Fix For: 2.2


Now we can't estimate space used by particular cache index. Let's add it!



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (IGNITE-6451) AssertionError: null in GridCacheIoManager.onSend on stop

2017-09-20 Thread Alexander Belyak (JIRA)
Alexander Belyak created IGNITE-6451:


 Summary: AssertionError: null in GridCacheIoManager.onSend on stop
 Key: IGNITE-6451
 URL: https://issues.apache.org/jira/browse/IGNITE-6451
 Project: Ignite
  Issue Type: Bug
  Components: general
Affects Versions: 1.8
Reporter: Alexander Belyak
Priority: Minor


If we stop node while sending message (after GridCacheIoManager.onSend test if 
grid is stopping) - we get AssertionError, for example, from:
{noformat}
java.lang.AssertionError: null
at 
org.apache.ignite.internal.processors.cache.GridCacheMessage.marshalCollection(GridCacheMessage.java:481)
 ~[ignite-core-1.10.3.ea15-SNAPSHOT.jar:2.0.0-SNAPSHOT]
at 
org.apache.ignite.internal.processors.cache.query.GridCacheQueryResponse.prepareMarshal(GridCacheQueryResponse.java:134)
 ~[ignite-core-1.10.3.ea15-SNAPSHOT.jar:2.0.0-SNAPSHOT]
at 
org.apache.ignite.internal.processors.cache.GridCacheIoManager.onSend(GridCacheIoManager.java:917)
 [ignite-core-1.10.3.ea15-SNAPSHOT.jar:2.0.0-SNAPSHOT]
{noformat}
I think we need more reliable approach to stop grid, ideally - we must stop all 
activity as first step of stopping grid and go to next step only after it. Or 
we can just add many tests in code like after each 
cctx = ctx.getCacheContext(cacheId) 
do 
if (cctx == null && ...kernalContext().isStopping())
 return false; //<= handle parallel stop here to correctly cancel operation
I think its important because no one can trust db with assertions in logs!



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (IGNITE-6062) IllegalArgumentException thrown while getHeapMemoryUsage()

2017-08-15 Thread Alexander Belyak (JIRA)
Alexander Belyak created IGNITE-6062:


 Summary: IllegalArgumentException thrown while getHeapMemoryUsage()
 Key: IGNITE-6062
 URL: https://issues.apache.org/jira/browse/IGNITE-6062
 Project: Ignite
  Issue Type: Bug
  Components: general
Affects Versions: 1.8
Reporter: Alexander Belyak
 Fix For: 1.8


In org.apache.ignite.internal.managers.discovery.GridDiscoveryManager
we can't just use getHeapMemoryUsage():
private static final MemoryMXBean mem = ManagementFactory.getMemoryMXBean();
mem.getHeapMemoryUsage().getCommitted();
because of 
https://bugs.openjdk.java.net/browse/JDK-6870537
It should be somehow wrapped to catch IllegalArgumentException.
Also we need to test all codebase and use wrapped version of 
getHeapMemoryUsage() method.
In version 2.1 its already fixed.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (IGNITE-5755) Wrong msg: calculation of memory policy size

2017-07-14 Thread Alexander Belyak (JIRA)
Alexander Belyak created IGNITE-5755:


 Summary: Wrong msg: calculation of memory policy size
 Key: IGNITE-5755
 URL: https://issues.apache.org/jira/browse/IGNITE-5755
 Project: Ignite
  Issue Type: Bug
  Components: general
Affects Versions: 2.1
Reporter: Alexander Belyak
Priority: Trivial
 Fix For: 2.3


In PageMemoryNoStoreImpl:
{noformat}
throw new IgniteOutOfMemoryException("Not enough memory allocated " 
+
"(consider increasing memory policy size or enabling evictions) 
" +
"[policyName=" + memoryPolicyCfg.getName() +
", size=" + U.readableSize(memoryPolicyCfg.getMaxSize(), true) 
+ "]"
{noformat}
wrong usage of U.readableSize - we should use non "Si" (1024 instead of 1000) 
multiplier. Right code is:
{noformat}
throw new IgniteOutOfMemoryException("Not enough memory allocated " 
+
"(consider increasing memory policy size or enabling evictions) 
" +
"[policyName=" + memoryPolicyCfg.getName() +
", size=" + U.readableSize(memoryPolicyCfg.getMaxSize(), false) 
+ "]"
{noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (IGNITE-5733) Activate/deactivate cluster through http-rest api

2017-07-11 Thread Alexander Belyak (JIRA)
Alexander Belyak created IGNITE-5733:


 Summary: Activate/deactivate cluster through http-rest api
 Key: IGNITE-5733
 URL: https://issues.apache.org/jira/browse/IGNITE-5733
 Project: Ignite
  Issue Type: Improvement
Affects Versions: 2.0
Reporter: Alexander Belyak
Priority: Minor
 Fix For: 2.1


Need to add command to get/set cluster active flag into http rest api.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (IGNITE-5709) Node stopped on OutOfMemoryException with persistence

2017-07-06 Thread Alexander Belyak (JIRA)
Alexander Belyak created IGNITE-5709:


 Summary: Node stopped on OutOfMemoryException with persistence
 Key: IGNITE-5709
 URL: https://issues.apache.org/jira/browse/IGNITE-5709
 Project: Ignite
  Issue Type: Bug
  Components: persistence
Affects Versions: 2.1
Reporter: Alexander Belyak
Priority: Critical


In long heavy (100%) load node with configured persistence can stop with 
"org.apache.ignite.internal.mem.OutOfMemoryException: Failed to find a page for 
eviction" exception. In my test it fail after 23 hour of 100% load while 
expiration outdated entries (by CreatedExpiryPolicy).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (IGNITE-5631) Can't write value greater then wal segment

2017-06-30 Thread Alexander Belyak (JIRA)
Alexander Belyak created IGNITE-5631:


 Summary: Can't write value greater then wal segment
 Key: IGNITE-5631
 URL: https://issues.apache.org/jira/browse/IGNITE-5631
 Project: Ignite
  Issue Type: Bug
  Components: persistence
Affects Versions: 2.1
Reporter: Alexander Belyak
Priority: Minor
 Fix For: 2.1


Step to reproduce: insert value greater then wal segment size.
Expected behavior: get few wal segments and insert value
Current behavior: infinite writing of wal archive
For test I use 256Kb of WAL segment size and value from 10M length String.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (IGNITE-5445) ServerImpl can't process NodeFailedMessage about itself

2017-06-07 Thread Alexander Belyak (JIRA)
Alexander Belyak created IGNITE-5445:


 Summary: ServerImpl can't process NodeFailedMessage about itself
 Key: IGNITE-5445
 URL: https://issues.apache.org/jira/browse/IGNITE-5445
 Project: Ignite
  Issue Type: Bug
  Components: general
Affects Versions: 1.9
Reporter: Alexander Belyak
Priority: Minor
 Fix For: 2.1


If for some reason (GC pause or heavy load) node get NodeLeft(FAILED) message 
about itself - it can't correctly handle it, because it call 
TcpDiscoveryNodesRing.removeNode with local node id and get assertion error.
I think - node should correctly determine such event and throw something like 
"segmented" event and so on.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (IGNITE-5390) But in IgniteCacheTxStoreSessionWriteBehindCoalescingTest

2017-06-02 Thread Alexander Belyak (JIRA)
Alexander Belyak created IGNITE-5390:


 Summary: But in IgniteCacheTxStoreSessionWriteBehindCoalescingTest
 Key: IGNITE-5390
 URL: https://issues.apache.org/jira/browse/IGNITE-5390
 Project: Ignite
  Issue Type: Bug
 Environment: 1.9.3
Reporter: Alexander Belyak
Assignee: Alexander Belyak
Priority: Trivial


IgniteCacheTxStoreSessionWriteBehindCoalescingTest override cacheConfiguration 
method from IgniteCacheStoreSessionWriteBehindAbstractTest to switch TestStore 
into TestNonCoalescingStore.
But IgniteCacheStoreSessionWriteBehindAbstractTest.getConfiguration 
cacheStoreFactory explicitly set to TestStore for ccfg1.
Need to remove it.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (IGNITE-5184) Collect write behind batch with out of order

2017-05-09 Thread Alexander Belyak (JIRA)
Alexander Belyak created IGNITE-5184:


 Summary: Collect write behind batch with out of order
 Key: IGNITE-5184
 URL: https://issues.apache.org/jira/browse/IGNITE-5184
 Project: Ignite
  Issue Type: Improvement
Reporter: Alexander Belyak


Now write behind flusher trying to batch cache operation with only natural 
order, i.e. if cache have "insert1, update2, delete3, insert4, delete5" 
operations it will be splitted to 4 batch opearations:
1) insert1, update2
2) delete3
3) insert4
4) delete5
Or even worse if we have two flush threads (they can get operation as:
1 thread: 1) insert1
2 thread: 1) update2
1 thread: 2) delete3
2 thread: 2) insert4
1 thread: 3) delete5
And we get 5 "batch" operation with store.
Because we already don't have real historical order in WB (with insert key1=1, 
delete key2, update key1=3 store wil get writeAll(key1=3) and then 
deleteAll(key2) operations) - it will be better if flusher thying to skip cache 
entries with different operation, i.e.process first example as:
1) insert1, update2, insert4 (skip delete3 and process it later)
2) delete3, delete5 (process delete3 operation)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (IGNITE-5062) Support new parameters in .Net

2017-04-24 Thread Alexander Belyak (JIRA)
Alexander Belyak created IGNITE-5062:


 Summary: Support new parameters in .Net
 Key: IGNITE-5062
 URL: https://issues.apache.org/jira/browse/IGNITE-5062
 Project: Ignite
  Issue Type: Bug
Reporter: Alexander Belyak
Assignee: Pavel Tupitsyn


Need to support new value and remove old ones:
In TcpDiscoverySpi:
remove maxMissedHeartbeats
remove maxMissedClientHeartbeats
remove heartbeatFrequency
rename hbFreq to metricsUpdateFrequency
In IgniteConfiguration:
add clientFailureDetectionTimeout (long with bounds from metricsUpdateFrequency 
to Integer.MAX_VALUE)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (IGNITE-5060) Check configuration parameters on the Integer overflowing

2017-04-24 Thread Alexander Belyak (JIRA)
Alexander Belyak created IGNITE-5060:


 Summary: Check configuration parameters on the Integer overflowing
 Key: IGNITE-5060
 URL: https://issues.apache.org/jira/browse/IGNITE-5060
 Project: Ignite
  Issue Type: Bug
Reporter: Alexander Belyak


Time related configuration parameters using long data type (and expect value in 
ms), but standard java.net.Socket class expect integer for soDelay and usually 
long timeouts from configuration cast to ineter with simple (int) method with 
overflow if configuration timeout > Integer.MAX_VALUE.
Need to add configuration check for:
* IgniteConfiguration.failureDetectionTimeout
* IgniteConfiguration.clientFailureDetectionTimeout
* TcpDiscoverySpi.ackTimeout
* TcpDiscoverySpi.netTimeout




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (IGNITE-5043) Support CacheConfiguration.writeBehindCoalescing in .Net

2017-04-20 Thread Alexander Belyak (JIRA)
Alexander Belyak created IGNITE-5043:


 Summary: Support CacheConfiguration.writeBehindCoalescing in .Net
 Key: IGNITE-5043
 URL: https://issues.apache.org/jira/browse/IGNITE-5043
 Project: Ignite
  Issue Type: Bug
Reporter: Alexander Belyak
Assignee: Pavel Tupitsyn


Please support new parameter CacheConfiguration.writeCoalescing in .Net.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (IGNITE-5042) Add internal ring wide msg for status check

2017-04-20 Thread Alexander Belyak (JIRA)
Alexander Belyak created IGNITE-5042:


 Summary: Add internal ring wide msg for status check
 Key: IGNITE-5042
 URL: https://issues.apache.org/jira/browse/IGNITE-5042
 Project: Ignite
  Issue Type: Improvement
Reporter: Alexander Belyak


Ignite cluster,  perhaps, need some special ring message to fast node status 
check, because metrics update message now is too heavy and require 
unmarshalling/marshalling in each node to go through the ring (and in big 
cluster it can take a lot of time).
New ring status check message must work with keep binary approach.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (IGNITE-5015) In TcpCommunicationSpi use IgniteConfiguration.clientFailureDetectionTimeout

2017-04-18 Thread Alexander Belyak (JIRA)
Alexander Belyak created IGNITE-5015:


 Summary: In TcpCommunicationSpi use 
IgniteConfiguration.clientFailureDetectionTimeout
 Key: IGNITE-5015
 URL: https://issues.apache.org/jira/browse/IGNITE-5015
 Project: Ignite
  Issue Type: Bug
Reporter: Alexander Belyak


Need to use new IgniteConfiguration.clientFailureDetectionTimeout in 
CommunicationSpi when interacting with client nodes.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (IGNITE-5005) WriteBehindStore - split flusher's to different classes by writeCoalescing

2017-04-17 Thread Alexander Belyak (JIRA)
Alexander Belyak created IGNITE-5005:


 Summary: WriteBehindStore - split flusher's to different classes 
by writeCoalescing
 Key: IGNITE-5005
 URL: https://issues.apache.org/jira/browse/IGNITE-5005
 Project: Ignite
  Issue Type: Improvement
Reporter: Alexander Belyak


In GridCacheWriteBehindStore.Flusher too many if statements, because its  
behavior depends of writeCoalescing flag too much. Need to split this class 
into two different (with one abstract base Flusher class).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (IGNITE-5004) GridCacheWriteBegindStore - remove StatefulValue

2017-04-17 Thread Alexander Belyak (JIRA)
Alexander Belyak created IGNITE-5004:


 Summary: GridCacheWriteBegindStore - remove StatefulValue
 Key: IGNITE-5004
 URL: https://issues.apache.org/jira/browse/IGNITE-5004
 Project: Ignite
  Issue Type: Improvement
Affects Versions: 1.9
Reporter: Alexander Belyak


If writeCoalescing=false - GridCacheWriteBehindStore doesn't need to create 
StatefulValue for each KV entry. Need to implement WBStore without this wrapper 
at all (if it is possible to solve ABA problem in cacheMap's) or with thinner 
wrapper without unnecesarry syncronizations/state.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (IGNITE-5003) Parallel write same key in CacheWriteBehindStore

2017-04-17 Thread Alexander Belyak (JIRA)
Alexander Belyak created IGNITE-5003:


 Summary: Parallel write same key in CacheWriteBehindStore
 Key: IGNITE-5003
 URL: https://issues.apache.org/jira/browse/IGNITE-5003
 Project: Ignite
  Issue Type: Improvement
Affects Versions: 1.9
Reporter: Alexander Belyak


Now GridCacheWriteBehindStore.updateCache wait for writeLock in StatefulValue 
and, moreover, waitForFlush() if value is in pending (flushing) state. We need 
to remove waiting.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (IGNITE-4999) Use one thread pool to flush all CacheWriteBehindStore

2017-04-17 Thread Alexander Belyak (JIRA)
Alexander Belyak created IGNITE-4999:


 Summary: Use one thread pool to flush all CacheWriteBehindStore
 Key: IGNITE-4999
 URL: https://issues.apache.org/jira/browse/IGNITE-4999
 Project: Ignite
  Issue Type: Improvement
Affects Versions: 1.9
Reporter: Alexander Belyak


For now we have flusher threads for each CacheWriteBehindStore so we can't 
create many caches with this mechanism (too many threads).
We should use single thread pool for all CacheWriteBehindStore instances (as 
for TTL).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (IGNITE-4940) GridCacheWriteBehindStore lose more data then necessary

2017-04-11 Thread Alexander Belyak (JIRA)
Alexander Belyak created IGNITE-4940:


 Summary: GridCacheWriteBehindStore lose more data then necessary
 Key: IGNITE-4940
 URL: https://issues.apache.org/jira/browse/IGNITE-4940
 Project: Ignite
  Issue Type: Bug
Affects Versions: 1.9
Reporter: Alexander Belyak
Priority: Minor


Unnecessary data loss happen in case of slowdown or errors in underlying store 
& populate new data in cache:
1) Writer add new cache entry and check cache size
2) If cache size > criticalSize (by default criticalSize = 1.5 * cacheSize) - 
writer will try to flush single value synchronously
At this point we have:
N flusher threads wich trying to flush data in batch mode
1+ writer threads wich trying to flush single value
Both writer and flusher use updateStore procedure, but if updateStore get 
Exception from underlying store it will check cacheSize and if it will be 
greater chen criticalCacheSize - it log cache overflow event and return true 
(as if data was sucessfully stored). Then data will be removed from writeBehind 
cache.
Moreower, we can loss not only single value, but 1+ batch if flusher's threads 
will get store exception on overflowed cache.
Reproduce:
{panel}
/**
 * Tests that cache would keep values if underlying store fails.
 *
 * @throws Exception If failed.
 */
private void testStoreFailure(boolean writeCoalescing) throws Exception {
delegate.setShouldFail(true);

initStore(2, writeCoalescing);

Set exp;

try {
Thread timer = new Thread(new Runnable() {
@Override
public void run() {
try {
U.sleep(FLUSH_FREQUENCY*2);
} catch (IgniteInterruptedCheckedException e) {
assertTrue("Timer was interrupted", false);
}
delegate.setShouldFail(false);
}
});
timer.start();
exp = runPutGetRemoveMultithreaded(10, 10);

timer.join();

info(">>> There are " + store.getWriteBehindErrorRetryCount() + " 
entries in RETRY state");

// Despite that we set shouldFail flag to false, flush thread may 
just have caught an exception.
// If we move store to the stopping state right away, this value 
will be lost. That's why this sleep
// is inserted here to let all exception handlers in write-behind 
store exit.
U.sleep(1000);
}
finally {
shutdownStore();
}

Map map = delegate.getMap();

Collection extra = new HashSet<>(map.keySet());

extra.removeAll(exp);

assertTrue("The underlying store contains extra keys: " + extra, 
extra.isEmpty());

Collection missing = new HashSet<>(exp);

missing.removeAll(map.keySet());

assertTrue("Missing keys in the underlying store: " + missing, 
missing.isEmpty());

for (Integer key : exp)
assertEquals("Invalid value for key " + key, "val" + key, 
map.get(key));
}
{panel}
Solution: test cache size before inserting new value +
a) with some kind of synchronization to prevent cacheSize growing more then 
criticalCacheSize (strong restriction)
b) remove cache size test from updateStore - cache can grow more then 
cacheCriticalSize in single point - if we get race on updateCache...
I preferr b becouse of less synchronization pressure (cache can store 1 or 2 
extra elements)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (IGNITE-4934) GridCacheWriteBehindStore broke if store backend throw a single exception

2017-04-09 Thread Alexander Belyak (JIRA)
Alexander Belyak created IGNITE-4934:


 Summary: GridCacheWriteBehindStore broke if store backend throw a 
single exception
 Key: IGNITE-4934
 URL: https://issues.apache.org/jira/browse/IGNITE-4934
 Project: Ignite
  Issue Type: Bug
  Components: cache
Affects Versions: 1.9
Reporter: Alexander Belyak
Assignee: Alexey Dmitriev
Priority: Critical


If flusher in GridCacheWriteBehindStore get runtime exception from underlying 
CacheStore - it will stop working at all. So future operation with 
GridCacheWriteBehindStore will be performed by writer threads (in 
flushSingleValue procedure) without batching, without write behind and moreower 
- with deadlock if writer will try to owerride some key in pending state, wich 
was processed by broken flusher thread.
Reproducer: GridCacheWriteBehindStoreMultithreadedSelfTest.testStoreFailure with
 exp = runPutGetRemoveMultithreaded(10, 10); 
changed to 
exp = runPutGetRemoveMultithreaded(10, 500);
This test should be changed too.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (IGNITE-4022) IgniteServices soesn't throw an exception if there are no server nodes

2016-10-04 Thread Alexander Belyak (JIRA)
Alexander Belyak created IGNITE-4022:


 Summary: IgniteServices soesn't throw an exception if there are no 
server nodes
 Key: IGNITE-4022
 URL: https://issues.apache.org/jira/browse/IGNITE-4022
 Project: Ignite
  Issue Type: Bug
Affects Versions: 1.8
Reporter: Alexander Belyak


If you call deployNodeSingleton method, but there are no server nodes in 
IgniteServices base ClusterGroup - you will never know about it and can't find 
deployed service instance. Probably we should print out these errors in logs as 
well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)