[jira] [Commented] (IGNITE-2688) InterruptException for segmentation issues

2016-04-16 Thread Denis Magda (JIRA)

[ 
https://issues.apache.org/jira/browse/IGNITE-2688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15244533#comment-15244533
 ] 

Denis Magda commented on IGNITE-2688:
-

TC looks good. [~yzhdanov] or [~sboikov] please review the changes incorporated 
in IGNITE-2688 branch.

> InterruptException for segmentation issues
> --
>
> Key: IGNITE-2688
> URL: https://issues.apache.org/jira/browse/IGNITE-2688
> Project: Ignite
>  Issue Type: Bug
>Reporter: Sergey Kozlov
>Assignee: Denis Magda
>Priority: Minor
>
> We're still seeing following exception for  segmentation issues:
> {noformat}
> [18:16:31,566][WARNING][tcp-disco-msg-worker-#2%null%][TcpDiscoverySpi] Node 
> is out of topology (probably, due to short-time network problems).
> [18:16:31,566][WARNING][disco-event-worker-#46%null%][GridDiscoveryManager] 
> Local node SEGMENTED: TcpDiscoveryNode 
> [id=19cf4b0f-d520-4915-be9f-813a99f945a5, addrs=[0:0:0:0:0:0:0:1, 127.0.0.1, 
> 172.22.222.44, 192.168.1.117], sockAddrs=[work-pc/172.22.222.44:47501, 
> /0:0:0:0:0:0:0:1:47501, /172.22.222.44:47501, /127.0.0.1:47501, 
> /172.22.222.44:47501, /192.168.1.117:47501], discPort=47501, order=4, 
> intOrder=4, lastExchangeTime=1455808591566, loc=true, 
> ver=1.6.0#19700101-sha1:, isClient=false]
> [18:16:31,629][SEVERE][tcp-disco-msg-worker-#2%null%][TcpDiscoverySpi] 
> TcpDiscoverSpi's message worker thread failed abnormally. Stopping the node 
> in order to prevent cluster wide instability.
> java.lang.InterruptedException
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:2017)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2095)
>   at 
> java.util.concurrent.LinkedBlockingDeque.pollFirst(LinkedBlockingDeque.java:519)
>   at 
> java.util.concurrent.LinkedBlockingDeque.poll(LinkedBlockingDeque.java:682)
>   at 
> org.apache.ignite.spi.discovery.tcp.ServerImpl$MessageWorkerAdapter.body(ServerImpl.java:5786)
>   at 
> org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.body(ServerImpl.java:2160)
>   at org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:62)
> [18:16:31,851][WARNING][sys-#22%null%][GridDhtAtomicCache] 
>  Failed to send near update reply to node 
> because it left grid: fad03851-2077-4b50-92b3-00ec6d85fa39
> [18:16:31,866][WARNING][disco-event-worker-#46%null%][GridDiscoveryManager] 
> Stopping local node according to configured segmentation policy.
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (IGNITE-1248) Add a method to retrieve Spring application context

2016-04-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/IGNITE-1248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15244298#comment-15244298
 ] 

ASF GitHub Bot commented on IGNITE-1248:


GitHub user samaitra opened a pull request:

https://github.com/apache/ignite/pull/651

IGNITE-1248 Add a method to retrieve Spring application context



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/samaitra/ignite IGNITE-1248

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/ignite/pull/651.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #651


commit 67595aa658ef94b0a380246c661e522a2d081ca6
Author: samaitra 
Date:   2016-04-16T16:38:00Z

IGNITE-1248 Add a method to retrieve Spring application context




> Add a method to retrieve Spring application context
> ---
>
> Key: IGNITE-1248
> URL: https://issues.apache.org/jira/browse/IGNITE-1248
> Project: Ignite
>  Issue Type: Improvement
>  Components: general
>Affects Versions: 1.1.4
>Reporter: Valentin Kulichenko
>Assignee: Saikat Maitra
>
> Currently there is a way to inject application context instance into task, 
> closure, cache store using {{@SpringApplicationContextResource}} annotation. 
> But there is no way to get context associated with an {{Ignite}} instance 
> from user code.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (IGNITE-1248) Add a method to retrieve Spring application context

2016-04-16 Thread Saikat Maitra (JIRA)

 [ 
https://issues.apache.org/jira/browse/IGNITE-1248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saikat Maitra reassigned IGNITE-1248:
-

Assignee: Saikat Maitra

> Add a method to retrieve Spring application context
> ---
>
> Key: IGNITE-1248
> URL: https://issues.apache.org/jira/browse/IGNITE-1248
> Project: Ignite
>  Issue Type: Improvement
>  Components: general
>Affects Versions: 1.1.4
>Reporter: Valentin Kulichenko
>Assignee: Saikat Maitra
>
> Currently there is a way to inject application context instance into task, 
> closure, cache store using {{@SpringApplicationContextResource}} annotation. 
> But there is no way to get context associated with an {{Ignite}} instance 
> from user code.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (IGNITE-2864) Need update local store from primary and backups

2016-04-16 Thread Anton Vinogradov (JIRA)

[ 
https://issues.apache.org/jira/browse/IGNITE-2864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15244076#comment-15244076
 ] 

Anton Vinogradov commented on IGNITE-2864:
--

Partially fixed found on review.

Main problem is conflict resolving in case of local store usage.
This problem exists at current implementation too.
I think that this problem should be solved different from decided on review.

Env:
Each node has local store. 
Local store contains primary and backup partitions.
At node fail store can be used to restore entries.

Problem:
Cluster partially failed, amount of failed node > backups. 

Initial solution:
Restart failed nodes and load entries from local stores after restart.
Resolve conflicts at rebalancing and user requests.

Cons:
A lot of changes required. Difficult to cover all cases. 

Better solution:
Topology validator should be used to prevent work with inconsistent data.
Recover steps:
1) Deny all user requests (admins should do that)
2) Restart all failed nodes.
3) Load all data from all available local stores at stable topology (after 
final rebalancing finished). 
All conflicts will be resovled using conflict resolver in this case, 
correct?
All entries will be restored since we have backups at local stores. (in 
case lost stores <= backups)
4) Allow user requests.

Thoughts?

> Need update local store from primary and backups
> 
>
> Key: IGNITE-2864
> URL: https://issues.apache.org/jira/browse/IGNITE-2864
> Project: Ignite
>  Issue Type: Bug
>  Components: cache
>Reporter: Semen Boikov
>Assignee: Anton Vinogradov
> Fix For: 1.6
>
>
> Now cache local store is updated only from primary nodes, this means that 
> data can be lost if primary node is not re-started after crash. Need fix it 
> and update store from primaries and backups if store is local (for both tx 
> and atomic caches).
> This test should work:
> - cache with 1 backup, two server nodes
> - execute cache put for key K
> - stop both nodes
> - restart only node which was backup for K
> - load data from local sore, update for K should be restored



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)