Re: Deadlock during cache loading

2018-07-03 Thread breischl
Also, I probably should have mentioned this earlier but we're not using WAL or any disk persistence. So everything should be in-memory, and generally on-heap. I think that makes it less likely that we were blocked on plain throughput of some hardware or virtual-hardware. -- Sent from:

Re: Deadlock during cache loading

2018-07-02 Thread breischl
(OT: Sorry about the duplicate posts, for some reason Nabble was refusing to show me new posts so I thought my earlier ones had been lost.) >Why did you decide, that cluster is deadlocked in the first place? Because all of the Datastreamer threads were stuck waiting on locks, and no progress was

Re: Deadlock during cache loading

2018-07-02 Thread David Harvey
Denis does have a point. When we were trying to run using GP2 storage, the cluster would simply lock up for an hour. Once we moved to local SSDs on i3 instances those issues went away (but we needed 2.5 to have the streaming rate hold for up as we had a lot of data loaded). The i3 instances

Re: Deadlock during cache loading

2018-07-02 Thread Denis Mekhanikov
Why did you decide, that cluster is deadlocked in the first place? > We've had several deployments in a row fail, apparently due to deadlocking in the loading process. What did you see in logs of the failing nodes? Denis пн, 2 июл. 2018 г. в 17:08, breischl : > Ah, I had not thought of that,

Re: Deadlock during cache loading

2018-07-02 Thread breischl
Ah, I had not thought of that, thanks. Interestingly, going to a smaller cluster seems to have worked around the problem. We were running a 44-node cluster using 3 backups of the data. Switching to two separate 22-node clusters, each with 1 backup, seems to work just fine. Is there some limit to

Re: Deadlock during cache loading

2018-07-02 Thread David Harvey
transactions are easy to use: see examples, org.apache.ignite. examples.datagrid.store.auto We use them in the stream receiver.You simply bracket the get/put in the transaction, but use a timeout, then bracket that with an "until done" while loop, perhaps added a sleep to backoff. We ended

Re: Deadlock during cache loading

2018-07-01 Thread breischl
@DaveHarvey, I'll look at that tomorrow. Seems potentially complicated, but if that's what has to happen we'll figure it out. Interestingly, cutting the cluster to half as many nodes (by reducing the number of backups) seems to have resolved the issue. Is there a guideline for how large a

Re: Deadlock during cache loading

2018-06-30 Thread David Harvey
You can start a transaction in the stream receiver to make it atomic. On Fri, Jun 29, 2018, 1:02 PM breischl wrote: > StreamTransformer does an invoke() pretty much exactly like what I'm doing, > so that would not seem to change anything. > > >

Re: Deadlock during cache loading

2018-06-29 Thread smovva
I have a fairly similar setup. What type of EC2 instances are you using? Just for compare my setup. -- Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: Deadlock during cache loading

2018-06-29 Thread breischl
StreamTransformer does an invoke() pretty much exactly like what I'm doing, so that would not seem to change anything. https://github.com/apache/ignite/blob/master/modules/core/src/main/java/org/apache/ignite/stream/StreamTransformer.java#L50-L53 I may try using a put(), but since I need to

Re: Deadlock during cache loading

2018-06-29 Thread Denis Mekhanikov
Entries, that are provided to the *receive()* method are immutable. But you can either do *cache.put() *inside the *receive() *method, just like *DataStreamerCacheUpdaters#Individual

Re: Deadlock during cache loading

2018-06-29 Thread breischl
Hi Denis, It was not clear to me that we could do the update from within the StreamReceiver without some sort of cache operation. Would we just use the CacheEntry.setValue() method to do that? Something roughly like the following? Thanks! public void receive(IgniteCache cache, Collection>

Re: Deadlock during cache loading

2018-06-29 Thread Denis Mekhanikov
Hi! Why do you do this inside an invoke()? All of this can be done just inside a receiver. Can you get rid of the invoke and check, that deadlocks disappear? Denis пт, 29 июн. 2018 г. в 17:24, breischl : > That does seem to be what's happening, but we're only invoke()'ing on keys > that were

RE: Deadlock during cache loading

2018-06-29 Thread breischl
That does seem to be what's happening, but we're only invoke()'ing on keys that were passed into receive(), so that should not require going off-box. Right? Here's the relevant code... @Override public void receive(IgniteCache cache, Collection> newEntries) throws IgniteException { for

RE: Deadlock during cache loading

2018-06-28 Thread Dave Harvey
Your original stack trace shows a call to your custom stream receiver which appears to itself call invoke(). I can only guess that your code does, but it appears to be making an call off node to something that is not returning.

Re: Deadlock during cache loading

2018-06-28 Thread breischl
Just found a bunch of these in my logs as well. Note this is showing starvation in the system threadpool, not the datastreamer threadpool, but perhaps they're related? [2018-06-28T17:39:55,728Z](grid-timeout-worker-#23)([]) WARN - G - >>> Possible starvation in striped pool. Thread name:

RE: Deadlock during cache loading

2018-06-28 Thread breischl
Also... >What you showed that the stream receiver called invoke() and did not get an answer, not a deadlock. It's not that I'm getting back a null, it's that all the threads are blocked waiting on the invoke() call, and no progress is being made. That sounds a lot like a deadlock. I guess you

RE: Deadlock during cache loading

2018-06-28 Thread breischl
>our a stream receiver called invoke() and that in turn did another invoke, which was the actual bug. So Ignite's invoke() implementation called itself? >It was helpful when we did the invoke using a custom thread pool, I'm not sure I understand the concept here. Is the idea to have an

RE: Deadlock during cache loading

2018-06-28 Thread Dave Harvey
2.4 should be OK. What you showed that the stream receiver called invoke() and did not get an answer, not a deadlock. Nothing looks particularly wrong there. When we created this bug, it was our a stream receiver called invoke() and that in turn did another invoke, which was the actual bug. It

RE: Deadlock during cache loading

2018-06-28 Thread breischl
Thanks Dave. I am using Ignite v2.4.0. Would a newer version potentially help? This problem seems to come and go. I didn't hit it for a few days, and now we've hit it on two deployments in a row. It may be some sort of timing or external factor that provokes it. The most recent case we hit the

RE: Deadlock during cache loading

2018-06-25 Thread Dave Harvey
"When receiver is invoked for key K, it’s holding the lock for K." is not correct, at least in the 2.4 code. When a custom stream receiver is called, the data streamer thread has a read-lock preventing termination, and there is a real-lock on the topology, but DataStreamerUpdateJob.call() does

RE: Deadlock during cache loading

2018-06-22 Thread breischl
In our case we're only using the receiver as you describe, to update the key that it was invoked for. Our actual use case is that the incoming stream of data sometimes sends us old data, which we want to discard rather than cache. So the StreamReceiver examines the value already in the cache and

RE: Deadlock during cache loading

2018-06-22 Thread Stanislav Lukyanov
Subject: RE: Deadlock during cache loading Hi Stan, Thanks for taking a look. I'm having trouble finding anywhere that it's documented what I can or can't call inside a receiver. Is it just put()/get() that are allowed? Also, I noticed that the default StreamTransformer implementation calls invoke

RE: Deadlock during cache loading

2018-06-22 Thread breischl
Hi Stan, Thanks for taking a look. I'm having trouble finding anywhere that it's documented what I can or can't call inside a receiver. Is it just put()/get() that are allowed? Also, I noticed that the default StreamTransformer implementation calls invoke() from within a receiver. So is that

RE: Deadlock during cache loading

2018-06-21 Thread Stanislav Lukyanov
Hi, Looks like you’re performing a cache operation (invoke()) from a StreamReceiver – this is not allowed. Check out this SO answer https://stackoverflow.com/questions/43891757/closures-stuck-in-2-0-when-try-to-add-an-element-into-the-queue. Stan From: breischl Sent: 21 июня 2018 г. 19:35 To: