Also, I probably should have mentioned this earlier but we're not using WAL
or any disk persistence. So everything should be in-memory, and generally
on-heap. I think that makes it less likely that we were blocked on plain
throughput of some hardware or virtual-hardware.
--
Sent from:
(OT: Sorry about the duplicate posts, for some reason Nabble was refusing to
show me new posts so I thought my earlier ones had been lost.)
>Why did you decide, that cluster is deadlocked in the first place?
Because all of the Datastreamer threads were stuck waiting on locks, and no
progress was
Denis does have a point. When we were trying to run using GP2 storage,
the cluster would simply lock up for an hour. Once we moved to local SSDs
on i3 instances those issues went away (but we needed 2.5 to have the
streaming rate hold for up as we had a lot of data loaded). The i3
instances
Why did you decide, that cluster is deadlocked in the first place?
> We've had several deployments in a row fail, apparently due to
deadlocking in the loading process.
What did you see in logs of the failing nodes?
Denis
пн, 2 июл. 2018 г. в 17:08, breischl :
> Ah, I had not thought of that,
Ah, I had not thought of that, thanks.
Interestingly, going to a smaller cluster seems to have worked around the
problem. We were running a 44-node cluster using 3 backups of the data.
Switching to two separate 22-node clusters, each with 1 backup, seems to
work just fine. Is there some limit to
transactions are easy to use: see examples, org.apache.ignite.
examples.datagrid.store.auto
We use them in the stream receiver.You simply bracket the get/put in
the transaction, but use a timeout, then bracket that with an "until done"
while loop, perhaps added a sleep to backoff.
We ended
@DaveHarvey, I'll look at that tomorrow. Seems potentially complicated, but
if that's what has to happen we'll figure it out.
Interestingly, cutting the cluster to half as many nodes (by reducing the
number of backups) seems to have resolved the issue. Is there a guideline
for how large a
You can start a transaction in the stream receiver to make it atomic.
On Fri, Jun 29, 2018, 1:02 PM breischl wrote:
> StreamTransformer does an invoke() pretty much exactly like what I'm doing,
> so that would not seem to change anything.
>
>
>
I have a fairly similar setup. What type of EC2 instances are you using? Just
for compare my setup.
--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/
StreamTransformer does an invoke() pretty much exactly like what I'm doing,
so that would not seem to change anything.
https://github.com/apache/ignite/blob/master/modules/core/src/main/java/org/apache/ignite/stream/StreamTransformer.java#L50-L53
I may try using a put(), but since I need to
Entries, that are provided to the *receive()* method are immutable.
But you can either do *cache.put() *inside the *receive() *method, just
like *DataStreamerCacheUpdaters#Individual
Hi Denis,
It was not clear to me that we could do the update from within the
StreamReceiver without some sort of cache operation. Would we just use the
CacheEntry.setValue() method to do that? Something roughly like the
following?
Thanks!
public void receive(IgniteCache cache,
Collection>
Hi!
Why do you do this inside an invoke()?
All of this can be done just inside a receiver.
Can you get rid of the invoke and check, that deadlocks disappear?
Denis
пт, 29 июн. 2018 г. в 17:24, breischl :
> That does seem to be what's happening, but we're only invoke()'ing on keys
> that were
That does seem to be what's happening, but we're only invoke()'ing on keys
that were passed into receive(), so that should not require going off-box.
Right?
Here's the relevant code...
@Override
public void receive(IgniteCache cache,
Collection> newEntries) throws IgniteException {
for
Your original stack trace shows a call to your custom stream receiver which
appears to itself call invoke(). I can only guess that your code does, but
it appears to be making an call off node to something that is not returning.
Just found a bunch of these in my logs as well. Note this is showing
starvation in the system threadpool, not the datastreamer threadpool, but
perhaps they're related?
[2018-06-28T17:39:55,728Z](grid-timeout-worker-#23)([]) WARN - G - >>>
Possible starvation in striped pool.
Thread name:
Also...
>What you showed that the stream receiver called invoke() and did not get an
answer, not a deadlock.
It's not that I'm getting back a null, it's that all the threads are blocked
waiting on the invoke() call, and no progress is being made. That sounds a
lot like a deadlock. I guess you
>our a stream receiver called invoke() and that in turn did another invoke,
which was the actual bug.
So Ignite's invoke() implementation called itself?
>It was helpful when we did the invoke using a custom thread pool,
I'm not sure I understand the concept here. Is the idea to have an
2.4 should be OK.
What you showed that the stream receiver called invoke() and did not get an
answer, not a deadlock. Nothing looks particularly wrong there. When we
created this bug, it was our a stream receiver called invoke() and that in
turn did another invoke, which was the actual bug.
It
Thanks Dave. I am using Ignite v2.4.0. Would a newer version potentially
help?
This problem seems to come and go. I didn't hit it for a few days, and now
we've hit it on two deployments in a row. It may be some sort of timing or
external factor that provokes it. The most recent case we hit the
"When receiver is invoked for key K, it’s holding the lock for K." is not
correct, at least in the 2.4 code.
When a custom stream receiver is called, the data streamer thread has a
read-lock preventing termination, and there is a real-lock on the topology,
but DataStreamerUpdateJob.call() does
In our case we're only using the receiver as you describe, to update the key
that it was invoked for. Our actual use case is that the incoming stream of
data sometimes sends us old data, which we want to discard rather than
cache. So the StreamReceiver examines the value already in the cache and
Subject: RE: Deadlock during cache loading
Hi Stan,
Thanks for taking a look. I'm having trouble finding anywhere that it's
documented what I can or can't call inside a receiver. Is it just
put()/get() that are allowed?
Also, I noticed that the default StreamTransformer implementation calls
invoke
Hi Stan,
Thanks for taking a look. I'm having trouble finding anywhere that it's
documented what I can or can't call inside a receiver. Is it just
put()/get() that are allowed?
Also, I noticed that the default StreamTransformer implementation calls
invoke() from within a receiver. So is that
Hi,
Looks like you’re performing a cache operation (invoke()) from a StreamReceiver
– this is not allowed.
Check out this SO answer
https://stackoverflow.com/questions/43891757/closures-stuck-in-2-0-when-try-to-add-an-element-into-the-queue.
Stan
From: breischl
Sent: 21 июня 2018 г. 19:35
To:
25 matches
Mail list logo