Re: Operation block on Cluster recovery/rebalance.

2020-08-18 Thread John Smith
Hi Denis, for everyones reference:
https://issues.apache.org/jira/browse/IGNITE-13372

On Mon, 17 Aug 2020 at 14:28, Denis Magda  wrote:

> But on client reconnect, doesn't it mean it will still block until the
>> cluster is active even if I get new IgniteCache instance?
>
>
> No, the client will be getting an exception on an attempt to get an
> IgniteCache instance.
>
> -
> Denis
>
>
> On Fri, Aug 14, 2020 at 4:14 PM John Smith  wrote:
>
>> Yeah I can maybe use vertx event bus or something to do this... But now I
>> have to tie the ignite instance to the IgniteCahe repository I wrote.
>>
>> But on client reconnect, doesn't it mean it will still block until the
>> cluster is active even if I get new IgniteCache instance?
>>
>> On Fri, 14 Aug 2020 at 18:22, Denis Magda  wrote:
>>
>>> @Evgenii Zhuravlev , @Ilya Kasnacheev
>>> , any thoughts on this?
>>>
>>> As a dirty workaround, you can update your cache references on client
>>> reconnect events. You will be getting an exception by calling
>>> ignite.cache(cacheName) in the time when the cluster is not activated yet.
>>> Does this work for you?
>>>
>>> -
>>> Denis
>>>
>>>
>>> On Fri, Aug 14, 2020 at 3:12 PM John Smith 
>>> wrote:
>>>
 Is there any work around? I can't have an HTTP server block on all
 requests.

 1- I need to figure out why I lose a server nodes every few weeks,
 which when rebooting the nodes cause the inactive state until they are
 back

 2- Implement some kind of logic on the client side not to block the
 HTTP part...

 Can IgniteCache instance be notified of disconnected events so I can
 maybe tell the repository class I have to set a flag to skip the operation?


 On Fri., Aug. 14, 2020, 5:17 p.m. Denis Magda, 
 wrote:

> My guess that it's standard behavior for all operations (SQL,
> key-value, compute, etc.). But I'll let the maintainers of those modules
> clarify.
>
> -
> Denis
>
>
> On Fri, Aug 14, 2020 at 1:44 PM John Smith 
> wrote:
>
>> Hi Denis, so to understand it's all operations or just the query?
>>
>> On Fri., Aug. 14, 2020, 12:53 p.m. Denis Magda, 
>> wrote:
>>
>>> John,
>>>
>>> Ok, we nailed it. That's the current expected behavior. Generally, I
>>> agree with you that the platform should support an option when 
>>> operations
>>> fail if the cluster is deactivated. Could you propose the change by
>>> starting a discussion on the dev list? You can refer to this user list
>>> discussion for reference. Let me know if you need help with this.
>>>
>>> -
>>> Denis
>>>
>>>
>>> On Thu, Aug 13, 2020 at 5:55 PM John Smith 
>>> wrote:
>>>
 No I, reuse the instance. The cache instance is created once at
 startup of the application and I pass it to my "repository" class

 public abstract class AbstractIgniteRepository implements 
 CacheRepository {
 public final long DEFAULT_OPERATION_TIMEOUT = 2000;

 private Vertx vertx;
 private IgniteCache cache;

 AbstractIgniteRepository(Vertx vertx, IgniteCache cache) {
 this.vertx = vertx;
 this.cache = cache;
 }

 ...

 Future> query(final String sql, final long 
 timeoutMs, final Object... args) {
 final Promise> promise = Promise.promise();

 vertx.setTimer(timeoutMs, l -> {
 promise.tryFail(new TimeoutException("Cache operation did 
 not complete within: " + timeoutMs + " Ms.")); // THIS FIRE IF THE 
 BLOE DOESN"T COMPLETE IN TIME.
 });

 vertx.>executeBlocking(code -> {
 SqlFieldsQuery query = new 
 SqlFieldsQuery(sql).setArgs(args);
 query.setTimeout((int) timeoutMs, TimeUnit.MILLISECONDS);


 try (QueryCursor> cursor = cache.query(query)) { 
 // <--- BLOCKS HERE.
 List rows = new ArrayList<>();
 Iterator> iterator = cursor.iterator();

 while(iterator.hasNext()) {
 List currentRow = iterator.next();
 JsonArray row = new JsonArray();

 currentRow.forEach(o -> row.add(o));

 rows.add(row);
 }

 code.complete(rows);
 } catch(Exception ex) {
 code.fail(ex);
 }
 }, result -> {
 if(result.succeeded()) {
 promise.tryComplete(result.result());
 } else {
 promise.tryFail(result.cau

Re: Operation block on Cluster recovery/rebalance.

2020-08-17 Thread Denis Magda
>
> But on client reconnect, doesn't it mean it will still block until the
> cluster is active even if I get new IgniteCache instance?


No, the client will be getting an exception on an attempt to get an
IgniteCache instance.

-
Denis


On Fri, Aug 14, 2020 at 4:14 PM John Smith  wrote:

> Yeah I can maybe use vertx event bus or something to do this... But now I
> have to tie the ignite instance to the IgniteCahe repository I wrote.
>
> But on client reconnect, doesn't it mean it will still block until the
> cluster is active even if I get new IgniteCache instance?
>
> On Fri, 14 Aug 2020 at 18:22, Denis Magda  wrote:
>
>> @Evgenii Zhuravlev , @Ilya Kasnacheev
>> , any thoughts on this?
>>
>> As a dirty workaround, you can update your cache references on client
>> reconnect events. You will be getting an exception by calling
>> ignite.cache(cacheName) in the time when the cluster is not activated yet.
>> Does this work for you?
>>
>> -
>> Denis
>>
>>
>> On Fri, Aug 14, 2020 at 3:12 PM John Smith 
>> wrote:
>>
>>> Is there any work around? I can't have an HTTP server block on all
>>> requests.
>>>
>>> 1- I need to figure out why I lose a server nodes every few weeks, which
>>> when rebooting the nodes cause the inactive state until they are back
>>>
>>> 2- Implement some kind of logic on the client side not to block the HTTP
>>> part...
>>>
>>> Can IgniteCache instance be notified of disconnected events so I can
>>> maybe tell the repository class I have to set a flag to skip the operation?
>>>
>>>
>>> On Fri., Aug. 14, 2020, 5:17 p.m. Denis Magda, 
>>> wrote:
>>>
 My guess that it's standard behavior for all operations (SQL,
 key-value, compute, etc.). But I'll let the maintainers of those modules
 clarify.

 -
 Denis


 On Fri, Aug 14, 2020 at 1:44 PM John Smith 
 wrote:

> Hi Denis, so to understand it's all operations or just the query?
>
> On Fri., Aug. 14, 2020, 12:53 p.m. Denis Magda, 
> wrote:
>
>> John,
>>
>> Ok, we nailed it. That's the current expected behavior. Generally, I
>> agree with you that the platform should support an option when operations
>> fail if the cluster is deactivated. Could you propose the change by
>> starting a discussion on the dev list? You can refer to this user list
>> discussion for reference. Let me know if you need help with this.
>>
>> -
>> Denis
>>
>>
>> On Thu, Aug 13, 2020 at 5:55 PM John Smith 
>> wrote:
>>
>>> No I, reuse the instance. The cache instance is created once at
>>> startup of the application and I pass it to my "repository" class
>>>
>>> public abstract class AbstractIgniteRepository implements 
>>> CacheRepository {
>>> public final long DEFAULT_OPERATION_TIMEOUT = 2000;
>>>
>>> private Vertx vertx;
>>> private IgniteCache cache;
>>>
>>> AbstractIgniteRepository(Vertx vertx, IgniteCache cache) {
>>> this.vertx = vertx;
>>> this.cache = cache;
>>> }
>>>
>>> ...
>>>
>>> Future> query(final String sql, final long 
>>> timeoutMs, final Object... args) {
>>> final Promise> promise = Promise.promise();
>>>
>>> vertx.setTimer(timeoutMs, l -> {
>>> promise.tryFail(new TimeoutException("Cache operation did 
>>> not complete within: " + timeoutMs + " Ms.")); // THIS FIRE IF THE BLOE 
>>> DOESN"T COMPLETE IN TIME.
>>> });
>>>
>>> vertx.>executeBlocking(code -> {
>>> SqlFieldsQuery query = new 
>>> SqlFieldsQuery(sql).setArgs(args);
>>> query.setTimeout((int) timeoutMs, TimeUnit.MILLISECONDS);
>>>
>>>
>>> try (QueryCursor> cursor = cache.query(query)) { // 
>>> <--- BLOCKS HERE.
>>> List rows = new ArrayList<>();
>>> Iterator> iterator = cursor.iterator();
>>>
>>> while(iterator.hasNext()) {
>>> List currentRow = iterator.next();
>>> JsonArray row = new JsonArray();
>>>
>>> currentRow.forEach(o -> row.add(o));
>>>
>>> rows.add(row);
>>> }
>>>
>>> code.complete(rows);
>>> } catch(Exception ex) {
>>> code.fail(ex);
>>> }
>>> }, result -> {
>>> if(result.succeeded()) {
>>> promise.tryComplete(result.result());
>>> } else {
>>> promise.tryFail(result.cause());
>>> }
>>> });
>>>
>>> return promise.future();
>>> }
>>>
>>> public  T cache() {
>>> return (T) cache;
>>> }
>>> }
>>>
>>>
>>>
>>> On Thu, 13 Aug 2020 at 16:29, Denis Magda  wrote:
>

Re: Operation block on Cluster recovery/rebalance.

2020-08-14 Thread John Smith
Yeah I can maybe use vertx event bus or something to do this... But now I
have to tie the ignite instance to the IgniteCahe repository I wrote.

But on client reconnect, doesn't it mean it will still block until the
cluster is active even if I get new IgniteCache instance?

On Fri, 14 Aug 2020 at 18:22, Denis Magda  wrote:

> @Evgenii Zhuravlev , @Ilya Kasnacheev
> , any thoughts on this?
>
> As a dirty workaround, you can update your cache references on client
> reconnect events. You will be getting an exception by calling
> ignite.cache(cacheName) in the time when the cluster is not activated yet.
> Does this work for you?
>
> -
> Denis
>
>
> On Fri, Aug 14, 2020 at 3:12 PM John Smith  wrote:
>
>> Is there any work around? I can't have an HTTP server block on all
>> requests.
>>
>> 1- I need to figure out why I lose a server nodes every few weeks, which
>> when rebooting the nodes cause the inactive state until they are back
>>
>> 2- Implement some kind of logic on the client side not to block the HTTP
>> part...
>>
>> Can IgniteCache instance be notified of disconnected events so I can
>> maybe tell the repository class I have to set a flag to skip the operation?
>>
>>
>> On Fri., Aug. 14, 2020, 5:17 p.m. Denis Magda,  wrote:
>>
>>> My guess that it's standard behavior for all operations (SQL, key-value,
>>> compute, etc.). But I'll let the maintainers of those modules clarify.
>>>
>>> -
>>> Denis
>>>
>>>
>>> On Fri, Aug 14, 2020 at 1:44 PM John Smith 
>>> wrote:
>>>
 Hi Denis, so to understand it's all operations or just the query?

 On Fri., Aug. 14, 2020, 12:53 p.m. Denis Magda, 
 wrote:

> John,
>
> Ok, we nailed it. That's the current expected behavior. Generally, I
> agree with you that the platform should support an option when operations
> fail if the cluster is deactivated. Could you propose the change by
> starting a discussion on the dev list? You can refer to this user list
> discussion for reference. Let me know if you need help with this.
>
> -
> Denis
>
>
> On Thu, Aug 13, 2020 at 5:55 PM John Smith 
> wrote:
>
>> No I, reuse the instance. The cache instance is created once at
>> startup of the application and I pass it to my "repository" class
>>
>> public abstract class AbstractIgniteRepository implements 
>> CacheRepository {
>> public final long DEFAULT_OPERATION_TIMEOUT = 2000;
>>
>> private Vertx vertx;
>> private IgniteCache cache;
>>
>> AbstractIgniteRepository(Vertx vertx, IgniteCache cache) {
>> this.vertx = vertx;
>> this.cache = cache;
>> }
>>
>> ...
>>
>> Future> query(final String sql, final long 
>> timeoutMs, final Object... args) {
>> final Promise> promise = Promise.promise();
>>
>> vertx.setTimer(timeoutMs, l -> {
>> promise.tryFail(new TimeoutException("Cache operation did 
>> not complete within: " + timeoutMs + " Ms.")); // THIS FIRE IF THE BLOE 
>> DOESN"T COMPLETE IN TIME.
>> });
>>
>> vertx.>executeBlocking(code -> {
>> SqlFieldsQuery query = new SqlFieldsQuery(sql).setArgs(args);
>> query.setTimeout((int) timeoutMs, TimeUnit.MILLISECONDS);
>>
>>
>> try (QueryCursor> cursor = cache.query(query)) { // 
>> <--- BLOCKS HERE.
>> List rows = new ArrayList<>();
>> Iterator> iterator = cursor.iterator();
>>
>> while(iterator.hasNext()) {
>> List currentRow = iterator.next();
>> JsonArray row = new JsonArray();
>>
>> currentRow.forEach(o -> row.add(o));
>>
>> rows.add(row);
>> }
>>
>> code.complete(rows);
>> } catch(Exception ex) {
>> code.fail(ex);
>> }
>> }, result -> {
>> if(result.succeeded()) {
>> promise.tryComplete(result.result());
>> } else {
>> promise.tryFail(result.cause());
>> }
>> });
>>
>> return promise.future();
>> }
>>
>> public  T cache() {
>> return (T) cache;
>> }
>> }
>>
>>
>>
>> On Thu, 13 Aug 2020 at 16:29, Denis Magda  wrote:
>>
>>> I've created a simple test and always getting the exception below on
>>> an attempt to get a reference to an IgniteCache instance in cases when 
>>> the
>>> cluster is not activated:
>>>
>>> *Exception in thread "main" class org.apache.ignite.IgniteException:
>>> Can not perform the operation because the cluster is inactive. Note, 
>>> that
>>> the cluster is considered inactive by default if Ignite Per

Re: Operation block on Cluster recovery/rebalance.

2020-08-14 Thread Denis Magda
@Evgenii Zhuravlev , @Ilya Kasnacheev
, any thoughts on this?

As a dirty workaround, you can update your cache references on client
reconnect events. You will be getting an exception by calling
ignite.cache(cacheName) in the time when the cluster is not activated yet.
Does this work for you?

-
Denis


On Fri, Aug 14, 2020 at 3:12 PM John Smith  wrote:

> Is there any work around? I can't have an HTTP server block on all
> requests.
>
> 1- I need to figure out why I lose a server nodes every few weeks, which
> when rebooting the nodes cause the inactive state until they are back
>
> 2- Implement some kind of logic on the client side not to block the HTTP
> part...
>
> Can IgniteCache instance be notified of disconnected events so I can maybe
> tell the repository class I have to set a flag to skip the operation?
>
>
> On Fri., Aug. 14, 2020, 5:17 p.m. Denis Magda,  wrote:
>
>> My guess that it's standard behavior for all operations (SQL, key-value,
>> compute, etc.). But I'll let the maintainers of those modules clarify.
>>
>> -
>> Denis
>>
>>
>> On Fri, Aug 14, 2020 at 1:44 PM John Smith 
>> wrote:
>>
>>> Hi Denis, so to understand it's all operations or just the query?
>>>
>>> On Fri., Aug. 14, 2020, 12:53 p.m. Denis Magda, 
>>> wrote:
>>>
 John,

 Ok, we nailed it. That's the current expected behavior. Generally, I
 agree with you that the platform should support an option when operations
 fail if the cluster is deactivated. Could you propose the change by
 starting a discussion on the dev list? You can refer to this user list
 discussion for reference. Let me know if you need help with this.

 -
 Denis


 On Thu, Aug 13, 2020 at 5:55 PM John Smith 
 wrote:

> No I, reuse the instance. The cache instance is created once at
> startup of the application and I pass it to my "repository" class
>
> public abstract class AbstractIgniteRepository implements 
> CacheRepository {
> public final long DEFAULT_OPERATION_TIMEOUT = 2000;
>
> private Vertx vertx;
> private IgniteCache cache;
>
> AbstractIgniteRepository(Vertx vertx, IgniteCache cache) {
> this.vertx = vertx;
> this.cache = cache;
> }
>
> ...
>
> Future> query(final String sql, final long timeoutMs, 
> final Object... args) {
> final Promise> promise = Promise.promise();
>
> vertx.setTimer(timeoutMs, l -> {
> promise.tryFail(new TimeoutException("Cache operation did not 
> complete within: " + timeoutMs + " Ms.")); // THIS FIRE IF THE BLOE 
> DOESN"T COMPLETE IN TIME.
> });
>
> vertx.>executeBlocking(code -> {
> SqlFieldsQuery query = new SqlFieldsQuery(sql).setArgs(args);
> query.setTimeout((int) timeoutMs, TimeUnit.MILLISECONDS);
>
>
> try (QueryCursor> cursor = cache.query(query)) { // 
> <--- BLOCKS HERE.
> List rows = new ArrayList<>();
> Iterator> iterator = cursor.iterator();
>
> while(iterator.hasNext()) {
> List currentRow = iterator.next();
> JsonArray row = new JsonArray();
>
> currentRow.forEach(o -> row.add(o));
>
> rows.add(row);
> }
>
> code.complete(rows);
> } catch(Exception ex) {
> code.fail(ex);
> }
> }, result -> {
> if(result.succeeded()) {
> promise.tryComplete(result.result());
> } else {
> promise.tryFail(result.cause());
> }
> });
>
> return promise.future();
> }
>
> public  T cache() {
> return (T) cache;
> }
> }
>
>
>
> On Thu, 13 Aug 2020 at 16:29, Denis Magda  wrote:
>
>> I've created a simple test and always getting the exception below on
>> an attempt to get a reference to an IgniteCache instance in cases when 
>> the
>> cluster is not activated:
>>
>> *Exception in thread "main" class org.apache.ignite.IgniteException:
>> Can not perform the operation because the cluster is inactive. Note, that
>> the cluster is considered inactive by default if Ignite Persistent Store 
>> is
>> used to let all the nodes join the cluster. To activate the cluster call
>> Ignite.active(true)*
>>
>> Are you trying to get a new IgniteCache reference whenever the client
>> reconnects successfully to the cluster? My guts feel that currently, 
>> Ignite
>> verifies the activation status and generates the exception above whenever
>> you're getting a reference to an IgniteCache or IgniteCompute. But once 
>> you
>

Re: Operation block on Cluster recovery/rebalance.

2020-08-14 Thread John Smith
Is there any work around? I can't have an HTTP server block on all requests.

1- I need to figure out why I lose a server nodes every few weeks, which
when rebooting the nodes cause the inactive state until they are back

2- Implement some kind of logic on the client side not to block the HTTP
part...

Can IgniteCache instance be notified of disconnected events so I can maybe
tell the repository class I have to set a flag to skip the operation?


On Fri., Aug. 14, 2020, 5:17 p.m. Denis Magda,  wrote:

> My guess that it's standard behavior for all operations (SQL, key-value,
> compute, etc.). But I'll let the maintainers of those modules clarify.
>
> -
> Denis
>
>
> On Fri, Aug 14, 2020 at 1:44 PM John Smith  wrote:
>
>> Hi Denis, so to understand it's all operations or just the query?
>>
>> On Fri., Aug. 14, 2020, 12:53 p.m. Denis Magda, 
>> wrote:
>>
>>> John,
>>>
>>> Ok, we nailed it. That's the current expected behavior. Generally, I
>>> agree with you that the platform should support an option when operations
>>> fail if the cluster is deactivated. Could you propose the change by
>>> starting a discussion on the dev list? You can refer to this user list
>>> discussion for reference. Let me know if you need help with this.
>>>
>>> -
>>> Denis
>>>
>>>
>>> On Thu, Aug 13, 2020 at 5:55 PM John Smith 
>>> wrote:
>>>
 No I, reuse the instance. The cache instance is created once at startup
 of the application and I pass it to my "repository" class

 public abstract class AbstractIgniteRepository implements 
 CacheRepository {
 public final long DEFAULT_OPERATION_TIMEOUT = 2000;

 private Vertx vertx;
 private IgniteCache cache;

 AbstractIgniteRepository(Vertx vertx, IgniteCache cache) {
 this.vertx = vertx;
 this.cache = cache;
 }

 ...

 Future> query(final String sql, final long timeoutMs, 
 final Object... args) {
 final Promise> promise = Promise.promise();

 vertx.setTimer(timeoutMs, l -> {
 promise.tryFail(new TimeoutException("Cache operation did not 
 complete within: " + timeoutMs + " Ms.")); // THIS FIRE IF THE BLOE 
 DOESN"T COMPLETE IN TIME.
 });

 vertx.>executeBlocking(code -> {
 SqlFieldsQuery query = new SqlFieldsQuery(sql).setArgs(args);
 query.setTimeout((int) timeoutMs, TimeUnit.MILLISECONDS);


 try (QueryCursor> cursor = cache.query(query)) { // 
 <--- BLOCKS HERE.
 List rows = new ArrayList<>();
 Iterator> iterator = cursor.iterator();

 while(iterator.hasNext()) {
 List currentRow = iterator.next();
 JsonArray row = new JsonArray();

 currentRow.forEach(o -> row.add(o));

 rows.add(row);
 }

 code.complete(rows);
 } catch(Exception ex) {
 code.fail(ex);
 }
 }, result -> {
 if(result.succeeded()) {
 promise.tryComplete(result.result());
 } else {
 promise.tryFail(result.cause());
 }
 });

 return promise.future();
 }

 public  T cache() {
 return (T) cache;
 }
 }



 On Thu, 13 Aug 2020 at 16:29, Denis Magda  wrote:

> I've created a simple test and always getting the exception below on
> an attempt to get a reference to an IgniteCache instance in cases when the
> cluster is not activated:
>
> *Exception in thread "main" class org.apache.ignite.IgniteException:
> Can not perform the operation because the cluster is inactive. Note, that
> the cluster is considered inactive by default if Ignite Persistent Store 
> is
> used to let all the nodes join the cluster. To activate the cluster call
> Ignite.active(true)*
>
> Are you trying to get a new IgniteCache reference whenever the client
> reconnects successfully to the cluster? My guts feel that currently, 
> Ignite
> verifies the activation status and generates the exception above whenever
> you're getting a reference to an IgniteCache or IgniteCompute. But once 
> you
> got those references and try to run some operations then those get stuck 
> if
> the cluster is not activated.
> -
> Denis
>
>
> On Thu, Aug 13, 2020 at 6:37 AM John Smith 
> wrote:
>
>> The cache.query() starts to block when ignite server nodes are being
>> restarted and there's no baseline topology yet. The server nodes do not
>> block. It's the client that blocks.
>>
>> The dumpfiles are of the server nodes. The screen shot is from the

Re: Operation block on Cluster recovery/rebalance.

2020-08-14 Thread Denis Magda
My guess that it's standard behavior for all operations (SQL, key-value,
compute, etc.). But I'll let the maintainers of those modules clarify.

-
Denis


On Fri, Aug 14, 2020 at 1:44 PM John Smith  wrote:

> Hi Denis, so to understand it's all operations or just the query?
>
> On Fri., Aug. 14, 2020, 12:53 p.m. Denis Magda,  wrote:
>
>> John,
>>
>> Ok, we nailed it. That's the current expected behavior. Generally, I
>> agree with you that the platform should support an option when operations
>> fail if the cluster is deactivated. Could you propose the change by
>> starting a discussion on the dev list? You can refer to this user list
>> discussion for reference. Let me know if you need help with this.
>>
>> -
>> Denis
>>
>>
>> On Thu, Aug 13, 2020 at 5:55 PM John Smith 
>> wrote:
>>
>>> No I, reuse the instance. The cache instance is created once at startup
>>> of the application and I pass it to my "repository" class
>>>
>>> public abstract class AbstractIgniteRepository implements 
>>> CacheRepository {
>>> public final long DEFAULT_OPERATION_TIMEOUT = 2000;
>>>
>>> private Vertx vertx;
>>> private IgniteCache cache;
>>>
>>> AbstractIgniteRepository(Vertx vertx, IgniteCache cache) {
>>> this.vertx = vertx;
>>> this.cache = cache;
>>> }
>>>
>>> ...
>>>
>>> Future> query(final String sql, final long timeoutMs, 
>>> final Object... args) {
>>> final Promise> promise = Promise.promise();
>>>
>>> vertx.setTimer(timeoutMs, l -> {
>>> promise.tryFail(new TimeoutException("Cache operation did not 
>>> complete within: " + timeoutMs + " Ms.")); // THIS FIRE IF THE BLOE DOESN"T 
>>> COMPLETE IN TIME.
>>> });
>>>
>>> vertx.>executeBlocking(code -> {
>>> SqlFieldsQuery query = new SqlFieldsQuery(sql).setArgs(args);
>>> query.setTimeout((int) timeoutMs, TimeUnit.MILLISECONDS);
>>>
>>>
>>> try (QueryCursor> cursor = cache.query(query)) { // 
>>> <--- BLOCKS HERE.
>>> List rows = new ArrayList<>();
>>> Iterator> iterator = cursor.iterator();
>>>
>>> while(iterator.hasNext()) {
>>> List currentRow = iterator.next();
>>> JsonArray row = new JsonArray();
>>>
>>> currentRow.forEach(o -> row.add(o));
>>>
>>> rows.add(row);
>>> }
>>>
>>> code.complete(rows);
>>> } catch(Exception ex) {
>>> code.fail(ex);
>>> }
>>> }, result -> {
>>> if(result.succeeded()) {
>>> promise.tryComplete(result.result());
>>> } else {
>>> promise.tryFail(result.cause());
>>> }
>>> });
>>>
>>> return promise.future();
>>> }
>>>
>>> public  T cache() {
>>> return (T) cache;
>>> }
>>> }
>>>
>>>
>>>
>>> On Thu, 13 Aug 2020 at 16:29, Denis Magda  wrote:
>>>
 I've created a simple test and always getting the exception below on an
 attempt to get a reference to an IgniteCache instance in cases when the
 cluster is not activated:

 *Exception in thread "main" class org.apache.ignite.IgniteException:
 Can not perform the operation because the cluster is inactive. Note, that
 the cluster is considered inactive by default if Ignite Persistent Store is
 used to let all the nodes join the cluster. To activate the cluster call
 Ignite.active(true)*

 Are you trying to get a new IgniteCache reference whenever the client
 reconnects successfully to the cluster? My guts feel that currently, Ignite
 verifies the activation status and generates the exception above whenever
 you're getting a reference to an IgniteCache or IgniteCompute. But once you
 got those references and try to run some operations then those get stuck if
 the cluster is not activated.
 -
 Denis


 On Thu, Aug 13, 2020 at 6:37 AM John Smith 
 wrote:

> The cache.query() starts to block when ignite server nodes are being
> restarted and there's no baseline topology yet. The server nodes do not
> block. It's the client that blocks.
>
> The dumpfiles are of the server nodes. The screen shot is from the
> client app using your kit profiler on the client side the threads are
> marked as red on your kit.
>
> The app is simple, make http request, it runs cache Sql query on
> ignite and if it succeeds does a put back to ignite.
>
> The Client disconnected exception only happens when all server nodes
> in the cluster are down. The blockage only happens when the cluster is
> trying to establish baseline topology.
>
> On Wed., Aug. 12, 2020, 6:28 p.m. Denis Magda, 
> wrote:
>
>> John,
>>
>> I don't see any traits of an application-caused deadlock in the
>> thread dumps. Please elaborate on the following:
>>>

Re: Operation block on Cluster recovery/rebalance.

2020-08-14 Thread John Smith
Hi Denis, so to understand it's all operations or just the query?

On Fri., Aug. 14, 2020, 12:53 p.m. Denis Magda,  wrote:

> John,
>
> Ok, we nailed it. That's the current expected behavior. Generally, I agree
> with you that the platform should support an option when operations fail if
> the cluster is deactivated. Could you propose the change by starting a
> discussion on the dev list? You can refer to this user list discussion for
> reference. Let me know if you need help with this.
>
> -
> Denis
>
>
> On Thu, Aug 13, 2020 at 5:55 PM John Smith  wrote:
>
>> No I, reuse the instance. The cache instance is created once at startup
>> of the application and I pass it to my "repository" class
>>
>> public abstract class AbstractIgniteRepository implements 
>> CacheRepository {
>> public final long DEFAULT_OPERATION_TIMEOUT = 2000;
>>
>> private Vertx vertx;
>> private IgniteCache cache;
>>
>> AbstractIgniteRepository(Vertx vertx, IgniteCache cache) {
>> this.vertx = vertx;
>> this.cache = cache;
>> }
>>
>> ...
>>
>> Future> query(final String sql, final long timeoutMs, 
>> final Object... args) {
>> final Promise> promise = Promise.promise();
>>
>> vertx.setTimer(timeoutMs, l -> {
>> promise.tryFail(new TimeoutException("Cache operation did not 
>> complete within: " + timeoutMs + " Ms.")); // THIS FIRE IF THE BLOE DOESN"T 
>> COMPLETE IN TIME.
>> });
>>
>> vertx.>executeBlocking(code -> {
>> SqlFieldsQuery query = new SqlFieldsQuery(sql).setArgs(args);
>> query.setTimeout((int) timeoutMs, TimeUnit.MILLISECONDS);
>>
>>
>> try (QueryCursor> cursor = cache.query(query)) { // <--- 
>> BLOCKS HERE.
>> List rows = new ArrayList<>();
>> Iterator> iterator = cursor.iterator();
>>
>> while(iterator.hasNext()) {
>> List currentRow = iterator.next();
>> JsonArray row = new JsonArray();
>>
>> currentRow.forEach(o -> row.add(o));
>>
>> rows.add(row);
>> }
>>
>> code.complete(rows);
>> } catch(Exception ex) {
>> code.fail(ex);
>> }
>> }, result -> {
>> if(result.succeeded()) {
>> promise.tryComplete(result.result());
>> } else {
>> promise.tryFail(result.cause());
>> }
>> });
>>
>> return promise.future();
>> }
>>
>> public  T cache() {
>> return (T) cache;
>> }
>> }
>>
>>
>>
>> On Thu, 13 Aug 2020 at 16:29, Denis Magda  wrote:
>>
>>> I've created a simple test and always getting the exception below on an
>>> attempt to get a reference to an IgniteCache instance in cases when the
>>> cluster is not activated:
>>>
>>> *Exception in thread "main" class org.apache.ignite.IgniteException: Can
>>> not perform the operation because the cluster is inactive. Note, that the
>>> cluster is considered inactive by default if Ignite Persistent Store is
>>> used to let all the nodes join the cluster. To activate the cluster call
>>> Ignite.active(true)*
>>>
>>> Are you trying to get a new IgniteCache reference whenever the client
>>> reconnects successfully to the cluster? My guts feel that currently, Ignite
>>> verifies the activation status and generates the exception above whenever
>>> you're getting a reference to an IgniteCache or IgniteCompute. But once you
>>> got those references and try to run some operations then those get stuck if
>>> the cluster is not activated.
>>> -
>>> Denis
>>>
>>>
>>> On Thu, Aug 13, 2020 at 6:37 AM John Smith 
>>> wrote:
>>>
 The cache.query() starts to block when ignite server nodes are being
 restarted and there's no baseline topology yet. The server nodes do not
 block. It's the client that blocks.

 The dumpfiles are of the server nodes. The screen shot is from the
 client app using your kit profiler on the client side the threads are
 marked as red on your kit.

 The app is simple, make http request, it runs cache Sql query on ignite
 and if it succeeds does a put back to ignite.

 The Client disconnected exception only happens when all server nodes in
 the cluster are down. The blockage only happens when the cluster is trying
 to establish baseline topology.

 On Wed., Aug. 12, 2020, 6:28 p.m. Denis Magda, 
 wrote:

> John,
>
> I don't see any traits of an application-caused deadlock in the thread
> dumps. Please elaborate on the following:
>
> 7- Restart 1st node, run operation, operation fails with
>> ClientDisconectedException but application still able to complete it's
>> request.
>
>
> What's the IP address of the server node the client app uses to join
> the cluster? If that's not the address of the 1st node, that is already
> restarted, the

Re: Operation block on Cluster recovery/rebalance.

2020-08-14 Thread Denis Magda
John,

Ok, we nailed it. That's the current expected behavior. Generally, I agree
with you that the platform should support an option when operations fail if
the cluster is deactivated. Could you propose the change by starting a
discussion on the dev list? You can refer to this user list discussion for
reference. Let me know if you need help with this.

-
Denis


On Thu, Aug 13, 2020 at 5:55 PM John Smith  wrote:

> No I, reuse the instance. The cache instance is created once at startup of
> the application and I pass it to my "repository" class
>
> public abstract class AbstractIgniteRepository implements 
> CacheRepository {
> public final long DEFAULT_OPERATION_TIMEOUT = 2000;
>
> private Vertx vertx;
> private IgniteCache cache;
>
> AbstractIgniteRepository(Vertx vertx, IgniteCache cache) {
> this.vertx = vertx;
> this.cache = cache;
> }
>
> ...
>
> Future> query(final String sql, final long timeoutMs, 
> final Object... args) {
> final Promise> promise = Promise.promise();
>
> vertx.setTimer(timeoutMs, l -> {
> promise.tryFail(new TimeoutException("Cache operation did not 
> complete within: " + timeoutMs + " Ms.")); // THIS FIRE IF THE BLOE DOESN"T 
> COMPLETE IN TIME.
> });
>
> vertx.>executeBlocking(code -> {
> SqlFieldsQuery query = new SqlFieldsQuery(sql).setArgs(args);
> query.setTimeout((int) timeoutMs, TimeUnit.MILLISECONDS);
>
>
> try (QueryCursor> cursor = cache.query(query)) { // <--- 
> BLOCKS HERE.
> List rows = new ArrayList<>();
> Iterator> iterator = cursor.iterator();
>
> while(iterator.hasNext()) {
> List currentRow = iterator.next();
> JsonArray row = new JsonArray();
>
> currentRow.forEach(o -> row.add(o));
>
> rows.add(row);
> }
>
> code.complete(rows);
> } catch(Exception ex) {
> code.fail(ex);
> }
> }, result -> {
> if(result.succeeded()) {
> promise.tryComplete(result.result());
> } else {
> promise.tryFail(result.cause());
> }
> });
>
> return promise.future();
> }
>
> public  T cache() {
> return (T) cache;
> }
> }
>
>
>
> On Thu, 13 Aug 2020 at 16:29, Denis Magda  wrote:
>
>> I've created a simple test and always getting the exception below on an
>> attempt to get a reference to an IgniteCache instance in cases when the
>> cluster is not activated:
>>
>> *Exception in thread "main" class org.apache.ignite.IgniteException: Can
>> not perform the operation because the cluster is inactive. Note, that the
>> cluster is considered inactive by default if Ignite Persistent Store is
>> used to let all the nodes join the cluster. To activate the cluster call
>> Ignite.active(true)*
>>
>> Are you trying to get a new IgniteCache reference whenever the client
>> reconnects successfully to the cluster? My guts feel that currently, Ignite
>> verifies the activation status and generates the exception above whenever
>> you're getting a reference to an IgniteCache or IgniteCompute. But once you
>> got those references and try to run some operations then those get stuck if
>> the cluster is not activated.
>> -
>> Denis
>>
>>
>> On Thu, Aug 13, 2020 at 6:37 AM John Smith 
>> wrote:
>>
>>> The cache.query() starts to block when ignite server nodes are being
>>> restarted and there's no baseline topology yet. The server nodes do not
>>> block. It's the client that blocks.
>>>
>>> The dumpfiles are of the server nodes. The screen shot is from the
>>> client app using your kit profiler on the client side the threads are
>>> marked as red on your kit.
>>>
>>> The app is simple, make http request, it runs cache Sql query on ignite
>>> and if it succeeds does a put back to ignite.
>>>
>>> The Client disconnected exception only happens when all server nodes in
>>> the cluster are down. The blockage only happens when the cluster is trying
>>> to establish baseline topology.
>>>
>>> On Wed., Aug. 12, 2020, 6:28 p.m. Denis Magda, 
>>> wrote:
>>>
 John,

 I don't see any traits of an application-caused deadlock in the thread
 dumps. Please elaborate on the following:

 7- Restart 1st node, run operation, operation fails with
> ClientDisconectedException but application still able to complete it's
> request.


 What's the IP address of the server node the client app uses to join
 the cluster? If that's not the address of the 1st node, that is already
 restarted, then the client couldn't join the cluster and it's expected that
 it fails with the ClientDisconnectedException.

 8- Start 2nd node, run operation, from here on all operations just
> block.


 Are the operations unblocked and completed successfu

Re: Operation block on Cluster recovery/rebalance.

2020-08-13 Thread John Smith
No I, reuse the instance. The cache instance is created once at startup of
the application and I pass it to my "repository" class

public abstract class AbstractIgniteRepository implements
CacheRepository {
public final long DEFAULT_OPERATION_TIMEOUT = 2000;

private Vertx vertx;
private IgniteCache cache;

AbstractIgniteRepository(Vertx vertx, IgniteCache cache) {
this.vertx = vertx;
this.cache = cache;
}

...

Future> query(final String sql, final long
timeoutMs, final Object... args) {
final Promise> promise = Promise.promise();

vertx.setTimer(timeoutMs, l -> {
promise.tryFail(new TimeoutException("Cache operation did
not complete within: " + timeoutMs + " Ms.")); // THIS FIRE IF THE
BLOE DOESN"T COMPLETE IN TIME.
});

vertx.>executeBlocking(code -> {
SqlFieldsQuery query = new SqlFieldsQuery(sql).setArgs(args);
query.setTimeout((int) timeoutMs, TimeUnit.MILLISECONDS);


try (QueryCursor> cursor = cache.query(query)) {
// <--- BLOCKS HERE.
List rows = new ArrayList<>();
Iterator> iterator = cursor.iterator();

while(iterator.hasNext()) {
List currentRow = iterator.next();
JsonArray row = new JsonArray();

currentRow.forEach(o -> row.add(o));

rows.add(row);
}

code.complete(rows);
} catch(Exception ex) {
code.fail(ex);
}
}, result -> {
if(result.succeeded()) {
promise.tryComplete(result.result());
} else {
promise.tryFail(result.cause());
}
});

return promise.future();
}

public  T cache() {
return (T) cache;
}
}



On Thu, 13 Aug 2020 at 16:29, Denis Magda  wrote:

> I've created a simple test and always getting the exception below on an
> attempt to get a reference to an IgniteCache instance in cases when the
> cluster is not activated:
>
> *Exception in thread "main" class org.apache.ignite.IgniteException: Can
> not perform the operation because the cluster is inactive. Note, that the
> cluster is considered inactive by default if Ignite Persistent Store is
> used to let all the nodes join the cluster. To activate the cluster call
> Ignite.active(true)*
>
> Are you trying to get a new IgniteCache reference whenever the client
> reconnects successfully to the cluster? My guts feel that currently, Ignite
> verifies the activation status and generates the exception above whenever
> you're getting a reference to an IgniteCache or IgniteCompute. But once you
> got those references and try to run some operations then those get stuck if
> the cluster is not activated.
> -
> Denis
>
>
> On Thu, Aug 13, 2020 at 6:37 AM John Smith  wrote:
>
>> The cache.query() starts to block when ignite server nodes are being
>> restarted and there's no baseline topology yet. The server nodes do not
>> block. It's the client that blocks.
>>
>> The dumpfiles are of the server nodes. The screen shot is from the client
>> app using your kit profiler on the client side the threads are marked as
>> red on your kit.
>>
>> The app is simple, make http request, it runs cache Sql query on ignite
>> and if it succeeds does a put back to ignite.
>>
>> The Client disconnected exception only happens when all server nodes in
>> the cluster are down. The blockage only happens when the cluster is trying
>> to establish baseline topology.
>>
>> On Wed., Aug. 12, 2020, 6:28 p.m. Denis Magda,  wrote:
>>
>>> John,
>>>
>>> I don't see any traits of an application-caused deadlock in the thread
>>> dumps. Please elaborate on the following:
>>>
>>> 7- Restart 1st node, run operation, operation fails with
 ClientDisconectedException but application still able to complete it's
 request.
>>>
>>>
>>> What's the IP address of the server node the client app uses to join the
>>> cluster? If that's not the address of the 1st node, that is already
>>> restarted, then the client couldn't join the cluster and it's expected that
>>> it fails with the ClientDisconnectedException.
>>>
>>> 8- Start 2nd node, run operation, from here on all operations just block.
>>>
>>>
>>> Are the operations unblocked and completed successfully when the third
>>> node joins the cluster and the cluster gets activated automatically?
>>>
>>> -
>>> Denis
>>>
>>>
>>> On Wed, Aug 12, 2020 at 11:08 AM John Smith 
>>> wrote:
>>>
 Ok Denis here they are...

 3 nodes and I capture a yourlit screenshot of what it thinks are
 deadlocks on the client app.


 https://www.dropbox.com/sh/2cxjkngvx0ubw3b/AADa--HQg-rRsY3RBo2vQeJ9a?dl=0

 On Wed, 12 Aug 2020 at 11:07, John Smith 
 wrote:

> Hi Denis. I will asap but you I think you were right it is the query
> that blocks.
>
> My application first first 

Re: Operation block on Cluster recovery/rebalance.

2020-08-13 Thread Denis Magda
I've created a simple test and always getting the exception below on an
attempt to get a reference to an IgniteCache instance in cases when the
cluster is not activated:

*Exception in thread "main" class org.apache.ignite.IgniteException: Can
not perform the operation because the cluster is inactive. Note, that the
cluster is considered inactive by default if Ignite Persistent Store is
used to let all the nodes join the cluster. To activate the cluster call
Ignite.active(true)*

Are you trying to get a new IgniteCache reference whenever the client
reconnects successfully to the cluster? My guts feel that currently, Ignite
verifies the activation status and generates the exception above whenever
you're getting a reference to an IgniteCache or IgniteCompute. But once you
got those references and try to run some operations then those get stuck if
the cluster is not activated.
-
Denis


On Thu, Aug 13, 2020 at 6:37 AM John Smith  wrote:

> The cache.query() starts to block when ignite server nodes are being
> restarted and there's no baseline topology yet. The server nodes do not
> block. It's the client that blocks.
>
> The dumpfiles are of the server nodes. The screen shot is from the client
> app using your kit profiler on the client side the threads are marked as
> red on your kit.
>
> The app is simple, make http request, it runs cache Sql query on ignite
> and if it succeeds does a put back to ignite.
>
> The Client disconnected exception only happens when all server nodes in
> the cluster are down. The blockage only happens when the cluster is trying
> to establish baseline topology.
>
> On Wed., Aug. 12, 2020, 6:28 p.m. Denis Magda,  wrote:
>
>> John,
>>
>> I don't see any traits of an application-caused deadlock in the thread
>> dumps. Please elaborate on the following:
>>
>> 7- Restart 1st node, run operation, operation fails with
>>> ClientDisconectedException but application still able to complete it's
>>> request.
>>
>>
>> What's the IP address of the server node the client app uses to join the
>> cluster? If that's not the address of the 1st node, that is already
>> restarted, then the client couldn't join the cluster and it's expected that
>> it fails with the ClientDisconnectedException.
>>
>> 8- Start 2nd node, run operation, from here on all operations just block.
>>
>>
>> Are the operations unblocked and completed successfully when the third
>> node joins the cluster and the cluster gets activated automatically?
>>
>> -
>> Denis
>>
>>
>> On Wed, Aug 12, 2020 at 11:08 AM John Smith 
>> wrote:
>>
>>> Ok Denis here they are...
>>>
>>> 3 nodes and I capture a yourlit screenshot of what it thinks are
>>> deadlocks on the client app.
>>>
>>> https://www.dropbox.com/sh/2cxjkngvx0ubw3b/AADa--HQg-rRsY3RBo2vQeJ9a?dl=0
>>>
>>> On Wed, 12 Aug 2020 at 11:07, John Smith  wrote:
>>>
 Hi Denis. I will asap but you I think you were right it is the query
 that blocks.

 My application first first runs a select on the cache and then does a
 put to cache.

 On Tue, 11 Aug 2020 at 19:22, Denis Magda  wrote:

> John,
>
> It sounds like a deadlock caused by the application logic. Is there
> any chance that the operation you run on step 8 accesses several keys in
> one order while the other operations work with the same keys but in a
> different order. The deadlocks are possible when you use Ignite 
> Transaction
> API or simply execute bulk operations such as cache.readAll() or
> cache.writeAll(..).
>
> Please take and attach thread dumps from all the cluster nodes for
> analysis if we need to dig deeper.
>
> -
> Denis
>
>
> On Mon, Aug 10, 2020 at 6:23 PM John Smith 
> wrote:
>
>> Hi Denis, I think you are right. It's the query that blocks the other
>> k/v operations are ok.
>>
>> Any thoughts on this?
>>
>> On Mon, 10 Aug 2020 at 15:28, John Smith 
>> wrote:
>>
>>> I tried with 2.8.1, same issue. Operations block indefinitely...
>>>
>>> 1- Start 3 node cluster
>>> 2- Start client application client = true with Ignition.start()
>>> 3- Run some cache operations, everything ok...
>>> 4- Shut down one node, run operation, still ok
>>> 5- Shut down 2nd node, run operation, still ok
>>> 6- Shut down 3rd node, run operation, still ok... Operations start
>>> failing with ClientDisconectedException...
>>> 7- Restart 1st node, run operation, operation fails
>>> with ClientDisconectedException but application still able to complete 
>>> it's
>>> request.
>>> 8- Start 2nd node, run operation, from here on all operations just
>>> block.
>>>
>>> Basically the client application is an HTTP Server on each HTTP
>>> request does cache exception.
>>>
>>>
>>>
>>>
>>>
>>>
>>> On Fri, 7 Aug 2020 at 19:46, John Smith 
>>> wrote:
>>>
 No, everything blocks... Also using 2.7.0 jus

Re: Operation block on Cluster recovery/rebalance.

2020-08-13 Thread John Smith
The cache.query() starts to block when ignite server nodes are being
restarted and there's no baseline topology yet. The server nodes do not
block. It's the client that blocks.

The dumpfiles are of the server nodes. The screen shot is from the client
app using your kit profiler on the client side the threads are marked as
red on your kit.

The app is simple, make http request, it runs cache Sql query on ignite and
if it succeeds does a put back to ignite.

The Client disconnected exception only happens when all server nodes in the
cluster are down. The blockage only happens when the cluster is trying to
establish baseline topology.

On Wed., Aug. 12, 2020, 6:28 p.m. Denis Magda,  wrote:

> John,
>
> I don't see any traits of an application-caused deadlock in the thread
> dumps. Please elaborate on the following:
>
> 7- Restart 1st node, run operation, operation fails with
>> ClientDisconectedException but application still able to complete it's
>> request.
>
>
> What's the IP address of the server node the client app uses to join the
> cluster? If that's not the address of the 1st node, that is already
> restarted, then the client couldn't join the cluster and it's expected that
> it fails with the ClientDisconnectedException.
>
> 8- Start 2nd node, run operation, from here on all operations just block.
>
>
> Are the operations unblocked and completed successfully when the third
> node joins the cluster and the cluster gets activated automatically?
>
> -
> Denis
>
>
> On Wed, Aug 12, 2020 at 11:08 AM John Smith 
> wrote:
>
>> Ok Denis here they are...
>>
>> 3 nodes and I capture a yourlit screenshot of what it thinks are
>> deadlocks on the client app.
>>
>> https://www.dropbox.com/sh/2cxjkngvx0ubw3b/AADa--HQg-rRsY3RBo2vQeJ9a?dl=0
>>
>> On Wed, 12 Aug 2020 at 11:07, John Smith  wrote:
>>
>>> Hi Denis. I will asap but you I think you were right it is the query
>>> that blocks.
>>>
>>> My application first first runs a select on the cache and then does a
>>> put to cache.
>>>
>>> On Tue, 11 Aug 2020 at 19:22, Denis Magda  wrote:
>>>
 John,

 It sounds like a deadlock caused by the application logic. Is there any
 chance that the operation you run on step 8 accesses several keys in one
 order while the other operations work with the same keys but in a different
 order. The deadlocks are possible when you use Ignite Transaction API or
 simply execute bulk operations such as cache.readAll() or
 cache.writeAll(..).

 Please take and attach thread dumps from all the cluster nodes for
 analysis if we need to dig deeper.

 -
 Denis


 On Mon, Aug 10, 2020 at 6:23 PM John Smith 
 wrote:

> Hi Denis, I think you are right. It's the query that blocks the other
> k/v operations are ok.
>
> Any thoughts on this?
>
> On Mon, 10 Aug 2020 at 15:28, John Smith 
> wrote:
>
>> I tried with 2.8.1, same issue. Operations block indefinitely...
>>
>> 1- Start 3 node cluster
>> 2- Start client application client = true with Ignition.start()
>> 3- Run some cache operations, everything ok...
>> 4- Shut down one node, run operation, still ok
>> 5- Shut down 2nd node, run operation, still ok
>> 6- Shut down 3rd node, run operation, still ok... Operations start
>> failing with ClientDisconectedException...
>> 7- Restart 1st node, run operation, operation fails
>> with ClientDisconectedException but application still able to complete 
>> it's
>> request.
>> 8- Start 2nd node, run operation, from here on all operations just
>> block.
>>
>> Basically the client application is an HTTP Server on each HTTP
>> request does cache exception.
>>
>>
>>
>>
>>
>>
>> On Fri, 7 Aug 2020 at 19:46, John Smith 
>> wrote:
>>
>>> No, everything blocks... Also using 2.7.0 just in case.
>>>
>>> Only time I get exception is if the cluster is completely off, then
>>> I get ClientDisconectedException...
>>>
>>> On Fri, 7 Aug 2020 at 18:52, Denis Magda  wrote:
>>>
 If I'm not mistaken, key-value operations (cache.get/put) and
 compute calls fail with an exception if the cluster is deactivated. Do
 those fail on your end?

 As for the async and SQL operations, let's see what other community
 members say.

 -
 Denis


 On Fri, Aug 7, 2020 at 1:06 PM John Smith 
 wrote:

> Hi any thoughts on this?
>
> On Thu, 6 Aug 2020 at 23:33, John Smith 
> wrote:
>
>> Here is another example where it blocks.
>>
>> SqlFieldsQuery query = new SqlFieldsQuery(
>> "select * from my_table")
>> .setArgs(providerId, carrierCode);
>> query.setTimeout(1000, TimeUnit.MILLISECONDS);
>>
>> try (QueryCursor> cursor = cache

Re: Operation block on Cluster recovery/rebalance.

2020-08-12 Thread Denis Magda
John,

I don't see any traits of an application-caused deadlock in the thread
dumps. Please elaborate on the following:

7- Restart 1st node, run operation, operation fails with
> ClientDisconectedException but application still able to complete it's
> request.


What's the IP address of the server node the client app uses to join the
cluster? If that's not the address of the 1st node, that is already
restarted, then the client couldn't join the cluster and it's expected that
it fails with the ClientDisconnectedException.

8- Start 2nd node, run operation, from here on all operations just block.


Are the operations unblocked and completed successfully when the third node
joins the cluster and the cluster gets activated automatically?

-
Denis


On Wed, Aug 12, 2020 at 11:08 AM John Smith  wrote:

> Ok Denis here they are...
>
> 3 nodes and I capture a yourlit screenshot of what it thinks are deadlocks
> on the client app.
>
> https://www.dropbox.com/sh/2cxjkngvx0ubw3b/AADa--HQg-rRsY3RBo2vQeJ9a?dl=0
>
> On Wed, 12 Aug 2020 at 11:07, John Smith  wrote:
>
>> Hi Denis. I will asap but you I think you were right it is the query that
>> blocks.
>>
>> My application first first runs a select on the cache and then does a put
>> to cache.
>>
>> On Tue, 11 Aug 2020 at 19:22, Denis Magda  wrote:
>>
>>> John,
>>>
>>> It sounds like a deadlock caused by the application logic. Is there any
>>> chance that the operation you run on step 8 accesses several keys in one
>>> order while the other operations work with the same keys but in a different
>>> order. The deadlocks are possible when you use Ignite Transaction API or
>>> simply execute bulk operations such as cache.readAll() or
>>> cache.writeAll(..).
>>>
>>> Please take and attach thread dumps from all the cluster nodes for
>>> analysis if we need to dig deeper.
>>>
>>> -
>>> Denis
>>>
>>>
>>> On Mon, Aug 10, 2020 at 6:23 PM John Smith 
>>> wrote:
>>>
 Hi Denis, I think you are right. It's the query that blocks the other
 k/v operations are ok.

 Any thoughts on this?

 On Mon, 10 Aug 2020 at 15:28, John Smith 
 wrote:

> I tried with 2.8.1, same issue. Operations block indefinitely...
>
> 1- Start 3 node cluster
> 2- Start client application client = true with Ignition.start()
> 3- Run some cache operations, everything ok...
> 4- Shut down one node, run operation, still ok
> 5- Shut down 2nd node, run operation, still ok
> 6- Shut down 3rd node, run operation, still ok... Operations start
> failing with ClientDisconectedException...
> 7- Restart 1st node, run operation, operation fails
> with ClientDisconectedException but application still able to complete 
> it's
> request.
> 8- Start 2nd node, run operation, from here on all operations just
> block.
>
> Basically the client application is an HTTP Server on each HTTP
> request does cache exception.
>
>
>
>
>
>
> On Fri, 7 Aug 2020 at 19:46, John Smith 
> wrote:
>
>> No, everything blocks... Also using 2.7.0 just in case.
>>
>> Only time I get exception is if the cluster is completely off, then I
>> get ClientDisconectedException...
>>
>> On Fri, 7 Aug 2020 at 18:52, Denis Magda  wrote:
>>
>>> If I'm not mistaken, key-value operations (cache.get/put) and
>>> compute calls fail with an exception if the cluster is deactivated. Do
>>> those fail on your end?
>>>
>>> As for the async and SQL operations, let's see what other community
>>> members say.
>>>
>>> -
>>> Denis
>>>
>>>
>>> On Fri, Aug 7, 2020 at 1:06 PM John Smith 
>>> wrote:
>>>
 Hi any thoughts on this?

 On Thu, 6 Aug 2020 at 23:33, John Smith 
 wrote:

> Here is another example where it blocks.
>
> SqlFieldsQuery query = new SqlFieldsQuery(
> "select * from my_table")
> .setArgs(providerId, carrierCode);
> query.setTimeout(1000, TimeUnit.MILLISECONDS);
>
> try (QueryCursor> cursor = cache.query(query))
>
> cache.query just blocks even with the timeout set.
>
> Is there a way to timeout and at least have the application
> continue and respond with an appropriate message?
>
>
>
> On Thu, 6 Aug 2020 at 23:06, John Smith 
> wrote:
>
>> Hi running 2.7.0
>>
>> When I reboot a node and it begins to rejoin the cluster or the
>> cluster is not yet activated with baseline topology operations seem 
>> to
>> block forever, operations that are supposed to return IgniteFuture. 
>> I.e:
>> putAsync, getAsync etc... They just block, until the cluster 
>> resolves it's
>> state.
>>
>>
>>


Re: Operation block on Cluster recovery/rebalance.

2020-08-12 Thread John Smith
Ok Denis here they are...

3 nodes and I capture a yourlit screenshot of what it thinks are deadlocks
on the client app.

https://www.dropbox.com/sh/2cxjkngvx0ubw3b/AADa--HQg-rRsY3RBo2vQeJ9a?dl=0

On Wed, 12 Aug 2020 at 11:07, John Smith  wrote:

> Hi Denis. I will asap but you I think you were right it is the query that
> blocks.
>
> My application first first runs a select on the cache and then does a put
> to cache.
>
> On Tue, 11 Aug 2020 at 19:22, Denis Magda  wrote:
>
>> John,
>>
>> It sounds like a deadlock caused by the application logic. Is there any
>> chance that the operation you run on step 8 accesses several keys in one
>> order while the other operations work with the same keys but in a different
>> order. The deadlocks are possible when you use Ignite Transaction API or
>> simply execute bulk operations such as cache.readAll() or
>> cache.writeAll(..).
>>
>> Please take and attach thread dumps from all the cluster nodes for
>> analysis if we need to dig deeper.
>>
>> -
>> Denis
>>
>>
>> On Mon, Aug 10, 2020 at 6:23 PM John Smith 
>> wrote:
>>
>>> Hi Denis, I think you are right. It's the query that blocks the other
>>> k/v operations are ok.
>>>
>>> Any thoughts on this?
>>>
>>> On Mon, 10 Aug 2020 at 15:28, John Smith  wrote:
>>>
 I tried with 2.8.1, same issue. Operations block indefinitely...

 1- Start 3 node cluster
 2- Start client application client = true with Ignition.start()
 3- Run some cache operations, everything ok...
 4- Shut down one node, run operation, still ok
 5- Shut down 2nd node, run operation, still ok
 6- Shut down 3rd node, run operation, still ok... Operations start
 failing with ClientDisconectedException...
 7- Restart 1st node, run operation, operation fails
 with ClientDisconectedException but application still able to complete it's
 request.
 8- Start 2nd node, run operation, from here on all operations just
 block.

 Basically the client application is an HTTP Server on each HTTP request
 does cache exception.






 On Fri, 7 Aug 2020 at 19:46, John Smith  wrote:

> No, everything blocks... Also using 2.7.0 just in case.
>
> Only time I get exception is if the cluster is completely off, then I
> get ClientDisconectedException...
>
> On Fri, 7 Aug 2020 at 18:52, Denis Magda  wrote:
>
>> If I'm not mistaken, key-value operations (cache.get/put) and compute
>> calls fail with an exception if the cluster is deactivated. Do those fail
>> on your end?
>>
>> As for the async and SQL operations, let's see what other community
>> members say.
>>
>> -
>> Denis
>>
>>
>> On Fri, Aug 7, 2020 at 1:06 PM John Smith 
>> wrote:
>>
>>> Hi any thoughts on this?
>>>
>>> On Thu, 6 Aug 2020 at 23:33, John Smith 
>>> wrote:
>>>
 Here is another example where it blocks.

 SqlFieldsQuery query = new SqlFieldsQuery(
 "select * from my_table")
 .setArgs(providerId, carrierCode);
 query.setTimeout(1000, TimeUnit.MILLISECONDS);

 try (QueryCursor> cursor = cache.query(query))

 cache.query just blocks even with the timeout set.

 Is there a way to timeout and at least have the application
 continue and respond with an appropriate message?



 On Thu, 6 Aug 2020 at 23:06, John Smith 
 wrote:

> Hi running 2.7.0
>
> When I reboot a node and it begins to rejoin the cluster or the
> cluster is not yet activated with baseline topology operations seem to
> block forever, operations that are supposed to return IgniteFuture. 
> I.e:
> putAsync, getAsync etc... They just block, until the cluster resolves 
> it's
> state.
>
>
>


Re: Operation block on Cluster recovery/rebalance.

2020-08-12 Thread John Smith
Hi Denis. I will asap but you I think you were right it is the query that
blocks.

My application first first runs a select on the cache and then does a put
to cache.

On Tue, 11 Aug 2020 at 19:22, Denis Magda  wrote:

> John,
>
> It sounds like a deadlock caused by the application logic. Is there any
> chance that the operation you run on step 8 accesses several keys in one
> order while the other operations work with the same keys but in a different
> order. The deadlocks are possible when you use Ignite Transaction API or
> simply execute bulk operations such as cache.readAll() or
> cache.writeAll(..).
>
> Please take and attach thread dumps from all the cluster nodes for
> analysis if we need to dig deeper.
>
> -
> Denis
>
>
> On Mon, Aug 10, 2020 at 6:23 PM John Smith  wrote:
>
>> Hi Denis, I think you are right. It's the query that blocks the other k/v
>> operations are ok.
>>
>> Any thoughts on this?
>>
>> On Mon, 10 Aug 2020 at 15:28, John Smith  wrote:
>>
>>> I tried with 2.8.1, same issue. Operations block indefinitely...
>>>
>>> 1- Start 3 node cluster
>>> 2- Start client application client = true with Ignition.start()
>>> 3- Run some cache operations, everything ok...
>>> 4- Shut down one node, run operation, still ok
>>> 5- Shut down 2nd node, run operation, still ok
>>> 6- Shut down 3rd node, run operation, still ok... Operations start
>>> failing with ClientDisconectedException...
>>> 7- Restart 1st node, run operation, operation fails
>>> with ClientDisconectedException but application still able to complete it's
>>> request.
>>> 8- Start 2nd node, run operation, from here on all operations just block.
>>>
>>> Basically the client application is an HTTP Server on each HTTP request
>>> does cache exception.
>>>
>>>
>>>
>>>
>>>
>>>
>>> On Fri, 7 Aug 2020 at 19:46, John Smith  wrote:
>>>
 No, everything blocks... Also using 2.7.0 just in case.

 Only time I get exception is if the cluster is completely off, then I
 get ClientDisconectedException...

 On Fri, 7 Aug 2020 at 18:52, Denis Magda  wrote:

> If I'm not mistaken, key-value operations (cache.get/put) and compute
> calls fail with an exception if the cluster is deactivated. Do those fail
> on your end?
>
> As for the async and SQL operations, let's see what other community
> members say.
>
> -
> Denis
>
>
> On Fri, Aug 7, 2020 at 1:06 PM John Smith 
> wrote:
>
>> Hi any thoughts on this?
>>
>> On Thu, 6 Aug 2020 at 23:33, John Smith 
>> wrote:
>>
>>> Here is another example where it blocks.
>>>
>>> SqlFieldsQuery query = new SqlFieldsQuery(
>>> "select * from my_table")
>>> .setArgs(providerId, carrierCode);
>>> query.setTimeout(1000, TimeUnit.MILLISECONDS);
>>>
>>> try (QueryCursor> cursor = cache.query(query))
>>>
>>> cache.query just blocks even with the timeout set.
>>>
>>> Is there a way to timeout and at least have the application continue
>>> and respond with an appropriate message?
>>>
>>>
>>>
>>> On Thu, 6 Aug 2020 at 23:06, John Smith 
>>> wrote:
>>>
 Hi running 2.7.0

 When I reboot a node and it begins to rejoin the cluster or the
 cluster is not yet activated with baseline topology operations seem to
 block forever, operations that are supposed to return IgniteFuture. 
 I.e:
 putAsync, getAsync etc... They just block, until the cluster resolves 
 it's
 state.





Re: Operation block on Cluster recovery/rebalance.

2020-08-11 Thread Denis Magda
John,

It sounds like a deadlock caused by the application logic. Is there any
chance that the operation you run on step 8 accesses several keys in one
order while the other operations work with the same keys but in a different
order. The deadlocks are possible when you use Ignite Transaction API or
simply execute bulk operations such as cache.readAll() or
cache.writeAll(..).

Please take and attach thread dumps from all the cluster nodes for analysis
if we need to dig deeper.

-
Denis


On Mon, Aug 10, 2020 at 6:23 PM John Smith  wrote:

> Hi Denis, I think you are right. It's the query that blocks the other k/v
> operations are ok.
>
> Any thoughts on this?
>
> On Mon, 10 Aug 2020 at 15:28, John Smith  wrote:
>
>> I tried with 2.8.1, same issue. Operations block indefinitely...
>>
>> 1- Start 3 node cluster
>> 2- Start client application client = true with Ignition.start()
>> 3- Run some cache operations, everything ok...
>> 4- Shut down one node, run operation, still ok
>> 5- Shut down 2nd node, run operation, still ok
>> 6- Shut down 3rd node, run operation, still ok... Operations start
>> failing with ClientDisconectedException...
>> 7- Restart 1st node, run operation, operation fails
>> with ClientDisconectedException but application still able to complete it's
>> request.
>> 8- Start 2nd node, run operation, from here on all operations just block.
>>
>> Basically the client application is an HTTP Server on each HTTP request
>> does cache exception.
>>
>>
>>
>>
>>
>>
>> On Fri, 7 Aug 2020 at 19:46, John Smith  wrote:
>>
>>> No, everything blocks... Also using 2.7.0 just in case.
>>>
>>> Only time I get exception is if the cluster is completely off, then I
>>> get ClientDisconectedException...
>>>
>>> On Fri, 7 Aug 2020 at 18:52, Denis Magda  wrote:
>>>
 If I'm not mistaken, key-value operations (cache.get/put) and compute
 calls fail with an exception if the cluster is deactivated. Do those fail
 on your end?

 As for the async and SQL operations, let's see what other community
 members say.

 -
 Denis


 On Fri, Aug 7, 2020 at 1:06 PM John Smith 
 wrote:

> Hi any thoughts on this?
>
> On Thu, 6 Aug 2020 at 23:33, John Smith 
> wrote:
>
>> Here is another example where it blocks.
>>
>> SqlFieldsQuery query = new SqlFieldsQuery(
>> "select * from my_table")
>> .setArgs(providerId, carrierCode);
>> query.setTimeout(1000, TimeUnit.MILLISECONDS);
>>
>> try (QueryCursor> cursor = cache.query(query))
>>
>> cache.query just blocks even with the timeout set.
>>
>> Is there a way to timeout and at least have the application continue
>> and respond with an appropriate message?
>>
>>
>>
>> On Thu, 6 Aug 2020 at 23:06, John Smith 
>> wrote:
>>
>>> Hi running 2.7.0
>>>
>>> When I reboot a node and it begins to rejoin the cluster or the
>>> cluster is not yet activated with baseline topology operations seem to
>>> block forever, operations that are supposed to return IgniteFuture. I.e:
>>> putAsync, getAsync etc... They just block, until the cluster resolves 
>>> it's
>>> state.
>>>
>>>
>>>


Re: Operation block on Cluster recovery/rebalance.

2020-08-10 Thread John Smith
Hi Denis, I think you are right. It's the query that blocks the other k/v
operations are ok.

Any thoughts on this?

On Mon, 10 Aug 2020 at 15:28, John Smith  wrote:

> I tried with 2.8.1, same issue. Operations block indefinitely...
>
> 1- Start 3 node cluster
> 2- Start client application client = true with Ignition.start()
> 3- Run some cache operations, everything ok...
> 4- Shut down one node, run operation, still ok
> 5- Shut down 2nd node, run operation, still ok
> 6- Shut down 3rd node, run operation, still ok... Operations start failing
> with ClientDisconectedException...
> 7- Restart 1st node, run operation, operation fails
> with ClientDisconectedException but application still able to complete it's
> request.
> 8- Start 2nd node, run operation, from here on all operations just block.
>
> Basically the client application is an HTTP Server on each HTTP request
> does cache exception.
>
>
>
>
>
>
> On Fri, 7 Aug 2020 at 19:46, John Smith  wrote:
>
>> No, everything blocks... Also using 2.7.0 just in case.
>>
>> Only time I get exception is if the cluster is completely off, then I get
>> ClientDisconectedException...
>>
>> On Fri, 7 Aug 2020 at 18:52, Denis Magda  wrote:
>>
>>> If I'm not mistaken, key-value operations (cache.get/put) and compute
>>> calls fail with an exception if the cluster is deactivated. Do those fail
>>> on your end?
>>>
>>> As for the async and SQL operations, let's see what other community
>>> members say.
>>>
>>> -
>>> Denis
>>>
>>>
>>> On Fri, Aug 7, 2020 at 1:06 PM John Smith 
>>> wrote:
>>>
 Hi any thoughts on this?

 On Thu, 6 Aug 2020 at 23:33, John Smith  wrote:

> Here is another example where it blocks.
>
> SqlFieldsQuery query = new SqlFieldsQuery(
> "select * from my_table")
> .setArgs(providerId, carrierCode);
> query.setTimeout(1000, TimeUnit.MILLISECONDS);
>
> try (QueryCursor> cursor = cache.query(query))
>
> cache.query just blocks even with the timeout set.
>
> Is there a way to timeout and at least have the application continue
> and respond with an appropriate message?
>
>
>
> On Thu, 6 Aug 2020 at 23:06, John Smith 
> wrote:
>
>> Hi running 2.7.0
>>
>> When I reboot a node and it begins to rejoin the cluster or the
>> cluster is not yet activated with baseline topology operations seem to
>> block forever, operations that are supposed to return IgniteFuture. I.e:
>> putAsync, getAsync etc... They just block, until the cluster resolves 
>> it's
>> state.
>>
>>
>>


Re: Operation block on Cluster recovery/rebalance.

2020-08-10 Thread John Smith
I tried with 2.8.1, same issue. Operations block indefinitely...

1- Start 3 node cluster
2- Start client application client = true with Ignition.start()
3- Run some cache operations, everything ok...
4- Shut down one node, run operation, still ok
5- Shut down 2nd node, run operation, still ok
6- Shut down 3rd node, run operation, still ok... Operations start failing
with ClientDisconectedException...
7- Restart 1st node, run operation, operation fails
with ClientDisconectedException but application still able to complete it's
request.
8- Start 2nd node, run operation, from here on all operations just block.

Basically the client application is an HTTP Server on each HTTP request
does cache exception.






On Fri, 7 Aug 2020 at 19:46, John Smith  wrote:

> No, everything blocks... Also using 2.7.0 just in case.
>
> Only time I get exception is if the cluster is completely off, then I get
> ClientDisconectedException...
>
> On Fri, 7 Aug 2020 at 18:52, Denis Magda  wrote:
>
>> If I'm not mistaken, key-value operations (cache.get/put) and compute
>> calls fail with an exception if the cluster is deactivated. Do those fail
>> on your end?
>>
>> As for the async and SQL operations, let's see what other community
>> members say.
>>
>> -
>> Denis
>>
>>
>> On Fri, Aug 7, 2020 at 1:06 PM John Smith  wrote:
>>
>>> Hi any thoughts on this?
>>>
>>> On Thu, 6 Aug 2020 at 23:33, John Smith  wrote:
>>>
 Here is another example where it blocks.

 SqlFieldsQuery query = new SqlFieldsQuery(
 "select * from my_table")
 .setArgs(providerId, carrierCode);
 query.setTimeout(1000, TimeUnit.MILLISECONDS);

 try (QueryCursor> cursor = cache.query(query))

 cache.query just blocks even with the timeout set.

 Is there a way to timeout and at least have the application continue
 and respond with an appropriate message?



 On Thu, 6 Aug 2020 at 23:06, John Smith  wrote:

> Hi running 2.7.0
>
> When I reboot a node and it begins to rejoin the cluster or the
> cluster is not yet activated with baseline topology operations seem to
> block forever, operations that are supposed to return IgniteFuture. I.e:
> putAsync, getAsync etc... They just block, until the cluster resolves it's
> state.
>
>
>


Re: Operation block on Cluster recovery/rebalance.

2020-08-07 Thread John Smith
No, everything blocks... Also using 2.7.0 just in case.

Only time I get exception is if the cluster is completely off, then I get
ClientDisconectedException...

On Fri, 7 Aug 2020 at 18:52, Denis Magda  wrote:

> If I'm not mistaken, key-value operations (cache.get/put) and compute
> calls fail with an exception if the cluster is deactivated. Do those fail
> on your end?
>
> As for the async and SQL operations, let's see what other community
> members say.
>
> -
> Denis
>
>
> On Fri, Aug 7, 2020 at 1:06 PM John Smith  wrote:
>
>> Hi any thoughts on this?
>>
>> On Thu, 6 Aug 2020 at 23:33, John Smith  wrote:
>>
>>> Here is another example where it blocks.
>>>
>>> SqlFieldsQuery query = new SqlFieldsQuery(
>>> "select * from my_table")
>>> .setArgs(providerId, carrierCode);
>>> query.setTimeout(1000, TimeUnit.MILLISECONDS);
>>>
>>> try (QueryCursor> cursor = cache.query(query))
>>>
>>> cache.query just blocks even with the timeout set.
>>>
>>> Is there a way to timeout and at least have the application continue and
>>> respond with an appropriate message?
>>>
>>>
>>>
>>> On Thu, 6 Aug 2020 at 23:06, John Smith  wrote:
>>>
 Hi running 2.7.0

 When I reboot a node and it begins to rejoin the cluster or the cluster
 is not yet activated with baseline topology operations seem to block
 forever, operations that are supposed to return IgniteFuture. I.e:
 putAsync, getAsync etc... They just block, until the cluster resolves it's
 state.





Re: Operation block on Cluster recovery/rebalance.

2020-08-07 Thread Denis Magda
If I'm not mistaken, key-value operations (cache.get/put) and compute calls
fail with an exception if the cluster is deactivated. Do those fail on your
end?

As for the async and SQL operations, let's see what other community members
say.

-
Denis


On Fri, Aug 7, 2020 at 1:06 PM John Smith  wrote:

> Hi any thoughts on this?
>
> On Thu, 6 Aug 2020 at 23:33, John Smith  wrote:
>
>> Here is another example where it blocks.
>>
>> SqlFieldsQuery query = new SqlFieldsQuery(
>> "select * from my_table")
>> .setArgs(providerId, carrierCode);
>> query.setTimeout(1000, TimeUnit.MILLISECONDS);
>>
>> try (QueryCursor> cursor = cache.query(query))
>>
>> cache.query just blocks even with the timeout set.
>>
>> Is there a way to timeout and at least have the application continue and
>> respond with an appropriate message?
>>
>>
>>
>> On Thu, 6 Aug 2020 at 23:06, John Smith  wrote:
>>
>>> Hi running 2.7.0
>>>
>>> When I reboot a node and it begins to rejoin the cluster or the cluster
>>> is not yet activated with baseline topology operations seem to block
>>> forever, operations that are supposed to return IgniteFuture. I.e:
>>> putAsync, getAsync etc... They just block, until the cluster resolves it's
>>> state.
>>>
>>>
>>>


Re: Operation block on Cluster recovery/rebalance.

2020-08-07 Thread John Smith
Hi any thoughts on this?

On Thu, 6 Aug 2020 at 23:33, John Smith  wrote:

> Here is another example where it blocks.
>
> SqlFieldsQuery query = new SqlFieldsQuery(
> "select * from my_table")
> .setArgs(providerId, carrierCode);
> query.setTimeout(1000, TimeUnit.MILLISECONDS);
>
> try (QueryCursor> cursor = cache.query(query))
>
> cache.query just blocks even with the timeout set.
>
> Is there a way to timeout and at least have the application continue and
> respond with an appropriate message?
>
>
>
> On Thu, 6 Aug 2020 at 23:06, John Smith  wrote:
>
>> Hi running 2.7.0
>>
>> When I reboot a node and it begins to rejoin the cluster or the cluster
>> is not yet activated with baseline topology operations seem to block
>> forever, operations that are supposed to return IgniteFuture. I.e:
>> putAsync, getAsync etc... They just block, until the cluster resolves it's
>> state.
>>
>>
>>


Re: Operation block on Cluster recovery/rebalance.

2020-08-06 Thread John Smith
Here is another example where it blocks.

SqlFieldsQuery query = new SqlFieldsQuery(
"select * from my_table")
.setArgs(providerId, carrierCode);
query.setTimeout(1000, TimeUnit.MILLISECONDS);

try (QueryCursor> cursor = cache.query(query))

cache.query just blocks even with the timeout set.

Is there a way to timeout and at least have the application continue and
respond with an appropriate message?



On Thu, 6 Aug 2020 at 23:06, John Smith  wrote:

> Hi running 2.7.0
>
> When I reboot a node and it begins to rejoin the cluster or the cluster is
> not yet activated with baseline topology operations seem to block forever,
> operations that are supposed to return IgniteFuture. I.e: putAsync,
> getAsync etc... They just block, until the cluster resolves it's state.
>
>
>


Operation block on Cluster recovery/rebalance.

2020-08-06 Thread John Smith
Hi running 2.7.0

When I reboot a node and it begins to rejoin the cluster or the cluster is
not yet activated with baseline topology operations seem to block forever,
operations that are supposed to return IgniteFuture. I.e: putAsync,
getAsync etc... They just block, until the cluster resolves it's state.