How to change backup number after cache is created?

2020-08-07 Thread Steven Zheng
Hi community,
I want to know how to modify number of backups for a created cache. In some
cases, the backup of cache is 0 at the beginning. But when using in
production, we must increase it.

Best Regards,
-
Steven Zheng
E-mail: closee...@gmail.com


Re: run ignitevisorcmd in k8s and docker?

2020-08-07 Thread Denis Magda
Hi,

Then Visor CMD should work fine. Please share your configs and logs for
analysis.

As for the alternate UI tools, yes, you can deploy those in Docker. Just
google for respective documentation.

Denis

On Thursday, August 6, 2020, bbweb  wrote:

> Hi Denis, Actaully we are running visor CMD in the same container with
> Ignite server, and using the same configuration file with
> KubernetesIPFinder that ignite server use. Should this mode be supported?
> And for the link provided, does it mean that web console can also be used?
> Thanks! - Bright
>
> - 原始邮件 -
>
> *发件人:*Denis Magda
>
> *发送时间:*2020-08-07 11:53:11
>
> *收件人:*user
>
> *主 题:*Re: run ignitevisorcmd in k8s and docker?
>
> Hi there,
>
> Visor CMD uses a legacy connection mode (daemon node) that might be tricky
> to deal with in K8S environments. Try to start it inside the same K8S
> network with your cluster and use a configuration with KubernetesIPFinder
> you used for the server nodes.
>
> Alternatively, go for an alternate monitoring tool that is better suited
> for K8S:
> https://ignite.apache.org/features/management-and-monitoring.html
>
> -
> Denis
>
>
> On Thu, Aug 6, 2020 at 6:43 PM bbweb < bb...@sohu.com> wrote:
>
> Hi,
>
> we aremeeting problem when we run ignitevisorcmd in K8S and docker
> environment. Afterwe start a cluster in K8S and run ignitevisorcmd in the
> node, it just can'tfind any node when running "top" in it, it just show
> empty topology.
>
> Is thissupported in K8S and docker? Or are there some configurations
> should be done?
>
> Thanks for any response!
>
>
>
>
>
> 
>
>
>
>
>
>
> 
>
>
>
>

-- 
-
Denis


Re: read-though tutorial for a big table

2020-08-07 Thread Denis Magda
Alex,

Please share a bit more details on what you’re struggling with. It sounds
like you are looking for a specific piece of advice rather than generic
performance suggestions.

Denis

On Thursday, August 6, 2020, Alex Panchenko 
wrote:

> Hello Vladimir,
>
> are there some key things you can share with us? Some checklist with the
> most important configuration params or things we need to review/check?
> anything would be helpful
>
> I've been playing with Ignite for the last few months, performance still
> low.
> I have to decide whether to switch from Ignite to some another solution or
> improve the performance ASAP.
>
> Thanks
>
>
>
> --
> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>


-- 
-
Denis


Re: Best first steps for performance tuning

2020-08-07 Thread Denis Magda
Devin,

Please consider watching this talk by Ivan Rakov, one of main contributors
to the native persistence. Rewind to 28 minutes 14 seconds for performance
consideration: https://www.youtube.com/watch?v=f2ArcJPH4iU

Let us know if any piece of advice was helpful and we’ll put the knowledge
on paper then for better discoverability.

Denis

On Thursday, August 6, 2020, Evgenii Zhuravlev 
wrote:

> Hi Devin,
>
> Yes, you're right, the first step could be increasing the amount of
> offheap memory used for data(data region size). By default, Ignite uses 20%
> of available RAM.
>
> After that, I would recommend finding where the bottleneck is for your
> system - you can check CPU, disk and network to find it.
>
> Best Regards,
> Evgenii
>
> чт, 6 авг. 2020 г. в 12:49, Devin Bost :
>
>> While looking at the docs, there are a lot of available parameters for
>> performance tuning.
>> We have several high-velocity Ignite operations, and Ignite is having
>> trouble keeping up. We are using native persistence, and I suspect we will
>> get more value from increasing the amount of memory used since I think the
>> default memory utilization is low if I remember correctly.
>>
>> We're wondering what the first few things should be for us to look at for
>> performance tuning and wondering if anyone has some guidance so we know
>> what to start with.
>>
>> Thanks,
>>
>> Devin G. Bost
>>
>

-- 
-
Denis


Re: read-though tutorial for a big table

2020-08-07 Thread vtchernyi
Hi Alex,I do not feel myself like a guru about Ignite, there is much more experienced people in that chat. Summer happened, vacation and so forth.. Please wait some time until my blog will be ready.I hope you can read russian and my article [1] may be helpful meanwhile. It is as a result of half year Ignite learning using oficial doc sites, user maillist and StackOverflow. Very helpful for me was Denis Magda's post about building microservices and 2-line comment on StackOverflow by Val KulichenkoHope that helps,Vladimir[1] https://m.habr.com/ru/post/472568/16:27, 7 августа 2020 г., Denis Magda :Alex,Please share a bit more details on what you’re struggling with. It sounds like you are looking for a specific piece of advice rather than generic performance suggestions.DenisOn Thursday, August 6, 2020, Alex Panchenko  wrote:Hello Vladimir,

are there some key things you can share with us? Some checklist with the
most important configuration params or things we need to review/check?
anything would be helpful

I've been playing with Ignite for the last few months, performance still
low.
I have to decide whether to switch from Ignite to some another solution or
improve the performance ASAP.  

Thanks



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/
-- -Denis
-- Отправлено из мобильного приложения Яндекс.Почты

Re: How to change backup number after cache is created?

2020-08-07 Thread Ilya Kasnacheev
Hello!

You cannot change the number of backups once the cache is created.

You need to create it with correct value (maybe one that depends on the
current env).

Regards,
-- 
Ilya Kasnacheev


пт, 7 авг. 2020 г. в 13:04, Steven Zheng :

> Hi community,
> I want to know how to modify number of backups for a created cache. In
> some cases, the backup of cache is 0 at the beginning. But when using in
> production, we must increase it.
>
> Best Regards,
> -
> Steven Zheng
> E-mail: closee...@gmail.com
>


Re: Enabling swapPath causes invoking shutdown hook

2020-08-07 Thread Denis Magda
Agree with Ilya that the performance should be comparable if you disable
the WAL of the Ignite persistence.

Anyway, the swapping and Ignite persistence pursue different goals. The
swapping is one of the out-of-memory protection techniques - if you run out
of DRAM, then the OS will start swapping in/out Ignite memory pages to
avoid a node outage. But, the swap space is not a durable storage layer. If
you restart the cluster, then all the swapped pages will evaporate. While,
Ignite persistence is your durable disk tier that survives cluster restarts
and, thus, uses more sophisticated algorithms to ensure data consistency
and durability. Just select what suits you best. I put more thoughts on
this in this article:
https://www.gridgain.com/resources/blog/out-of-memory-apache-ignite-cluster-handling-techniques

-
Denis


On Thu, Aug 6, 2020 at 6:23 AM Ilya Kasnacheev 
wrote:

> Hello!
>
> I think the performance of swap space should be on par with persistence
> with disabled WAL.
>
> You can submit suggested updates to the documentation if you like.
>
> Regards,
> --
> Ilya Kasnacheev
>
>
> ср, 5 авг. 2020 г. в 06:00, 38797715 <38797...@qq.com>:
>
>> Hi Ilya,
>>
>> If so, there are two ways to implement ignite's swap space:
>> 1. maxSize > physical memory, which will use the swap mechanism of the
>> OS, can be used *vm.swappiness* Adjust.
>> 2. Configure the *swapPath* property, which is implemented by Ignite
>> itself, is independent of the OS and has no optimization parameters.
>> There's a choice between these two models, right? Then I think there may
>> be many problems in the description of the document. I hope you can check
>> it again:
>> https://apacheignite.readme.io/docs/swap-space
>>
>> After our initial testing, the performance of swap space is much better
>> than native persistence, so I think this pattern is valuable in some
>> scenarios.
>> 在 2020/8/4 下午10:16, Ilya Kasnacheev 写道:
>>
>> Hello!
>>
>> From the docs:
>>
>> To avoid this situation with the swapping capabilities, you need to :
>>
>>- Set maxSize = bigger_ than_RAM_size, in which case, the OS will
>>take care of the swapping.
>>- Enable swapping by setting the DataRegionConfiguration.swapPath
>> property.
>>
>>
>> I actually think these are either-or. You should either do the first (and
>> configure OS swapping) or the second part.
>>
>> Having said that, I recommend setting proper Native Persistence instead.
>>
>> Regards,
>> --
>> Ilya Kasnacheev
>>
>>
>> сб, 25 июл. 2020 г. в 04:49, 38797715 <38797...@qq.com>:
>>
>>> Hi,
>>>
>>> https://apacheignite.readme.io/docs/swap-space
>>>
>>> According to the above document, if the physical memory is small, you
>>> can solve this problem by opening the swap space,The specific method is to
>>> configure maxSize to a larger value (i.e. larger than the physical memory),
>>> and the swapPath property needs to be configured.
>>>
>>> But from the test results, the node is terminated.
>>>
>>> I think the correct result should be that even if the amount of data
>>> exceeds the physical memory, the node should still be able to run normally,
>>> but the data is exchanged to the disk.
>>>
>>> I want to know what parameters affect the behavior of this
>>> configuration? *vm.swappiness* or others?
>>> 在 2020/7/24 下午9:55, aealexsandrov 写道:
>>>
>>> Hi,
>>>
>>> Can you please clarify your expectations? You expected that JVM process will
>>> be killed instead of gracefully stopping? What you are going to achieve?
>>>
>>> BR,
>>> Andrei
>>>
>>>
>>>
>>> --
>>> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>>>
>>>


Re: Operation block on Cluster recovery/rebalance.

2020-08-07 Thread John Smith
Hi any thoughts on this?

On Thu, 6 Aug 2020 at 23:33, John Smith  wrote:

> Here is another example where it blocks.
>
> SqlFieldsQuery query = new SqlFieldsQuery(
> "select * from my_table")
> .setArgs(providerId, carrierCode);
> query.setTimeout(1000, TimeUnit.MILLISECONDS);
>
> try (QueryCursor> cursor = cache.query(query))
>
> cache.query just blocks even with the timeout set.
>
> Is there a way to timeout and at least have the application continue and
> respond with an appropriate message?
>
>
>
> On Thu, 6 Aug 2020 at 23:06, John Smith  wrote:
>
>> Hi running 2.7.0
>>
>> When I reboot a node and it begins to rejoin the cluster or the cluster
>> is not yet activated with baseline topology operations seem to block
>> forever, operations that are supposed to return IgniteFuture. I.e:
>> putAsync, getAsync etc... They just block, until the cluster resolves it's
>> state.
>>
>>
>>


Re: Operation block on Cluster recovery/rebalance.

2020-08-07 Thread Denis Magda
If I'm not mistaken, key-value operations (cache.get/put) and compute calls
fail with an exception if the cluster is deactivated. Do those fail on your
end?

As for the async and SQL operations, let's see what other community members
say.

-
Denis


On Fri, Aug 7, 2020 at 1:06 PM John Smith  wrote:

> Hi any thoughts on this?
>
> On Thu, 6 Aug 2020 at 23:33, John Smith  wrote:
>
>> Here is another example where it blocks.
>>
>> SqlFieldsQuery query = new SqlFieldsQuery(
>> "select * from my_table")
>> .setArgs(providerId, carrierCode);
>> query.setTimeout(1000, TimeUnit.MILLISECONDS);
>>
>> try (QueryCursor> cursor = cache.query(query))
>>
>> cache.query just blocks even with the timeout set.
>>
>> Is there a way to timeout and at least have the application continue and
>> respond with an appropriate message?
>>
>>
>>
>> On Thu, 6 Aug 2020 at 23:06, John Smith  wrote:
>>
>>> Hi running 2.7.0
>>>
>>> When I reboot a node and it begins to rejoin the cluster or the cluster
>>> is not yet activated with baseline topology operations seem to block
>>> forever, operations that are supposed to return IgniteFuture. I.e:
>>> putAsync, getAsync etc... They just block, until the cluster resolves it's
>>> state.
>>>
>>>
>>>


Re: What does all partition owners have left the grid, partition data has been lost mean?

2020-08-07 Thread Denis Magda
John,

This page covers all the aspects of backups in Ignite (what are those, how
to configure, etc.):
https://www.gridgain.com/docs/latest/developers-guide/configuring-caches/configuring-backups

As for the 5 nodes cluster with the persistence, yes, all those have to be
added to be in the same baseline topology.

-
Denis


On Thu, Aug 6, 2020 at 8:26 AM John Smith  wrote:

> Ok if I have 5 nodes with persistence then all nodes need to be in
> baseline?
>
> Also what are the docs for backup to make sure I have it right?
>
>
> On Thu, 6 Aug 2020 at 10:08, Ilya Kasnacheev 
> wrote:
>
>> Hello!
>>
>> You are confusing baseline with backups here.
>>
>> You should have 1 backup to afford losing a node.
>>
>> You should have all data nodes in the baseline.
>>
>> Regards,
>> --
>> Ilya Kasnacheev
>>
>>
>> ср, 5 авг. 2020 г. в 17:56, John Smith :
>>
>>> I mean I have 3 nodes and the baseline is set to 3. Does it mean if I
>>> put 2 as baseline then I can lose at least 1? If I remove one node from
>>> baseline does it mean it will not store data?
>>>
>>> Or is it better to have 3 baseline nodes and add a 4th node? In that
>>> case if I still lose a baseline node will I still be able to do
>>> operations on the cache?
>>>
>>> On Wed, 5 Aug 2020 at 08:21, John Smith  wrote:
>>>
 I have 3 nodes and baseline topology is 3 so if I lose 1 I guess it's
 enough... Should it be 2?

 On Tue., Aug. 4, 2020, 10:57 a.m. Ilya Kasnacheev, <
 ilya.kasnach...@gmail.com> wrote:

> Hello!
>
> What is your baseline topology at this moment? It means just that: you
> have lost enough nodes of your distributed grid that data is nowhere to be
> found now.
>
> Regards,
> --
> Ilya Kasnacheev
>
>
> пн, 3 авг. 2020 г. в 19:12, John Smith :
>
>> I get the below exception on my client...
>>
>> #1 I rebooted the cache nodes error still continued.
>> #2 restarted the client node error went away.
>> #3 this seems to happen every few weeks.
>> #4 is there some sort of timeout values and retries I can put?
>> #5 cache operations seem to block when rebooting the nodes (I have 3
>> nodes). Is there a way not to block?
>>
>> javax.cache.CacheException: class
>> org.apache.ignite.internal.processors.cache.CacheInvalidStateException:
>> Failed to execute cache operation (all partition owners have left the 
>> grid,
>> partition data has been lost) [cacheName=xx, part=273, 
>> key=16479796986]
>> at
>> org.apache.ignite.internal.processors.cache.GridCacheUtils.convertToCacheException(GridCacheUtils.java:1337)
>> at
>> org.apache.ignite.internal.processors.cache.IgniteCacheFutureImpl.convertException(IgniteCacheFutureImpl.java:62)
>> at
>> org.apache.ignite.internal.util.future.IgniteFutureImpl.get(IgniteFutureImpl.java:157)
>> at
>> com.xx.common.vertx.ext.data.impl.IgniteCacheRepository.lambda$executeAsync$394d953f$1(IgniteCacheRepository.java:59)
>> at
>> org.apache.ignite.internal.util.future.AsyncFutureListener$1.run(AsyncFutureListener.java:53)
>> at
>> com.xx.common.vertx.ext.data.impl.VertxIgniteExecutorAdapter.lambda$execute$0(VertxIgniteExecutorAdapter.java:18)
>> at io.vertx.core.impl.ContextImpl.executeTask(ContextImpl.java:369)
>> at
>> io.vertx.core.impl.EventLoopContext.lambda$executeAsync$0(EventLoopContext.java:38)
>> at
>> io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:164)
>> at
>> io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:472)
>> at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:497)
>> at
>> io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)
>> at
>> io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
>> at
>> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
>> at java.lang.Thread.run(Thread.java:748)
>> Caused by:
>> org.apache.ignite.internal.processors.cache.CacheInvalidStateException:
>> Failed to execute cache operation (all partition owners have left the 
>> grid,
>> partition data has been lost) [cacheName=xx, part=273, 
>> key=16479796986]
>> at
>> org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTopologyFutureAdapter.validatePartitionOperation(GridDhtTopologyFutureAdapter.java:161)
>> at
>> org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTopologyFutureAdapter.validateCache(GridDhtTopologyFutureAdapter.java:116)
>> at
>> org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridNearAtomicSingleUpdateFuture.mapOnTopology(GridNearAtomicSingleUpdateFuture.java:417)
>> at
>> org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridNearAtomicAbstractUpdateFuture

Re: Operation block on Cluster recovery/rebalance.

2020-08-07 Thread John Smith
No, everything blocks... Also using 2.7.0 just in case.

Only time I get exception is if the cluster is completely off, then I get
ClientDisconectedException...

On Fri, 7 Aug 2020 at 18:52, Denis Magda  wrote:

> If I'm not mistaken, key-value operations (cache.get/put) and compute
> calls fail with an exception if the cluster is deactivated. Do those fail
> on your end?
>
> As for the async and SQL operations, let's see what other community
> members say.
>
> -
> Denis
>
>
> On Fri, Aug 7, 2020 at 1:06 PM John Smith  wrote:
>
>> Hi any thoughts on this?
>>
>> On Thu, 6 Aug 2020 at 23:33, John Smith  wrote:
>>
>>> Here is another example where it blocks.
>>>
>>> SqlFieldsQuery query = new SqlFieldsQuery(
>>> "select * from my_table")
>>> .setArgs(providerId, carrierCode);
>>> query.setTimeout(1000, TimeUnit.MILLISECONDS);
>>>
>>> try (QueryCursor> cursor = cache.query(query))
>>>
>>> cache.query just blocks even with the timeout set.
>>>
>>> Is there a way to timeout and at least have the application continue and
>>> respond with an appropriate message?
>>>
>>>
>>>
>>> On Thu, 6 Aug 2020 at 23:06, John Smith  wrote:
>>>
 Hi running 2.7.0

 When I reboot a node and it begins to rejoin the cluster or the cluster
 is not yet activated with baseline topology operations seem to block
 forever, operations that are supposed to return IgniteFuture. I.e:
 putAsync, getAsync etc... They just block, until the cluster resolves it's
 state.