Re: run ignitevisorcmd in k8s and docker?

2020-08-06 Thread Denis Magda
Hi there,

Visor CMD uses a legacy connection mode (daemon node) that might be tricky
to deal with in K8S environments. Try to start it inside the same K8S
network with your cluster and use a configuration with KubernetesIPFinder
you used for the server nodes.

Alternatively, go for an alternate monitoring tool that is better suited
for K8S:
https://ignite.apache.org/features/management-and-monitoring.html

-
Denis


On Thu, Aug 6, 2020 at 6:43 PM bbweb  wrote:

> Hi,
>
> we are meeting problem when we run ignitevisorcmd in K8S and docker
> environment. After we start a cluster in K8S and run ignitevisorcmd in the
> node, it just can't find any node when running "top" in it, it just show
> empty topology.
>
> Is this supported in K8S and docker? Or are there some configurations
> should be done?
>
> Thanks for any response!
>
>
>
>
>
> 
>
>
>
>


Re: Operation block on Cluster recovery/rebalance.

2020-08-06 Thread John Smith
Here is another example where it blocks.

SqlFieldsQuery query = new SqlFieldsQuery(
"select * from my_table")
.setArgs(providerId, carrierCode);
query.setTimeout(1000, TimeUnit.MILLISECONDS);

try (QueryCursor> cursor = cache.query(query))

cache.query just blocks even with the timeout set.

Is there a way to timeout and at least have the application continue and
respond with an appropriate message?



On Thu, 6 Aug 2020 at 23:06, John Smith  wrote:

> Hi running 2.7.0
>
> When I reboot a node and it begins to rejoin the cluster or the cluster is
> not yet activated with baseline topology operations seem to block forever,
> operations that are supposed to return IgniteFuture. I.e: putAsync,
> getAsync etc... They just block, until the cluster resolves it's state.
>
>
>


Operation block on Cluster recovery/rebalance.

2020-08-06 Thread John Smith
Hi running 2.7.0

When I reboot a node and it begins to rejoin the cluster or the cluster is
not yet activated with baseline topology operations seem to block forever,
operations that are supposed to return IgniteFuture. I.e: putAsync,
getAsync etc... They just block, until the cluster resolves it's state.


Re: Is there a way for client to lazy join the cluster?

2020-08-06 Thread John Smith
Ok I see...

On Thu, 6 Aug 2020 at 17:13, Evgenii Zhuravlev 
wrote:

> It should be handled on your application side. For example, you can
> make initialization of Ignite instance in a separate thread and add a check
> on other API invocations that instance was initialized.
>
> Evgenii
>
> чт, 6 авг. 2020 г. в 09:03, John Smith :
>
>> I'm testing failover scenarios and currently I have the full cluster shut
>> off. I would still like my application to continue working even if the
>> cache is not there...
>>
>> When my application starts...
>>
>> It calls Ignition.start(config)
>>
>> The application will not start until Ignition.start(config) finishes I.e:
>> I start the cluster back up.
>>
>


run ignitevisorcmd in k8s and docker?

2020-08-06 Thread bbweb
Hi, we are
meeting problem when we run ignitevisorcmd in K8S and docker environment. After
we start a cluster in K8S and run ignitevisorcmd in the node, it just can't
find any node when running "top" in it, it just show empty topology.Is this
supported in K8S and docker? Or are there some configurations should be done? Thanks for any response! 



Re: Call for presentations for ApacheCon North America 2020 now open

2020-08-06 Thread Denis Magda
Wesley,

Definitely follow the advice of Saikat and submit the talk to the IMC
Summit.

Also, we’ll be happy to host you for Ignite virtual meetup. Many community
folks will be glad to learn how you use Ignite ML in production. Are you
ready?
https://www.meetup.com/Apache-Ignite-Virtual-Meetup/

Denis

On Wednesday, August 5, 2020, Wesley Peng  wrote:

> Congrats. We could prepare a talking for "machine learning application
> with Ignite", as we did store feature engineering data into ignite for
> large-scale and fast access.
>
> regards.
>
>
> Saikat Maitra wrote:
>
>> Congrats!!!
>>
>> It looks like both of our talks are on same day, Tuesday, September 29th
>>
>> https://apachecon.com/acah2020/tracks/ignite.html
>>
>

-- 
-
Denis


Re: Is there a way for client to lazy join the cluster?

2020-08-06 Thread Evgenii Zhuravlev
It should be handled on your application side. For example, you can
make initialization of Ignite instance in a separate thread and add a check
on other API invocations that instance was initialized.

Evgenii

чт, 6 авг. 2020 г. в 09:03, John Smith :

> I'm testing failover scenarios and currently I have the full cluster shut
> off. I would still like my application to continue working even if the
> cache is not there...
>
> When my application starts...
>
> It calls Ignition.start(config)
>
> The application will not start until Ignition.start(config) finishes I.e:
> I start the cluster back up.
>


Re: Best first steps for performance tuning

2020-08-06 Thread Evgenii Zhuravlev
Hi Devin,

Yes, you're right, the first step could be increasing the amount of offheap
memory used for data(data region size). By default, Ignite uses 20% of
available RAM.

After that, I would recommend finding where the bottleneck is for your
system - you can check CPU, disk and network to find it.

Best Regards,
Evgenii

чт, 6 авг. 2020 г. в 12:49, Devin Bost :

> While looking at the docs, there are a lot of available parameters for
> performance tuning.
> We have several high-velocity Ignite operations, and Ignite is having
> trouble keeping up. We are using native persistence, and I suspect we will
> get more value from increasing the amount of memory used since I think the
> default memory utilization is low if I remember correctly.
>
> We're wondering what the first few things should be for us to look at for
> performance tuning and wondering if anyone has some guidance so we know
> what to start with.
>
> Thanks,
>
> Devin G. Bost
>


Re: Ignite client node hangs while IgniteAtomicLong is created

2020-08-06 Thread Ilya Roublev
Hello, Ilya,Attached are two thread dumps, the second is taken 13 minutes
after thefirst one:  threaddump.txt
 
,  threaddump2.txt
 
.The hanging occurs in the main thread (in fact the same output is for
threaddump taken after 8 hours):The differences between two thread dumps are
minor, one of them is asfollows:in the first thread dumpin the secondWhat
concerns a reproducer project, this is not an easy task, because it
isdifficult to understand which factors may be treated as significant.
Ourinitial project is in general stable, the matter is that we have dozens
of buildson our build server per each day and only some of these builds
fail. It is very difficult tocatch this situation, I have had to launch 5
builds one after another beforethis situation really occured. And it may be
that this situation requireslaunching very specific containers in Docker
each at very specific time. Andwe cannot share our original project, all I
can do is to give you thoseparts of the code that deal with Ignite. For
example, the full code of startmethod from DbManager is as follows:And we
have logs for all containers of our app including those for Igniteserver
nodes, if you like I can provide them.Thank you very much for your help in
advance.My best regards,Ilya



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Best first steps for performance tuning

2020-08-06 Thread Devin Bost
While looking at the docs, there are a lot of available parameters for
performance tuning.
We have several high-velocity Ignite operations, and Ignite is having
trouble keeping up. We are using native persistence, and I suspect we will
get more value from increasing the amount of memory used since I think the
default memory utilization is low if I remember correctly.

We're wondering what the first few things should be for us to look at for
performance tuning and wondering if anyone has some guidance so we know
what to start with.

Thanks,

Devin G. Bost


Re: Call for presentations for ApacheCon North America 2020 now open

2020-08-06 Thread Saikat Maitra
Hi,

I think this year's call for proposals for ApacheCon 2020 has ended but you
can go ahead and share proposals for a talk in In-Memory Computing Summit.

 https://www.imcsummit.org/2020/virtual/call-for-speakers


Regards,
Saikat



On Wed, Aug 5, 2020 at 8:51 PM Wesley Peng  wrote:

> Congrats. We could prepare a talking for "machine learning application
> with Ignite", as we did store feature engineering data into ignite for
> large-scale and fast access.
>
> regards.
>
>
> Saikat Maitra wrote:
> > Congrats!!!
> >
> > It looks like both of our talks are on same day, Tuesday, September 29th
> >
> > https://apachecon.com/acah2020/tracks/ignite.html
>


Is there a way for client to lazy join the cluster?

2020-08-06 Thread John Smith
I'm testing failover scenarios and currently I have the full cluster shut
off. I would still like my application to continue working even if the
cache is not there...

When my application starts...

It calls Ignition.start(config)

The application will not start until Ignition.start(config) finishes I.e: I
start the cluster back up.


Re: What does all partition owners have left the grid, partition data has been lost mean?

2020-08-06 Thread John Smith
Ok if I have 5 nodes with persistence then all nodes need to be in baseline?

Also what are the docs for backup to make sure I have it right?


On Thu, 6 Aug 2020 at 10:08, Ilya Kasnacheev 
wrote:

> Hello!
>
> You are confusing baseline with backups here.
>
> You should have 1 backup to afford losing a node.
>
> You should have all data nodes in the baseline.
>
> Regards,
> --
> Ilya Kasnacheev
>
>
> ср, 5 авг. 2020 г. в 17:56, John Smith :
>
>> I mean I have 3 nodes and the baseline is set to 3. Does it mean if I put
>> 2 as baseline then I can lose at least 1? If I remove one node from
>> baseline does it mean it will not store data?
>>
>> Or is it better to have 3 baseline nodes and add a 4th node? In that case
>> if I still lose a baseline node will I still be able to do operations on
>> the cache?
>>
>> On Wed, 5 Aug 2020 at 08:21, John Smith  wrote:
>>
>>> I have 3 nodes and baseline topology is 3 so if I lose 1 I guess it's
>>> enough... Should it be 2?
>>>
>>> On Tue., Aug. 4, 2020, 10:57 a.m. Ilya Kasnacheev, <
>>> ilya.kasnach...@gmail.com> wrote:
>>>
 Hello!

 What is your baseline topology at this moment? It means just that: you
 have lost enough nodes of your distributed grid that data is nowhere to be
 found now.

 Regards,
 --
 Ilya Kasnacheev


 пн, 3 авг. 2020 г. в 19:12, John Smith :

> I get the below exception on my client...
>
> #1 I rebooted the cache nodes error still continued.
> #2 restarted the client node error went away.
> #3 this seems to happen every few weeks.
> #4 is there some sort of timeout values and retries I can put?
> #5 cache operations seem to block when rebooting the nodes (I have 3
> nodes). Is there a way not to block?
>
> javax.cache.CacheException: class
> org.apache.ignite.internal.processors.cache.CacheInvalidStateException:
> Failed to execute cache operation (all partition owners have left the 
> grid,
> partition data has been lost) [cacheName=xx, part=273, 
> key=16479796986]
> at
> org.apache.ignite.internal.processors.cache.GridCacheUtils.convertToCacheException(GridCacheUtils.java:1337)
> at
> org.apache.ignite.internal.processors.cache.IgniteCacheFutureImpl.convertException(IgniteCacheFutureImpl.java:62)
> at
> org.apache.ignite.internal.util.future.IgniteFutureImpl.get(IgniteFutureImpl.java:157)
> at
> com.xx.common.vertx.ext.data.impl.IgniteCacheRepository.lambda$executeAsync$394d953f$1(IgniteCacheRepository.java:59)
> at
> org.apache.ignite.internal.util.future.AsyncFutureListener$1.run(AsyncFutureListener.java:53)
> at
> com.xx.common.vertx.ext.data.impl.VertxIgniteExecutorAdapter.lambda$execute$0(VertxIgniteExecutorAdapter.java:18)
> at io.vertx.core.impl.ContextImpl.executeTask(ContextImpl.java:369)
> at
> io.vertx.core.impl.EventLoopContext.lambda$executeAsync$0(EventLoopContext.java:38)
> at
> io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:164)
> at
> io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:472)
> at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:497)
> at
> io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)
> at
> io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
> at
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
> at java.lang.Thread.run(Thread.java:748)
> Caused by:
> org.apache.ignite.internal.processors.cache.CacheInvalidStateException:
> Failed to execute cache operation (all partition owners have left the 
> grid,
> partition data has been lost) [cacheName=xx, part=273, 
> key=16479796986]
> at
> org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTopologyFutureAdapter.validatePartitionOperation(GridDhtTopologyFutureAdapter.java:161)
> at
> org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTopologyFutureAdapter.validateCache(GridDhtTopologyFutureAdapter.java:116)
> at
> org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridNearAtomicSingleUpdateFuture.mapOnTopology(GridNearAtomicSingleUpdateFuture.java:417)
> at
> org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridNearAtomicAbstractUpdateFuture.map(GridNearAtomicAbstractUpdateFuture.java:248)
> at
> org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$26.apply(GridDhtAtomicCache.java:1146)
> at
> org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$26.apply(GridDhtAtomicCache.java:1144)
> at
> org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.asyncOp(GridDhtAtomicCache.java:761)
> at
> 

Re: What does all partition owners have left the grid, partition data has been lost mean?

2020-08-06 Thread Ilya Kasnacheev
Hello!

You are confusing baseline with backups here.

You should have 1 backup to afford losing a node.

You should have all data nodes in the baseline.

Regards,
-- 
Ilya Kasnacheev


ср, 5 авг. 2020 г. в 17:56, John Smith :

> I mean I have 3 nodes and the baseline is set to 3. Does it mean if I put
> 2 as baseline then I can lose at least 1? If I remove one node from
> baseline does it mean it will not store data?
>
> Or is it better to have 3 baseline nodes and add a 4th node? In that case
> if I still lose a baseline node will I still be able to do operations on
> the cache?
>
> On Wed, 5 Aug 2020 at 08:21, John Smith  wrote:
>
>> I have 3 nodes and baseline topology is 3 so if I lose 1 I guess it's
>> enough... Should it be 2?
>>
>> On Tue., Aug. 4, 2020, 10:57 a.m. Ilya Kasnacheev, <
>> ilya.kasnach...@gmail.com> wrote:
>>
>>> Hello!
>>>
>>> What is your baseline topology at this moment? It means just that: you
>>> have lost enough nodes of your distributed grid that data is nowhere to be
>>> found now.
>>>
>>> Regards,
>>> --
>>> Ilya Kasnacheev
>>>
>>>
>>> пн, 3 авг. 2020 г. в 19:12, John Smith :
>>>
 I get the below exception on my client...

 #1 I rebooted the cache nodes error still continued.
 #2 restarted the client node error went away.
 #3 this seems to happen every few weeks.
 #4 is there some sort of timeout values and retries I can put?
 #5 cache operations seem to block when rebooting the nodes (I have 3
 nodes). Is there a way not to block?

 javax.cache.CacheException: class
 org.apache.ignite.internal.processors.cache.CacheInvalidStateException:
 Failed to execute cache operation (all partition owners have left the grid,
 partition data has been lost) [cacheName=xx, part=273, key=16479796986]
 at
 org.apache.ignite.internal.processors.cache.GridCacheUtils.convertToCacheException(GridCacheUtils.java:1337)
 at
 org.apache.ignite.internal.processors.cache.IgniteCacheFutureImpl.convertException(IgniteCacheFutureImpl.java:62)
 at
 org.apache.ignite.internal.util.future.IgniteFutureImpl.get(IgniteFutureImpl.java:157)
 at
 com.xx.common.vertx.ext.data.impl.IgniteCacheRepository.lambda$executeAsync$394d953f$1(IgniteCacheRepository.java:59)
 at
 org.apache.ignite.internal.util.future.AsyncFutureListener$1.run(AsyncFutureListener.java:53)
 at
 com.xx.common.vertx.ext.data.impl.VertxIgniteExecutorAdapter.lambda$execute$0(VertxIgniteExecutorAdapter.java:18)
 at io.vertx.core.impl.ContextImpl.executeTask(ContextImpl.java:369)
 at
 io.vertx.core.impl.EventLoopContext.lambda$executeAsync$0(EventLoopContext.java:38)
 at
 io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:164)
 at
 io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:472)
 at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:497)
 at
 io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)
 at
 io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
 at
 io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
 at java.lang.Thread.run(Thread.java:748)
 Caused by:
 org.apache.ignite.internal.processors.cache.CacheInvalidStateException:
 Failed to execute cache operation (all partition owners have left the grid,
 partition data has been lost) [cacheName=xx, part=273, key=16479796986]
 at
 org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTopologyFutureAdapter.validatePartitionOperation(GridDhtTopologyFutureAdapter.java:161)
 at
 org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTopologyFutureAdapter.validateCache(GridDhtTopologyFutureAdapter.java:116)
 at
 org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridNearAtomicSingleUpdateFuture.mapOnTopology(GridNearAtomicSingleUpdateFuture.java:417)
 at
 org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridNearAtomicAbstractUpdateFuture.map(GridNearAtomicAbstractUpdateFuture.java:248)
 at
 org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$26.apply(GridDhtAtomicCache.java:1146)
 at
 org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$26.apply(GridDhtAtomicCache.java:1144)
 at
 org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.asyncOp(GridDhtAtomicCache.java:761)
 at
 org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.update0(GridDhtAtomicCache.java:1144)
 at
 org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.putAsync0(GridDhtAtomicCache.java:641)
 at
 

Re: Ignite 2.81. - NULL pointer exception

2020-08-06 Thread Ilya Kasnacheev
Hello!

This seems to be an assertion failure that we did not see before. Please
tell us if you see recurrence and especially if you can link it to some
activity/

Regards,
-- 
Ilya Kasnacheev


ср, 5 авг. 2020 г. в 10:18, Mahesh Renduchintala <
mahesh.renduchint...@aline-consulting.com>:

> Hi,
>
> we have a null pointer exception in one of our servers.  No major activity
> was happening when the server crashed
>
> Please check the logs and see if there is any workaround we can use.
> We are in a production environment.
>
> regards
> mahesh
>
>


Re: Enabling swapPath causes invoking shutdown hook

2020-08-06 Thread Ilya Kasnacheev
Hello!

I think the performance of swap space should be on par with persistence
with disabled WAL.

You can submit suggested updates to the documentation if you like.

Regards,
-- 
Ilya Kasnacheev


ср, 5 авг. 2020 г. в 06:00, 38797715 <38797...@qq.com>:

> Hi Ilya,
>
> If so, there are two ways to implement ignite's swap space:
> 1. maxSize > physical memory, which will use the swap mechanism of the OS,
> can be used *vm.swappiness* Adjust.
> 2. Configure the *swapPath* property, which is implemented by Ignite
> itself, is independent of the OS and has no optimization parameters.
> There's a choice between these two models, right? Then I think there may
> be many problems in the description of the document. I hope you can check
> it again:
> https://apacheignite.readme.io/docs/swap-space
>
> After our initial testing, the performance of swap space is much better
> than native persistence, so I think this pattern is valuable in some
> scenarios.
> 在 2020/8/4 下午10:16, Ilya Kasnacheev 写道:
>
> Hello!
>
> From the docs:
>
> To avoid this situation with the swapping capabilities, you need to :
>
>- Set maxSize = bigger_ than_RAM_size, in which case, the OS will take
>care of the swapping.
>- Enable swapping by setting the DataRegionConfiguration.swapPath
> property.
>
>
> I actually think these are either-or. You should either do the first (and
> configure OS swapping) or the second part.
>
> Having said that, I recommend setting proper Native Persistence instead.
>
> Regards,
> --
> Ilya Kasnacheev
>
>
> сб, 25 июл. 2020 г. в 04:49, 38797715 <38797...@qq.com>:
>
>> Hi,
>>
>> https://apacheignite.readme.io/docs/swap-space
>>
>> According to the above document, if the physical memory is small, you can
>> solve this problem by opening the swap space,The specific method is to
>> configure maxSize to a larger value (i.e. larger than the physical memory),
>> and the swapPath property needs to be configured.
>>
>> But from the test results, the node is terminated.
>>
>> I think the correct result should be that even if the amount of data
>> exceeds the physical memory, the node should still be able to run normally,
>> but the data is exchanged to the disk.
>>
>> I want to know what parameters affect the behavior of this configuration?
>> *vm.swappiness* or others?
>> 在 2020/7/24 下午9:55, aealexsandrov 写道:
>>
>> Hi,
>>
>> Can you please clarify your expectations? You expected that JVM process will
>> be killed instead of gracefully stopping? What you are going to achieve?
>>
>> BR,
>> Andrei
>>
>>
>>
>> --
>> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>>
>>


Re: read-though tutorial for a big table

2020-08-06 Thread Alex Panchenko
Hello Vladimir,

are there some key things you can share with us? Some checklist with the
most important configuration params or things we need to review/check?
anything would be helpful

I've been playing with Ignite for the last few months, performance still
low.
I have to decide whether to switch from Ignite to some another solution or
improve the performance ASAP.  

Thanks



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/


Re: How to optimize the connection terminal in debugging mode?

2020-08-06 Thread vtchernyi
Hi,I do my debug in single node cluster made of my local machine. I create special testConfig.xml to be used for debug only and specify only local comp name in that file.  In that case there is no 30 sec timeout and debugging is okVladimir9:45, 6 августа 2020 г., 38797715 <38797...@qq.com>:
Hi,
In the development scenario, when using the native client to
  connect to the server, if the single step debugging interrupt time
  is too long, the connection between the client and the server may
  be broken. What parameters can be used to optimize the timeout in
  this scenario to facilitate debugging?
part1.91c8c02d.0a7fc...@qq.com
  
-- Отправлено из мобильного приложения Яндекс.Почты