Re: Designing Affinity Key for more locality

2021-04-26 Thread William.L
Came across this statement in the Data Partitioning documents:

"The affinity function determines the mapping between keys and partitions.
Each partition is identified by a number from a limited set (0 to 1023 by
default)."

Looks like there is no point for adding another layer of mapping unless I am
going for a smaller number.
Are there other ways in ignite to get more locality for subset of the data?
 



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/


Designing Affinity Key for more locality

2021-04-26 Thread William.L
Hi,

I am currently using user-id as the affinity key and because it is a
uuid/string, it will distribute across all partitions in the cluster.
However, for my scenario, I am writing and reading users that are for the
same tenant/group so it seems to me that it is better to design with more
read/write locality within a partition.

One approach I am thinking about is to hash the tenant-id + user-id into
1024 integer values (think of it as logical buckets) which I will use as the
affinity key. This way I can still get the colocation of user data while
also getting more locality within a partition.

Question is whether there are some negative trade-off in Ignite with using
this approach?

Note, I am also using Ignite SQL so I plan to set this integer as a SQL
field so that I can do colocation join on the user. Is this even necessary
if distributed join is disabled?

Thanks.






--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/


Re: Poor Performance with Increasing Public Thread Pool Size

2021-04-26 Thread Raymond Wilson
Hi Adrian,

One thing I would be careful about in .Net IA clients is managing the
numbers of allocations. Your test suggests the compute func being sent to
the node contains a large number of objects (10 in the case of your
test scaffold).

Increasing the size of the public thread pool will increase the number of
concurrent tasks being executed, also increasing the number of objects on
the code. If you had 48 concurrent tasks at 10 objects each, that is
4.8 million objects.

You mention the runtime for the small number of threads is already in the
minutes, so the GC will may have promoted large numbers of objects into the
Gen 2 pool which means every GC epoch must scan large numbers of objects.

I would look at your GC pause lengths as an initial possibility. I would
also look at structuring the payload so that it does not cause large
numbers of allocations in the processing context.

Beyond that, I would look at any locking your code may be performing in its
processing, and how individual compute tasks make use of the CPU/mem on the
server. C# Tasks and async programming may also be a good option to look at.

Cheers,
Raymond.


On Tue, Apr 27, 2021 at 8:36 AM Adrian Corman 
wrote:

> Hi Igniters,
>
>
>
> We are using the .net API for Ignite to distribute computing tasks
> (closures) to a cluster. We are having an issue involving the public thread
> pool size – when the tasks are sent to a machine with the default public
> thread pool size (48 processors, 48  threads in the case of our test
> machine), the jobs run 10x slower than if we set the public thread pool to
> a smaller number such as 6 or 12. This slowdown doesn’t seem to be related
> to any cache operations or IO – logging shows that even pure computational
> methods are slower overall.
>
>
>
> We also discovered that when we run two nodes on the test machine, each
> with half the thread pool size, the jobs run faster than with one node with
> the full pool size (two of 24 run faster than one of 48).
>
>
>
> Does anyone have any suggestions on how we can trace the source of this
> slowdown, or otherwise improve performance? I’ve included scaffolding for
> the types of tests we are running, to illustrate our issue.
>
>
>
> Thank you!
>
>
>
> public class PerfTest_Ignite
>
> {
>
> private IIgnite ignite;
>
>
>
> public void Test_PublicThreadPool_12()
>
> {
>
> ignite = Ignition.Start(new IgniteConfiguration
>
> {
>
> PublicThreadPoolSize = 12
>
> });
>
> string[] tasks = new string[10];
>
>
>
> var result = ignite.GetCompute().ApplyAsync(new
> LongRunning_DotNet_Task(), tasks);
>
> // This method performs as expected
>
> }
>
>
>
> public void Test_PublicThreadPool_48()
>
> {
>
> ignite = Ignition.Start(new IgniteConfiguration
>
> {
>
> //PublicThreadPoolSize = 48. Do not set Public Thread Pool
> size. (48 is the number of cores on this machine. )
>
> });
>
> string[] tasks = new string[10];
>
>
>
> var result = ignite.GetCompute().ApplyAsync(new
> LongRunning_DotNet_Task(), tasks);
>
>
>
> // This method is 10 TIMES SLOWER?
>
> }
>
>
>
>
>
> }
>
>
>
>
>
> public class LongRunning_DotNet_Task : IComputeFunc
>
> {
>
> public bool Invoke(string[] arg)
>
> {
>
> // do work that takes up to 5 minutes with 12 threads, but
> takes 50 minutes with 48 threads!
>
> return true;
>
>}
>
> }
>
>
>


-- 

Raymond Wilson
Solution Architect, Civil Construction Software Systems (CCSS)
11 Birmingham Drive | Christchurch, New Zealand
raymond_wil...@trimble.com




Re: Cannot start ignite nodes with shared memory - Ignite version 2.10.0

2021-04-26 Thread sarahsamji
Thankyou Ilya. 



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/


Poor Performance with Increasing Public Thread Pool Size

2021-04-26 Thread Adrian Corman
Hi Igniters,

We are using the .net API for Ignite to distribute computing tasks (closures) 
to a cluster. We are having an issue involving the public thread pool size - 
when the tasks are sent to a machine with the default public thread pool size 
(48 processors, 48  threads in the case of our test machine), the jobs run 10x 
slower than if we set the public thread pool to a smaller number such as 6 or 
12. This slowdown doesn't seem to be related to any cache operations or IO - 
logging shows that even pure computational methods are slower overall.

We also discovered that when we run two nodes on the test machine, each with 
half the thread pool size, the jobs run faster than with one node with the full 
pool size (two of 24 run faster than one of 48).

Does anyone have any suggestions on how we can trace the source of this 
slowdown, or otherwise improve performance? I've included scaffolding for the 
types of tests we are running, to illustrate our issue.

Thank you!

public class PerfTest_Ignite
{
private IIgnite ignite;

public void Test_PublicThreadPool_12()
{
ignite = Ignition.Start(new IgniteConfiguration
{
PublicThreadPoolSize = 12
});
string[] tasks = new string[10];

var result = ignite.GetCompute().ApplyAsync(new 
LongRunning_DotNet_Task(), tasks);
// This method performs as expected
}

public void Test_PublicThreadPool_48()
{
ignite = Ignition.Start(new IgniteConfiguration
{
//PublicThreadPoolSize = 48. Do not set Public Thread Pool 
size. (48 is the number of cores on this machine. )
});
string[] tasks = new string[10];

var result = ignite.GetCompute().ApplyAsync(new 
LongRunning_DotNet_Task(), tasks);

// This method is 10 TIMES SLOWER?
}


}


public class LongRunning_DotNet_Task : IComputeFunc
{
public bool Invoke(string[] arg)
{
// do work that takes up to 5 minutes with 12 threads, but takes 50 
minutes with 48 threads!
return true;
   }
}



Re: Cannot disable system view JMX exporter via configuration

2021-04-26 Thread tsipporah22
Thanks for the reply. As a workaround, now I'm adding below whitelist to my
prometheus.yaml and it's working.
whitelistObjectNames:
  - "java.lang:type=OperatingSystem"
  - "java.lang:type=Threading"
  - "java.lang:type=Memory"

tsippo



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/


Webinar >> How to Use Spark With Apache Ignite for Big Data Processing

2021-04-26 Thread aealexsandrov
Hi Igniters!

I'm going to talk about Apache Ignite Spark integration and run live demo
examples during the following webinar:

https://www.gridgain.com/resources/webinars/how-use-spark-apache-ignite-big-data-processing

If you'd like to learn more about the existing API, take a look at some use
cases, or just ask a few questions, feel free to join the webinar.

BR,
Andrei



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/


Webinar: How to Use Spark With Apache Ignite for Big Data Processing

2021-04-26 Thread andrei

Hi Igniters!

I'm going to talk about Apache Ignite Spark integration and run live 
demo examples during the following webinar:


https://www.gridgain.com/resources/webinars/how-use-spark-apache-ignite-big-data-processing 



If you'd like to learn more about the existing API, take a look at some 
use cases, or just ask a few questions, feel free to join the webinar.


BR,
Andrei


rebalancing & K8

2021-04-26 Thread narges saleh
Hi folks

If I am deploying my ignite cluster using AKS, is defining the auto
discovery service sufficient?
I  am following this link:
https://ignite.apache.org/docs/latest/installation/kubernetes/azure-deployment

Specifically, I am concerned about ignite's node/partition rebalancing in
case of auto-scaling. When K8 adds or removes nodes and pods, meaning,
ignite nodes get added or removed, does rebalancing kicks in properly? Do I
need to tune any parameter specifically for the purpose of deployment into
K8? Do I need to set up liveness probes?

thanks.


Re: Ignite 2.10. Performance tests in Azure

2021-04-26 Thread Stephen Darlington
The challenge is that by using large numbers of records in a single putAll 
method, you’re effectively creating a huge transaction. Transactions require 
distributed locks, which are expensive.

You’re right that batching can improve throughput (but not latency). That’s 
what the data streamer does. This is a blog showing a similar approach:

https://www.gridgain.com/resources/blog/how-fast-load-large-datasets-apache-ignite-using-key-value-api
 


(The code is Java but the approach should work for C++.)

> On 26 Apr 2021, at 09:03, jjimeno  wrote:
> 
> Hi,
> 
> I have the same feeling, but I think that shouldn't be the case.  Small
> number of big batches should decrease the total latency time while would
> favor the total throughput. And, as Ilya said:
> 
> "In a distributed system, throughput will scale with cluster growth, but
> latency will be steady or become slightly worse."
> 
> the effects of scaling the cluster should be clearer using a few big batches
> rather than a lot of tiny ones, at least in my understanding.
> 
> Unfortunately, Data Streamer is not yet supported in the C++ API, afaik.
> 
> 
> 
> --
> Sent from: http://apache-ignite-users.70518.x6.nabble.com/




Re: Ignite 2.10. Performance tests in Azure

2021-04-26 Thread jjimeno
Hi,

I have the same feeling, but I think that shouldn't be the case.  Small
number of big batches should decrease the total latency time while would
favor the total throughput. And, as Ilya said:

"In a distributed system, throughput will scale with cluster growth, but
latency will be steady or become slightly worse."

the effects of scaling the cluster should be clearer using a few big batches
rather than a lot of tiny ones, at least in my understanding.

Unfortunately, Data Streamer is not yet supported in the C++ API, afaik.



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/