Re: Ignite 2.10. Performance tests in Azure

2021-05-04 Thread Stephen Darlington
> otherwise, why would we need specific transactional caches?

Transactional caches rollback the update in the event of failure; atomic caches 
do not.

> Regarding collocated computing:  We have to work in C++ and, since Ignite
> nodes use Java, I don't see how it is possible to send a C++ piece of code
> (like a lambda) so that it gets executed within the node... 

With C++ you’d need to deploy the code in advance.

You’re talking about peer-class loading, but that’s not a necessary part of 
colocated computing.

Regards,
Stephen

Re: Ignite 2.10. Performance tests in Azure

2021-05-01 Thread jjimeno
It's strange, as that's not what is stated in the documentation:

https://ignite.apache.org/docs/latest/key-value-api/basic-cache-operations#atomic-operations

  

otherwise, why would we need specific transactional caches?

Regarding collocated computing:  We have to work in C++ and, since Ignite
nodes use Java, I don't see how it is possible to send a C++ piece of code
(like a lambda) so that it gets executed within the node... 




--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/


Re: Ignite 2.10. Performance tests in Azure

2021-04-29 Thread Stephen Darlington
Neither getAll not putAll are really designed for many thousands of 
reads/writes in a single operation. The whole operation (rather than individual 
rows) are atomic.

For writes, you can use CacheAffinity to work out which node each record will 
be stored on and bulk push updates to the specific node (as the blog I posted 
previously suggests).

For reads, I’d expect to see scan and SQL queries. Or — better — using 
colocated compute to avoid copying the data over the network.

Ignite does scale horizontally but you have to use the right approach to get 
the best performance.

> On 29 Apr 2021, at 08:55, barttanghe  wrote:
> 
> Stephen,
> 
> I was in the understanding that if the cache is atomic (which is one of the
> cases jjimeno tried), that there are no transactions involved and that the
> putAll in fact is working on a row by row basis (some can fail).
> 
> So I don't understand what you mean with 'you're effectively creating a huge
> transaction'.
> Is this something internal to Ignite (so not user-space ?)?
> Could you help me in understanding this?
> 
> Next to that, what you explain is about putAll (writing case), but the
> getAll (reading case) also seems to already have reached its limits at only
> 16 nodes. Any ideas on that one?
> 
> Thanks!
> 
> Bart
> 
> 
> 
> --
> Sent from: http://apache-ignite-users.70518.x6.nabble.com/




Re: Ignite 2.10. Performance tests in Azure

2021-04-29 Thread barttanghe
Stephen,

I was in the understanding that if the cache is atomic (which is one of the
cases jjimeno tried), that there are no transactions involved and that the
putAll in fact is working on a row by row basis (some can fail).

So I don't understand what you mean with 'you're effectively creating a huge
transaction'.
Is this something internal to Ignite (so not user-space ?)?
Could you help me in understanding this?

Next to that, what you explain is about putAll (writing case), but the
getAll (reading case) also seems to already have reached its limits at only
16 nodes. Any ideas on that one?

Thanks!

Bart



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/


Re: Ignite 2.10. Performance tests in Azure

2021-04-26 Thread Stephen Darlington
The challenge is that by using large numbers of records in a single putAll 
method, you’re effectively creating a huge transaction. Transactions require 
distributed locks, which are expensive.

You’re right that batching can improve throughput (but not latency). That’s 
what the data streamer does. This is a blog showing a similar approach:

https://www.gridgain.com/resources/blog/how-fast-load-large-datasets-apache-ignite-using-key-value-api
 


(The code is Java but the approach should work for C++.)

> On 26 Apr 2021, at 09:03, jjimeno  wrote:
> 
> Hi,
> 
> I have the same feeling, but I think that shouldn't be the case.  Small
> number of big batches should decrease the total latency time while would
> favor the total throughput. And, as Ilya said:
> 
> "In a distributed system, throughput will scale with cluster growth, but
> latency will be steady or become slightly worse."
> 
> the effects of scaling the cluster should be clearer using a few big batches
> rather than a lot of tiny ones, at least in my understanding.
> 
> Unfortunately, Data Streamer is not yet supported in the C++ API, afaik.
> 
> 
> 
> --
> Sent from: http://apache-ignite-users.70518.x6.nabble.com/




Re: Ignite 2.10. Performance tests in Azure

2021-04-26 Thread jjimeno
Hi,

I have the same feeling, but I think that shouldn't be the case.  Small
number of big batches should decrease the total latency time while would
favor the total throughput. And, as Ilya said:

"In a distributed system, throughput will scale with cluster growth, but
latency will be steady or become slightly worse."

the effects of scaling the cluster should be clearer using a few big batches
rather than a lot of tiny ones, at least in my understanding.

Unfortunately, Data Streamer is not yet supported in the C++ API, afaik.



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/


Re: Ignite 2.10. Performance tests in Azure

2021-04-23 Thread Stephen Darlington
I would say that Ignite tends to work better with large numbers of small 
requests rather than a small number of big batches. 

The Java API has the Data Streamer API to help, not sure if that’s in the C++ 
API. But I think smaller batches would help.

I’ve not seen anyone request 18000 records in a single get command. Using 
colocated compute to avoid copying all those records over the network would 
improve performance. Or at least iterate over them using a scan or SQL query.

Regards,
Stephen

> On 23 Apr 2021, at 15:24, jjimeno  wrote:
> 
> Hello, and thanks for answering so quick
> 
> Because, as you say, I should get a bigger throughput when increasing the
> number of nodes.  
> 
> Of course I can't get the best of Ignite with this configuration, but I
> would expect something similar to what I get writing: time decreasing while
> nodes increase until the point that the single thread becomes the
> bottleneck.
> 
> Also, I wouldn't expect having better writing than reading times.
> 
> Sorry, I don't have these values, but I'll try to repeat the tests to get
> them.
> 
> Josemari.
> 
> 
> 
> --
> Sent from: http://apache-ignite-users.70518.x6.nabble.com/




Re: Ignite 2.10. Performance tests in Azure

2021-04-23 Thread jjimeno
Hello, and thanks for answering so quick

Because, as you say, I should get a bigger throughput when increasing the
number of nodes.  

Of course I can't get the best of Ignite with this configuration, but I
would expect something similar to what I get writing: time decreasing while
nodes increase until the point that the single thread becomes the
bottleneck.

Also, I wouldn't expect having better writing than reading times.

Sorry, I don't have these values, but I'll try to repeat the tests to get
them.

Josemari.



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/


Re: Ignite 2.10. Performance tests in Azure

2021-04-23 Thread Ilya Kasnacheev
Hello!

Why do you expect it to scale if you are only seem to run this in a single
thread?

In a distributed system, throughput will scale with cluster growth, but
latency will be steady or become slightly worse.

You need to run the same thread with sufficient number of threads, and
maybe using more than one client (VM and all) to drive load in order to
saturate it.

What is the CPU usage during the test on server nodes, per cluster size?

Regards,
-- 
Ilya Kasnacheev


пт, 23 апр. 2021 г. в 10:59, jjimeno :

> Hello all,
>
> For our project we need a distributed database with transactional support,
> and Ignite is one of the options we are testing.
>
> Scalability is one of our must have, so we created an Ignite Kubernetes
> cluster in Azure to test it, but we found that the results were not what we
> expected.
>
> To discard the problem was in our code or in using transactional caches, we
> created a small test program for writing/reading 1.8M keys of 528 bytes
> each
> (it represents one of our data types).
>
> As you can see in this graph, reading doesn't seem to scale.  Especially
> for
> the transactional cache, where having 4, 8 or 16 nodes in the cluster
> performs worse than having only 2:
> 
>
> While writing in atomic caches does... until 8 nodes, then it gets steady
> (No transactional times because of  this
>   ):
> 
>
> Another strange thing is that, for atomic caches, reading seems to be
> slower
> than writing:
> 
>
> So, my questions are:
>   - Could I been doing something wrong that could lead to this results?
>   - How could it be possible to get worse reading timings in a 4/8/16 nodes
> cluster than in a 2 nodes cluster for a transactional cache?
>   - How could reading be slower than writing in atomic caches?
>
> These are the source code and configuration files we're using:
> Test.cpp
> 
> Order.h 
>
> node-configuration.xml
> <
> http://apache-ignite-users.70518.x6.nabble.com/file/t3059/node-configuration.xml>
>
>
> Best regards and thanks in advance!
>
>
>
>
> --
> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>