Re: Optimum CacheConfiguration for cross cache join SQL queries

2017-07-06 Thread vkulichenko
iostream wrote
> Hi,
> 
> I have 2 caches, each with POJO values that look like -
> 
> public class Value1 {
> 
>   @QuerySqlField(index = true)
>   private Long a;
> 
>   @QuerySqlField(index = true)
>   private String b;
> 
> }
> 
> public class Value2 {
> 
>   @QuerySqlField(index = true)
>   private Long a;
> 
>   @QuerySqlField(index = true)
>   private String c;
> 
> }
> 
> I have to run the following SQL query on my Ignite Cluster -
> 
> SELECT * from Value1 v1, Value2 v2 WHERE v1.a = v2.a
> 
> To test the query performance, I configured my Ignite Cluster to have only
> 1 server node and used 4 clients nodes to run the query. Added 50,000
> Value1 objects and around 800,000 Value2 objects in the cache. I used a
> single node cluster in order to avoid using affinity collocation problems
> when performing cross cache joins. The average response time of the query
> was around 150 seconds! :(
> 
> Server node - 4 CPUs, 8 GB Heap
> Client node - 4 CPUs / client, 1 GB Heap / client
> 
> I am quite surprised to see the SQL perform poorly. I am posting my
> cacheConfiguration for your reference (same for both the caches).
> 
> cacheConfig.setAtomicityMode("transactional");
> cacheConfig.setCacheMode("partitioned");
> cacheConfig.setBackups(1);
> cacheConfig.setCopyOnRead(false);
> cacheConfig.setIndexedTypes(Long.class, Value1.class);
> 
> I am wondering if the query performed poorly because of my cache
> configuration or is Ignite not very efficient with cross cache joins
> (given that I used only 1 node to avoid non-collcation)
> 
> Any help is appreciated!
> 
> Thank you.

First of all, your configuration doesn't look valid. Atomicity mode and
cache mode are enums not strings, code you provided will not compile.

As for performance, this query doesn't have any condition, so it scans at
least one of the tables. For such queries you should apply proper
collocation between two tables and scale it out by adding more nodes and/or
setting CacheConfiguration#queryParallelism to value more than 1. This will
parallelize execution and make it faster.

In addition I would check the execution and verify that it scans the smaller
table. If not, you can set SqlFieldsQuery#enforceJoinOrder to true and
rewrite the query to use explicit joins in preferred order.

And finally, my recommendation would be to pick a query that is closer to to
realistic use case. Queries like provided here will never be very fast as
they imply a lot of scanning. Scaling out is the only way to effectively
increase performance for them.

-Val



--
View this message in context: 
http://apache-ignite-users.70518.x6.nabble.com/Optimum-CacheConfiguration-for-cross-cache-join-SQL-queries-tp14371p14429.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.


Re: Optimum CacheConfiguration for cross cache join SQL queries

2017-07-06 Thread vkulichenko
Hi,

Please properly subscribe to the mailing list so that the community can
receive email notifications for your messages. To subscribe, send empty
email to user-subscr...@ignite.apache.org and follow simple instructions in
the reply.





--
View this message in context: 
http://apache-ignite-users.70518.x6.nabble.com/Optimum-CacheConfiguration-for-cross-cache-join-SQL-queries-tp14371p14428.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.