Re: Cache Join
Hello! Do you really need such huge number of fields to join? Isn't it easier to introduce an affinity field? Anyway, you seem to create an index with a huge number of fields. Consider only keeping fields which are necessary for performing that join. You might even need to boil it down to some single field (I'm not sure how actual distributedJoins is implemented). If you could frame it as stand-alone project, somebody from community could try and tweak it until it works. Regards, -- Ilya Kasnacheev чт, 28 мар. 2019 г. в 06:23, Skollur : > Hello > > I am using apache ignite 2.7 version and trying to join three tables and My > query looks as below. All tables are configured as PARTIONED and also set > "distributedjoin=true" > == > SELECT > CAS.Group_Customer_ID as groupCustomerId, > CAS.Account_Number as accountNumber, > 'Custody Account' as productTitle, > CAS.Investment_Model as investmentModel, > sum(CAS.Market_Value) AS marketValue, > CSS.Category_Sequence as sequenceNumber, > sum(ASM.Prev_Month_Bal) as preMonthBal, > sum(ASM.Prev_Year_Bal) as preYearBal > FROM > "VwCustodyAccountPositionCpsCache".VW_CUSTODY_ACCOUNT_POSITION_CPS > CAS > INNER JOIN "IdbCpsSequenceCache".IDB_CPS_SEQUENCE CSS ON > CAS.Account_Type = CSS.Account_Type > AND CAS.As_Of_Date BETWEEN '2018-12-31' AND '2018-12-31' > AND CAS.Group_Customer_ID = 61 > INNER JOIN "InvestmentSummaryMonthlyCache".ASSET_SUMMARY_MONTHLY ASM ON > CAS.Group_Customer_ID = ASM.Group_Customer_ID > AND CAS.Account_Number = ASM.Account_Number > AND CAS.As_of_Date = ASM.Effective_Date > AND CAS.Account_Type = ASM.Account_Type > AND CAS.Product = ASM.Product > AND CAS.Asset_Class_Allocation = ASM.Asset_Class > group by > CAS.Group_Customer_ID, > CAS.Account_Number, > CAS.Investment_Model, > CSS.Category_Sequence > = > I have added index for each fields which are used in each cache table as > below. Example in InvestmentSummaryMonthlyCache:- > >ArrayList indexes = new ArrayList<>(); > QueryIndex index = new QueryIndex(); > index.setName("NonClustered_Index_Summary_Monthly"); > index.setIndexType(QueryIndexType.SORTED); > LinkedHashMap indFlds = new LinkedHashMap<>(); > indFlds.put("dwId", true); > indFlds.put("groupCustomerId", true); > indFlds.put("accountNumber", true); > indFlds.put("accountType", true); > indFlds.put("product", true); > indFlds.put("productType", true); > indFlds.put("subProduct", true); > indFlds.put("assetClass", true); > indFlds.put("effectiveDate", true); > index.setFields(indFlds); > indexes.add(index); > qryEntity.setIndexes(indexes); > qryEntities.add(qryEntity); > = > When I tried to execute query(I have all fields defined in index as above), > getting an error -> SQL Error [5]: javax.cache.CacheException: Failed > to > prepare distributed join query: join condition does not use index > [joinedCache=InvestmentSummaryMonthlyCache, What am I missing here? > > > plan=SELECT > CAS__Z0.GROUP_CUSTOMER_ID AS __C0_0, > CAS__Z0.ACCOUNT_NUMBER AS __C0_1, > 'Custody Account' AS __C0_2, > CAS__Z0.INVESTMENT_MODEL AS __C0_3, > SUM(CAS__Z0.MARKET_VALUE) AS __C0_4, > CSS__Z1.CATEGORY_SEQUENCE AS __C0_5, > SUM(ASM__Z2.PREV_MONTH_BAL) AS __C0_6, > SUM(ASM__Z2.PREV_YEAR_BAL) AS __C0_7 > FROM "VwCustodyAccountPositionCpsCache".VW_CUSTODY_ACCOUNT_POSITION_CPS > CAS__Z0 > /* > "VwCustodyAccountPositionCpsCache".VW_CUSTODY_ACCOUNT_POSITION_CPS.__SCAN_ > */ > /* WHERE (CAS__Z0.AS_OF_DATE <= DATE '2018-12-31') > AND ((CAS__Z0.GROUP_CUSTOMER_ID = 80061) > AND (CAS__Z0.AS_OF_DATE >= DATE '2018-12-31')) > */ > INNER JOIN "InvestmentSummaryMonthlyCache".ASSET_SUMMARY_MONTHLY ASM__Z2 > /* batched:broadcast > "InvestmentSummaryMonthlyCache".ASSET_SUMMARY_MONTHLY.__SCAN_ */ > ON 1=1 > /* WHERE (CAS__Z0.ACCOUNT_NUMBER = ASM__Z2.ACCOUNT_NUMBER) > AND ((CAS__Z0.GROUP_CUSTOMER_ID = ASM__Z2.GROUP_CUSTOMER_ID) > AND ((CAS__Z0.AS_OF_DATE = ASM__Z2.EFFECTIVE_DATE) > AND ((CAS__Z0.ACCOUNT_TYPE = ASM__Z2.ACCOUNT_TYPE) > AND ((CAS__Z0.ASSET_CLASS_ALLOCATION = ASM__Z2.ASSET_CLASS) > AND (CAS__Z0.PRODUCT = ASM__Z2.PRODUCT) > */ > INNER JOIN "IdbCpsSequenceCache".IDB_CPS_SEQUENCE CSS__Z1 > /* batched:broadcast "IdbCpsSequenceCache".NONCLUSTERED_INDEX_LC: > ACCOUNT_TYPE = CAS__Z0.ACCOUNT_TYPE */ > ON 1=1 > WHERE ((CAS__Z0.GROUP_CUSTOMER_ID = 80061) > AND ((CAS__Z0.ACCOUNT_TYPE = CSS__Z1.ACCOUNT_TYPE) > AND ((CAS__Z0.AS_OF_DATE >= DATE '2018-12-31') > AND (CAS__Z0.AS_OF_DATE <= DATE
Cache Join
Hello I am using apache ignite 2.7 version and trying to join three tables and My query looks as below. All tables are configured as PARTIONED and also set "distributedjoin=true" == SELECT CAS.Group_Customer_ID as groupCustomerId, CAS.Account_Number as accountNumber, 'Custody Account' as productTitle, CAS.Investment_Model as investmentModel, sum(CAS.Market_Value) AS marketValue, CSS.Category_Sequence as sequenceNumber, sum(ASM.Prev_Month_Bal) as preMonthBal, sum(ASM.Prev_Year_Bal) as preYearBal FROM "VwCustodyAccountPositionCpsCache".VW_CUSTODY_ACCOUNT_POSITION_CPS CAS INNER JOIN "IdbCpsSequenceCache".IDB_CPS_SEQUENCE CSS ON CAS.Account_Type = CSS.Account_Type AND CAS.As_Of_Date BETWEEN '2018-12-31' AND '2018-12-31' AND CAS.Group_Customer_ID = 61 INNER JOIN "InvestmentSummaryMonthlyCache".ASSET_SUMMARY_MONTHLY ASM ON CAS.Group_Customer_ID = ASM.Group_Customer_ID AND CAS.Account_Number = ASM.Account_Number AND CAS.As_of_Date = ASM.Effective_Date AND CAS.Account_Type = ASM.Account_Type AND CAS.Product = ASM.Product AND CAS.Asset_Class_Allocation = ASM.Asset_Class group by CAS.Group_Customer_ID, CAS.Account_Number, CAS.Investment_Model, CSS.Category_Sequence = I have added index for each fields which are used in each cache table as below. Example in InvestmentSummaryMonthlyCache:- ArrayList indexes = new ArrayList<>(); QueryIndex index = new QueryIndex(); index.setName("NonClustered_Index_Summary_Monthly"); index.setIndexType(QueryIndexType.SORTED); LinkedHashMap indFlds = new LinkedHashMap<>(); indFlds.put("dwId", true); indFlds.put("groupCustomerId", true); indFlds.put("accountNumber", true); indFlds.put("accountType", true); indFlds.put("product", true); indFlds.put("productType", true); indFlds.put("subProduct", true); indFlds.put("assetClass", true); indFlds.put("effectiveDate", true); index.setFields(indFlds); indexes.add(index); qryEntity.setIndexes(indexes); qryEntities.add(qryEntity); = When I tried to execute query(I have all fields defined in index as above), getting an error -> SQL Error [5]: javax.cache.CacheException: Failed to prepare distributed join query: join condition does not use index [joinedCache=InvestmentSummaryMonthlyCache, What am I missing here? plan=SELECT CAS__Z0.GROUP_CUSTOMER_ID AS __C0_0, CAS__Z0.ACCOUNT_NUMBER AS __C0_1, 'Custody Account' AS __C0_2, CAS__Z0.INVESTMENT_MODEL AS __C0_3, SUM(CAS__Z0.MARKET_VALUE) AS __C0_4, CSS__Z1.CATEGORY_SEQUENCE AS __C0_5, SUM(ASM__Z2.PREV_MONTH_BAL) AS __C0_6, SUM(ASM__Z2.PREV_YEAR_BAL) AS __C0_7 FROM "VwCustodyAccountPositionCpsCache".VW_CUSTODY_ACCOUNT_POSITION_CPS CAS__Z0 /* "VwCustodyAccountPositionCpsCache".VW_CUSTODY_ACCOUNT_POSITION_CPS.__SCAN_ */ /* WHERE (CAS__Z0.AS_OF_DATE <= DATE '2018-12-31') AND ((CAS__Z0.GROUP_CUSTOMER_ID = 80061) AND (CAS__Z0.AS_OF_DATE >= DATE '2018-12-31')) */ INNER JOIN "InvestmentSummaryMonthlyCache".ASSET_SUMMARY_MONTHLY ASM__Z2 /* batched:broadcast "InvestmentSummaryMonthlyCache".ASSET_SUMMARY_MONTHLY.__SCAN_ */ ON 1=1 /* WHERE (CAS__Z0.ACCOUNT_NUMBER = ASM__Z2.ACCOUNT_NUMBER) AND ((CAS__Z0.GROUP_CUSTOMER_ID = ASM__Z2.GROUP_CUSTOMER_ID) AND ((CAS__Z0.AS_OF_DATE = ASM__Z2.EFFECTIVE_DATE) AND ((CAS__Z0.ACCOUNT_TYPE = ASM__Z2.ACCOUNT_TYPE) AND ((CAS__Z0.ASSET_CLASS_ALLOCATION = ASM__Z2.ASSET_CLASS) AND (CAS__Z0.PRODUCT = ASM__Z2.PRODUCT) */ INNER JOIN "IdbCpsSequenceCache".IDB_CPS_SEQUENCE CSS__Z1 /* batched:broadcast "IdbCpsSequenceCache".NONCLUSTERED_INDEX_LC: ACCOUNT_TYPE = CAS__Z0.ACCOUNT_TYPE */ ON 1=1 WHERE ((CAS__Z0.GROUP_CUSTOMER_ID = 80061) AND ((CAS__Z0.ACCOUNT_TYPE = CSS__Z1.ACCOUNT_TYPE) AND ((CAS__Z0.AS_OF_DATE >= DATE '2018-12-31') AND (CAS__Z0.AS_OF_DATE <= DATE '2018-12-31' AND ((CAS__Z0.ASSET_CLASS_ALLOCATION = ASM__Z2.ASSET_CLASS) AND ((CAS__Z0.PRODUCT = ASM__Z2.PRODUCT) AND ((CAS__Z0.ACCOUNT_TYPE = ASM__Z2.ACCOUNT_TYPE) AND ((CAS__Z0.AS_OF_DATE = ASM__Z2.EFFECTIVE_DATE) AND ((CAS__Z0.GROUP_CUSTOMER_ID = ASM__Z2.GROUP_CUSTOMER_ID) AND (CAS__Z0.ACCOUNT_NUMBER = ASM__Z2.ACCOUNT_NUMBER)) GROUP BY CAS__Z0.GROUP_CUSTOMER_ID, CAS__Z0.ACCOUNT_NUMBER, CAS__Z0.INVESTMENT_MODEL, CSS__Z1.CATEGORY_SEQUENCE] SQL Error [5]: javax.cache.CacheException: Failed to prepare distributed join query: join condition does not use index [joinedCache=InvestmentSummaryMonthlyCache, plan=SELECT CAS__Z0.GROUP_CUSTOMER_ID AS __C0_0, CAS__Z0.ACCOUNT_NUMBER AS __C0_1,
Re: Poor cross cache join SQL performance (v2.0.0)
Hi, As well as for regular DB, each group index relates only to one entity/table, so there is no such thing like cross cache group index. Kind regards, Alex. On Wed, Jul 12, 2017 at 3:22 PM, iostream [via Apache Ignite Users] < ml+s70518n14719...@n6.nabble.com> wrote: > Hi, > > Apologies for the delayed response. > > Implementing appropriate group indexes solved the problem for me. > > One question - can I create a group index across caches? Or group index > can be created only within one cache? > > Thanks! :) > > -- > If you reply to this email, your message will be added to the discussion > below: > http://apache-ignite-users.70518.x6.nabble.com/Poor-cross-cache-join-SQL- > performance-v2-0-0-tp14373p14719.html > To start a new topic under Apache Ignite Users, email > ml+s70518n1...@n6.nabble.com > To unsubscribe from Apache Ignite Users, click here > <http://apache-ignite-users.70518.x6.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code=1=YWxleGFuZGVyLmZlZG90b2ZmQGdtYWlsLmNvbXwxfC0xMzYxNTU0NTg=> > . > NAML > <http://apache-ignite-users.70518.x6.nabble.com/template/NamlServlet.jtp?macro=macro_viewer=instant_html%21nabble%3Aemail.naml=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml> > -- View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Poor-cross-cache-join-SQL-performance-v2-0-0-tp14373p14721.html Sent from the Apache Ignite Users mailing list archive at Nabble.com.
Re: Poor cross cache join SQL performance (v2.0.0)
Hi, Apologies for the delayed response. Implementing appropriate group indexes solved the problem for me. One question - can I create a group index across caches? Or group index can be created only within one cache? Thanks! :) -- View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Poor-cross-cache-join-SQL-performance-v2-0-0-tp14373p14719.html Sent from the Apache Ignite Users mailing list archive at Nabble.com.
Re: Poor cross cache join SQL performance (v2.0.0)
Hi, It would be great to take a look on a corresponding JFR. Kind regards, Alex 6 июля 2017 г. 11:55 AM пользователь "afedotov" < alexander.fedot...@gmail.com> написал: Could you please share the reproducer? Kind regards, Alex 6 июля 2017 г. 11:18 AM пользователь "iostream [via Apache Ignite Users]" <[hidden email] <http:///user/SendEmail.jtp?type=node=14384=0>> написал: > Hi Alex, > > Yes Value2.class was already added to indextypes. > > I am only reading the value which is why copyOnRead is set to "false". > > -- > If you reply to this email, your message will be added to the discussion > below: > http://apache-ignite-users.70518.x6.nabble.com/Poor-cross- > cache-join-SQL-performance-v2-0-0-tp14373p14382.html > To start a new topic under Apache Ignite Users, email [hidden email] > <http:///user/SendEmail.jtp?type=node=14384=1> > To unsubscribe from Apache Ignite Users, click here. > NAML > <http://apache-ignite-users.70518.x6.nabble.com/template/NamlServlet.jtp?macro=macro_viewer=instant_html%21nabble%3Aemail.naml=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml> > -- View this message in context: Re: Poor cross cache join SQL performance (v2.0.0) <http://apache-ignite-users.70518.x6.nabble.com/Poor-cross-cache-join-SQL-performance-v2-0-0-tp14373p14384.html> Sent from the Apache Ignite Users mailing list archive <http://apache-ignite-users.70518.x6.nabble.com/> at Nabble.com.
Re: Optimum CacheConfiguration for cross cache join SQL queries
iostream wrote > Hi, > > I have 2 caches, each with POJO values that look like - > > public class Value1 { > > @QuerySqlField(index = true) > private Long a; > > @QuerySqlField(index = true) > private String b; > > } > > public class Value2 { > > @QuerySqlField(index = true) > private Long a; > > @QuerySqlField(index = true) > private String c; > > } > > I have to run the following SQL query on my Ignite Cluster - > > SELECT * from Value1 v1, Value2 v2 WHERE v1.a = v2.a > > To test the query performance, I configured my Ignite Cluster to have only > 1 server node and used 4 clients nodes to run the query. Added 50,000 > Value1 objects and around 800,000 Value2 objects in the cache. I used a > single node cluster in order to avoid using affinity collocation problems > when performing cross cache joins. The average response time of the query > was around 150 seconds! :( > > Server node - 4 CPUs, 8 GB Heap > Client node - 4 CPUs / client, 1 GB Heap / client > > I am quite surprised to see the SQL perform poorly. I am posting my > cacheConfiguration for your reference (same for both the caches). > > cacheConfig.setAtomicityMode("transactional"); > cacheConfig.setCacheMode("partitioned"); > cacheConfig.setBackups(1); > cacheConfig.setCopyOnRead(false); > cacheConfig.setIndexedTypes(Long.class, Value1.class); > > I am wondering if the query performed poorly because of my cache > configuration or is Ignite not very efficient with cross cache joins > (given that I used only 1 node to avoid non-collcation) > > Any help is appreciated! > > Thank you. First of all, your configuration doesn't look valid. Atomicity mode and cache mode are enums not strings, code you provided will not compile. As for performance, this query doesn't have any condition, so it scans at least one of the tables. For such queries you should apply proper collocation between two tables and scale it out by adding more nodes and/or setting CacheConfiguration#queryParallelism to value more than 1. This will parallelize execution and make it faster. In addition I would check the execution and verify that it scans the smaller table. If not, you can set SqlFieldsQuery#enforceJoinOrder to true and rewrite the query to use explicit joins in preferred order. And finally, my recommendation would be to pick a query that is closer to to realistic use case. Queries like provided here will never be very fast as they imply a lot of scanning. Scaling out is the only way to effectively increase performance for them. -Val -- View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Optimum-CacheConfiguration-for-cross-cache-join-SQL-queries-tp14371p14429.html Sent from the Apache Ignite Users mailing list archive at Nabble.com.
Re: Optimum CacheConfiguration for cross cache join SQL queries
Hi, Please properly subscribe to the mailing list so that the community can receive email notifications for your messages. To subscribe, send empty email to user-subscr...@ignite.apache.org and follow simple instructions in the reply. -- View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Optimum-CacheConfiguration-for-cross-cache-join-SQL-queries-tp14371p14428.html Sent from the Apache Ignite Users mailing list archive at Nabble.com.
Re: Poor cross cache join SQL performance (v2.0.0)
Hi Alex, Yes Value2.class was already added to indextypes. I am only reading the value which is why copyOnRead is set to "false". -- View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Poor-cross-cache-join-SQL-performance-v2-0-0-tp14373p14382.html Sent from the Apache Ignite Users mailing list archive at Nabble.com.
Re: Poor cross cache join SQL performance (v2.0.0)
Hi, Add Value2 to the list of the indexed classes (setIndexedTypes). What is the reason for setting copyOnRead(false)? If the problem still stands after the fix, please check GC or take a JFR. Kind regards, Alex 6 июля 2017 г. 7:29 AM пользователь "iostream [via Apache Ignite Users]" < ml+s70518n14373...@n6.nabble.com> написал: Hi, I have 2 caches, each with POJO values that look like - public class Value1 { @QuerySqlField(index = true) private Long a; @QuerySqlField(index = true) private String b; } public class Value2 { @QuerySqlField(index = true) private Long a; @QuerySqlField(index = true) private String c; } I have to run the following SQL query on my Ignite Cluster - SELECT * from Value1 v1, Value2 v2 WHERE v1.a = v2.a To test the query performance, I configured my Ignite Cluster to have only 1 server node and used 4 clients nodes to run the query. Added 50,000 Value1 objects and around 800,000 Value2 objects in the cache. I used a single node cluster in order to avoid using affinity collocation problems when performing cross cache joins. The average response time of the query was around 150 seconds! :( Server node - 4 CPUs, 8 GB Heap Client node - 4 CPUs / client, 1 GB Heap / client I am quite surprised to see the SQL perform poorly. I am posting my cacheConfiguration for your reference (same for both the caches). cacheConfig.setAtomicityMode("transactional"); cacheConfig.setCacheMode("partitioned"); cacheConfig.setBackups(1); cacheConfig.setCopyOnRead(false); cacheConfig.setIndexedTypes(Long.class, Value1.class); I am wondering if the query performed poorly because of my cache configuration or is Ignite not very efficient with cross cache joins (given that I used only 1 node to avoid non-collcation) Any help is appreciated! Thank you. -- If you reply to this email, your message will be added to the discussion below: http://apache-ignite-users.70518.x6.nabble.com/Poor-cross-cache-join-SQL- performance-v2-0-0-tp14373.html To start a new topic under Apache Ignite Users, email ml+s70518n1...@n6.nabble.com To unsubscribe from Apache Ignite Users, click here <http://apache-ignite-users.70518.x6.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code=1=YWxleGFuZGVyLmZlZG90b2ZmQGdtYWlsLmNvbXwxfC0xMzYxNTU0NTg=> . NAML <http://apache-ignite-users.70518.x6.nabble.com/template/NamlServlet.jtp?macro=macro_viewer=instant_html%21nabble%3Aemail.naml=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml> -- View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Poor-cross-cache-join-SQL-performance-v2-0-0-tp14373p14379.html Sent from the Apache Ignite Users mailing list archive at Nabble.com.