If you are only doing colocated joins, then there will be no runtime
overhead incurred by collecting distributed rows (colocated joins mean do
not try to join data that might be distributed across nodes), so there
might not be much difference in runtimes.

The difference between different affinity keys, if any, will be seen in the
results returned, and unless there's a significant difference in row count,
it will be difficult to affect runtime performance using this strategy.

On Tue, Jan 10, 2023, 13:32 Айсина Роза Мунеровна <[email protected]>
wrote:

> Hola!
>
> We want to optimize our SQL queries that make collocated JOINs on several
> tables (about ~8 tables).
>
> Some tables have column “product_id” on which these tables are joined.
> Business meaning is that the result are features for each product_id.
>
> So based on documentation we set “*product_id*” as affinity key
> so that all data required for join will be located on the same node.
> (Column “*product_id*” is always part of the primary key)
>
> But! After this we made experiments:
> - put affinity key to other part of primary key (for example, if primary
> key is "(product_id, store_id)", then affinity key is “store_id”);
> - didn't specify affinity key at all.
>
> The problem is that all our load testing results didn’t changed!
>
> So the question - is there any way to make more advanced *EXPLAIN*,
> that will show partition shuffling (if it happens) or data collocation?
> Some debug tool for this problem. Like query plan in Spark.
>
>
> Information about our setup:
> - Ignite cluster on 5 VMs;
> - all tables are partitioned or replicated;
> - all tables are created with DDL SQL and all interactions are made *only*
> through SQL API;
> - DDL example:
>
> CREATE TABLE IF NOT EXISTS PUBLIC.ProductFeatures
> (
>     product_id INT PRIMARY KEY,
>     total_cnt_orders_with_sku INT
> )
> WITH "CACHE_NAME=PUBLIC_ProductFeatures,
> KEY_TYPE=io.sbmt.ProductFeaturesKey,
> VALUE_TYPE=io.sbmt.ProductFeaturesValue, AFFINITY_KEY=product_id,
> TEMPLATE=PARTITIONED, BACKUPS=1
>
> - our main SQL query:
>
> SELECT
>     ProductFeatures.product_id,
>     ProductFeatures.total_cnt_orders_with_sku,
>     StoreProductFeatures.price,
>     UserProductFeaturesOrder.num_prev_orders_with_sku,
>     ...
> FROM ProductFeatures
> LEFT JOIN StoreProductFeatures
>   ON ProductFeatures.product_id = StoreProductFeatures.product_id
>   AND StoreProductFeatures.store_id = {store_id}
> ... (more joins)
> CROSS JOIN UserFeaturesDiscount
> WHERE UserFeaturesDiscount.user_id = {user_id}
>   AND ProductFeatures.product_id IN {skus}
>   …
>
> Looking forward for some help.
>
> *--*
>
> *Роза Айсина*
>
> Старший разработчик ПО
>
> *СберМаркет* | Доставка из любимых магазинов
>
>
>
> Email: [email protected]
>
> Mob:
>
> Web: sbermarket.ru
>
> App: iOS
> <https://apps.apple.com/ru/app/%D1%81%D0%B1%D0%B5%D1%80%D0%BC%D0%B0%D1%80%D0%BA%D0%B5%D1%82-%D0%B4%D0%BE%D1%81%D1%82%D0%B0%D0%B2%D0%BA%D0%B0-%D0%BF%D1%80%D0%BE%D0%B4%D1%83%D0%BA%D1%82%D0%BE%D0%B2/id1166642457>
> и Android
> <https://play.google.com/store/apps/details?id=ru.instamart&hl=en&gl=ru>
>
>
>
> *УВЕДОМЛЕНИЕ О КОНФИДЕНЦИАЛЬНОСТИ:* это электронное сообщение и любые
> документы, приложенные к нему, содержат конфиденциальную информацию.
> Настоящим уведомляем Вас о том, что, если это сообщение не предназначено
> Вам, использование, копирование, распространение информации, содержащейся в
> настоящем сообщении, а также осуществление любых действий на основе этой
> информации, строго запрещено. Если Вы получили это сообщение по ошибке,
> пожалуйста, сообщите об этом отправителю по электронной почте и удалите это
> сообщение.
> *CONFIDENTIALITY NOTICE:* This email and any files attached to it are
> confidential. If you are not the intended recipient you are notified that
> using, copying, distributing or taking any action in reliance on the
> contents of this information is strictly prohibited. If you have received
> this email in error please notify the sender and delete this email.
>

Reply via email to