Hello, Druid community, Ben Krug from Imply points me to this mail list for my question about Druid Joins. We have a following Druid Join query that may trigger a bug in Druid: > quote_type > WITH DIM AS ( > SELECT api_client_id, title > FROM inline_dimension_api_clients_1 AS API_CLIENTS > ), > FACTS AS ( > SELECT api_client_id, COUNT(*) as api_client_count > FROM inline_data AS ORDERS > WHERE ORDERS.__time >= TIMESTAMP '2021-06-10 00:00:00' AND ORDERS.__time < > TIMESTAMP '2021-06-18 00:00:00' AND ORDERS.shop_id = 25248974 > GROUP BY 1 > ) > SELECT DIM.title, FACTS.api_client_id, FACTS.api_client_count > FROM FACTS > LEFT JOIN DIM ON FACTS.api_client_id = DIM.api_client_id
So the “api_client_id” field is `long` type in both “inline_data” and “inline_dimension_api_clients_1” datasources. However, when doing a join, the makeLongProcessor method will be called, and throw an “UnsupportedOperationException" because "index.keyType()" is string in MapIndex. Then I found Gian Merlino has a PR to fix the issue. I have validated that this fix works for our case in my local Druid cluster. The fix is not included in Druid v0.21.1. I have the following questions: 1. Why the index key type is `string` rather than `long` for my subquery? Is it implicitly transformed to `string` type for performance benefit? 2. When will you publish a new Druid release? Will the fix be part of the next release? Thank you Jason Chen Jason (Jianbin) Chen Senior Data Developer p: +1 2066608351 | e: jason.c...@shopify.com a: 234 Laurier Ave W Ottawa, ON K1N 5X8