[jira] [Updated] (HIVE-19360) CBO: Add an "optimizedSQL" to QueryPlan object
[ https://issues.apache.org/jira/browse/HIVE-19360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-19360: --- Resolution: Fixed Fix Version/s: 3.2.0 4.0.0 Status: Resolved (was: Patch Available) > CBO: Add an "optimizedSQL" to QueryPlan object > --- > > Key: HIVE-19360 > URL: https://issues.apache.org/jira/browse/HIVE-19360 > Project: Hive > Issue Type: Improvement > Components: CBO, Diagnosability >Affects Versions: 3.1.0 >Reporter: Gopal V >Assignee: Gopal V >Priority: Major > Fix For: 4.0.0, 3.2.0 > > Attachments: HIVE-19360.1.patch, HIVE-19360.2.patch, > HIVE-19360.3.patch, HIVE-19360.4.patch, HIVE-19360.5.patch, > HIVE-19360.6.patch, HIVE-19360.7.patch, HIVE-19360.8.patch > > > Calcite RelNodes can be converted back into SQL (as the new JDBC storage > handler does), which allows Hive to print out the post CBO plan as a SQL > query instead of having to guess the join orders from the subsequent Tez plan. > The query generated might not be always valid SQL at this point, but is a > world ahead of DAG plans in readability. > Eg. tpc-ds Query4 CTEs gets expanded to > {code} > SELECT t16.$f3 customer_preferred_cust_flag > FROM > (SELECT t0.c_customer_id $f0, >SUM((t2.ws_ext_list_price - > t2.ws_ext_wholesale_cost - t2.ws_ext_discount_amt + t2.ws_ext_sales_price) / > CAST(2 AS DECIMAL(10, 0))) $f8 >FROM > (SELECT c_customer_sk, > c_customer_id, > c_first_name, > c_last_name, > c_preferred_cust_flag, > c_birth_country, > c_login, > c_email_address > FROM default.customer > WHERE c_customer_sk IS NOT NULL > AND c_customer_id IS NOT NULL) t0 >INNER JOIN ( > (SELECT ws_sold_date_sk, > ws_bill_customer_sk, > ws_ext_discount_amt, > ws_ext_sales_price, > ws_ext_wholesale_cost, > ws_ext_list_price > FROM default.web_sales > WHERE ws_bill_customer_sk IS NOT NULL > AND ws_sold_date_sk IS NOT NULL) t2 >INNER JOIN > (SELECT d_date_sk, > CAST(2002 AS INTEGER) d_year > FROM default.date_dim > WHERE d_year = 2002 > AND d_date_sk IS NOT NULL) t4 ON t2.ws_sold_date_sk = > t4.d_date_sk) ON t0.c_customer_sk = t2.ws_bill_customer_sk >GROUP BY t0.c_customer_id, > t0.c_first_name, > t0.c_last_name, > t0.c_preferred_cust_flag, > t0.c_birth_country, > t0.c_login, > t0.c_email_address) t7 > INNER JOIN ( > (SELECT t9.c_customer_id $f0, >t9.c_preferred_cust_flag $f3, > > SUM((t11.ss_ext_list_price - t11.ss_ext_wholesale_cost - > t11.ss_ext_discount_amt + t11.ss_ext_sales_price) / CAST(2 AS DECIMAL(10, > 0))) $f8 >FROM > (SELECT c_customer_sk, > c_customer_id, > c_first_name, > c_last_name, > c_preferred_cust_flag, > c_birth_country, > c_login, > c_email_address > FROM default.customer > WHERE c_customer_sk IS NOT NULL > AND c_customer_id IS NOT NULL) t9 >INNER JOIN ( > (SELECT ss_sold_date_sk, > ss_customer_sk, > ss_ext_discount_amt, > ss_ext_sales_price, > ss_ext_wholesale_cost, > ss_ext_list_price > FROM default.store_sales > WHERE ss_customer_sk IS NOT NULL > AND ss_sold_date_sk IS NOT NULL) t11 >INNER JOIN > (SELECT d_date_sk, > CAST(2002 AS INTEGER) d_year > FROM default.date_dim > WHERE d_year = 2002 > AND d_date_sk IS NOT NULL) t13 ON > t11.ss_sold_date_sk = t13.d_date_sk) ON t9.c_customer_sk = t11.ss_customer_sk >GROUP BY t9.c_customer_id, > t9.c_first_name, >
[jira] [Updated] (HIVE-19360) CBO: Add an "optimizedSQL" to QueryPlan object
[ https://issues.apache.org/jira/browse/HIVE-19360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-19360: --- Attachment: HIVE-19360.8.patch > CBO: Add an "optimizedSQL" to QueryPlan object > --- > > Key: HIVE-19360 > URL: https://issues.apache.org/jira/browse/HIVE-19360 > Project: Hive > Issue Type: Improvement > Components: CBO, Diagnosability >Affects Versions: 3.1.0 >Reporter: Gopal V >Assignee: Jesus Camacho Rodriguez >Priority: Major > Attachments: HIVE-19360.1.patch, HIVE-19360.2.patch, > HIVE-19360.3.patch, HIVE-19360.4.patch, HIVE-19360.5.patch, > HIVE-19360.6.patch, HIVE-19360.7.patch, HIVE-19360.8.patch > > > Calcite RelNodes can be converted back into SQL (as the new JDBC storage > handler does), which allows Hive to print out the post CBO plan as a SQL > query instead of having to guess the join orders from the subsequent Tez plan. > The query generated might not be always valid SQL at this point, but is a > world ahead of DAG plans in readability. > Eg. tpc-ds Query4 CTEs gets expanded to > {code} > SELECT t16.$f3 customer_preferred_cust_flag > FROM > (SELECT t0.c_customer_id $f0, >SUM((t2.ws_ext_list_price - > t2.ws_ext_wholesale_cost - t2.ws_ext_discount_amt + t2.ws_ext_sales_price) / > CAST(2 AS DECIMAL(10, 0))) $f8 >FROM > (SELECT c_customer_sk, > c_customer_id, > c_first_name, > c_last_name, > c_preferred_cust_flag, > c_birth_country, > c_login, > c_email_address > FROM default.customer > WHERE c_customer_sk IS NOT NULL > AND c_customer_id IS NOT NULL) t0 >INNER JOIN ( > (SELECT ws_sold_date_sk, > ws_bill_customer_sk, > ws_ext_discount_amt, > ws_ext_sales_price, > ws_ext_wholesale_cost, > ws_ext_list_price > FROM default.web_sales > WHERE ws_bill_customer_sk IS NOT NULL > AND ws_sold_date_sk IS NOT NULL) t2 >INNER JOIN > (SELECT d_date_sk, > CAST(2002 AS INTEGER) d_year > FROM default.date_dim > WHERE d_year = 2002 > AND d_date_sk IS NOT NULL) t4 ON t2.ws_sold_date_sk = > t4.d_date_sk) ON t0.c_customer_sk = t2.ws_bill_customer_sk >GROUP BY t0.c_customer_id, > t0.c_first_name, > t0.c_last_name, > t0.c_preferred_cust_flag, > t0.c_birth_country, > t0.c_login, > t0.c_email_address) t7 > INNER JOIN ( > (SELECT t9.c_customer_id $f0, >t9.c_preferred_cust_flag $f3, > > SUM((t11.ss_ext_list_price - t11.ss_ext_wholesale_cost - > t11.ss_ext_discount_amt + t11.ss_ext_sales_price) / CAST(2 AS DECIMAL(10, > 0))) $f8 >FROM > (SELECT c_customer_sk, > c_customer_id, > c_first_name, > c_last_name, > c_preferred_cust_flag, > c_birth_country, > c_login, > c_email_address > FROM default.customer > WHERE c_customer_sk IS NOT NULL > AND c_customer_id IS NOT NULL) t9 >INNER JOIN ( > (SELECT ss_sold_date_sk, > ss_customer_sk, > ss_ext_discount_amt, > ss_ext_sales_price, > ss_ext_wholesale_cost, > ss_ext_list_price > FROM default.store_sales > WHERE ss_customer_sk IS NOT NULL > AND ss_sold_date_sk IS NOT NULL) t11 >INNER JOIN > (SELECT d_date_sk, > CAST(2002 AS INTEGER) d_year > FROM default.date_dim > WHERE d_year = 2002 > AND d_date_sk IS NOT NULL) t13 ON > t11.ss_sold_date_sk = t13.d_date_sk) ON t9.c_customer_sk = t11.ss_customer_sk >GROUP BY t9.c_customer_id, > t9.c_first_name, > t9.c_last_name, > t9.c_preferred_cust_flag, >
[jira] [Updated] (HIVE-19360) CBO: Add an "optimizedSQL" to QueryPlan object
[ https://issues.apache.org/jira/browse/HIVE-19360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-19360: --- Attachment: HIVE-19360.7.patch > CBO: Add an "optimizedSQL" to QueryPlan object > --- > > Key: HIVE-19360 > URL: https://issues.apache.org/jira/browse/HIVE-19360 > Project: Hive > Issue Type: Improvement > Components: CBO, Diagnosability >Affects Versions: 3.1.0 >Reporter: Gopal V >Assignee: Jesus Camacho Rodriguez >Priority: Major > Attachments: HIVE-19360.1.patch, HIVE-19360.2.patch, > HIVE-19360.3.patch, HIVE-19360.4.patch, HIVE-19360.5.patch, > HIVE-19360.6.patch, HIVE-19360.7.patch > > > Calcite RelNodes can be converted back into SQL (as the new JDBC storage > handler does), which allows Hive to print out the post CBO plan as a SQL > query instead of having to guess the join orders from the subsequent Tez plan. > The query generated might not be always valid SQL at this point, but is a > world ahead of DAG plans in readability. > Eg. tpc-ds Query4 CTEs gets expanded to > {code} > SELECT t16.$f3 customer_preferred_cust_flag > FROM > (SELECT t0.c_customer_id $f0, >SUM((t2.ws_ext_list_price - > t2.ws_ext_wholesale_cost - t2.ws_ext_discount_amt + t2.ws_ext_sales_price) / > CAST(2 AS DECIMAL(10, 0))) $f8 >FROM > (SELECT c_customer_sk, > c_customer_id, > c_first_name, > c_last_name, > c_preferred_cust_flag, > c_birth_country, > c_login, > c_email_address > FROM default.customer > WHERE c_customer_sk IS NOT NULL > AND c_customer_id IS NOT NULL) t0 >INNER JOIN ( > (SELECT ws_sold_date_sk, > ws_bill_customer_sk, > ws_ext_discount_amt, > ws_ext_sales_price, > ws_ext_wholesale_cost, > ws_ext_list_price > FROM default.web_sales > WHERE ws_bill_customer_sk IS NOT NULL > AND ws_sold_date_sk IS NOT NULL) t2 >INNER JOIN > (SELECT d_date_sk, > CAST(2002 AS INTEGER) d_year > FROM default.date_dim > WHERE d_year = 2002 > AND d_date_sk IS NOT NULL) t4 ON t2.ws_sold_date_sk = > t4.d_date_sk) ON t0.c_customer_sk = t2.ws_bill_customer_sk >GROUP BY t0.c_customer_id, > t0.c_first_name, > t0.c_last_name, > t0.c_preferred_cust_flag, > t0.c_birth_country, > t0.c_login, > t0.c_email_address) t7 > INNER JOIN ( > (SELECT t9.c_customer_id $f0, >t9.c_preferred_cust_flag $f3, > > SUM((t11.ss_ext_list_price - t11.ss_ext_wholesale_cost - > t11.ss_ext_discount_amt + t11.ss_ext_sales_price) / CAST(2 AS DECIMAL(10, > 0))) $f8 >FROM > (SELECT c_customer_sk, > c_customer_id, > c_first_name, > c_last_name, > c_preferred_cust_flag, > c_birth_country, > c_login, > c_email_address > FROM default.customer > WHERE c_customer_sk IS NOT NULL > AND c_customer_id IS NOT NULL) t9 >INNER JOIN ( > (SELECT ss_sold_date_sk, > ss_customer_sk, > ss_ext_discount_amt, > ss_ext_sales_price, > ss_ext_wholesale_cost, > ss_ext_list_price > FROM default.store_sales > WHERE ss_customer_sk IS NOT NULL > AND ss_sold_date_sk IS NOT NULL) t11 >INNER JOIN > (SELECT d_date_sk, > CAST(2002 AS INTEGER) d_year > FROM default.date_dim > WHERE d_year = 2002 > AND d_date_sk IS NOT NULL) t13 ON > t11.ss_sold_date_sk = t13.d_date_sk) ON t9.c_customer_sk = t11.ss_customer_sk >GROUP BY t9.c_customer_id, > t9.c_first_name, > t9.c_last_name, > t9.c_preferred_cust_flag, > t9.c_birth_count
[jira] [Updated] (HIVE-19360) CBO: Add an "optimizedSQL" to QueryPlan object
[ https://issues.apache.org/jira/browse/HIVE-19360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-19360: --- Attachment: (was: HIVE-19360.6.patch) > CBO: Add an "optimizedSQL" to QueryPlan object > --- > > Key: HIVE-19360 > URL: https://issues.apache.org/jira/browse/HIVE-19360 > Project: Hive > Issue Type: Improvement > Components: CBO, Diagnosability >Affects Versions: 3.1.0 >Reporter: Gopal V >Assignee: Jesus Camacho Rodriguez >Priority: Major > Attachments: HIVE-19360.1.patch, HIVE-19360.2.patch, > HIVE-19360.3.patch, HIVE-19360.4.patch, HIVE-19360.5.patch, HIVE-19360.6.patch > > > Calcite RelNodes can be converted back into SQL (as the new JDBC storage > handler does), which allows Hive to print out the post CBO plan as a SQL > query instead of having to guess the join orders from the subsequent Tez plan. > The query generated might not be always valid SQL at this point, but is a > world ahead of DAG plans in readability. > Eg. tpc-ds Query4 CTEs gets expanded to > {code} > SELECT t16.$f3 customer_preferred_cust_flag > FROM > (SELECT t0.c_customer_id $f0, >SUM((t2.ws_ext_list_price - > t2.ws_ext_wholesale_cost - t2.ws_ext_discount_amt + t2.ws_ext_sales_price) / > CAST(2 AS DECIMAL(10, 0))) $f8 >FROM > (SELECT c_customer_sk, > c_customer_id, > c_first_name, > c_last_name, > c_preferred_cust_flag, > c_birth_country, > c_login, > c_email_address > FROM default.customer > WHERE c_customer_sk IS NOT NULL > AND c_customer_id IS NOT NULL) t0 >INNER JOIN ( > (SELECT ws_sold_date_sk, > ws_bill_customer_sk, > ws_ext_discount_amt, > ws_ext_sales_price, > ws_ext_wholesale_cost, > ws_ext_list_price > FROM default.web_sales > WHERE ws_bill_customer_sk IS NOT NULL > AND ws_sold_date_sk IS NOT NULL) t2 >INNER JOIN > (SELECT d_date_sk, > CAST(2002 AS INTEGER) d_year > FROM default.date_dim > WHERE d_year = 2002 > AND d_date_sk IS NOT NULL) t4 ON t2.ws_sold_date_sk = > t4.d_date_sk) ON t0.c_customer_sk = t2.ws_bill_customer_sk >GROUP BY t0.c_customer_id, > t0.c_first_name, > t0.c_last_name, > t0.c_preferred_cust_flag, > t0.c_birth_country, > t0.c_login, > t0.c_email_address) t7 > INNER JOIN ( > (SELECT t9.c_customer_id $f0, >t9.c_preferred_cust_flag $f3, > > SUM((t11.ss_ext_list_price - t11.ss_ext_wholesale_cost - > t11.ss_ext_discount_amt + t11.ss_ext_sales_price) / CAST(2 AS DECIMAL(10, > 0))) $f8 >FROM > (SELECT c_customer_sk, > c_customer_id, > c_first_name, > c_last_name, > c_preferred_cust_flag, > c_birth_country, > c_login, > c_email_address > FROM default.customer > WHERE c_customer_sk IS NOT NULL > AND c_customer_id IS NOT NULL) t9 >INNER JOIN ( > (SELECT ss_sold_date_sk, > ss_customer_sk, > ss_ext_discount_amt, > ss_ext_sales_price, > ss_ext_wholesale_cost, > ss_ext_list_price > FROM default.store_sales > WHERE ss_customer_sk IS NOT NULL > AND ss_sold_date_sk IS NOT NULL) t11 >INNER JOIN > (SELECT d_date_sk, > CAST(2002 AS INTEGER) d_year > FROM default.date_dim > WHERE d_year = 2002 > AND d_date_sk IS NOT NULL) t13 ON > t11.ss_sold_date_sk = t13.d_date_sk) ON t9.c_customer_sk = t11.ss_customer_sk >GROUP BY t9.c_customer_id, > t9.c_first_name, > t9.c_last_name, > t9.c_preferred_cust_flag, > t9.c_birth_country, >
[jira] [Updated] (HIVE-19360) CBO: Add an "optimizedSQL" to QueryPlan object
[ https://issues.apache.org/jira/browse/HIVE-19360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-19360: --- Attachment: HIVE-19360.6.patch > CBO: Add an "optimizedSQL" to QueryPlan object > --- > > Key: HIVE-19360 > URL: https://issues.apache.org/jira/browse/HIVE-19360 > Project: Hive > Issue Type: Improvement > Components: CBO, Diagnosability >Affects Versions: 3.1.0 >Reporter: Gopal V >Assignee: Jesus Camacho Rodriguez >Priority: Major > Attachments: HIVE-19360.1.patch, HIVE-19360.2.patch, > HIVE-19360.3.patch, HIVE-19360.4.patch, HIVE-19360.5.patch, HIVE-19360.6.patch > > > Calcite RelNodes can be converted back into SQL (as the new JDBC storage > handler does), which allows Hive to print out the post CBO plan as a SQL > query instead of having to guess the join orders from the subsequent Tez plan. > The query generated might not be always valid SQL at this point, but is a > world ahead of DAG plans in readability. > Eg. tpc-ds Query4 CTEs gets expanded to > {code} > SELECT t16.$f3 customer_preferred_cust_flag > FROM > (SELECT t0.c_customer_id $f0, >SUM((t2.ws_ext_list_price - > t2.ws_ext_wholesale_cost - t2.ws_ext_discount_amt + t2.ws_ext_sales_price) / > CAST(2 AS DECIMAL(10, 0))) $f8 >FROM > (SELECT c_customer_sk, > c_customer_id, > c_first_name, > c_last_name, > c_preferred_cust_flag, > c_birth_country, > c_login, > c_email_address > FROM default.customer > WHERE c_customer_sk IS NOT NULL > AND c_customer_id IS NOT NULL) t0 >INNER JOIN ( > (SELECT ws_sold_date_sk, > ws_bill_customer_sk, > ws_ext_discount_amt, > ws_ext_sales_price, > ws_ext_wholesale_cost, > ws_ext_list_price > FROM default.web_sales > WHERE ws_bill_customer_sk IS NOT NULL > AND ws_sold_date_sk IS NOT NULL) t2 >INNER JOIN > (SELECT d_date_sk, > CAST(2002 AS INTEGER) d_year > FROM default.date_dim > WHERE d_year = 2002 > AND d_date_sk IS NOT NULL) t4 ON t2.ws_sold_date_sk = > t4.d_date_sk) ON t0.c_customer_sk = t2.ws_bill_customer_sk >GROUP BY t0.c_customer_id, > t0.c_first_name, > t0.c_last_name, > t0.c_preferred_cust_flag, > t0.c_birth_country, > t0.c_login, > t0.c_email_address) t7 > INNER JOIN ( > (SELECT t9.c_customer_id $f0, >t9.c_preferred_cust_flag $f3, > > SUM((t11.ss_ext_list_price - t11.ss_ext_wholesale_cost - > t11.ss_ext_discount_amt + t11.ss_ext_sales_price) / CAST(2 AS DECIMAL(10, > 0))) $f8 >FROM > (SELECT c_customer_sk, > c_customer_id, > c_first_name, > c_last_name, > c_preferred_cust_flag, > c_birth_country, > c_login, > c_email_address > FROM default.customer > WHERE c_customer_sk IS NOT NULL > AND c_customer_id IS NOT NULL) t9 >INNER JOIN ( > (SELECT ss_sold_date_sk, > ss_customer_sk, > ss_ext_discount_amt, > ss_ext_sales_price, > ss_ext_wholesale_cost, > ss_ext_list_price > FROM default.store_sales > WHERE ss_customer_sk IS NOT NULL > AND ss_sold_date_sk IS NOT NULL) t11 >INNER JOIN > (SELECT d_date_sk, > CAST(2002 AS INTEGER) d_year > FROM default.date_dim > WHERE d_year = 2002 > AND d_date_sk IS NOT NULL) t13 ON > t11.ss_sold_date_sk = t13.d_date_sk) ON t9.c_customer_sk = t11.ss_customer_sk >GROUP BY t9.c_customer_id, > t9.c_first_name, > t9.c_last_name, > t9.c_preferred_cust_flag, > t9.c_birth_country, >
[jira] [Updated] (HIVE-19360) CBO: Add an "optimizedSQL" to QueryPlan object
[ https://issues.apache.org/jira/browse/HIVE-19360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-19360: --- Attachment: HIVE-19360.6.patch > CBO: Add an "optimizedSQL" to QueryPlan object > --- > > Key: HIVE-19360 > URL: https://issues.apache.org/jira/browse/HIVE-19360 > Project: Hive > Issue Type: Improvement > Components: CBO, Diagnosability >Affects Versions: 3.1.0 >Reporter: Gopal V >Assignee: Jesus Camacho Rodriguez >Priority: Major > Attachments: HIVE-19360.1.patch, HIVE-19360.2.patch, > HIVE-19360.3.patch, HIVE-19360.4.patch, HIVE-19360.5.patch, HIVE-19360.6.patch > > > Calcite RelNodes can be converted back into SQL (as the new JDBC storage > handler does), which allows Hive to print out the post CBO plan as a SQL > query instead of having to guess the join orders from the subsequent Tez plan. > The query generated might not be always valid SQL at this point, but is a > world ahead of DAG plans in readability. > Eg. tpc-ds Query4 CTEs gets expanded to > {code} > SELECT t16.$f3 customer_preferred_cust_flag > FROM > (SELECT t0.c_customer_id $f0, >SUM((t2.ws_ext_list_price - > t2.ws_ext_wholesale_cost - t2.ws_ext_discount_amt + t2.ws_ext_sales_price) / > CAST(2 AS DECIMAL(10, 0))) $f8 >FROM > (SELECT c_customer_sk, > c_customer_id, > c_first_name, > c_last_name, > c_preferred_cust_flag, > c_birth_country, > c_login, > c_email_address > FROM default.customer > WHERE c_customer_sk IS NOT NULL > AND c_customer_id IS NOT NULL) t0 >INNER JOIN ( > (SELECT ws_sold_date_sk, > ws_bill_customer_sk, > ws_ext_discount_amt, > ws_ext_sales_price, > ws_ext_wholesale_cost, > ws_ext_list_price > FROM default.web_sales > WHERE ws_bill_customer_sk IS NOT NULL > AND ws_sold_date_sk IS NOT NULL) t2 >INNER JOIN > (SELECT d_date_sk, > CAST(2002 AS INTEGER) d_year > FROM default.date_dim > WHERE d_year = 2002 > AND d_date_sk IS NOT NULL) t4 ON t2.ws_sold_date_sk = > t4.d_date_sk) ON t0.c_customer_sk = t2.ws_bill_customer_sk >GROUP BY t0.c_customer_id, > t0.c_first_name, > t0.c_last_name, > t0.c_preferred_cust_flag, > t0.c_birth_country, > t0.c_login, > t0.c_email_address) t7 > INNER JOIN ( > (SELECT t9.c_customer_id $f0, >t9.c_preferred_cust_flag $f3, > > SUM((t11.ss_ext_list_price - t11.ss_ext_wholesale_cost - > t11.ss_ext_discount_amt + t11.ss_ext_sales_price) / CAST(2 AS DECIMAL(10, > 0))) $f8 >FROM > (SELECT c_customer_sk, > c_customer_id, > c_first_name, > c_last_name, > c_preferred_cust_flag, > c_birth_country, > c_login, > c_email_address > FROM default.customer > WHERE c_customer_sk IS NOT NULL > AND c_customer_id IS NOT NULL) t9 >INNER JOIN ( > (SELECT ss_sold_date_sk, > ss_customer_sk, > ss_ext_discount_amt, > ss_ext_sales_price, > ss_ext_wholesale_cost, > ss_ext_list_price > FROM default.store_sales > WHERE ss_customer_sk IS NOT NULL > AND ss_sold_date_sk IS NOT NULL) t11 >INNER JOIN > (SELECT d_date_sk, > CAST(2002 AS INTEGER) d_year > FROM default.date_dim > WHERE d_year = 2002 > AND d_date_sk IS NOT NULL) t13 ON > t11.ss_sold_date_sk = t13.d_date_sk) ON t9.c_customer_sk = t11.ss_customer_sk >GROUP BY t9.c_customer_id, > t9.c_first_name, > t9.c_last_name, > t9.c_preferred_cust_flag, > t9.c_birth_country, >
[jira] [Updated] (HIVE-19360) CBO: Add an "optimizedSQL" to QueryPlan object
[ https://issues.apache.org/jira/browse/HIVE-19360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-19360: --- Attachment: HIVE-19360.5.patch > CBO: Add an "optimizedSQL" to QueryPlan object > --- > > Key: HIVE-19360 > URL: https://issues.apache.org/jira/browse/HIVE-19360 > Project: Hive > Issue Type: Improvement > Components: CBO, Diagnosability >Affects Versions: 3.1.0 >Reporter: Gopal V >Assignee: Jesus Camacho Rodriguez >Priority: Major > Attachments: HIVE-19360.1.patch, HIVE-19360.2.patch, > HIVE-19360.3.patch, HIVE-19360.4.patch, HIVE-19360.5.patch > > > Calcite RelNodes can be converted back into SQL (as the new JDBC storage > handler does), which allows Hive to print out the post CBO plan as a SQL > query instead of having to guess the join orders from the subsequent Tez plan. > The query generated might not be always valid SQL at this point, but is a > world ahead of DAG plans in readability. > Eg. tpc-ds Query4 CTEs gets expanded to > {code} > SELECT t16.$f3 customer_preferred_cust_flag > FROM > (SELECT t0.c_customer_id $f0, >SUM((t2.ws_ext_list_price - > t2.ws_ext_wholesale_cost - t2.ws_ext_discount_amt + t2.ws_ext_sales_price) / > CAST(2 AS DECIMAL(10, 0))) $f8 >FROM > (SELECT c_customer_sk, > c_customer_id, > c_first_name, > c_last_name, > c_preferred_cust_flag, > c_birth_country, > c_login, > c_email_address > FROM default.customer > WHERE c_customer_sk IS NOT NULL > AND c_customer_id IS NOT NULL) t0 >INNER JOIN ( > (SELECT ws_sold_date_sk, > ws_bill_customer_sk, > ws_ext_discount_amt, > ws_ext_sales_price, > ws_ext_wholesale_cost, > ws_ext_list_price > FROM default.web_sales > WHERE ws_bill_customer_sk IS NOT NULL > AND ws_sold_date_sk IS NOT NULL) t2 >INNER JOIN > (SELECT d_date_sk, > CAST(2002 AS INTEGER) d_year > FROM default.date_dim > WHERE d_year = 2002 > AND d_date_sk IS NOT NULL) t4 ON t2.ws_sold_date_sk = > t4.d_date_sk) ON t0.c_customer_sk = t2.ws_bill_customer_sk >GROUP BY t0.c_customer_id, > t0.c_first_name, > t0.c_last_name, > t0.c_preferred_cust_flag, > t0.c_birth_country, > t0.c_login, > t0.c_email_address) t7 > INNER JOIN ( > (SELECT t9.c_customer_id $f0, >t9.c_preferred_cust_flag $f3, > > SUM((t11.ss_ext_list_price - t11.ss_ext_wholesale_cost - > t11.ss_ext_discount_amt + t11.ss_ext_sales_price) / CAST(2 AS DECIMAL(10, > 0))) $f8 >FROM > (SELECT c_customer_sk, > c_customer_id, > c_first_name, > c_last_name, > c_preferred_cust_flag, > c_birth_country, > c_login, > c_email_address > FROM default.customer > WHERE c_customer_sk IS NOT NULL > AND c_customer_id IS NOT NULL) t9 >INNER JOIN ( > (SELECT ss_sold_date_sk, > ss_customer_sk, > ss_ext_discount_amt, > ss_ext_sales_price, > ss_ext_wholesale_cost, > ss_ext_list_price > FROM default.store_sales > WHERE ss_customer_sk IS NOT NULL > AND ss_sold_date_sk IS NOT NULL) t11 >INNER JOIN > (SELECT d_date_sk, > CAST(2002 AS INTEGER) d_year > FROM default.date_dim > WHERE d_year = 2002 > AND d_date_sk IS NOT NULL) t13 ON > t11.ss_sold_date_sk = t13.d_date_sk) ON t9.c_customer_sk = t11.ss_customer_sk >GROUP BY t9.c_customer_id, > t9.c_first_name, > t9.c_last_name, > t9.c_preferred_cust_flag, > t9.c_birth_country, > t9.c_login, >
[jira] [Updated] (HIVE-19360) CBO: Add an "optimizedSQL" to QueryPlan object
[ https://issues.apache.org/jira/browse/HIVE-19360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-19360: --- Attachment: HIVE-19360.4.patch > CBO: Add an "optimizedSQL" to QueryPlan object > --- > > Key: HIVE-19360 > URL: https://issues.apache.org/jira/browse/HIVE-19360 > Project: Hive > Issue Type: Improvement > Components: CBO, Diagnosability >Affects Versions: 3.1.0 >Reporter: Gopal V >Assignee: Jesus Camacho Rodriguez >Priority: Major > Attachments: HIVE-19360.1.patch, HIVE-19360.2.patch, > HIVE-19360.3.patch, HIVE-19360.4.patch > > > Calcite RelNodes can be converted back into SQL (as the new JDBC storage > handler does), which allows Hive to print out the post CBO plan as a SQL > query instead of having to guess the join orders from the subsequent Tez plan. > The query generated might not be always valid SQL at this point, but is a > world ahead of DAG plans in readability. > Eg. tpc-ds Query4 CTEs gets expanded to > {code} > SELECT t16.$f3 customer_preferred_cust_flag > FROM > (SELECT t0.c_customer_id $f0, >SUM((t2.ws_ext_list_price - > t2.ws_ext_wholesale_cost - t2.ws_ext_discount_amt + t2.ws_ext_sales_price) / > CAST(2 AS DECIMAL(10, 0))) $f8 >FROM > (SELECT c_customer_sk, > c_customer_id, > c_first_name, > c_last_name, > c_preferred_cust_flag, > c_birth_country, > c_login, > c_email_address > FROM default.customer > WHERE c_customer_sk IS NOT NULL > AND c_customer_id IS NOT NULL) t0 >INNER JOIN ( > (SELECT ws_sold_date_sk, > ws_bill_customer_sk, > ws_ext_discount_amt, > ws_ext_sales_price, > ws_ext_wholesale_cost, > ws_ext_list_price > FROM default.web_sales > WHERE ws_bill_customer_sk IS NOT NULL > AND ws_sold_date_sk IS NOT NULL) t2 >INNER JOIN > (SELECT d_date_sk, > CAST(2002 AS INTEGER) d_year > FROM default.date_dim > WHERE d_year = 2002 > AND d_date_sk IS NOT NULL) t4 ON t2.ws_sold_date_sk = > t4.d_date_sk) ON t0.c_customer_sk = t2.ws_bill_customer_sk >GROUP BY t0.c_customer_id, > t0.c_first_name, > t0.c_last_name, > t0.c_preferred_cust_flag, > t0.c_birth_country, > t0.c_login, > t0.c_email_address) t7 > INNER JOIN ( > (SELECT t9.c_customer_id $f0, >t9.c_preferred_cust_flag $f3, > > SUM((t11.ss_ext_list_price - t11.ss_ext_wholesale_cost - > t11.ss_ext_discount_amt + t11.ss_ext_sales_price) / CAST(2 AS DECIMAL(10, > 0))) $f8 >FROM > (SELECT c_customer_sk, > c_customer_id, > c_first_name, > c_last_name, > c_preferred_cust_flag, > c_birth_country, > c_login, > c_email_address > FROM default.customer > WHERE c_customer_sk IS NOT NULL > AND c_customer_id IS NOT NULL) t9 >INNER JOIN ( > (SELECT ss_sold_date_sk, > ss_customer_sk, > ss_ext_discount_amt, > ss_ext_sales_price, > ss_ext_wholesale_cost, > ss_ext_list_price > FROM default.store_sales > WHERE ss_customer_sk IS NOT NULL > AND ss_sold_date_sk IS NOT NULL) t11 >INNER JOIN > (SELECT d_date_sk, > CAST(2002 AS INTEGER) d_year > FROM default.date_dim > WHERE d_year = 2002 > AND d_date_sk IS NOT NULL) t13 ON > t11.ss_sold_date_sk = t13.d_date_sk) ON t9.c_customer_sk = t11.ss_customer_sk >GROUP BY t9.c_customer_id, > t9.c_first_name, > t9.c_last_name, > t9.c_preferred_cust_flag, > t9.c_birth_country, > t9.c_login, >
[jira] [Updated] (HIVE-19360) CBO: Add an "optimizedSQL" to QueryPlan object
[ https://issues.apache.org/jira/browse/HIVE-19360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V updated HIVE-19360: --- Attachment: HIVE-19360.3.patch > CBO: Add an "optimizedSQL" to QueryPlan object > --- > > Key: HIVE-19360 > URL: https://issues.apache.org/jira/browse/HIVE-19360 > Project: Hive > Issue Type: Improvement > Components: CBO, Diagnosability >Affects Versions: 3.1.0 >Reporter: Gopal V >Assignee: Gopal V >Priority: Major > Attachments: HIVE-19360.1.patch, HIVE-19360.2.patch, > HIVE-19360.3.patch > > > Calcite RelNodes can be converted back into SQL (as the new JDBC storage > handler does), which allows Hive to print out the post CBO plan as a SQL > query instead of having to guess the join orders from the subsequent Tez plan. > The query generated might not be always valid SQL at this point, but is a > world ahead of DAG plans in readability. > Eg. tpc-ds Query4 CTEs gets expanded to > {code} > SELECT t16.$f3 customer_preferred_cust_flag > FROM > (SELECT t0.c_customer_id $f0, >SUM((t2.ws_ext_list_price - > t2.ws_ext_wholesale_cost - t2.ws_ext_discount_amt + t2.ws_ext_sales_price) / > CAST(2 AS DECIMAL(10, 0))) $f8 >FROM > (SELECT c_customer_sk, > c_customer_id, > c_first_name, > c_last_name, > c_preferred_cust_flag, > c_birth_country, > c_login, > c_email_address > FROM default.customer > WHERE c_customer_sk IS NOT NULL > AND c_customer_id IS NOT NULL) t0 >INNER JOIN ( > (SELECT ws_sold_date_sk, > ws_bill_customer_sk, > ws_ext_discount_amt, > ws_ext_sales_price, > ws_ext_wholesale_cost, > ws_ext_list_price > FROM default.web_sales > WHERE ws_bill_customer_sk IS NOT NULL > AND ws_sold_date_sk IS NOT NULL) t2 >INNER JOIN > (SELECT d_date_sk, > CAST(2002 AS INTEGER) d_year > FROM default.date_dim > WHERE d_year = 2002 > AND d_date_sk IS NOT NULL) t4 ON t2.ws_sold_date_sk = > t4.d_date_sk) ON t0.c_customer_sk = t2.ws_bill_customer_sk >GROUP BY t0.c_customer_id, > t0.c_first_name, > t0.c_last_name, > t0.c_preferred_cust_flag, > t0.c_birth_country, > t0.c_login, > t0.c_email_address) t7 > INNER JOIN ( > (SELECT t9.c_customer_id $f0, >t9.c_preferred_cust_flag $f3, > > SUM((t11.ss_ext_list_price - t11.ss_ext_wholesale_cost - > t11.ss_ext_discount_amt + t11.ss_ext_sales_price) / CAST(2 AS DECIMAL(10, > 0))) $f8 >FROM > (SELECT c_customer_sk, > c_customer_id, > c_first_name, > c_last_name, > c_preferred_cust_flag, > c_birth_country, > c_login, > c_email_address > FROM default.customer > WHERE c_customer_sk IS NOT NULL > AND c_customer_id IS NOT NULL) t9 >INNER JOIN ( > (SELECT ss_sold_date_sk, > ss_customer_sk, > ss_ext_discount_amt, > ss_ext_sales_price, > ss_ext_wholesale_cost, > ss_ext_list_price > FROM default.store_sales > WHERE ss_customer_sk IS NOT NULL > AND ss_sold_date_sk IS NOT NULL) t11 >INNER JOIN > (SELECT d_date_sk, > CAST(2002 AS INTEGER) d_year > FROM default.date_dim > WHERE d_year = 2002 > AND d_date_sk IS NOT NULL) t13 ON > t11.ss_sold_date_sk = t13.d_date_sk) ON t9.c_customer_sk = t11.ss_customer_sk >GROUP BY t9.c_customer_id, > t9.c_first_name, > t9.c_last_name, > t9.c_preferred_cust_flag, > t9.c_birth_country, > t9.c_login, > t9.c_email_address) t16 > INNER JOIN ( >
[jira] [Updated] (HIVE-19360) CBO: Add an "optimizedSQL" to QueryPlan object
[ https://issues.apache.org/jira/browse/HIVE-19360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V updated HIVE-19360: --- Attachment: (was: HIVE-19360.3.patch) > CBO: Add an "optimizedSQL" to QueryPlan object > --- > > Key: HIVE-19360 > URL: https://issues.apache.org/jira/browse/HIVE-19360 > Project: Hive > Issue Type: Improvement > Components: CBO, Diagnosability >Affects Versions: 3.1.0 >Reporter: Gopal V >Assignee: Gopal V >Priority: Major > Attachments: HIVE-19360.1.patch, HIVE-19360.2.patch > > > Calcite RelNodes can be converted back into SQL (as the new JDBC storage > handler does), which allows Hive to print out the post CBO plan as a SQL > query instead of having to guess the join orders from the subsequent Tez plan. > The query generated might not be always valid SQL at this point, but is a > world ahead of DAG plans in readability. > Eg. tpc-ds Query4 CTEs gets expanded to > {code} > SELECT t16.$f3 customer_preferred_cust_flag > FROM > (SELECT t0.c_customer_id $f0, >SUM((t2.ws_ext_list_price - > t2.ws_ext_wholesale_cost - t2.ws_ext_discount_amt + t2.ws_ext_sales_price) / > CAST(2 AS DECIMAL(10, 0))) $f8 >FROM > (SELECT c_customer_sk, > c_customer_id, > c_first_name, > c_last_name, > c_preferred_cust_flag, > c_birth_country, > c_login, > c_email_address > FROM default.customer > WHERE c_customer_sk IS NOT NULL > AND c_customer_id IS NOT NULL) t0 >INNER JOIN ( > (SELECT ws_sold_date_sk, > ws_bill_customer_sk, > ws_ext_discount_amt, > ws_ext_sales_price, > ws_ext_wholesale_cost, > ws_ext_list_price > FROM default.web_sales > WHERE ws_bill_customer_sk IS NOT NULL > AND ws_sold_date_sk IS NOT NULL) t2 >INNER JOIN > (SELECT d_date_sk, > CAST(2002 AS INTEGER) d_year > FROM default.date_dim > WHERE d_year = 2002 > AND d_date_sk IS NOT NULL) t4 ON t2.ws_sold_date_sk = > t4.d_date_sk) ON t0.c_customer_sk = t2.ws_bill_customer_sk >GROUP BY t0.c_customer_id, > t0.c_first_name, > t0.c_last_name, > t0.c_preferred_cust_flag, > t0.c_birth_country, > t0.c_login, > t0.c_email_address) t7 > INNER JOIN ( > (SELECT t9.c_customer_id $f0, >t9.c_preferred_cust_flag $f3, > > SUM((t11.ss_ext_list_price - t11.ss_ext_wholesale_cost - > t11.ss_ext_discount_amt + t11.ss_ext_sales_price) / CAST(2 AS DECIMAL(10, > 0))) $f8 >FROM > (SELECT c_customer_sk, > c_customer_id, > c_first_name, > c_last_name, > c_preferred_cust_flag, > c_birth_country, > c_login, > c_email_address > FROM default.customer > WHERE c_customer_sk IS NOT NULL > AND c_customer_id IS NOT NULL) t9 >INNER JOIN ( > (SELECT ss_sold_date_sk, > ss_customer_sk, > ss_ext_discount_amt, > ss_ext_sales_price, > ss_ext_wholesale_cost, > ss_ext_list_price > FROM default.store_sales > WHERE ss_customer_sk IS NOT NULL > AND ss_sold_date_sk IS NOT NULL) t11 >INNER JOIN > (SELECT d_date_sk, > CAST(2002 AS INTEGER) d_year > FROM default.date_dim > WHERE d_year = 2002 > AND d_date_sk IS NOT NULL) t13 ON > t11.ss_sold_date_sk = t13.d_date_sk) ON t9.c_customer_sk = t11.ss_customer_sk >GROUP BY t9.c_customer_id, > t9.c_first_name, > t9.c_last_name, > t9.c_preferred_cust_flag, > t9.c_birth_country, > t9.c_login, > t9.c_email_address) t16 > INNER JOIN ( >
[jira] [Updated] (HIVE-19360) CBO: Add an "optimizedSQL" to QueryPlan object
[ https://issues.apache.org/jira/browse/HIVE-19360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V updated HIVE-19360: --- Attachment: HIVE-19360.3.patch > CBO: Add an "optimizedSQL" to QueryPlan object > --- > > Key: HIVE-19360 > URL: https://issues.apache.org/jira/browse/HIVE-19360 > Project: Hive > Issue Type: Improvement > Components: CBO, Diagnosability >Affects Versions: 3.1.0 >Reporter: Gopal V >Assignee: Gopal V >Priority: Major > Attachments: HIVE-19360.1.patch, HIVE-19360.2.patch, > HIVE-19360.3.patch > > > Calcite RelNodes can be converted back into SQL (as the new JDBC storage > handler does), which allows Hive to print out the post CBO plan as a SQL > query instead of having to guess the join orders from the subsequent Tez plan. > The query generated might not be always valid SQL at this point, but is a > world ahead of DAG plans in readability. > Eg. tpc-ds Query4 CTEs gets expanded to > {code} > SELECT t16.$f3 customer_preferred_cust_flag > FROM > (SELECT t0.c_customer_id $f0, >SUM((t2.ws_ext_list_price - > t2.ws_ext_wholesale_cost - t2.ws_ext_discount_amt + t2.ws_ext_sales_price) / > CAST(2 AS DECIMAL(10, 0))) $f8 >FROM > (SELECT c_customer_sk, > c_customer_id, > c_first_name, > c_last_name, > c_preferred_cust_flag, > c_birth_country, > c_login, > c_email_address > FROM default.customer > WHERE c_customer_sk IS NOT NULL > AND c_customer_id IS NOT NULL) t0 >INNER JOIN ( > (SELECT ws_sold_date_sk, > ws_bill_customer_sk, > ws_ext_discount_amt, > ws_ext_sales_price, > ws_ext_wholesale_cost, > ws_ext_list_price > FROM default.web_sales > WHERE ws_bill_customer_sk IS NOT NULL > AND ws_sold_date_sk IS NOT NULL) t2 >INNER JOIN > (SELECT d_date_sk, > CAST(2002 AS INTEGER) d_year > FROM default.date_dim > WHERE d_year = 2002 > AND d_date_sk IS NOT NULL) t4 ON t2.ws_sold_date_sk = > t4.d_date_sk) ON t0.c_customer_sk = t2.ws_bill_customer_sk >GROUP BY t0.c_customer_id, > t0.c_first_name, > t0.c_last_name, > t0.c_preferred_cust_flag, > t0.c_birth_country, > t0.c_login, > t0.c_email_address) t7 > INNER JOIN ( > (SELECT t9.c_customer_id $f0, >t9.c_preferred_cust_flag $f3, > > SUM((t11.ss_ext_list_price - t11.ss_ext_wholesale_cost - > t11.ss_ext_discount_amt + t11.ss_ext_sales_price) / CAST(2 AS DECIMAL(10, > 0))) $f8 >FROM > (SELECT c_customer_sk, > c_customer_id, > c_first_name, > c_last_name, > c_preferred_cust_flag, > c_birth_country, > c_login, > c_email_address > FROM default.customer > WHERE c_customer_sk IS NOT NULL > AND c_customer_id IS NOT NULL) t9 >INNER JOIN ( > (SELECT ss_sold_date_sk, > ss_customer_sk, > ss_ext_discount_amt, > ss_ext_sales_price, > ss_ext_wholesale_cost, > ss_ext_list_price > FROM default.store_sales > WHERE ss_customer_sk IS NOT NULL > AND ss_sold_date_sk IS NOT NULL) t11 >INNER JOIN > (SELECT d_date_sk, > CAST(2002 AS INTEGER) d_year > FROM default.date_dim > WHERE d_year = 2002 > AND d_date_sk IS NOT NULL) t13 ON > t11.ss_sold_date_sk = t13.d_date_sk) ON t9.c_customer_sk = t11.ss_customer_sk >GROUP BY t9.c_customer_id, > t9.c_first_name, > t9.c_last_name, > t9.c_preferred_cust_flag, > t9.c_birth_country, > t9.c_login, > t9.c_email_address) t16 > INNER JOIN ( >
[jira] [Updated] (HIVE-19360) CBO: Add an "optimizedSQL" to QueryPlan object
[ https://issues.apache.org/jira/browse/HIVE-19360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V updated HIVE-19360: --- Attachment: HIVE-19360.2.patch > CBO: Add an "optimizedSQL" to QueryPlan object > --- > > Key: HIVE-19360 > URL: https://issues.apache.org/jira/browse/HIVE-19360 > Project: Hive > Issue Type: Improvement > Components: CBO, Diagnosability >Affects Versions: 3.1.0 >Reporter: Gopal V >Assignee: Gopal V >Priority: Major > Attachments: HIVE-19360.1.patch, HIVE-19360.2.patch > > > Calcite RelNodes can be converted back into SQL (as the new JDBC storage > handler does), which allows Hive to print out the post CBO plan as a SQL > query instead of having to guess the join orders from the subsequent Tez plan. > The query generated might not be always valid SQL at this point, but is a > world ahead of DAG plans in readability. > Eg. tpc-ds Query4 CTEs gets expanded to > {code} > SELECT t16.$f3 customer_preferred_cust_flag > FROM > (SELECT t0.c_customer_id $f0, >SUM((t2.ws_ext_list_price - > t2.ws_ext_wholesale_cost - t2.ws_ext_discount_amt + t2.ws_ext_sales_price) / > CAST(2 AS DECIMAL(10, 0))) $f8 >FROM > (SELECT c_customer_sk, > c_customer_id, > c_first_name, > c_last_name, > c_preferred_cust_flag, > c_birth_country, > c_login, > c_email_address > FROM default.customer > WHERE c_customer_sk IS NOT NULL > AND c_customer_id IS NOT NULL) t0 >INNER JOIN ( > (SELECT ws_sold_date_sk, > ws_bill_customer_sk, > ws_ext_discount_amt, > ws_ext_sales_price, > ws_ext_wholesale_cost, > ws_ext_list_price > FROM default.web_sales > WHERE ws_bill_customer_sk IS NOT NULL > AND ws_sold_date_sk IS NOT NULL) t2 >INNER JOIN > (SELECT d_date_sk, > CAST(2002 AS INTEGER) d_year > FROM default.date_dim > WHERE d_year = 2002 > AND d_date_sk IS NOT NULL) t4 ON t2.ws_sold_date_sk = > t4.d_date_sk) ON t0.c_customer_sk = t2.ws_bill_customer_sk >GROUP BY t0.c_customer_id, > t0.c_first_name, > t0.c_last_name, > t0.c_preferred_cust_flag, > t0.c_birth_country, > t0.c_login, > t0.c_email_address) t7 > INNER JOIN ( > (SELECT t9.c_customer_id $f0, >t9.c_preferred_cust_flag $f3, > > SUM((t11.ss_ext_list_price - t11.ss_ext_wholesale_cost - > t11.ss_ext_discount_amt + t11.ss_ext_sales_price) / CAST(2 AS DECIMAL(10, > 0))) $f8 >FROM > (SELECT c_customer_sk, > c_customer_id, > c_first_name, > c_last_name, > c_preferred_cust_flag, > c_birth_country, > c_login, > c_email_address > FROM default.customer > WHERE c_customer_sk IS NOT NULL > AND c_customer_id IS NOT NULL) t9 >INNER JOIN ( > (SELECT ss_sold_date_sk, > ss_customer_sk, > ss_ext_discount_amt, > ss_ext_sales_price, > ss_ext_wholesale_cost, > ss_ext_list_price > FROM default.store_sales > WHERE ss_customer_sk IS NOT NULL > AND ss_sold_date_sk IS NOT NULL) t11 >INNER JOIN > (SELECT d_date_sk, > CAST(2002 AS INTEGER) d_year > FROM default.date_dim > WHERE d_year = 2002 > AND d_date_sk IS NOT NULL) t13 ON > t11.ss_sold_date_sk = t13.d_date_sk) ON t9.c_customer_sk = t11.ss_customer_sk >GROUP BY t9.c_customer_id, > t9.c_first_name, > t9.c_last_name, > t9.c_preferred_cust_flag, > t9.c_birth_country, > t9.c_login, > t9.c_email_address) t16 > INNER JOIN ( > (SELECT
[jira] [Updated] (HIVE-19360) CBO: Add an "optimizedSQL" to QueryPlan object
[ https://issues.apache.org/jira/browse/HIVE-19360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V updated HIVE-19360: --- Attachment: (was: HIVE-19360.1.patch) > CBO: Add an "optimizedSQL" to QueryPlan object > --- > > Key: HIVE-19360 > URL: https://issues.apache.org/jira/browse/HIVE-19360 > Project: Hive > Issue Type: Improvement > Components: CBO, Diagnosability >Affects Versions: 3.1.0 >Reporter: Gopal V >Assignee: Gopal V >Priority: Major > Attachments: HIVE-19360.1.patch > > > Calcite RelNodes can be converted back into SQL (as the new JDBC storage > handler does), which allows Hive to print out the post CBO plan as a SQL > query instead of having to guess the join orders from the subsequent Tez plan. > The query generated might not be always valid SQL at this point, but is a > world ahead of DAG plans in readability. > Eg. tpc-ds Query4 CTEs gets expanded to > {code} > SELECT t16.$f3 customer_preferred_cust_flag > FROM > (SELECT t0.c_customer_id $f0, >SUM((t2.ws_ext_list_price - > t2.ws_ext_wholesale_cost - t2.ws_ext_discount_amt + t2.ws_ext_sales_price) / > CAST(2 AS DECIMAL(10, 0))) $f8 >FROM > (SELECT c_customer_sk, > c_customer_id, > c_first_name, > c_last_name, > c_preferred_cust_flag, > c_birth_country, > c_login, > c_email_address > FROM default.customer > WHERE c_customer_sk IS NOT NULL > AND c_customer_id IS NOT NULL) t0 >INNER JOIN ( > (SELECT ws_sold_date_sk, > ws_bill_customer_sk, > ws_ext_discount_amt, > ws_ext_sales_price, > ws_ext_wholesale_cost, > ws_ext_list_price > FROM default.web_sales > WHERE ws_bill_customer_sk IS NOT NULL > AND ws_sold_date_sk IS NOT NULL) t2 >INNER JOIN > (SELECT d_date_sk, > CAST(2002 AS INTEGER) d_year > FROM default.date_dim > WHERE d_year = 2002 > AND d_date_sk IS NOT NULL) t4 ON t2.ws_sold_date_sk = > t4.d_date_sk) ON t0.c_customer_sk = t2.ws_bill_customer_sk >GROUP BY t0.c_customer_id, > t0.c_first_name, > t0.c_last_name, > t0.c_preferred_cust_flag, > t0.c_birth_country, > t0.c_login, > t0.c_email_address) t7 > INNER JOIN ( > (SELECT t9.c_customer_id $f0, >t9.c_preferred_cust_flag $f3, > > SUM((t11.ss_ext_list_price - t11.ss_ext_wholesale_cost - > t11.ss_ext_discount_amt + t11.ss_ext_sales_price) / CAST(2 AS DECIMAL(10, > 0))) $f8 >FROM > (SELECT c_customer_sk, > c_customer_id, > c_first_name, > c_last_name, > c_preferred_cust_flag, > c_birth_country, > c_login, > c_email_address > FROM default.customer > WHERE c_customer_sk IS NOT NULL > AND c_customer_id IS NOT NULL) t9 >INNER JOIN ( > (SELECT ss_sold_date_sk, > ss_customer_sk, > ss_ext_discount_amt, > ss_ext_sales_price, > ss_ext_wholesale_cost, > ss_ext_list_price > FROM default.store_sales > WHERE ss_customer_sk IS NOT NULL > AND ss_sold_date_sk IS NOT NULL) t11 >INNER JOIN > (SELECT d_date_sk, > CAST(2002 AS INTEGER) d_year > FROM default.date_dim > WHERE d_year = 2002 > AND d_date_sk IS NOT NULL) t13 ON > t11.ss_sold_date_sk = t13.d_date_sk) ON t9.c_customer_sk = t11.ss_customer_sk >GROUP BY t9.c_customer_id, > t9.c_first_name, > t9.c_last_name, > t9.c_preferred_cust_flag, > t9.c_birth_country, > t9.c_login, > t9.c_email_address) t16 > INNER JOIN ( > (SELECT t18.c_cus
[jira] [Updated] (HIVE-19360) CBO: Add an "optimizedSQL" to QueryPlan object
[ https://issues.apache.org/jira/browse/HIVE-19360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V updated HIVE-19360: --- Status: Patch Available (was: Open) > CBO: Add an "optimizedSQL" to QueryPlan object > --- > > Key: HIVE-19360 > URL: https://issues.apache.org/jira/browse/HIVE-19360 > Project: Hive > Issue Type: Improvement > Components: CBO, Diagnosability >Affects Versions: 3.1.0 >Reporter: Gopal V >Assignee: Gopal V >Priority: Major > Attachments: HIVE-19360.1.patch > > > Calcite RelNodes can be converted back into SQL (as the new JDBC storage > handler does), which allows Hive to print out the post CBO plan as a SQL > query instead of having to guess the join orders from the subsequent Tez plan. > The query generated might not be always valid SQL at this point, but is a > world ahead of DAG plans in readability. > Eg. tpc-ds Query4 CTEs gets expanded to > {code} > SELECT t16.$f3 customer_preferred_cust_flag > FROM > (SELECT t0.c_customer_id $f0, >SUM((t2.ws_ext_list_price - > t2.ws_ext_wholesale_cost - t2.ws_ext_discount_amt + t2.ws_ext_sales_price) / > CAST(2 AS DECIMAL(10, 0))) $f8 >FROM > (SELECT c_customer_sk, > c_customer_id, > c_first_name, > c_last_name, > c_preferred_cust_flag, > c_birth_country, > c_login, > c_email_address > FROM default.customer > WHERE c_customer_sk IS NOT NULL > AND c_customer_id IS NOT NULL) t0 >INNER JOIN ( > (SELECT ws_sold_date_sk, > ws_bill_customer_sk, > ws_ext_discount_amt, > ws_ext_sales_price, > ws_ext_wholesale_cost, > ws_ext_list_price > FROM default.web_sales > WHERE ws_bill_customer_sk IS NOT NULL > AND ws_sold_date_sk IS NOT NULL) t2 >INNER JOIN > (SELECT d_date_sk, > CAST(2002 AS INTEGER) d_year > FROM default.date_dim > WHERE d_year = 2002 > AND d_date_sk IS NOT NULL) t4 ON t2.ws_sold_date_sk = > t4.d_date_sk) ON t0.c_customer_sk = t2.ws_bill_customer_sk >GROUP BY t0.c_customer_id, > t0.c_first_name, > t0.c_last_name, > t0.c_preferred_cust_flag, > t0.c_birth_country, > t0.c_login, > t0.c_email_address) t7 > INNER JOIN ( > (SELECT t9.c_customer_id $f0, >t9.c_preferred_cust_flag $f3, > > SUM((t11.ss_ext_list_price - t11.ss_ext_wholesale_cost - > t11.ss_ext_discount_amt + t11.ss_ext_sales_price) / CAST(2 AS DECIMAL(10, > 0))) $f8 >FROM > (SELECT c_customer_sk, > c_customer_id, > c_first_name, > c_last_name, > c_preferred_cust_flag, > c_birth_country, > c_login, > c_email_address > FROM default.customer > WHERE c_customer_sk IS NOT NULL > AND c_customer_id IS NOT NULL) t9 >INNER JOIN ( > (SELECT ss_sold_date_sk, > ss_customer_sk, > ss_ext_discount_amt, > ss_ext_sales_price, > ss_ext_wholesale_cost, > ss_ext_list_price > FROM default.store_sales > WHERE ss_customer_sk IS NOT NULL > AND ss_sold_date_sk IS NOT NULL) t11 >INNER JOIN > (SELECT d_date_sk, > CAST(2002 AS INTEGER) d_year > FROM default.date_dim > WHERE d_year = 2002 > AND d_date_sk IS NOT NULL) t13 ON > t11.ss_sold_date_sk = t13.d_date_sk) ON t9.c_customer_sk = t11.ss_customer_sk >GROUP BY t9.c_customer_id, > t9.c_first_name, > t9.c_last_name, > t9.c_preferred_cust_flag, > t9.c_birth_country, > t9.c_login, > t9.c_email_address) t16 > INNER JOIN ( > (SELECT t18.c_customer
[jira] [Updated] (HIVE-19360) CBO: Add an "optimizedSQL" to QueryPlan object
[ https://issues.apache.org/jira/browse/HIVE-19360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V updated HIVE-19360: --- Attachment: HIVE-19360.1.patch > CBO: Add an "optimizedSQL" to QueryPlan object > --- > > Key: HIVE-19360 > URL: https://issues.apache.org/jira/browse/HIVE-19360 > Project: Hive > Issue Type: Improvement > Components: CBO, Diagnosability >Affects Versions: 3.1.0 >Reporter: Gopal V >Assignee: Gopal V >Priority: Major > Attachments: HIVE-19360.1.patch > > > Calcite RelNodes can be converted back into SQL (as the new JDBC storage > handler does), which allows Hive to print out the post CBO plan as a SQL > query instead of having to guess the join orders from the subsequent Tez plan. > The query generated might not be always valid SQL at this point, but is a > world ahead of DAG plans in readability. > Eg. tpc-ds Query4 CTEs gets expanded to > {code} > SELECT t16.$f3 customer_preferred_cust_flag > FROM > (SELECT t0.c_customer_id $f0, >SUM((t2.ws_ext_list_price - > t2.ws_ext_wholesale_cost - t2.ws_ext_discount_amt + t2.ws_ext_sales_price) / > CAST(2 AS DECIMAL(10, 0))) $f8 >FROM > (SELECT c_customer_sk, > c_customer_id, > c_first_name, > c_last_name, > c_preferred_cust_flag, > c_birth_country, > c_login, > c_email_address > FROM default.customer > WHERE c_customer_sk IS NOT NULL > AND c_customer_id IS NOT NULL) t0 >INNER JOIN ( > (SELECT ws_sold_date_sk, > ws_bill_customer_sk, > ws_ext_discount_amt, > ws_ext_sales_price, > ws_ext_wholesale_cost, > ws_ext_list_price > FROM default.web_sales > WHERE ws_bill_customer_sk IS NOT NULL > AND ws_sold_date_sk IS NOT NULL) t2 >INNER JOIN > (SELECT d_date_sk, > CAST(2002 AS INTEGER) d_year > FROM default.date_dim > WHERE d_year = 2002 > AND d_date_sk IS NOT NULL) t4 ON t2.ws_sold_date_sk = > t4.d_date_sk) ON t0.c_customer_sk = t2.ws_bill_customer_sk >GROUP BY t0.c_customer_id, > t0.c_first_name, > t0.c_last_name, > t0.c_preferred_cust_flag, > t0.c_birth_country, > t0.c_login, > t0.c_email_address) t7 > INNER JOIN ( > (SELECT t9.c_customer_id $f0, >t9.c_preferred_cust_flag $f3, > > SUM((t11.ss_ext_list_price - t11.ss_ext_wholesale_cost - > t11.ss_ext_discount_amt + t11.ss_ext_sales_price) / CAST(2 AS DECIMAL(10, > 0))) $f8 >FROM > (SELECT c_customer_sk, > c_customer_id, > c_first_name, > c_last_name, > c_preferred_cust_flag, > c_birth_country, > c_login, > c_email_address > FROM default.customer > WHERE c_customer_sk IS NOT NULL > AND c_customer_id IS NOT NULL) t9 >INNER JOIN ( > (SELECT ss_sold_date_sk, > ss_customer_sk, > ss_ext_discount_amt, > ss_ext_sales_price, > ss_ext_wholesale_cost, > ss_ext_list_price > FROM default.store_sales > WHERE ss_customer_sk IS NOT NULL > AND ss_sold_date_sk IS NOT NULL) t11 >INNER JOIN > (SELECT d_date_sk, > CAST(2002 AS INTEGER) d_year > FROM default.date_dim > WHERE d_year = 2002 > AND d_date_sk IS NOT NULL) t13 ON > t11.ss_sold_date_sk = t13.d_date_sk) ON t9.c_customer_sk = t11.ss_customer_sk >GROUP BY t9.c_customer_id, > t9.c_first_name, > t9.c_last_name, > t9.c_preferred_cust_flag, > t9.c_birth_country, > t9.c_login, > t9.c_email_address) t16 > INNER JOIN ( > (SELECT t18.c_customer_id $f
[jira] [Updated] (HIVE-19360) CBO: Add an "optimizedSQL" to QueryPlan object
[ https://issues.apache.org/jira/browse/HIVE-19360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V updated HIVE-19360: --- Attachment: HIVE-19360.1.patch > CBO: Add an "optimizedSQL" to QueryPlan object > --- > > Key: HIVE-19360 > URL: https://issues.apache.org/jira/browse/HIVE-19360 > Project: Hive > Issue Type: Improvement > Components: CBO, Diagnosability >Affects Versions: 3.1.0 >Reporter: Gopal V >Assignee: Gopal V >Priority: Major > Attachments: HIVE-19360.1.patch > > > Calcite RelNodes can be converted back into SQL (as the new JDBC storage > handler does), which allows Hive to print out the post CBO plan as a SQL > query instead of having to guess the join orders from the subsequent Tez plan. > The query generated might not be always valid SQL at this point, but is a > world ahead of DAG plans in readability. > Eg. tpc-ds Query4 CTEs gets expanded to > {code} > SELECT t16.$f3 customer_preferred_cust_flag > FROM > (SELECT t0.c_customer_id $f0, >SUM((t2.ws_ext_list_price - > t2.ws_ext_wholesale_cost - t2.ws_ext_discount_amt + t2.ws_ext_sales_price) / > CAST(2 AS DECIMAL(10, 0))) $f8 >FROM > (SELECT c_customer_sk, > c_customer_id, > c_first_name, > c_last_name, > c_preferred_cust_flag, > c_birth_country, > c_login, > c_email_address > FROM default.customer > WHERE c_customer_sk IS NOT NULL > AND c_customer_id IS NOT NULL) t0 >INNER JOIN ( > (SELECT ws_sold_date_sk, > ws_bill_customer_sk, > ws_ext_discount_amt, > ws_ext_sales_price, > ws_ext_wholesale_cost, > ws_ext_list_price > FROM default.web_sales > WHERE ws_bill_customer_sk IS NOT NULL > AND ws_sold_date_sk IS NOT NULL) t2 >INNER JOIN > (SELECT d_date_sk, > CAST(2002 AS INTEGER) d_year > FROM default.date_dim > WHERE d_year = 2002 > AND d_date_sk IS NOT NULL) t4 ON t2.ws_sold_date_sk = > t4.d_date_sk) ON t0.c_customer_sk = t2.ws_bill_customer_sk >GROUP BY t0.c_customer_id, > t0.c_first_name, > t0.c_last_name, > t0.c_preferred_cust_flag, > t0.c_birth_country, > t0.c_login, > t0.c_email_address) t7 > INNER JOIN ( > (SELECT t9.c_customer_id $f0, >t9.c_preferred_cust_flag $f3, > > SUM((t11.ss_ext_list_price - t11.ss_ext_wholesale_cost - > t11.ss_ext_discount_amt + t11.ss_ext_sales_price) / CAST(2 AS DECIMAL(10, > 0))) $f8 >FROM > (SELECT c_customer_sk, > c_customer_id, > c_first_name, > c_last_name, > c_preferred_cust_flag, > c_birth_country, > c_login, > c_email_address > FROM default.customer > WHERE c_customer_sk IS NOT NULL > AND c_customer_id IS NOT NULL) t9 >INNER JOIN ( > (SELECT ss_sold_date_sk, > ss_customer_sk, > ss_ext_discount_amt, > ss_ext_sales_price, > ss_ext_wholesale_cost, > ss_ext_list_price > FROM default.store_sales > WHERE ss_customer_sk IS NOT NULL > AND ss_sold_date_sk IS NOT NULL) t11 >INNER JOIN > (SELECT d_date_sk, > CAST(2002 AS INTEGER) d_year > FROM default.date_dim > WHERE d_year = 2002 > AND d_date_sk IS NOT NULL) t13 ON > t11.ss_sold_date_sk = t13.d_date_sk) ON t9.c_customer_sk = t11.ss_customer_sk >GROUP BY t9.c_customer_id, > t9.c_first_name, > t9.c_last_name, > t9.c_preferred_cust_flag, > t9.c_birth_country, > t9.c_login, > t9.c_email_address) t16 > INNER JOIN ( > (SELECT t18.c_customer_id $f