[ 
https://issues.apache.org/jira/browse/OPTIQ-399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julian Hyde updated OPTIQ-399:
------------------------------
    Description: 
If a predicate is of the from {{p AND q1 OR p AND q2}} it would help to 
factorize out the AND factor {{p}}. Sometimes {{p}} contains predicates from 
just one side of a join and can be pushed down, whereas the whole predicate 
references columns from both sides.

Converting to DNF is not sufficient, and in the worst case exponentially 
increases the size of the expression.

In the following query, factorization will allow us to push down 
{{store.s_store_sk = store_sales.ss_store_sk AND store_sales.ss_sold_date_sk = 
date_dim.d_date_sk}}.

{code:sql}
SELECT avg(ss_quantity) ,
       avg(ss_ext_sales_price) ,
       avg(ss_ext_wholesale_cost) ,
       sum(ss_ext_wholesale_cost)
FROM store_sales ,
     store ,
     customer_demographics ,
     household_demographics ,
     customer_address ,
     date_dim
WHERE store.s_store_sk = store_sales.ss_store_sk
  AND store_sales.ss_sold_date_sk = date_dim.d_date_sk
  AND date_dim.d_year = 2001
  AND((store_sales.ss_hdemo_sk=household_demographics.hd_demo_sk
       AND customer_demographics.cd_demo_sk = store_sales.ss_cdemo_sk
       AND customer_demographics.cd_marital_status = 'M'
       AND customer_demographics.cd_education_status = '4 yr Degree'
       AND store_sales.ss_sales_price BETWEEN 100.00 AND 150.00
       AND household_demographics.hd_dep_count = 3)
      OR (store_sales.ss_hdemo_sk=household_demographics.hd_demo_sk
          AND customer_demographics.cd_demo_sk = store_sales.ss_cdemo_sk
          AND customer_demographics.cd_marital_status = 'D'
          AND customer_demographics.cd_education_status = 'Primary'
          AND store_sales.ss_sales_price BETWEEN 50.00 AND 100.00
          AND household_demographics.hd_dep_count = 1 )
      OR (store_sales.ss_hdemo_sk=household_demographics.hd_demo_sk
          AND customer_demographics.cd_demo_sk = ss_cdemo_sk
          AND customer_demographics.cd_marital_status = 'U'
          AND customer_demographics.cd_education_status = 'Advanced Degree'
          AND store_sales.ss_sales_price BETWEEN 150.00 AND 200.00
          AND household_demographics.hd_dep_count = 1))
  AND((store_sales.ss_addr_sk = customer_address.ca_address_sk
       AND customer_address.ca_country = 'United States'
       AND customer_address.ca_state IN ('KY',
                                         'GA',
                                         'NM')
       AND store_sales.ss_net_profit BETWEEN 100 AND 200)
      OR (store_sales.ss_addr_sk = customer_address.ca_address_sk
          AND customer_address.ca_country = 'United States'
          AND customer_address.ca_state IN ('MT',
                                            'OR',
                                            'IN')
          AND store_sales.ss_net_profit BETWEEN 150 AND 300)
      OR (store_sales.ss_addr_sk = customer_address.ca_address_sk
          AND customer_address.ca_country = 'United States'
          AND customer_address.ca_state IN ('WI',
                                            'MO',
                                            'WV')
          AND store_sales.ss_net_profit BETWEEN 50 AND 250)) ;
{code}

  was:
{code:sql}
SELECT avg(ss_quantity) ,
       avg(ss_ext_sales_price) ,
       avg(ss_ext_wholesale_cost) ,
       sum(ss_ext_wholesale_cost)
FROM store_sales ,
     store ,
     customer_demographics ,
     household_demographics ,
     customer_address ,
     date_dim
WHERE store.s_store_sk = store_sales.ss_store_sk
  AND store_sales.ss_sold_date_sk = date_dim.d_date_sk
  AND date_dim.d_year = 2001
  AND((store_sales.ss_hdemo_sk=household_demographics.hd_demo_sk
       AND customer_demographics.cd_demo_sk = store_sales.ss_cdemo_sk
       AND customer_demographics.cd_marital_status = 'M'
       AND customer_demographics.cd_education_status = '4 yr Degree'
       AND store_sales.ss_sales_price BETWEEN 100.00 AND 150.00
       AND household_demographics.hd_dep_count = 3)
      OR (store_sales.ss_hdemo_sk=household_demographics.hd_demo_sk
          AND customer_demographics.cd_demo_sk = store_sales.ss_cdemo_sk
          AND customer_demographics.cd_marital_status = 'D'
          AND customer_demographics.cd_education_status = 'Primary'
          AND store_sales.ss_sales_price BETWEEN 50.00 AND 100.00
          AND household_demographics.hd_dep_count = 1 )
      OR (store_sales.ss_hdemo_sk=household_demographics.hd_demo_sk
          AND customer_demographics.cd_demo_sk = ss_cdemo_sk
          AND customer_demographics.cd_marital_status = 'U'
          AND customer_demographics.cd_education_status = 'Advanced Degree'
          AND store_sales.ss_sales_price BETWEEN 150.00 AND 200.00
          AND household_demographics.hd_dep_count = 1))
  AND((store_sales.ss_addr_sk = customer_address.ca_address_sk
       AND customer_address.ca_country = 'United States'
       AND customer_address.ca_state IN ('KY',
                                         'GA',
                                         'NM')
       AND store_sales.ss_net_profit BETWEEN 100 AND 200)
      OR (store_sales.ss_addr_sk = customer_address.ca_address_sk
          AND customer_address.ca_country = 'United States'
          AND customer_address.ca_state IN ('MT',
                                            'OR',
                                            'IN')
          AND store_sales.ss_net_profit BETWEEN 150 AND 300)
      OR (store_sales.ss_addr_sk = customer_address.ca_address_sk
          AND customer_address.ca_country = 'United States'
          AND customer_address.ca_state IN ('WI',
                                            'MO',
                                            'WV')
          AND store_sales.ss_net_profit BETWEEN 50 AND 250)) ;
{code}


> Factorize common AND factors out of OR predicates
> -------------------------------------------------
>
>                 Key: OPTIQ-399
>                 URL: https://issues.apache.org/jira/browse/OPTIQ-399
>             Project: Optiq
>          Issue Type: Bug
>            Reporter: Laljo John Pullokkaran
>            Assignee: Laljo John Pullokkaran
>
> If a predicate is of the from {{p AND q1 OR p AND q2}} it would help to 
> factorize out the AND factor {{p}}. Sometimes {{p}} contains predicates from 
> just one side of a join and can be pushed down, whereas the whole predicate 
> references columns from both sides.
> Converting to DNF is not sufficient, and in the worst case exponentially 
> increases the size of the expression.
> In the following query, factorization will allow us to push down 
> {{store.s_store_sk = store_sales.ss_store_sk AND store_sales.ss_sold_date_sk 
> = date_dim.d_date_sk}}.
> {code:sql}
> SELECT avg(ss_quantity) ,
>        avg(ss_ext_sales_price) ,
>        avg(ss_ext_wholesale_cost) ,
>        sum(ss_ext_wholesale_cost)
> FROM store_sales ,
>      store ,
>      customer_demographics ,
>      household_demographics ,
>      customer_address ,
>      date_dim
> WHERE store.s_store_sk = store_sales.ss_store_sk
>   AND store_sales.ss_sold_date_sk = date_dim.d_date_sk
>   AND date_dim.d_year = 2001
>   AND((store_sales.ss_hdemo_sk=household_demographics.hd_demo_sk
>        AND customer_demographics.cd_demo_sk = store_sales.ss_cdemo_sk
>        AND customer_demographics.cd_marital_status = 'M'
>        AND customer_demographics.cd_education_status = '4 yr Degree'
>        AND store_sales.ss_sales_price BETWEEN 100.00 AND 150.00
>        AND household_demographics.hd_dep_count = 3)
>       OR (store_sales.ss_hdemo_sk=household_demographics.hd_demo_sk
>           AND customer_demographics.cd_demo_sk = store_sales.ss_cdemo_sk
>           AND customer_demographics.cd_marital_status = 'D'
>           AND customer_demographics.cd_education_status = 'Primary'
>           AND store_sales.ss_sales_price BETWEEN 50.00 AND 100.00
>           AND household_demographics.hd_dep_count = 1 )
>       OR (store_sales.ss_hdemo_sk=household_demographics.hd_demo_sk
>           AND customer_demographics.cd_demo_sk = ss_cdemo_sk
>           AND customer_demographics.cd_marital_status = 'U'
>           AND customer_demographics.cd_education_status = 'Advanced Degree'
>           AND store_sales.ss_sales_price BETWEEN 150.00 AND 200.00
>           AND household_demographics.hd_dep_count = 1))
>   AND((store_sales.ss_addr_sk = customer_address.ca_address_sk
>        AND customer_address.ca_country = 'United States'
>        AND customer_address.ca_state IN ('KY',
>                                          'GA',
>                                          'NM')
>        AND store_sales.ss_net_profit BETWEEN 100 AND 200)
>       OR (store_sales.ss_addr_sk = customer_address.ca_address_sk
>           AND customer_address.ca_country = 'United States'
>           AND customer_address.ca_state IN ('MT',
>                                             'OR',
>                                             'IN')
>           AND store_sales.ss_net_profit BETWEEN 150 AND 300)
>       OR (store_sales.ss_addr_sk = customer_address.ca_address_sk
>           AND customer_address.ca_country = 'United States'
>           AND customer_address.ca_state IN ('WI',
>                                             'MO',
>                                             'WV')
>           AND store_sales.ss_net_profit BETWEEN 50 AND 250)) ;
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to