[ 
https://issues.apache.org/jira/browse/CALCITE-1334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15903418#comment-15903418
 ] 

Nishant Bangarwa edited comment on CALCITE-1334 at 3/9/17 5:19 PM:
-------------------------------------------------------------------

looks like this is still an issue, 
With the latest trunk the query being sent to Druid for 
DruidAdapterIT.testFilterTimestamp is as follows 
Notice the interval being sent to Druid I expect that to be 
"1997-04-01/1997-06-30" - 
{code}
{
  "queryType": "select",
  "dataSource": "foodmart",
  "descending": false,
  "intervals": [
    "1900-01-09T00:00:00.000/2992-01-10T00:00:00.000"
  ],
  "dimensions": [
    "product_id",
    "brand_name",
    "product_name",
    "SKU",
    "SRP",
    "gross_weight",
    "net_weight",
    "recyclable_package",
    "low_fat",
    "units_per_case",
    "cases_per_pallet",
    "shelf_width",
    "shelf_height",
    "shelf_depth",
    "product_class_id",
    "product_subcategory",
    "product_category",
    "product_department",
    "product_family",
    "customer_id",
    "account_num",
    "lname",
    "fname",
    "mi",
    "address1",
    "address2",
    "address3",
    "address4",
    "city",
    "state_province",
    "postal_code",
    "country",
    "customer_region_id",
    "phone1",
    "phone2",
    "birthdate",
    "marital_status",
    "yearly_income",
    "gender",
    "total_children",
    "num_children_at_home",
    "education",
    "date_accnt_opened",
    "member_card",
    "occupation",
    "houseowner",
    "num_cars_owned",
    "fullname",
    "promotion_id",
    "promotion_district_id",
    "promotion_name",
    "media_type",
    "cost",
    "start_date",
    "end_date",
    "store_id",
    "store_type",
    "region_id",
    "store_name",
    "store_number",
    "store_street_address",
    "store_city",
    "store_state",
    "store_postal_code",
    "store_country",
    "store_manager",
    "store_phone",
    "store_fax",
    "first_opened_date",
    "last_remodel_date",
    "store_sqft",
    "grocery_sqft",
    "frozen_sqft",
    "meat_sqft",
    "coffee_bar",
    "video_store",
    "salad_bar",
    "prepared_food",
    "florist",
    "time_id",
    "the_day",
    "the_month",
    "the_year",
    "day_of_month",
    "week_of_year",
    "month_of_year",
    "quarter",
    "fiscal_period"
  ],
  "metrics": [
    "unit_sales",
    "store_sales",
    "store_cost"
  ],
  "granularity": "all",
  "pagingSpec": {
    "threshold": 16384,
    "fromNext": true
  },
  "context": {
    "druid.query.fetch": false
  }
}
{code}


was (Author: nishantbangarwa):
looks like this is still al issue, 
With the latest trunk the query being sent to Druid for 
DruidAdapterIT.testFilterTimestamp is as follows 
Notice the interval being sent to Druid I expect that to be 
"1997-04-01/1997-06-30" - 
{code}
{
  "queryType": "select",
  "dataSource": "foodmart",
  "descending": false,
  "intervals": [
    "1900-01-09T00:00:00.000/2992-01-10T00:00:00.000"
  ],
  "dimensions": [
    "product_id",
    "brand_name",
    "product_name",
    "SKU",
    "SRP",
    "gross_weight",
    "net_weight",
    "recyclable_package",
    "low_fat",
    "units_per_case",
    "cases_per_pallet",
    "shelf_width",
    "shelf_height",
    "shelf_depth",
    "product_class_id",
    "product_subcategory",
    "product_category",
    "product_department",
    "product_family",
    "customer_id",
    "account_num",
    "lname",
    "fname",
    "mi",
    "address1",
    "address2",
    "address3",
    "address4",
    "city",
    "state_province",
    "postal_code",
    "country",
    "customer_region_id",
    "phone1",
    "phone2",
    "birthdate",
    "marital_status",
    "yearly_income",
    "gender",
    "total_children",
    "num_children_at_home",
    "education",
    "date_accnt_opened",
    "member_card",
    "occupation",
    "houseowner",
    "num_cars_owned",
    "fullname",
    "promotion_id",
    "promotion_district_id",
    "promotion_name",
    "media_type",
    "cost",
    "start_date",
    "end_date",
    "store_id",
    "store_type",
    "region_id",
    "store_name",
    "store_number",
    "store_street_address",
    "store_city",
    "store_state",
    "store_postal_code",
    "store_country",
    "store_manager",
    "store_phone",
    "store_fax",
    "first_opened_date",
    "last_remodel_date",
    "store_sqft",
    "grocery_sqft",
    "frozen_sqft",
    "meat_sqft",
    "coffee_bar",
    "video_store",
    "salad_bar",
    "prepared_food",
    "florist",
    "time_id",
    "the_day",
    "the_month",
    "the_year",
    "day_of_month",
    "week_of_year",
    "month_of_year",
    "quarter",
    "fiscal_period"
  ],
  "metrics": [
    "unit_sales",
    "store_sales",
    "store_cost"
  ],
  "granularity": "all",
  "pagingSpec": {
    "threshold": 16384,
    "fromNext": true
  },
  "context": {
    "druid.query.fetch": false
  }
}
{code}

> Convert predicates on EXTRACT function calls into date ranges
> -------------------------------------------------------------
>
>                 Key: CALCITE-1334
>                 URL: https://issues.apache.org/jira/browse/CALCITE-1334
>             Project: Calcite
>          Issue Type: Bug
>            Reporter: Julian Hyde
>            Assignee: Julian Hyde
>              Labels: druid
>             Fix For: 1.9.0
>
>
> We would like to convert predicates on date dimension columns into date 
> ranges. This is particularly useful for Druid, which has a single timestamp 
> column.
> Consider the case of a materialized view
> {code}
> SELECT sales.*, product.*, time_by_day.*
> FROM sales
> JOIN product USING (product_id)
> JOIN time_by_day USING (time_id)
> {code}
> that corresponds to a Druid table
> {noformat}
> sales_product_time(
>   product_id int not null,
>   time_id int not null,
>   units int not null,
>   the_year int not null,
>   the_quarter int not null,
>   the_month int not null,
>   the_timestamp timestamp not null,
>   product_name varchar(20) not null)
> {noformat}
> And suppose we have the following check constraints:
> * {{CHECK the_year = EXTRACT(YEAR FROM the_timestamp)}}
> * {{CHECK the_month = EXTRACT(MONTH FROM the_timestamp)}}
> Given a query
> {code}
> SELECT product_id, count(*)
> FROM sales
> JOIN product USING (product_id)
> JOIN time_by_day USING (time_id)
> WHERE the_year = 2016
> AND the_month IN (4, 5, 6)
> {code}
> we would like to transform it into the following query to be run against 
> Druid:
> {code}
> SELECT product_id, count(*)
> FROM sales_product_time
> WHERE the_timestamp BETWEEN '2016-04-01' AND '2016-06-30'
> {code}
> Druid can handle timestamp ranges (or disjoint sets of ranges) very 
> efficiently.
> I believe we can write a rule that knows the check constraints and also knows 
> the properties of the {{EXTRACT}} function:
> 1. Apply check constraints to convert {{WHERE year = ...}} to {{WHERE 
> EXTRACT(YEAR FROM the_timestamp) = ...}}, etc.
> 2. {{EXTRACT(YEAR FROM ...)}} is monotonic, therefore we can deduce the range 
> of the_timestamp values such that {{EXTRACT(YEAR FROM the_timestamp)}} 
> returns 2016.
> 3. Then we need to use the fact that {{EXTRACT(MONTH FROM the_timestamp)}} is 
> monotonic if {{the_timestamp}} is bounded within a particular year.
> 4. And we need to merge month ranges somehow.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to