[jira] [Created] (CALCITE-1591) Drill adapter: Use "groupBy" query with extractionFn for time dimension

Julian Hyde (JIRA) Wed, 18 Jan 2017 14:26:25 -0800

Julian Hyde created CALCITE-1591:
------------------------------------

             Summary: Drill adapter: Use "groupBy" query with extractionFn for 
time dimension
                 Key: CALCITE-1591
                 URL: https://issues.apache.org/jira/browse/CALCITE-1591
             Project: Calcite
          Issue Type: Bug
            Reporter: Julian Hyde
            Assignee: Julian Hyde



For queries that aggregate on the time dimension, or a function of it such as 
{{FLOOR(__time TO DAY)}}, as of the fix for CALCITE-1579 we generate a 
"groupBy" query that does not sort or apply limit. It would be better (in the 
sense that Druid is doing more of the work, and Hive is doing less work) if we 
use an extractionFn to create a dimension that we can sort on.

In CALCITE-1578, [~nishantbangarwa] gives the following example query:

{code}
{
  "queryType": "groupBy",
  "dataSource": "druid_tpcds_ss_sold_time_subset",
  "granularity": "ALL",
  "dimensions": [
    "i_brand_id",
    {
      "type" : "extraction",
      "dimension" : "__time",
      "outputName" :  "year",
      "extractionFn" : {
        "type" : "timeFormat",
        "granularity" : "YEAR"
      }
    }
  ],
  "limitSpec": {
    "type": "default",
    "limit": 10,
    "columns": [
      {
        "dimension": "$f3",
        "direction": "ascending"
      }
    ]
  },
  "aggregations": [
    {
      "type": "longMax",
      "name": "$f2",
      "fieldName": "ss_quantity"
    },
    {
      "type": "doubleSum",
      "name": "$f3",
      "fieldName": "ss_wholesale_cost"
    }
  ],
  "intervals": [
    "1900-01-01T00:00:00.000Z/3000-01-01T00:00:00.000Z"
  ]
}
{code}

and for {{DruidAdapterIt. testGroupByDaySortDescLimit}}, [~bslim] suggests

{code}
{
  "queryType": "groupBy",
  "dataSource": "foodmart",
  "granularity": "all",
  "dimensions": [
    "brand_name",
    {
      "type": "extraction",
      "dimension": "__time",
      "outputName": "day",
      "extractionFn": {
        "type": "timeFormat",
        "granularity": "DAY"
      }
    }
  ],
  "aggregations": [
    {
      "type": "longSum",
      "name": "S",
      "fieldName": "unit_sales"
    }
  ],
  "limitSpec": {
    "type": "default",
    "limit": 30,
    "columns": [
      {
        "dimension": "S",
        "direction": "ascending"
      }
    ]
  }
}
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (CALCITE-1591) Drill adapter: Use "groupBy" query with extractionFn for time dimension

Reply via email to