Julian Hyde created CALCITE-1591: ------------------------------------ Summary: Drill adapter: Use "groupBy" query with extractionFn for time dimension Key: CALCITE-1591 URL: https://issues.apache.org/jira/browse/CALCITE-1591 Project: Calcite Issue Type: Bug Reporter: Julian Hyde Assignee: Julian Hyde
For queries that aggregate on the time dimension, or a function of it such as {{FLOOR(__time TO DAY)}}, as of the fix for CALCITE-1579 we generate a "groupBy" query that does not sort or apply limit. It would be better (in the sense that Druid is doing more of the work, and Hive is doing less work) if we use an extractionFn to create a dimension that we can sort on. In CALCITE-1578, [~nishantbangarwa] gives the following example query: {code} { "queryType": "groupBy", "dataSource": "druid_tpcds_ss_sold_time_subset", "granularity": "ALL", "dimensions": [ "i_brand_id", { "type" : "extraction", "dimension" : "__time", "outputName" : "year", "extractionFn" : { "type" : "timeFormat", "granularity" : "YEAR" } } ], "limitSpec": { "type": "default", "limit": 10, "columns": [ { "dimension": "$f3", "direction": "ascending" } ] }, "aggregations": [ { "type": "longMax", "name": "$f2", "fieldName": "ss_quantity" }, { "type": "doubleSum", "name": "$f3", "fieldName": "ss_wholesale_cost" } ], "intervals": [ "1900-01-01T00:00:00.000Z/3000-01-01T00:00:00.000Z" ] } {code} and for {{DruidAdapterIt. testGroupByDaySortDescLimit}}, [~bslim] suggests {code} { "queryType": "groupBy", "dataSource": "foodmart", "granularity": "all", "dimensions": [ "brand_name", { "type": "extraction", "dimension": "__time", "outputName": "day", "extractionFn": { "type": "timeFormat", "granularity": "DAY" } } ], "aggregations": [ { "type": "longSum", "name": "S", "fieldName": "unit_sales" } ], "limitSpec": { "type": "default", "limit": 30, "columns": [ { "dimension": "S", "direction": "ascending" } ] } } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)