Julian Hyde created CALCITE-3131:
------------------------------------

             Summary: In LatticeSuggester, record whether columns are used as 
"dimensions" or "measures"
                 Key: CALCITE-3131
                 URL: https://issues.apache.org/jira/browse/CALCITE-3131
             Project: Calcite
          Issue Type: Bug
            Reporter: Julian Hyde
             Fix For: 1.21.0


In {{LatticeSuggester}}, record whether columns are used as "dimensions" or 
"measures" in the query set. And add a method 
{{Lattice.isAlwaysMeasure(Column)}}, populated from these use counts.

To understand the difference between dimensions and measures, consider the SQL 
queries
{code}
select deptno,
  sum(salary + commission) as remuneration,
  sum(salary) as salary
from Emp
group by first_name || ' ' || last_name, deptno;

select min(first_name || ' ' || last_name)
from Emp
where deptno = 10;
{code}

There are two expressions:
* {{salary + commission}} is only used as an argument two an aggregate 
function, i.e. a measure.
* {{first_name || ' ' || last_name}} is used as a dimension in the first query 
(in {{GROUP BY}}, but in {{WHERE}} or {{SELECT}} or {{ORDER BY}} would also 
count), and as a measure in the second.

When applied to a collection of queries, {{LatticeSuggester}} should record 
each individual use of a column or derived column (expression), and whether it 
is a usage as a measure. Thus {{salary + commission}} has [true], and 
{{first_name || ' ' || last_name}} has [false, true].

In future we might store further information about uses. For example, if a 
column is used with aliases "expr$1", "profit", "profit" then we could go with 
the majority decision and call it "profit". Currently we go with the alias the 
first time it is used.

When the {{LatticeSuggester}} builds a {{Lattice}} it passes on the use counts, 
and these power a new method {{Lattice.isAlwaysMeasure(Column)}}. A {{Lattice}} 
that is made directly, not induced from a set of queries, will have an empty 
use map. (It is a minor burden for these lattices, but it is outweighed by the 
convenience when an existing lattice needs to be merged with a new set of 
queries in an incremental {{LatticeSuggester}} run.)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to