Apologies for the delay in replying This makes things clear and seems obvious now that you point it out. Thank you, Justin and Julian =)
Let me ask another question (if I may) that I am struggling to phrase easily. So with GraphQL, you might have a query like: - "Get houses" - "For each house get the user that lives in the house - "And for each user get their list of todos" The result has to come back such that it's a single object for each row ============== { houses: [{ address: "123 Main Street", users: [{ name: "Joe", todos: [{ description: "Take out trash" }] }] } >From a SQL perspective, the logical equivalent would be something like: ============== SELECT house.address, (somehow nest users + double-nest todos under user) FROM house JOIN user ON user.house_id = house.id todos ON todos.user_id = user.id WHERE house.id = 1 I'm not familiar enough with SQL to have figured out a way to make this kind of query using operators that are supported across most of the DB's Calcite has adapters for. Currently what I have done instead, on a tip from Gopalakrishna Holla from LinkedIn Coral team who has built GraphQL-on-Calcite, was to break up the query into individual parts and then do the join in-memory: SELECT ... FROM users; SELECT ... FROM todos WHERE todos.user_id IN (ResultSet from prev response); However, the way I am doing this seems like it's probably very inefficient. Because I do a series of nested loops to add the key to each object in the parent ResultSet row: https://github.com/GavinRay97/GraphQLCalcite/blob/648e0ac4f6810a3c360d13a03e6597c33406de4b/src/main/kotlin/TableDataFetcher.kt#L135-L153 Is there some better way of doing this? I would be eternally grateful for any advice. On Thu, Feb 10, 2022 at 3:27 PM Julian Hyde <jhyde.apa...@gmail.com> wrote: > Yes, if you want to do multiple layers of aggregation, use CTEs (WITH) or > nested sub-queries. For example, the following is I believe valid standard > SQL, and actually computes something useful: > > WITH q1 AS > (SELECT deptno, job, AVG(sal) AS avg_sal > FROM emp > GROUP BY deptno, job) > WITH q2 AS > (SELECT deptno, AVG(avg_sal) AS avg_avg_sal > FROM q1 > GROUP BY deptno) > SELECT AVG(avg_avg_sal) > FROM q2 > GROUP BY () > > (You can omit the “GROUP BY ()” line, but I think it makes things clearer.) > > Julian > > > > > On Feb 10, 2022, at 12:17 PM, Justin Swanhart <greenl...@gmail.com> > wrote: > > > > I wish you could unsend emails :) Answering my own question, no, because > > that would return three rows with the average :D > > > > On Thu, Feb 10, 2022 at 3:16 PM Justin Swanhart <greenl...@gmail.com> > wrote: > > > >> Just out of curiosity, is the second level aggregation using AVG in a > >> window context? It the frame is the whole table and it aggregates over > it? > >> > >> On Thu, Feb 10, 2022 at 3:12 PM Justin Swanhart <greenl...@gmail.com> > >> wrote: > >> > >>> That is really neat about Oracle. > >>> > >>> The alternative in general is to use a subquery: > >>> SELECT avg(avg(sal)) FROM emp GROUP BY deptno; > >>> becomes > >>> select avg(the_avg) > >>> from (select avg(sal) from emp group b deptno) an_alias; > >>> > >>> or > >>> > >>> with the_cte as (select avg(sal) x from emp group by deptno) > >>> select avg(x) from the_cte; > >>> > >>> On Thu, Feb 10, 2022 at 3:03 PM Julian Hyde <jhyde.apa...@gmail.com> > >>> wrote: > >>> > >>>> Some databases, e.g. Oracle, allow TWO levels of nesting: > >>>> > >>>> SELECT avg(sal) FROM emp GROUP BY deptno; > >>>> > >>>> AVG(SAL) > >>>> ======== > >>>> 1,566.67 > >>>> 2,175.00 > >>>> 2,916.65 > >>>> > >>>> SELECT avg(avg(sal)) FROM emp GROUP BY deptno; > >>>> > >>>> AVG(SUM(SAL)) > >>>> ============= > >>>> 9,675 > >>>> > >>>> The first level aggregates by department (returning 3 records), and > the > >>>> second level computes the grand total (returning 1 record). But that > is an > >>>> exceptional case. > >>>> > >>>> Generally, any expression in the SELECT or HAVING clause of an > aggregate > >>>> query is either ‘before’ or ‘after’ aggregation. Consider > >>>> > >>>> SELECT t.x + 1 AS a, 2 + SUM(t.y + 3) AS b > >>>> FROM t > >>>> GROUP BY t.x > >>>> > >>>> The expressions “t.y” and “t.y + 3” occur before aggregation; “t.x”, > >>>> “t.x + 1”, “SUM(t.y + 3)” and “2 + SUM(t.y + 3)” occur after > aggregation. > >>>> SQL semantics rely heavily on this stratification. Allowing an extra > level > >>>> of aggregation would mess it all up. > >>>> > >>>> Julian > >>>> > >>>> > >>>> > >>>>> On Feb 10, 2022, at 9:45 AM, Justin Swanhart <greenl...@gmail.com> > >>>> wrote: > >>>>> > >>>>> This is a SQL limitation. > >>>>> > >>>>> mysql> select sum(1); > >>>>> +--------+ > >>>>> | sum(1) | > >>>>> +--------+ > >>>>> | 1 | > >>>>> +--------+ > >>>>> 1 row in set (0.00 sec) > >>>>> > >>>>> mysql> select sum(sum(1)); > >>>>> ERROR 1111 (HY000): Invalid use of group function > >>>>> > >>>>> On Thu, Feb 10, 2022 at 12:39 PM Gavin Ray <ray.gavi...@gmail.com> > >>>> wrote: > >>>>> > >>>>>> Went to test this query out and found that it can't be performed: > >>>>>> > >>>>>> SELECT > >>>>>> JSON_OBJECT( > >>>>>> KEY 'users' > >>>>>> VALUE JSON_ARRAYAGG( > >>>>>> JSON_OBJECT( > >>>>>> KEY 'name' VALUE "users"."name", > >>>>>> KEY 'todos' VALUE JSON_ARRAYAGG( > >>>>>> JSON_OBJECT( > >>>>>> KEY 'description' VALUE "todos"."description" > >>>>>> ) > >>>>>> ) > >>>>>> ) > >>>>>> ) > >>>>>> ) > >>>>>> FROM > >>>>>> "users" > >>>>>> LEFT OUTER JOIN > >>>>>> "todos" ON "users"."id" = "todos"."user_id"; > >>>>>> > >>>>>> Checking the source, seems this is a blanket policy, not a > >>>>>> datasource-specific thing. > >>>>>> From a functional perspective, it doesn't feel like it's much > >>>> different > >>>>>> from JOINs > >>>>>> But I don't understand relational theory or DB functionality in the > >>>> least, > >>>>>> so I'm not fit to judge. > >>>>>> > >>>>>> Just curious why Calcite doesn't allow this > >>>>>> > >>>> > >>>> > >