walterddr commented on pull request #7678: URL: https://github.com/apache/pinot/pull/7678#issuecomment-959411835
hmm. I think we are off from the SQL semantics. 1. first of all. DISTINCT is not a function, it is not the same as ``` SELECT C1 AS ALIAS_C1, C2 AS ALIAS_C2, ADD(alias_c1, alias_c2) ``` because it doesn't produces a projection/transformation column for each input row. 2. this following SQL ``` select CAST(runs AS string) as a, CAST(num AS int) as b, DISTINCT(a, b) from baseballStats ``` is only valid b/c DISTINCT is modeled as a function, and based on this precondition, the result cannot be `a,b,a,b` because you cannot have the first `a,b` modeled as non-distinct columns and the second `a,b` as distinct columns --> it simply isn't a valid linear algebra --> Imagine you have `SELECT a, DISTINCT(b) FROM table`. which `a` value should I return if there's 2 rows with the same `b` value? The fundamental problem here is that DISTINCT should not be modeled as a function. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
