walterddr commented on pull request #7678:
URL: https://github.com/apache/pinot/pull/7678#issuecomment-959411835


   hmm. I think we are off from the SQL semantics. 
   
   1. first of all. DISTINCT is not a function, it is not the same as 
   ```
   SELECT C1 AS ALIAS_C1, C2 AS ALIAS_C2, ADD(alias_c1, alias_c2)
   ```
   because it doesn't produces a projection/transformation column for each 
input row.
   
   2. this following SQL 
   ```
   select CAST(runs AS string) as a, CAST(num AS int) as b, DISTINCT(a, b) from 
baseballStats
   ```
   is only valid b/c DISTINCT is modeled as a function, and based on this 
precondition, the result cannot be `a,b,a,b` because you cannot have the first 
`a,b` modeled as non-distinct columns and the second `a,b` as distinct columns 
--> it simply isn't a valid linear algebra --> Imagine you have `SELECT a, 
DISTINCT(b) FROM table`.
   which `a` value should I return if there's 2 rows with the same `b` value?
   
   The fundamental problem here is that DISTINCT should not be modeled as a 
function.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to