kgyrtkirk commented on code in PR #16741: URL: https://github.com/apache/druid/pull/16741#discussion_r1689390267
########## processing/src/main/java/org/apache/druid/query/operator/window/WindowFrame.java: ########## @@ -21,159 +21,192 @@ import com.fasterxml.jackson.annotation.JsonCreator; import com.fasterxml.jackson.annotation.JsonProperty; +import com.fasterxml.jackson.annotation.JsonSubTypes; +import com.fasterxml.jackson.annotation.JsonTypeInfo; +import com.google.common.collect.ImmutableList; import com.google.common.collect.Lists; -import org.apache.druid.query.operator.ColumnWithDirection; +import org.apache.druid.annotations.SubclassesMustOverrideEqualsAndHashCode; + +import javax.annotation.Nullable; -import java.util.Collections; import java.util.List; import java.util.Objects; -import java.util.stream.Collectors; -public class WindowFrame +@JsonTypeInfo(use = JsonTypeInfo.Id.NAME, property = "type") +@JsonSubTypes(value = { + @JsonSubTypes.Type(name = "rows", value = WindowFrame.Rows.class), + @JsonSubTypes.Type(name = "groups", value = WindowFrame.Groups.class), Review Comment: I'll try to cover all aspects of your questions :) > why did RANGE change to groups for a `RANGE` if both endpoints are `UNBOUNDED` or `CURRENT ROW` there is no difference between `RANGE` and `GROUPS` (we support only these right now) > Is this because only group by queries are supported currently? it has no connection to that - `groups` is a frame evaluation mode; which groups rows with the same value together I always find a different doc about this...today I've found [this](https://www.jooq.org/doc/latest/manual/sql-building/column-expressions/window-functions/window-frame/) ; but its also in the standard. > I naively would expect the syntax present in the query the user wrote to match what appears here, but that doesn't seem to be the case with this change. Is this confusing? I don't feel like its confusing - as the resulting plan may be different from what the user supplied already: filters and join conditions may have changed; predicates could be pushed and window frame specs may change * I don't know if this would be optimized; but consider that the user gives: `SUM() OVER (PARTITION BY X ORDER BY Y,Z RANGE BETWEEN UNBOUNDED PRECEEDING AND UNBOUNDED PROCEEDING)` * that's a fully unbounded window...do we need to do the `ORDER`? if not...then it could be executed as `SUM() OVER (PARTITION BY X)` > How will this make a future refactor easier? We will be able to add `range` later along with the supporting algo enhancements > also, do we even support using groups in SQL syntax? Although the SQL standard has it - the Calcite layer doesn't accept `GROUPS` [today](https://github.com/apache/calcite/blob/1947312e4ae792aa6ba6b4c48a2d868712d41dc5/core/src/main/codegen/templates/Parser.jj#L2767-L2769) - only `RANGE` and `ROWS` are allowed. With this change we will naturally expose the support of `groups` in the native layer - and native api users may even go beyond and use all features of `groups`. When the Calcite support will arrive for `GROUPS` we will already have everything prepared to enable full support for it. > it seems like maybe we don't and there are no tests, is this also confusing? I'm confused. As we recognize edge cases of `RANGE` as `GROUPS` - the `sqlTest` plans contain `rows` and `groups`. For all `RANGE` frames we can't reliably execute correctly there is an exception explaining it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org