[ https://issues.apache.org/jira/browse/SPARK-25841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17109313#comment-17109313 ]
Shyam commented on SPARK-25841: ------------------------------- [~rxin] is this fixed in latest version i.e. 2.4.3v ? still this issue persisting ? > Redesign window function rangeBetween API > ----------------------------------------- > > Key: SPARK-25841 > URL: https://issues.apache.org/jira/browse/SPARK-25841 > Project: Spark > Issue Type: Umbrella > Components: SQL > Affects Versions: 2.3.2, 2.4.0 > Reporter: Reynold Xin > Assignee: Reynold Xin > Priority: Major > > As I was reviewing the Spark API changes for 2.4, I found that through > organic, ad-hoc evolution the current API for window functions in Scala is > pretty bad. > > To illustrate the problem, we have two rangeBetween functions in Window > class: > > {code:java} > class Window { > def unboundedPreceding: Long > ... > def rangeBetween(start: Long, end: Long): WindowSpec > def rangeBetween(start: Column, end: Column): WindowSpec > }{code} > > The Column version of rangeBetween was added in Spark 2.3 because the > previous version (Long) could only support integral values and not time > intervals. Now in order to support specifying unboundedPreceding in the > rangeBetween(Column, Column) API, we added an unboundedPreceding that returns > a Column in functions.scala. > > There are a few issues I have with the API: > > 1. To the end user, this can be just super confusing. Why are there two > unboundedPreceding functions, in different classes, that are named the same > but return different types? > > 2. Using Column as the parameter signature implies this can be an actual > Column, but in practice rangeBetween can only accept literal values. > > 3. We added the new APIs to support intervals, but they don't actually work, > because in the implementation we try to validate the start is less than the > end, but calendar interval types are not comparable, and as a result we throw > a type mismatch exception at runtime: scala.MatchError: CalendarIntervalType > (of class org.apache.spark.sql.types.CalendarIntervalType$) > > 4. In order to make interval work, users need to create an interval using > CalendarInterval, which is an internal class that has no documentation and no > stable API. > > -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org