[ https://issues.apache.org/jira/browse/SPARK-8816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Herman van Hovell closed SPARK-8816. ------------------------------------ Resolution: Won't Fix Closing as won't fix since this has not come up as a performance problem. > Improve Performance of Unbounded Following Window Frame Processing > ------------------------------------------------------------------ > > Key: SPARK-8816 > URL: https://issues.apache.org/jira/browse/SPARK-8816 > Project: Spark > Issue Type: Sub-task > Components: SQL > Affects Versions: 1.4.0 > Reporter: Herman van Hovell > Priority: Minor > > The performance of Unbounded Following frames in both the current and the > proposed (SPARK-8638) implementation of Window Functions is quite bad: > O(N*(N-1)/2). > A solution to this is to process such frames in reverse. This would > effectively reduce the complexity to O(N). The problem with this approach > that it assumes that AggregateExpression are communitative. Most are, but > some are actually order based: FIRST/LAST. There are two solution for this: > * Only allow communitative aggregates to processed in reverse order. In > practice this would mean, that a white list containing all allowed aggregates > is used, e.g. Sum, Average, Min, Max, ... > * Add functionality to WindowFunction or even better AggregateExpression, > which would allow us to get the reverse operator from the expression, and use > this reverse for processing, for example: > {noformat} > case class Max(child: Expression) extends AggregateExpression with Ordered { > ... > > def reverse: Min = Min(child) > } > {noformat} > The impact and extensibility of the first option are lower than of the > second. > We might also want to asses how often such frames are used. It might not be a > problem at all. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org