David Anderson created FLINK-22894:
--------------------------------------

             Summary: Window Top-N should allow n=1
                 Key: FLINK-22894
                 URL: https://issues.apache.org/jira/browse/FLINK-22894
             Project: Flink
          Issue Type: Bug
          Components: Table SQL / Runtime
    Affects Versions: 1.13.1
            Reporter: David Anderson


I tried to reimplement the Hourly Tips exercise from the DataStream training 
using Flink SQL. The objective of this exercise is to find the one taxi driver 
who earned the most in tips during each hour, and report that driver's driverId 
and the sum of their tips. 

This can be expressed as a window top-n query, where n=1, as in

{{FROM (}}
{{  SELECT *, ROW_NUMBER() OVER }}{{(PARTITION BY window_start, window_end 
ORDER BY sumOfTips DESC) as rownum}}
{{  FROM ( }}
{{    SELECT driverId, window_start, window_end, sum(tip) as sumOfTips}}
{{    FROM TABLE( }}
{{      TUMBLE(TABLE fares, DESCRIPTOR(startTime), INTERVAL '1' HOUR))}}
{{    GROUP BY driverId, window_start, window_end}}
{{  )}}
{{) WHERE rownum = 1;}}

 

This fails because the {{WindowRankOperatorBuilder}} insists on {{rankEnd > 1. 
}}So, in other words, while it is possible to report the top 2 drivers, or the 
driver in 2nd place, it's not possible to report only the top driver.

This appears to be an off-by-one error in the range checking.

 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to