[ https://issues.apache.org/jira/browse/BEAM-7844?focusedWorklogId=288038&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-288038 ]
ASF GitHub Bot logged work on BEAM-7844: ---------------------------------------- Author: ASF GitHub Bot Created on: 02/Aug/19 16:23 Start Date: 02/Aug/19 16:23 Worklog Time Spent: 10m Work Description: akedin commented on pull request #9198: [BEAM-7844] Row rate window estimation URL: https://github.com/apache/beam/pull/9198#discussion_r310202909 ########## File path: sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/rel/BeamAggregationRel.java ########## @@ -82,9 +82,44 @@ public BeamAggregationRel( this.windowFieldIndex = windowFieldIndex; } + private NodeStats computeWindowingCostEffect(NodeStats inputStat) { + if (windowFn == null) { + return inputStat; + } + WindowFn w = windowFn; + double multiplicationFactor = 1; + if (w instanceof SlidingWindows) { + multiplicationFactor = + ((double) ((SlidingWindows) w).getSize().getStandardSeconds()) + / ((SlidingWindows) w).getPeriod().getStandardSeconds(); + } + + return NodeStats.create( + inputStat.getRowCount() * multiplicationFactor, + inputStat.getRate() * multiplicationFactor, + BeamIOSourceRel.CONSTANT_WINDOW_SIZE); + } + @Override public NodeStats estimateNodeStats(RelMetadataQuery mq) { - return NodeStats.create(mq.getRowCount(this)); + + NodeStats inputEstimate = BeamSqlRelUtils.getNodeStats(this.input, mq); + + inputEstimate = computeWindowingCostEffect(inputEstimate); + + NodeStats estimate; + int groupCount = groupSet.cardinality() - (windowFn == null ? 0 : 1); + if (groupCount == 0) { Review comment: nit: i think a ternary operator makes it easier to read (less noise, assignments, nesting), but up to you: ```java return (groupCount != 0) ? inputEstimate.multiply(1.0 - Math.pow(.5, groupCount)); : NodeStats.create( Math.min(inputEstimate.getRowCount(), 1d), inputEstimate.getRate() / inputEstimate.getWindow(), 1d); ``` ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking ------------------- Worklog Id: (was: 288038) Time Spent: 4h (was: 3h 50m) > Custom MetadataHandler for RowRateWindow > ---------------------------------------- > > Key: BEAM-7844 > URL: https://issues.apache.org/jira/browse/BEAM-7844 > Project: Beam > Issue Type: New Feature > Components: dsl-sql > Reporter: Alireza Samadianzakaria > Assignee: Alireza Samadianzakaria > Priority: Major > Time Spent: 4h > Remaining Estimate: 0h > > Calcite has a getRowCount method and a handler for that. However, we need > rate and window as well. This is a requirement for Beam-7777 -- This message was sent by Atlassian JIRA (v7.6.14#76016)