[ 
https://issues.apache.org/jira/browse/DRILL-556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13984041#comment-13984041
 ] 

ASF GitHub Bot commented on DRILL-556:
--------------------------------------

Github user amansinha100 commented on a diff in the pull request:

    https://github.com/apache/incubator-drill/pull/56#discussion_r12080233
  
    --- Diff: exec/java-exec/src/main/codegen/templates/AggrTypeFunctions3.java 
---
    @@ -0,0 +1,128 @@
    +/**
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + * http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +<@pp.dropOutputFile />
    +
    +
    +
    +<#list aggrtypes2.aggrtypes as aggrtype>
    +<#if aggrtype.className != "Avg">
    +<@pp.changeOutputFile 
name="/org/apache/drill/exec/expr/fn/impl/gaggr/${aggrtype.className}Functions.java"
 />
    +
    +<#include "/@includes/license.ftl" />
    +
    +<#-- A utility class that is used to generate java code for aggr functions 
such as stddev, variance -->
    +
    +/*
    + * This class is automatically generated from AggrTypeFunctions2.tdd using 
FreeMarker.
    + */
    +
    +package org.apache.drill.exec.expr.fn.impl.gaggr;
    +
    +import org.apache.drill.exec.expr.DrillAggFunc;
    +import org.apache.drill.exec.expr.annotations.FunctionTemplate;
    +import 
org.apache.drill.exec.expr.annotations.FunctionTemplate.NullHandling;
    +import 
org.apache.drill.exec.expr.annotations.FunctionTemplate.FunctionScope;
    +import org.apache.drill.exec.expr.annotations.Output;
    +import org.apache.drill.exec.expr.annotations.Param;
    +import org.apache.drill.exec.expr.annotations.Workspace;
    +import org.apache.drill.exec.expr.holders.*;
    +import org.apache.drill.exec.record.RecordBatch;
    +
    +@SuppressWarnings("unused")
    +
    +public class ${aggrtype.className}Functions {
    +   static final org.slf4j.Logger logger = 
org.slf4j.LoggerFactory.getLogger(${aggrtype.className}Functions.class);
    +
    +<#list aggrtype.types as type>
    +
    +<#if aggrtype.aliasName == "">
    +@FunctionTemplate(name = "${aggrtype.funcName}", scope = 
FunctionTemplate.FunctionScope.POINT_AGGREGATE)
    +<#else>
    +@FunctionTemplate(names = {"${aggrtype.funcName}", 
"${aggrtype.aliasName}"}, scope = 
FunctionTemplate.FunctionScope.POINT_AGGREGATE)
    +</#if>
    +
    +public static class ${type.inputType}${aggrtype.className} implements 
DrillAggFunc{
    +
    +  @Param ${type.inputType}Holder in;
    +  @Workspace ${type.movingAverageType}Holder avg;
    +  @Workspace ${type.movingDeviationType}Holder dev;
    +  @Workspace ${type.countRunningType}Holder count;
    +  @Output ${type.outputType}Holder out;
    +
    +  public void setup(RecordBatch b) {
    +   avg = new ${type.movingAverageType}Holder();
    +    dev = new ${type.movingDeviationType}Holder();
    +    count = new ${type.countRunningType}Holder();
    +
    +    // Initialize the workspace variables
    +    avg.value = 0;
    +    dev.value = 0;
    +    count.value = 1;
    +  }
    +
    +  @Override
    +  public void add() {
    +   <#if type.inputType?starts_with("Nullable")>
    +     sout: {
    +     if (in.isSet == 0) {
    +      // processing nullable input and the value is null, so don't do 
anything...
    +      break sout;
    +     }
    +   </#if>
    +
    +    // Welford's approach to compute standard deviation
    --- End diff --
    
    Welford's method does the computation online (streaming) and it looks 
simple... so I am wondering is there a catch ?         It is computing the 
average each time a row is processed as opposed to doing it once at the end..so 
we would have to see how it performs.  


> Implement aggregate functions to compute standard deviation, variance
> ---------------------------------------------------------------------
>
>                 Key: DRILL-556
>                 URL: https://issues.apache.org/jira/browse/DRILL-556
>             Project: Apache Drill
>          Issue Type: Bug
>            Reporter: Mehant Baid
>            Assignee: Mehant Baid
>         Attachments: DRILL-556.patch
>
>
> Following are the aggregate functions to be added as part of this JIRA
> stddev()
> stddev_samp()
> stddev_pop()
> variance()
> var_samp()
> var_pop()



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to