[jira] [Commented] (FLINK-4604) Add support for standard deviation/variance

2017-05-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15994788#comment-15994788
 ] 

ASF GitHub Bot commented on FLINK-4604:
---

Github user ex00 commented on the issue:

https://github.com/apache/flink/pull/3260
  
Thanks @twalthr! 


> Add support for standard deviation/variance
> ---
>
> Key: FLINK-4604
> URL: https://issues.apache.org/jira/browse/FLINK-4604
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API & SQL
>Reporter: Timo Walther
>Assignee: Anton Mushin
> Attachments: 1.jpg
>
>
> Calcite's {{AggregateReduceFunctionsRule}} can convert SQL {{AVG, STDDEV_POP, 
> STDDEV_SAMP, VAR_POP, VAR_SAMP}} to sum/count functions. We should add, test 
> and document this rule. 
> If we also want to add this aggregates to Table API is up for discussion.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-4604) Add support for standard deviation/variance

2017-05-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15994784#comment-15994784
 ] 

ASF GitHub Bot commented on FLINK-4604:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/3260


> Add support for standard deviation/variance
> ---
>
> Key: FLINK-4604
> URL: https://issues.apache.org/jira/browse/FLINK-4604
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API & SQL
>Reporter: Timo Walther
>Assignee: Anton Mushin
> Attachments: 1.jpg
>
>
> Calcite's {{AggregateReduceFunctionsRule}} can convert SQL {{AVG, STDDEV_POP, 
> STDDEV_SAMP, VAR_POP, VAR_SAMP}} to sum/count functions. We should add, test 
> and document this rule. 
> If we also want to add this aggregates to Table API is up for discussion.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-4604) Add support for standard deviation/variance

2017-05-02 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15993043#comment-15993043
 ] 

ASF GitHub Bot commented on FLINK-4604:
---

Github user twalthr commented on the issue:

https://github.com/apache/flink/pull/3260
  
Thanks for the update @ex00. I will go through your changes and merge this 
tomorrow.


> Add support for standard deviation/variance
> ---
>
> Key: FLINK-4604
> URL: https://issues.apache.org/jira/browse/FLINK-4604
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API & SQL
>Reporter: Timo Walther
>Assignee: Anton Mushin
> Attachments: 1.jpg
>
>
> Calcite's {{AggregateReduceFunctionsRule}} can convert SQL {{AVG, STDDEV_POP, 
> STDDEV_SAMP, VAR_POP, VAR_SAMP}} to sum/count functions. We should add, test 
> and document this rule. 
> If we also want to add this aggregates to Table API is up for discussion.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-4604) Add support for standard deviation/variance

2017-04-21 Thread Anton Mushin (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15978405#comment-15978405
 ] 

Anton Mushin commented on FLINK-4604:
-

Hi [~twalthr],
Could you look my PR for this issue, please?

> Add support for standard deviation/variance
> ---
>
> Key: FLINK-4604
> URL: https://issues.apache.org/jira/browse/FLINK-4604
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API & SQL
>Reporter: Timo Walther
>Assignee: Anton Mushin
> Attachments: 1.jpg
>
>
> Calcite's {{AggregateReduceFunctionsRule}} can convert SQL {{AVG, STDDEV_POP, 
> STDDEV_SAMP, VAR_POP, VAR_SAMP}} to sum/count functions. We should add, test 
> and document this rule. 
> If we also want to add this aggregates to Table API is up for discussion.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-4604) Add support for standard deviation/variance

2017-04-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15969972#comment-15969972
 ] 

ASF GitHub Bot commented on FLINK-4604:
---

Github user ex00 commented on the issue:

https://github.com/apache/flink/pull/3260
  
Hello, I updated PR for calcite 1.12.


> Add support for standard deviation/variance
> ---
>
> Key: FLINK-4604
> URL: https://issues.apache.org/jira/browse/FLINK-4604
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API & SQL
>Reporter: Timo Walther
>Assignee: Anton Mushin
> Attachments: 1.jpg
>
>
> Calcite's {{AggregateReduceFunctionsRule}} can convert SQL {{AVG, STDDEV_POP, 
> STDDEV_SAMP, VAR_POP, VAR_SAMP}} to sum/count functions. We should add, test 
> and document this rule. 
> If we also want to add this aggregates to Table API is up for discussion.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-4604) Add support for standard deviation/variance

2017-03-26 Thread Anton Mushin (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15942740#comment-15942740
 ] 

Anton Mushin commented on FLINK-4604:
-

Hi everyone,
Calcite 1.12 has been release: 
http://calcite.apache.org/news/2017/03/24/release-1.12.0/
I will try update my PR with considering what CALCITE-1621 included in Calcite 
1.12

> Add support for standard deviation/variance
> ---
>
> Key: FLINK-4604
> URL: https://issues.apache.org/jira/browse/FLINK-4604
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API & SQL
>Reporter: Timo Walther
>Assignee: Anton Mushin
> Attachments: 1.jpg
>
>
> Calcite's {{AggregateReduceFunctionsRule}} can convert SQL {{AVG, STDDEV_POP, 
> STDDEV_SAMP, VAR_POP, VAR_SAMP}} to sum/count functions. We should add, test 
> and document this rule. 
> If we also want to add this aggregates to Table API is up for discussion.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-4604) Add support for standard deviation/variance

2017-02-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15851394#comment-15851394
 ] 

ASF GitHub Bot commented on FLINK-4604:
---

GitHub user ex00 opened a pull request:

https://github.com/apache/flink/pull/3260

[FLINK-4604] Add support for standard deviation/variance

add rule for reduce standard deviation/variance functions

Thanks for contributing to Apache Flink. Before you open your pull request, 
please take the following check list into consideration.
If your changes take all of the items into account, feel free to open your 
pull request. For more information and/or questions please refer to the [How To 
Contribute guide](http://flink.apache.org/how-to-contribute.html).
In addition to going through the list, please provide a meaningful 
description of your changes.

- [x] General
  - The pull request references the related JIRA issue ("[FLINK-XXX] Jira 
title text")
  - The pull request addresses only one issue
  - Each commit in the PR has a meaningful commit message (including the 
JIRA id)

- [x] Documentation
  - Documentation has been added for new functionality
  - Old documentation affected by the pull request has been updated
  - JavaDoc for public methods has been added

- [x] Tests & Build
  - Functionality added by the pull request is covered by tests
  - `mvn clean verify` has been executed successfully locally or a Travis 
build has passed


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ex00/flink FLINK-4604

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/3260.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3260


commit 38d99c4c6ccc8e8b1dbd2f83d9cef4eae3494f00
Author: Anton Mushin 
Date:   2017-02-03T10:06:49Z

[FLINK-4604] Add support for standard deviation/variance

add rule for reduce standard deviation/variance functions




> Add support for standard deviation/variance
> ---
>
> Key: FLINK-4604
> URL: https://issues.apache.org/jira/browse/FLINK-4604
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API & SQL
>Reporter: Timo Walther
>Assignee: Anton Mushin
> Attachments: 1.jpg
>
>
> Calcite's {{AggregateReduceFunctionsRule}} can convert SQL {{AVG, STDDEV_POP, 
> STDDEV_SAMP, VAR_POP, VAR_SAMP}} to sum/count functions. We should add, test 
> and document this rule. 
> If we also want to add this aggregates to Table API is up for discussion.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-4604) Add support for standard deviation/variance

2017-01-11 Thread Anton Mushin (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15817633#comment-15817633
 ] 

Anton Mushin commented on FLINK-4604:
-

Hi [~twalthr].
Do you suggest use *Decorrelator class for Flink rules?

> Add support for standard deviation/variance
> ---
>
> Key: FLINK-4604
> URL: https://issues.apache.org/jira/browse/FLINK-4604
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API & SQL
>Reporter: Timo Walther
>Assignee: Anton Mushin
> Attachments: 1.jpg
>
>
> Calcite's {{AggregateReduceFunctionsRule}} can convert SQL {{AVG, STDDEV_POP, 
> STDDEV_SAMP, VAR_POP, VAR_SAMP}} to sum/count functions. We should add, test 
> and document this rule. 
> If we also want to add this aggregates to Table API is up for discussion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4604) Add support for standard deviation/variance

2017-01-10 Thread Timo Walther (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15815229#comment-15815229
 ] 

Timo Walther commented on FLINK-4604:
-

Any news on this [~anmu]? You could solve this issue temporarily by doing it 
similar to FLINK-5144.

> Add support for standard deviation/variance
> ---
>
> Key: FLINK-4604
> URL: https://issues.apache.org/jira/browse/FLINK-4604
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API & SQL
>Reporter: Timo Walther
>Assignee: Anton Mushin
> Attachments: 1.jpg
>
>
> Calcite's {{AggregateReduceFunctionsRule}} can convert SQL {{AVG, STDDEV_POP, 
> STDDEV_SAMP, VAR_POP, VAR_SAMP}} to sum/count functions. We should add, test 
> and document this rule. 
> If we also want to add this aggregates to Table API is up for discussion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4604) Add support for standard deviation/variance

2016-11-22 Thread Anton Mushin (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15686845#comment-15686845
 ] 

Anton Mushin commented on FLINK-4604:
-

what if do try return the type of source column instead null?

> Add support for standard deviation/variance
> ---
>
> Key: FLINK-4604
> URL: https://issues.apache.org/jira/browse/FLINK-4604
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API & SQL
>Reporter: Timo Walther
>Assignee: Anton Mushin
> Attachments: 1.jpg
>
>
> Calcite's {{AggregateReduceFunctionsRule}} can convert SQL {{AVG, STDDEV_POP, 
> STDDEV_SAMP, VAR_POP, VAR_SAMP}} to sum/count functions. We should add, test 
> and document this rule. 
> If we also want to add this aggregates to Table API is up for discussion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4604) Add support for standard deviation/variance

2016-11-15 Thread Timo Walther (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15667697#comment-15667697
 ] 

Timo Walther commented on FLINK-4604:
-

I looked into the problem. If you take a look at 
{{AggregateReduceFunctionsRule}} (e.g. line 354), you see that the null literal 
is created with no type/the {{NULL}} type. We currently do not support this 
type. Either we replace this and similar lines with 
{{rexBuilder.makeNullLiteral(sumZeroRef.getType().getSqlTypeName())}} to give 
it a type or we create a new type but I don't know how we want to represent it 
so far.

> Add support for standard deviation/variance
> ---
>
> Key: FLINK-4604
> URL: https://issues.apache.org/jira/browse/FLINK-4604
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API & SQL
>Reporter: Timo Walther
>Assignee: Anton Mushin
> Attachments: 1.jpg
>
>
> Calcite's {{AggregateReduceFunctionsRule}} can convert SQL {{AVG, STDDEV_POP, 
> STDDEV_SAMP, VAR_POP, VAR_SAMP}} to sum/count functions. We should add, test 
> and document this rule. 
> If we also want to add this aggregates to Table API is up for discussion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4604) Add support for standard deviation/variance

2016-11-15 Thread Timo Walther (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15666872#comment-15666872
 ] 

Timo Walther commented on FLINK-4604:
-

I think the missing type for {{NULL}} is a Calcite issue. Usually, a {{NULL}} 
in a {{CASE(=($f2, 0), null, EXPR$1)}} should have the same type as {{EXPR$1}}. 
I will have a look at it today.

> Add support for standard deviation/variance
> ---
>
> Key: FLINK-4604
> URL: https://issues.apache.org/jira/browse/FLINK-4604
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API & SQL
>Reporter: Timo Walther
>Assignee: Anton Mushin
> Attachments: 1.jpg
>
>
> Calcite's {{AggregateReduceFunctionsRule}} can convert SQL {{AVG, STDDEV_POP, 
> STDDEV_SAMP, VAR_POP, VAR_SAMP}} to sum/count functions. We should add, test 
> and document this rule. 
> If we also want to add this aggregates to Table API is up for discussion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4604) Add support for standard deviation/variance

2016-10-31 Thread Anton Mushin (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15621663#comment-15621663
 ] 

Anton Mushin commented on FLINK-4604:
-

bq. We can later remove these commits once the original PR got merged.
I'm already merge this PR as patch
bq. Regarding the Type NULL issue, are you still working on the branch you 
posted previously in this thread: BRANCH? 
Yes, I working in this 
[branch|https://github.com/ex00/flink/compare/master...ex00:FLINK-4604].
bq.I assume you get the exception when running one of tests, right?
Yes, I have 2 tests: 
{{org.apache.flink.api.scala.batch.sql.AggregationsITCase#testStddevPopAggregateWithOtherAggreagteSUM0}}
 and 
{{org.apache.flink.api.scala.batch.sql.AggregationsITCase#testStddevPopAggregateWithOtherAggreagteSUM}}.
 
{{testStddevPopAggregateWithOtherAggreagteSUM0}} is passing
{{testStddevPopAggregateWithOtherAggreagteSUM}} is failing

> Add support for standard deviation/variance
> ---
>
> Key: FLINK-4604
> URL: https://issues.apache.org/jira/browse/FLINK-4604
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API & SQL
>Reporter: Timo Walther
>Assignee: Anton Mushin
> Attachments: 1.jpg
>
>
> Calcite's {{AggregateReduceFunctionsRule}} can convert SQL {{AVG, STDDEV_POP, 
> STDDEV_SAMP, VAR_POP, VAR_SAMP}} to sum/count functions. We should add, test 
> and document this rule. 
> If we also want to add this aggregates to Table API is up for discussion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4604) Add support for standard deviation/variance

2016-10-28 Thread Fabian Hueske (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15615730#comment-15615730
 ] 

Fabian Hueske commented on FLINK-4604:
--

Hi [~anmu], thanks for digging into this.

Regarding the sqrt/power function, I guess you are aware of FLINK-4743 and PR 
[#2686|https://github.com/apache/flink/pull/2686].
So this issue should be fixed rather soon. In case this is blocking you, I'd 
rebase your code on top of the PR branch. We can later remove these commits 
once the original PR got merged.

Regarding the {{Type NULL}} issue, are you still working on the branch you 
posted previously in this thread: 
[BRANCH|https://github.com/ex00/flink/compare/master...ex00:FLINK-4604]? I 
assume you get the exception when running one of tests, right?
I will have a look and try to figure out what is going wrong there.


> Add support for standard deviation/variance
> ---
>
> Key: FLINK-4604
> URL: https://issues.apache.org/jira/browse/FLINK-4604
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API & SQL
>Reporter: Timo Walther
>Assignee: Anton Mushin
> Attachments: 1.jpg
>
>
> Calcite's {{AggregateReduceFunctionsRule}} can convert SQL {{AVG, STDDEV_POP, 
> STDDEV_SAMP, VAR_POP, VAR_SAMP}} to sum/count functions. We should add, test 
> and document this rule. 
> If we also want to add this aggregates to Table API is up for discussion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4604) Add support for standard deviation/variance

2016-10-28 Thread Anton Mushin (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15615302#comment-15615302
 ] 

Anton Mushin commented on FLINK-4604:
-

For correct calculate standard deviation/variance function need
 impliment support sqrt/power function for all data types.  

> Add support for standard deviation/variance
> ---
>
> Key: FLINK-4604
> URL: https://issues.apache.org/jira/browse/FLINK-4604
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API & SQL
>Reporter: Timo Walther
>Assignee: Anton Mushin
> Attachments: 1.jpg
>
>
> Calcite's {{AggregateReduceFunctionsRule}} can convert SQL {{AVG, STDDEV_POP, 
> STDDEV_SAMP, VAR_POP, VAR_SAMP}} to sum/count functions. We should add, test 
> and document this rule. 
> If we also want to add this aggregates to Table API is up for discussion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4604) Add support for standard deviation/variance

2016-10-28 Thread Anton Mushin (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15615280#comment-15615280
 ] 

Anton Mushin commented on FLINK-4604:
-

I found rootcause

problems occur only in query with SUM, if in query replace SUM to $SUM0 (or any 
aggregate function) then query works.
At time execute {{AggregateReduceFunctionsRule#onMatch}} SUM must replace to 
$SUM0 
[here|https://github.com/apache/calcite/blob/0938c7b6d767e3242874d87a30d9112512d9243a/core/src/main/java/org/apache/calcite/rel/rules/AggregateReduceFunctionsRule.java#L338]
But it is not happening because the types in FLINK is nullable. ( in my case)

I do not know how to solve this problem yet, please let me know if you have 
idea how resolve this problem.

> Add support for standard deviation/variance
> ---
>
> Key: FLINK-4604
> URL: https://issues.apache.org/jira/browse/FLINK-4604
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API & SQL
>Reporter: Timo Walther
>Assignee: Anton Mushin
> Attachments: 1.jpg
>
>
> Calcite's {{AggregateReduceFunctionsRule}} can convert SQL {{AVG, STDDEV_POP, 
> STDDEV_SAMP, VAR_POP, VAR_SAMP}} to sum/count functions. We should add, test 
> and document this rule. 
> If we also want to add this aggregates to Table API is up for discussion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4604) Add support for standard deviation/variance

2016-10-25 Thread Anton Mushin (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15607653#comment-15607653
 ] 

Anton Mushin commented on FLINK-4604:
-

Hello [~fhueske], thanks for your answer!
I added support {{SqlKind.SUM0}} and implement custom {{Sum0Aggregate}}, but 
I'm still get
{noformat}
DataSetCalc(select=[CAST(POWER(/(-($f0, /(*($f1, $f1), $f2)), $f2), 0.5)) AS 
EXPR$0, CASE(=($f2, 0), null, EXPR$1) AS EXPR$1])
  DataSetAggregate(select=[SUM($f6) AS $f0, SUM(_1) AS $f1, COUNT(_1) AS $f2, 
$SUM0(_1) AS EXPR$1])
DataSetCalc(select=[_1, _2, _3, _4, _5, _6, *(_1, _1) AS $f6])
  DataSetScan(table=[[_DataSetTable_0]])

org.apache.flink.api.table.TableException: Type NULL is not supported. Null 
values must have a supported type.

at 
org.apache.flink.api.table.FlinkTypeFactory$.toTypeInfo(FlinkTypeFactory.scala:138)
at 
org.apache.flink.api.table.codegen.CodeGenerator.visitLiteral(CodeGenerator.scala:553)
{noformat}
and I don't understand where I getting {{CASE(=($f2, 0), null, EXPR$1) AS 
EXPR$1])}}

> Add support for standard deviation/variance
> ---
>
> Key: FLINK-4604
> URL: https://issues.apache.org/jira/browse/FLINK-4604
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API & SQL
>Reporter: Timo Walther
>Assignee: Anton Mushin
> Attachments: 1.jpg
>
>
> Calcite's {{AggregateReduceFunctionsRule}} can convert SQL {{AVG, STDDEV_POP, 
> STDDEV_SAMP, VAR_POP, VAR_SAMP}} to sum/count functions. We should add, test 
> and document this rule. 
> If we also want to add this aggregates to Table API is up for discussion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4604) Add support for standard deviation/variance

2016-10-21 Thread Fabian Hueske (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15595581#comment-15595581
 ] 

Fabian Hueske commented on FLINK-4604:
--

Hi [~anmu],

I think we can implement and add support for {{SqlKind.SUM0}} as part of this 
issue.
For that you have to implement a custom {{Sum0Aggregate}} which extends 
{{SumAggregate}} and overrides the {{prepare()}} method such that it does not 
initialize the aggregate with {{null}} but with {{0}} if the value is {{null}}.

Next you have to fix the {{AggregateUtil}} and separate 
{{SqlSumEmptyIsZeroAggFunction}} from {{SqlSumAggFunction}} and initialize the 
new {{Sum0Aggregate}} and also allow for {{case SqlKind.SUM0 => true}} in 
{{DataSetAggregateRule}}.

If I did not forget a place to add SUM0 support, that should do the trick.

> Add support for standard deviation/variance
> ---
>
> Key: FLINK-4604
> URL: https://issues.apache.org/jira/browse/FLINK-4604
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API & SQL
>Reporter: Timo Walther
>Assignee: Anton Mushin
> Attachments: 1.jpg
>
>
> Calcite's {{AggregateReduceFunctionsRule}} can convert SQL {{AVG, STDDEV_POP, 
> STDDEV_SAMP, VAR_POP, VAR_SAMP}} to sum/count functions. We should add, test 
> and document this rule. 
> If we also want to add this aggregates to Table API is up for discussion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4604) Add support for standard deviation/variance

2016-10-17 Thread Anton Mushin (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15581442#comment-15581442
 ] 

Anton Mushin commented on FLINK-4604:
-

Do you talking about new issue for support 
{{SqlKind.SUM0/SqlSumEmptyIsZeroAggFunction}}? or is it issue exist already? or 
do support {{SqlKind.SUM0}} in this issue?


> Add support for standard deviation/variance
> ---
>
> Key: FLINK-4604
> URL: https://issues.apache.org/jira/browse/FLINK-4604
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API & SQL
>Reporter: Timo Walther
>Assignee: Anton Mushin
> Attachments: 1.jpg
>
>
> Calcite's {{AggregateReduceFunctionsRule}} can convert SQL {{AVG, STDDEV_POP, 
> STDDEV_SAMP, VAR_POP, VAR_SAMP}} to sum/count functions. We should add, test 
> and document this rule. 
> If we also want to add this aggregates to Table API is up for discussion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4604) Add support for standard deviation/variance

2016-10-14 Thread Timo Walther (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15575320#comment-15575320
 ] 

Timo Walther commented on FLINK-4604:
-

The problem is that the rule translates not only into {{SUM}} and {{COUNT}} but 
also into {{SqlKind.SUM0}}/{{SqlSumEmptyIsZeroAggFunction}} which is not yet 
supported. This means we need to adapt the {{AggregateUtil}} to support this 
aggregation first. If I use the normal sum aggregator for it, I get the "Type 
NULL is not supported" exception. This is bug actually a null within a {{CASE}} 
expression should have a type.

> Add support for standard deviation/variance
> ---
>
> Key: FLINK-4604
> URL: https://issues.apache.org/jira/browse/FLINK-4604
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API & SQL
>Reporter: Timo Walther
>Assignee: Anton Mushin
> Attachments: 1.jpg
>
>
> Calcite's {{AggregateReduceFunctionsRule}} can convert SQL {{AVG, STDDEV_POP, 
> STDDEV_SAMP, VAR_POP, VAR_SAMP}} to sum/count functions. We should add, test 
> and document this rule. 
> If we also want to add this aggregates to Table API is up for discussion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4604) Add support for standard deviation/variance

2016-10-13 Thread Anton Mushin (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15572115#comment-15572115
 ] 

Anton Mushin commented on FLINK-4604:
-

Sorry, I will try explain my problem.
I have two tests
{code:title=test1}
@Test
  def testStddevPopAggregate(): Unit = {
val env = ExecutionEnvironment.getExecutionEnvironment
val tEnv = TableEnvironment.getTableEnvironment(env, config)
val ds = env.fromElements(
  (1: Byte, 1 : Short, 1, 1L, 1F, 1D),
  (2: Byte, 2 : Short, 2, 2L, 2F, 2D)).toTable(tEnv)
tEnv.registerTable("myTable", ds)
val columns = Array("_1","_2","_3","_4","_5","_6")

val sqlQuery = getSelectQuery("STDDEV_POP(?)")(columns,"myTable")
//val sqlExpectedQuery = getSelectQuery("SQRT((SUM(? * ?) - SUM(?) * SUM(?) 
/ COUNT(?)) / COUNT(?))")(columns,"myTable")

val actualResult = tEnv.sql(sqlQuery).toDataSet[Row].collect()
//val expectedResult = 
tEnv.sql(sqlExpectedQuery).toDataSet[Row].collect().toString.replaceAll("Buffer\\(|\\)",
 "")
val expectedResult = "0,0,0,0,0.5,0.5"
TestBaseUtils.compareOrderedResultAsText(actualResult.asJava, 
expectedResult)
  }
{code}
{code:title=test2}
@Test
  def testStddevPopAggregateWithOtherAggreagte(): Unit = {
val localconf = config
localconf.setNullCheck(true)
val env = ExecutionEnvironment.getExecutionEnvironment
val tEnv = TableEnvironment.getTableEnvironment(env, localconf)

val ds = env.fromElements(
  (1: Byte, 1 : Short, 1, 1L, 1F, 1D),
  (2: Byte, 2 : Short, 2, 2L, 2F, 2D)).toTable(tEnv)
tEnv.registerTable("myTable", ds)
val columns = Array("_1","_2","_3","_4","_5","_6")

val sqlQuery = getSelectQuery("STDDEV_POP(?), avg(?), sum(?), max(?), 
min(?), count(?)")(columns,"myTable")

val sqlExpectedQuery = getSelectQuery("SQRT((SUM(? * ?) - SUM(?) * SUM(?) / 
COUNT(?)) / COUNT(?))" +
  "avg(?),sum(?),max(?),min(?),count(?)")(columns,"myTable")

val actualResult = tEnv.sql(sqlQuery).toDataSet[Row].collect()
val expectedResult = 
tEnv.sql(sqlExpectedQuery).toDataSet[Row].collect().toString.replaceAll("Buffer\\(|\\)",
 "")
//val expectedResult = 
"0.0,1,3,2,1,2,0.0,1,3,2,1,2,0.0,1,3,2,1,2,0.0,1,3,2,1,2,0.5,1.5,3.0,2.0,1.0,2,0.5,1.5,3.0,2.0,1.0,2"
TestBaseUtils.compareOrderedResultAsText(actualResult.asJava, 
expectedResult)
  }
{code}
[Actual code|https://github.com/ex00/flink/compare/master...ex00:FLINK-4604] 
for running tests.
First test is passed and second isn't.
I have Calcite ST:
{noformat}
org.apache.calcite.plan.RelOptPlanner$CannotPlanException: Node 
[rel#7:Subset#1.DATASET.[]] could not be implemented; planner state:

Root: rel#7:Subset#1.DATASET.[]
Original rel:

Sets:
Set#0, type: RecordType(TINYINT _1, SMALLINT _2, INTEGER _3, BIGINT _4, FLOAT 
_5, DOUBLE _6)
rel#4:Subset#0.NONE.[], best=null, importance=0.81
rel#0:LogicalTableScan.NONE.[](table=[_DataSetTable_0]), 
rowcount=1000.0, cumulative cost={inf}
rel#26:Subset#0.DATASET.[], best=rel#25, importance=0.405
rel#25:DataSetScan.DATASET.[](table=[_DataSetTable_0]), 
rowcount=1000.0, cumulative cost={1000.0 rows, 1000.0 cpu, 0.0 io}
rel#28:Subset#0.ENUMERABLE.[], best=rel#27, importance=0.405

rel#27:EnumerableTableScan.ENUMERABLE.[](table=[_DataSetTable_0]), 
rowcount=1000.0, cumulative cost={1000.0 rows, 1001.0 cpu, 0.0 io}
Set#1, type: RecordType(TINYINT EXPR$0, TINYINT EXPR$1, TINYINT EXPR$2, TINYINT 
EXPR$3, TINYINT EXPR$4, BIGINT EXPR$5, SMALLINT EXPR$6, SMALLINT EXPR$7, 
SMALLINT EXPR$8, SMALLINT EXPR$9, SMALLINT EXPR$10, BIGINT EXPR$11, INTEGER 
EXPR$12, INTEGER EXPR$13, INTEGER EXPR$14, INTEGER EXPR$15, INTEGER EXPR$16, 
BIGINT EXPR$17, BIGINT EXPR$18, BIGINT EXPR$19, BIGINT EXPR$20, BIGINT EXPR$21, 
BIGINT EXPR$22, BIGINT EXPR$23, FLOAT EXPR$24, FLOAT EXPR$25, FLOAT EXPR$26, 
FLOAT EXPR$27, FLOAT EXPR$28, BIGINT EXPR$29, DOUBLE EXPR$30, DOUBLE EXPR$31, 
DOUBLE EXPR$32, DOUBLE EXPR$33, DOUBLE EXPR$34, BIGINT EXPR$35)
rel#6:Subset#1.NONE.[], best=null, importance=0.9

rel#5:LogicalAggregate.NONE.[](input=rel#4:Subset#0.NONE.[],group={},EXPR$0=STDDEV_POP($0),EXPR$1=AVG($0),EXPR$2=SUM($0),EXPR$3=MAX($0),EXPR$4=MIN($0),EXPR$5=COUNT($0),EXPR$6=STDDEV_POP($1),EXPR$7=AVG($1),EXPR$8=SUM($1),EXPR$9=MAX($1),EXPR$10=MIN($1),EXPR$11=COUNT($1),EXPR$12=STDDEV_POP($2),EXPR$13=AVG($2),EXPR$14=SUM($2),EXPR$15=MAX($2),EXPR$16=MIN($2),EXPR$17=COUNT($2),EXPR$18=STDDEV_POP($3),EXPR$19=AVG($3),EXPR$20=SUM($3),EXPR$21=MAX($3),EXPR$22=MIN($3),EXPR$23=COUNT($3),EXPR$24=STDDEV_POP($4),EXPR$25=AVG($4),EXPR$26=SUM($4),EXPR$27=MAX($4),EXPR$28=MIN($4),EXPR$29=COUNT($4),EXPR$30=STDDEV_POP($5),EXPR$31=AVG($5),EXPR$32=SUM($5),EXPR$33=MAX($5),EXPR$34=MIN($5),EXPR$35=COUNT($5)),
 rowcount=100.0, cumulative cost={inf}

rel#15:LogicalProject.NONE.[](input=rel#14:Subset#3.NONE.[],EXPR$0=CAST(POWER(/

[jira] [Commented] (FLINK-4604) Add support for standard deviation/variance

2016-10-13 Thread Anton Mushin (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15571502#comment-15571502
 ] 

Anton Mushin commented on FLINK-4604:
-

I tried use code like as your code and I got exception:
{noformat}
org.apache.flink.api.table.TableException: Cannot generate a valid execution 
plan for the given query: 

LogicalAggregate(group=[{}], EXPR$0=[STDDEV_POP($0)], EXPR$1=[STDDEV_POP($1)], 
EXPR$2=[STDDEV_POP($2)], EXPR$3=[STDDEV_POP($3)], EXPR$4=[STDDEV_POP($4)], 
EXPR$5=[STDDEV_POP($5)])
  LogicalTableScan(table=[[_DataSetTable_0]])

This exception indicates that the query uses an unsupported SQL feature.
Please check the documentation for the set of currently supported SQL features.

at 
org.apache.flink.api.table.BatchTableEnvironment.translate(BatchTableEnvironment.scala:257)
at 
org.apache.flink.api.scala.table.BatchTableEnvironment.toDataSet(BatchTableEnvironment.scala:139)
at 
org.apache.flink.api.scala.table.TableConversions.toDataSet(TableConversions.scala:41)
at 
org.apache.flink.api.scala.batch.sql.AggregationsITCase.testStddevPopAggregate(AggregationsITCase.scala:282)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
at org.junit.rules.RunRules.evaluate(RunRules.java:20)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
at org.junit.runners.Suite.runChild(Suite.java:127)
at org.junit.runners.Suite.runChild(Suite.java:26)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
at org.junit.runner.JUnitCore.run(JUnitCore.java:160)
at 
com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:69)
at 
com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:234)
at 
com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:74)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at com.intellij.rt.execution.application.AppMain.main(AppMain.java:144)
{noformat}
You can look my 
[code|https://github.com/ex00/flink/commit/470f19d44bbf11217de06d72e25cdeec7f406f9a]

> Add support for standard deviation/variance
> ---
>
> Key: FLINK-4604
> URL: https://issues.apache.org/jira/browse/FLINK-4604
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API & SQL
>Reporter: Timo Walther
>Assignee: Anton Mushin
> Attachments: 1.jpg
>
>
> Calcite's {{AggregateReduceFunctionsRule}} can convert SQL {{AVG, STDDEV_POP, 
> STDDEV_SAMP, VAR_POP, VAR_SAMP}} to sum/count functions. We should add, test 
> and document this rule. 
> If we also want 

[jira] [Commented] (FLINK-4604) Add support for standard deviation/variance

2016-10-12 Thread Timo Walther (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15569021#comment-15569021
 ] 

Timo Walther commented on FLINK-4604:
-

You cannot use {{!AggregateReduceFunctionsRule.INSTANCE.matches(call)}} as it 
is always false for sums. See 
{{AggregateReduceFunctionsRule.containsAvgStddevVarCall}}.

What do you mean with "something went wrong"? What about this:

{code}
val supported = agg.getAggCallList.map(_.getAggregation.getKind).forall {
  case SqlKind.SUM => true
  case SqlKind.MIN => true
  case SqlKind.MAX => true
  case _ => false
}

!distinctAggs && !groupSets && !agg.indicator && supported
{code}

> Add support for standard deviation/variance
> ---
>
> Key: FLINK-4604
> URL: https://issues.apache.org/jira/browse/FLINK-4604
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API & SQL
>Reporter: Timo Walther
>Assignee: Anton Mushin
> Attachments: 1.jpg
>
>
> Calcite's {{AggregateReduceFunctionsRule}} can convert SQL {{AVG, STDDEV_POP, 
> STDDEV_SAMP, VAR_POP, VAR_SAMP}} to sum/count functions. We should add, test 
> and document this rule. 
> If we also want to add this aggregates to Table API is up for discussion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4604) Add support for standard deviation/variance

2016-10-12 Thread Anton Mushin (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15568664#comment-15568664
 ] 

Anton Mushin commented on FLINK-4604:
-

I tried check function in 
{{org.apache.flink.api.table.plan.rules.dataSet.DataSetAggregateRule#matches}}, 
but something went wrong :)
I did so
{code:title=org.apache.flink.api.table.plan.rules.dataSet.DataSetAggregateRule}
override def matches(call: RelOptRuleCall): Boolean = {
val agg: LogicalAggregate = call.rel(0).asInstanceOf[LogicalAggregate]

// check if we have distinct aggregates
val distinctAggs = agg.getAggCallList.exists(_.isDistinct)
if (distinctAggs) {
  throw new TableException("DISTINCT aggregates are currently not 
supported.")
}

// check if we have grouping sets
val groupSets = agg.getGroupSets.size() != 1 || agg.getGroupSets.get(0) != 
agg.getGroupSet
if (groupSets || agg.indicator) {
  throw new TableException("GROUPING SETS are currently not supported.")
}

(!distinctAggs && !groupSets && !agg.indicator) && 
!AggregateReduceFunctionsRule.INSTANCE.matches(call)
  }
{code}
And I got next plan and exception:
{noformat}
DataSetCalc(select=[CAST(/(-(CASE(=($f1, 0), null, $f0), /(*(CASE(=($f3, 0), 
null, $f2), CASE(=($f3, 0), null, $f2)), $f3)), CASE(=($f3, 1), null, -($f3, 
1 AS $f0, CAST(/(-(CASE(=($f5, 0), null, $f4), /(*(CASE(=($f7, 0), null, 
$f6), CASE(=($f7, 0), null, $f6)), $f7)), CASE(=($f7, 1), null, -($f7, 1 AS 
$f1, CAST(/(-(CASE(=($f9, 0), null, $f8), /(*(CASE(=($f11, 0), null, $f10), 
CASE(=($f11, 0), null, $f10)), $f11)), CASE(=($f11, 1), null, -($f11, 1 AS 
$f2, CAST(/(-(CASE(=($f13, 0), null, $f12), /(*(CASE(=($f15, 0), null, $f14), 
CASE(=($f15, 0), null, $f14)), $f15)), CASE(=($f15, 1), null, -($f15, 1 AS 
$f3, CAST(/(-(CASE(=($f17, 0), null, $f16), /(*(CASE(=($f19, 0), null, $f18), 
CASE(=($f19, 0), null, $f18)), $f19)), CASE(=($f19, 1), null, -($f19, 1 AS 
$f4, CAST(/(-(CASE(=($f21, 0), null, $f20), /(*(CASE(=($f23, 0), null, $f22), 
CASE(=($f23, 0), null, $f22)), $f23)), CASE(=($f23, 1), null, -($f23, 1 AS 
$f5])
  DataSetAggregate(select=[$SUM0($f6) AS $f0, COUNT($f6) AS $f1, $SUM0(_1) AS 
$f2, COUNT(_1) AS $f3, $SUM0($f7) AS $f4, COUNT($f7) AS $f5, $SUM0(_2) AS $f6, 
COUNT(_2) AS $f7, $SUM0($f8) AS $f8, COUNT($f8) AS $f9, $SUM0(_3) AS $f10, 
COUNT(_3) AS $f11, $SUM0($f9) AS $f12, COUNT($f9) AS $f13, $SUM0(_4) AS $f14, 
COUNT(_4) AS $f15, $SUM0($f10) AS $f16, COUNT($f10) AS $f17, $SUM0(_5) AS $f18, 
COUNT(_5) AS $f19, $SUM0($f11) AS $f20, COUNT($f11) AS $f21, $SUM0(_6) AS $f22, 
COUNT(_6) AS $f23])
DataSetCalc(select=[_1, _2, _3, _4, _5, _6])
  DataSetScan(table=[[_DataSetTable_0]])
{noformat}
{noformat}
org.apache.flink.api.table.TableException: Type NULL is not supported. Null 
values must have a supported type.

at 
org.apache.flink.api.table.FlinkTypeFactory$.toTypeInfo(FlinkTypeFactory.scala:128)
at 
org.apache.flink.api.table.codegen.CodeGenerator.visitLiteral(CodeGenerator.scala:553)
at 
org.apache.flink.api.table.codegen.CodeGenerator.visitLiteral(CodeGenerator.scala:56)
at org.apache.calcite.rex.RexLiteral.accept(RexLiteral.java:658)
at 
org.apache.flink.api.table.codegen.CodeGenerator$$anonfun$14.apply(CodeGenerator.scala:675)
at 
org.apache.flink.api.table.codegen.CodeGenerator$$anonfun$14.apply(CodeGenerator.scala:675)
at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.Iterator$class.foreach(Iterator.scala:727)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
at scala.collection.AbstractTraversable.map(Traversable.scala:105)
at 
org.apache.flink.api.table.codegen.CodeGenerator.visitCall(CodeGenerator.scala:675)
at 
org.apache.flink.api.table.codegen.CodeGenerator.visitCall(CodeGenerator.scala:56)
at org.apache.calcite.rex.RexCall.accept(RexCall.java:108)
at 
org.apache.flink.api.table.codegen.CodeGenerator$$anonfun$14.apply(CodeGenerator.scala:675)
at 
org.apache.flink.api.table.codegen.CodeGenerator$$anonfun$14.apply(CodeGenerator.scala:675)
at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.Iterator$class.foreach(Iterator.scala:727)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
at scala.collection.IterableLike$class.foreach(Ite

[jira] [Commented] (FLINK-4604) Add support for standard deviation/variance

2016-10-10 Thread Timo Walther (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15561815#comment-15561815
 ] 

Timo Walther commented on FLINK-4604:
-

Yes, I mean {{matches}}. {{DataSetAggregate}} does not support those 
aggregations so the rule should never match.

> Add support for standard deviation/variance
> ---
>
> Key: FLINK-4604
> URL: https://issues.apache.org/jira/browse/FLINK-4604
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API & SQL
>Reporter: Timo Walther
>Assignee: Anton Mushin
> Attachments: 1.jpg
>
>
> Calcite's {{AggregateReduceFunctionsRule}} can convert SQL {{AVG, STDDEV_POP, 
> STDDEV_SAMP, VAR_POP, VAR_SAMP}} to sum/count functions. We should add, test 
> and document this rule. 
> If we also want to add this aggregates to Table API is up for discussion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4604) Add support for standard deviation/variance

2016-10-10 Thread Anton Mushin (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15561783#comment-15561783
 ] 

Anton Mushin commented on FLINK-4604:
-

Do you mean check in 
{{org.apache.flink.api.table.plan.rules.dataSet.DataSetAggregateRule#matches}}? 
or convert to other {{RelNode}} in 
{{org.apache.flink.api.table.plan.rules.dataSet.DataSetAggregateRule#convert}}

> Add support for standard deviation/variance
> ---
>
> Key: FLINK-4604
> URL: https://issues.apache.org/jira/browse/FLINK-4604
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API & SQL
>Reporter: Timo Walther
>Assignee: Anton Mushin
> Attachments: 1.jpg
>
>
> Calcite's {{AggregateReduceFunctionsRule}} can convert SQL {{AVG, STDDEV_POP, 
> STDDEV_SAMP, VAR_POP, VAR_SAMP}} to sum/count functions. We should add, test 
> and document this rule. 
> If we also want to add this aggregates to Table API is up for discussion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4604) Add support for standard deviation/variance

2016-10-07 Thread Timo Walther (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15556725#comment-15556725
 ] 

Timo Walther commented on FLINK-4604:
-

I think a even nicer solution would be to check the aggregation function type 
already in `DataSetAggregateRule`, so that the rule never matches in those 
cases.

> Add support for standard deviation/variance
> ---
>
> Key: FLINK-4604
> URL: https://issues.apache.org/jira/browse/FLINK-4604
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API & SQL
>Reporter: Timo Walther
>Assignee: Anton Mushin
> Attachments: 1.jpg
>
>
> Calcite's {{AggregateReduceFunctionsRule}} can convert SQL {{AVG, STDDEV_POP, 
> STDDEV_SAMP, VAR_POP, VAR_SAMP}} to sum/count functions. We should add, test 
> and document this rule. 
> If we also want to add this aggregates to Table API is up for discussion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4604) Add support for standard deviation/variance

2016-10-07 Thread Anton Mushin (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15554768#comment-15554768
 ] 

Anton Mushin commented on FLINK-4604:
-

I changed method 
{{org.apache.flink.api.table.plan.nodes.dataset.DataSetAggregate#computeSelfCost}}
it is impermanent implementation for examle
{code}
override def computeSelfCost (planner: RelOptPlanner, metadata: 
RelMetadataQuery): RelOptCost = {
val child = this.getInput
val rowCnt = metadata.getRowCount(child)
val rowSize = this.estimateRowSize(child.getRowType)
val aggCnt = this.namedAggregates.size
var resultCost = planner.getCostFactory.makeCost(rowCnt, rowCnt * aggCnt, 
rowCnt * rowSize)
this.namedAggregates.foreach(x=>{
  x.getKey.getAggregation.getKind match {
case SqlKind.STDDEV_POP =>
  resultCost = resultCost.plus(planner.getCostFactory.makeCost(rowCnt, 
rowCnt * aggCnt, rowCnt * rowSize))
case SqlKind.STDDEV_SAMP =>
  resultCost = resultCost.plus(planner.getCostFactory.makeCost(rowCnt, 
rowCnt * aggCnt, rowCnt * rowSize))
case SqlKind.VAR_SAMP =>
  resultCost = resultCost.plus(planner.getCostFactory.makeCost(rowCnt, 
rowCnt * aggCnt, rowCnt * rowSize))
case SqlKind.VAR_POP =>
  resultCost = resultCost.plus(planner.getCostFactory.makeCost(rowCnt, 
rowCnt * aggCnt, rowCnt * rowSize))
case default => None
  }
})
resultCost
  }
{code}
and i got next plan:
{noformat}
DataSetCalc(select=[CAST(POWER(/(-($f0, /(*($f1, $f1), $f2)), $f2), 0.5)) AS 
$f0, CAST(POWER(/(-($f3, /(*($f1, $f1), $f2)), CASE(=($f2, 1), null, -($f2, 
1))), 0.5)) AS $f1, CAST(/(-($f4, /(*($f1, $f1), $f2)), CASE(=($f2, 1), null, 
-($f2, 1 AS $f2, CAST(/(-($f5, /(*($f1, $f1), $f2)), $f2)) AS $f3])
  DataSetAggregate(select=[SUM($f6) AS $f0, SUM(_1) AS $f1, COUNT(_1) AS $f2, 
SUM($f7) AS $f3, SUM($f8) AS $f4, SUM($f9) AS $f5])
DataSetCalc(select=[_1, _2, _3, _4, _5, _6])
  DataSetScan(table=[[_DataSetTable_0]])

LogicalAggregate(group=[{}], EXPR$0=[STDDEV_POP($0)], EXPR$1=[STDDEV_SAMP($0)], 
EXPR$2=[VAR_SAMP($0)], EXPR$3=[VAR_POP($0)])
  LogicalProject(_1=[$0])
LogicalTableScan(table=[[_DataSetTable_0]])
{noformat}

> Add support for standard deviation/variance
> ---
>
> Key: FLINK-4604
> URL: https://issues.apache.org/jira/browse/FLINK-4604
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API & SQL
>Reporter: Timo Walther
>Assignee: Anton Mushin
> Attachments: 1.jpg
>
>
> Calcite's {{AggregateReduceFunctionsRule}} can convert SQL {{AVG, STDDEV_POP, 
> STDDEV_SAMP, VAR_POP, VAR_SAMP}} to sum/count functions. We should add, test 
> and document this rule. 
> If we also want to add this aggregates to Table API is up for discussion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4604) Add support for standard deviation/variance

2016-10-07 Thread Timo Walther (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15554766#comment-15554766
 ] 

Timo Walther commented on FLINK-4604:
-

Yes, you are right. We have to make the costs for STDDEV_POP, etc. very high. 
So if one of those aggregation functions is contained, than 
{{planner.getCostFactory.makeInfiniteCost()}}.

> Add support for standard deviation/variance
> ---
>
> Key: FLINK-4604
> URL: https://issues.apache.org/jira/browse/FLINK-4604
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API & SQL
>Reporter: Timo Walther
>Assignee: Anton Mushin
> Attachments: 1.jpg
>
>
> Calcite's {{AggregateReduceFunctionsRule}} can convert SQL {{AVG, STDDEV_POP, 
> STDDEV_SAMP, VAR_POP, VAR_SAMP}} to sum/count functions. We should add, test 
> and document this rule. 
> If we also want to add this aggregates to Table API is up for discussion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4604) Add support for standard deviation/variance

2016-10-07 Thread Anton Mushin (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15554574#comment-15554574
 ] 

Anton Mushin commented on FLINK-4604:
-

I'm think will probably worth consider types of aggregate functions in 
{{gorg.apache.flink.api.table.plan.nodes.dataset.DataSetAggregate#computeSelfCost}}
What do you think about it?

> Add support for standard deviation/variance
> ---
>
> Key: FLINK-4604
> URL: https://issues.apache.org/jira/browse/FLINK-4604
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API & SQL
>Reporter: Timo Walther
>Assignee: Anton Mushin
> Attachments: 1.jpg
>
>
> Calcite's {{AggregateReduceFunctionsRule}} can convert SQL {{AVG, STDDEV_POP, 
> STDDEV_SAMP, VAR_POP, VAR_SAMP}} to sum/count functions. We should add, test 
> and document this rule. 
> If we also want to add this aggregates to Table API is up for discussion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4604) Add support for standard deviation/variance

2016-10-07 Thread Anton Mushin (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15554418#comment-15554418
 ] 

Anton Mushin commented on FLINK-4604:
-

Yes, of course. link to commit: 
https://github.com/ex00/flink/commit/8f802db568ce271bfb1597f7c2a29cc0d00f55f6

> Add support for standard deviation/variance
> ---
>
> Key: FLINK-4604
> URL: https://issues.apache.org/jira/browse/FLINK-4604
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API & SQL
>Reporter: Timo Walther
>Assignee: Anton Mushin
> Attachments: 1.jpg
>
>
> Calcite's {{AggregateReduceFunctionsRule}} can convert SQL {{AVG, STDDEV_POP, 
> STDDEV_SAMP, VAR_POP, VAR_SAMP}} to sum/count functions. We should add, test 
> and document this rule. 
> If we also want to add this aggregates to Table API is up for discussion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4604) Add support for standard deviation/variance

2016-10-06 Thread Timo Walther (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15552086#comment-15552086
 ] 

Timo Walther commented on FLINK-4604:
-

Maybe the cost function of {{DataSetAggregate}} is faulty. Could you post a 
link to your branch? I will have a look at it.

> Add support for standard deviation/variance
> ---
>
> Key: FLINK-4604
> URL: https://issues.apache.org/jira/browse/FLINK-4604
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API & SQL
>Reporter: Timo Walther
>Assignee: Anton Mushin
> Attachments: 1.jpg
>
>
> Calcite's {{AggregateReduceFunctionsRule}} can convert SQL {{AVG, STDDEV_POP, 
> STDDEV_SAMP, VAR_POP, VAR_SAMP}} to sum/count functions. We should add, test 
> and document this rule. 
> If we also want to add this aggregates to Table API is up for discussion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4604) Add support for standard deviation/variance

2016-10-06 Thread Timo Walther (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15551943#comment-15551943
 ] 

Timo Walther commented on FLINK-4604:
-

Ok, it seems that the {{AggregateReduceFunctionsRule}} does work correctly. It 
would be great if you could figure out why. Can you check if 
{{AggregateReduceFunctionsRule.matches}} is called? Maybe we have to adapt the 
{{RelOptRuleOperand}} condition or Calcite has a problem that needs to be 
reported on the Calcite mailing list.

> Add support for standard deviation/variance
> ---
>
> Key: FLINK-4604
> URL: https://issues.apache.org/jira/browse/FLINK-4604
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API & SQL
>Reporter: Timo Walther
>Assignee: Anton Mushin
>
> Calcite's {{AggregateReduceFunctionsRule}} can convert SQL {{AVG, STDDEV_POP, 
> STDDEV_SAMP, VAR_POP, VAR_SAMP}} to sum/count functions. We should add, test 
> and document this rule. 
> If we also want to add this aggregates to Table API is up for discussion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4604) Add support for standard deviation/variance

2016-10-06 Thread Anton Mushin (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15551768#comment-15551768
 ] 

Anton Mushin commented on FLINK-4604:
-

I'm use {{RelOptUtil.toString(relNode)}} for getting plan
{code:title=org.apache.flink.api.table.BatchTableEnvironment#translate}
 ..
val dataSetPlan = try {
  optProgram.run(getPlanner, decorPlan, flinkOutputProps)
}
catch {
  ..
}

print(s"\n${RelOptUtil.toString(dataSetPlan)}\n${RelOptUtil.toString(relNode)}")
dataSetPlan match {
  case node: DataSetRel =>
node.translateToPlan(
  this,
  Some(tpe.asInstanceOf[TypeInformation[Any]])
).asInstanceOf[DataSet[A]]
  case _ => ???
}
  }
{code}
I getting in the output
{noformat}
DataSetAggregate(select=[STDDEV_POP(_1) AS EXPR$0, STDDEV_SAMP(_1) AS EXPR$1, 
VAR_SAMP(_1) AS EXPR$2, VAR_POP(_1) AS EXPR$3])
  DataSetScan(table=[[_DataSetTable_0]])

LogicalAggregate(group=[{}], EXPR$0=[STDDEV_POP($0)], EXPR$1=[STDDEV_SAMP($0)], 
EXPR$2=[VAR_SAMP($0)], EXPR$3=[VAR_POP($0)])
  LogicalProject(_1=[$0])
LogicalTableScan(table=[[_DataSetTable_0]])
{noformat}

> Add support for standard deviation/variance
> ---
>
> Key: FLINK-4604
> URL: https://issues.apache.org/jira/browse/FLINK-4604
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API & SQL
>Reporter: Timo Walther
>Assignee: Anton Mushin
>
> Calcite's {{AggregateReduceFunctionsRule}} can convert SQL {{AVG, STDDEV_POP, 
> STDDEV_SAMP, VAR_POP, VAR_SAMP}} to sum/count functions. We should add, test 
> and document this rule. 
> If we also want to add this aggregates to Table API is up for discussion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4604) Add support for standard deviation/variance

2016-10-06 Thread Timo Walther (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15551706#comment-15551706
 ] 

Timo Walther commented on FLINK-4604:
-

Your implementation looks good so far. The output of 
{{BatchTableEnvironment#translate}} should actually be a tree of 
{{DataSetAggregate}}, {{DataSetCalc}} nodes. Are you using {{explain}} to get 
the plan above? It might only print the logical plan. You can have a look at 
{{ExpressionReductionTest}} how you can get the optimized plan. This will be 
easier when https://github.com/apache/flink/pull/2595 is merged.

> Add support for standard deviation/variance
> ---
>
> Key: FLINK-4604
> URL: https://issues.apache.org/jira/browse/FLINK-4604
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API & SQL
>Reporter: Timo Walther
>Assignee: Anton Mushin
>
> Calcite's {{AggregateReduceFunctionsRule}} can convert SQL {{AVG, STDDEV_POP, 
> STDDEV_SAMP, VAR_POP, VAR_SAMP}} to sum/count functions. We should add, test 
> and document this rule. 
> If we also want to add this aggregates to Table API is up for discussion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4604) Add support for standard deviation/variance

2016-10-06 Thread Anton Mushin (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15551505#comment-15551505
 ] 

Anton Mushin commented on FLINK-4604:
-

I have some problem.
I added rule in 
{{org.apache.flink.api.table.plan.rules.FlinkRuleSets#DATASET_OPT_RULES}}
{code}
.
   // aggregate union rule
AggregateUnionAggregateRule.INSTANCE,

AggregateReduceFunctionsRule.INSTANCE,
..
{code}
I create case classes in 
{{org.apache.flink.api.table.expressions.aggregations.scala}} look like as
{code}
case class StddevPop(child: Expression) extends Aggregation {
  override def toString = s"stddev_pop($child)"

  override private[flink] def toAggCall(name: String)(implicit relBuilder: 
RelBuilder): AggCall = {
relBuilder.aggregateCall(SqlStdOperatorTable.STDDEV_POP, false, null, name, 
child.toRexNode)
  }

  override private[flink] def resultType = child.resultType

  override private[flink] def validateInput =
TypeCheckUtils.assertNumericExpr(child.resultType, "stddev_pop")
}
{code}

and described new functions in  
{{org.apache.flink.api.table.validate.FunctionCatalog#builtInFunctions}}
{code}
..
"sum" -> classOf[Sum],
"stddev_pop" -> classOf[StddevPop],
"stddev_samp" -> classOf[StddevSamp],
"var_pop" -> classOf[VarPop],
"var_samp" -> classOf[VarSamp],
...
{code}
and in 
{{org.apache.flink.api.table.validate.BasicOperatorTable#builtInSqlOperators}}
{code}
.
// AGGREGATE OPERATORS

SqlStdOperatorTable.AVG,
SqlStdOperatorTable.STDDEV_POP,
SqlStdOperatorTable.STDDEV_SAMP,
SqlStdOperatorTable.VAR_POP,
SqlStdOperatorTable.VAR_SAMP,

{code}
Also I described functions in 
{{org.apache.flink.api.table.expressions.ExpressionParser}} lool like as
{code}
lazy val STDDEV_POP: Keyword = Keyword("stddev_pop")
.
lazy val suffixStddevPop: PackratParser[Expression] =
composite <~ "." ~ STDDEV_POP ~ opt("()") ^^ { e => StddevPop(e) }

..
lazy val prefixStddevPop: PackratParser[Expression] =
STDDEV_POP ~ "(" ~> expression <~ ")" ^^ { e => StddevPop(e) }

// and it added in ExpressionParser#functionIdent, ExpressionParser#suffixed, 
ExpressionParser#prefixed as for other aggregate functions 
{code}
and I added them into {{org.apache.flink.api.scala.table.expressionDsl.scala}}
{code}
..
def stddev_pop = StddevPop(expr)
  def stddev_samp = StddevSamp(expr)
  def var_pop = VarPop(expr)
  def var_samp = VarSamp(expr)
{code}


Then I'm execute next sql query: {{SELECT 
STDDEV_POP(_1),STDDEV_SAMP(_1),VAR_SAMP(_1),VAR_POP(_1) FROM table}}
In {{org.apache.flink.api.table.BatchTableEnvironment#translate}} I getting 
dataset plan as
{noformat}
LogicalAggregate(group=[{}], EXPR$0=[STDDEV_POP($0)], EXPR$1=[STDDEV_SAMP($0)], 
EXPR$2=[VAR_SAMP($0)], EXPR$3=[VAR_POP($0)])
  LogicalProject(_1=[$0])
LogicalTableScan(table=[[_DataSetTable_0]])
{noformat}

When 
{{org.apache.flink.api.table.plan.nodes.dataset.DataSetAggregate#translateToPlan}}
 occur then {{org.apache.flink.api.table.runtime.aggregate.AvgAggregate}} is 
called for each new function and becouse of it avg calculate for each function
Could you suggest what I might miss?

> Add support for standard deviation/variance
> ---
>
> Key: FLINK-4604
> URL: https://issues.apache.org/jira/browse/FLINK-4604
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API & SQL
>Reporter: Timo Walther
>Assignee: Anton Mushin
>
> Calcite's {{AggregateReduceFunctionsRule}} can convert SQL {{AVG, STDDEV_POP, 
> STDDEV_SAMP, VAR_POP, VAR_SAMP}} to sum/count functions. We should add, test 
> and document this rule. 
> If we also want to add this aggregates to Table API is up for discussion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4604) Add support for standard deviation/variance

2016-10-04 Thread Timo Walther (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15544874#comment-15544874
 ] 

Timo Walther commented on FLINK-4604:
-

Hi [~anmu],
1. Yes, the test should look like the one in your example. Since this issue 
does not implement new runtime functions, it ok to just test some cases to 
check proper logical translation. We will soon provide a test base so that we 
can just test the logical translation and don't have to run an entire Flink 
program. IMHO one large test for all function is also ok. You also don't need 
to test every data type.

2. Yes this rule has to be added to the DataSet rules. But we don't need new 
runtime functions. Calcite translates all aggregates to SUM/COUNT expressions 
which are already supported in 
{{org.apache.flink.api.table.runtime.aggregate}}. In order to have those 
functions also in the Table API you need to add them to 
{{expressionDsl.scala}}, {{ExpressionParser}}/{{FunctionCatalog}} and create 
expression case classes in {{aggregations.scala}}.

> Add support for standard deviation/variance
> ---
>
> Key: FLINK-4604
> URL: https://issues.apache.org/jira/browse/FLINK-4604
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API & SQL
>Reporter: Timo Walther
>Assignee: Anton Mushin
>
> Calcite's {{AggregateReduceFunctionsRule}} can convert SQL {{AVG, STDDEV_POP, 
> STDDEV_SAMP, VAR_POP, VAR_SAMP}} to sum/count functions. We should add, test 
> and document this rule. 
> If we also want to add this aggregates to Table API is up for discussion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4604) Add support for standard deviation/variance

2016-09-30 Thread Anton Mushin (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15536178#comment-15536178
 ] 

Anton Mushin commented on FLINK-4604:
-

[~twalthr],thanks!
I have any question about this issue.
1. Am I correct understand in general the test should look like as
{code:java}
public void testNewAggregationFunctions() throws Exception {
//set env 
String sqlQuery = "SELECT 
AVG(x),STDDEV_POP(x),STDDEV_SAMP(x),VAR_POP(x),VAR_SAMP(x) FROM table";
Table result = tableEnv.sql(sqlQuery);
/*
AVG(x) = SUM(x) / COUNT(x)
STDDEV_POP(x) = SQRT( (SUM(x * x) - SUM(x) * SUM(x) / COUNT(x)) 
/ COUNT(x))
STDDEV_SAMP(x)= SQRT((SUM(x * x) - SUM(x) * SUM(x) / COUNT(x)) 
/ CASE COUNT(x) WHEN 1 THEN NULL ELSE COUNT(x) - 1 END)
VAR_POP(x)= (SUM(x * x) - SUM(x) * SUM(x) / COUNT(x))/ COUNT(x)
VAR_SAMP(x) = (SUM(x * x) - SUM(x) * SUM(x) / COUNT(x))/ CASE 
COUNT(x) WHEN 1 THEN NULL ELSE COUNT(x) - 1 END
 */
String sqlQuery1 = "SELECT SUM(x)/COUNT(x), " +
"SQRT( (SUM(x * x) - SUM(x) * SUM(x) / COUNT(x)) / 
COUNT(x)), " +
"SQRT( (SUM(x * x) - SUM(x) * SUM(x) / COUNT(x)) / CASE 
COUNT(x) WHEN 1 THEN NULL ELSE COUNT(x) - 1 END), " +
"(SUM(x * x) - SUM(x) * SUM(x) / COUNT(x)) / COUNT(x), 
" +
"(SUM(x * x) - SUM(x) * SUM(x) / COUNT(x)) / CASE 
COUNT(x) WHEN 1 THEN NULL ELSE COUNT(x) - 1 END " +
"FROM table";
Table result1 = tableEnv.sql(sqlQuery1);

DataSet resultSet = tableEnv.toDataSet(result, Row.class);
List results = resultSet.collect();
DataSet expectedResultSet = tableEnv.toDataSet(result1, 
Row.class);
List expectedResults = expectedResultSet.collect();
compareResult(results, expected);
}
{code}
+some single test for each new function? 

2. For support AggregateReduceFunctionsRule I should add to 
org.apache.flink.api.table.plan.rules.FlinkRuleSets 
AggregateReduceFunctionsRule.INSTANCE and implement logic 
STDDEV_POP,STDDEV_SAMP,VAR_POP,VAR_SAMP in package 
org.apache.flink.api.table.runtime.aggregate like for AvgAggregate, isn't it?
And define new aggregate functions in other places where it is necessary, for 
example: org.apache.flink.api.table.validate.FunctionCatalog

> Add support for standard deviation/variance
> ---
>
> Key: FLINK-4604
> URL: https://issues.apache.org/jira/browse/FLINK-4604
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API & SQL
>Reporter: Timo Walther
>Assignee: Anton Mushin
>
> Calcite's {{AggregateReduceFunctionsRule}} can convert SQL {{AVG, STDDEV_POP, 
> STDDEV_SAMP, VAR_POP, VAR_SAMP}} to sum/count functions. We should add, test 
> and document this rule. 
> If we also want to add this aggregates to Table API is up for discussion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4604) Add support for standard deviation/variance

2016-09-30 Thread Timo Walther (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15535940#comment-15535940
 ] 

Timo Walther commented on FLINK-4604:
-

[~anmu] yes sure. I assigned the issue to you. If you need any help, just ask. 
Thanks!

> Add support for standard deviation/variance
> ---
>
> Key: FLINK-4604
> URL: https://issues.apache.org/jira/browse/FLINK-4604
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API & SQL
>Reporter: Timo Walther
>Assignee: Anton Mushin
>
> Calcite's {{AggregateReduceFunctionsRule}} can convert SQL {{AVG, STDDEV_POP, 
> STDDEV_SAMP, VAR_POP, VAR_SAMP}} to sum/count functions. We should add, test 
> and document this rule. 
> If we also want to add this aggregates to Table API is up for discussion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4604) Add support for standard deviation/variance

2016-09-30 Thread Anton Mushin (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15535709#comment-15535709
 ] 

Anton Mushin commented on FLINK-4604:
-

Hello.
Could I start to resolve this ticket? 


> Add support for standard deviation/variance
> ---
>
> Key: FLINK-4604
> URL: https://issues.apache.org/jira/browse/FLINK-4604
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API & SQL
>Reporter: Timo Walther
>
> Calcite's {{AggregateReduceFunctionsRule}} can convert SQL {{AVG, STDDEV_POP, 
> STDDEV_SAMP, VAR_POP, VAR_SAMP}} to sum/count functions. We should add, test 
> and document this rule. 
> If we also want to add this aggregates to Table API is up for discussion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)