[jira] [Commented] (FLINK-2142) GSoC project: Exact and Approximate Statistics for Data Streams and Windows

2016-05-16 Thread Gabor Gevay (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15284435#comment-15284435
 ] 

Gabor Gevay commented on FLINK-2142:


This proposal was based on the old (pre-0.10) windowing API. I'm now taking it 
apart, by converting sub-tasks to stand-alone issues (FLINK-2148, FLINK-2147) 
and/or modifying/closing those sub-tasks that don't make sense in the current 
streaming API. I will add the label `approximate` to those issues that are 
about approximate calculations.

Note: The main reason why I abandoned this project last summer, is that the 
streaming API was changing a lot at that time, so it seemed better to postpone 
these things.

> GSoC project: Exact and Approximate Statistics for Data Streams and Windows
> ---
>
> Key: FLINK-2142
> URL: https://issues.apache.org/jira/browse/FLINK-2142
> Project: Flink
>  Issue Type: New Feature
>  Components: Streaming
>Reporter: Gabor Gevay
>Assignee: Gabor Gevay
>Priority: Minor
>  Labels: gsoc2015, statistics, streaming
>
> The goal of this project is to implement basic statistics of data streams and 
> windows (like average, median, variance, correlation, etc.) in a 
> computationally efficient manner. This involves designing custom PreReducers.
> The exact calculation of some statistics (eg. frequencies, or the number of 
> distinct elements) would require memory proportional to the number of 
> elements in the input (the window or the entire stream). However, there are 
> efficient algorithms and data structures using less memory for calculating 
> the same statistics only approximately, with user-specified error bounds.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2142) GSoC project: Exact and Approximate Statistics for Data Streams and Windows

2016-05-16 Thread Gabor Gevay (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15284474#comment-15284474
 ] 

Gabor Gevay commented on FLINK-2142:


(I've also broken off FLINK-2144.)

> GSoC project: Exact and Approximate Statistics for Data Streams and Windows
> ---
>
> Key: FLINK-2142
> URL: https://issues.apache.org/jira/browse/FLINK-2142
> Project: Flink
>  Issue Type: New Feature
>  Components: Streaming
>Reporter: Gabor Gevay
>Assignee: Gabor Gevay
>Priority: Minor
>  Labels: gsoc2015, statistics, streaming
>
> The goal of this project is to implement basic statistics of data streams and 
> windows (like average, median, variance, correlation, etc.) in a 
> computationally efficient manner. This involves designing custom PreReducers.
> The exact calculation of some statistics (eg. frequencies, or the number of 
> distinct elements) would require memory proportional to the number of 
> elements in the input (the window or the entire stream). However, there are 
> efficient algorithms and data structures using less memory for calculating 
> the same statistics only approximately, with user-specified error bounds.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2142) GSoC project: Exact and Approximate Statistics for Data Streams and Windows

2015-06-03 Thread JIRA

[ 
https://issues.apache.org/jira/browse/FLINK-2142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14570785#comment-14570785
 ] 

Márton Balassi commented on FLINK-2142:
---

Thanks, for adding the tickets to track your progress [~ggevay].

> GSoC project: Exact and Approximate Statistics for Data Streams and Windows
> ---
>
> Key: FLINK-2142
> URL: https://issues.apache.org/jira/browse/FLINK-2142
> Project: Flink
>  Issue Type: New Feature
>  Components: Streaming
>Reporter: Gabor Gevay
>Assignee: Gabor Gevay
>Priority: Minor
>  Labels: gsoc2015, statistics, streaming
>
> The goal of this project is to implement basic statistics of data streams and 
> windows (like average, median, variance, correlation, etc.) in a 
> computationally efficient manner. This involves designing custom PreReducers.
> The exact calculation of some statistics (eg. frequencies, or the number of 
> distinct elements) would require memory proportional to the number of 
> elements in the input (the window or the entire stream). However, there are 
> efficient algorithms and data structures using less memory for calculating 
> the same statistics only approximately, with user-specified error bounds.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2142) GSoC project: Exact and Approximate Statistics for Data Streams and Windows

2021-07-29 Thread Jira


[ 
https://issues.apache.org/jira/browse/FLINK-2142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17389797#comment-17389797
 ] 

Gábor Gévay commented on FLINK-2142:


There is a recent paper on this in the meantime:
http://www.vldb.org/pvldb/vol14/p1818-poepsel-lemaitre.pdf

> GSoC project: Exact and Approximate Statistics for Data Streams and Windows
> ---
>
> Key: FLINK-2142
> URL: https://issues.apache.org/jira/browse/FLINK-2142
> Project: Flink
>  Issue Type: New Feature
>  Components: API / DataStream
>Reporter: Gábor Gévay
>Assignee: Gábor Gévay
>Priority: Not a Priority
>  Labels: gsoc2015, stale-assigned, statistics, streaming
>
> The goal of this project is to implement basic statistics of data streams and 
> windows (like average, median, variance, correlation, etc.) in a 
> computationally efficient manner. This involves designing custom PreReducers.
> The exact calculation of some statistics (eg. frequencies, or the number of 
> distinct elements) would require memory proportional to the number of 
> elements in the input (the window or the entire stream). However, there are 
> efficient algorithms and data structures using less memory for calculating 
> the same statistics only approximately, with user-specified error bounds.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-2142) GSoC project: Exact and Approximate Statistics for Data Streams and Windows

2021-04-14 Thread Flink Jira Bot (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-2142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17321764#comment-17321764
 ] 

Flink Jira Bot commented on FLINK-2142:
---

This issue and all of its Sub-Tasks have not been updated for 180 days. So, it 
has been labeled "stale-minor". If you are still affected by this bug or are 
still interested in this issue, please give an update and remove the label. In 
7 days the issue will be closed automatically.

> GSoC project: Exact and Approximate Statistics for Data Streams and Windows
> ---
>
> Key: FLINK-2142
> URL: https://issues.apache.org/jira/browse/FLINK-2142
> Project: Flink
>  Issue Type: New Feature
>  Components: API / DataStream
>Reporter: Gábor Gévay
>Assignee: Gábor Gévay
>Priority: Minor
>  Labels: gsoc2015, stale-minor, statistics, streaming
>
> The goal of this project is to implement basic statistics of data streams and 
> windows (like average, median, variance, correlation, etc.) in a 
> computationally efficient manner. This involves designing custom PreReducers.
> The exact calculation of some statistics (eg. frequencies, or the number of 
> distinct elements) would require memory proportional to the number of 
> elements in the input (the window or the entire stream). However, there are 
> efficient algorithms and data structures using less memory for calculating 
> the same statistics only approximately, with user-specified error bounds.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-2142) GSoC project: Exact and Approximate Statistics for Data Streams and Windows

2021-04-22 Thread Flink Jira Bot (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-2142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17329573#comment-17329573
 ] 

Flink Jira Bot commented on FLINK-2142:
---

This issue is assigned but has not received an update in 7 days so it has been 
labeled "stale-assigned". If you are still working on the issue, please give an 
update and remove the label. If you are no longer working on the issue, please 
unassign so someone else may work on it. In 7 days the issue will be 
automatically unassigned.

> GSoC project: Exact and Approximate Statistics for Data Streams and Windows
> ---
>
> Key: FLINK-2142
> URL: https://issues.apache.org/jira/browse/FLINK-2142
> Project: Flink
>  Issue Type: New Feature
>  Components: API / DataStream
>Reporter: Gábor Gévay
>Assignee: Gábor Gévay
>Priority: Minor
>  Labels: gsoc2015, stale-assigned, stale-minor, statistics, 
> streaming
>
> The goal of this project is to implement basic statistics of data streams and 
> windows (like average, median, variance, correlation, etc.) in a 
> computationally efficient manner. This involves designing custom PreReducers.
> The exact calculation of some statistics (eg. frequencies, or the number of 
> distinct elements) would require memory proportional to the number of 
> elements in the input (the window or the entire stream). However, there are 
> efficient algorithms and data structures using less memory for calculating 
> the same statistics only approximately, with user-specified error bounds.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)