[jira] [Commented] (FLINK-16517) Add a long running WordCount example

2020-05-27 Thread Robert Metzger (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-16517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17117788#comment-17117788
 ] 

Robert Metzger commented on FLINK-16517:


I agree with Till that {{WindowJoin}} (and also {{TopSpeedWindowing.jar}}) are 
two examples of self-contained streaming jobs, that are part of the examples 
collection.
The review and maintenance overhead of making this example also long-running is 
just not justified.

I propose to close this ticket as "Won't fix" and close the accompanying pull 
request.


> Add a long running WordCount example
> 
>
> Key: FLINK-16517
> URL: https://issues.apache.org/jira/browse/FLINK-16517
> Project: Flink
>  Issue Type: Improvement
>  Components: Examples
>Reporter: Ethan Li
>Assignee: Ethan Li
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> As far as I know, flink doesn't have a long running WordCount example for 
> users to start with or doing some simple tests.  
> The closest one is SocketWindowWordCount. But it requires setting up a server 
> (nc -l ), which is not hard, but still tedious for simple use cases. And it 
> requires human input for the job to actually run.
> I propose to add or modify current WordCount example to have a SourceFunction 
> that randomly generates input data based on a set of sentences, so the 
> WordCount job can run forever. The generation ratio will be configurable.
> This will be the easiest way to start a long running flink job and can be 
> useful for new users to start using flink quickly, or for developers to test 
> flink easily.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-16517) Add a long running WordCount example

2020-03-19 Thread Till Rohrmann (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-16517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17062489#comment-17062489
 ] 

Till Rohrmann commented on FLINK-16517:
---

Isn't the {{WindowJoin}} already simple enough and it comes with an infinite 
source.

> Add a long running WordCount example
> 
>
> Key: FLINK-16517
> URL: https://issues.apache.org/jira/browse/FLINK-16517
> Project: Flink
>  Issue Type: Improvement
>  Components: Examples
>Reporter: Ethan Li
>Assignee: Ethan Li
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> As far as I know, flink doesn't have a long running WordCount example for 
> users to start with or doing some simple tests.  
> The closest one is SocketWindowWordCount. But it requires setting up a server 
> (nc -l ), which is not hard, but still tedious for simple use cases. And it 
> requires human input for the job to actually run.
> I propose to add or modify current WordCount example to have a SourceFunction 
> that randomly generates input data based on a set of sentences, so the 
> WordCount job can run forever. The generation ratio will be configurable.
> This will be the easiest way to start a long running flink job and can be 
> useful for new users to start using flink quickly, or for developers to test 
> flink easily.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-16517) Add a long running WordCount example

2020-03-17 Thread Ethan Li (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-16517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17061327#comment-17061327
 ] 

Ethan Li commented on FLINK-16517:
--

Thanks [~aljoscha] I put up a pull request to add the new unbounded source. 

I did not deal with FileProcessingMode in this pr because it seems to require 
some changes in 
"[readTextFile|[https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/environment/StreamExecutionEnvironment.java#L1085]]
 and that might impact other components/code. I am willing to file a separate 
issue/PR if you think it makes sense to do so.

 

 

> Add a long running WordCount example
> 
>
> Key: FLINK-16517
> URL: https://issues.apache.org/jira/browse/FLINK-16517
> Project: Flink
>  Issue Type: Improvement
>  Components: Examples
>Reporter: Ethan Li
>Assignee: Ethan Li
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> As far as I know, flink doesn't have a long running WordCount example for 
> users to start with or doing some simple tests.  
> The closest one is SocketWindowWordCount. But it requires setting up a server 
> (nc -l ), which is not hard, but still tedious for simple use cases. And it 
> requires human input for the job to actually run.
> I propose to add or modify current WordCount example to have a SourceFunction 
> that randomly generates input data based on a set of sentences, so the 
> WordCount job can run forever. The generation ratio will be configurable.
> This will be the easiest way to start a long running flink job and can be 
> useful for new users to start using flink quickly, or for developers to test 
> flink easily.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-16517) Add a long running WordCount example

2020-03-11 Thread Aljoscha Krettek (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-16517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17056734#comment-17056734
 ] 

Aljoscha Krettek commented on FLINK-16517:
--

I think you might be right, we can change the streaming WordCount example to 
read files with {{FileProcessingMode.PROCESS_CONTINUOUSLY}}, which would make 
it more stream-y, and also to use an unbounded source for the built-in data 
when no input path is given.

> Add a long running WordCount example
> 
>
> Key: FLINK-16517
> URL: https://issues.apache.org/jira/browse/FLINK-16517
> Project: Flink
>  Issue Type: Improvement
>  Components: Examples
>Reporter: Ethan Li
>Priority: Minor
>
> As far as I know, flink doesn't have a long running WordCount example for 
> users to start with or doing some simple tests.  
> The closest one is SocketWindowWordCount. But it requires setting up a server 
> (nc -l ), which is not hard, but still tedious for simple use cases. And it 
> requires human input for the job to actually run.
> I propose to add or modify current WordCount example to have a SourceFunction 
> that randomly generates input data based on a set of sentences, so the 
> WordCount job can run forever. The generation ratio will be configurable.
> This will be the easiest way to start a long running flink job and can be 
> useful for new users to start using flink quickly, or for developers to test 
> flink easily.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-16517) Add a long running WordCount example

2020-03-10 Thread Ethan Li (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-16517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17056571#comment-17056571
 ] 

Ethan Li commented on FLINK-16517:
--

[~aljoscha] Thanks for the link. I feel like it's still not simple enough for 
starters. I am looking for a very simple example so starters can focus on 
making their first flink job running. WordCount example is a like a hello-world 
program in streaming. 

> Add a long running WordCount example
> 
>
> Key: FLINK-16517
> URL: https://issues.apache.org/jira/browse/FLINK-16517
> Project: Flink
>  Issue Type: Improvement
>  Components: Examples
>Reporter: Ethan Li
>Priority: Minor
>
> As far as I know, flink doesn't have a long running WordCount example for 
> users to start with or doing some simple tests.  
> The closest one is SocketWindowWordCount. But it requires setting up a server 
> (nc -l ), which is not hard, but still tedious for simple use cases. And it 
> requires human input for the job to actually run.
> I propose to add or modify current WordCount example to have a SourceFunction 
> that randomly generates input data based on a set of sentences, so the 
> WordCount job can run forever. The generation ratio will be configurable.
> This will be the easiest way to start a long running flink job and can be 
> useful for new users to start using flink quickly, or for developers to test 
> flink easily.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-16517) Add a long running WordCount example

2020-03-10 Thread Aljoscha Krettek (Jira)


[jira] [Commented] (FLINK-16517) Add a long running WordCount example

2020-03-09 Thread Ethan Li (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-16517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17055515#comment-17055515
 ] 

Ethan Li commented on FLINK-16517:
--

I will put up a pull request if you think this makes sense.

> Add a long running WordCount example
> 
>
> Key: FLINK-16517
> URL: https://issues.apache.org/jira/browse/FLINK-16517
> Project: Flink
>  Issue Type: Improvement
>  Components: Examples
>Reporter: Ethan Li
>Priority: Minor
>
> As far as I know, flink doesn't have a long running WordCount example for 
> users to start with or doing some simple tests.  
> The closest one is SocketWindowWordCount. But it requires setting up a server 
> (nc -l ), which is not hard, but still tedious for simple use cases. And it 
> requires human input for the job to actually run.
> I propose to add or modify current WordCount example to have a SourceFunction 
> that randomly generates input data based on a set of sentences, so the 
> WordCount job can run forever. The generation ratio will be configurable.
> This will be the easiest way to start a long running flink job and can be 
> useful for new users to start using flink quickly, or for developers to test 
> flink easily.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)