[jira] [Work logged] (BEAM-5707) Add a portable Flink streaming synthetic source for testing

2019-10-04 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-5707?focusedWorklogId=323586=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-323586
 ]

ASF GitHub Bot logged work on BEAM-5707:


Author: ASF GitHub Bot
Created on: 04/Oct/19 17:15
Start Date: 04/Oct/19 17:15
Worklog Time Spent: 10m 
  Work Description: tweise commented on pull request #9728: [BEAM-5707] 
Modify Flink streaming impulse function to include a counter
URL: https://github.com/apache/beam/pull/9728
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 323586)
Time Spent: 8h  (was: 7h 50m)

> Add a portable Flink streaming synthetic source for testing
> ---
>
> Key: BEAM-5707
> URL: https://issues.apache.org/jira/browse/BEAM-5707
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-flink
>Reporter: Micah Wylde
>Assignee: Micah Wylde
>Priority: Minor
> Fix For: 2.9.0
>
>  Time Spent: 8h
>  Remaining Estimate: 0h
>
> Currently there are no built-in streaming sources for portable pipelines. 
> This makes it hard to test streaming functionality in the Python SDK.
> It would be very useful to add a periodic impulse source that (with some 
> configurable frequency) outputs an empty byte array, which can then be 
> transformed as desired inside the python pipeline. More context in this 
> [mailing list 
> discussion|https://lists.apache.org/thread.html/b44a648ab1d0cb200d8bfe4b280e9dad6368209c4725609cbfbbe410@%3Cdev.beam.apache.org%3E].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-5707) Add a portable Flink streaming synthetic source for testing

2019-10-04 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-5707?focusedWorklogId=323338=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-323338
 ]

ASF GitHub Bot logged work on BEAM-5707:


Author: ASF GitHub Bot
Created on: 04/Oct/19 11:25
Start Date: 04/Oct/19 11:25
Worklog Time Spent: 10m 
  Work Description: mxm commented on pull request #9728: [BEAM-5707] Modify 
Flink streaming impulse function to include a counter
URL: https://github.com/apache/beam/pull/9728
 
 
   Flink's streaming impulse function is used for load tests which used to be
   impossible to realize with timers. This extends the functionality to send a
   per-partition counter with the impulse, instead of just an empty byte array.
   
   I've also filed https://jira.apache.org/jira/browse/BEAM-8353 to start the
   process of removing the proprietary source.
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/)
 | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/)[![Build
 

[jira] [Work logged] (BEAM-5707) Add a portable Flink streaming synthetic source for testing

2018-10-22 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5707?focusedWorklogId=157130=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-157130
 ]

ASF GitHub Bot logged work on BEAM-5707:


Author: ASF GitHub Bot
Created on: 22/Oct/18 19:47
Start Date: 22/Oct/18 19:47
Worklog Time Spent: 10m 
  Work Description: mwylde commented on issue #6774: [BEAM-5707] Fix 
':beam-sdks-python:docs' target
URL: https://github.com/apache/beam/pull/6774#issuecomment-431902850
 
 
   Thanks for the fix!


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 157130)
Time Spent: 7h 40m  (was: 7.5h)

> Add a portable Flink streaming synthetic source for testing
> ---
>
> Key: BEAM-5707
> URL: https://issues.apache.org/jira/browse/BEAM-5707
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-flink
>Reporter: Micah Wylde
>Assignee: Micah Wylde
>Priority: Minor
> Fix For: 2.9.0
>
>  Time Spent: 7h 40m
>  Remaining Estimate: 0h
>
> Currently there are no built-in streaming sources for portable pipelines. 
> This makes it hard to test streaming functionality in the Python SDK.
> It would be very useful to add a periodic impulse source that (with some 
> configurable frequency) outputs an empty byte array, which can then be 
> transformed as desired inside the python pipeline. More context in this 
> [mailing list 
> discussion|https://lists.apache.org/thread.html/b44a648ab1d0cb200d8bfe4b280e9dad6368209c4725609cbfbbe410@%3Cdev.beam.apache.org%3E].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5707) Add a portable Flink streaming synthetic source for testing

2018-10-22 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5707?focusedWorklogId=156996=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-156996
 ]

ASF GitHub Bot logged work on BEAM-5707:


Author: ASF GitHub Bot
Created on: 22/Oct/18 19:21
Start Date: 22/Oct/18 19:21
Worklog Time Spent: 10m 
  Work Description: kennknowles closed pull request #6774: [BEAM-5707] Fix 
':beam-sdks-python:docs' target
URL: https://github.com/apache/beam/pull/6774
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/sdks/python/apache_beam/io/flink/flink_streaming_impulse_source.py 
b/sdks/python/apache_beam/io/flink/flink_streaming_impulse_source.py
index 5be728e2f89..dc5277c356e 100644
--- a/sdks/python/apache_beam/io/flink/flink_streaming_impulse_source.py
+++ b/sdks/python/apache_beam/io/flink/flink_streaming_impulse_source.py
@@ -64,7 +64,6 @@ def set_message_count(self, message_count):
 self.config["message_count"] = message_count
 return self
 
-  @staticmethod
   @PTransform.register_urn("flink:transform:streaming_impulse:v1", None)
   def from_runner_api_parameter(spec_parameter, _unused_context):
 config = json.loads(spec_parameter)


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 156996)
Time Spent: 7.5h  (was: 7h 20m)

> Add a portable Flink streaming synthetic source for testing
> ---
>
> Key: BEAM-5707
> URL: https://issues.apache.org/jira/browse/BEAM-5707
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-flink
>Reporter: Micah Wylde
>Assignee: Micah Wylde
>Priority: Minor
> Fix For: 2.9.0
>
>  Time Spent: 7.5h
>  Remaining Estimate: 0h
>
> Currently there are no built-in streaming sources for portable pipelines. 
> This makes it hard to test streaming functionality in the Python SDK.
> It would be very useful to add a periodic impulse source that (with some 
> configurable frequency) outputs an empty byte array, which can then be 
> transformed as desired inside the python pipeline. More context in this 
> [mailing list 
> discussion|https://lists.apache.org/thread.html/b44a648ab1d0cb200d8bfe4b280e9dad6368209c4725609cbfbbe410@%3Cdev.beam.apache.org%3E].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5707) Add a portable Flink streaming synthetic source for testing

2018-10-22 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5707?focusedWorklogId=156995=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-156995
 ]

ASF GitHub Bot logged work on BEAM-5707:


Author: ASF GitHub Bot
Created on: 22/Oct/18 19:21
Start Date: 22/Oct/18 19:21
Worklog Time Spent: 10m 
  Work Description: kennknowles commented on issue #6774: [BEAM-5707] Fix 
':beam-sdks-python:docs' target
URL: https://github.com/apache/beam/pull/6774#issuecomment-431869261
 
 
   Yes, I retract my comment. It is really the architecture of the code here 
that is a bit wonky. The `register_urn` method applies `staticmethod` as part 
of the decorator's logic.
   
   I also confirmed the fix works.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 156995)
Time Spent: 7h 20m  (was: 7h 10m)

> Add a portable Flink streaming synthetic source for testing
> ---
>
> Key: BEAM-5707
> URL: https://issues.apache.org/jira/browse/BEAM-5707
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-flink
>Reporter: Micah Wylde
>Assignee: Micah Wylde
>Priority: Minor
> Fix For: 2.9.0
>
>  Time Spent: 7h 20m
>  Remaining Estimate: 0h
>
> Currently there are no built-in streaming sources for portable pipelines. 
> This makes it hard to test streaming functionality in the Python SDK.
> It would be very useful to add a periodic impulse source that (with some 
> configurable frequency) outputs an empty byte array, which can then be 
> transformed as desired inside the python pipeline. More context in this 
> [mailing list 
> discussion|https://lists.apache.org/thread.html/b44a648ab1d0cb200d8bfe4b280e9dad6368209c4725609cbfbbe410@%3Cdev.beam.apache.org%3E].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5707) Add a portable Flink streaming synthetic source for testing

2018-10-22 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5707?focusedWorklogId=156993=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-156993
 ]

ASF GitHub Bot logged work on BEAM-5707:


Author: ASF GitHub Bot
Created on: 22/Oct/18 19:20
Start Date: 22/Oct/18 19:20
Worklog Time Spent: 10m 
  Work Description: mxm commented on issue #6774: [BEAM-5707] Fix 
':beam-sdks-python:docs' target
URL: https://github.com/apache/beam/pull/6774#issuecomment-431867755
 
 
   Yes, I have verified it works. If you look at the other custom transforms 
which use `@PTransform.register_urn` you see they don't use `@staticmethod`. 
Also, `@PTransform.register_urn` itself is annotated with `@classmethod`.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 156993)
Time Spent: 7h 10m  (was: 7h)

> Add a portable Flink streaming synthetic source for testing
> ---
>
> Key: BEAM-5707
> URL: https://issues.apache.org/jira/browse/BEAM-5707
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-flink
>Reporter: Micah Wylde
>Assignee: Micah Wylde
>Priority: Minor
> Fix For: 2.9.0
>
>  Time Spent: 7h 10m
>  Remaining Estimate: 0h
>
> Currently there are no built-in streaming sources for portable pipelines. 
> This makes it hard to test streaming functionality in the Python SDK.
> It would be very useful to add a periodic impulse source that (with some 
> configurable frequency) outputs an empty byte array, which can then be 
> transformed as desired inside the python pipeline. More context in this 
> [mailing list 
> discussion|https://lists.apache.org/thread.html/b44a648ab1d0cb200d8bfe4b280e9dad6368209c4725609cbfbbe410@%3Cdev.beam.apache.org%3E].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5707) Add a portable Flink streaming synthetic source for testing

2018-10-22 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5707?focusedWorklogId=156949=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-156949
 ]

ASF GitHub Bot logged work on BEAM-5707:


Author: ASF GitHub Bot
Created on: 22/Oct/18 18:52
Start Date: 22/Oct/18 18:52
Worklog Time Spent: 10m 
  Work Description: mxm commented on issue #6774: [BEAM-5707] Fix 
':beam-sdks-python:docs' target
URL: https://github.com/apache/beam/pull/6774#issuecomment-431836336
 
 
   CC @mwylde @tweise


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 156949)
Time Spent: 6h 50m  (was: 6h 40m)

> Add a portable Flink streaming synthetic source for testing
> ---
>
> Key: BEAM-5707
> URL: https://issues.apache.org/jira/browse/BEAM-5707
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-flink
>Reporter: Micah Wylde
>Assignee: Micah Wylde
>Priority: Minor
> Fix For: 2.9.0
>
>  Time Spent: 6h 50m
>  Remaining Estimate: 0h
>
> Currently there are no built-in streaming sources for portable pipelines. 
> This makes it hard to test streaming functionality in the Python SDK.
> It would be very useful to add a periodic impulse source that (with some 
> configurable frequency) outputs an empty byte array, which can then be 
> transformed as desired inside the python pipeline. More context in this 
> [mailing list 
> discussion|https://lists.apache.org/thread.html/b44a648ab1d0cb200d8bfe4b280e9dad6368209c4725609cbfbbe410@%3Cdev.beam.apache.org%3E].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5707) Add a portable Flink streaming synthetic source for testing

2018-10-18 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5707?focusedWorklogId=155969=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-155969
 ]

ASF GitHub Bot logged work on BEAM-5707:


Author: ASF GitHub Bot
Created on: 18/Oct/18 16:04
Start Date: 18/Oct/18 16:04
Worklog Time Spent: 10m 
  Work Description: tweise closed pull request #6737: [BEAM-5707] Add 
support for options to flink_streaming_impulse.py
URL: https://github.com/apache/beam/pull/6737
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/sdks/python/apache_beam/examples/flink/flink_streaming_impulse.py 
b/sdks/python/apache_beam/examples/flink/flink_streaming_impulse.py
index 23ad8f25d4e..0cfaf5d1422 100644
--- a/sdks/python/apache_beam/examples/flink/flink_streaming_impulse.py
+++ b/sdks/python/apache_beam/examples/flink/flink_streaming_impulse.py
@@ -24,6 +24,7 @@
 
 import argparse
 import logging
+import sys
 
 import apache_beam as beam
 import apache_beam.transforms.window as window
@@ -57,15 +58,26 @@ def run(argv=None):
 args.extend(argv)
 
   parser = argparse.ArgumentParser()
-  _, pipeline_args = parser.parse_known_args(args)
+  parser.add_argument('--count',
+  dest='count',
+  default=0,
+  help='Number of triggers to generate '
+   '(0 means emit forever).')
+  parser.add_argument('--interval_ms',
+  dest='interval_ms',
+  default=500,
+  help='Interval between records per parallel '
+   'Flink subtask.')
+
+  known_args, pipeline_args = parser.parse_known_args(args)
 
   pipeline_options = PipelineOptions(pipeline_args)
 
   p = beam.Pipeline(options=pipeline_options)
 
   messages = (p | FlinkStreamingImpulseSource()
-  .set_message_count(1)
-  .set_interval_ms(500))
+  .set_message_count(known_args.count)
+  .set_interval_ms(known_args.interval_ms))
 
   _ = (messages | 'decode' >> beam.Map(lambda x: ('', 1))
| 'window' >> beam.WindowInto(window.GlobalWindows(),
@@ -83,4 +95,4 @@ def run(argv=None):
 
 if __name__ == '__main__':
   logging.getLogger().setLevel(logging.INFO)
-  run()
+  run(sys.argv[1:])


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 155969)
Time Spent: 6h 20m  (was: 6h 10m)

> Add a portable Flink streaming synthetic source for testing
> ---
>
> Key: BEAM-5707
> URL: https://issues.apache.org/jira/browse/BEAM-5707
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-flink
>Reporter: Micah Wylde
>Assignee: Micah Wylde
>Priority: Minor
> Fix For: 2.9.0
>
>  Time Spent: 6h 20m
>  Remaining Estimate: 0h
>
> Currently there are no built-in streaming sources for portable pipelines. 
> This makes it hard to test streaming functionality in the Python SDK.
> It would be very useful to add a periodic impulse source that (with some 
> configurable frequency) outputs an empty byte array, which can then be 
> transformed as desired inside the python pipeline. More context in this 
> [mailing list 
> discussion|https://lists.apache.org/thread.html/b44a648ab1d0cb200d8bfe4b280e9dad6368209c4725609cbfbbe410@%3Cdev.beam.apache.org%3E].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5707) Add a portable Flink streaming synthetic source for testing

2018-10-18 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5707?focusedWorklogId=155810=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-155810
 ]

ASF GitHub Bot logged work on BEAM-5707:


Author: ASF GitHub Bot
Created on: 18/Oct/18 10:11
Start Date: 18/Oct/18 10:11
Worklog Time Spent: 10m 
  Work Description: mxm commented on a change in pull request #6737: 
[BEAM-5707] Add support for options to flink_streaming_impulse.py
URL: https://github.com/apache/beam/pull/6737#discussion_r226245828
 
 

 ##
 File path: sdks/python/apache_beam/examples/flink/flink_streaming_impulse.py
 ##
 @@ -57,15 +58,24 @@ def run(argv=None):
 args.extend(argv)
 
   parser = argparse.ArgumentParser()
-  _, pipeline_args = parser.parse_known_args(args)
+  parser.add_argument('--count',
+  dest='message_count',
+  default=1,
 
 Review comment:
   +1 If I remember correctly, `0` is used to emit forever.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 155810)
Time Spent: 6h 10m  (was: 6h)

> Add a portable Flink streaming synthetic source for testing
> ---
>
> Key: BEAM-5707
> URL: https://issues.apache.org/jira/browse/BEAM-5707
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-flink
>Reporter: Micah Wylde
>Assignee: Micah Wylde
>Priority: Minor
> Fix For: 2.9.0
>
>  Time Spent: 6h 10m
>  Remaining Estimate: 0h
>
> Currently there are no built-in streaming sources for portable pipelines. 
> This makes it hard to test streaming functionality in the Python SDK.
> It would be very useful to add a periodic impulse source that (with some 
> configurable frequency) outputs an empty byte array, which can then be 
> transformed as desired inside the python pipeline. More context in this 
> [mailing list 
> discussion|https://lists.apache.org/thread.html/b44a648ab1d0cb200d8bfe4b280e9dad6368209c4725609cbfbbe410@%3Cdev.beam.apache.org%3E].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5707) Add a portable Flink streaming synthetic source for testing

2018-10-18 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5707?focusedWorklogId=155790=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-155790
 ]

ASF GitHub Bot logged work on BEAM-5707:


Author: ASF GitHub Bot
Created on: 18/Oct/18 09:19
Start Date: 18/Oct/18 09:19
Worklog Time Spent: 10m 
  Work Description: robertwb commented on a change in pull request #6737: 
[BEAM-5707] Add support for options to flink_streaming_impulse.py
URL: https://github.com/apache/beam/pull/6737#discussion_r226230366
 
 

 ##
 File path: sdks/python/apache_beam/examples/flink/flink_streaming_impulse.py
 ##
 @@ -57,15 +58,24 @@ def run(argv=None):
 args.extend(argv)
 
   parser = argparse.ArgumentParser()
-  _, pipeline_args = parser.parse_known_args(args)
+  parser.add_argument('--count',
+  dest='message_count',
+  default=1,
+  help='Number of records to generate.')
+  parser.add_argument('--interval_ms',
+  dest='interval_ms',
+  default=500,
+  help='Interval between records.')
+
+  known_args, pipeline_args = parser.parse_known_args(args)
 
   pipeline_options = PipelineOptions(pipeline_args)
 
   p = beam.Pipeline(options=pipeline_options)
 
   messages = (p | FlinkStreamingImpulseSource()
-  .set_message_count(1)
-  .set_interval_ms(500))
+  .set_message_count(known_args.message_count)
 
 Review comment:
   I didn't have time to comment on the original PR, but the pythonic way to do 
this would be to let the constructor take keywords arguments rather than have 
builder functions. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 155790)
Time Spent: 6h  (was: 5h 50m)

> Add a portable Flink streaming synthetic source for testing
> ---
>
> Key: BEAM-5707
> URL: https://issues.apache.org/jira/browse/BEAM-5707
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-flink
>Reporter: Micah Wylde
>Assignee: Micah Wylde
>Priority: Minor
> Fix For: 2.9.0
>
>  Time Spent: 6h
>  Remaining Estimate: 0h
>
> Currently there are no built-in streaming sources for portable pipelines. 
> This makes it hard to test streaming functionality in the Python SDK.
> It would be very useful to add a periodic impulse source that (with some 
> configurable frequency) outputs an empty byte array, which can then be 
> transformed as desired inside the python pipeline. More context in this 
> [mailing list 
> discussion|https://lists.apache.org/thread.html/b44a648ab1d0cb200d8bfe4b280e9dad6368209c4725609cbfbbe410@%3Cdev.beam.apache.org%3E].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5707) Add a portable Flink streaming synthetic source for testing

2018-10-18 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5707?focusedWorklogId=155789=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-155789
 ]

ASF GitHub Bot logged work on BEAM-5707:


Author: ASF GitHub Bot
Created on: 18/Oct/18 09:19
Start Date: 18/Oct/18 09:19
Worklog Time Spent: 10m 
  Work Description: robertwb commented on a change in pull request #6737: 
[BEAM-5707] Add support for options to flink_streaming_impulse.py
URL: https://github.com/apache/beam/pull/6737#discussion_r226229174
 
 

 ##
 File path: sdks/python/apache_beam/examples/flink/flink_streaming_impulse.py
 ##
 @@ -57,15 +58,24 @@ def run(argv=None):
 args.extend(argv)
 
   parser = argparse.ArgumentParser()
-  _, pipeline_args = parser.parse_known_args(args)
+  parser.add_argument('--count',
+  dest='message_count',
+  default=1,
 
 Review comment:
   Should the default perhaps be maxint, i.e. never quit?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 155789)
Time Spent: 5h 50m  (was: 5h 40m)

> Add a portable Flink streaming synthetic source for testing
> ---
>
> Key: BEAM-5707
> URL: https://issues.apache.org/jira/browse/BEAM-5707
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-flink
>Reporter: Micah Wylde
>Assignee: Micah Wylde
>Priority: Minor
> Fix For: 2.9.0
>
>  Time Spent: 5h 50m
>  Remaining Estimate: 0h
>
> Currently there are no built-in streaming sources for portable pipelines. 
> This makes it hard to test streaming functionality in the Python SDK.
> It would be very useful to add a periodic impulse source that (with some 
> configurable frequency) outputs an empty byte array, which can then be 
> transformed as desired inside the python pipeline. More context in this 
> [mailing list 
> discussion|https://lists.apache.org/thread.html/b44a648ab1d0cb200d8bfe4b280e9dad6368209c4725609cbfbbe410@%3Cdev.beam.apache.org%3E].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5707) Add a portable Flink streaming synthetic source for testing

2018-10-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5707?focusedWorklogId=155171=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-155171
 ]

ASF GitHub Bot logged work on BEAM-5707:


Author: ASF GitHub Bot
Created on: 16/Oct/18 22:39
Start Date: 16/Oct/18 22:39
Worklog Time Spent: 10m 
  Work Description: pabloem commented on issue #6637: [BEAM-5707] Add a 
periodic, streaming impulse source for Flink portable pipelines
URL: https://github.com/apache/beam/pull/6637#issuecomment-430425870
 
 
   Very cool. Thanks Micah!


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 155171)
Time Spent: 5h 10m  (was: 5h)

> Add a portable Flink streaming synthetic source for testing
> ---
>
> Key: BEAM-5707
> URL: https://issues.apache.org/jira/browse/BEAM-5707
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-flink
>Reporter: Micah Wylde
>Assignee: Aljoscha Krettek
>Priority: Minor
>  Time Spent: 5h 10m
>  Remaining Estimate: 0h
>
> Currently there are no built-in streaming sources for portable pipelines. 
> This makes it hard to test streaming functionality in the Python SDK.
> It would be very useful to add a periodic impulse source that (with some 
> configurable frequency) outputs an empty byte array, which can then be 
> transformed as desired inside the python pipeline. More context in this 
> [mailing list 
> discussion|https://lists.apache.org/thread.html/b44a648ab1d0cb200d8bfe4b280e9dad6368209c4725609cbfbbe410@%3Cdev.beam.apache.org%3E].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5707) Add a portable Flink streaming synthetic source for testing

2018-10-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5707?focusedWorklogId=155172=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-155172
 ]

ASF GitHub Bot logged work on BEAM-5707:


Author: ASF GitHub Bot
Created on: 16/Oct/18 22:39
Start Date: 16/Oct/18 22:39
Worklog Time Spent: 10m 
  Work Description: pabloem closed pull request #6637: [BEAM-5707] Add a 
periodic, streaming impulse source for Flink portable pipelines
URL: https://github.com/apache/beam/pull/6637
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git 
a/runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkStreamingPortablePipelineTranslator.java
 
b/runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkStreamingPortablePipelineTranslator.java
index 42b9c1114a7..2b276f404c7 100644
--- 
a/runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkStreamingPortablePipelineTranslator.java
+++ 
b/runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkStreamingPortablePipelineTranslator.java
@@ -17,6 +17,9 @@
  */
 package org.apache.beam.runners.flink;
 
+import com.fasterxml.jackson.databind.JsonNode;
+import com.fasterxml.jackson.databind.ObjectMapper;
+import com.google.auto.service.AutoService;
 import com.google.common.collect.BiMap;
 import com.google.common.collect.HashMultiset;
 import com.google.common.collect.ImmutableMap;
@@ -34,6 +37,7 @@
 import java.util.TreeMap;
 import org.apache.beam.model.pipeline.v1.RunnerApi;
 import org.apache.beam.runners.core.SystemReduceFn;
+import org.apache.beam.runners.core.construction.NativeTransforms;
 import org.apache.beam.runners.core.construction.PTransformTranslation;
 import org.apache.beam.runners.core.construction.RehydratedComponents;
 import org.apache.beam.runners.core.construction.RunnerPCollectionView;
@@ -52,6 +56,7 @@
 import 
org.apache.beam.runners.flink.translation.wrappers.streaming.SingletonKeyedWorkItemCoder;
 import 
org.apache.beam.runners.flink.translation.wrappers.streaming.WindowDoFnOperator;
 import 
org.apache.beam.runners.flink.translation.wrappers.streaming.WorkItemKeySelector;
+import 
org.apache.beam.runners.flink.translation.wrappers.streaming.io.StreamingImpulseSource;
 import org.apache.beam.runners.fnexecution.provisioning.JobInfo;
 import org.apache.beam.runners.fnexecution.wire.WireCoders;
 import org.apache.beam.sdk.coders.ByteArrayCoder;
@@ -156,6 +161,9 @@ public StreamExecutionEnvironment getExecutionEnvironment() 
{
 void translate(String id, RunnerApi.Pipeline pipeline, T t);
   }
 
+  private static final String STREAMING_IMPULSE_TRANSFORM_URN =
+  "flink:transform:streaming_impulse:v1";
+
   private final Map>
   urnToTransformTranslator;
 
@@ -165,6 +173,7 @@ public StreamExecutionEnvironment getExecutionEnvironment() 
{
 translatorMap.put(PTransformTranslation.FLATTEN_TRANSFORM_URN, 
this::translateFlatten);
 translatorMap.put(PTransformTranslation.GROUP_BY_KEY_TRANSFORM_URN, 
this::translateGroupByKey);
 translatorMap.put(PTransformTranslation.IMPULSE_TRANSFORM_URN, 
this::translateImpulse);
+translatorMap.put(STREAMING_IMPULSE_TRANSFORM_URN, 
this::translateStreamingImpulse);
 translatorMap.put(
 PTransformTranslation.ASSIGN_WINDOWS_TRANSFORM_URN, 
this::translateAssignWindows);
 translatorMap.put(ExecutableStage.URN, this::translateExecutableStage);
@@ -403,6 +412,40 @@ private void translateImpulse(
 
context.addDataStream(Iterables.getOnlyElement(pTransform.getOutputsMap().values()),
 source);
   }
 
+  /** Predicate to determine whether a URN is a Flink native transform. */
+  @AutoService(NativeTransforms.IsNativeTransform.class)
+  public static class IsFlinkNativeTransform implements 
NativeTransforms.IsNativeTransform {
+@Override
+public boolean test(RunnerApi.PTransform pTransform) {
+  return STREAMING_IMPULSE_TRANSFORM_URN.equals(
+  PTransformTranslation.urnForTransformOrNull(pTransform));
+}
+  }
+
+  private void translateStreamingImpulse(
+  String id, RunnerApi.Pipeline pipeline, StreamingTranslationContext 
context) {
+RunnerApi.PTransform pTransform = 
pipeline.getComponents().getTransformsOrThrow(id);
+
+ObjectMapper objectMapper = new ObjectMapper();
+
+int intervalMillis;
+int messageCount;
+try {
+  JsonNode config = 
objectMapper.readTree(pTransform.getSpec().getPayload().toByteArray());
+  intervalMillis = config.path("interval_ms").asInt(100);
+  messageCount = config.path("message_count").asInt(0);
+} catch (IOException e) {
+  throw new RuntimeException("Failed to parse configuration for streaming 
impulse", e);
+}
+
+

[jira] [Work logged] (BEAM-5707) Add a portable Flink streaming synthetic source for testing

2018-10-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5707?focusedWorklogId=155152=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-155152
 ]

ASF GitHub Bot logged work on BEAM-5707:


Author: ASF GitHub Bot
Created on: 16/Oct/18 22:10
Start Date: 16/Oct/18 22:10
Worklog Time Spent: 10m 
  Work Description: mwylde commented on issue #6637: [BEAM-5707] Add a 
periodic, streaming impulse source for Flink portable pipelines
URL: https://github.com/apache/beam/pull/6637#issuecomment-430419035
 
 
   Run Python PreCommit


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 155152)
Time Spent: 4h 50m  (was: 4h 40m)

> Add a portable Flink streaming synthetic source for testing
> ---
>
> Key: BEAM-5707
> URL: https://issues.apache.org/jira/browse/BEAM-5707
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-flink
>Reporter: Micah Wylde
>Assignee: Aljoscha Krettek
>Priority: Minor
>  Time Spent: 4h 50m
>  Remaining Estimate: 0h
>
> Currently there are no built-in streaming sources for portable pipelines. 
> This makes it hard to test streaming functionality in the Python SDK.
> It would be very useful to add a periodic impulse source that (with some 
> configurable frequency) outputs an empty byte array, which can then be 
> transformed as desired inside the python pipeline. More context in this 
> [mailing list 
> discussion|https://lists.apache.org/thread.html/b44a648ab1d0cb200d8bfe4b280e9dad6368209c4725609cbfbbe410@%3Cdev.beam.apache.org%3E].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5707) Add a portable Flink streaming synthetic source for testing

2018-10-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5707?focusedWorklogId=154920=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-154920
 ]

ASF GitHub Bot logged work on BEAM-5707:


Author: ASF GitHub Bot
Created on: 16/Oct/18 15:16
Start Date: 16/Oct/18 15:16
Worklog Time Spent: 10m 
  Work Description: mxm commented on issue #6637: [BEAM-5707] Add a 
periodic, streaming impulse source for Flink portable pipelines
URL: https://github.com/apache/beam/pull/6637#issuecomment-430278236
 
 
   Checkstyle still needs to be fixed before merging.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 154920)
Time Spent: 4.5h  (was: 4h 20m)

> Add a portable Flink streaming synthetic source for testing
> ---
>
> Key: BEAM-5707
> URL: https://issues.apache.org/jira/browse/BEAM-5707
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-flink
>Reporter: Micah Wylde
>Assignee: Aljoscha Krettek
>Priority: Minor
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> Currently there are no built-in streaming sources for portable pipelines. 
> This makes it hard to test streaming functionality in the Python SDK.
> It would be very useful to add a periodic impulse source that (with some 
> configurable frequency) outputs an empty byte array, which can then be 
> transformed as desired inside the python pipeline. More context in this 
> [mailing list 
> discussion|https://lists.apache.org/thread.html/b44a648ab1d0cb200d8bfe4b280e9dad6368209c4725609cbfbbe410@%3Cdev.beam.apache.org%3E].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5707) Add a portable Flink streaming synthetic source for testing

2018-10-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5707?focusedWorklogId=154883=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-154883
 ]

ASF GitHub Bot logged work on BEAM-5707:


Author: ASF GitHub Bot
Created on: 16/Oct/18 14:14
Start Date: 16/Oct/18 14:14
Worklog Time Spent: 10m 
  Work Description: tweise commented on a change in pull request #6637: 
[BEAM-5707] Add a periodic, streaming impulse source for Flink portable 
pipelines
URL: https://github.com/apache/beam/pull/6637#discussion_r225557997
 
 

 ##
 File path: 
runners/flink/src/main/java/org/apache/beam/runners/flink/translation/wrappers/streaming/io/StreamingImpulseSource.java
 ##
 @@ -0,0 +1,73 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.runners.flink.translation.wrappers.streaming.io;
+
+import java.util.concurrent.atomic.AtomicBoolean;
+import org.apache.beam.sdk.util.WindowedValue;
+import 
org.apache.flink.streaming.api.functions.source.RichParallelSourceFunction;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * A streaming source that periodically produces an empty byte array. This is 
mostly useful for
+ * debugging, or for triggering periodic behavior in a portable pipeline.
+ */
+public class StreamingImpulseSource extends 
RichParallelSourceFunction> {
+  private static final Logger LOG = 
LoggerFactory.getLogger(StreamingImpulseSource.class);
+
+  private final AtomicBoolean cancelled = new AtomicBoolean(false);
+  private long count = 0;
+  private final int intervalMillis;
+  private final int messageCount;
+
+  public StreamingImpulseSource(int intervalMillis, int messageCount) {
+this.intervalMillis = intervalMillis;
+this.messageCount = messageCount;
+  }
+
+  @Override
+  public void run(SourceContext> ctx) {
+// in order to produce messageCount messages across all parallel subtasks, 
we divide by
+// the total number of subtasks
+int subtaskCount = messageCount / 
getRuntimeContext().getNumberOfParallelSubtasks();
 
 Review comment:
   It would be good if we can run this "forever", without upper bound. I think 
that should be the default when the user does not specify a count.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 154883)
Time Spent: 4h 20m  (was: 4h 10m)

> Add a portable Flink streaming synthetic source for testing
> ---
>
> Key: BEAM-5707
> URL: https://issues.apache.org/jira/browse/BEAM-5707
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-flink
>Reporter: Micah Wylde
>Assignee: Aljoscha Krettek
>Priority: Minor
>  Time Spent: 4h 20m
>  Remaining Estimate: 0h
>
> Currently there are no built-in streaming sources for portable pipelines. 
> This makes it hard to test streaming functionality in the Python SDK.
> It would be very useful to add a periodic impulse source that (with some 
> configurable frequency) outputs an empty byte array, which can then be 
> transformed as desired inside the python pipeline. More context in this 
> [mailing list 
> discussion|https://lists.apache.org/thread.html/b44a648ab1d0cb200d8bfe4b280e9dad6368209c4725609cbfbbe410@%3Cdev.beam.apache.org%3E].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5707) Add a portable Flink streaming synthetic source for testing

2018-10-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5707?focusedWorklogId=154879=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-154879
 ]

ASF GitHub Bot logged work on BEAM-5707:


Author: ASF GitHub Bot
Created on: 16/Oct/18 14:03
Start Date: 16/Oct/18 14:03
Worklog Time Spent: 10m 
  Work Description: tweise commented on a change in pull request #6637: 
[BEAM-5707] Add a periodic, streaming impulse source for Flink portable 
pipelines
URL: https://github.com/apache/beam/pull/6637#discussion_r225553384
 
 

 ##
 File path: sdks/python/apache_beam/examples/flink/flink_streaming_impulse.py
 ##
 @@ -0,0 +1,78 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+"""A streaming workflow that uses a synthetic streaming source.
+
+This can only be used with the Flink portable runner.
 
 Review comment:
   It might be useful for others to have this as a ready to run example for 
profiling?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 154879)
Time Spent: 4h 10m  (was: 4h)

> Add a portable Flink streaming synthetic source for testing
> ---
>
> Key: BEAM-5707
> URL: https://issues.apache.org/jira/browse/BEAM-5707
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-flink
>Reporter: Micah Wylde
>Assignee: Aljoscha Krettek
>Priority: Minor
>  Time Spent: 4h 10m
>  Remaining Estimate: 0h
>
> Currently there are no built-in streaming sources for portable pipelines. 
> This makes it hard to test streaming functionality in the Python SDK.
> It would be very useful to add a periodic impulse source that (with some 
> configurable frequency) outputs an empty byte array, which can then be 
> transformed as desired inside the python pipeline. More context in this 
> [mailing list 
> discussion|https://lists.apache.org/thread.html/b44a648ab1d0cb200d8bfe4b280e9dad6368209c4725609cbfbbe410@%3Cdev.beam.apache.org%3E].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5707) Add a portable Flink streaming synthetic source for testing

2018-10-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5707?focusedWorklogId=154878=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-154878
 ]

ASF GitHub Bot logged work on BEAM-5707:


Author: ASF GitHub Bot
Created on: 16/Oct/18 14:01
Start Date: 16/Oct/18 14:01
Worklog Time Spent: 10m 
  Work Description: tweise commented on issue #6637: [BEAM-5707] Add a 
periodic, streaming impulse source for Flink portable pipelines
URL: https://github.com/apache/beam/pull/6637#issuecomment-430249085
 
 
   @mxm from the comments it looks like you are now ok with the changes for 
this round (per review status you still request changes)? 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 154878)
Time Spent: 4h  (was: 3h 50m)

> Add a portable Flink streaming synthetic source for testing
> ---
>
> Key: BEAM-5707
> URL: https://issues.apache.org/jira/browse/BEAM-5707
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-flink
>Reporter: Micah Wylde
>Assignee: Aljoscha Krettek
>Priority: Minor
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> Currently there are no built-in streaming sources for portable pipelines. 
> This makes it hard to test streaming functionality in the Python SDK.
> It would be very useful to add a periodic impulse source that (with some 
> configurable frequency) outputs an empty byte array, which can then be 
> transformed as desired inside the python pipeline. More context in this 
> [mailing list 
> discussion|https://lists.apache.org/thread.html/b44a648ab1d0cb200d8bfe4b280e9dad6368209c4725609cbfbbe410@%3Cdev.beam.apache.org%3E].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5707) Add a portable Flink streaming synthetic source for testing

2018-10-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5707?focusedWorklogId=154690=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-154690
 ]

ASF GitHub Bot logged work on BEAM-5707:


Author: ASF GitHub Bot
Created on: 16/Oct/18 09:50
Start Date: 16/Oct/18 09:50
Worklog Time Spent: 10m 
  Work Description: mxm commented on a change in pull request #6637: 
[BEAM-5707] Add a periodic, streaming impulse source for Flink portable 
pipelines
URL: https://github.com/apache/beam/pull/6637#discussion_r225468978
 
 

 ##
 File path: 
runners/flink/src/main/java/org/apache/beam/runners/flink/translation/wrappers/streaming/io/StreamingImpulseSource.java
 ##
 @@ -0,0 +1,63 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.runners.flink.translation.wrappers.streaming.io;
+
+import java.util.concurrent.atomic.AtomicBoolean;
+import org.apache.beam.sdk.util.WindowedValue;
+import 
org.apache.flink.streaming.api.functions.source.RichParallelSourceFunction;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * A streaming source that periodically produces an empty byte array. This is 
mostly useful
+ * for debugging, or for triggering periodic behavior in a portable pipeline.
+ */
+public class StreamingImpulseSource extends 
RichParallelSourceFunction> {
 
 Review comment:
   You are now calculating the total number of messages emitted but you kept 
the source parallel. That's fine. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 154690)
Time Spent: 3h 50m  (was: 3h 40m)

> Add a portable Flink streaming synthetic source for testing
> ---
>
> Key: BEAM-5707
> URL: https://issues.apache.org/jira/browse/BEAM-5707
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-flink
>Reporter: Micah Wylde
>Assignee: Aljoscha Krettek
>Priority: Minor
>  Time Spent: 3h 50m
>  Remaining Estimate: 0h
>
> Currently there are no built-in streaming sources for portable pipelines. 
> This makes it hard to test streaming functionality in the Python SDK.
> It would be very useful to add a periodic impulse source that (with some 
> configurable frequency) outputs an empty byte array, which can then be 
> transformed as desired inside the python pipeline. More context in this 
> [mailing list 
> discussion|https://lists.apache.org/thread.html/b44a648ab1d0cb200d8bfe4b280e9dad6368209c4725609cbfbbe410@%3Cdev.beam.apache.org%3E].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5707) Add a portable Flink streaming synthetic source for testing

2018-10-15 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5707?focusedWorklogId=154566=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-154566
 ]

ASF GitHub Bot logged work on BEAM-5707:


Author: ASF GitHub Bot
Created on: 16/Oct/18 01:27
Start Date: 16/Oct/18 01:27
Worklog Time Spent: 10m 
  Work Description: mwylde commented on a change in pull request #6637: 
[BEAM-5707] Add a periodic, streaming impulse source for Flink portable 
pipelines
URL: https://github.com/apache/beam/pull/6637#discussion_r225368350
 
 

 ##
 File path: 
runners/flink/src/main/java/org/apache/beam/runners/flink/translation/wrappers/streaming/io/StreamingImpulseSource.java
 ##
 @@ -0,0 +1,63 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.runners.flink.translation.wrappers.streaming.io;
+
+import java.util.concurrent.atomic.AtomicBoolean;
+import org.apache.beam.sdk.util.WindowedValue;
+import 
org.apache.flink.streaming.api.functions.source.RichParallelSourceFunction;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * A streaming source that periodically produces an empty byte array. This is 
mostly useful
+ * for debugging, or for triggering periodic behavior in a portable pipeline.
+ */
+public class StreamingImpulseSource extends 
RichParallelSourceFunction> {
 
 Review comment:
   I've taken Thomas' suggestion to divide the message count by the 
parallelism, so that now the messageCount is the total number produced across 
all tasks.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 154566)
Time Spent: 3.5h  (was: 3h 20m)

> Add a portable Flink streaming synthetic source for testing
> ---
>
> Key: BEAM-5707
> URL: https://issues.apache.org/jira/browse/BEAM-5707
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-flink
>Reporter: Micah Wylde
>Assignee: Aljoscha Krettek
>Priority: Minor
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> Currently there are no built-in streaming sources for portable pipelines. 
> This makes it hard to test streaming functionality in the Python SDK.
> It would be very useful to add a periodic impulse source that (with some 
> configurable frequency) outputs an empty byte array, which can then be 
> transformed as desired inside the python pipeline. More context in this 
> [mailing list 
> discussion|https://lists.apache.org/thread.html/b44a648ab1d0cb200d8bfe4b280e9dad6368209c4725609cbfbbe410@%3Cdev.beam.apache.org%3E].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5707) Add a portable Flink streaming synthetic source for testing

2018-10-15 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5707?focusedWorklogId=154567=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-154567
 ]

ASF GitHub Bot logged work on BEAM-5707:


Author: ASF GitHub Bot
Created on: 16/Oct/18 01:27
Start Date: 16/Oct/18 01:27
Worklog Time Spent: 10m 
  Work Description: mwylde commented on a change in pull request #6637: 
[BEAM-5707] Add a periodic, streaming impulse source for Flink portable 
pipelines
URL: https://github.com/apache/beam/pull/6637#discussion_r225368350
 
 

 ##
 File path: 
runners/flink/src/main/java/org/apache/beam/runners/flink/translation/wrappers/streaming/io/StreamingImpulseSource.java
 ##
 @@ -0,0 +1,63 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.runners.flink.translation.wrappers.streaming.io;
+
+import java.util.concurrent.atomic.AtomicBoolean;
+import org.apache.beam.sdk.util.WindowedValue;
+import 
org.apache.flink.streaming.api.functions.source.RichParallelSourceFunction;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * A streaming source that periodically produces an empty byte array. This is 
mostly useful
+ * for debugging, or for triggering periodic behavior in a portable pipeline.
+ */
+public class StreamingImpulseSource extends 
RichParallelSourceFunction> {
 
 Review comment:
   I've taken Thomas' suggestion to divide the message count by the 
parallelism, so that now the messageCount is the total number produced across 
all subtasks.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 154567)
Time Spent: 3h 40m  (was: 3.5h)

> Add a portable Flink streaming synthetic source for testing
> ---
>
> Key: BEAM-5707
> URL: https://issues.apache.org/jira/browse/BEAM-5707
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-flink
>Reporter: Micah Wylde
>Assignee: Aljoscha Krettek
>Priority: Minor
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> Currently there are no built-in streaming sources for portable pipelines. 
> This makes it hard to test streaming functionality in the Python SDK.
> It would be very useful to add a periodic impulse source that (with some 
> configurable frequency) outputs an empty byte array, which can then be 
> transformed as desired inside the python pipeline. More context in this 
> [mailing list 
> discussion|https://lists.apache.org/thread.html/b44a648ab1d0cb200d8bfe4b280e9dad6368209c4725609cbfbbe410@%3Cdev.beam.apache.org%3E].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5707) Add a portable Flink streaming synthetic source for testing

2018-10-15 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5707?focusedWorklogId=154286=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-154286
 ]

ASF GitHub Bot logged work on BEAM-5707:


Author: ASF GitHub Bot
Created on: 15/Oct/18 13:08
Start Date: 15/Oct/18 13:08
Worklog Time Spent: 10m 
  Work Description: mxm commented on a change in pull request #6637: 
[BEAM-5707] Add a periodic, streaming impulse source for Flink portable 
pipelines
URL: https://github.com/apache/beam/pull/6637#discussion_r225158353
 
 

 ##
 File path: sdks/python/apache_beam/examples/flink/flink_streaming_impulse.py
 ##
 @@ -0,0 +1,78 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+"""A streaming workflow that uses a synthetic streaming source.
+
+This can only be used with the Flink portable runner.
 
 Review comment:
   Not a blocker though, we can do this in a follow-up.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 154286)
Time Spent: 2h 40m  (was: 2.5h)

> Add a portable Flink streaming synthetic source for testing
> ---
>
> Key: BEAM-5707
> URL: https://issues.apache.org/jira/browse/BEAM-5707
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-flink
>Reporter: Micah Wylde
>Assignee: Aljoscha Krettek
>Priority: Minor
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> Currently there are no built-in streaming sources for portable pipelines. 
> This makes it hard to test streaming functionality in the Python SDK.
> It would be very useful to add a periodic impulse source that (with some 
> configurable frequency) outputs an empty byte array, which can then be 
> transformed as desired inside the python pipeline. More context in this 
> [mailing list 
> discussion|https://lists.apache.org/thread.html/b44a648ab1d0cb200d8bfe4b280e9dad6368209c4725609cbfbbe410@%3Cdev.beam.apache.org%3E].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5707) Add a portable Flink streaming synthetic source for testing

2018-10-15 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5707?focusedWorklogId=154285=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-154285
 ]

ASF GitHub Bot logged work on BEAM-5707:


Author: ASF GitHub Bot
Created on: 15/Oct/18 13:07
Start Date: 15/Oct/18 13:07
Worklog Time Spent: 10m 
  Work Description: mxm commented on a change in pull request #6637: 
[BEAM-5707] Add a periodic, streaming impulse source for Flink portable 
pipelines
URL: https://github.com/apache/beam/pull/6637#discussion_r225152300
 
 

 ##
 File path: 
runners/flink/src/main/java/org/apache/beam/runners/flink/translation/wrappers/streaming/io/StreamingImpulseSource.java
 ##
 @@ -0,0 +1,63 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.runners.flink.translation.wrappers.streaming.io;
+
+import java.util.concurrent.atomic.AtomicBoolean;
+import org.apache.beam.sdk.util.WindowedValue;
+import 
org.apache.flink.streaming.api.functions.source.RichParallelSourceFunction;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * A streaming source that periodically produces an empty byte array. This is 
mostly useful
+ * for debugging, or for triggering periodic behavior in a portable pipeline.
+ */
+public class StreamingImpulseSource extends 
RichParallelSourceFunction> {
 
 Review comment:
   Note that this source does not behave like the `Impulse` transform because 
it emits empty byte arrays for all its parallel instances.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 154285)

> Add a portable Flink streaming synthetic source for testing
> ---
>
> Key: BEAM-5707
> URL: https://issues.apache.org/jira/browse/BEAM-5707
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-flink
>Reporter: Micah Wylde
>Assignee: Aljoscha Krettek
>Priority: Minor
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> Currently there are no built-in streaming sources for portable pipelines. 
> This makes it hard to test streaming functionality in the Python SDK.
> It would be very useful to add a periodic impulse source that (with some 
> configurable frequency) outputs an empty byte array, which can then be 
> transformed as desired inside the python pipeline. More context in this 
> [mailing list 
> discussion|https://lists.apache.org/thread.html/b44a648ab1d0cb200d8bfe4b280e9dad6368209c4725609cbfbbe410@%3Cdev.beam.apache.org%3E].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5707) Add a portable Flink streaming synthetic source for testing

2018-10-15 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5707?focusedWorklogId=154283=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-154283
 ]

ASF GitHub Bot logged work on BEAM-5707:


Author: ASF GitHub Bot
Created on: 15/Oct/18 13:07
Start Date: 15/Oct/18 13:07
Worklog Time Spent: 10m 
  Work Description: mxm commented on a change in pull request #6637: 
[BEAM-5707] Add a periodic, streaming impulse source for Flink portable 
pipelines
URL: https://github.com/apache/beam/pull/6637#discussion_r225153744
 
 

 ##
 File path: 
runners/flink/src/main/java/org/apache/beam/runners/flink/translation/wrappers/streaming/io/StreamingImpulseSource.java
 ##
 @@ -0,0 +1,63 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.runners.flink.translation.wrappers.streaming.io;
+
+import java.util.concurrent.atomic.AtomicBoolean;
+import org.apache.beam.sdk.util.WindowedValue;
+import 
org.apache.flink.streaming.api.functions.source.RichParallelSourceFunction;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * A streaming source that periodically produces an empty byte array. This is 
mostly useful
+ * for debugging, or for triggering periodic behavior in a portable pipeline.
+ */
+public class StreamingImpulseSource extends 
RichParallelSourceFunction> {
 
 Review comment:
   If we care about the exact number of messages emitted (`messageCount`), we 
should consider making this a regular `SourceFunction`.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 154283)
Time Spent: 2.5h  (was: 2h 20m)

> Add a portable Flink streaming synthetic source for testing
> ---
>
> Key: BEAM-5707
> URL: https://issues.apache.org/jira/browse/BEAM-5707
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-flink
>Reporter: Micah Wylde
>Assignee: Aljoscha Krettek
>Priority: Minor
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> Currently there are no built-in streaming sources for portable pipelines. 
> This makes it hard to test streaming functionality in the Python SDK.
> It would be very useful to add a periodic impulse source that (with some 
> configurable frequency) outputs an empty byte array, which can then be 
> transformed as desired inside the python pipeline. More context in this 
> [mailing list 
> discussion|https://lists.apache.org/thread.html/b44a648ab1d0cb200d8bfe4b280e9dad6368209c4725609cbfbbe410@%3Cdev.beam.apache.org%3E].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5707) Add a portable Flink streaming synthetic source for testing

2018-10-15 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5707?focusedWorklogId=154284=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-154284
 ]

ASF GitHub Bot logged work on BEAM-5707:


Author: ASF GitHub Bot
Created on: 15/Oct/18 13:07
Start Date: 15/Oct/18 13:07
Worklog Time Spent: 10m 
  Work Description: mxm commented on a change in pull request #6637: 
[BEAM-5707] Add a periodic, streaming impulse source for Flink portable 
pipelines
URL: https://github.com/apache/beam/pull/6637#discussion_r225158157
 
 

 ##
 File path: sdks/python/apache_beam/examples/flink/flink_streaming_impulse.py
 ##
 @@ -0,0 +1,78 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+"""A streaming workflow that uses a synthetic streaming source.
+
+This can only be used with the Flink portable runner.
 
 Review comment:
   I wonder how meaningful this example is here? It would be nice to convert 
this into a streaming test case as part of the portable runner tests, e.g. add 
tests to `flink_runner_test.py`.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 154284)
Time Spent: 2.5h  (was: 2h 20m)

> Add a portable Flink streaming synthetic source for testing
> ---
>
> Key: BEAM-5707
> URL: https://issues.apache.org/jira/browse/BEAM-5707
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-flink
>Reporter: Micah Wylde
>Assignee: Aljoscha Krettek
>Priority: Minor
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> Currently there are no built-in streaming sources for portable pipelines. 
> This makes it hard to test streaming functionality in the Python SDK.
> It would be very useful to add a periodic impulse source that (with some 
> configurable frequency) outputs an empty byte array, which can then be 
> transformed as desired inside the python pipeline. More context in this 
> [mailing list 
> discussion|https://lists.apache.org/thread.html/b44a648ab1d0cb200d8bfe4b280e9dad6368209c4725609cbfbbe410@%3Cdev.beam.apache.org%3E].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)