[jira] [Created] (BEAM-10251) TestStream needs to register transform id

2020-06-12 Thread Andrew Crites (Jira)
Andrew Crites created BEAM-10251:


 Summary: TestStream needs to register transform id
 Key: BEAM-10251
 URL: https://issues.apache.org/jira/browse/BEAM-10251
 Project: Beam
  Issue Type: Improvement
  Components: runner-dataflow
Reporter: Andrew Crites
Assignee: Andrew Crites


For the Dataflow runner we also need to add the transform id to the TestStream 
stem created. This is so that other information about the operation can be 
looked up in the pipeline proto. This will need to be done for both Java and 
Python.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (BEAM-9382) TestStreamTranscriptTest relies on non-deterministic behavior

2020-06-11 Thread Andrew Crites (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Crites closed BEAM-9382.
---
Fix Version/s: 2.21.0
   Resolution: Fixed

> TestStreamTranscriptTest relies on non-deterministic behavior
> -
>
> Key: BEAM-9382
> URL: https://issues.apache.org/jira/browse/BEAM-9382
> Project: Beam
>  Issue Type: Bug
>  Components: testing
>Reporter: Andrew Crites
>Priority: P3
> Fix For: 2.21.0
>
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> The test discarding_early_fixed uses an early trigger Count(2) and then 
> inserts 3 elements, assuming all 3 will get emitted in the early pane. 
> However, runners do not have to follow this behavior. Instead, they could 
> emit the first two elements seen and then buffer the third until something 
> else comes in. We should change this test to only insert 2 elements so that 
> all runners will behave the same.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (BEAM-8587) Add TestStream support for Dataflow runner

2020-06-11 Thread Andrew Crites (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Crites closed BEAM-8587.
---
Fix Version/s: 2.21.0
   Resolution: Fixed

> Add TestStream support for Dataflow runner
> --
>
> Key: BEAM-8587
> URL: https://issues.apache.org/jira/browse/BEAM-8587
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-dataflow, testing
>Reporter: Andrew Crites
>Priority: P2
> Fix For: 2.21.0
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> TestStream support needed to test features like late data and processing time 
> triggers on local Dataflow runner.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (BEAM-8800) Add advance_watermark_to_infinity to end of TestStreams

2020-06-11 Thread Andrew Crites (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Crites closed BEAM-8800.
---
Fix Version/s: 2.21.0
   Resolution: Fixed

> Add advance_watermark_to_infinity to end of TestStreams
> ---
>
> Key: BEAM-8800
> URL: https://issues.apache.org/jira/browse/BEAM-8800
> Project: Beam
>  Issue Type: Improvement
>  Components: testing
>Reporter: Andrew Crites
>Priority: P3
> Fix For: 2.21.0
>
>   Original Estimate: 5m
>  Remaining Estimate: 5m
>
> TestStream should end with advance_watermark_to_infinity so that pipeline 
> will finish. This worked without for direct runner, but is required for 
> TestStream on Dataflow runner.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (BEAM-9718) Update documentation for windowed value coder

2020-04-08 Thread Andrew Crites (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Crites closed BEAM-9718.
---

> Update documentation for windowed value coder
> -
>
> Key: BEAM-9718
> URL: https://issues.apache.org/jira/browse/BEAM-9718
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model
>Reporter: Andrew Crites
>Assignee: Andrew Crites
>Priority: Minor
> Fix For: 2.21.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> The comments in beam_runner_api.proto describing WINDOWED_VALUE coder need 
> filling in.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-9718) Update documentation for windowed value coder

2020-04-07 Thread Andrew Crites (Jira)
Andrew Crites created BEAM-9718:
---

 Summary: Update documentation for windowed value coder
 Key: BEAM-9718
 URL: https://issues.apache.org/jira/browse/BEAM-9718
 Project: Beam
  Issue Type: Improvement
  Components: beam-model
Reporter: Andrew Crites
Assignee: Andrew Crites


The comments in beam_runner_api.proto describing WINDOWED_VALUE coder need 
filling in.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-9624) Combine operation should support only converting to accumulators

2020-03-27 Thread Andrew Crites (Jira)
Andrew Crites created BEAM-9624:
---

 Summary: Combine operation should support only converting to 
accumulators
 Key: BEAM-9624
 URL: https://issues.apache.org/jira/browse/BEAM-9624
 Project: Beam
  Issue Type: Improvement
  Components: runner-core
Reporter: Andrew Crites
Assignee: Andrew Crites


For streaming pipelines, we want to be able to lift the combiner into the 
MergeBuckets without having to also do a PartialGroupByKey before the shuffle. 
We don't want to do the PGBK since it could cause non-deterministic results 
when used with some triggers.

We propose adding a new URN for doing just the convert to accumulators step and 
adding support for it in Java/Python/Go.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-9382) TestStreamTranscriptTest relies on non-deterministic behavior

2020-02-25 Thread Andrew Crites (Jira)
Andrew Crites created BEAM-9382:
---

 Summary: TestStreamTranscriptTest relies on non-deterministic 
behavior
 Key: BEAM-9382
 URL: https://issues.apache.org/jira/browse/BEAM-9382
 Project: Beam
  Issue Type: Bug
  Components: testing
Reporter: Andrew Crites
Assignee: Andrew Crites


The test discarding_early_fixed uses an early trigger Count(2) and then inserts 
3 elements, assuming all 3 will get emitted in the early pane. However, runners 
do not have to follow this behavior. Instead, they could emit the first two 
elements seen and then buffer the third until something else comes in. We 
should change this test to only insert 2 elements so that all runners will 
behave the same.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-9293) Python direct runner doesn't emit empty pane when it should

2020-02-11 Thread Andrew Crites (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17034763#comment-17034763
 ] 

Andrew Crites commented on BEAM-9293:
-

And actually, there should be an empty pane emitted for [0, 15) at time 315 too 
since the default trigger has late=Repeat(Count(1)) so the 'late' element will 
get emitted right away.

> Python direct runner doesn't emit empty pane when it should
> ---
>
> Key: BEAM-9293
> URL: https://issues.apache.org/jira/browse/BEAM-9293
> Project: Beam
>  Issue Type: Bug
>  Components: runner-direct
>Reporter: Andrew Crites
>Priority: Minor
>
> In test_stream_test.py, the test test_gbk_execution_no_triggers there is 
> positive allowed_lateness. So the two windows [15, 30) and [300, 315) that do 
> not have late data should have an empty pane emitted when window finishes 
> (after allowed lateness) since the default ClosingBehavior in Python is 
> EMIT_ALWAYS (see transforms/core.py).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-9293) Python direct runner doesn't emit empty pane when it should

2020-02-11 Thread Andrew Crites (Jira)
Andrew Crites created BEAM-9293:
---

 Summary: Python direct runner doesn't emit empty pane when it 
should
 Key: BEAM-9293
 URL: https://issues.apache.org/jira/browse/BEAM-9293
 Project: Beam
  Issue Type: Bug
  Components: runner-direct
Reporter: Andrew Crites


In test_stream_test.py, the test test_gbk_execution_no_triggers there is 
positive allowed_lateness. So the two windows [15, 30) and [300, 315) that do 
not have late data should have an empty pane emitted when window finishes 
(after allowed lateness) since the default ClosingBehavior in Python is 
EMIT_ALWAYS (see transforms/core.py).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-9126) TestStreamTest.testDiscarding mode fails for Dataflow since Dataflow TestStream only updates watermarks at 1s resolution.

2020-01-15 Thread Andrew Crites (Jira)
Andrew Crites created BEAM-9126:
---

 Summary: TestStreamTest.testDiscarding mode fails for Dataflow 
since Dataflow TestStream only updates watermarks at 1s resolution.
 Key: BEAM-9126
 URL: https://issues.apache.org/jira/browse/BEAM-9126
 Project: Beam
  Issue Type: Bug
  Components: test-failures
Reporter: Andrew Crites
Assignee: Andrew Crites


The test advances the watermark to 1001ms, but Dataflow will only report the 
watermark as having advanced to 1000ms. The test only needs to advance to the 
end of the window anyway, so there is no problem with changing it to 1000ms.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-8587) Add TestStream support for Dataflow runner

2019-11-26 Thread Andrew Crites (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16982919#comment-16982919
 ] 

Andrew Crites commented on BEAM-8587:
-

I'll add support to Java shortly. No plans on Go any time soon.

> Add TestStream support for Dataflow runner
> --
>
> Key: BEAM-8587
> URL: https://issues.apache.org/jira/browse/BEAM-8587
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-dataflow, testing
>Reporter: Andrew Crites
>Assignee: Andrew Crites
>Priority: Major
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> TestStream support needed to test features like late data and processing time 
> triggers on local Dataflow runner.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (BEAM-8800) Add advance_watermark_to_infinity to end of TestStreams

2019-11-21 Thread Andrew Crites (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Crites reassigned BEAM-8800:
---

Assignee: Andrew Crites

> Add advance_watermark_to_infinity to end of TestStreams
> ---
>
> Key: BEAM-8800
> URL: https://issues.apache.org/jira/browse/BEAM-8800
> Project: Beam
>  Issue Type: Improvement
>  Components: testing
>Reporter: Andrew Crites
>Assignee: Andrew Crites
>Priority: Minor
>   Original Estimate: 5m
>  Remaining Estimate: 5m
>
> TestStream should end with advance_watermark_to_infinity so that pipeline 
> will finish. This worked without for direct runner, but is required for 
> TestStream on Dataflow runner.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-8800) Add advance_watermark_to_infinity to end of TestStreams

2019-11-21 Thread Andrew Crites (Jira)
Andrew Crites created BEAM-8800:
---

 Summary: Add advance_watermark_to_infinity to end of TestStreams
 Key: BEAM-8800
 URL: https://issues.apache.org/jira/browse/BEAM-8800
 Project: Beam
  Issue Type: Improvement
  Components: testing
Reporter: Andrew Crites


TestStream should end with advance_watermark_to_infinity so that pipeline will 
finish. This worked without for direct runner, but is required for TestStream 
on Dataflow runner.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (BEAM-8587) Add TestStream support for Dataflow runner

2019-11-08 Thread Andrew Crites (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Crites reassigned BEAM-8587:
---

Assignee: Andrew Crites

> Add TestStream support for Dataflow runner
> --
>
> Key: BEAM-8587
> URL: https://issues.apache.org/jira/browse/BEAM-8587
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-dataflow, testing
>Reporter: Andrew Crites
>Assignee: Andrew Crites
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> TestStream support needed to test features like late data and processing time 
> triggers on local Dataflow runner.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-8587) Add TestStream support for Dataflow runner

2019-11-07 Thread Andrew Crites (Jira)
Andrew Crites created BEAM-8587:
---

 Summary: Add TestStream support for Dataflow runner
 Key: BEAM-8587
 URL: https://issues.apache.org/jira/browse/BEAM-8587
 Project: Beam
  Issue Type: Improvement
  Components: runner-dataflow, testing
Reporter: Andrew Crites


TestStream support needed to test features like late data and processing time 
triggers on local Dataflow runner.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (BEAM-8465) Increase visibility of beam_on_flume_group

2019-10-23 Thread Andrew Crites (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Crites closed BEAM-8465.
---
Fix Version/s: Not applicable
   Resolution: Invalid

Oops. Don't need to change this file here.

> Increase visibility of beam_on_flume_group
> --
>
> Key: BEAM-8465
> URL: https://issues.apache.org/jira/browse/BEAM-8465
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-dataflow
>Reporter: Andrew Crites
>Priority: Trivial
> Fix For: Not applicable
>
>   Original Estimate: 5m
>  Remaining Estimate: 5m
>
> We need to add //dist_proc/dax/public/io to packages in beam_on_flume_group 
> since that's where TestStream source protos will live.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-8465) Increase visibility of beam_on_flume_group

2019-10-23 Thread Andrew Crites (Jira)
Andrew Crites created BEAM-8465:
---

 Summary: Increase visibility of beam_on_flume_group
 Key: BEAM-8465
 URL: https://issues.apache.org/jira/browse/BEAM-8465
 Project: Beam
  Issue Type: Improvement
  Components: runner-dataflow
Reporter: Andrew Crites


We need to add //dist_proc/dax/public/io to packages in beam_on_flume_group 
since that's where TestStream source protos will live.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)