[jira] [Resolved] (BEAM-9499) test_multi_triggered_gbk_side_input is failing on head

2020-06-01 Thread Ruoyun Huang (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruoyun Huang resolved BEAM-9499.

Fix Version/s: 2.21.0
   Resolution: Fixed

> test_multi_triggered_gbk_side_input is failing on head
> --
>
> Key: BEAM-9499
> URL: https://issues.apache.org/jira/browse/BEAM-9499
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow, sdk-py-core
>Reporter: Ankur Goenka
>Assignee: Ruoyun Huang
>Priority: P2
>  Labels: stale-assigned
> Fix For: 2.21.0
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> test_multi_triggered_gbk_side_input is failing after it was fixed to run on 
> Dataflow runner. Earlier it was always running on DirectRunner.
>  Example failure: 
> [https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/6004/testReport/junit/apache_beam.transforms.sideinputs_test/SideInputsTest/test_multi_triggered_gbk_side_input_7/]
> Error:
> h3. Error Message
> 'list' object has no attribute 'proto'  >> begin captured 
> logging <<  apache_beam.options.pipeline_options: 
> WARNING: --region not set; will default to us-central1. Future releases of 
> Beam will require the user to set --region explicitly, or else have a default 
> set via the gcloud tool.
> {{[https://cloud.google.com/compute/docs/regions-zones]}}
> root: DEBUG: Unhandled type_constraint: Union[] root: DEBUG: Unhandled 
> type_constraint: Union[] root: DEBUG: Unhandled type_constraint: Union[] 
> apache_beam.runners.runner: ERROR: Error while visiting Main windowInto 
> - >> end captured logging << -
> h3. Stacktrace
> File "/usr/lib/python3.6/unittest/case.py", line 59, in testPartExecutor 
> yield File "/usr/lib/python3.6/unittest/case.py", line 605, in run 
> testMethod() File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/sdks/python/apache_beam/transforms/sideinputs_test.py",
>  line 406, in test_multi_triggered_gbk_side_input p.run() File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/sdks/python/apache_beam/testing/test_pipeline.py",
>  line 112, in run False if self.not_use_test_runner_api else 
> test_runner_api)) File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/sdks/python/apache_beam/pipeline.py",
>  line 495, in run self._options).run(False) File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/sdks/python/apache_beam/pipeline.py",
>  line 508, in run return self.runner.run_pipeline(self, self._options) File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/sdks/python/apache_beam/runners/dataflow/test_dataflow_runner.py",
>  line 57, in run_pipeline self).run_pipeline(pipeline, options) File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/sdks/python/apache_beam/runners/dataflow/dataflow_runner.py",
>  line 536, in run_pipeline self.visit_transforms(pipeline, options) File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/sdks/python/apache_beam/runners/runner.py",
>  line 224, in visit_transforms pipeline.visit(RunVisitor(self)) File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/sdks/python/apache_beam/pipeline.py",
>  line 545, in visit self._root_transform().visit(visitor, self, visited) File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/sdks/python/apache_beam/pipeline.py",
>  line 1033, in visit part.visit(visitor, pipeline, visited) File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/sdks/python/apache_beam/pipeline.py",
>  line 1036, in visit visitor.visit_transform(self) File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/sdks/python/apache_beam/runners/runner.py",
>  line 219, in visit_transform self.runner.run_transform(transform_node, 
> options) File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/sdks/python/apache_beam/runners/runner.py",
>  line 246, in run_transform return m(transform_node, options) File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/sdks/python/apache_beam/runners/dataflow/dataflow_runner.py",
>  line 957, in run_ParDo PropertyNames.STEP_NAME: input_step.proto.name, 
> 'list' object has no attribute 'proto'  >> begin captured 
> logging <<  apache_beam.options.pipeline_options: 
> WARNING: --region not set; will default to us-central1. Future releases of 
> Beam will require the user to set --region explicitly, or else have a default 
> set via the gcloud tool.
> 

[jira] [Resolved] (BEAM-8645) TimestampCombiner incorrect in beam python

2020-06-01 Thread Ruoyun Huang (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruoyun Huang resolved BEAM-8645.

Fix Version/s: 2.21.0
   Resolution: Fixed

> TimestampCombiner incorrect in beam python
> --
>
> Key: BEAM-8645
> URL: https://issues.apache.org/jira/browse/BEAM-8645
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Ruoyun Huang
>Assignee: Ruoyun Huang
>Priority: P2
>  Labels: stale-assigned
> Fix For: 2.21.0
>
>  Time Spent: 10h 20m
>  Remaining Estimate: 0h
>
> When we have a TimestampValue on combine: 
> {code:java}
> main_stream = (p   
> | 'main TestStream' >> TestStream()   
> .add_elements([window.TimestampedValue(('k', 100), 0)])   
> .add_elements([window.TimestampedValue(('k', 400), 9)])   
> .advance_watermark_to_infinity()   
> | 'main windowInto' >> beam.WindowInto( 
> window.FixedWindows(10),  
> timestamp_combiner=TimestampCombiner.OUTPUT_AT_LATEST)   | 
> 'Combine' >> beam.CombinePerKey(sum))
> The expect timestamp should be:
> LATEST:    (('k', 500), Timestamp(9)),
> EARLIEST:    (('k', 500), Timestamp(0)),
> END_OF_WINDOW: (('k', 500), Timestamp(10)),
> But current py streaming gives following results: 
> LATEST:    (('k', 500), Timestamp(10)),
> EARLIEST:    (('k', 500), Timestamp(10)),
> END_OF_WINDOW: (('k', 500), Timestamp(9.)),
> More details and discussions:
> https://lists.apache.org/thread.html/d3af1f2f84a2e59a747196039eae77812b78a991f0f293c717e5f4e1@%3Cdev.beam.apache.org%3E
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (BEAM-6446) Clean up suppression rules in checkstyle suppressions.xml

2020-06-01 Thread Ruoyun Huang (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-6446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruoyun Huang reassigned BEAM-6446:
--

Assignee: (was: Ruoyun Huang)

> Clean up suppression rules in checkstyle suppressions.xml
> -
>
> Key: BEAM-6446
> URL: https://issues.apache.org/jira/browse/BEAM-6446
> Project: Beam
>  Issue Type: Improvement
>  Components: build-system
>Reporter: Ruoyun Huang
>Priority: P3
>  Labels: stale-assigned
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> When violations are addressed, clean up suppression rules in checkstyle 
> suppressions.xml



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work started] (BEAM-9286) Create validation tests for metrics based on MonitoringInfo if applicable

2020-06-01 Thread Ruoyun Huang (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on BEAM-9286 started by Ruoyun Huang.
--
> Create validation tests for metrics based on MonitoringInfo if applicable
> -
>
> Key: BEAM-9286
> URL: https://issues.apache.org/jira/browse/BEAM-9286
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-harness
>Reporter: Ruoyun Huang
>Assignee: Ruoyun Huang
>Priority: P3
>  Labels: stale-assigned
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> Create dedicated validation runner tests for metrics (those based Monitoring 
> Info). 
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (BEAM-9286) Create validation tests for metrics based on MonitoringInfo if applicable

2020-06-01 Thread Ruoyun Huang (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruoyun Huang resolved BEAM-9286.

Fix Version/s: 2.21.0
   Resolution: Fixed

> Create validation tests for metrics based on MonitoringInfo if applicable
> -
>
> Key: BEAM-9286
> URL: https://issues.apache.org/jira/browse/BEAM-9286
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-harness
>Reporter: Ruoyun Huang
>Assignee: Ruoyun Huang
>Priority: P3
>  Labels: stale-assigned
> Fix For: 2.21.0
>
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> Create dedicated validation runner tests for metrics (those based Monitoring 
> Info). 
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work started] (BEAM-9703) Create py validations runner test for metrics

2020-05-02 Thread Ruoyun Huang (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on BEAM-9703 started by Ruoyun Huang.
--
> Create py validations runner test for metrics
> -
>
> Key: BEAM-9703
> URL: https://issues.apache.org/jira/browse/BEAM-9703
> Project: Beam
>  Issue Type: Bug
>  Components: testing
>Reporter: Ruoyun Huang
>Assignee: Ruoyun Huang
>Priority: Minor
>  Time Spent: 4h 50m
>  Remaining Estimate: 0h
>
> Some of the metrics are not covered by dedicated validation runner test. 
> Would like create these if needed. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (BEAM-9703) Create py validations runner test for metrics

2020-05-02 Thread Ruoyun Huang (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruoyun Huang resolved BEAM-9703.

Fix Version/s: 3.0.0
   Resolution: Fixed

> Create py validations runner test for metrics
> -
>
> Key: BEAM-9703
> URL: https://issues.apache.org/jira/browse/BEAM-9703
> Project: Beam
>  Issue Type: Bug
>  Components: testing
>Reporter: Ruoyun Huang
>Assignee: Ruoyun Huang
>Priority: Minor
> Fix For: 3.0.0
>
>  Time Spent: 4h 50m
>  Remaining Estimate: 0h
>
> Some of the metrics are not covered by dedicated validation runner test. 
> Would like create these if needed. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (BEAM-9703) Create py validations runner test for metrics

2020-04-05 Thread Ruoyun Huang (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruoyun Huang reassigned BEAM-9703:
--

Assignee: Ruoyun Huang

> Create py validations runner test for metrics
> -
>
> Key: BEAM-9703
> URL: https://issues.apache.org/jira/browse/BEAM-9703
> Project: Beam
>  Issue Type: Bug
>  Components: testing
>Reporter: Ruoyun Huang
>Assignee: Ruoyun Huang
>Priority: Minor
>
> Some of the metrics are not covered by dedicated validation runner test. 
> Would like create these if needed. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-9703) Create py validations runner test for metrics

2020-04-05 Thread Ruoyun Huang (Jira)
Ruoyun Huang created BEAM-9703:
--

 Summary: Create py validations runner test for metrics
 Key: BEAM-9703
 URL: https://issues.apache.org/jira/browse/BEAM-9703
 Project: Beam
  Issue Type: Bug
  Components: testing
Reporter: Ruoyun Huang


Some of the metrics are not covered by dedicated validation runner test. Would 
like create these if needed. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-9488) Python SDK sending unexpected MonitoringInfo

2020-03-11 Thread Ruoyun Huang (Jira)
Ruoyun Huang created BEAM-9488:
--

 Summary: Python SDK sending unexpected MonitoringInfo
 Key: BEAM-9488
 URL: https://issues.apache.org/jira/browse/BEAM-9488
 Project: Beam
  Issue Type: Bug
  Components: sdk-py-core
Reporter: Ruoyun Huang


element_count metrics is supposed to be tied with pcollection ids, but by 
inspecting what is sent over by python sdk, we see there are monitoringInfo 
sent wit ptransforms in it. 

[Double checked the job graph, these seem to be redundant. i.e. the 
corresponding pcollection does have its own MonitoringInfo reported.]

Likely a bug. 

Proof: 

urn: "beam:metric:element_count:v1"
type: "beam:metrics:sum_int_64"
metric {
  counter_data {
int64_value: 1
  }
}
labels {
  key: "PTRANSFORM"
  value: "start/MaybeReshuffle/Reshuffle/RemoveRandomKeys-ptransform-85"
}
labels {
  key: "TAG"
  value: "None"
}
timestamp {
  seconds: 1583949073
  nanos: 842402935
}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-9286) Create validation tests for metrics based on MonitoringInfo if applicable

2020-02-10 Thread Ruoyun Huang (Jira)
Ruoyun Huang created BEAM-9286:
--

 Summary: Create validation tests for metrics based on 
MonitoringInfo if applicable
 Key: BEAM-9286
 URL: https://issues.apache.org/jira/browse/BEAM-9286
 Project: Beam
  Issue Type: Improvement
  Components: sdk-py-harness
Reporter: Ruoyun Huang
Assignee: Ruoyun Huang


Create dedicated validation runner tests for metrics (those based Monitoring 
Info). 

 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9286) Create validation tests for metrics based on MonitoringInfo if applicable

2020-02-10 Thread Ruoyun Huang (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruoyun Huang updated BEAM-9286:
---
Status: Open  (was: Triage Needed)

> Create validation tests for metrics based on MonitoringInfo if applicable
> -
>
> Key: BEAM-9286
> URL: https://issues.apache.org/jira/browse/BEAM-9286
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-harness
>Reporter: Ruoyun Huang
>Assignee: Ruoyun Huang
>Priority: Minor
>
> Create dedicated validation runner tests for metrics (those based Monitoring 
> Info). 
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-8645) TimestampCombiner incorrect in beam python

2019-12-09 Thread Ruoyun Huang (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16991829#comment-16991829
 ] 

Ruoyun Huang commented on BEAM-8645:


The issue was supposed to be fixed by Robert.  There has been several Jira and 
PRs tracking this issue (one of them being PR#10035). 

 

Do we have more information on what is happening and why it is re-opened?  
Thanks. 

> TimestampCombiner incorrect in beam python
> --
>
> Key: BEAM-8645
> URL: https://issues.apache.org/jira/browse/BEAM-8645
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Ruoyun Huang
>Assignee: Ruoyun Huang
>Priority: Major
>  Time Spent: 10h 20m
>  Remaining Estimate: 0h
>
> When we have a TimestampValue on combine: 
> {code:java}
> main_stream = (p   
> | 'main TestStream' >> TestStream()   
> .add_elements([window.TimestampedValue(('k', 100), 0)])   
> .add_elements([window.TimestampedValue(('k', 400), 9)])   
> .advance_watermark_to_infinity()   
> | 'main windowInto' >> beam.WindowInto( 
> window.FixedWindows(10),  
> timestamp_combiner=TimestampCombiner.OUTPUT_AT_LATEST)   | 
> 'Combine' >> beam.CombinePerKey(sum))
> The expect timestamp should be:
> LATEST:    (('k', 500), Timestamp(9)),
> EARLIEST:    (('k', 500), Timestamp(0)),
> END_OF_WINDOW: (('k', 500), Timestamp(10)),
> But current py streaming gives following results: 
> LATEST:    (('k', 500), Timestamp(10)),
> EARLIEST:    (('k', 500), Timestamp(10)),
> END_OF_WINDOW: (('k', 500), Timestamp(9.)),
> More details and discussions:
> https://lists.apache.org/thread.html/d3af1f2f84a2e59a747196039eae77812b78a991f0f293c717e5f4e1@%3Cdev.beam.apache.org%3E
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-8645) TimestampCombiner incorrect in beam python

2019-11-13 Thread Ruoyun Huang (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16973828#comment-16973828
 ] 

Ruoyun Huang commented on BEAM-8645:


duplicated with BEAM-8657

> TimestampCombiner incorrect in beam python
> --
>
> Key: BEAM-8645
> URL: https://issues.apache.org/jira/browse/BEAM-8645
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Ruoyun Huang
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> When we have a TimestampValue on combine: 
> {code:java}
> main_stream = (p   
> | 'main TestStream' >> TestStream()   
> .add_elements([window.TimestampedValue(('k', 100), 0)])   
> .add_elements([window.TimestampedValue(('k', 400), 9)])   
> .advance_watermark_to_infinity()   
> | 'main windowInto' >> beam.WindowInto( 
> window.FixedWindows(10),  
> timestamp_combiner=TimestampCombiner.OUTPUT_AT_LATEST)   | 
> 'Combine' >> beam.CombinePerKey(sum))
> The expect timestamp should be:
> LATEST:    (('k', 500), Timestamp(9)),
> EARLIEST:    (('k', 500), Timestamp(0)),
> END_OF_WINDOW: (('k', 500), Timestamp(10)),
> But current py streaming gives following results: 
> LATEST:    (('k', 500), Timestamp(10)),
> EARLIEST:    (('k', 500), Timestamp(10)),
> END_OF_WINDOW: (('k', 500), Timestamp(9.)),
> More details and discussions:
> https://lists.apache.org/thread.html/d3af1f2f84a2e59a747196039eae77812b78a991f0f293c717e5f4e1@%3Cdev.beam.apache.org%3E
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-8645) TimestampCombiner incorrect in beam python

2019-11-12 Thread Ruoyun Huang (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16972921#comment-16972921
 ] 

Ruoyun Huang commented on BEAM-8645:


Created PR to show how this happens:  
[https://github.com/apache/beam/pull/10081] 

> TimestampCombiner incorrect in beam python
> --
>
> Key: BEAM-8645
> URL: https://issues.apache.org/jira/browse/BEAM-8645
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Ruoyun Huang
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> When we have a TimestampValue on combine: 
> {code:java}
> main_stream = (p   
> | 'main TestStream' >> TestStream()   
> .add_elements([window.TimestampedValue(('k', 100), 0)])   
> .add_elements([window.TimestampedValue(('k', 400), 9)])   
> .advance_watermark_to_infinity()   
> | 'main windowInto' >> beam.WindowInto( 
> window.FixedWindows(10),  
> timestamp_combiner=TimestampCombiner.OUTPUT_AT_LATEST)   | 
> 'Combine' >> beam.CombinePerKey(sum))
> The expect timestamp should be:
> LATEST:    (('k', 500), Timestamp(9)),
> EARLIEST:    (('k', 500), Timestamp(0)),
> END_OF_WINDOW: (('k', 500), Timestamp(10)),
> But current py streaming gives following results: 
> LATEST:    (('k', 500), Timestamp(10)),
> EARLIEST:    (('k', 500), Timestamp(10)),
> END_OF_WINDOW: (('k', 500), Timestamp(9.)),
> More details and discussions:
> https://lists.apache.org/thread.html/d3af1f2f84a2e59a747196039eae77812b78a991f0f293c717e5f4e1@%3Cdev.beam.apache.org%3E
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-8645) TimestampCombiner incorrect in beam python

2019-11-12 Thread Ruoyun Huang (Jira)
Ruoyun Huang created BEAM-8645:
--

 Summary: TimestampCombiner incorrect in beam python
 Key: BEAM-8645
 URL: https://issues.apache.org/jira/browse/BEAM-8645
 Project: Beam
  Issue Type: Bug
  Components: sdk-py-core
Reporter: Ruoyun Huang


When we have a TimestampValue on combine: 
{code:java}
main_stream = (p   
| 'main TestStream' >> TestStream()   
.add_elements([window.TimestampedValue(('k', 100), 0)])   
.add_elements([window.TimestampedValue(('k', 400), 9)])   
.advance_watermark_to_infinity()   
| 'main windowInto' >> beam.WindowInto( 
window.FixedWindows(10),  
timestamp_combiner=TimestampCombiner.OUTPUT_AT_LATEST)   | 
'Combine' >> beam.CombinePerKey(sum))


The expect timestamp should be:
LATEST:    (('k', 500), Timestamp(9)),
EARLIEST:    (('k', 500), Timestamp(0)),
END_OF_WINDOW: (('k', 500), Timestamp(10)),


But current py streaming gives following results: 
LATEST:    (('k', 500), Timestamp(10)),
EARLIEST:    (('k', 500), Timestamp(10)),
END_OF_WINDOW: (('k', 500), Timestamp(9.)),


More details and discussions:

https://lists.apache.org/thread.html/d3af1f2f84a2e59a747196039eae77812b78a991f0f293c717e5f4e1@%3Cdev.beam.apache.org%3E

{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-7331) Missing util function for late pane in java PAssert

2019-05-15 Thread Ruoyun Huang (JIRA)
Ruoyun Huang created BEAM-7331:
--

 Summary: Missing util function for late pane in java PAssert 
 Key: BEAM-7331
 URL: https://issues.apache.org/jira/browse/BEAM-7331
 Project: Beam
  Issue Type: Improvement
  Components: testing
Reporter: Ruoyun Huang
Assignee: Ruoyun Huang


coming from a user's question: 
[https://stackoverflow.com/questions/56132551/apache-beam-teststream-finalpane-not-firing-as-expected]

There are util functions for all types of Panes, except for LatePane.  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work started] (BEAM-6446) Clean up suppression rules in checkstyle suppressions.xml

2019-03-20 Thread Ruoyun Huang (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on BEAM-6446 started by Ruoyun Huang.
--
> Clean up suppression rules in checkstyle suppressions.xml
> -
>
> Key: BEAM-6446
> URL: https://issues.apache.org/jira/browse/BEAM-6446
> Project: Beam
>  Issue Type: Improvement
>  Components: build-system
>Reporter: Ruoyun Huang
>Assignee: Ruoyun Huang
>Priority: Minor
>  Labels: triaged
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> When violations are addressed, clean up suppression rules in checkstyle 
> suppressions.xml



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (BEAM-6403) Improve checkstyle rules on javadoc comments

2019-03-18 Thread Ruoyun Huang (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruoyun Huang resolved BEAM-6403.

   Resolution: Fixed
Fix Version/s: 2.11.0

> Improve checkstyle rules on javadoc comments
> 
>
> Key: BEAM-6403
> URL: https://issues.apache.org/jira/browse/BEAM-6403
> Project: Beam
>  Issue Type: Improvement
>  Components: build-system
>Reporter: Ruoyun Huang
>Assignee: Ruoyun Huang
>Priority: Minor
>  Labels: triaged
> Fix For: 2.11.0
>
>
> Make checkstyle check comments on non-trivial public methods. 
>  
> discussions:  
> https://lists.apache.org/thread.html/819a68f69940e60cb820370df90ce15cecd289493b28149e1df1719e@%3Cdev.beam.apache.org%3E
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (BEAM-2917) ULR support for portable user state

2019-03-18 Thread Ruoyun Huang (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-2917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruoyun Huang resolved BEAM-2917.

   Resolution: Won't Fix
Fix Version/s: 2.12.0

no new feature for java ULR

> ULR support for portable user state
> ---
>
> Key: BEAM-2917
> URL: https://issues.apache.org/jira/browse/BEAM-2917
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-core, runner-direct
>Reporter: Henning Rohde
>Assignee: Ruoyun Huang
>Priority: Minor
>  Labels: portability, triaged
> Fix For: 2.12.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work stopped] (BEAM-2928) ULR support for portable side input

2019-03-18 Thread Ruoyun Huang (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-2928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on BEAM-2928 stopped by Ruoyun Huang.
--
> ULR support for portable side input
> ---
>
> Key: BEAM-2928
> URL: https://issues.apache.org/jira/browse/BEAM-2928
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-core, runner-direct
>Reporter: Henning Rohde
>Assignee: Ruoyun Huang
>Priority: Major
>  Labels: portability, triaged
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Get side inputs working on the ULR. Since the ULR code is based on the direct 
> runner code there should already be some code that could be reused, but new 
> code will need to be written where side inputs would interact with 
> portability.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (BEAM-2928) ULR support for portable side input

2019-03-18 Thread Ruoyun Huang (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-2928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruoyun Huang resolved BEAM-2928.

   Resolution: Won't Fix
Fix Version/s: 2.12.0

The community consensus was to stop implementing new features for ULR. 

> ULR support for portable side input
> ---
>
> Key: BEAM-2928
> URL: https://issues.apache.org/jira/browse/BEAM-2928
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-core, runner-direct
>Reporter: Henning Rohde
>Assignee: Ruoyun Huang
>Priority: Major
>  Labels: portability, triaged
> Fix For: 2.12.0
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Get side inputs working on the ULR. Since the ULR code is based on the direct 
> runner code there should already be some code that could be reused, but new 
> code will need to be written where side inputs would interact with 
> portability.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (BEAM-6504) Integration of Portabability sideInput into Dataflow

2019-03-15 Thread Ruoyun Huang (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruoyun Huang resolved BEAM-6504.

   Resolution: Implemented
Fix Version/s: 2.12.0

> Integration of Portabability sideInput into Dataflow
> 
>
> Key: BEAM-6504
> URL: https://issues.apache.org/jira/browse/BEAM-6504
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-dataflow
>Reporter: Ruoyun Huang
>Assignee: Ruoyun Huang
>Priority: Major
>  Labels: triaged
> Fix For: 2.12.0
>
>  Time Spent: 4h 20m
>  Remaining Estimate: 0h
>
> Underlying fn api support is done in BEAM-2929, this Jira integrates 
> everything into dataflow. 
>  
> 1) introduce a sideInputHandler for dataflow. 
> 2) wire the handler to dataflow runner (i.e.  ProcessRemoteBundleOperation)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work started] (BEAM-6504) Integration of Portabability sideInput into Dataflow

2019-03-15 Thread Ruoyun Huang (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on BEAM-6504 started by Ruoyun Huang.
--
> Integration of Portabability sideInput into Dataflow
> 
>
> Key: BEAM-6504
> URL: https://issues.apache.org/jira/browse/BEAM-6504
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-dataflow
>Reporter: Ruoyun Huang
>Assignee: Ruoyun Huang
>Priority: Major
>  Labels: triaged
>  Time Spent: 4h 20m
>  Remaining Estimate: 0h
>
> Underlying fn api support is done in BEAM-2929, this Jira integrates 
> everything into dataflow. 
>  
> 1) introduce a sideInputHandler for dataflow. 
> 2) wire the handler to dataflow runner (i.e.  ProcessRemoteBundleOperation)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-6530) Strange character on website (contact page)

2019-02-19 Thread Ruoyun Huang (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-6530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16772419#comment-16772419
 ] 

Ruoyun Huang commented on BEAM-6530:


+ [~melap] to check if this is expected? 

 

 

> Strange character on website (contact page)
> ---
>
> Key: BEAM-6530
> URL: https://issues.apache.org/jira/browse/BEAM-6530
> Project: Beam
>  Issue Type: Bug
>  Components: website
>Reporter: Ruoyun Huang
>Priority: Minor
> Attachments: Screen Shot 2019-01-28 at 6.22.11 PM.png
>
>
> see screen shot as attached. 
>  
> Looks like an html error somewhere, Looking at the code though don't see 
> strange redundant characters: 
> [https://github.com/apache/beam/blob/master/website/src/community/contact-us.md]
>  
> Some one know more about how the web pages organized might want to take a 
> look. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-6530) Strange character on website (contact page)

2019-01-28 Thread Ruoyun Huang (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruoyun Huang updated BEAM-6530:
---
Description: 
see screen shot as attached. 

 

Looks like an html error somewhere, Looking at the code though don't see 
strange redundant characters: 
[https://github.com/apache/beam/blob/master/website/src/community/contact-us.md]

 

Some one know more about how the web pages organized might want to take a look. 

  was:
see screen shot: 
[https://drive.google.com/open?id=1hCUDzc4hpTzJjR0ydoH1aqWyB35PJVu_]

 

Looks like an html error somewhere, Looking at the code though don't see 
strange redundant characters: 
[https://github.com/apache/beam/blob/master/website/src/community/contact-us.md]

 

Some one know more about how the web pages organized might want to take a look. 


> Strange character on website (contact page)
> ---
>
> Key: BEAM-6530
> URL: https://issues.apache.org/jira/browse/BEAM-6530
> Project: Beam
>  Issue Type: Bug
>  Components: website
>Reporter: Ruoyun Huang
>Assignee: Melissa Pashniak
>Priority: Minor
> Attachments: Screen Shot 2019-01-28 at 6.22.11 PM.png
>
>
> see screen shot as attached. 
>  
> Looks like an html error somewhere, Looking at the code though don't see 
> strange redundant characters: 
> [https://github.com/apache/beam/blob/master/website/src/community/contact-us.md]
>  
> Some one know more about how the web pages organized might want to take a 
> look. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-6530) Strange character on website (contact page)

2019-01-28 Thread Ruoyun Huang (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruoyun Huang updated BEAM-6530:
---
Attachment: Screen Shot 2019-01-28 at 6.22.11 PM.png

> Strange character on website (contact page)
> ---
>
> Key: BEAM-6530
> URL: https://issues.apache.org/jira/browse/BEAM-6530
> Project: Beam
>  Issue Type: Bug
>  Components: website
>Reporter: Ruoyun Huang
>Assignee: Melissa Pashniak
>Priority: Minor
> Attachments: Screen Shot 2019-01-28 at 6.22.11 PM.png
>
>
> see screen shot: 
> [https://drive.google.com/open?id=1hCUDzc4hpTzJjR0ydoH1aqWyB35PJVu_]
>  
> Looks like an html error somewhere, Looking at the code though don't see 
> strange redundant characters: 
> [https://github.com/apache/beam/blob/master/website/src/community/contact-us.md]
>  
> Some one know more about how the web pages organized might want to take a 
> look. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-6530) Strange character on website (contact page)

2019-01-28 Thread Ruoyun Huang (JIRA)
Ruoyun Huang created BEAM-6530:
--

 Summary: Strange character on website (contact page)
 Key: BEAM-6530
 URL: https://issues.apache.org/jira/browse/BEAM-6530
 Project: Beam
  Issue Type: Bug
  Components: website
Reporter: Ruoyun Huang
Assignee: Melissa Pashniak


see screen shot: 
[https://drive.google.com/open?id=1hCUDzc4hpTzJjR0ydoH1aqWyB35PJVu_]

 

Looks like an html error somewhere, Looking at the code though don't see 
strange redundant characters: 
[https://github.com/apache/beam/blob/master/website/src/community/contact-us.md]

 

Some one know more about how the web pages organized might want to take a look. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-6504) Integration of Portabability sideInput into Dataflow

2019-01-24 Thread Ruoyun Huang (JIRA)
Ruoyun Huang created BEAM-6504:
--

 Summary: Integration of Portabability sideInput into Dataflow
 Key: BEAM-6504
 URL: https://issues.apache.org/jira/browse/BEAM-6504
 Project: Beam
  Issue Type: New Feature
  Components: runner-dataflow
Reporter: Ruoyun Huang
Assignee: Ruoyun Huang


Underlying fn api support is done in BEAM-2929, this Jira integrates everything 
into dataflow. 

 

1) introduce a sideInputHandler for dataflow. 

2) wire the handler to dataflow runner (i.e.  ProcessRemoteBundleOperation)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-6354) Hanging BoundedReadFromUnboundedSourceTest#testTimeBound and SplittableDoFnTest#testLateData

2019-01-18 Thread Ruoyun Huang (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-6354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16746906#comment-16746906
 ] 

Ruoyun Huang commented on BEAM-6354:


If I understand it correctly, my input is bounded, thus maybe not relevant?  My 
code is like this: 

final PCollectionView view = pipeline.apply("Create47", 
Create.of(47)).apply(View.asSingleton());

 

I was tracing down the code path based on your suggestion. I am 90% sure that 
the reason for empty output is trigger not filed (evidence being, onTrigger() 
function never been executed).

> Hanging BoundedReadFromUnboundedSourceTest#testTimeBound and 
> SplittableDoFnTest#testLateData
> 
>
> Key: BEAM-6354
> URL: https://issues.apache.org/jira/browse/BEAM-6354
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Gleb Kanterov
>Assignee: Kenneth Knowles
>Priority: Major
> Fix For: 2.10.0
>
>
> It seems that they have a similar root cause because both of them use 
> unbounded streams.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-6339) In certain cases, spotlessJava fails to work

2019-01-16 Thread Ruoyun Huang (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-6339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16744360#comment-16744360
 ] 

Ruoyun Huang commented on BEAM-6339:


Didn't keep my change. :(       

 

I remember all I did was just removing existing comment (in a few randomly 
picked files), rather than anything too creative.  Just now I tried again but 
can no longer reproduce.  I suspect it only gets triggered when the violation 
is in certain files.  

 

Will keep an eye on this and bring it to broader attention if see the build 
failure next time. 

 

 

> In certain cases, spotlessJava fails to work
> 
>
> Key: BEAM-6339
> URL: https://issues.apache.org/jira/browse/BEAM-6339
> Project: Beam
>  Issue Type: Improvement
>  Components: build-system
>Reporter: Ruoyun Huang
>Assignee: Ruoyun Huang
>Priority: Trivial
> Fix For: Not applicable
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Encounter following error when certain criteria exists in new code: 
>  
> > Task :beam-runners-google-cloud-dataflow-java:spotlessJava FAILED
>  
> FAILURE: Build failed with an exception.
>  * What went wrong:
> Execution failed for task 
> ':beam-runners-google-cloud-dataflow-java:spotlessJava'.
> > You have a misbehaving rule which can't make up its mind.
>  This means that spotlessCheck will fail even after spotlessApply has run.
>  
>  This is a bug in a formatting rule, not Spotless itself, but Spotless can
>  work around this bug and generate helpful bug reports for the broken rule
>  if you add 'paddedCell()' to your build.gradle as such: 
>  
>  spotless {
>  format 'someFormat', {
>  ...
>  paddedCell()
>  }
>  }
>  
>  The next time you run spotlessCheck, it will put helpful bug reports into
>  'runners/google-cloud-dataflow-java/build/spotless-diagnose-java', and 
> spotlessApply
>  and spotlessCheck will be self-consistent from here on out.
>  
>  For details see 
> [https://github.com/diffplug/spotless/blob/master/PADDEDCELL.md]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-6446) Clean up suppression rules in checkstyle suppressions.xml

2019-01-15 Thread Ruoyun Huang (JIRA)
Ruoyun Huang created BEAM-6446:
--

 Summary: Clean up suppression rules in checkstyle suppressions.xml
 Key: BEAM-6446
 URL: https://issues.apache.org/jira/browse/BEAM-6446
 Project: Beam
  Issue Type: Improvement
  Components: build-system
Reporter: Ruoyun Huang
Assignee: Ruoyun Huang


When violations are addressed, clean up suppression rules in checkstyle 
suppressions.xml



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work started] (BEAM-6403) Improve checkstyle rules on javadoc comments

2019-01-11 Thread Ruoyun Huang (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on BEAM-6403 started by Ruoyun Huang.
--
> Improve checkstyle rules on javadoc comments
> 
>
> Key: BEAM-6403
> URL: https://issues.apache.org/jira/browse/BEAM-6403
> Project: Beam
>  Issue Type: Improvement
>  Components: build-system
>Reporter: Ruoyun Huang
>Assignee: Ruoyun Huang
>Priority: Minor
>
> Make checkstyle check comments on non-trivial public methods. 
>  
> discussions:  
> https://lists.apache.org/thread.html/819a68f69940e60cb820370df90ce15cecd289493b28149e1df1719e@%3Cdev.beam.apache.org%3E
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (BEAM-6184) PortableRunner dependency missed in wordcount example maven artifact

2019-01-09 Thread Ruoyun Huang (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruoyun Huang closed BEAM-6184.
--
   Resolution: Fixed
Fix Version/s: Not applicable

> PortableRunner dependency missed in wordcount example maven artifact
> 
>
> Key: BEAM-6184
> URL: https://issues.apache.org/jira/browse/BEAM-6184
> Project: Beam
>  Issue Type: Improvement
>  Components: build-system
>Reporter: Ruoyun Huang
>Assignee: Ruoyun Huang
>Priority: Minor
> Fix For: Not applicable
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
>  
>  
> more context: 
> https://lists.apache.org/thread.html/8dd60395424425f7502d62888c49014430d1d3b06c026606f3db28ab@%3Cuser.beam.apache.org%3E



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (BEAM-6339) In certain cases, spotlessJava fails to work

2019-01-09 Thread Ruoyun Huang (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruoyun Huang closed BEAM-6339.
--
   Resolution: Fixed
Fix Version/s: Not applicable

> In certain cases, spotlessJava fails to work
> 
>
> Key: BEAM-6339
> URL: https://issues.apache.org/jira/browse/BEAM-6339
> Project: Beam
>  Issue Type: Improvement
>  Components: build-system
>Reporter: Ruoyun Huang
>Assignee: Ruoyun Huang
>Priority: Trivial
> Fix For: Not applicable
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Encounter following error when certain criteria exists in new code: 
>  
> > Task :beam-runners-google-cloud-dataflow-java:spotlessJava FAILED
>  
> FAILURE: Build failed with an exception.
>  * What went wrong:
> Execution failed for task 
> ':beam-runners-google-cloud-dataflow-java:spotlessJava'.
> > You have a misbehaving rule which can't make up its mind.
>  This means that spotlessCheck will fail even after spotlessApply has run.
>  
>  This is a bug in a formatting rule, not Spotless itself, but Spotless can
>  work around this bug and generate helpful bug reports for the broken rule
>  if you add 'paddedCell()' to your build.gradle as such: 
>  
>  spotless {
>  format 'someFormat', {
>  ...
>  paddedCell()
>  }
>  }
>  
>  The next time you run spotlessCheck, it will put helpful bug reports into
>  'runners/google-cloud-dataflow-java/build/spotless-diagnose-java', and 
> spotlessApply
>  and spotlessCheck will be self-consistent from here on out.
>  
>  For details see 
> [https://github.com/diffplug/spotless/blob/master/PADDEDCELL.md]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-6339) In certain cases, spotlessJava fails to work

2019-01-02 Thread Ruoyun Huang (JIRA)
Ruoyun Huang created BEAM-6339:
--

 Summary: In certain cases, spotlessJava fails to work
 Key: BEAM-6339
 URL: https://issues.apache.org/jira/browse/BEAM-6339
 Project: Beam
  Issue Type: Improvement
  Components: build-system
Reporter: Ruoyun Huang
Assignee: Ruoyun Huang


Encounter following error when certain criteria exists in new code: 

 

> Task :beam-runners-google-cloud-dataflow-java:spotlessJava FAILED

 

FAILURE: Build failed with an exception.
 * What went wrong:
Execution failed for task 
':beam-runners-google-cloud-dataflow-java:spotlessJava'.
> You have a misbehaving rule which can't make up its mind.
 This means that spotlessCheck will fail even after spotlessApply has run.
 
 This is a bug in a formatting rule, not Spotless itself, but Spotless can
 work around this bug and generate helpful bug reports for the broken rule
 if you add 'paddedCell()' to your build.gradle as such: 
 
 spotless {
 format 'someFormat', {
 ...
 paddedCell()
 }
 }
 
 The next time you run spotlessCheck, it will put helpful bug reports into
 'runners/google-cloud-dataflow-java/build/spotless-diagnose-java', and 
spotlessApply
 and spotlessCheck will be self-consistent from here on out.
 
 For details see 
[https://github.com/diffplug/spotless/blob/master/PADDEDCELL.md]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (BEAM-5448) Support running user pipelines with the Java Reference Runner in Python.

2018-12-13 Thread Ruoyun Huang (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruoyun Huang resolved BEAM-5448.

   Resolution: Fixed
Fix Version/s: 2.7.1

> Support running user pipelines with the Java Reference Runner in Python.
> 
>
> Key: BEAM-5448
> URL: https://issues.apache.org/jira/browse/BEAM-5448
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-direct
>Reporter: Daniel Oliveira
>Assignee: Ruoyun Huang
>Priority: Major
> Fix For: 2.7.1
>
>
> In order to aid testing, devs should be able to write pipelines and then 
> easily run them with the ULR. This task is for creating the build rules 
> needed to accomplish this for pipelines using the Python SDK.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work started] (BEAM-2928) ULR support for portable side input

2018-12-10 Thread Ruoyun Huang (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-2928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on BEAM-2928 started by Ruoyun Huang.
--
> ULR support for portable side input
> ---
>
> Key: BEAM-2928
> URL: https://issues.apache.org/jira/browse/BEAM-2928
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-core, runner-direct
>Reporter: Henning Rohde
>Assignee: Ruoyun Huang
>Priority: Major
>  Labels: portability
>
> Get side inputs working on the ULR. Since the ULR code is based on the direct 
> runner code there should already be some code that could be reused, but new 
> code will need to be written where side inputs would interact with 
> portability.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-6184) PortableRunner dependency missed in wordcount example maven artifact

2018-12-05 Thread Ruoyun Huang (JIRA)
Ruoyun Huang created BEAM-6184:
--

 Summary: PortableRunner dependency missed in wordcount example 
maven artifact
 Key: BEAM-6184
 URL: https://issues.apache.org/jira/browse/BEAM-6184
 Project: Beam
  Issue Type: Improvement
  Components: build-system
Reporter: Ruoyun Huang
Assignee: Ruoyun Huang


 

 

more context: 
https://lists.apache.org/thread.html/8dd60395424425f7502d62888c49014430d1d3b06c026606f3db28ab@%3Cuser.beam.apache.org%3E



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (BEAM-5430) Adjust implementation of CombineGroupedValues runner to use CombineFn.apply

2018-11-20 Thread Ruoyun Huang (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruoyun Huang resolved BEAM-5430.

   Resolution: Fixed
Fix Version/s: Not applicable

> Adjust implementation of CombineGroupedValues runner to use CombineFn.apply
> ---
>
> Key: BEAM-5430
> URL: https://issues.apache.org/jira/browse/BEAM-5430
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-harness
>Reporter: Daniel Oliveira
>Assignee: Ruoyun Huang
>Priority: Major
> Fix For: Not applicable
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> In the [implementation for the runner of 
> Combine.GroupedValues|https://github.com/apache/beam/blob/bdd0081b49f8e7df6733dc8e8bc90dda3efc6621/sdks/java/harness/src/main/java/org/apache/beam/fn/harness/CombineRunners.java#L217]
>  in the Java SDK it essentially re-implements what was already implemented in 
> [CombineFn.apply|https://github.com/apache/beam/blob/bdd0081b49f8e7df6733dc8e8bc90dda3efc6621/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/Combine.java#L348].
> The implementation should instead just call CombineFn.apply for simplicity 
> and code reuse.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work started] (BEAM-5448) Support running user pipelines with the Java Reference Runner in Python.

2018-11-20 Thread Ruoyun Huang (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on BEAM-5448 started by Ruoyun Huang.
--
> Support running user pipelines with the Java Reference Runner in Python.
> 
>
> Key: BEAM-5448
> URL: https://issues.apache.org/jira/browse/BEAM-5448
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-direct
>Reporter: Daniel Oliveira
>Assignee: Ruoyun Huang
>Priority: Major
>
> In order to aid testing, devs should be able to write pipelines and then 
> easily run them with the ULR. This task is for creating the build rules 
> needed to accomplish this for pipelines using the Python SDK.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (BEAM-5448) Support running user pipelines with the Java Reference Runner in Python.

2018-11-20 Thread Ruoyun Huang (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16693884#comment-16693884
 ] 

Ruoyun Huang edited comment on BEAM-5448 at 11/20/18 11:12 PM:
---

Works being done:

       Caveat remains: Due to how ULR is implemented, currently it only 
supports running a pipeline without IO output. And the Java Reference Runner in 
python becomes complete only when ULR supports more features. 
 # Updated cwiki page on how to run python-portability-on-java-ULR (link: 
https://cwiki.apache.org/confluence/display/BEAM/Usage+Guide)
 # Side by side comparisons with Java-SDK-on-ULR’s behavior (link: 
https://bit.ly/2qui6Ot)
 # [Optional] For validation purpose, shall we check in a variant of python 
wordCount, which does not do TextIO? 


was (Author: ruoyun):
Works being done: 

       Caveat remains: Due to how ULR is implemented, currently it only 
supports running a pipeline without IO output. And the Java Reference Runner in 
python becomes complete only when ULR supports more features. 
 # Updated cwiki page on how to run python-portability-on-java-ULR.
 # Side by side comparisons with Java-SDK-on-ULR’s behavior. 
 # [Optional] For validation purpose, shall we check in a variant of python 
wordCount, which does not do TextIO? 

> Support running user pipelines with the Java Reference Runner in Python.
> 
>
> Key: BEAM-5448
> URL: https://issues.apache.org/jira/browse/BEAM-5448
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-direct
>Reporter: Daniel Oliveira
>Assignee: Ruoyun Huang
>Priority: Major
>
> In order to aid testing, devs should be able to write pipelines and then 
> easily run them with the ULR. This task is for creating the build rules 
> needed to accomplish this for pipelines using the Python SDK.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-5448) Support running user pipelines with the Java Reference Runner in Python.

2018-11-20 Thread Ruoyun Huang (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16693884#comment-16693884
 ] 

Ruoyun Huang commented on BEAM-5448:


Works being done: 

       Caveat remains: Due to how ULR is implemented, currently it only 
supports running a pipeline without IO output. And the Java Reference Runner in 
python becomes complete only when ULR supports more features. 
 # Updated cwiki page on how to run python-portability-on-java-ULR.
 # Side by side comparisons with Java-SDK-on-ULR’s behavior. 
 # [Optional] For validation purpose, shall we check in a variant of python 
wordCount, which does not do TextIO? 

> Support running user pipelines with the Java Reference Runner in Python.
> 
>
> Key: BEAM-5448
> URL: https://issues.apache.org/jira/browse/BEAM-5448
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-direct
>Reporter: Daniel Oliveira
>Assignee: Ruoyun Huang
>Priority: Major
>
> In order to aid testing, devs should be able to write pipelines and then 
> easily run them with the ULR. This task is for creating the build rules 
> needed to accomplish this for pipelines using the Python SDK.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (BEAM-6068) Wordcount example fails to read from gcs shakespare text file

2018-11-14 Thread Ruoyun Huang (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruoyun Huang resolved BEAM-6068.

Resolution: Not A Problem

Changed to Not a problem.  See the fixing steps in previous comment. 

> Wordcount example fails to read from gcs shakespare text file
> -
>
> Key: BEAM-6068
> URL: https://issues.apache.org/jira/browse/BEAM-6068
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Ruoyun Huang
>Assignee: Mark Liu
>Priority: Major
> Fix For: 2.9.0
>
>
> Symptom: 
> In a synced-to-head repo, following command fails:
> python -m apache_beam.examples.wordcount   --input 
> gs://dataflow-samples/shakespeare/kinglear.txt   --output gs://$USER-test/tmp 
>   --runner DataflowRunner   --project google.com:clouddfe   --temp_location 
> gs://$USER-test/temp-it   --experiment beam_fn_api   --sdk_location 
> dist/apache-beam-2.9.0.dev0.tar.gz
>  
> error message being: 
> File "/usr/lib/python2.7/runpy.py", line 174, in _run_module_as_main
>  "__main__", fname, loader, pkg_name)
>  File "/usr/lib/python2.7/runpy.py", line 72, in _run_code
>  exec code in run_globals
>  File 
> "/usr/local/google/home/ruoyun/projects/beam2/sdks/python/apache_beam/examples/wordcount.py",
>  line 136, in 
>  run()
>  File 
> "/usr/local/google/home/ruoyun/projects/beam2/sdks/python/apache_beam/examples/wordcount.py",
>  line 90, in run
>  lines = p | 'read' >> ReadFromText(known_args.input)
>  File "apache_beam/io/textio.py", line 524, in __init__
>  skip_header_lines=skip_header_lines)
>  File "apache_beam/io/textio.py", line 119, in __init__
>  validate=validate)
>  File "apache_beam/io/filebasedsource.py", line 121, in __init__
>  self._validate()
>  File "apache_beam/options/value_provider.py", line 137, in _f
>  return fnc(self, *args, **kwargs)
>  File "apache_beam/io/filebasedsource.py", line 178, in _validate
>  match_result = FileSystems.match([pattern], limits=[1])[0]
>  File "apache_beam/io/filesystems.py", line 187, in match
>  return filesystem.match(patterns, limits)
>  File "apache_beam/io/filesystem.py", line 705, in match
>  raise BeamIOError("Match operation failed", exceptions)
> apache_beam.io.filesystem.BeamIOError: Match operation failed with exceptions 
> \{'gs://dataflow-samples/shakespeare/kinglear.txt': TypeError("__init__() got 
> an unexpected keyword argument 'response_encoding'",)}
>  
>  
> However, I can run the similar command by reverting to 2.8 release and 
> rebuild everything. This command succeeds: 
> python -m apache_beam.examples.wordcount   
> --input=gs://dataflow-samples/shakespeare/kinglear.txt  
> --output=gs://test-$USER/portable/   --runner DataflowRunner --project 
> $GCP_PROJECT  --staging_location gs://test-$USER/staging_wc --temp_location 
> gs://test-$USER/tmp \ --sdk_location=./dist/apache-beam-2.8.0.dev0.tar.gz
>  
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-6068) Wordcount example fails to read from gcs shakespare text file

2018-11-14 Thread Ruoyun Huang (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-6068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16687292#comment-16687292
 ] 

Ruoyun Huang commented on BEAM-6068:


Now it works on head as well by doing following: 
 
1. new virtualenv
 2. pip install grpcio-tools==1.3.5
 (this is suggested only for developing purpose)
 3. python setup.py sdist
 4. pip install dist/apache-beam-2.9.0.dev0.tar.gz[gcp]
 5. run the same command. 
 
 
Looks like maybe my py2 virtual env was messed up somehow.  But one thing still 
puzzles me is that why rebuilding 2.8 release works. 
 
Thanks Mark! 
 

> Wordcount example fails to read from gcs shakespare text file
> -
>
> Key: BEAM-6068
> URL: https://issues.apache.org/jira/browse/BEAM-6068
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Ruoyun Huang
>Assignee: Mark Liu
>Priority: Major
> Fix For: 2.9.0
>
>
> Symptom: 
> In a synced-to-head repo, following command fails:
> python -m apache_beam.examples.wordcount   --input 
> gs://dataflow-samples/shakespeare/kinglear.txt   --output gs://$USER-test/tmp 
>   --runner DataflowRunner   --project google.com:clouddfe   --temp_location 
> gs://$USER-test/temp-it   --experiment beam_fn_api   --sdk_location 
> dist/apache-beam-2.9.0.dev0.tar.gz
>  
> error message being: 
> File "/usr/lib/python2.7/runpy.py", line 174, in _run_module_as_main
>  "__main__", fname, loader, pkg_name)
>  File "/usr/lib/python2.7/runpy.py", line 72, in _run_code
>  exec code in run_globals
>  File 
> "/usr/local/google/home/ruoyun/projects/beam2/sdks/python/apache_beam/examples/wordcount.py",
>  line 136, in 
>  run()
>  File 
> "/usr/local/google/home/ruoyun/projects/beam2/sdks/python/apache_beam/examples/wordcount.py",
>  line 90, in run
>  lines = p | 'read' >> ReadFromText(known_args.input)
>  File "apache_beam/io/textio.py", line 524, in __init__
>  skip_header_lines=skip_header_lines)
>  File "apache_beam/io/textio.py", line 119, in __init__
>  validate=validate)
>  File "apache_beam/io/filebasedsource.py", line 121, in __init__
>  self._validate()
>  File "apache_beam/options/value_provider.py", line 137, in _f
>  return fnc(self, *args, **kwargs)
>  File "apache_beam/io/filebasedsource.py", line 178, in _validate
>  match_result = FileSystems.match([pattern], limits=[1])[0]
>  File "apache_beam/io/filesystems.py", line 187, in match
>  return filesystem.match(patterns, limits)
>  File "apache_beam/io/filesystem.py", line 705, in match
>  raise BeamIOError("Match operation failed", exceptions)
> apache_beam.io.filesystem.BeamIOError: Match operation failed with exceptions 
> \{'gs://dataflow-samples/shakespeare/kinglear.txt': TypeError("__init__() got 
> an unexpected keyword argument 'response_encoding'",)}
>  
>  
> However, I can run the similar command by reverting to 2.8 release and 
> rebuild everything. This command succeeds: 
> python -m apache_beam.examples.wordcount   
> --input=gs://dataflow-samples/shakespeare/kinglear.txt  
> --output=gs://test-$USER/portable/   --runner DataflowRunner --project 
> $GCP_PROJECT  --staging_location gs://test-$USER/staging_wc --temp_location 
> gs://test-$USER/tmp \ --sdk_location=./dist/apache-beam-2.8.0.dev0.tar.gz
>  
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-6068) Wordcount example fails to read from gcs shakespare text file

2018-11-14 Thread Ruoyun Huang (JIRA)
Ruoyun Huang created BEAM-6068:
--

 Summary: Wordcount example fails to read from gcs shakespare text 
file
 Key: BEAM-6068
 URL: https://issues.apache.org/jira/browse/BEAM-6068
 Project: Beam
  Issue Type: Improvement
  Components: sdk-py-core
Reporter: Ruoyun Huang
Assignee: Ahmet Altay


Symptom: 

In a synced-to-head repo, following command fails:

python -m apache_beam.examples.wordcount   --input 
gs://dataflow-samples/shakespeare/kinglear.txt   --output gs://$USER-test/tmp   
--runner DataflowRunner   --project google.com:clouddfe   --temp_location 
gs://$USER-test/temp-it   --experiment beam_fn_api   --sdk_location 
dist/apache-beam-2.9.0.dev0.tar.gz

 

error message being: 

File "/usr/lib/python2.7/runpy.py", line 174, in _run_module_as_main
 "__main__", fname, loader, pkg_name)
 File "/usr/lib/python2.7/runpy.py", line 72, in _run_code
 exec code in run_globals
 File 
"/usr/local/google/home/ruoyun/projects/beam2/sdks/python/apache_beam/examples/wordcount.py",
 line 136, in 
 run()
 File 
"/usr/local/google/home/ruoyun/projects/beam2/sdks/python/apache_beam/examples/wordcount.py",
 line 90, in run
 lines = p | 'read' >> ReadFromText(known_args.input)
 File "apache_beam/io/textio.py", line 524, in __init__
 skip_header_lines=skip_header_lines)
 File "apache_beam/io/textio.py", line 119, in __init__
 validate=validate)
 File "apache_beam/io/filebasedsource.py", line 121, in __init__
 self._validate()
 File "apache_beam/options/value_provider.py", line 137, in _f
 return fnc(self, *args, **kwargs)
 File "apache_beam/io/filebasedsource.py", line 178, in _validate
 match_result = FileSystems.match([pattern], limits=[1])[0]
 File "apache_beam/io/filesystems.py", line 187, in match
 return filesystem.match(patterns, limits)
 File "apache_beam/io/filesystem.py", line 705, in match
 raise BeamIOError("Match operation failed", exceptions)
apache_beam.io.filesystem.BeamIOError: Match operation failed with exceptions 
\{'gs://dataflow-samples/shakespeare/kinglear.txt': TypeError("__init__() got 
an unexpected keyword argument 'response_encoding'",)}

 

 

However, I can run the similar command by reverting to 2.8 release and rebuild 
everything. This command succeeds: 

python -m apache_beam.examples.wordcount   
--input=gs://dataflow-samples/shakespeare/kinglear.txt  
--output=gs://test-$USER/portable/   --runner DataflowRunner --project 
$GCP_PROJECT  --staging_location gs://test-$USER/staging_wc --temp_location 
gs://test-$USER/tmp \ --sdk_location=./dist/apache-beam-2.8.0.dev0.tar.gz

 

 

 

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-5448) Support running user pipelines with the Java Reference Runner in Python.

2018-11-12 Thread Ruoyun Huang (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16684206#comment-16684206
 ] 

Ruoyun Huang commented on BEAM-5448:


Notes status quo and what we do:  https://bit.ly/2qui6Ot

> Support running user pipelines with the Java Reference Runner in Python.
> 
>
> Key: BEAM-5448
> URL: https://issues.apache.org/jira/browse/BEAM-5448
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-direct
>Reporter: Daniel Oliveira
>Assignee: Ruoyun Huang
>Priority: Major
>
> In order to aid testing, devs should be able to write pipelines and then 
> easily run them with the ULR. This task is for creating the build rules 
> needed to accomplish this for pipelines using the Python SDK.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (BEAM-5448) Support running user pipelines with the Java Reference Runner in Python.

2018-11-12 Thread Ruoyun Huang (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16684206#comment-16684206
 ] 

Ruoyun Huang edited comment on BEAM-5448 at 11/12/18 6:20 PM:
--

Notes status quo and what we do:  
[Link.|https://docs.google.com/document/d/1S86saZqiDaE_M5wxO0zOQ_rwC6QHv7sp1BmGTm0dLNE/edit#]


was (Author: ruoyun):
Notes status quo and what we do:  https://bit.ly/2qui6Ot

> Support running user pipelines with the Java Reference Runner in Python.
> 
>
> Key: BEAM-5448
> URL: https://issues.apache.org/jira/browse/BEAM-5448
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-direct
>Reporter: Daniel Oliveira
>Assignee: Ruoyun Huang
>Priority: Major
>
> In order to aid testing, devs should be able to write pipelines and then 
> easily run them with the ULR. This task is for creating the build rules 
> needed to accomplish this for pipelines using the Python SDK.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-5931) Rollback PR/6899

2018-11-05 Thread Ruoyun Huang (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16676260#comment-16676260
 ] 

Ruoyun Huang commented on BEAM-5931:


+  [~chamikara] , [~szewinho], [~kasiak], [~dariusz.aniszewski]

Added area experts. (according to this page: 
[https://cwiki.apache.org/confluence/display/BEAM/Works+in+Progress])

 

Folks, would like to ask for suggestions regarding which way would be the 
easiest/best to fix PerformanceTests_TextIOIT (more details on my original 
questions [https://bit.ly/2qui6Ot]).   Thanks a lot everyone. 

> Rollback PR/6899
> 
>
> Key: BEAM-5931
> URL: https://issues.apache.org/jira/browse/BEAM-5931
> Project: Beam
>  Issue Type: Task
>  Components: beam-model, runner-dataflow
>Reporter: Luke Cwik
>Assignee: Ruoyun Huang
>Priority: Major
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> To rollback this change, one must either:
> 1) Update nexmark / perf test framework to use Dataflow worker jar.
> This requires adding the 
> {code}
>  "--dataflowWorkerJar=${dataflowWorkerJar}",
>  "--workerHarnessContainerImage=",
> {code}
> when running the tests.
> OR
> 2) Update the dataflow worker image with code that contains the rollback of 
> PR/6899 and then rollback PR/6899 in Github with the updated Dataflow worker 
> image.
> #1 is preferable since we will no longer have tests running that don't use a 
> Dataflow worker jar built from Github HEAD.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-5931) Rollback PR/6899

2018-11-05 Thread Ruoyun Huang (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16675632#comment-16675632
 ] 

Ruoyun Huang commented on BEAM-5931:


Would like to ping in this thread about my question (https://bit.ly/2qui6Ot) in 
dev mailing. 

I saw Lukaz Gajowy is the person that made recent significant change to these 
Jenkins files.  Would you please share your workflow on debug/test-run Jenkins 
files?  I have been using command string in PR, but that was very inefficient.  
Really appreciate any suggestion on what the best way is. 

 

Thanks. 

 

> Rollback PR/6899
> 
>
> Key: BEAM-5931
> URL: https://issues.apache.org/jira/browse/BEAM-5931
> Project: Beam
>  Issue Type: Task
>  Components: beam-model, runner-dataflow
>Reporter: Luke Cwik
>Assignee: Ruoyun Huang
>Priority: Major
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> To rollback this change, one must either:
> 1) Update nexmark / perf test framework to use Dataflow worker jar.
> This requires adding the 
> {code}
>  "--dataflowWorkerJar=${dataflowWorkerJar}",
>  "--workerHarnessContainerImage=",
> {code}
> when running the tests.
> OR
> 2) Update the dataflow worker image with code that contains the rollback of 
> PR/6899 and then rollback PR/6899 in Github with the updated Dataflow worker 
> image.
> #1 is preferable since we will no longer have tests running that don't use a 
> Dataflow worker jar built from Github HEAD.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-5931) Rollback PR/6899

2018-11-01 Thread Ruoyun Huang (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16671945#comment-16671945
 ] 

Ruoyun Huang commented on BEAM-5931:


Looking into  the Nexmark one.

Original PR mentioned failing target beam_PerformanceTests_TextIOIT,  however I 
cannot find a file with this name under folder 
"[beam|https://github.com/apache/beam]/[.test-infra|https://github.com/apache/beam/tree/master/.test-infra]/*jenkins*/;,
 can someone point me where this is? 

> Rollback PR/6899
> 
>
> Key: BEAM-5931
> URL: https://issues.apache.org/jira/browse/BEAM-5931
> Project: Beam
>  Issue Type: Task
>  Components: beam-model, runner-dataflow
>Reporter: Luke Cwik
>Assignee: Ruoyun Huang
>Priority: Major
>
> To rollback this change, one must either:
> 1) Update nexmark / perf test framework to use Dataflow worker jar.
> This requires adding the 
> {code}
>  "--dataflowWorkerJar=${dataflowWorkerJar}",
>  "--workerHarnessContainerImage=",
> {code}
> when running the tests.
> OR
> 2) Update the dataflow worker image with code that contains the rollback of 
> PR/6899 and then rollback PR/6899 in Github with the updated Dataflow worker 
> image.
> #1 is preferable since we will no longer have tests running that don't use a 
> Dataflow worker jar built from Github HEAD.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (BEAM-5703) Migrate Python streaming and portable integration tests to use a staged dataflow worker jar

2018-10-26 Thread Ruoyun Huang (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruoyun Huang resolved BEAM-5703.

   Resolution: Fixed
Fix Version/s: Not applicable

> Migrate Python streaming and portable integration tests to use a staged 
> dataflow worker jar
> ---
>
> Key: BEAM-5703
> URL: https://issues.apache.org/jira/browse/BEAM-5703
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Henning Rohde
>Assignee: Ruoyun Huang
>Priority: Major
> Fix For: Not applicable
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (BEAM-5745) Util test on annotations fails

2018-10-26 Thread Ruoyun Huang (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruoyun Huang resolved BEAM-5745.

   Resolution: Fixed
Fix Version/s: Not applicable

> Util test on annotations fails 
> ---
>
> Key: BEAM-5745
> URL: https://issues.apache.org/jira/browse/BEAM-5745
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Ruoyun Huang
>Assignee: Ruoyun Huang
>Priority: Minor
> Fix For: Not applicable
>
>  Time Spent: 5h
>  Remaining Estimate: 0h
>
> Traceback (most recent call last):
>   File 
> "/usr/local/google/home/ruoyun/projects/beam/sdks/python/apache_beam/utils/annotations_test.py",
>  line 142, in test_frequency
>     label_check_list=[])
>   File 
> "/usr/local/google/home/ruoyun/projects/beam/sdks/python/apache_beam/utils/annotations_test.py",
>  line 149, in check_annotation
>     self.assertIn(fnc_name + ' is ' + annotation_type, 
> str(warning[-1].message))
> AssertionError: 'fnc2_test_annotate_frequency is experimental' not found in 
> 'fnc_test_annotate_frequency is experimental.'



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (BEAM-5637) Python support for custom dataflow worker jar

2018-10-24 Thread Ruoyun Huang (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruoyun Huang resolved BEAM-5637.

   Resolution: Fixed
Fix Version/s: Not applicable

> Python support for custom dataflow worker jar
> -
>
> Key: BEAM-5637
> URL: https://issues.apache.org/jira/browse/BEAM-5637
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Henning Rohde
>Assignee: Ruoyun Huang
>Priority: Major
> Fix For: Not applicable
>
>  Time Spent: 6h
>  Remaining Estimate: 0h
>
> One of the slightly subtle aspects is that we would need to ignore one of the 
> staged jars for portable Python jobs. That requires a change to the Python 
> boot code: 
> https://github.com/apache/beam/blob/66d7c865b7267f388ee60752891a9141fad43774/sdks/python/container/boot.go#L104



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (BEAM-5793) Python sdk docs target fails in flink_streaming_impulse_source

2018-10-22 Thread Ruoyun Huang (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruoyun Huang resolved BEAM-5793.

   Resolution: Fixed
Fix Version/s: 2.8.0

> Python sdk docs target fails in flink_streaming_impulse_source
> --
>
> Key: BEAM-5793
> URL: https://issues.apache.org/jira/browse/BEAM-5793
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Reporter: Ruoyun Huang
>Assignee: Micah Wylde
>Priority: Minor
> Fix For: 2.8.0
>
>
> gradle scan result: [https://gradle.com/s/w4icffibxs72a]
> Error message: 
> projects/beam2/sdks/python/apache_beam/io/flink/flink_streaming_impulse_source.py:docstring
>  of 
> apache_beam.io.flink.flink_streaming_impulse_source.FlinkStreamingImpulseSource.from_runner_api_parameter:11:Unexpected
>  indentation.
>  
> Root cause seems to be that there are two annotations for a single function. 
> And sphinx does not like that. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-5793) Python sdk docs target fails in flink_streaming_impulse_source

2018-10-22 Thread Ruoyun Huang (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16659166#comment-16659166
 ] 

Ruoyun Huang commented on BEAM-5793:


FYI.  

It is fixed by removing the staticmethod annotation in 
[https://github.com/apache/beam/pull/6774.] 

> Python sdk docs target fails in flink_streaming_impulse_source
> --
>
> Key: BEAM-5793
> URL: https://issues.apache.org/jira/browse/BEAM-5793
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Reporter: Ruoyun Huang
>Assignee: Micah Wylde
>Priority: Minor
>
> gradle scan result: [https://gradle.com/s/w4icffibxs72a]
> Error message: 
> projects/beam2/sdks/python/apache_beam/io/flink/flink_streaming_impulse_source.py:docstring
>  of 
> apache_beam.io.flink.flink_streaming_impulse_source.FlinkStreamingImpulseSource.from_runner_api_parameter:11:Unexpected
>  indentation.
>  
> Root cause seems to be that there are two annotations for a single function. 
> And sphinx does not like that. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-5793) Python sdk docs target fails in flink_streaming_impulse_source

2018-10-19 Thread Ruoyun Huang (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16657606#comment-16657606
 ] 

Ruoyun Huang commented on BEAM-5793:


Update:

I created a testing PR, [https://github.com/apache/beam/pull/6759.]   Verified 
two things: 

1) `docs` build target indeed is triggered in jenkins (previously I didn't 
notice there is 'full log' option). 

2) and   it passes on Jenkins.  
[Link|https://builds.apache.org/job/beam_PreCommit_Python_Commit/1985/] 

I verified both my Desktop and Jenkins use the same version of sphinx (1.6.5). 

 

 

Thanks Micah for the discussions. Given this result, it is more likely to be 
some settings messed up on my Desktop (and so does Scott's ). :( 

 

Instead of digging into this further, for now I shall just pause and do some 
manual fix when building my PRs.

 

 

> Python sdk docs target fails in flink_streaming_impulse_source
> --
>
> Key: BEAM-5793
> URL: https://issues.apache.org/jira/browse/BEAM-5793
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Reporter: Ruoyun Huang
>Assignee: Micah Wylde
>Priority: Minor
>
> gradle scan result: [https://gradle.com/s/w4icffibxs72a]
> Error message: 
> projects/beam2/sdks/python/apache_beam/io/flink/flink_streaming_impulse_source.py:docstring
>  of 
> apache_beam.io.flink.flink_streaming_impulse_source.FlinkStreamingImpulseSource.from_runner_api_parameter:11:Unexpected
>  indentation.
>  
> Root cause seems to be that there are two annotations for a single function. 
> And sphinx does not like that. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (BEAM-5793) Python sdk docs target fails in flink_streaming_impulse_source

2018-10-19 Thread Ruoyun Huang (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16657525#comment-16657525
 ] 

Ruoyun Huang edited comment on BEAM-5793 at 10/19/18 10:50 PM:
---

 

Weird thing is so far Jenkins continuous build haven't caught this. 

Micah, that is a good idea, let me create a testing PR (by making sure synced 
to head), putting it on Jenkins see what happens. 


was (Author: ruoyun):
AFAIK, it is not on Jenkins.  (And I think it should be) 

 

Micah, that is a good idea, let me create a testing PR, putting it on Jenkins 
see what happens. 

> Python sdk docs target fails in flink_streaming_impulse_source
> --
>
> Key: BEAM-5793
> URL: https://issues.apache.org/jira/browse/BEAM-5793
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Reporter: Ruoyun Huang
>Assignee: Micah Wylde
>Priority: Minor
>
> gradle scan result: [https://gradle.com/s/w4icffibxs72a]
> Error message: 
> projects/beam2/sdks/python/apache_beam/io/flink/flink_streaming_impulse_source.py:docstring
>  of 
> apache_beam.io.flink.flink_streaming_impulse_source.FlinkStreamingImpulseSource.from_runner_api_parameter:11:Unexpected
>  indentation.
>  
> Root cause seems to be that there are two annotations for a single function. 
> And sphinx does not like that. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-5793) Python sdk docs target fails in flink_streaming_impulse_source

2018-10-19 Thread Ruoyun Huang (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16657525#comment-16657525
 ] 

Ruoyun Huang commented on BEAM-5793:


AFAIK, it is not on Jenkins.  (And I think it should be) 

 

Micah, that is a good idea, let me create a testing PR, putting it on Jenkins 
see what happens. 

> Python sdk docs target fails in flink_streaming_impulse_source
> --
>
> Key: BEAM-5793
> URL: https://issues.apache.org/jira/browse/BEAM-5793
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Reporter: Ruoyun Huang
>Assignee: Micah Wylde
>Priority: Minor
>
> gradle scan result: [https://gradle.com/s/w4icffibxs72a]
> Error message: 
> projects/beam2/sdks/python/apache_beam/io/flink/flink_streaming_impulse_source.py:docstring
>  of 
> apache_beam.io.flink.flink_streaming_impulse_source.FlinkStreamingImpulseSource.from_runner_api_parameter:11:Unexpected
>  indentation.
>  
> Root cause seems to be that there are two annotations for a single function. 
> And sphinx does not like that. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-5793) Python sdk docs target fails in flink_streaming_impulse_source

2018-10-19 Thread Ruoyun Huang (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16657495#comment-16657495
 ] 

Ruoyun Huang commented on BEAM-5793:


I tend to guess this is a real issue as well. 

I tried several things still not sure about the root cause. but one thing for 
sure is that if we remove either of the annotations, then the warning is gone. 

 

btw, back to the implementation choice itself,  looking at what this function 
does, I don't understand why this function has to be a static method. 

> Python sdk docs target fails in flink_streaming_impulse_source
> --
>
> Key: BEAM-5793
> URL: https://issues.apache.org/jira/browse/BEAM-5793
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Reporter: Ruoyun Huang
>Assignee: Micah Wylde
>Priority: Minor
>
> gradle scan result: [https://gradle.com/s/w4icffibxs72a]
> Error message: 
> projects/beam2/sdks/python/apache_beam/io/flink/flink_streaming_impulse_source.py:docstring
>  of 
> apache_beam.io.flink.flink_streaming_impulse_source.FlinkStreamingImpulseSource.from_runner_api_parameter:11:Unexpected
>  indentation.
>  
> Root cause seems to be that there are two annotations for a single function. 
> And sphinx does not like that. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-5793) Python sdk docs target fails in flink_streaming_impulse_source

2018-10-19 Thread Ruoyun Huang (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16657200#comment-16657200
 ] 

Ruoyun Huang commented on BEAM-5793:


No, didn't do anything special.  Just synced to head, and then run the gradle 
command. 

I am on linux and python 2.7.13.    Let me have a try on a mac as well. 

> Python sdk docs target fails in flink_streaming_impulse_source
> --
>
> Key: BEAM-5793
> URL: https://issues.apache.org/jira/browse/BEAM-5793
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Reporter: Ruoyun Huang
>Assignee: Micah Wylde
>Priority: Minor
>
> gradle scan result: [https://gradle.com/s/w4icffibxs72a]
> Error message: 
> projects/beam2/sdks/python/apache_beam/io/flink/flink_streaming_impulse_source.py:docstring
>  of 
> apache_beam.io.flink.flink_streaming_impulse_source.FlinkStreamingImpulseSource.from_runner_api_parameter:11:Unexpected
>  indentation.
>  
> Root cause seems to be that there are two annotations for a single function. 
> And sphinx does not like that. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-5793) Python sdk docs target fails in flink_streaming_impulse_source

2018-10-19 Thread Ruoyun Huang (JIRA)
Ruoyun Huang created BEAM-5793:
--

 Summary: Python sdk docs target fails in 
flink_streaming_impulse_source
 Key: BEAM-5793
 URL: https://issues.apache.org/jira/browse/BEAM-5793
 Project: Beam
  Issue Type: Improvement
  Components: sdk-py-core
Reporter: Ruoyun Huang
Assignee: Ahmet Altay


gradle scan result: [https://gradle.com/s/w4icffibxs72a]

Error message: 
projects/beam2/sdks/python/apache_beam/io/flink/flink_streaming_impulse_source.py:docstring
 of 
apache_beam.io.flink.flink_streaming_impulse_source.FlinkStreamingImpulseSource.from_runner_api_parameter:11:Unexpected
 indentation.

 

Root cause seems to be that there are two annotations for a single function. 
And sphinx does not like that. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-5754) beam-sdks-java-io-xml:test target fails in 2.7.0 release

2018-10-15 Thread Ruoyun Huang (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruoyun Huang updated BEAM-5754:
---
Description: 
some community member reports (in a slack thread) that this test target fails 
in 2.7.0. 

I verified on my computer as well. Same error.  

Not sure how serious it is, just to create this Jira to bring the issue to 
attention.  Should be easy to fix. 

 

Error message: 

> Task :beam-sdks-java-core:compileJava
/usr/local/google/home/ruoyun/projects/beam2/sdks/java/core/src/main/java/org/apache/beam/sdk/options/PipelineOptionsFactory.java:1544:
 warning: [UnnecessaryParentheses] Unnecessary use of grouping parentheses
 if (!(PipelineRunner.class.isAssignableFrom(runnerClass))) {
 ^
 (see https://errorprone.info/bugpattern/UnnecessaryParentheses)
 Did you mean 'if (!PipelineRunner.class.isAssignableFrom(runnerClass)) {'?
error: warnings found and -Werror specified
Note: Some input files use or override a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
Note: 
/usr/local/google/home/ruoyun/projects/beam2/sdks/java/core/src/main/java/org/apache/beam/sdk/options/PipelineOptionsFactory.java
 uses unchecked or unsafe operations.
Note: Recompile with -Xlint:unchecked for details.
1 error
1 warning

  was:
some community member reports (in a slack thread) that this test target fails 
in 2.7.0. 

I verified on my computer as well. Same error.  

Not sure how serious it is, just to create this Jira to bring the issue to 
attention.  Should be easy to fix. 


> beam-sdks-java-io-xml:test target fails in 2.7.0 release
> 
>
> Key: BEAM-5754
> URL: https://issues.apache.org/jira/browse/BEAM-5754
> Project: Beam
>  Issue Type: Test
>  Components: sdk-java-core
> Environment: java
>Reporter: Ruoyun Huang
>Assignee: Charles Chen
>Priority: Minor
>
> some community member reports (in a slack thread) that this test target fails 
> in 2.7.0. 
> I verified on my computer as well. Same error.  
> Not sure how serious it is, just to create this Jira to bring the issue to 
> attention.  Should be easy to fix. 
>  
> Error message: 
> > Task :beam-sdks-java-core:compileJava
> /usr/local/google/home/ruoyun/projects/beam2/sdks/java/core/src/main/java/org/apache/beam/sdk/options/PipelineOptionsFactory.java:1544:
>  warning: [UnnecessaryParentheses] Unnecessary use of grouping parentheses
>  if (!(PipelineRunner.class.isAssignableFrom(runnerClass))) {
>  ^
>  (see https://errorprone.info/bugpattern/UnnecessaryParentheses)
>  Did you mean 'if (!PipelineRunner.class.isAssignableFrom(runnerClass)) {'?
> error: warnings found and -Werror specified
> Note: Some input files use or override a deprecated API.
> Note: Recompile with -Xlint:deprecation for details.
> Note: 
> /usr/local/google/home/ruoyun/projects/beam2/sdks/java/core/src/main/java/org/apache/beam/sdk/options/PipelineOptionsFactory.java
>  uses unchecked or unsafe operations.
> Note: Recompile with -Xlint:unchecked for details.
> 1 error
> 1 warning



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work started] (BEAM-5745) Util test on annotations fails

2018-10-14 Thread Ruoyun Huang (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on BEAM-5745 started by Ruoyun Huang.
--
> Util test on annotations fails 
> ---
>
> Key: BEAM-5745
> URL: https://issues.apache.org/jira/browse/BEAM-5745
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Reporter: Ruoyun Huang
>Assignee: Ruoyun Huang
>Priority: Minor
>
> Traceback (most recent call last):
>   File 
> "/usr/local/google/home/ruoyun/projects/beam/sdks/python/apache_beam/utils/annotations_test.py",
>  line 142, in test_frequency
>     label_check_list=[])
>   File 
> "/usr/local/google/home/ruoyun/projects/beam/sdks/python/apache_beam/utils/annotations_test.py",
>  line 149, in check_annotation
>     self.assertIn(fnc_name + ' is ' + annotation_type, 
> str(warning[-1].message))
> AssertionError: 'fnc2_test_annotate_frequency is experimental' not found in 
> 'fnc_test_annotate_frequency is experimental.'



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)