[jira] [Work logged] (BEAM-9136) Add LICENSES and NOTICES to docker images

2020-03-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9136?focusedWorklogId=410854=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-410854
 ]

ASF GitHub Bot logged work on BEAM-9136:


Author: ASF GitHub Bot
Created on: 27/Mar/20 05:50
Start Date: 27/Mar/20 05:50
Worklog Time Spent: 10m 
  Work Description: Hannah-Jiang commented on issue #11246: [BEAM-9136]Add 
licenses for dependencies for Go
URL: https://github.com/apache/beam/pull/11246#issuecomment-604822387
 
 
   R: @alanmyrvold, @robertwb 
   Cc: @tvalentyn 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 410854)
Time Spent: 9h  (was: 8h 50m)

> Add LICENSES and NOTICES to docker images
> -
>
> Key: BEAM-9136
> URL: https://issues.apache.org/jira/browse/BEAM-9136
> Project: Beam
>  Issue Type: Task
>  Components: build-system
>Reporter: Hannah Jiang
>Assignee: Hannah Jiang
>Priority: Major
>  Time Spent: 9h
>  Remaining Estimate: 0h
>
> Scan dependencies and add licenses and notices of the dependencies to SDK 
> docker images.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-4374) Update existing metrics in the FN API to use new Metric Schema

2020-03-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-4374?focusedWorklogId=410850=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-410850
 ]

ASF GitHub Bot logged work on BEAM-4374:


Author: ASF GitHub Bot
Created on: 27/Mar/20 05:33
Start Date: 27/Mar/20 05:33
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on issue #11184: [BEAM-4374] Update 
protos related to MonitoringInfo.
URL: https://github.com/apache/beam/pull/11184#issuecomment-604823688
 
 
   Run Java PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 410850)
Time Spent: 34h 10m  (was: 34h)

> Update existing metrics in the FN API to use new Metric Schema
> --
>
> Key: BEAM-4374
> URL: https://issues.apache.org/jira/browse/BEAM-4374
> Project: Beam
>  Issue Type: New Feature
>  Components: beam-model
>Reporter: Alex Amato
>Priority: Major
>  Time Spent: 34h 10m
>  Remaining Estimate: 0h
>
> Update existing metrics to use the new proto and cataloging schema defined in:
> [_https://s.apache.org/beam-fn-api-metrics_]
>  * Check in new protos
>  * Define catalog file for metrics
>  * Port existing metrics to use this new format, based on catalog 
> names+metadata



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9136) Add LICENSES and NOTICES to docker images

2020-03-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9136?focusedWorklogId=410849=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-410849
 ]

ASF GitHub Bot logged work on BEAM-9136:


Author: ASF GitHub Bot
Created on: 27/Mar/20 05:28
Start Date: 27/Mar/20 05:28
Worklog Time Spent: 10m 
  Work Description: Hannah-Jiang commented on issue #11246: [BEAM-9136]Add 
licenses for dependencies for Go
URL: https://github.com/apache/beam/pull/11246#issuecomment-604822387
 
 
   R: @alanmyrvold, @robertwb 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 410849)
Time Spent: 8h 50m  (was: 8h 40m)

> Add LICENSES and NOTICES to docker images
> -
>
> Key: BEAM-9136
> URL: https://issues.apache.org/jira/browse/BEAM-9136
> Project: Beam
>  Issue Type: Task
>  Components: build-system
>Reporter: Hannah Jiang
>Assignee: Hannah Jiang
>Priority: Major
>  Time Spent: 8h 50m
>  Remaining Estimate: 0h
>
> Scan dependencies and add licenses and notices of the dependencies to SDK 
> docker images.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9136) Add LICENSES and NOTICES to docker images

2020-03-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9136?focusedWorklogId=410848=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-410848
 ]

ASF GitHub Bot logged work on BEAM-9136:


Author: ASF GitHub Bot
Created on: 27/Mar/20 05:28
Start Date: 27/Mar/20 05:28
Worklog Time Spent: 10m 
  Work Description: Hannah-Jiang commented on pull request #11246: 
[BEAM-9136]Add licenses for dependencies for Go
URL: https://github.com/apache/beam/pull/11246#discussion_r399042837
 
 

 ##
 File path: sdks/go/container/license_script.sh
 ##
 @@ -0,0 +1,25 @@
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+output_dir=third_party_licenses
+# remove output_dir if existing
+if [ -d "$output_dir" ]; then rm -rf $output_dir; fi
+
+# get go-licenses and run
+go get github.com/google/go-licenses
+$GOPATH/bin/go-licenses save "github.com/apache/beam/sdks/go/pkg/beam/" 
--save_path="$output_dir"
 
 Review comment:
   This line returns `not found` error when run with Jenkins. When I test with 
my machine, `$GOPATH/bin/go-license` worked.
   I tried with `go-licenses`, `$GOPATH/bin/go-license`, 
`/usr/bin/go/bin/go-licenses` but no one worked. All returned `not found` 
error. 
[log](https://builds.apache.org/job/beam_PreCommit_Go_Commit/6020/console)
   How do we run a Go package within Jenkins? Can we use the same script to run 
both locally and Jenkins? And in case users want to customize it, it should be 
able to run at users' machine as well.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 410848)
Time Spent: 8h 40m  (was: 8.5h)

> Add LICENSES and NOTICES to docker images
> -
>
> Key: BEAM-9136
> URL: https://issues.apache.org/jira/browse/BEAM-9136
> Project: Beam
>  Issue Type: Task
>  Components: build-system
>Reporter: Hannah Jiang
>Assignee: Hannah Jiang
>Priority: Major
>  Time Spent: 8h 40m
>  Remaining Estimate: 0h
>
> Scan dependencies and add licenses and notices of the dependencies to SDK 
> docker images.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9562) Remove timer from PCollection and treat timers as Elements

2020-03-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9562?focusedWorklogId=410839=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-410839
 ]

ASF GitHub Bot logged work on BEAM-9562:


Author: ASF GitHub Bot
Created on: 27/Mar/20 04:53
Start Date: 27/Mar/20 04:53
Worklog Time Spent: 10m 
  Work Description: boyuanzz commented on issue #11199: [BEAM-9562] Update 
Timer encoding with respect of dynamic timers
URL: https://github.com/apache/beam/pull/11199#issuecomment-604814424
 
 
   Most python and Java SDK part has been done. Remaining work for java runner 
hookup.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 410839)
Time Spent: 5h 40m  (was: 5.5h)

> Remove timer from PCollection and treat timers as Elements 
> ---
>
> Key: BEAM-9562
> URL: https://issues.apache.org/jira/browse/BEAM-9562
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-py-harness
>Reporter: Boyuan Zhang
>Assignee: Boyuan Zhang
>Priority: Major
>  Time Spent: 5h 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9136) Add LICENSES and NOTICES to docker images

2020-03-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9136?focusedWorklogId=410831=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-410831
 ]

ASF GitHub Bot logged work on BEAM-9136:


Author: ASF GitHub Bot
Created on: 27/Mar/20 04:32
Start Date: 27/Mar/20 04:32
Worklog Time Spent: 10m 
  Work Description: Hannah-Jiang commented on pull request #11246: 
[BEAM-9136]Add licenses for dependencies for Go
URL: https://github.com/apache/beam/pull/11246
 
 
   **Please** add a meaningful description for your change here
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [ ] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] Update `CHANGES.md` with noteworthy changes.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   See the [Contributor Guide](https://beam.apache.org/contribute) for more 
tips on [how to make review process 
smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)[![Build
 

[jira] [Work logged] (BEAM-4374) Update existing metrics in the FN API to use new Metric Schema

2020-03-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-4374?focusedWorklogId=410829=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-410829
 ]

ASF GitHub Bot logged work on BEAM-4374:


Author: ASF GitHub Bot
Created on: 27/Mar/20 04:26
Start Date: 27/Mar/20 04:26
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on issue #11184: [BEAM-4374] Update 
protos related to MonitoringInfo.
URL: https://github.com/apache/beam/pull/11184#issuecomment-604808223
 
 
   Run Java PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 410829)
Time Spent: 34h  (was: 33h 50m)

> Update existing metrics in the FN API to use new Metric Schema
> --
>
> Key: BEAM-4374
> URL: https://issues.apache.org/jira/browse/BEAM-4374
> Project: Beam
>  Issue Type: New Feature
>  Components: beam-model
>Reporter: Alex Amato
>Priority: Major
>  Time Spent: 34h
>  Remaining Estimate: 0h
>
> Update existing metrics to use the new proto and cataloging schema defined in:
> [_https://s.apache.org/beam-fn-api-metrics_]
>  * Check in new protos
>  * Define catalog file for metrics
>  * Port existing metrics to use this new format, based on catalog 
> names+metadata



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9562) Remove timer from PCollection and treat timers as Elements

2020-03-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9562?focusedWorklogId=410826=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-410826
 ]

ASF GitHub Bot logged work on BEAM-9562:


Author: ASF GitHub Bot
Created on: 27/Mar/20 04:21
Start Date: 27/Mar/20 04:21
Worklog Time Spent: 10m 
  Work Description: boyuanzz commented on pull request #11216: [BEAM-9562] 
Remove TimerSpec from Proto
URL: https://github.com/apache/beam/pull/11216
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 410826)
Time Spent: 5.5h  (was: 5h 20m)

> Remove timer from PCollection and treat timers as Elements 
> ---
>
> Key: BEAM-9562
> URL: https://issues.apache.org/jira/browse/BEAM-9562
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-py-harness
>Reporter: Boyuan Zhang
>Assignee: Boyuan Zhang
>Priority: Major
>  Time Spent: 5.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9562) Remove timer from PCollection and treat timers as Elements

2020-03-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9562?focusedWorklogId=410825=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-410825
 ]

ASF GitHub Bot logged work on BEAM-9562:


Author: ASF GitHub Bot
Created on: 27/Mar/20 04:20
Start Date: 27/Mar/20 04:20
Worklog Time Spent: 10m 
  Work Description: boyuanzz commented on issue #11216: [BEAM-9562] Remove 
TimerSpec from Proto
URL: https://github.com/apache/beam/pull/11216#issuecomment-604806948
 
 
   All tests passed. Going to merge it. Thanks for your help!
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 410825)
Time Spent: 5h 20m  (was: 5h 10m)

> Remove timer from PCollection and treat timers as Elements 
> ---
>
> Key: BEAM-9562
> URL: https://issues.apache.org/jira/browse/BEAM-9562
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-py-harness
>Reporter: Boyuan Zhang
>Assignee: Boyuan Zhang
>Priority: Major
>  Time Spent: 5h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8603) Add Python SqlTransform MVP

2020-03-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8603?focusedWorklogId=410817=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-410817
 ]

ASF GitHub Bot logged work on BEAM-8603:


Author: ASF GitHub Bot
Created on: 27/Mar/20 03:30
Start Date: 27/Mar/20 03:30
Worklog Time Spent: 10m 
  Work Description: ihji commented on pull request #10055: [BEAM-8603] Add 
Python SqlTransform
URL: https://github.com/apache/beam/pull/10055#discussion_r399015980
 
 

 ##
 File path: sdks/python/apache_beam/transforms/sql_test.py
 ##
 @@ -0,0 +1,109 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+"""Tests for transforms that use the SQL Expansion service."""
+
+# pytype: skip-file
+
+from __future__ import absolute_import
+
+import logging
+import typing
+import unittest
+
+from nose.plugins.attrib import attr
+from past.builtins import unicode
+
+import apache_beam as beam
+from apache_beam import coders
+from apache_beam.options.pipeline_options import DebugOptions
+from apache_beam.options.pipeline_options import StandardOptions
+from apache_beam.testing.test_pipeline import TestPipeline
+from apache_beam.testing.util import assert_that
+from apache_beam.testing.util import equal_to
+from apache_beam.transforms.sql import SqlTransform
+from apache_beam.utils import subprocess_server
+
+SimpleRow = typing.NamedTuple(
+"SimpleRow", [("int", int), ("str", unicode), ("flt", float)])
+coders.registry.register_coder(SimpleRow, coders.RowCoder)
+
+
+@attr('UsesSqlExpansionService')
+@unittest.skipIf(
+TestPipeline().get_pipeline_options().view_as(StandardOptions).runner is
+None,
+"Must be run with a runner that supports cross-language transforms")
+class SqlTransformTest(unittest.TestCase):
+  """Tests that exercise the cross-language SqlTransform (implemented in java).
+
+  Note this test must be executed with pipeline options that run jobs on a 
local
+  job server. The easiest way to accomplish this is to run the
+  `validatesCrossLanguageRunnerPythonUsingSql` gradle target for a particular
+  job server, which will start the runner and job server for you. For example,
+  `:runners:flink:1.10:job-server:validatesCrossLanguageRunnerPythonUsingSql` 
to
+  test on Flink 1.10.
+
+  Alternatively, you may be able to iterate faster if you run the tests 
directly
+  using a runner like `FlinkRunner`, which starts its own job server, but 
you'll
+  need to spin up a local flink cluster:
+$ pip install -e './sdks/python[gcp,test]'
+$ python ./sdks/python/setup.py nosetests \\
+--tests apache_beam.transforms.sql_test \\
+--test-pipeline-options="--runner=FlinkRunner \\
+ --flink_version=1.10 \\
+ --flink_master=localhost:8081"
+  """
+  @staticmethod
+  def make_test_pipeline():
+path_to_jar = subprocess_server.JavaJarServer.path_to_beam_jar(
+":sdks:java:extensions:sql:expansion-service:shadowJar")
+test_pipeline = TestPipeline()
+test_pipeline.get_pipeline_options().view_as(DebugOptions).experiments = [
+'jar_packages=' + path_to_jar
 
 Review comment:
   We can remove `jar_packages` flag when BEAM-9238 is done.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 410817)
Time Spent: 5h 10m  (was: 5h)

> Add Python SqlTransform MVP
> ---
>
> Key: BEAM-8603
> URL: https://issues.apache.org/jira/browse/BEAM-8603
> Project: Beam
>  Issue Type: Improvement
>  Components: dsl-sql, sdk-py-core
>Reporter: Brian Hulette
>Assignee: Brian Hulette
>Priority: Major
>  Time Spent: 5h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8751) Beam Dependency Update Request: com.google.apis:google-api-services-cloudresourcemanager

2020-03-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8751?focusedWorklogId=410816=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-410816
 ]

ASF GitHub Bot logged work on BEAM-8751:


Author: ASF GitHub Bot
Created on: 27/Mar/20 03:28
Start Date: 27/Mar/20 03:28
Worklog Time Spent: 10m 
  Work Description: suztomo commented on issue #11208: [BEAM-8751] 
google-api-client 1.30.9
URL: https://github.com/apache/beam/pull/11208#issuecomment-604795796
 
 
   R: @lukecwik 
   22 successful checks!
   
   @aaltay Thank you!
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 410816)
Time Spent: 2h  (was: 1h 50m)

> Beam Dependency Update Request: 
> com.google.apis:google-api-services-cloudresourcemanager
> 
>
> Key: BEAM-8751
> URL: https://issues.apache.org/jira/browse/BEAM-8751
> Project: Beam
>  Issue Type: Sub-task
>  Components: dependencies
>Reporter: Beam JIRA Bot
>Priority: Major
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
>  - 2019-11-19 21:04:41.938497 
> -
> Please consider upgrading the dependency 
> com.google.apis:google-api-services-cloudresourcemanager. 
> The current version is v1-rev20181015-1.28.0. The latest version is 
> v2-rev20191018-1.30.3 
> cc: [~chamikara], 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2019-12-02 12:09:51.401493 
> -
> Please consider upgrading the dependency 
> com.google.apis:google-api-services-cloudresourcemanager. 
> The current version is v1-rev20181015-1.28.0. The latest version is 
> v2-rev20191115-1.30.3 
> cc: [~chamikara], 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2019-12-09 12:09:00.761817 
> -
> Please consider upgrading the dependency 
> com.google.apis:google-api-services-cloudresourcemanager. 
> The current version is v1-rev20181015-1.28.0. The latest version is 
> v2-rev20191115-1.30.3 
> cc: [~chamikara], 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2019-12-23 12:09:01.384571 
> -
> Please consider upgrading the dependency 
> com.google.apis:google-api-services-cloudresourcemanager. 
> The current version is v1-rev20181015-1.28.0. The latest version is 
> v2-rev20191206-1.30.3 
> cc: [~chamikara], 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2019-12-30 14:04:31.850871 
> -
> Please consider upgrading the dependency 
> com.google.apis:google-api-services-cloudresourcemanager. 
> The current version is v1-rev20181015-1.28.0. The latest version is 
> v2-rev20191206-1.30.3 
> cc: [~chamikara], 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2020-01-06 12:08:07.241510 
> -
> Please consider upgrading the dependency 
> com.google.apis:google-api-services-cloudresourcemanager. 
> The current version is v1-rev20181015-1.28.0. The latest version is 
> v2-rev20191206-1.30.3 
> cc: [~chamikara], 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2020-01-13 12:08:00.916536 
> -
> Please consider upgrading the dependency 
> com.google.apis:google-api-services-cloudresourcemanager. 
> The current version is v1-rev20181015-1.28.0. The latest version is 
> v2-rev20191206-1.30.3 
> cc: [~chamikara], 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 

[jira] [Work logged] (BEAM-9562) Remove timer from PCollection and treat timers as Elements

2020-03-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9562?focusedWorklogId=410815=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-410815
 ]

ASF GitHub Bot logged work on BEAM-9562:


Author: ASF GitHub Bot
Created on: 27/Mar/20 03:27
Start Date: 27/Mar/20 03:27
Worklog Time Spent: 10m 
  Work Description: boyuanzz commented on issue #11216: [BEAM-9562] Remove 
TimerSpec from Proto
URL: https://github.com/apache/beam/pull/11216#issuecomment-604795592
 
 
   Run Java Flink PortableValidatesRunner Streaming
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 410815)
Time Spent: 5h 10m  (was: 5h)

> Remove timer from PCollection and treat timers as Elements 
> ---
>
> Key: BEAM-9562
> URL: https://issues.apache.org/jira/browse/BEAM-9562
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-py-harness
>Reporter: Boyuan Zhang
>Assignee: Boyuan Zhang
>Priority: Major
>  Time Spent: 5h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-4150) Standardize use of PCollection coder proto attribute

2020-03-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-4150?focusedWorklogId=410796=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-410796
 ]

ASF GitHub Bot logged work on BEAM-4150:


Author: ASF GitHub Bot
Created on: 27/Mar/20 02:54
Start Date: 27/Mar/20 02:54
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on issue #11222: [BEAM-4150] Don't 
window PCollection coders.
URL: https://github.com/apache/beam/pull/11222#issuecomment-604788204
 
 
   Run PythonDocker PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 410796)
Time Spent: 9h  (was: 8h 50m)

> Standardize use of PCollection coder proto attribute
> 
>
> Key: BEAM-4150
> URL: https://issues.apache.org/jira/browse/BEAM-4150
> Project: Beam
>  Issue Type: Task
>  Components: beam-model
>Reporter: Robert Bradshaw
>Assignee: Luke Cwik
>Priority: Major
> Fix For: 2.20.0
>
>  Time Spent: 9h
>  Remaining Estimate: 0h
>
> In some places it's expected to be a WindowedCoder, in others the raw 
> ElementCoder. We should use the same convention (decided in discussion to be 
> the raw ElementCoder) everywhere. The WindowCoder can be pulled out of the 
> attached windowing strategy, and the input/output ports should specify the 
> encoding directly rather than read the adjacent PCollection coder fields. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9434) Performance improvements processing a large number of Avro files in S3+Spark

2020-03-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9434?focusedWorklogId=410790=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-410790
 ]

ASF GitHub Bot logged work on BEAM-9434:


Author: ASF GitHub Bot
Created on: 27/Mar/20 02:40
Start Date: 27/Mar/20 02:40
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on issue #11037: [BEAM-9434] 
performance improvements reading many Avro files in S3
URL: https://github.com/apache/beam/pull/11037#issuecomment-604785112
 
 
   Sorry about the long delay but **Reshuffle** should produce as many 
partitions as the runner thinks is optimal. It is effectively a 
**redistribute** operation.
   
   It looks like the spark translation is copying the number of partitions from 
the upstream transform for the reshuffle translation and in your case this is 
likely 1. 
   Translation: 
https://github.com/apache/beam/blob/f5a4a5afcd9425c0ddb9ec9c70067a5d5c0bc769/runners/spark/src/main/java/org/apache/beam/runners/spark/translation/TransformTranslator.java#L681
   Copying partitions:
   
https://github.com/apache/beam/blob/f5a4a5afcd9425c0ddb9ec9c70067a5d5c0bc769/runners/spark/src/main/java/org/apache/beam/runners/spark/translation/GroupCombineFunctions.java#L191
   
   @iemejia Shouldn't we be using a much larger value for partitions, e.g. the 
number of nodes?
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 410790)
Time Spent: 2h 50m  (was: 2h 40m)

> Performance improvements processing a large number of Avro files in S3+Spark
> 
>
> Key: BEAM-9434
> URL: https://issues.apache.org/jira/browse/BEAM-9434
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-aws, sdk-java-core
>Affects Versions: 2.19.0
>Reporter: Emiliano Capoccia
>Assignee: Emiliano Capoccia
>Priority: Minor
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> There is a performance issue when processing a large number of small Avro 
> files in Spark on K8S (tens of thousands or more).
> The recommended way of reading a pattern of Avro files in Beam is by means of:
>  
> {code:java}
> PCollection records = p.apply(AvroIO.read(AvroGenClass.class)
> .from("s3://my-bucket/path-to/*.avro").withHintMatchesManyFiles())
> {code}
> However, in the case of many small files, the above results in the entire 
> reading taking place in a single task/node, which is considerably slow and 
> has scalability issues.
> The option of omitting the hint is not viable, as it results in too many 
> tasks being spawn, and the cluster being busy doing coordination of tiny 
> tasks with high overhead.
> There are a few workarounds on the internet which mainly revolve around 
> compacting the input files before processing, so that a reduced number of 
> bulky files is processed in parallel.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9399) Possible deadlock between DataflowWorkerLoggingHandler and overridden System.err PrintStream

2020-03-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9399?focusedWorklogId=410776=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-410776
 ]

ASF GitHub Bot logged work on BEAM-9399:


Author: ASF GitHub Bot
Created on: 27/Mar/20 02:24
Start Date: 27/Mar/20 02:24
Worklog Time Spent: 10m 
  Work Description: aaltay commented on issue #11096: [BEAM-9399] Change 
the redirection of System.err to be a custom PrintStream
URL: https://github.com/apache/beam/pull/11096#issuecomment-604781282
 
 
   retest this please
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 410776)
Time Spent: 3h 20m  (was: 3h 10m)

> Possible deadlock between DataflowWorkerLoggingHandler and overridden 
> System.err PrintStream
> 
>
> Key: BEAM-9399
> URL: https://issues.apache.org/jira/browse/BEAM-9399
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Reporter: Sam Whittle
>Assignee: Sam Whittle
>Priority: Minor
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> When an exception is encountered in DataflowWorkerLoggingHandler the 
> ErrorManager is used to log the exception.  ErrorManager uses System.err 
> which is overridden to be a PrintStream that writes back into 
> DataflowWorkerLoggingHandler.
> This has the lock ordering DataflowWorkerLoggingHandler -> PrintStream.
> Other logging of System.err has the inverse lock ordering 
> PrintStream->DataflowWorkerLoggingHandler so there is potential for deadlock.
> This is one known cause of the inversion, but any other System.err logs from 
> inside DataflowWorkerLoggingHandler could cause the same issue.
> Proposed fix is to address low-hanging fruit of having ErrorManager output to 
> the original System.err.  A full fix would be to improve our override of 
> System.err to a PrintStream that can detect the locking inversion or possibly 
> we could use the PrintStream mutex in both cases.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-4374) Update existing metrics in the FN API to use new Metric Schema

2020-03-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-4374?focusedWorklogId=410775=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-410775
 ]

ASF GitHub Bot logged work on BEAM-4374:


Author: ASF GitHub Bot
Created on: 27/Mar/20 02:23
Start Date: 27/Mar/20 02:23
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on issue #11184: [BEAM-4374] Update 
protos related to MonitoringInfo.
URL: https://github.com/apache/beam/pull/11184#issuecomment-604780915
 
 
   Run Java PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 410775)
Time Spent: 33h 50m  (was: 33h 40m)

> Update existing metrics in the FN API to use new Metric Schema
> --
>
> Key: BEAM-4374
> URL: https://issues.apache.org/jira/browse/BEAM-4374
> Project: Beam
>  Issue Type: New Feature
>  Components: beam-model
>Reporter: Alex Amato
>Priority: Major
>  Time Spent: 33h 50m
>  Remaining Estimate: 0h
>
> Update existing metrics to use the new proto and cataloging schema defined in:
> [_https://s.apache.org/beam-fn-api-metrics_]
>  * Check in new protos
>  * Define catalog file for metrics
>  * Port existing metrics to use this new format, based on catalog 
> names+metadata



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8932) Expose complete Cloud Pub/Sub messages through PubsubIO API

2020-03-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8932?focusedWorklogId=410774=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-410774
 ]

ASF GitHub Bot logged work on BEAM-8932:


Author: ASF GitHub Bot
Created on: 27/Mar/20 02:22
Start Date: 27/Mar/20 02:22
Worklog Time Spent: 10m 
  Work Description: aaltay commented on issue #10478: [BEAM-8932][Cleanup] 
Extract PubsubBoundedWriter from PubsubIO
URL: https://github.com/apache/beam/pull/10478#issuecomment-604780730
 
 
   Run Java PostCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 410774)
Time Spent: 17h 20m  (was: 17h 10m)

> Expose complete Cloud Pub/Sub messages through PubsubIO API
> ---
>
> Key: BEAM-8932
> URL: https://issues.apache.org/jira/browse/BEAM-8932
> Project: Beam
>  Issue Type: Bug
>  Components: beam-model
>Reporter: Daniel Collins
>Assignee: Daniel Collins
>Priority: Major
>  Time Spent: 17h 20m
>  Remaining Estimate: 0h
>
> The PubsubIO API only exposes a subset of the fields in the underlying 
> PubsubMessage protocol buffer. To accomodate future feature changes as well 
> as for greater compatability with code using the Cloud Pub/Sub apis, a method 
> to read and write these protocol messages should be exposed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8648) Euphoria: Deprecate OutputHints from public API

2020-03-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8648?focusedWorklogId=410773=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-410773
 ]

ASF GitHub Bot logged work on BEAM-8648:


Author: ASF GitHub Bot
Created on: 27/Mar/20 02:21
Start Date: 27/Mar/20 02:21
Worklog Time Spent: 10m 
  Work Description: aaltay commented on issue #10084: [BEAM-8648] Deprecate 
OutputHints from Euphoria API.
URL: https://github.com/apache/beam/pull/10084#issuecomment-604780643
 
 
   What should be the next action here? Should we close this PR? Is it ready to 
be merged?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 410773)
Time Spent: 1h 40m  (was: 1.5h)

> Euphoria: Deprecate OutputHints from public API
> ---
>
> Key: BEAM-8648
> URL: https://issues.apache.org/jira/browse/BEAM-8648
> Project: Beam
>  Issue Type: Improvement
>  Components: dsl-euphoria
>Reporter: David Morávek
>Assignee: David Morávek
>Priority: Major
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> Deprecate OutputHints as they are no longer used during translation.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-9620) textio (and fileio in general) takes too long to estimate sizes of large globs

2020-03-26 Thread Chamikara Madhusanka Jayalath (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17068204#comment-17068204
 ] 

Chamikara Madhusanka Jayalath commented on BEAM-9620:
-

Though it might make the source not work in the way it's implemented today. We 
rely on estimate_size() to perform initial splitting at workers which has to 
work for the source to work. If we time limit, we have to make sure that 
splitting/reading is not affected.

> textio (and fileio in general) takes too long to estimate sizes of large globs
> --
>
> Key: BEAM-9620
> URL: https://issues.apache.org/jira/browse/BEAM-9620
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Chamikara Madhusanka Jayalath
>Priority: Major
>
> As a workaround we could introduce a way to not perform size estimation when 
> reading large globs. For example Java SDK has withHintMatchesManyFiles() 
> option.
>  
> [https://github.com/apache/beam/blob/850e8469de798d45ec535fe90cb2dc5dbda4974a/sdks/java/core/src/main/java/org/apache/beam/sdk/io/TextIO.java#L371]
>  
> Additionally, seems like we are repeating the size estimation where the same 
> PCollection read from a file-based source is applied to multiple PTransforms.
>  
> See following for more details.
> [https://stackoverflow.com/questions/60874942/avoid-recomputing-size-of-all-cloud-storage-files-in-gcsio-beam-python-sdk]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-9620) textio (and fileio in general) takes too long to estimate sizes of large globs

2020-03-26 Thread Chamikara Madhusanka Jayalath (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17068203#comment-17068203
 ] 

Chamikara Madhusanka Jayalath commented on BEAM-9620:
-

Yeah, that makes sense.

> textio (and fileio in general) takes too long to estimate sizes of large globs
> --
>
> Key: BEAM-9620
> URL: https://issues.apache.org/jira/browse/BEAM-9620
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Chamikara Madhusanka Jayalath
>Priority: Major
>
> As a workaround we could introduce a way to not perform size estimation when 
> reading large globs. For example Java SDK has withHintMatchesManyFiles() 
> option.
>  
> [https://github.com/apache/beam/blob/850e8469de798d45ec535fe90cb2dc5dbda4974a/sdks/java/core/src/main/java/org/apache/beam/sdk/io/TextIO.java#L371]
>  
> Additionally, seems like we are repeating the size estimation where the same 
> PCollection read from a file-based source is applied to multiple PTransforms.
>  
> See following for more details.
> [https://stackoverflow.com/questions/60874942/avoid-recomputing-size-of-all-cloud-storage-files-in-gcsio-beam-python-sdk]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9444) Shall we use GCP Libraries BOM to specify Google-related library versions?

2020-03-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9444?focusedWorklogId=410759=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-410759
 ]

ASF GitHub Bot logged work on BEAM-9444:


Author: ASF GitHub Bot
Created on: 27/Mar/20 01:44
Start Date: 27/Mar/20 01:44
Worklog Time Spent: 10m 
  Work Description: aaltay commented on issue #11156: [BEAM-9444] Use GCP 
Libraries BOM for Google Cloud Dependencies
URL: https://github.com/apache/beam/pull/11156#issuecomment-604772021
 
 
   Run SQL Postcommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 410759)
Time Spent: 10h 20m  (was: 10h 10m)

> Shall we use GCP Libraries BOM to specify Google-related library versions?
> --
>
> Key: BEAM-9444
> URL: https://issues.apache.org/jira/browse/BEAM-9444
> Project: Beam
>  Issue Type: Task
>  Components: build-system
>Reporter: Tomo Suzuki
>Assignee: Tomo Suzuki
>Priority: Major
> Attachments: Screen Shot 2020-03-13 at 13.33.01.png, Screen Shot 
> 2020-03-17 at 16.01.16.png
>
>  Time Spent: 10h 20m
>  Remaining Estimate: 0h
>
> Shall we use GCP Libraries BOM to specify Google-related library versions?
>   
>  I've been working on Beam's dependency upgrades in the past few months. I 
> think it's time to consider a long-term solution to keep the libraries 
> up-to-date with small maintenance effort. To achieve that, I propose Beam to 
> use GCP Libraries BOM to set the Google-related library versions, rather than 
> trying to make changes in each of ~30 Google libraries.
>   
> h1. Background
> A BOM is pom.xml that provides dependencyManagement to importing projects.
>   
>  GCP Libraries BOM is a BOM that includes many Google Cloud related libraries 
> + gRPC + protobuf. We (Google Cloud Java Diamond Dependency team) maintain 
> the BOM so that the set of the libraries are compatible with each other.
>   
> h1. Implementation
> Notes for obstacles.
> h2. BeamModulePlugin's "force" does not take BOM into account (thus fails)
> {{forcedModules}} via version resolution strategy is playing bad. This causes
> {noformat}
> A problem occurred evaluating project ':sdks:java:extensions:sql'. 
> Could not resolve all dependencies for configuration 
> ':sdks:java:extensions:sql:fmppTemplates'.
> Invalid format: 'com.google.cloud:google-cloud-core'. Group, name and version 
> cannot be empty. Correct example: 'org.gradle:gradle-core:1.0'{noformat}
> !Screen Shot 2020-03-13 at 13.33.01.png|width=489,height=287! 
>   
> h2. :sdks:java:maven-archetypes:examples needs the version of 
> google-http-client
> The task requires the version for the library:
> {code:java}
> 'google-http-client.version': 
> dependencies.create(project.library.java.google_http_client).getVersion(),
> {code}
> This would generate NullPointerException. Running gradlew without the 
> subproject:
>   
> {code:java}
> ./gradlew -p sdks/java check -x :sdks:java:maven-archetypes:examples:check
> {code}
> h1. Problem in Gradle-generated pom files
> The generated Maven artifact POM has invalid data due to the BOM change. For 
> example my locally installed 
> {{~/.m2/repository/org/apache/beam/beam-sdks-java-io-google-cloud-platform/2.21.0-SNAPSHOT/beam-sdks-java-io-google-cloud-platform-2.21.0-SNAPSHOT.pom}}
>  had the following problems.
> h2. The GCP Libraries BOM showing up in dependencies section:
> {noformat}
>   
> 
>   com.google.cloud
>   libraries-bom
>   4.2.0
>   compile
>   
> 
>   com.google.guava
>   guava-jdk5
> ...
>   
> 
> {noformat}
> h2. The artifact that use the BOM in Gradle is missing version in the 
> dependency.
> {noformat}
> 
>   com.google.api
>   gax
>   
>   compile
>   ...
> 
> {noformat}
> h1. DependencyManagement section in generated pom.xml
> How can I check whether a entry in dependencies is "platform"?
> !Screen Shot 2020-03-17 at 16.01.16.png|width=504,height=344!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9444) Shall we use GCP Libraries BOM to specify Google-related library versions?

2020-03-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9444?focusedWorklogId=410754=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-410754
 ]

ASF GitHub Bot logged work on BEAM-9444:


Author: ASF GitHub Bot
Created on: 27/Mar/20 01:43
Start Date: 27/Mar/20 01:43
Worklog Time Spent: 10m 
  Work Description: aaltay commented on issue #11156: [BEAM-9444] Use GCP 
Libraries BOM for Google Cloud Dependencies
URL: https://github.com/apache/beam/pull/11156#issuecomment-604771821
 
 
   Run Java PostCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 410754)
Time Spent: 9.5h  (was: 9h 20m)

> Shall we use GCP Libraries BOM to specify Google-related library versions?
> --
>
> Key: BEAM-9444
> URL: https://issues.apache.org/jira/browse/BEAM-9444
> Project: Beam
>  Issue Type: Task
>  Components: build-system
>Reporter: Tomo Suzuki
>Assignee: Tomo Suzuki
>Priority: Major
> Attachments: Screen Shot 2020-03-13 at 13.33.01.png, Screen Shot 
> 2020-03-17 at 16.01.16.png
>
>  Time Spent: 9.5h
>  Remaining Estimate: 0h
>
> Shall we use GCP Libraries BOM to specify Google-related library versions?
>   
>  I've been working on Beam's dependency upgrades in the past few months. I 
> think it's time to consider a long-term solution to keep the libraries 
> up-to-date with small maintenance effort. To achieve that, I propose Beam to 
> use GCP Libraries BOM to set the Google-related library versions, rather than 
> trying to make changes in each of ~30 Google libraries.
>   
> h1. Background
> A BOM is pom.xml that provides dependencyManagement to importing projects.
>   
>  GCP Libraries BOM is a BOM that includes many Google Cloud related libraries 
> + gRPC + protobuf. We (Google Cloud Java Diamond Dependency team) maintain 
> the BOM so that the set of the libraries are compatible with each other.
>   
> h1. Implementation
> Notes for obstacles.
> h2. BeamModulePlugin's "force" does not take BOM into account (thus fails)
> {{forcedModules}} via version resolution strategy is playing bad. This causes
> {noformat}
> A problem occurred evaluating project ':sdks:java:extensions:sql'. 
> Could not resolve all dependencies for configuration 
> ':sdks:java:extensions:sql:fmppTemplates'.
> Invalid format: 'com.google.cloud:google-cloud-core'. Group, name and version 
> cannot be empty. Correct example: 'org.gradle:gradle-core:1.0'{noformat}
> !Screen Shot 2020-03-13 at 13.33.01.png|width=489,height=287! 
>   
> h2. :sdks:java:maven-archetypes:examples needs the version of 
> google-http-client
> The task requires the version for the library:
> {code:java}
> 'google-http-client.version': 
> dependencies.create(project.library.java.google_http_client).getVersion(),
> {code}
> This would generate NullPointerException. Running gradlew without the 
> subproject:
>   
> {code:java}
> ./gradlew -p sdks/java check -x :sdks:java:maven-archetypes:examples:check
> {code}
> h1. Problem in Gradle-generated pom files
> The generated Maven artifact POM has invalid data due to the BOM change. For 
> example my locally installed 
> {{~/.m2/repository/org/apache/beam/beam-sdks-java-io-google-cloud-platform/2.21.0-SNAPSHOT/beam-sdks-java-io-google-cloud-platform-2.21.0-SNAPSHOT.pom}}
>  had the following problems.
> h2. The GCP Libraries BOM showing up in dependencies section:
> {noformat}
>   
> 
>   com.google.cloud
>   libraries-bom
>   4.2.0
>   compile
>   
> 
>   com.google.guava
>   guava-jdk5
> ...
>   
> 
> {noformat}
> h2. The artifact that use the BOM in Gradle is missing version in the 
> dependency.
> {noformat}
> 
>   com.google.api
>   gax
>   
>   compile
>   ...
> 
> {noformat}
> h1. DependencyManagement section in generated pom.xml
> How can I check whether a entry in dependencies is "platform"?
> !Screen Shot 2020-03-17 at 16.01.16.png|width=504,height=344!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9444) Shall we use GCP Libraries BOM to specify Google-related library versions?

2020-03-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9444?focusedWorklogId=410755=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-410755
 ]

ASF GitHub Bot logged work on BEAM-9444:


Author: ASF GitHub Bot
Created on: 27/Mar/20 01:43
Start Date: 27/Mar/20 01:43
Worklog Time Spent: 10m 
  Work Description: aaltay commented on issue #11156: [BEAM-9444] Use GCP 
Libraries BOM for Google Cloud Dependencies
URL: https://github.com/apache/beam/pull/11156#issuecomment-604771862
 
 
   Run Java HadoopFormatIO Performance Test
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 410755)
Time Spent: 9h 40m  (was: 9.5h)

> Shall we use GCP Libraries BOM to specify Google-related library versions?
> --
>
> Key: BEAM-9444
> URL: https://issues.apache.org/jira/browse/BEAM-9444
> Project: Beam
>  Issue Type: Task
>  Components: build-system
>Reporter: Tomo Suzuki
>Assignee: Tomo Suzuki
>Priority: Major
> Attachments: Screen Shot 2020-03-13 at 13.33.01.png, Screen Shot 
> 2020-03-17 at 16.01.16.png
>
>  Time Spent: 9h 40m
>  Remaining Estimate: 0h
>
> Shall we use GCP Libraries BOM to specify Google-related library versions?
>   
>  I've been working on Beam's dependency upgrades in the past few months. I 
> think it's time to consider a long-term solution to keep the libraries 
> up-to-date with small maintenance effort. To achieve that, I propose Beam to 
> use GCP Libraries BOM to set the Google-related library versions, rather than 
> trying to make changes in each of ~30 Google libraries.
>   
> h1. Background
> A BOM is pom.xml that provides dependencyManagement to importing projects.
>   
>  GCP Libraries BOM is a BOM that includes many Google Cloud related libraries 
> + gRPC + protobuf. We (Google Cloud Java Diamond Dependency team) maintain 
> the BOM so that the set of the libraries are compatible with each other.
>   
> h1. Implementation
> Notes for obstacles.
> h2. BeamModulePlugin's "force" does not take BOM into account (thus fails)
> {{forcedModules}} via version resolution strategy is playing bad. This causes
> {noformat}
> A problem occurred evaluating project ':sdks:java:extensions:sql'. 
> Could not resolve all dependencies for configuration 
> ':sdks:java:extensions:sql:fmppTemplates'.
> Invalid format: 'com.google.cloud:google-cloud-core'. Group, name and version 
> cannot be empty. Correct example: 'org.gradle:gradle-core:1.0'{noformat}
> !Screen Shot 2020-03-13 at 13.33.01.png|width=489,height=287! 
>   
> h2. :sdks:java:maven-archetypes:examples needs the version of 
> google-http-client
> The task requires the version for the library:
> {code:java}
> 'google-http-client.version': 
> dependencies.create(project.library.java.google_http_client).getVersion(),
> {code}
> This would generate NullPointerException. Running gradlew without the 
> subproject:
>   
> {code:java}
> ./gradlew -p sdks/java check -x :sdks:java:maven-archetypes:examples:check
> {code}
> h1. Problem in Gradle-generated pom files
> The generated Maven artifact POM has invalid data due to the BOM change. For 
> example my locally installed 
> {{~/.m2/repository/org/apache/beam/beam-sdks-java-io-google-cloud-platform/2.21.0-SNAPSHOT/beam-sdks-java-io-google-cloud-platform-2.21.0-SNAPSHOT.pom}}
>  had the following problems.
> h2. The GCP Libraries BOM showing up in dependencies section:
> {noformat}
>   
> 
>   com.google.cloud
>   libraries-bom
>   4.2.0
>   compile
>   
> 
>   com.google.guava
>   guava-jdk5
> ...
>   
> 
> {noformat}
> h2. The artifact that use the BOM in Gradle is missing version in the 
> dependency.
> {noformat}
> 
>   com.google.api
>   gax
>   
>   compile
>   ...
> 
> {noformat}
> h1. DependencyManagement section in generated pom.xml
> How can I check whether a entry in dependencies is "platform"?
> !Screen Shot 2020-03-17 at 16.01.16.png|width=504,height=344!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9444) Shall we use GCP Libraries BOM to specify Google-related library versions?

2020-03-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9444?focusedWorklogId=410756=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-410756
 ]

ASF GitHub Bot logged work on BEAM-9444:


Author: ASF GitHub Bot
Created on: 27/Mar/20 01:43
Start Date: 27/Mar/20 01:43
Worklog Time Spent: 10m 
  Work Description: aaltay commented on issue #11156: [BEAM-9444] Use GCP 
Libraries BOM for Google Cloud Dependencies
URL: https://github.com/apache/beam/pull/11156#issuecomment-604771894
 
 
   Run BigQueryIO Streaming Performance Test Java
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 410756)
Time Spent: 9h 50m  (was: 9h 40m)

> Shall we use GCP Libraries BOM to specify Google-related library versions?
> --
>
> Key: BEAM-9444
> URL: https://issues.apache.org/jira/browse/BEAM-9444
> Project: Beam
>  Issue Type: Task
>  Components: build-system
>Reporter: Tomo Suzuki
>Assignee: Tomo Suzuki
>Priority: Major
> Attachments: Screen Shot 2020-03-13 at 13.33.01.png, Screen Shot 
> 2020-03-17 at 16.01.16.png
>
>  Time Spent: 9h 50m
>  Remaining Estimate: 0h
>
> Shall we use GCP Libraries BOM to specify Google-related library versions?
>   
>  I've been working on Beam's dependency upgrades in the past few months. I 
> think it's time to consider a long-term solution to keep the libraries 
> up-to-date with small maintenance effort. To achieve that, I propose Beam to 
> use GCP Libraries BOM to set the Google-related library versions, rather than 
> trying to make changes in each of ~30 Google libraries.
>   
> h1. Background
> A BOM is pom.xml that provides dependencyManagement to importing projects.
>   
>  GCP Libraries BOM is a BOM that includes many Google Cloud related libraries 
> + gRPC + protobuf. We (Google Cloud Java Diamond Dependency team) maintain 
> the BOM so that the set of the libraries are compatible with each other.
>   
> h1. Implementation
> Notes for obstacles.
> h2. BeamModulePlugin's "force" does not take BOM into account (thus fails)
> {{forcedModules}} via version resolution strategy is playing bad. This causes
> {noformat}
> A problem occurred evaluating project ':sdks:java:extensions:sql'. 
> Could not resolve all dependencies for configuration 
> ':sdks:java:extensions:sql:fmppTemplates'.
> Invalid format: 'com.google.cloud:google-cloud-core'. Group, name and version 
> cannot be empty. Correct example: 'org.gradle:gradle-core:1.0'{noformat}
> !Screen Shot 2020-03-13 at 13.33.01.png|width=489,height=287! 
>   
> h2. :sdks:java:maven-archetypes:examples needs the version of 
> google-http-client
> The task requires the version for the library:
> {code:java}
> 'google-http-client.version': 
> dependencies.create(project.library.java.google_http_client).getVersion(),
> {code}
> This would generate NullPointerException. Running gradlew without the 
> subproject:
>   
> {code:java}
> ./gradlew -p sdks/java check -x :sdks:java:maven-archetypes:examples:check
> {code}
> h1. Problem in Gradle-generated pom files
> The generated Maven artifact POM has invalid data due to the BOM change. For 
> example my locally installed 
> {{~/.m2/repository/org/apache/beam/beam-sdks-java-io-google-cloud-platform/2.21.0-SNAPSHOT/beam-sdks-java-io-google-cloud-platform-2.21.0-SNAPSHOT.pom}}
>  had the following problems.
> h2. The GCP Libraries BOM showing up in dependencies section:
> {noformat}
>   
> 
>   com.google.cloud
>   libraries-bom
>   4.2.0
>   compile
>   
> 
>   com.google.guava
>   guava-jdk5
> ...
>   
> 
> {noformat}
> h2. The artifact that use the BOM in Gradle is missing version in the 
> dependency.
> {noformat}
> 
>   com.google.api
>   gax
>   
>   compile
>   ...
> 
> {noformat}
> h1. DependencyManagement section in generated pom.xml
> How can I check whether a entry in dependencies is "platform"?
> !Screen Shot 2020-03-17 at 16.01.16.png|width=504,height=344!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9444) Shall we use GCP Libraries BOM to specify Google-related library versions?

2020-03-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9444?focusedWorklogId=410757=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-410757
 ]

ASF GitHub Bot logged work on BEAM-9444:


Author: ASF GitHub Bot
Created on: 27/Mar/20 01:43
Start Date: 27/Mar/20 01:43
Worklog Time Spent: 10m 
  Work Description: aaltay commented on issue #11156: [BEAM-9444] Use GCP 
Libraries BOM for Google Cloud Dependencies
URL: https://github.com/apache/beam/pull/11156#issuecomment-604771955
 
 
   Run Dataflow ValidatesRunner
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 410757)
Time Spent: 10h  (was: 9h 50m)

> Shall we use GCP Libraries BOM to specify Google-related library versions?
> --
>
> Key: BEAM-9444
> URL: https://issues.apache.org/jira/browse/BEAM-9444
> Project: Beam
>  Issue Type: Task
>  Components: build-system
>Reporter: Tomo Suzuki
>Assignee: Tomo Suzuki
>Priority: Major
> Attachments: Screen Shot 2020-03-13 at 13.33.01.png, Screen Shot 
> 2020-03-17 at 16.01.16.png
>
>  Time Spent: 10h
>  Remaining Estimate: 0h
>
> Shall we use GCP Libraries BOM to specify Google-related library versions?
>   
>  I've been working on Beam's dependency upgrades in the past few months. I 
> think it's time to consider a long-term solution to keep the libraries 
> up-to-date with small maintenance effort. To achieve that, I propose Beam to 
> use GCP Libraries BOM to set the Google-related library versions, rather than 
> trying to make changes in each of ~30 Google libraries.
>   
> h1. Background
> A BOM is pom.xml that provides dependencyManagement to importing projects.
>   
>  GCP Libraries BOM is a BOM that includes many Google Cloud related libraries 
> + gRPC + protobuf. We (Google Cloud Java Diamond Dependency team) maintain 
> the BOM so that the set of the libraries are compatible with each other.
>   
> h1. Implementation
> Notes for obstacles.
> h2. BeamModulePlugin's "force" does not take BOM into account (thus fails)
> {{forcedModules}} via version resolution strategy is playing bad. This causes
> {noformat}
> A problem occurred evaluating project ':sdks:java:extensions:sql'. 
> Could not resolve all dependencies for configuration 
> ':sdks:java:extensions:sql:fmppTemplates'.
> Invalid format: 'com.google.cloud:google-cloud-core'. Group, name and version 
> cannot be empty. Correct example: 'org.gradle:gradle-core:1.0'{noformat}
> !Screen Shot 2020-03-13 at 13.33.01.png|width=489,height=287! 
>   
> h2. :sdks:java:maven-archetypes:examples needs the version of 
> google-http-client
> The task requires the version for the library:
> {code:java}
> 'google-http-client.version': 
> dependencies.create(project.library.java.google_http_client).getVersion(),
> {code}
> This would generate NullPointerException. Running gradlew without the 
> subproject:
>   
> {code:java}
> ./gradlew -p sdks/java check -x :sdks:java:maven-archetypes:examples:check
> {code}
> h1. Problem in Gradle-generated pom files
> The generated Maven artifact POM has invalid data due to the BOM change. For 
> example my locally installed 
> {{~/.m2/repository/org/apache/beam/beam-sdks-java-io-google-cloud-platform/2.21.0-SNAPSHOT/beam-sdks-java-io-google-cloud-platform-2.21.0-SNAPSHOT.pom}}
>  had the following problems.
> h2. The GCP Libraries BOM showing up in dependencies section:
> {noformat}
>   
> 
>   com.google.cloud
>   libraries-bom
>   4.2.0
>   compile
>   
> 
>   com.google.guava
>   guava-jdk5
> ...
>   
> 
> {noformat}
> h2. The artifact that use the BOM in Gradle is missing version in the 
> dependency.
> {noformat}
> 
>   com.google.api
>   gax
>   
>   compile
>   ...
> 
> {noformat}
> h1. DependencyManagement section in generated pom.xml
> How can I check whether a entry in dependencies is "platform"?
> !Screen Shot 2020-03-17 at 16.01.16.png|width=504,height=344!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9444) Shall we use GCP Libraries BOM to specify Google-related library versions?

2020-03-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9444?focusedWorklogId=410758=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-410758
 ]

ASF GitHub Bot logged work on BEAM-9444:


Author: ASF GitHub Bot
Created on: 27/Mar/20 01:43
Start Date: 27/Mar/20 01:43
Worklog Time Spent: 10m 
  Work Description: aaltay commented on issue #11156: [BEAM-9444] Use GCP 
Libraries BOM for Google Cloud Dependencies
URL: https://github.com/apache/beam/pull/11156#issuecomment-604771990
 
 
   Run Spark ValidatesRunner
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 410758)
Time Spent: 10h 10m  (was: 10h)

> Shall we use GCP Libraries BOM to specify Google-related library versions?
> --
>
> Key: BEAM-9444
> URL: https://issues.apache.org/jira/browse/BEAM-9444
> Project: Beam
>  Issue Type: Task
>  Components: build-system
>Reporter: Tomo Suzuki
>Assignee: Tomo Suzuki
>Priority: Major
> Attachments: Screen Shot 2020-03-13 at 13.33.01.png, Screen Shot 
> 2020-03-17 at 16.01.16.png
>
>  Time Spent: 10h 10m
>  Remaining Estimate: 0h
>
> Shall we use GCP Libraries BOM to specify Google-related library versions?
>   
>  I've been working on Beam's dependency upgrades in the past few months. I 
> think it's time to consider a long-term solution to keep the libraries 
> up-to-date with small maintenance effort. To achieve that, I propose Beam to 
> use GCP Libraries BOM to set the Google-related library versions, rather than 
> trying to make changes in each of ~30 Google libraries.
>   
> h1. Background
> A BOM is pom.xml that provides dependencyManagement to importing projects.
>   
>  GCP Libraries BOM is a BOM that includes many Google Cloud related libraries 
> + gRPC + protobuf. We (Google Cloud Java Diamond Dependency team) maintain 
> the BOM so that the set of the libraries are compatible with each other.
>   
> h1. Implementation
> Notes for obstacles.
> h2. BeamModulePlugin's "force" does not take BOM into account (thus fails)
> {{forcedModules}} via version resolution strategy is playing bad. This causes
> {noformat}
> A problem occurred evaluating project ':sdks:java:extensions:sql'. 
> Could not resolve all dependencies for configuration 
> ':sdks:java:extensions:sql:fmppTemplates'.
> Invalid format: 'com.google.cloud:google-cloud-core'. Group, name and version 
> cannot be empty. Correct example: 'org.gradle:gradle-core:1.0'{noformat}
> !Screen Shot 2020-03-13 at 13.33.01.png|width=489,height=287! 
>   
> h2. :sdks:java:maven-archetypes:examples needs the version of 
> google-http-client
> The task requires the version for the library:
> {code:java}
> 'google-http-client.version': 
> dependencies.create(project.library.java.google_http_client).getVersion(),
> {code}
> This would generate NullPointerException. Running gradlew without the 
> subproject:
>   
> {code:java}
> ./gradlew -p sdks/java check -x :sdks:java:maven-archetypes:examples:check
> {code}
> h1. Problem in Gradle-generated pom files
> The generated Maven artifact POM has invalid data due to the BOM change. For 
> example my locally installed 
> {{~/.m2/repository/org/apache/beam/beam-sdks-java-io-google-cloud-platform/2.21.0-SNAPSHOT/beam-sdks-java-io-google-cloud-platform-2.21.0-SNAPSHOT.pom}}
>  had the following problems.
> h2. The GCP Libraries BOM showing up in dependencies section:
> {noformat}
>   
> 
>   com.google.cloud
>   libraries-bom
>   4.2.0
>   compile
>   
> 
>   com.google.guava
>   guava-jdk5
> ...
>   
> 
> {noformat}
> h2. The artifact that use the BOM in Gradle is missing version in the 
> dependency.
> {noformat}
> 
>   com.google.api
>   gax
>   
>   compile
>   ...
> 
> {noformat}
> h1. DependencyManagement section in generated pom.xml
> How can I check whether a entry in dependencies is "platform"?
> !Screen Shot 2020-03-17 at 16.01.16.png|width=504,height=344!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8751) Beam Dependency Update Request: com.google.apis:google-api-services-cloudresourcemanager

2020-03-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8751?focusedWorklogId=410753=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-410753
 ]

ASF GitHub Bot logged work on BEAM-8751:


Author: ASF GitHub Bot
Created on: 27/Mar/20 01:42
Start Date: 27/Mar/20 01:42
Worklog Time Spent: 10m 
  Work Description: aaltay commented on issue #11208: [BEAM-8751] 
google-api-client 1.30.9
URL: https://github.com/apache/beam/pull/11208#issuecomment-604771741
 
 
   Run Java PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 410753)
Time Spent: 1h 50m  (was: 1h 40m)

> Beam Dependency Update Request: 
> com.google.apis:google-api-services-cloudresourcemanager
> 
>
> Key: BEAM-8751
> URL: https://issues.apache.org/jira/browse/BEAM-8751
> Project: Beam
>  Issue Type: Sub-task
>  Components: dependencies
>Reporter: Beam JIRA Bot
>Priority: Major
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
>  - 2019-11-19 21:04:41.938497 
> -
> Please consider upgrading the dependency 
> com.google.apis:google-api-services-cloudresourcemanager. 
> The current version is v1-rev20181015-1.28.0. The latest version is 
> v2-rev20191018-1.30.3 
> cc: [~chamikara], 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2019-12-02 12:09:51.401493 
> -
> Please consider upgrading the dependency 
> com.google.apis:google-api-services-cloudresourcemanager. 
> The current version is v1-rev20181015-1.28.0. The latest version is 
> v2-rev20191115-1.30.3 
> cc: [~chamikara], 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2019-12-09 12:09:00.761817 
> -
> Please consider upgrading the dependency 
> com.google.apis:google-api-services-cloudresourcemanager. 
> The current version is v1-rev20181015-1.28.0. The latest version is 
> v2-rev20191115-1.30.3 
> cc: [~chamikara], 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2019-12-23 12:09:01.384571 
> -
> Please consider upgrading the dependency 
> com.google.apis:google-api-services-cloudresourcemanager. 
> The current version is v1-rev20181015-1.28.0. The latest version is 
> v2-rev20191206-1.30.3 
> cc: [~chamikara], 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2019-12-30 14:04:31.850871 
> -
> Please consider upgrading the dependency 
> com.google.apis:google-api-services-cloudresourcemanager. 
> The current version is v1-rev20181015-1.28.0. The latest version is 
> v2-rev20191206-1.30.3 
> cc: [~chamikara], 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2020-01-06 12:08:07.241510 
> -
> Please consider upgrading the dependency 
> com.google.apis:google-api-services-cloudresourcemanager. 
> The current version is v1-rev20181015-1.28.0. The latest version is 
> v2-rev20191206-1.30.3 
> cc: [~chamikara], 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2020-01-13 12:08:00.916536 
> -
> Please consider upgrading the dependency 
> com.google.apis:google-api-services-cloudresourcemanager. 
> The current version is v1-rev20181015-1.28.0. The latest version is 
> v2-rev20191206-1.30.3 
> cc: [~chamikara], 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2020-01-20 

[jira] [Commented] (BEAM-9620) textio (and fileio in general) takes too long to estimate sizes of large globs

2020-03-26 Thread Udi Meiri (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17068187#comment-17068187
 ] 

Udi Meiri commented on BEAM-9620:
-

Since this is an estimation, perhaps there should be limits on how much it 
samples or a maximum amount of time it can spend sampling (overall).

> textio (and fileio in general) takes too long to estimate sizes of large globs
> --
>
> Key: BEAM-9620
> URL: https://issues.apache.org/jira/browse/BEAM-9620
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Chamikara Madhusanka Jayalath
>Priority: Major
>
> As a workaround we could introduce a way to not perform size estimation when 
> reading large globs. For example Java SDK has withHintMatchesManyFiles() 
> option.
>  
> [https://github.com/apache/beam/blob/850e8469de798d45ec535fe90cb2dc5dbda4974a/sdks/java/core/src/main/java/org/apache/beam/sdk/io/TextIO.java#L371]
>  
> Additionally, seems like we are repeating the size estimation where the same 
> PCollection read from a file-based source is applied to multiple PTransforms.
>  
> See following for more details.
> [https://stackoverflow.com/questions/60874942/avoid-recomputing-size-of-all-cloud-storage-files-in-gcsio-beam-python-sdk]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8603) Add Python SqlTransform MVP

2020-03-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8603?focusedWorklogId=410752=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-410752
 ]

ASF GitHub Bot logged work on BEAM-8603:


Author: ASF GitHub Bot
Created on: 27/Mar/20 01:16
Start Date: 27/Mar/20 01:16
Worklog Time Spent: 10m 
  Work Description: TheNeuralBit commented on pull request #10055: 
[BEAM-8603] Add Python SqlTransform
URL: https://github.com/apache/beam/pull/10055#discussion_r398982127
 
 

 ##
 File path: sdks/python/apache_beam/transforms/sql_test.py
 ##
 @@ -0,0 +1,109 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+"""Tests for transforms that use the SQL Expansion service."""
+
+# pytype: skip-file
+
+from __future__ import absolute_import
+
+import logging
+import typing
+import unittest
+
+from nose.plugins.attrib import attr
+from past.builtins import unicode
+
+import apache_beam as beam
+from apache_beam import coders
+from apache_beam.options.pipeline_options import DebugOptions
+from apache_beam.options.pipeline_options import StandardOptions
+from apache_beam.testing.test_pipeline import TestPipeline
+from apache_beam.testing.util import assert_that
+from apache_beam.testing.util import equal_to
+from apache_beam.transforms.sql import SqlTransform
+from apache_beam.utils import subprocess_server
+
+SimpleRow = typing.NamedTuple(
+"SimpleRow", [("int", int), ("str", unicode), ("flt", float)])
+coders.registry.register_coder(SimpleRow, coders.RowCoder)
+
+
+@attr('UsesSqlExpansionService')
+@unittest.skipIf(
+TestPipeline().get_pipeline_options().view_as(StandardOptions).runner is
+None,
+"Must be run with a runner that supports cross-language transforms")
 
 Review comment:
   Ah actually I also ran into an issue when running this test with the default 
runner, that I'm having a hard time making sense of:
   
   ```
   E   ValueError: Missing requirement declaration: 
{'beam:requirement:pardo:splittable_dofn:v1'} 

  
   sdks/python/apache_beam/runners/portability/fn_api_runner/fn_runner.py:651: 
ValueError
   ```
   
   It looks like it's indicating my pipeline should contain a `splittable_dofn` 
declaration but doesn't - but I'm not clear on why it needs that declaration (I 
don't think anything in the pipeline needs to be splittable), or how to add it.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 410752)
Time Spent: 5h  (was: 4h 50m)

> Add Python SqlTransform MVP
> ---
>
> Key: BEAM-8603
> URL: https://issues.apache.org/jira/browse/BEAM-8603
> Project: Beam
>  Issue Type: Improvement
>  Components: dsl-sql, sdk-py-core
>Reporter: Brian Hulette
>Assignee: Brian Hulette
>Priority: Major
>  Time Spent: 5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (BEAM-9432) Create a separate expansion service package.

2020-03-26 Thread Robert Bradshaw (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Bradshaw resolved BEAM-9432.
---
Fix Version/s: 2.21.0
   Resolution: Fixed

> Create a separate expansion service package.
> 
>
> Key: BEAM-9432
> URL: https://issues.apache.org/jira/browse/BEAM-9432
> Project: Beam
>  Issue Type: New Feature
>  Components: beam-model, sdk-java-core
>Reporter: Robert Bradshaw
>Assignee: Robert Bradshaw
>Priority: Major
> Fix For: 2.21.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-9339) Declare capabilities in SDK environments

2020-03-26 Thread Robert Bradshaw (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17068180#comment-17068180
 ] 

Robert Bradshaw commented on BEAM-9339:
---

This is now done.

> Declare capabilities in SDK environments
> 
>
> Key: BEAM-9339
> URL: https://issues.apache.org/jira/browse/BEAM-9339
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-go, sdk-java-harness, sdk-py-harness
>Reporter: Robert Bradshaw
>Assignee: Robert Bradshaw
>Priority: Major
>  Time Spent: 7h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (BEAM-9339) Declare capabilities in SDK environments

2020-03-26 Thread Robert Bradshaw (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Bradshaw resolved BEAM-9339.
---
Fix Version/s: 2.21.0
   Resolution: Fixed

> Declare capabilities in SDK environments
> 
>
> Key: BEAM-9339
> URL: https://issues.apache.org/jira/browse/BEAM-9339
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-go, sdk-java-harness, sdk-py-harness
>Reporter: Robert Bradshaw
>Assignee: Robert Bradshaw
>Priority: Major
> Fix For: 2.21.0
>
>  Time Spent: 7h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (BEAM-9433) Create an expansion service artifact for common IOs

2020-03-26 Thread Robert Bradshaw (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Bradshaw resolved BEAM-9433.
---
Fix Version/s: 2.21.0
   Resolution: Fixed

> Create an expansion service artifact for common IOs
> ---
>
> Key: BEAM-9433
> URL: https://issues.apache.org/jira/browse/BEAM-9433
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-kafka, sdk-java-core, sdk-py-core
>Reporter: Robert Bradshaw
>Assignee: Robert Bradshaw
>Priority: Major
> Fix For: 2.21.0
>
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> This will allow users to easily leverage Java IOs from Python/Go/... 
> pipelines. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9618) Allow SDKs to pull process bundle descriptors.

2020-03-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9618?focusedWorklogId=410747=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-410747
 ]

ASF GitHub Bot logged work on BEAM-9618:


Author: ASF GitHub Bot
Created on: 27/Mar/20 01:03
Start Date: 27/Mar/20 01:03
Worklog Time Spent: 10m 
  Work Description: robertwb commented on issue #11235: [BEAM-9618] Pull 
bundle descriptors.
URL: https://github.com/apache/beam/pull/11235#issuecomment-604762162
 
 
   R: @lukecwik this is rebased and should be ready for review. I will remove 
commit cebab89 before merging. 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 410747)
Remaining Estimate: 0h
Time Spent: 10m

> Allow SDKs to pull process bundle descriptors.
> --
>
> Key: BEAM-9618
> URL: https://issues.apache.org/jira/browse/BEAM-9618
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model
>Reporter: Robert Bradshaw
>Assignee: Robert Bradshaw
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (BEAM-9618) Allow SDKs to pull process bundle descriptors.

2020-03-26 Thread Robert Bradshaw (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Bradshaw reassigned BEAM-9618:
-

Assignee: Robert Bradshaw

> Allow SDKs to pull process bundle descriptors.
> --
>
> Key: BEAM-9618
> URL: https://issues.apache.org/jira/browse/BEAM-9618
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model
>Reporter: Robert Bradshaw
>Assignee: Robert Bradshaw
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-3097) Allow BigQuerySource to take a ValueProvider as a table input.

2020-03-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-3097?focusedWorklogId=410744=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-410744
 ]

ASF GitHub Bot logged work on BEAM-3097:


Author: ASF GitHub Bot
Created on: 27/Mar/20 00:50
Start Date: 27/Mar/20 00:50
Worklog Time Spent: 10m 
  Work Description: pabloem commented on pull request #11244: [BEAM-3097] 
_ReadFromBigQuery supports valueprovider for table
URL: https://github.com/apache/beam/pull/11244
 
 
   **Please** add a meaningful description for your change here
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [ ] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] Update `CHANGES.md` with noteworthy changes.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   See the [Contributor Guide](https://beam.apache.org/contribute) for more 
tips on [how to make review process 
smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)[![Build
 

[jira] [Work logged] (BEAM-3097) Allow BigQuerySource to take a ValueProvider as a table input.

2020-03-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-3097?focusedWorklogId=410745=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-410745
 ]

ASF GitHub Bot logged work on BEAM-3097:


Author: ASF GitHub Bot
Created on: 27/Mar/20 00:50
Start Date: 27/Mar/20 00:50
Worklog Time Spent: 10m 
  Work Description: pabloem commented on issue #11244: [BEAM-3097] 
_ReadFromBigQuery supports valueprovider for table
URL: https://github.com/apache/beam/pull/11244#issuecomment-604759269
 
 
   Run Python 3.7 PostCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 410745)
Remaining Estimate: 1h 40m  (was: 1h 50m)
Time Spent: 20m  (was: 10m)

> Allow BigQuerySource to take a ValueProvider as a table input.
> --
>
> Key: BEAM-3097
> URL: https://issues.apache.org/jira/browse/BEAM-3097
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Reporter: Ed Mothershaw
>Priority: Minor
>   Original Estimate: 2h
>  Time Spent: 20m
>  Remaining Estimate: 1h 40m
>
> In file sdks/python/apache_beam/io/gcp/bigquery.py, class BigQuery, line 389. 
> When a ValueProvider is input as table the script will fail.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (BEAM-9620) textio (and fileio in general) takes too long to estimate sizes of large globs

2020-03-26 Thread Chamikara Madhusanka Jayalath (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17068168#comment-17068168
 ] 

Chamikara Madhusanka Jayalath edited comment on BEAM-9620 at 3/27/20, 12:48 AM:


Actually, I think we do have a workaround. ReadAllFromText (and other various 
ReadAll transforms), should not run into this issue.

[https://github.com/apache/beam/blob/master/sdks/python/apache_beam/io/textio.py#L438]


was (Author: chamikara):
Actually, I think we do have a workaround. ReadAllFromText (and other various 
ReadAll transforms, should not run into this issue).

[https://github.com/apache/beam/blob/master/sdks/python/apache_beam/io/textio.py#L438]

> textio (and fileio in general) takes too long to estimate sizes of large globs
> --
>
> Key: BEAM-9620
> URL: https://issues.apache.org/jira/browse/BEAM-9620
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Chamikara Madhusanka Jayalath
>Priority: Major
>
> As a workaround we could introduce a way to not perform size estimation when 
> reading large globs. For example Java SDK has withHintMatchesManyFiles() 
> option.
>  
> [https://github.com/apache/beam/blob/850e8469de798d45ec535fe90cb2dc5dbda4974a/sdks/java/core/src/main/java/org/apache/beam/sdk/io/TextIO.java#L371]
>  
> Additionally, seems like we are repeating the size estimation where the same 
> PCollection read from a file-based source is applied to multiple PTransforms.
>  
> See following for more details.
> [https://stackoverflow.com/questions/60874942/avoid-recomputing-size-of-all-cloud-storage-files-in-gcsio-beam-python-sdk]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8603) Add Python SqlTransform MVP

2020-03-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8603?focusedWorklogId=410743=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-410743
 ]

ASF GitHub Bot logged work on BEAM-8603:


Author: ASF GitHub Bot
Created on: 27/Mar/20 00:48
Start Date: 27/Mar/20 00:48
Worklog Time Spent: 10m 
  Work Description: TheNeuralBit commented on issue #10055: [BEAM-8603] Add 
Python SqlTransform
URL: https://github.com/apache/beam/pull/10055#issuecomment-604758757
 
 
   > I reviewed everything but the groovy files, which I would like another set 
of eyes on.
   
   R: @ihji could you take a look at the groovy changes?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 410743)
Time Spent: 4h 50m  (was: 4h 40m)

> Add Python SqlTransform MVP
> ---
>
> Key: BEAM-8603
> URL: https://issues.apache.org/jira/browse/BEAM-8603
> Project: Beam
>  Issue Type: Improvement
>  Components: dsl-sql, sdk-py-core
>Reporter: Brian Hulette
>Assignee: Brian Hulette
>Priority: Major
>  Time Spent: 4h 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-9620) textio (and fileio in general) takes too long to estimate sizes of large globs

2020-03-26 Thread Chamikara Madhusanka Jayalath (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17068168#comment-17068168
 ] 

Chamikara Madhusanka Jayalath commented on BEAM-9620:
-

Actually, I think we do have a workaround. ReadAllFromText (and other various 
ReadAll transforms, should not run into this issue).

[https://github.com/apache/beam/blob/master/sdks/python/apache_beam/io/textio.py#L438]

> textio (and fileio in general) takes too long to estimate sizes of large globs
> --
>
> Key: BEAM-9620
> URL: https://issues.apache.org/jira/browse/BEAM-9620
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Chamikara Madhusanka Jayalath
>Priority: Major
>
> As a workaround we could introduce a way to not perform size estimation when 
> reading large globs. For example Java SDK has withHintMatchesManyFiles() 
> option.
>  
> [https://github.com/apache/beam/blob/850e8469de798d45ec535fe90cb2dc5dbda4974a/sdks/java/core/src/main/java/org/apache/beam/sdk/io/TextIO.java#L371]
>  
> Additionally, seems like we are repeating the size estimation where the same 
> PCollection read from a file-based source is applied to multiple PTransforms.
>  
> See following for more details.
> [https://stackoverflow.com/questions/60874942/avoid-recomputing-size-of-all-cloud-storage-files-in-gcsio-beam-python-sdk]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-9623) Add support for TableProviders in Python SqlTransform

2020-03-26 Thread Brian Hulette (Jira)
Brian Hulette created BEAM-9623:
---

 Summary: Add support for TableProviders in Python SqlTransform
 Key: BEAM-9623
 URL: https://issues.apache.org/jira/browse/BEAM-9623
 Project: Beam
  Issue Type: Improvement
  Components: dsl-sql, sdk-py-core
Reporter: Brian Hulette


It should be possible to use e.g. DataCatalogTableProvider and access BigQuery, 
PubSub, and GCS in queries.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-9622) Support for consuming tagged PCollections in Python SqlTransform

2020-03-26 Thread Brian Hulette (Jira)
Brian Hulette created BEAM-9622:
---

 Summary: Support for consuming tagged PCollections in Python 
SqlTransform
 Key: BEAM-9622
 URL: https://issues.apache.org/jira/browse/BEAM-9622
 Project: Beam
  Issue Type: Improvement
  Components: dsl-sql, sdk-py-core
Reporter: Brian Hulette






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-8603) Add Python SqlTransform MVP

2020-03-26 Thread Brian Hulette (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brian Hulette updated BEAM-8603:

Component/s: dsl-sql

> Add Python SqlTransform MVP
> ---
>
> Key: BEAM-8603
> URL: https://issues.apache.org/jira/browse/BEAM-8603
> Project: Beam
>  Issue Type: Improvement
>  Components: dsl-sql, sdk-py-core
>Reporter: Brian Hulette
>Assignee: Brian Hulette
>Priority: Major
>  Time Spent: 4h 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-9621) Python SqlTransform follow-ups

2020-03-26 Thread Brian Hulette (Jira)
Brian Hulette created BEAM-9621:
---

 Summary: Python SqlTransform follow-ups
 Key: BEAM-9621
 URL: https://issues.apache.org/jira/browse/BEAM-9621
 Project: Beam
  Issue Type: Improvement
  Components: dsl-sql, sdk-py-core
Reporter: Brian Hulette
Assignee: Brian Hulette


Tracking JIRA for follow-up work to improve SqlTransform in Python



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-8603) Add Python SqlTransform MVP

2020-03-26 Thread Brian Hulette (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brian Hulette updated BEAM-8603:

Summary: Add Python SqlTransform MVP  (was: Add Python SqlTransform example 
script)

> Add Python SqlTransform MVP
> ---
>
> Key: BEAM-8603
> URL: https://issues.apache.org/jira/browse/BEAM-8603
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Reporter: Brian Hulette
>Assignee: Brian Hulette
>Priority: Major
>  Time Spent: 4h 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (BEAM-9574) NamedTuple instances generated from schemas cannot be pickled

2020-03-26 Thread Brian Hulette (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brian Hulette resolved BEAM-9574.
-
Fix Version/s: 2.21.0
   Resolution: Fixed

> NamedTuple instances generated from schemas cannot be pickled
> -
>
> Key: BEAM-9574
> URL: https://issues.apache.org/jira/browse/BEAM-9574
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Brian Hulette
>Assignee: Brian Hulette
>Priority: Major
> Fix For: 2.21.0
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Attempting to pickle an instance of a generated NamedTuple class results in 
> the following:
> {code}
> _pickle.PicklingError: Can't pickle  'apache_beam.typehints.schemas.BeamSchema_a7de91e0_ae11_4c52_a041_0b58ada35ac1'>:
>  attribute lookup BeamSchema_a7de91e0_ae11_4c52_a041_0b58ada35ac1 on 
> apache_beam.typehints.schemas failed
> {code}
> In general, we shouldn't be pickling these instances, but occasionally it may 
> be necessary, and we should just do it rather than failing hard.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9574) NamedTuple instances generated from schemas cannot be pickled

2020-03-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9574?focusedWorklogId=410742=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-410742
 ]

ASF GitHub Bot logged work on BEAM-9574:


Author: ASF GitHub Bot
Created on: 27/Mar/20 00:35
Start Date: 27/Mar/20 00:35
Worklog Time Spent: 10m 
  Work Description: TheNeuralBit commented on pull request #11196: 
[BEAM-9574] Ensure that instances of generated NamedTuple classes can be pickled
URL: https://github.com/apache/beam/pull/11196#discussion_r398971677
 
 

 ##
 File path: sdks/python/apache_beam/typehints/schemas.py
 ##
 @@ -205,6 +218,11 @@ def typing_from_runner_api(fieldtype_proto):
 pass  # TODO
 
 
+def _hydrate_namedtuple_instance(encoded_schema, values):
+  return named_tuple_from_schema(
+  proto_utils.parse_Bytes(encoded_schema, schema_pb2.Schema))(*values)
+
+
 def named_tuple_from_schema(schema):
 
 Review comment:
   I went ahead and merged this, let me know if you think this should be 
tweaked and I can do it separately.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 410742)
Time Spent: 50m  (was: 40m)

> NamedTuple instances generated from schemas cannot be pickled
> -
>
> Key: BEAM-9574
> URL: https://issues.apache.org/jira/browse/BEAM-9574
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Brian Hulette
>Assignee: Brian Hulette
>Priority: Major
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Attempting to pickle an instance of a generated NamedTuple class results in 
> the following:
> {code}
> _pickle.PicklingError: Can't pickle  'apache_beam.typehints.schemas.BeamSchema_a7de91e0_ae11_4c52_a041_0b58ada35ac1'>:
>  attribute lookup BeamSchema_a7de91e0_ae11_4c52_a041_0b58ada35ac1 on 
> apache_beam.typehints.schemas failed
> {code}
> In general, we shouldn't be pickling these instances, but occasionally it may 
> be necessary, and we should just do it rather than failing hard.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9574) NamedTuple instances generated from schemas cannot be pickled

2020-03-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9574?focusedWorklogId=410740=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-410740
 ]

ASF GitHub Bot logged work on BEAM-9574:


Author: ASF GitHub Bot
Created on: 27/Mar/20 00:34
Start Date: 27/Mar/20 00:34
Worklog Time Spent: 10m 
  Work Description: TheNeuralBit commented on pull request #11196: 
[BEAM-9574] Ensure that instances of generated NamedTuple classes can be pickled
URL: https://github.com/apache/beam/pull/11196
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 410740)
Time Spent: 40m  (was: 0.5h)

> NamedTuple instances generated from schemas cannot be pickled
> -
>
> Key: BEAM-9574
> URL: https://issues.apache.org/jira/browse/BEAM-9574
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Brian Hulette
>Assignee: Brian Hulette
>Priority: Major
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Attempting to pickle an instance of a generated NamedTuple class results in 
> the following:
> {code}
> _pickle.PicklingError: Can't pickle  'apache_beam.typehints.schemas.BeamSchema_a7de91e0_ae11_4c52_a041_0b58ada35ac1'>:
>  attribute lookup BeamSchema_a7de91e0_ae11_4c52_a041_0b58ada35ac1 on 
> apache_beam.typehints.schemas failed
> {code}
> In general, we shouldn't be pickling these instances, but occasionally it may 
> be necessary, and we should just do it rather than failing hard.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-9620) textio (and fileio in general) takes too long to estimate sizes of large globs

2020-03-26 Thread Chamikara Madhusanka Jayalath (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17068161#comment-17068161
 ] 

Chamikara Madhusanka Jayalath commented on BEAM-9620:
-

cc: [~pabloem] [~udim]

> textio (and fileio in general) takes too long to estimate sizes of large globs
> --
>
> Key: BEAM-9620
> URL: https://issues.apache.org/jira/browse/BEAM-9620
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Chamikara Madhusanka Jayalath
>Priority: Major
>
> As a workaround we could introduce a way to not perform size estimation when 
> reading large globs. For example Java SDK has withHintMatchesManyFiles() 
> option.
>  
> [https://github.com/apache/beam/blob/850e8469de798d45ec535fe90cb2dc5dbda4974a/sdks/java/core/src/main/java/org/apache/beam/sdk/io/TextIO.java#L371]
>  
> Additionally, seems like we are repeating the size estimation where the same 
> PCollection read from a file-based source is applied to multiple PTransforms.
>  
> See following for more details.
> [https://stackoverflow.com/questions/60874942/avoid-recomputing-size-of-all-cloud-storage-files-in-gcsio-beam-python-sdk]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-4374) Update existing metrics in the FN API to use new Metric Schema

2020-03-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-4374?focusedWorklogId=410739=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-410739
 ]

ASF GitHub Bot logged work on BEAM-4374:


Author: ASF GitHub Bot
Created on: 27/Mar/20 00:29
Start Date: 27/Mar/20 00:29
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on issue #11184: [BEAM-4374] Update 
protos related to MonitoringInfo.
URL: https://github.com/apache/beam/pull/11184#issuecomment-604754371
 
 
   Run Python PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 410739)
Time Spent: 33h 40m  (was: 33.5h)

> Update existing metrics in the FN API to use new Metric Schema
> --
>
> Key: BEAM-4374
> URL: https://issues.apache.org/jira/browse/BEAM-4374
> Project: Beam
>  Issue Type: New Feature
>  Components: beam-model
>Reporter: Alex Amato
>Priority: Major
>  Time Spent: 33h 40m
>  Remaining Estimate: 0h
>
> Update existing metrics to use the new proto and cataloging schema defined in:
> [_https://s.apache.org/beam-fn-api-metrics_]
>  * Check in new protos
>  * Define catalog file for metrics
>  * Port existing metrics to use this new format, based on catalog 
> names+metadata



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-9620) textio (and fileio in general) takes too long to estimate sizes of large globs

2020-03-26 Thread Chamikara Madhusanka Jayalath (Jira)
Chamikara Madhusanka Jayalath created BEAM-9620:
---

 Summary: textio (and fileio in general) takes too long to estimate 
sizes of large globs
 Key: BEAM-9620
 URL: https://issues.apache.org/jira/browse/BEAM-9620
 Project: Beam
  Issue Type: Bug
  Components: sdk-py-core
Reporter: Chamikara Madhusanka Jayalath


As a workaround we could introduce a way to not perform size estimation when 
reading large globs. For example Java SDK has withHintMatchesManyFiles() option.

 

[https://github.com/apache/beam/blob/850e8469de798d45ec535fe90cb2dc5dbda4974a/sdks/java/core/src/main/java/org/apache/beam/sdk/io/TextIO.java#L371]

 

Additionally, seems like we are repeating the size estimation where the same 
PCollection read from a file-based source is applied to multiple PTransforms.

 

See following for more details.

[https://stackoverflow.com/questions/60874942/avoid-recomputing-size-of-all-cloud-storage-files-in-gcsio-beam-python-sdk]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9136) Add LICENSES and NOTICES to docker images

2020-03-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9136?focusedWorklogId=410730=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-410730
 ]

ASF GitHub Bot logged work on BEAM-9136:


Author: ASF GitHub Bot
Created on: 27/Mar/20 00:00
Start Date: 27/Mar/20 00:00
Worklog Time Spent: 10m 
  Work Description: Hannah-Jiang commented on pull request #11243: 
[BEAM-9136]Add licenses for dependencies for Java
URL: https://github.com/apache/beam/pull/11243
 
 
   **Please** add a meaningful description for your change here
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [ ] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] Update `CHANGES.md` with noteworthy changes.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   See the [Contributor Guide](https://beam.apache.org/contribute) for more 
tips on [how to make review process 
smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)[![Build
 

[jira] [Work logged] (BEAM-4150) Standardize use of PCollection coder proto attribute

2020-03-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-4150?focusedWorklogId=410726=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-410726
 ]

ASF GitHub Bot logged work on BEAM-4150:


Author: ASF GitHub Bot
Created on: 26/Mar/20 23:53
Start Date: 26/Mar/20 23:53
Worklog Time Spent: 10m 
  Work Description: pabloem commented on issue #11222: [BEAM-4150] Don't 
window PCollection coders.
URL: https://github.com/apache/beam/pull/11222#issuecomment-604744648
 
 
   may need to rebase to get passing Docker PreCommit?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 410726)
Time Spent: 8h 50m  (was: 8h 40m)

> Standardize use of PCollection coder proto attribute
> 
>
> Key: BEAM-4150
> URL: https://issues.apache.org/jira/browse/BEAM-4150
> Project: Beam
>  Issue Type: Task
>  Components: beam-model
>Reporter: Robert Bradshaw
>Assignee: Luke Cwik
>Priority: Major
> Fix For: 2.20.0
>
>  Time Spent: 8h 50m
>  Remaining Estimate: 0h
>
> In some places it's expected to be a WindowedCoder, in others the raw 
> ElementCoder. We should use the same convention (decided in discussion to be 
> the raw ElementCoder) everywhere. The WindowCoder can be pulled out of the 
> attached windowing strategy, and the input/output ports should specify the 
> encoding directly rather than read the adjacent PCollection coder fields. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-4374) Update existing metrics in the FN API to use new Metric Schema

2020-03-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-4374?focusedWorklogId=410717=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-410717
 ]

ASF GitHub Bot logged work on BEAM-4374:


Author: ASF GitHub Bot
Created on: 26/Mar/20 23:48
Start Date: 26/Mar/20 23:48
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on pull request #11184: [BEAM-4374] 
Update protos related to MonitoringInfo.
URL: https://github.com/apache/beam/pull/11184#discussion_r398957888
 
 

 ##
 File path: model/pipeline/src/main/proto/metrics.proto
 ##
 @@ -52,38 +61,160 @@ message Annotation {
   string value = 2;
 }
 
-// Populated MonitoringInfoSpecs for specific URNs.
-// Indicating the required fields to be set.
-// SDKs and RunnerHarnesses can load these instances into memory and write a
-// validator or code generator to assist with populating and validating
-// MonitoringInfo protos.
+// A set of well known MonitoringInfo specifications.
 message MonitoringInfoSpecs {
   enum Enum {
-// TODO(BEAM-6926): Add the PTRANSFORM name as a required label after
-// upgrading the python SDK.
-USER_COUNTER = 0 [(monitoring_info_spec) = {
-  urn: "beam:metric:user",
-  type_urn: "beam:metrics:sum_int_64",
+// Represents an integer counter where values are summed across bundles.
+USER_SUM_INT64 = 0 [(monitoring_info_spec) = {
+  urn: "beam:metric:user:v1",
+  type: "beam:metrics:sum_int64:v1",
+  required_labels: ["PTRANSFORM", "NAMESPACE", "NAME"],
+  annotations: [{
+key: "description",
+value: "URN utilized to report user metric."
+  }]
+}];
+
+// Represents a double counter where values are summed across bundles.
+USER_SUM_DOUBLE = 1 [(monitoring_info_spec) = {
+  urn: "beam:metric:user:v1",
+  type: "beam:metrics:sum_double:v1",
+  required_labels: ["PTRANSFORM", "NAMESPACE", "NAME"],
+  annotations: [{
+key: "description",
+value: "URN utilized to report user metric."
+  }]
+}];
+
+// Represents a distribution of an integer value where:
+//   - count: represents the number of values seen across all bundles
 
 Review comment:
   I chatted with Alex about this and the TypeUrns describing the encoding was 
enough.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 410717)
Time Spent: 33.5h  (was: 33h 20m)

> Update existing metrics in the FN API to use new Metric Schema
> --
>
> Key: BEAM-4374
> URL: https://issues.apache.org/jira/browse/BEAM-4374
> Project: Beam
>  Issue Type: New Feature
>  Components: beam-model
>Reporter: Alex Amato
>Priority: Major
>  Time Spent: 33.5h
>  Remaining Estimate: 0h
>
> Update existing metrics to use the new proto and cataloging schema defined in:
> [_https://s.apache.org/beam-fn-api-metrics_]
>  * Check in new protos
>  * Define catalog file for metrics
>  * Port existing metrics to use this new format, based on catalog 
> names+metadata



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-4374) Update existing metrics in the FN API to use new Metric Schema

2020-03-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-4374?focusedWorklogId=410716=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-410716
 ]

ASF GitHub Bot logged work on BEAM-4374:


Author: ASF GitHub Bot
Created on: 26/Mar/20 23:47
Start Date: 26/Mar/20 23:47
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on pull request #11184: [BEAM-4374] 
Update protos related to MonitoringInfo.
URL: https://github.com/apache/beam/pull/11184#discussion_r398957458
 
 

 ##
 File path: model/pipeline/src/main/proto/metrics.proto
 ##
 @@ -52,38 +62,160 @@ message Annotation {
   string value = 2;
 }
 
-// Populated MonitoringInfoSpecs for specific URNs.
-// Indicating the required fields to be set.
-// SDKs and RunnerHarnesses can load these instances into memory and write a
-// validator or code generator to assist with populating and validating
-// MonitoringInfo protos.
+// A set of well known MonitoringInfo specifications.
 message MonitoringInfoSpecs {
   enum Enum {
-// TODO(BEAM-6926): Add the PTRANSFORM name as a required label after
-// upgrading the python SDK.
-USER_COUNTER = 0 [(monitoring_info_spec) = {
-  urn: "beam:metric:user",
-  type_urn: "beam:metrics:sum_int_64",
+// Represents an integer counter where values are summed across bundles.
+USER_SUM_INT64 = 0 [(monitoring_info_spec) = {
+  urn: "beam:metric:user:sum_int64:v1",
+  type: "beam:metrics:sum_int64:v1",
   required_labels: ["PTRANSFORM", "NAMESPACE", "NAME"],
   annotations: [{
 key: "description",
-value: "URN utilized to report user numeric counters."
+value: "URN utilized to report user metric."
   }]
 }];
 
-ELEMENT_COUNT = 1 [(monitoring_info_spec) = {
+// Represents a double counter where values are summed across bundles.
+USER_SUM_DOUBLE = 1 [(monitoring_info_spec) = {
+  urn: "beam:metric:user:sum_double:v1",
+  type: "beam:metrics:sum_double:v1",
+  required_labels: ["PTRANSFORM", "NAMESPACE", "NAME"],
+  annotations: [{
+key: "description",
+value: "URN utilized to report user metric."
+  }]
+}];
+
+// Represents a distribution of an integer value where:
+//   - count: represents the number of values seen across all bundles
+//   - sum: represents the total of the value across all bundles
+//   - min: represents the smallest value seen across all bundles
+//   - max: represents the largest value seen across all bundles
+USER_DISTRIBUTION_INT64 = 2 [(monitoring_info_spec) = {
+  urn: "beam:metric:user:distribution_int64:v1",
+  type: "beam:metrics:distribution_int64:v1",
+  required_labels: ["PTRANSFORM", "NAMESPACE", "NAME"],
+  annotations: [{
+key: "description",
+value: "URN utilized to report user metric."
+  }]
+}];
+
+// Represents a distribution of a double value where:
+//   - count: represents the number of values seen across all bundles
+//   - sum: represents the total of the value across all bundles
+//   - min: represents the smallest value seen across all bundles
+//   - max: represents the largest value seen across all bundles
+USER_DISTRIBUTION_DOUBLE = 3 [(monitoring_info_spec) = {
+  urn: "beam:metric:user:distribution_double:v1",
+  type: "beam:metrics:distribution_double:v1",
+  required_labels: ["PTRANSFORM", "NAMESPACE", "NAME"],
+  annotations: [{
+key: "description",
+value: "URN utilized to report user metric."
+  }]
+}];
+
+// Represents the latest seen integer value. The timestamp is used to
+// provide an "ordering" over multiple values to determine which is the
+// latest.
+USER_LATEST_INT64 = 4 [(monitoring_info_spec) = {
+  urn: "beam:metric:user:latest_int64:v1",
+  type: "beam:metrics:latest_int64:v1",
+  required_labels: ["PTRANSFORM", "NAMESPACE", "NAME"],
+  annotations: [{
+key: "description",
+value: "URN utilized to report user metric."
+  }]
+}];
+
+// Represents the latest seen double value. The timestamp is used to
+// provide an "ordering" over multiple values to determine which is the
+// latest.
+USER_LATEST_DOUBLE = 5 [(monitoring_info_spec) = {
+  urn: "beam:metric:user:latest_double:v1",
+  type: "beam:metrics:latest_double:v1",
+  required_labels: ["PTRANSFORM", "NAMESPACE", "NAME"],
+  annotations: [{
+key: "description",
+value: "URN utilized to report user metric."
+  }]
+}];
+
+// Represents the largest set of integer values seen across bundles.
+USER_TOP_N_INT64 = 6 [(monitoring_info_spec) = {
+  urn: "beam:metric:user:top_n_int64:v1",
+  type: "beam:metrics:top_n_int64:v1",
+  required_labels: ["PTRANSFORM", "NAMESPACE", "NAME"],
+  annotations: [{
+  

[jira] [Work logged] (BEAM-4374) Update existing metrics in the FN API to use new Metric Schema

2020-03-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-4374?focusedWorklogId=410714=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-410714
 ]

ASF GitHub Bot logged work on BEAM-4374:


Author: ASF GitHub Bot
Created on: 26/Mar/20 23:44
Start Date: 26/Mar/20 23:44
Worklog Time Spent: 10m 
  Work Description: robertwb commented on pull request #11184: [BEAM-4374] 
Update protos related to MonitoringInfo.
URL: https://github.com/apache/beam/pull/11184#discussion_r398953040
 
 

 ##
 File path: model/pipeline/src/main/proto/metrics.proto
 ##
 @@ -52,38 +62,160 @@ message Annotation {
   string value = 2;
 }
 
-// Populated MonitoringInfoSpecs for specific URNs.
-// Indicating the required fields to be set.
-// SDKs and RunnerHarnesses can load these instances into memory and write a
-// validator or code generator to assist with populating and validating
-// MonitoringInfo protos.
+// A set of well known MonitoringInfo specifications.
 message MonitoringInfoSpecs {
   enum Enum {
-// TODO(BEAM-6926): Add the PTRANSFORM name as a required label after
-// upgrading the python SDK.
-USER_COUNTER = 0 [(monitoring_info_spec) = {
-  urn: "beam:metric:user",
-  type_urn: "beam:metrics:sum_int_64",
+// Represents an integer counter where values are summed across bundles.
+USER_SUM_INT64 = 0 [(monitoring_info_spec) = {
+  urn: "beam:metric:user:sum_int64:v1",
+  type: "beam:metrics:sum_int64:v1",
   required_labels: ["PTRANSFORM", "NAMESPACE", "NAME"],
   annotations: [{
 key: "description",
-value: "URN utilized to report user numeric counters."
+value: "URN utilized to report user metric."
   }]
 }];
 
-ELEMENT_COUNT = 1 [(monitoring_info_spec) = {
+// Represents a double counter where values are summed across bundles.
+USER_SUM_DOUBLE = 1 [(monitoring_info_spec) = {
+  urn: "beam:metric:user:sum_double:v1",
+  type: "beam:metrics:sum_double:v1",
+  required_labels: ["PTRANSFORM", "NAMESPACE", "NAME"],
+  annotations: [{
+key: "description",
+value: "URN utilized to report user metric."
+  }]
+}];
+
+// Represents a distribution of an integer value where:
+//   - count: represents the number of values seen across all bundles
+//   - sum: represents the total of the value across all bundles
+//   - min: represents the smallest value seen across all bundles
+//   - max: represents the largest value seen across all bundles
+USER_DISTRIBUTION_INT64 = 2 [(monitoring_info_spec) = {
+  urn: "beam:metric:user:distribution_int64:v1",
+  type: "beam:metrics:distribution_int64:v1",
+  required_labels: ["PTRANSFORM", "NAMESPACE", "NAME"],
+  annotations: [{
+key: "description",
+value: "URN utilized to report user metric."
+  }]
+}];
+
+// Represents a distribution of a double value where:
+//   - count: represents the number of values seen across all bundles
+//   - sum: represents the total of the value across all bundles
+//   - min: represents the smallest value seen across all bundles
+//   - max: represents the largest value seen across all bundles
+USER_DISTRIBUTION_DOUBLE = 3 [(monitoring_info_spec) = {
+  urn: "beam:metric:user:distribution_double:v1",
+  type: "beam:metrics:distribution_double:v1",
+  required_labels: ["PTRANSFORM", "NAMESPACE", "NAME"],
+  annotations: [{
+key: "description",
+value: "URN utilized to report user metric."
+  }]
+}];
+
+// Represents the latest seen integer value. The timestamp is used to
+// provide an "ordering" over multiple values to determine which is the
+// latest.
+USER_LATEST_INT64 = 4 [(monitoring_info_spec) = {
+  urn: "beam:metric:user:latest_int64:v1",
+  type: "beam:metrics:latest_int64:v1",
+  required_labels: ["PTRANSFORM", "NAMESPACE", "NAME"],
+  annotations: [{
+key: "description",
+value: "URN utilized to report user metric."
+  }]
+}];
+
+// Represents the latest seen double value. The timestamp is used to
+// provide an "ordering" over multiple values to determine which is the
+// latest.
+USER_LATEST_DOUBLE = 5 [(monitoring_info_spec) = {
+  urn: "beam:metric:user:latest_double:v1",
+  type: "beam:metrics:latest_double:v1",
+  required_labels: ["PTRANSFORM", "NAMESPACE", "NAME"],
+  annotations: [{
+key: "description",
+value: "URN utilized to report user metric."
+  }]
+}];
+
+// Represents the largest set of integer values seen across bundles.
+USER_TOP_N_INT64 = 6 [(monitoring_info_spec) = {
+  urn: "beam:metric:user:top_n_int64:v1",
+  type: "beam:metrics:top_n_int64:v1",
+  required_labels: ["PTRANSFORM", "NAMESPACE", "NAME"],
+  annotations: [{
+  

[jira] [Work logged] (BEAM-4374) Update existing metrics in the FN API to use new Metric Schema

2020-03-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-4374?focusedWorklogId=410715=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-410715
 ]

ASF GitHub Bot logged work on BEAM-4374:


Author: ASF GitHub Bot
Created on: 26/Mar/20 23:44
Start Date: 26/Mar/20 23:44
Worklog Time Spent: 10m 
  Work Description: robertwb commented on pull request #11184: [BEAM-4374] 
Update protos related to MonitoringInfo.
URL: https://github.com/apache/beam/pull/11184#discussion_r398952488
 
 

 ##
 File path: model/pipeline/src/main/proto/metrics.proto
 ##
 @@ -52,38 +61,160 @@ message Annotation {
   string value = 2;
 }
 
-// Populated MonitoringInfoSpecs for specific URNs.
-// Indicating the required fields to be set.
-// SDKs and RunnerHarnesses can load these instances into memory and write a
-// validator or code generator to assist with populating and validating
-// MonitoringInfo protos.
+// A set of well known MonitoringInfo specifications.
 message MonitoringInfoSpecs {
   enum Enum {
-// TODO(BEAM-6926): Add the PTRANSFORM name as a required label after
-// upgrading the python SDK.
-USER_COUNTER = 0 [(monitoring_info_spec) = {
-  urn: "beam:metric:user",
-  type_urn: "beam:metrics:sum_int_64",
+// Represents an integer counter where values are summed across bundles.
+USER_SUM_INT64 = 0 [(monitoring_info_spec) = {
+  urn: "beam:metric:user:v1",
+  type: "beam:metrics:sum_int64:v1",
+  required_labels: ["PTRANSFORM", "NAMESPACE", "NAME"],
+  annotations: [{
+key: "description",
+value: "URN utilized to report user metric."
+  }]
+}];
+
+// Represents a double counter where values are summed across bundles.
+USER_SUM_DOUBLE = 1 [(monitoring_info_spec) = {
+  urn: "beam:metric:user:v1",
+  type: "beam:metrics:sum_double:v1",
+  required_labels: ["PTRANSFORM", "NAMESPACE", "NAME"],
+  annotations: [{
+key: "description",
+value: "URN utilized to report user metric."
+  }]
+}];
+
+// Represents a distribution of an integer value where:
+//   - count: represents the number of values seen across all bundles
 
 Review comment:
   It is explicit in the type field which is a URN denoting exactly how the 
values are encoded? Did we need more?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 410715)

> Update existing metrics in the FN API to use new Metric Schema
> --
>
> Key: BEAM-4374
> URL: https://issues.apache.org/jira/browse/BEAM-4374
> Project: Beam
>  Issue Type: New Feature
>  Components: beam-model
>Reporter: Alex Amato
>Priority: Major
>  Time Spent: 33h 10m
>  Remaining Estimate: 0h
>
> Update existing metrics to use the new proto and cataloging schema defined in:
> [_https://s.apache.org/beam-fn-api-metrics_]
>  * Check in new protos
>  * Define catalog file for metrics
>  * Port existing metrics to use this new format, based on catalog 
> names+metadata



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9331) The Row object needs better builders

2020-03-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9331?focusedWorklogId=410711=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-410711
 ]

ASF GitHub Bot logged work on BEAM-9331:


Author: ASF GitHub Bot
Created on: 26/Mar/20 23:30
Start Date: 26/Mar/20 23:30
Worklog Time Spent: 10m 
  Work Description: reuvenlax commented on issue #10883: [BEAM-9331] Add 
better Row builders
URL: https://github.com/apache/beam/pull/10883#issuecomment-604738743
 
 
   run sql postcommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 410711)
Time Spent: 4h 20m  (was: 4h 10m)

> The Row object needs better builders
> 
>
> Key: BEAM-9331
> URL: https://issues.apache.org/jira/browse/BEAM-9331
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-java-core
>Reporter: Reuven Lax
>Assignee: Reuven Lax
>Priority: Major
>  Time Spent: 4h 20m
>  Remaining Estimate: 0h
>
> Users should be able to build a Row object by specifying field names. Desired 
> syntax:
>  
> Row.withSchema(schema)
>    .withFieldName("field1", "value)
>   .withFieldName("field2.field3", value)
>   .build()
>  
> Users should also have a builder that allows taking an existing row and 
> changing specific fields.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-4374) Update existing metrics in the FN API to use new Metric Schema

2020-03-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-4374?focusedWorklogId=410703=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-410703
 ]

ASF GitHub Bot logged work on BEAM-4374:


Author: ASF GitHub Bot
Created on: 26/Mar/20 23:23
Start Date: 26/Mar/20 23:23
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on pull request #11231: [BEAM-4374] 
Shortids for the Go SDK
URL: https://github.com/apache/beam/pull/11231#discussion_r398948198
 
 

 ##
 File path: sdks/go/pkg/beam/core/runtime/harness/monitoring.go
 ##
 @@ -16,20 +16,165 @@
 package harness
 
 import (
+   "bytes"
+   "strconv"
+   "sync"
+   "sync/atomic"
"time"
 
+   "github.com/apache/beam/sdks/go/pkg/beam/core/graph/coder"
+   "github.com/apache/beam/sdks/go/pkg/beam/core/graph/mtime"
"github.com/apache/beam/sdks/go/pkg/beam/core/metrics"
"github.com/apache/beam/sdks/go/pkg/beam/core/runtime/exec"
fnpb "github.com/apache/beam/sdks/go/pkg/beam/model/fnexecution_v1"
ppb "github.com/apache/beam/sdks/go/pkg/beam/model/pipeline_v1"
"github.com/golang/protobuf/ptypes"
 )
 
-func monitoring(p *exec.Plan) (*fnpb.Metrics, []*ppb.MonitoringInfo) {
+type mUrn uint32
+type mType uint32
+
+// TODO: Pull these from the protos.
+var sUrns = []string{
+   "beam:metric:user:v1",
 
 Review comment:
   heads up that this has now been exploded so that each MonitoringInfoSpec has 
a unique urn meaning that you'll see:
   beam:metric:user:sum_int64:v1, beam:metric:user:sum_double:v1, ...
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 410703)

> Update existing metrics in the FN API to use new Metric Schema
> --
>
> Key: BEAM-4374
> URL: https://issues.apache.org/jira/browse/BEAM-4374
> Project: Beam
>  Issue Type: New Feature
>  Components: beam-model
>Reporter: Alex Amato
>Priority: Major
>  Time Spent: 32h 50m
>  Remaining Estimate: 0h
>
> Update existing metrics to use the new proto and cataloging schema defined in:
> [_https://s.apache.org/beam-fn-api-metrics_]
>  * Check in new protos
>  * Define catalog file for metrics
>  * Port existing metrics to use this new format, based on catalog 
> names+metadata



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-4374) Update existing metrics in the FN API to use new Metric Schema

2020-03-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-4374?focusedWorklogId=410702=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-410702
 ]

ASF GitHub Bot logged work on BEAM-4374:


Author: ASF GitHub Bot
Created on: 26/Mar/20 23:23
Start Date: 26/Mar/20 23:23
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on pull request #11231: [BEAM-4374] 
Shortids for the Go SDK
URL: https://github.com/apache/beam/pull/11231#discussion_r398949690
 
 

 ##
 File path: sdks/go/pkg/beam/core/runtime/harness/monitoring_test.go
 ##
 @@ -0,0 +1,122 @@
+// Licensed to the Apache Software Foundation (ASF) under one or more
+// contributor license agreements.  See the NOTICE file distributed with
+// this work for additional information regarding copyright ownership.
+// The ASF licenses this file to You under the Apache License, Version 2.0
+// (the "License"); you may not use this file except in compliance with
+// the License.  You may obtain a copy of the License at
+//
+//http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+package harness
+
+import (
+   "testing"
+
+   "github.com/apache/beam/sdks/go/pkg/beam/core/metrics"
+)
+
+func TestGetShortID(t *testing.T) {
+   tests := []struct {
+   id   string
+   urn  mUrn
+   typ  mType
+   expectedUrn  string
+   expectedType string
+   }{
+   {
+   id:   "1",
+   urn:  urnUser,
 
 Review comment:
   Can you add the case where the same urn but unique labels are used gets a 
different short id?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 410702)
Time Spent: 32h 50m  (was: 32h 40m)

> Update existing metrics in the FN API to use new Metric Schema
> --
>
> Key: BEAM-4374
> URL: https://issues.apache.org/jira/browse/BEAM-4374
> Project: Beam
>  Issue Type: New Feature
>  Components: beam-model
>Reporter: Alex Amato
>Priority: Major
>  Time Spent: 32h 50m
>  Remaining Estimate: 0h
>
> Update existing metrics to use the new proto and cataloging schema defined in:
> [_https://s.apache.org/beam-fn-api-metrics_]
>  * Check in new protos
>  * Define catalog file for metrics
>  * Port existing metrics to use this new format, based on catalog 
> names+metadata



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-4374) Update existing metrics in the FN API to use new Metric Schema

2020-03-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-4374?focusedWorklogId=410704=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-410704
 ]

ASF GitHub Bot logged work on BEAM-4374:


Author: ASF GitHub Bot
Created on: 26/Mar/20 23:23
Start Date: 26/Mar/20 23:23
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on pull request #11231: [BEAM-4374] 
Shortids for the Go SDK
URL: https://github.com/apache/beam/pull/11231#discussion_r398948604
 
 

 ##
 File path: sdks/go/pkg/beam/core/runtime/harness/monitoring.go
 ##
 @@ -16,20 +16,165 @@
 package harness
 
 import (
+   "bytes"
+   "strconv"
+   "sync"
+   "sync/atomic"
"time"
 
+   "github.com/apache/beam/sdks/go/pkg/beam/core/graph/coder"
+   "github.com/apache/beam/sdks/go/pkg/beam/core/graph/mtime"
"github.com/apache/beam/sdks/go/pkg/beam/core/metrics"
"github.com/apache/beam/sdks/go/pkg/beam/core/runtime/exec"
fnpb "github.com/apache/beam/sdks/go/pkg/beam/model/fnexecution_v1"
ppb "github.com/apache/beam/sdks/go/pkg/beam/model/pipeline_v1"
"github.com/golang/protobuf/ptypes"
 )
 
-func monitoring(p *exec.Plan) (*fnpb.Metrics, []*ppb.MonitoringInfo) {
+type mUrn uint32
+type mType uint32
+
+// TODO: Pull these from the protos.
+var sUrns = []string{
+   "beam:metric:user:v1",
+   "beam:metric:element_count:v1",
+   "beam:metric:pardo_execution_time:start_bundle_msecs:v1",
+   "beam:metric:pardo_execution_time:process_bundle_msecs:v1",
+   "beam:metric:pardo_execution_time:finish_bundle_msecs:v1",
+   "beam:metric:ptransform_progress:remaining:v1",
+   "beam:metric:ptransform_progress:completed:v1",
+
+   "TestingSentinelUrn", // Must remain last.
+}
+
+const (
+   urnUser mUrn = iota
+   urnElementCount
+   urnStartBundle
+   urnProcessBundle
+   urnFinishBundle
+   urnProgressRemaining
+   urnProgressCompleted
+
+   urnTestSentinel // Must remain last.
+)
+
+var sTypes = []string{
+   "beam:metrics:sum_int64:v1",
+   "beam:metrics:sum_double:v1",
+   "beam:metrics:distribution_int64:v1",
+   "beam:metrics:distribution_double:v1",
+   "beam:metrics:latest_int64:v1",
+   "beam:metrics:latest_double:v1",
+   "beam:metrics:top_n_int64:v1",
+   "beam:metrics:top_n_double:v1",
+   "beam:metrics:bottom_n_int64:v1",
+   "beam:metrics:bottom_n_double:v1",
+   "beam:metrics:monitoring_table:v1",
+   "beam:metrics:progress:v1",
+
+   "TestingSentinelType", // Must remain last.
+}
+
+const (
 
 Review comment:
   Since the urns uniquely identify the type now, you don't need this anymore 
and a monitoring info is uniquely described by urn + labels.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 410704)
Time Spent: 33h  (was: 32h 50m)

> Update existing metrics in the FN API to use new Metric Schema
> --
>
> Key: BEAM-4374
> URL: https://issues.apache.org/jira/browse/BEAM-4374
> Project: Beam
>  Issue Type: New Feature
>  Components: beam-model
>Reporter: Alex Amato
>Priority: Major
>  Time Spent: 33h
>  Remaining Estimate: 0h
>
> Update existing metrics to use the new proto and cataloging schema defined in:
> [_https://s.apache.org/beam-fn-api-metrics_]
>  * Check in new protos
>  * Define catalog file for metrics
>  * Port existing metrics to use this new format, based on catalog 
> names+metadata



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-1894) Race conditions in python direct runner eager mode

2020-03-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-1894?focusedWorklogId=410700=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-410700
 ]

ASF GitHub Bot logged work on BEAM-1894:


Author: ASF GitHub Bot
Created on: 26/Mar/20 23:22
Start Date: 26/Mar/20 23:22
Worklog Time Spent: 10m 
  Work Description: udim commented on issue #11242: [BEAM-1894] Remove 
obsolete EagerRunner test
URL: https://github.com/apache/beam/pull/11242#issuecomment-604736500
 
 
   R: @pabloem 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 410700)
Time Spent: 20m  (was: 10m)

> Race conditions in python direct runner eager mode
> --
>
> Key: BEAM-1894
> URL: https://issues.apache.org/jira/browse/BEAM-1894
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Vikas Kedigehalli
>Assignee: Udi Meiri
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> test_eager_pipeline 
> (https://github.com/apache/beam/blob/master/sdks/python/apache_beam/pipeline_test.py#L283)
>  fails with the following error:
> ERROR: test_eager_pipeline (apache_beam.pipeline_test.PipelineTest)
> --
> Traceback (most recent call last):
>   File 
> "/usr/local/google/home/vikasrk/work/incubator-beam/sdks/python/apache_beam/pipeline_test.py",
>  line 285, in test_eager_pipeline
> self.assertEqual([1, 4, 9], p | Create([1, 2, 3]) | Map(lambda x: x*x))
>   File 
> "/usr/local/google/home/vikasrk/work/incubator-beam/sdks/python/apache_beam/transforms/ptransform.py",
>  line 387, in __ror__
> p.run().wait_until_finish()
>   File 
> "/usr/local/google/home/vikasrk/work/incubator-beam/sdks/python/apache_beam/pipeline.py",
>  line 160, in run
> self.to_runner_api(), self.runner, self.options).run(False)
>   File 
> "/usr/local/google/home/vikasrk/work/incubator-beam/sdks/python/apache_beam/pipeline.py",
>  line 169, in run
> return self.runner.run(self)
>   File 
> "/usr/local/google/home/vikasrk/work/incubator-beam/sdks/python/apache_beam/runners/direct/direct_runner.py",
>  line 99, in run
> result.wait_until_finish()
>   File 
> "/usr/local/google/home/vikasrk/work/incubator-beam/sdks/python/apache_beam/runners/direct/direct_runner.py",
>  line 166, in wait_until_finish
> self._executor.await_completion()
>   File 
> "/usr/local/google/home/vikasrk/work/incubator-beam/sdks/python/apache_beam/runners/direct/executor.py",
>  line 336, in await_completion
> self._executor.await_completion()
>   File 
> "/usr/local/google/home/vikasrk/work/incubator-beam/sdks/python/apache_beam/runners/direct/executor.py",
>  line 308, in __call__
> uncommitted_bundle.get_elements_iterable())
>   File 
> "/usr/local/google/home/vikasrk/work/incubator-beam/sdks/python/apache_beam/runners/direct/evaluation_context.py",
>  line 176, in append_to_cache
> self._cache.append(applied_ptransform, tag, elements)
>   File 
> "/usr/local/google/home/vikasrk/work/incubator-beam/sdks/python/apache_beam/runners/direct/direct_runner.py",
>  line 138, in append
> self._cache[(applied_ptransform, tag)].extend(elements)
> TypeError: 'NoneType' object has no attribute '__getitem__'
> This is triggered when Create is changed to a custom source. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9468) Add Google Cloud Healthcare API IO Connectors

2020-03-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9468?focusedWorklogId=410701=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-410701
 ]

ASF GitHub Bot logged work on BEAM-9468:


Author: ASF GitHub Bot
Created on: 26/Mar/20 23:22
Start Date: 26/Mar/20 23:22
Worklog Time Spent: 10m 
  Work Description: jaketf commented on pull request #11151: [BEAM-9468]  
Hl7v2 io
URL: https://github.com/apache/beam/pull/11151#discussion_r398949505
 
 

 ##
 File path: 
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/healthcare/HL7v2IO.java
 ##
 @@ -0,0 +1,636 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.gcp.healthcare;
+
+import com.google.api.services.healthcare.v1alpha2.model.Message;
+import com.google.auto.value.AutoValue;
+import java.io.IOException;
+import java.text.ParseException;
+import java.util.Collection;
+import java.util.List;
+import java.util.Map;
+import org.apache.beam.sdk.Pipeline;
+import org.apache.beam.sdk.coders.StringUtf8Coder;
+import org.apache.beam.sdk.io.gcp.datastore.AdaptiveThrottler;
+import org.apache.beam.sdk.io.gcp.pubsub.PubsubIO;
+import org.apache.beam.sdk.metrics.Counter;
+import org.apache.beam.sdk.metrics.Metrics;
+import org.apache.beam.sdk.transforms.Create;
+import org.apache.beam.sdk.transforms.DoFn;
+import org.apache.beam.sdk.transforms.PTransform;
+import org.apache.beam.sdk.transforms.ParDo;
+import org.apache.beam.sdk.util.Sleeper;
+import org.apache.beam.sdk.values.PBegin;
+import org.apache.beam.sdk.values.PCollection;
+import org.apache.beam.sdk.values.PCollectionTuple;
+import org.apache.beam.sdk.values.PInput;
+import org.apache.beam.sdk.values.POutput;
+import org.apache.beam.sdk.values.PValue;
+import org.apache.beam.sdk.values.TupleTag;
+import org.apache.beam.sdk.values.TupleTagList;
+import 
org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Throwables;
+import 
org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.ImmutableMap;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * {@link HL7v2IO} provides an API for reading from and writing to https://cloud.google.com/healthcare/docs/concepts/hl7v2;>Google Cloud 
Healthcare HL7v2 API.
+ * 
+ *
+ * Read
 
 Review comment:
   @brianlucier PTAL at this updated doc string. 
   it describes my latest change in ba9d023 to avoid the double get whenever we 
reading a whole HL7v2Store with Messages.List API by adding the view=FULL param.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 410701)
Time Spent: 6h 50m  (was: 6h 40m)

> Add Google Cloud Healthcare API IO Connectors
> -
>
> Key: BEAM-9468
> URL: https://issues.apache.org/jira/browse/BEAM-9468
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-gcp
>Reporter: Jacob Ferriero
>Assignee: Jacob Ferriero
>Priority: Minor
>  Time Spent: 6h 50m
>  Remaining Estimate: 0h
>
> Add IO Transforms for the HL7v2, FHIR and DICOM stores in the [Google Cloud 
> Healthcare API|https://cloud.google.com/healthcare/docs/]
> HL7v2IO
> FHIRIO
> DICOM 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (BEAM-1894) Race conditions in python direct runner eager mode

2020-03-26 Thread Udi Meiri (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-1894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Udi Meiri reassigned BEAM-1894:
---

Assignee: Udi Meiri

> Race conditions in python direct runner eager mode
> --
>
> Key: BEAM-1894
> URL: https://issues.apache.org/jira/browse/BEAM-1894
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Vikas Kedigehalli
>Assignee: Udi Meiri
>Priority: Major
>
> test_eager_pipeline 
> (https://github.com/apache/beam/blob/master/sdks/python/apache_beam/pipeline_test.py#L283)
>  fails with the following error:
> ERROR: test_eager_pipeline (apache_beam.pipeline_test.PipelineTest)
> --
> Traceback (most recent call last):
>   File 
> "/usr/local/google/home/vikasrk/work/incubator-beam/sdks/python/apache_beam/pipeline_test.py",
>  line 285, in test_eager_pipeline
> self.assertEqual([1, 4, 9], p | Create([1, 2, 3]) | Map(lambda x: x*x))
>   File 
> "/usr/local/google/home/vikasrk/work/incubator-beam/sdks/python/apache_beam/transforms/ptransform.py",
>  line 387, in __ror__
> p.run().wait_until_finish()
>   File 
> "/usr/local/google/home/vikasrk/work/incubator-beam/sdks/python/apache_beam/pipeline.py",
>  line 160, in run
> self.to_runner_api(), self.runner, self.options).run(False)
>   File 
> "/usr/local/google/home/vikasrk/work/incubator-beam/sdks/python/apache_beam/pipeline.py",
>  line 169, in run
> return self.runner.run(self)
>   File 
> "/usr/local/google/home/vikasrk/work/incubator-beam/sdks/python/apache_beam/runners/direct/direct_runner.py",
>  line 99, in run
> result.wait_until_finish()
>   File 
> "/usr/local/google/home/vikasrk/work/incubator-beam/sdks/python/apache_beam/runners/direct/direct_runner.py",
>  line 166, in wait_until_finish
> self._executor.await_completion()
>   File 
> "/usr/local/google/home/vikasrk/work/incubator-beam/sdks/python/apache_beam/runners/direct/executor.py",
>  line 336, in await_completion
> self._executor.await_completion()
>   File 
> "/usr/local/google/home/vikasrk/work/incubator-beam/sdks/python/apache_beam/runners/direct/executor.py",
>  line 308, in __call__
> uncommitted_bundle.get_elements_iterable())
>   File 
> "/usr/local/google/home/vikasrk/work/incubator-beam/sdks/python/apache_beam/runners/direct/evaluation_context.py",
>  line 176, in append_to_cache
> self._cache.append(applied_ptransform, tag, elements)
>   File 
> "/usr/local/google/home/vikasrk/work/incubator-beam/sdks/python/apache_beam/runners/direct/direct_runner.py",
>  line 138, in append
> self._cache[(applied_ptransform, tag)].extend(elements)
> TypeError: 'NoneType' object has no attribute '__getitem__'
> This is triggered when Create is changed to a custom source. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-1894) Race conditions in python direct runner eager mode

2020-03-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-1894?focusedWorklogId=410699=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-410699
 ]

ASF GitHub Bot logged work on BEAM-1894:


Author: ASF GitHub Bot
Created on: 26/Mar/20 23:21
Start Date: 26/Mar/20 23:21
Worklog Time Spent: 10m 
  Work Description: udim commented on pull request #11242: [BEAM-1894] 
Remove obsolete EagerRunner test
URL: https://github.com/apache/beam/pull/11242
 
 
   EagerRunner was removed in #4492.
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [ ] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] Update `CHANGES.md` with noteworthy changes.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   See the [Contributor Guide](https://beam.apache.org/contribute) for more 
tips on [how to make review process 
smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)[![Build
 

[jira] [Resolved] (BEAM-9377) Python typehints: Map wrapper prevents Optional stripping

2020-03-26 Thread Udi Meiri (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Udi Meiri resolved BEAM-9377.
-
Fix Version/s: Not applicable
   Resolution: Won't Fix

> Python typehints: Map wrapper prevents Optional stripping
> -
>
> Key: BEAM-9377
> URL: https://issues.apache.org/jira/browse/BEAM-9377
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-py-core
>Reporter: Udi Meiri
>Assignee: Udi Meiri
>Priority: Major
> Fix For: Not applicable
>
>
> This existing test is wrong:
> {code}
>   def test_map_wrapper_optional_output(self):
> # Optional does affect output type (Nones are NOT ignored).
> def map_fn(unused_element: int) -> typehints.Optional[int]:
>   return 1
> th = beam.Map(map_fn).get_type_hints()
> self.assertEqual(th.input_types, ((int, ), {}))
> self.assertEqual(th.output_types, ((typehints.Optional[int], ), {}))
> {code}
> The resulting output type should be int.
> {code}
> inital output hint:
> Optional[int]
> with wrapper:
> Iterable[Optional[int]]
> with DoFn.default_type_hints:
> Optional[int]
> {code}
> However any Nones returned by a DoFn's process method are dropped, so the 
> actual element_type returned is plain int.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (BEAM-9377) Python typehints: Map wrapper prevents Optional stripping

2020-03-26 Thread Udi Meiri (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Udi Meiri reassigned BEAM-9377:
---

Assignee: Udi Meiri

> Python typehints: Map wrapper prevents Optional stripping
> -
>
> Key: BEAM-9377
> URL: https://issues.apache.org/jira/browse/BEAM-9377
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-py-core
>Reporter: Udi Meiri
>Assignee: Udi Meiri
>Priority: Major
>
> This existing test is wrong:
> {code}
>   def test_map_wrapper_optional_output(self):
> # Optional does affect output type (Nones are NOT ignored).
> def map_fn(unused_element: int) -> typehints.Optional[int]:
>   return 1
> th = beam.Map(map_fn).get_type_hints()
> self.assertEqual(th.input_types, ((int, ), {}))
> self.assertEqual(th.output_types, ((typehints.Optional[int], ), {}))
> {code}
> The resulting output type should be int.
> {code}
> inital output hint:
> Optional[int]
> with wrapper:
> Iterable[Optional[int]]
> with DoFn.default_type_hints:
> Optional[int]
> {code}
> However any Nones returned by a DoFn's process method are dropped, so the 
> actual element_type returned is plain int.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-9377) Python typehints: Map wrapper prevents Optional stripping

2020-03-26 Thread Udi Meiri (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17068131#comment-17068131
 ] 

Udi Meiri commented on BEAM-9377:
-

Verified for myself that Nones returned from map_fn indeed appear in the 
PCollection:
{code}
  def test_typed_map_optional(self):
# Optional does affect output type (Nones are NOT ignored).
def map_fn(element: int) -> typehints.Optional[int]:
  if element == 1:
return None
  else:
return element

result = [1, 2, 3] | beam.Map(map_fn)
self.assertCountEqual([None, 2, 3], result)
{code}

> Python typehints: Map wrapper prevents Optional stripping
> -
>
> Key: BEAM-9377
> URL: https://issues.apache.org/jira/browse/BEAM-9377
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-py-core
>Reporter: Udi Meiri
>Priority: Major
>
> This existing test is wrong:
> {code}
>   def test_map_wrapper_optional_output(self):
> # Optional does affect output type (Nones are NOT ignored).
> def map_fn(unused_element: int) -> typehints.Optional[int]:
>   return 1
> th = beam.Map(map_fn).get_type_hints()
> self.assertEqual(th.input_types, ((int, ), {}))
> self.assertEqual(th.output_types, ((typehints.Optional[int], ), {}))
> {code}
> The resulting output type should be int.
> {code}
> inital output hint:
> Optional[int]
> with wrapper:
> Iterable[Optional[int]]
> with DoFn.default_type_hints:
> Optional[int]
> {code}
> However any Nones returned by a DoFn's process method are dropped, so the 
> actual element_type returned is plain int.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-9377) Python typehints: Map wrapper prevents Optional stripping

2020-03-26 Thread Udi Meiri (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17068132#comment-17068132
 ] 

Udi Meiri commented on BEAM-9377:
-

Nothing to do, closing

> Python typehints: Map wrapper prevents Optional stripping
> -
>
> Key: BEAM-9377
> URL: https://issues.apache.org/jira/browse/BEAM-9377
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-py-core
>Reporter: Udi Meiri
>Priority: Major
>
> This existing test is wrong:
> {code}
>   def test_map_wrapper_optional_output(self):
> # Optional does affect output type (Nones are NOT ignored).
> def map_fn(unused_element: int) -> typehints.Optional[int]:
>   return 1
> th = beam.Map(map_fn).get_type_hints()
> self.assertEqual(th.input_types, ((int, ), {}))
> self.assertEqual(th.output_types, ((typehints.Optional[int], ), {}))
> {code}
> The resulting output type should be int.
> {code}
> inital output hint:
> Optional[int]
> with wrapper:
> Iterable[Optional[int]]
> with DoFn.default_type_hints:
> Optional[int]
> {code}
> However any Nones returned by a DoFn's process method are dropped, so the 
> actual element_type returned is plain int.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-1894) Race conditions in python direct runner eager mode

2020-03-26 Thread Udi Meiri (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-1894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17068130#comment-17068130
 ] 

Udi Meiri commented on BEAM-1894:
-

EagerRunner was removed in https://github.com/apache/beam/pull/4492

> Race conditions in python direct runner eager mode
> --
>
> Key: BEAM-1894
> URL: https://issues.apache.org/jira/browse/BEAM-1894
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Vikas Kedigehalli
>Priority: Major
>
> test_eager_pipeline 
> (https://github.com/apache/beam/blob/master/sdks/python/apache_beam/pipeline_test.py#L283)
>  fails with the following error:
> ERROR: test_eager_pipeline (apache_beam.pipeline_test.PipelineTest)
> --
> Traceback (most recent call last):
>   File 
> "/usr/local/google/home/vikasrk/work/incubator-beam/sdks/python/apache_beam/pipeline_test.py",
>  line 285, in test_eager_pipeline
> self.assertEqual([1, 4, 9], p | Create([1, 2, 3]) | Map(lambda x: x*x))
>   File 
> "/usr/local/google/home/vikasrk/work/incubator-beam/sdks/python/apache_beam/transforms/ptransform.py",
>  line 387, in __ror__
> p.run().wait_until_finish()
>   File 
> "/usr/local/google/home/vikasrk/work/incubator-beam/sdks/python/apache_beam/pipeline.py",
>  line 160, in run
> self.to_runner_api(), self.runner, self.options).run(False)
>   File 
> "/usr/local/google/home/vikasrk/work/incubator-beam/sdks/python/apache_beam/pipeline.py",
>  line 169, in run
> return self.runner.run(self)
>   File 
> "/usr/local/google/home/vikasrk/work/incubator-beam/sdks/python/apache_beam/runners/direct/direct_runner.py",
>  line 99, in run
> result.wait_until_finish()
>   File 
> "/usr/local/google/home/vikasrk/work/incubator-beam/sdks/python/apache_beam/runners/direct/direct_runner.py",
>  line 166, in wait_until_finish
> self._executor.await_completion()
>   File 
> "/usr/local/google/home/vikasrk/work/incubator-beam/sdks/python/apache_beam/runners/direct/executor.py",
>  line 336, in await_completion
> self._executor.await_completion()
>   File 
> "/usr/local/google/home/vikasrk/work/incubator-beam/sdks/python/apache_beam/runners/direct/executor.py",
>  line 308, in __call__
> uncommitted_bundle.get_elements_iterable())
>   File 
> "/usr/local/google/home/vikasrk/work/incubator-beam/sdks/python/apache_beam/runners/direct/evaluation_context.py",
>  line 176, in append_to_cache
> self._cache.append(applied_ptransform, tag, elements)
>   File 
> "/usr/local/google/home/vikasrk/work/incubator-beam/sdks/python/apache_beam/runners/direct/direct_runner.py",
>  line 138, in append
> self._cache[(applied_ptransform, tag)].extend(elements)
> TypeError: 'NoneType' object has no attribute '__getitem__'
> This is triggered when Create is changed to a custom source. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9562) Remove timer from PCollection and treat timers as Elements

2020-03-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9562?focusedWorklogId=410694=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-410694
 ]

ASF GitHub Bot logged work on BEAM-9562:


Author: ASF GitHub Bot
Created on: 26/Mar/20 23:15
Start Date: 26/Mar/20 23:15
Worklog Time Spent: 10m 
  Work Description: boyuanzz commented on issue #11216: [BEAM-9562] Remove 
TimerSpec from Proto
URL: https://github.com/apache/beam/pull/11216#issuecomment-604734389
 
 
   Run Portable_Python PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 410694)
Time Spent: 5h  (was: 4h 50m)

> Remove timer from PCollection and treat timers as Elements 
> ---
>
> Key: BEAM-9562
> URL: https://issues.apache.org/jira/browse/BEAM-9562
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-py-harness
>Reporter: Boyuan Zhang
>Assignee: Boyuan Zhang
>Priority: Major
>  Time Spent: 5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-4374) Update existing metrics in the FN API to use new Metric Schema

2020-03-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-4374?focusedWorklogId=410692=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-410692
 ]

ASF GitHub Bot logged work on BEAM-4374:


Author: ASF GitHub Bot
Created on: 26/Mar/20 23:07
Start Date: 26/Mar/20 23:07
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on issue #11184: [BEAM-4374] Update 
protos related to MonitoringInfo.
URL: https://github.com/apache/beam/pull/11184#issuecomment-604732007
 
 
   This is ready for the next review.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 410692)
Time Spent: 32h 40m  (was: 32.5h)

> Update existing metrics in the FN API to use new Metric Schema
> --
>
> Key: BEAM-4374
> URL: https://issues.apache.org/jira/browse/BEAM-4374
> Project: Beam
>  Issue Type: New Feature
>  Components: beam-model
>Reporter: Alex Amato
>Priority: Major
>  Time Spent: 32h 40m
>  Remaining Estimate: 0h
>
> Update existing metrics to use the new proto and cataloging schema defined in:
> [_https://s.apache.org/beam-fn-api-metrics_]
>  * Check in new protos
>  * Define catalog file for metrics
>  * Port existing metrics to use this new format, based on catalog 
> names+metadata



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-4150) Standardize use of PCollection coder proto attribute

2020-03-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-4150?focusedWorklogId=410693=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-410693
 ]

ASF GitHub Bot logged work on BEAM-4150:


Author: ASF GitHub Bot
Created on: 26/Mar/20 23:07
Start Date: 26/Mar/20 23:07
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on issue #11222: [BEAM-4150] Don't 
window PCollection coders.
URL: https://github.com/apache/beam/pull/11222#issuecomment-604732125
 
 
   Run PythonDocker PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 410693)
Time Spent: 8h 40m  (was: 8.5h)

> Standardize use of PCollection coder proto attribute
> 
>
> Key: BEAM-4150
> URL: https://issues.apache.org/jira/browse/BEAM-4150
> Project: Beam
>  Issue Type: Task
>  Components: beam-model
>Reporter: Robert Bradshaw
>Assignee: Luke Cwik
>Priority: Major
> Fix For: 2.20.0
>
>  Time Spent: 8h 40m
>  Remaining Estimate: 0h
>
> In some places it's expected to be a WindowedCoder, in others the raw 
> ElementCoder. We should use the same convention (decided in discussion to be 
> the raw ElementCoder) everywhere. The WindowCoder can be pulled out of the 
> attached windowing strategy, and the input/output ports should specify the 
> encoding directly rather than read the adjacent PCollection coder fields. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-4374) Update existing metrics in the FN API to use new Metric Schema

2020-03-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-4374?focusedWorklogId=410690=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-410690
 ]

ASF GitHub Bot logged work on BEAM-4374:


Author: ASF GitHub Bot
Created on: 26/Mar/20 23:02
Start Date: 26/Mar/20 23:02
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on pull request #11184: [BEAM-4374] 
Update protos related to MonitoringInfo.
URL: https://github.com/apache/beam/pull/11184#discussion_r398942909
 
 

 ##
 File path: model/pipeline/src/main/proto/metrics.proto
 ##
 @@ -52,38 +55,157 @@ message Annotation {
   string value = 2;
 }
 
-// Populated MonitoringInfoSpecs for specific URNs.
-// Indicating the required fields to be set.
-// SDKs and RunnerHarnesses can load these instances into memory and write a
-// validator or code generator to assist with populating and validating
-// MonitoringInfo protos.
+// A set of well known MonitoringInfo specifications.
 message MonitoringInfoSpecs {
   enum Enum {
-// TODO(BEAM-6926): Add the PTRANSFORM name as a required label after
-// upgrading the python SDK.
-USER_COUNTER = 0 [(monitoring_info_spec) = {
-  urn: "beam:metric:user",
-  type_urn: "beam:metrics:sum_int_64",
+// Represents an integer counter where values are summed across bundles.
+USER_SUM_INT64 = 0 [(monitoring_info_spec) = {
+  urn: "beam:metric:user:v1",
+  type: "beam:metrics:sum_int64:v1",
   required_labels: ["PTRANSFORM", "NAMESPACE", "NAME"],
   annotations: [{
 key: "description",
-value: "URN utilized to report user numeric counters."
+value: "URN utilized to report user metric."
   }]
 }];
 
-ELEMENT_COUNT = 1 [(monitoring_info_spec) = {
+// Represents a double counter where values are summed across bundles.
+USER_SUM_DOUBLE = 1 [(monitoring_info_spec) = {
+  urn: "beam:metric:user:v1",
 
 Review comment:
   Made all the URNs unique and added a test to make sure that they remain 
unique.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 410690)
Time Spent: 32.5h  (was: 32h 20m)

> Update existing metrics in the FN API to use new Metric Schema
> --
>
> Key: BEAM-4374
> URL: https://issues.apache.org/jira/browse/BEAM-4374
> Project: Beam
>  Issue Type: New Feature
>  Components: beam-model
>Reporter: Alex Amato
>Priority: Major
>  Time Spent: 32.5h
>  Remaining Estimate: 0h
>
> Update existing metrics to use the new proto and cataloging schema defined in:
> [_https://s.apache.org/beam-fn-api-metrics_]
>  * Check in new protos
>  * Define catalog file for metrics
>  * Port existing metrics to use this new format, based on catalog 
> names+metadata



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9468) Add Google Cloud Healthcare API IO Connectors

2020-03-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9468?focusedWorklogId=410687=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-410687
 ]

ASF GitHub Bot logged work on BEAM-9468:


Author: ASF GitHub Bot
Created on: 26/Mar/20 22:59
Start Date: 26/Mar/20 22:59
Worklog Time Spent: 10m 
  Work Description: jaketf commented on issue #11151: [BEAM-9468]  Hl7v2 io
URL: https://github.com/apache/beam/pull/11151#issuecomment-604655037
 
 
   Ok an updates here from an internal thread w/ API team.
   
   1. [Message.List returning message contents is available in beta API with 
the view parameter.
   1. Schematized Data should be in next beta release roughly in ~2 weeks.
   1. right now the sink is outputting schematized data json wrapped in 
"{data=}" 
   
   In light of these I will do the following refactors: 
   1. [x] how we batch read from to always avoid the double get. This will make 
it a completely parallel code path than the real-time path but I think that's 
ok.
   1. [ ] refactor to use beta client library (once it includes schematizedData)
   1. [x] I'll strip out that `{data=}` wrapper to make this easier for users.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 410687)
Time Spent: 6h 40m  (was: 6.5h)

> Add Google Cloud Healthcare API IO Connectors
> -
>
> Key: BEAM-9468
> URL: https://issues.apache.org/jira/browse/BEAM-9468
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-gcp
>Reporter: Jacob Ferriero
>Assignee: Jacob Ferriero
>Priority: Minor
>  Time Spent: 6h 40m
>  Remaining Estimate: 0h
>
> Add IO Transforms for the HL7v2, FHIR and DICOM stores in the [Google Cloud 
> Healthcare API|https://cloud.google.com/healthcare/docs/]
> HL7v2IO
> FHIRIO
> DICOM 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9573) Watermark hold for timer output timestamp is not computed correctly

2020-03-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9573?focusedWorklogId=410685=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-410685
 ]

ASF GitHub Bot logged work on BEAM-9573:


Author: ASF GitHub Bot
Created on: 26/Mar/20 22:58
Start Date: 26/Mar/20 22:58
Worklog Time Spent: 10m 
  Work Description: amaliujia commented on issue #11220: 
[BEAM-9573][release-2.20] Correct computing of watermark hold for timer output 
timestamp
URL: https://github.com/apache/beam/pull/11220#issuecomment-604729513
 
 
   Great! all tests have pass!
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 410685)
Time Spent: 8h  (was: 7h 50m)

> Watermark hold for timer output timestamp is not computed correctly
> ---
>
> Key: BEAM-9573
> URL: https://issues.apache.org/jira/browse/BEAM-9573
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Affects Versions: 2.20.0
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Blocker
> Fix For: 2.20.0
>
>  Time Spent: 8h
>  Remaining Estimate: 0h
>
> With the introduction of timer output timestamp, a new watermark hold had 
> been added to the Flink Runner. The watermark computation works on the keyed 
> state backend which computes a key-scoped watermark hold and not the desired 
> operator-wide watermark hold.
> Computation: 
> https://github.com/apache/beam/blob/b564239081e9351c56fb0e7d263495b95dd3f8f3/runners/flink/src/main/java/org/apache/beam/runners/flink/translation/wrappers/streaming/DoFnOperator.java#L1140
> Key-scoped state: 
> https://github.com/apache/beam/blob/b564239081e9351c56fb0e7d263495b95dd3f8f3/runners/flink/src/main/java/org/apache/beam/runners/flink/translation/wrappers/streaming/DoFnOperator.java#L1130
> We need to change this to operate on all keys. This has to be done before 
> fixing BEAM-9566.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9573) Watermark hold for timer output timestamp is not computed correctly

2020-03-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9573?focusedWorklogId=410686=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-410686
 ]

ASF GitHub Bot logged work on BEAM-9573:


Author: ASF GitHub Bot
Created on: 26/Mar/20 22:58
Start Date: 26/Mar/20 22:58
Worklog Time Spent: 10m 
  Work Description: amaliujia commented on issue #11220: 
[BEAM-9573][release-2.20] Correct computing of watermark hold for timer output 
timestamp
URL: https://github.com/apache/beam/pull/11220#issuecomment-604729513
 
 
   Great! all tests have passed!
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 410686)
Time Spent: 8h 10m  (was: 8h)

> Watermark hold for timer output timestamp is not computed correctly
> ---
>
> Key: BEAM-9573
> URL: https://issues.apache.org/jira/browse/BEAM-9573
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Affects Versions: 2.20.0
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Blocker
> Fix For: 2.20.0
>
>  Time Spent: 8h 10m
>  Remaining Estimate: 0h
>
> With the introduction of timer output timestamp, a new watermark hold had 
> been added to the Flink Runner. The watermark computation works on the keyed 
> state backend which computes a key-scoped watermark hold and not the desired 
> operator-wide watermark hold.
> Computation: 
> https://github.com/apache/beam/blob/b564239081e9351c56fb0e7d263495b95dd3f8f3/runners/flink/src/main/java/org/apache/beam/runners/flink/translation/wrappers/streaming/DoFnOperator.java#L1140
> Key-scoped state: 
> https://github.com/apache/beam/blob/b564239081e9351c56fb0e7d263495b95dd3f8f3/runners/flink/src/main/java/org/apache/beam/runners/flink/translation/wrappers/streaming/DoFnOperator.java#L1130
> We need to change this to operate on all keys. This has to be done before 
> fixing BEAM-9566.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9557) Error setting processing time timers near end-of-window

2020-03-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9557?focusedWorklogId=410684=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-410684
 ]

ASF GitHub Bot logged work on BEAM-9557:


Author: ASF GitHub Bot
Created on: 26/Mar/20 22:56
Start Date: 26/Mar/20 22:56
Worklog Time Spent: 10m 
  Work Description: reuvenlax commented on issue #11226: [BEAM-9557] Fix 
timer window boundary checking
URL: https://github.com/apache/beam/pull/11226#issuecomment-604729044
 
 
   @amaliujia appears that test looks for specific strings in the exception 
(always a recipe for a brittle test). I changed "event time timer" to 
"event-time timer" which broke that test. Will fix.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 410684)
Time Spent: 1h  (was: 50m)

> Error setting processing time timers near end-of-window
> ---
>
> Key: BEAM-9557
> URL: https://issues.apache.org/jira/browse/BEAM-9557
> Project: Beam
>  Issue Type: Bug
>  Components: runner-core
>Reporter: Steve Niemitz
>Assignee: Reuven Lax
>Priority: Critical
> Fix For: 2.20.0
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Previously, it was possible to set a processing time timer past the end of a 
> window, and it would simply not fire.
> However, now, this results in an error:
> {code:java}
> java.lang.IllegalArgumentException: Attempted to set event time timer that 
> outputs for 2020-03-19T18:01:35.000Z but that is after the expiration of 
> window 2020-03-19T17:59:59.999Z
> 
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkArgument(Preconditions.java:440)
> 
> org.apache.beam.runners.dataflow.worker.repackaged.org.apache.beam.runners.core.SimpleDoFnRunner$TimerInternalsTimer.setAndVerifyOutputTimestamp(SimpleDoFnRunner.java:1011)
> 
> org.apache.beam.runners.dataflow.worker.repackaged.org.apache.beam.runners.core.SimpleDoFnRunner$TimerInternalsTimer.setRelative(SimpleDoFnRunner.java:934)
> .processElement(???.scala:187)
>  {code}
>  
> I think the regression was introduced in commit 
> a005fd765a762183ca88df90f261f6d4a20cf3e0.  Also notice that the error message 
> is wrong, it says that "event time timer" but the timer is in the processing 
> time domain.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9331) The Row object needs better builders

2020-03-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9331?focusedWorklogId=410683=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-410683
 ]

ASF GitHub Bot logged work on BEAM-9331:


Author: ASF GitHub Bot
Created on: 26/Mar/20 22:47
Start Date: 26/Mar/20 22:47
Worklog Time Spent: 10m 
  Work Description: reuvenlax commented on issue #10883: [BEAM-9331] Add 
better Row builders
URL: https://github.com/apache/beam/pull/10883#issuecomment-604726387
 
 
   @alexvanboxel rebased and fixed bugs. Previously I was blocked on getting 
logical types to work, but now that we natively store logical types in Row, 
it's become much easier.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 410683)
Time Spent: 4h 10m  (was: 4h)

> The Row object needs better builders
> 
>
> Key: BEAM-9331
> URL: https://issues.apache.org/jira/browse/BEAM-9331
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-java-core
>Reporter: Reuven Lax
>Assignee: Reuven Lax
>Priority: Major
>  Time Spent: 4h 10m
>  Remaining Estimate: 0h
>
> Users should be able to build a Row object by specifying field names. Desired 
> syntax:
>  
> Row.withSchema(schema)
>    .withFieldName("field1", "value)
>   .withFieldName("field2.field3", value)
>   .build()
>  
> Users should also have a builder that allows taking an existing row and 
> changing specific fields.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-5422) Update BigQueryIO DynamicDestinations documentation to clarify usage of getDestination() and getTable()

2020-03-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-5422?focusedWorklogId=410682=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-410682
 ]

ASF GitHub Bot logged work on BEAM-5422:


Author: ASF GitHub Bot
Created on: 26/Mar/20 22:46
Start Date: 26/Mar/20 22:46
Worklog Time Spent: 10m 
  Work Description: udim commented on issue #11241: [BEAM-5422] Document 
DynamicDestinations.getTable uniqueness requirement
URL: https://github.com/apache/beam/pull/11241#issuecomment-604725835
 
 
   R: @chamikaramj 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 410682)
Time Spent: 20m  (was: 10m)

> Update BigQueryIO DynamicDestinations documentation to clarify usage of 
> getDestination() and getTable()
> ---
>
> Key: BEAM-5422
> URL: https://issues.apache.org/jira/browse/BEAM-5422
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-gcp
>Reporter: Chamikara Madhusanka Jayalath
>Assignee: Chamikara Madhusanka Jayalath
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Currently, there are some details related to these methods that should be 
> further clarified. For example, getTable() is expected to return a unique 
> value for each destination.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-5422) Update BigQueryIO DynamicDestinations documentation to clarify usage of getDestination() and getTable()

2020-03-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-5422?focusedWorklogId=410681=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-410681
 ]

ASF GitHub Bot logged work on BEAM-5422:


Author: ASF GitHub Bot
Created on: 26/Mar/20 22:44
Start Date: 26/Mar/20 22:44
Worklog Time Spent: 10m 
  Work Description: udim commented on pull request #11241: [BEAM-5422] 
Document DynamicDestinations.getTable uniqueness requirement
URL: https://github.com/apache/beam/pull/11241
 
 
   Load job IDs are keyed by table (among other things), but not the
   destination. Thus multiple DestinationTs mapping to the same table will
   have the same BQ job ID. The first one will succeed, and the rest will
   seem to Beam as retries (no-ops because the job has already
   started/completed).
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [ ] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] Update `CHANGES.md` with noteworthy changes.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   See the [Contributor Guide](https://beam.apache.org/contribute) for more 
tips on [how to make review process 
smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
 

[jira] [Work logged] (BEAM-4374) Update existing metrics in the FN API to use new Metric Schema

2020-03-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-4374?focusedWorklogId=410665=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-410665
 ]

ASF GitHub Bot logged work on BEAM-4374:


Author: ASF GitHub Bot
Created on: 26/Mar/20 22:10
Start Date: 26/Mar/20 22:10
Worklog Time Spent: 10m 
  Work Description: lostluck commented on issue #11231: [BEAM-4374] 
Shortids for the Go SDK
URL: https://github.com/apache/beam/pull/11231#issuecomment-604713667
 
 
   Just as a note, I'll wait until https://github.com/apache/beam/pull/11184 is 
in, and resolve the merge conflicts on my end before we merge this one. 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 410665)
Time Spent: 32h 20m  (was: 32h 10m)

> Update existing metrics in the FN API to use new Metric Schema
> --
>
> Key: BEAM-4374
> URL: https://issues.apache.org/jira/browse/BEAM-4374
> Project: Beam
>  Issue Type: New Feature
>  Components: beam-model
>Reporter: Alex Amato
>Priority: Major
>  Time Spent: 32h 20m
>  Remaining Estimate: 0h
>
> Update existing metrics to use the new proto and cataloging schema defined in:
> [_https://s.apache.org/beam-fn-api-metrics_]
>  * Check in new protos
>  * Define catalog file for metrics
>  * Port existing metrics to use this new format, based on catalog 
> names+metadata



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9136) Add LICENSES and NOTICES to docker images

2020-03-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9136?focusedWorklogId=410662=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-410662
 ]

ASF GitHub Bot logged work on BEAM-9136:


Author: ASF GitHub Bot
Created on: 26/Mar/20 22:07
Start Date: 26/Mar/20 22:07
Worklog Time Spent: 10m 
  Work Description: Hannah-Jiang commented on issue #11067: [BEAM-9136]Add 
licenses for dependencies for Python
URL: https://github.com/apache/beam/pull/11067#issuecomment-604712009
 
 
   I changed this PR to Python only.
   New PRs will be created for Java and Go.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 410662)
Time Spent: 8h 10m  (was: 8h)

> Add LICENSES and NOTICES to docker images
> -
>
> Key: BEAM-9136
> URL: https://issues.apache.org/jira/browse/BEAM-9136
> Project: Beam
>  Issue Type: Task
>  Components: build-system
>Reporter: Hannah Jiang
>Assignee: Hannah Jiang
>Priority: Major
>  Time Spent: 8h 10m
>  Remaining Estimate: 0h
>
> Scan dependencies and add licenses and notices of the dependencies to SDK 
> docker images.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9136) Add LICENSES and NOTICES to docker images

2020-03-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9136?focusedWorklogId=410659=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-410659
 ]

ASF GitHub Bot logged work on BEAM-9136:


Author: ASF GitHub Bot
Created on: 26/Mar/20 21:57
Start Date: 26/Mar/20 21:57
Worklog Time Spent: 10m 
  Work Description: Hannah-Jiang commented on issue #11067: [BEAM-9136]Add 
licenses for dependencies
URL: https://github.com/apache/beam/pull/11067#issuecomment-604707757
 
 
   Run Python DockerBuild PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 410659)
Time Spent: 8h  (was: 7h 50m)

> Add LICENSES and NOTICES to docker images
> -
>
> Key: BEAM-9136
> URL: https://issues.apache.org/jira/browse/BEAM-9136
> Project: Beam
>  Issue Type: Task
>  Components: build-system
>Reporter: Hannah Jiang
>Assignee: Hannah Jiang
>Priority: Major
>  Time Spent: 8h
>  Remaining Estimate: 0h
>
> Scan dependencies and add licenses and notices of the dependencies to SDK 
> docker images.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-4374) Update existing metrics in the FN API to use new Metric Schema

2020-03-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-4374?focusedWorklogId=410658=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-410658
 ]

ASF GitHub Bot logged work on BEAM-4374:


Author: ASF GitHub Bot
Created on: 26/Mar/20 21:55
Start Date: 26/Mar/20 21:55
Worklog Time Spent: 10m 
  Work Description: ajamato commented on pull request #11184: [BEAM-4374] 
Update protos related to MonitoringInfo.
URL: https://github.com/apache/beam/pull/11184#discussion_r398916506
 
 

 ##
 File path: model/pipeline/src/main/proto/metrics.proto
 ##
 @@ -52,38 +61,160 @@ message Annotation {
   string value = 2;
 }
 
-// Populated MonitoringInfoSpecs for specific URNs.
-// Indicating the required fields to be set.
-// SDKs and RunnerHarnesses can load these instances into memory and write a
-// validator or code generator to assist with populating and validating
-// MonitoringInfo protos.
+// A set of well known MonitoringInfo specifications.
 message MonitoringInfoSpecs {
   enum Enum {
-// TODO(BEAM-6926): Add the PTRANSFORM name as a required label after
-// upgrading the python SDK.
-USER_COUNTER = 0 [(monitoring_info_spec) = {
-  urn: "beam:metric:user",
-  type_urn: "beam:metrics:sum_int_64",
+// Represents an integer counter where values are summed across bundles.
+USER_SUM_INT64 = 0 [(monitoring_info_spec) = {
+  urn: "beam:metric:user:v1",
+  type: "beam:metrics:sum_int64:v1",
+  required_labels: ["PTRANSFORM", "NAMESPACE", "NAME"],
+  annotations: [{
+key: "description",
+value: "URN utilized to report user metric."
+  }]
+}];
+
+// Represents a double counter where values are summed across bundles.
+USER_SUM_DOUBLE = 1 [(monitoring_info_spec) = {
+  urn: "beam:metric:user:v1",
+  type: "beam:metrics:sum_double:v1",
+  required_labels: ["PTRANSFORM", "NAMESPACE", "NAME"],
+  annotations: [{
+key: "description",
+value: "URN utilized to report user metric."
+  }]
+}];
+
+// Represents a distribution of an integer value where:
+//   - count: represents the number of values seen across all bundles
 
 Review comment:
   Now it seems like there technically aren't any fields named "count", "sum", 
"min", "max". Just 4 encoded varints in that specific order. There is no longer 
a proto or anything which defines this format.
   
   If we are going to keep type urns, I think that there should be somewhere in 
this file where you could a "TypeSpec", which describes how to encode each 
opaque bytes payload. i.e. the coders used for each value, the order they must 
be encoded. Or a proto that should be serialized into that bytes field, etc. A 
description that will work for all languages. Right now you can only know that 
from looking at your encoding code.
   
   I think it would be best if SDK implemented could look at a reference file 
like this and know how to populate the MonitoringInfo. That was the original 
intention behind MonitoringInfoSpec, and I believe that is a bit lost now with 
this change.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 410658)
Time Spent: 32h 10m  (was: 32h)

> Update existing metrics in the FN API to use new Metric Schema
> --
>
> Key: BEAM-4374
> URL: https://issues.apache.org/jira/browse/BEAM-4374
> Project: Beam
>  Issue Type: New Feature
>  Components: beam-model
>Reporter: Alex Amato
>Priority: Major
>  Time Spent: 32h 10m
>  Remaining Estimate: 0h
>
> Update existing metrics to use the new proto and cataloging schema defined in:
> [_https://s.apache.org/beam-fn-api-metrics_]
>  * Check in new protos
>  * Define catalog file for metrics
>  * Port existing metrics to use this new format, based on catalog 
> names+metadata



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9557) Error setting processing time timers near end-of-window

2020-03-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9557?focusedWorklogId=410657=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-410657
 ]

ASF GitHub Bot logged work on BEAM-9557:


Author: ASF GitHub Bot
Created on: 26/Mar/20 21:52
Start Date: 26/Mar/20 21:52
Worklog Time Spent: 10m 
  Work Description: amaliujia commented on issue #11226: [BEAM-9557] Fix 
timer window boundary checking
URL: https://github.com/apache/beam/pull/11226#issuecomment-604620053
 
 
   This failed test might be relevant to this PR: 
org.apache.beam.sdk.transforms.ParDoTest$TimerTests.testOutOfBoundsEventTimeTimer
   
   (link https://builds.apache.org/job/beam_PreCommit_Java_Phrase/1930/)
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 410657)
Time Spent: 50m  (was: 40m)

> Error setting processing time timers near end-of-window
> ---
>
> Key: BEAM-9557
> URL: https://issues.apache.org/jira/browse/BEAM-9557
> Project: Beam
>  Issue Type: Bug
>  Components: runner-core
>Reporter: Steve Niemitz
>Assignee: Reuven Lax
>Priority: Critical
> Fix For: 2.20.0
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Previously, it was possible to set a processing time timer past the end of a 
> window, and it would simply not fire.
> However, now, this results in an error:
> {code:java}
> java.lang.IllegalArgumentException: Attempted to set event time timer that 
> outputs for 2020-03-19T18:01:35.000Z but that is after the expiration of 
> window 2020-03-19T17:59:59.999Z
> 
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkArgument(Preconditions.java:440)
> 
> org.apache.beam.runners.dataflow.worker.repackaged.org.apache.beam.runners.core.SimpleDoFnRunner$TimerInternalsTimer.setAndVerifyOutputTimestamp(SimpleDoFnRunner.java:1011)
> 
> org.apache.beam.runners.dataflow.worker.repackaged.org.apache.beam.runners.core.SimpleDoFnRunner$TimerInternalsTimer.setRelative(SimpleDoFnRunner.java:934)
> .processElement(???.scala:187)
>  {code}
>  
> I think the regression was introduced in commit 
> a005fd765a762183ca88df90f261f6d4a20cf3e0.  Also notice that the error message 
> is wrong, it says that "event time timer" but the timer is in the processing 
> time domain.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-9619) Install Python 3.8 on Jenkins workers

2020-03-26 Thread Valentyn Tymofieiev (Jira)
Valentyn Tymofieiev created BEAM-9619:
-

 Summary: Install Python 3.8 on Jenkins workers
 Key: BEAM-9619
 URL: https://issues.apache.org/jira/browse/BEAM-9619
 Project: Beam
  Issue Type: Sub-task
  Components: testing
Reporter: Valentyn Tymofieiev






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9550) beam_PostCommit_Python_Chicago_Taxi_Flink OOM

2020-03-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9550?focusedWorklogId=410653=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-410653
 ]

ASF GitHub Bot logged work on BEAM-9550:


Author: ASF GitHub Bot
Created on: 26/Mar/20 21:43
Start Date: 26/Mar/20 21:43
Worklog Time Spent: 10m 
  Work Description: kamilwu commented on issue #11193: [BEAM-9550] Increase 
JVM Metaspace size for the TaskExecutors.
URL: https://github.com/apache/beam/pull/11193#issuecomment-604702445
 
 
   Run Seed Job
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 410653)
Time Spent: 3h  (was: 2h 50m)

> beam_PostCommit_Python_Chicago_Taxi_Flink OOM
> -
>
> Key: BEAM-9550
> URL: https://issues.apache.org/jira/browse/BEAM-9550
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink, test-failures
>Reporter: Kyle Weaver
>Assignee: Kamil Wasilewski
>Priority: Major
>  Labels: currently-failing
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> https://builds.apache.org/job/beam_PostCommit_Python_Chicago_Taxi_Flink/
> The following error has been occurring consistently for several days:
> 07:57:26 ERROR:root:java.lang.OutOfMemoryError: Metaspace
> 07:57:27 Traceback (most recent call last):
> 07:57:27   File "tfdv_analyze_and_validate.py", line 227, in 
> 07:57:27 main()
> 07:57:27   File "tfdv_analyze_and_validate.py", line 212, in main
> 07:57:27 project=known_args.metric_reporting_project)
> 07:57:27   File "tfdv_analyze_and_validate.py", line 132, in compute_stats
> 07:57:27 result.wait_until_finish()
> 07:57:27   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Chicago_Taxi_Flink/src/build/gradleenv/1866363813/local/lib/python2.7/site-packages/apache_beam/runners/portability/portable_runner.py",
>  line 545, in wait_until_finish
> 07:57:27 (self._job_id, self._state, self._last_error_message()))
> 07:57:27 RuntimeError: Pipeline 
> chicago-taxi-tfdv-20200317-144954-eval_9742ac2b-26bf-4d1d-835e-572d4efacfcb 
> failed in state FAILED: java.lang.OutOfMemoryError: Metaspace



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9136) Add LICENSES and NOTICES to docker images

2020-03-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9136?focusedWorklogId=410650=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-410650
 ]

ASF GitHub Bot logged work on BEAM-9136:


Author: ASF GitHub Bot
Created on: 26/Mar/20 21:37
Start Date: 26/Mar/20 21:37
Worklog Time Spent: 10m 
  Work Description: Hannah-Jiang commented on pull request #11067: 
[BEAM-9136]Add licenses for dependencies
URL: https://github.com/apache/beam/pull/11067#discussion_r398907891
 
 

 ##
 File path: .test-infra/jenkins/job_PreCommit_Python_DockerBuild.groovy
 ##
 @@ -0,0 +1,38 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+import PrecommitJobBuilder
+
+PrecommitJobBuilder builder = new PrecommitJobBuilder(
+scope: this,
+nameBase: 'Python',
 
 Review comment:
   They are not for Python Precommit.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 410650)
Time Spent: 7h 50m  (was: 7h 40m)

> Add LICENSES and NOTICES to docker images
> -
>
> Key: BEAM-9136
> URL: https://issues.apache.org/jira/browse/BEAM-9136
> Project: Beam
>  Issue Type: Task
>  Components: build-system
>Reporter: Hannah Jiang
>Assignee: Hannah Jiang
>Priority: Major
>  Time Spent: 7h 50m
>  Remaining Estimate: 0h
>
> Scan dependencies and add licenses and notices of the dependencies to SDK 
> docker images.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-4374) Update existing metrics in the FN API to use new Metric Schema

2020-03-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-4374?focusedWorklogId=410637=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-410637
 ]

ASF GitHub Bot logged work on BEAM-4374:


Author: ASF GitHub Bot
Created on: 26/Mar/20 21:20
Start Date: 26/Mar/20 21:20
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on pull request #11184: [BEAM-4374] 
Update protos related to MonitoringInfo.
URL: https://github.com/apache/beam/pull/11184#discussion_r398899707
 
 

 ##
 File path: 
runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/fn/control/ElementCountMonitoringInfoToCounterUpdateTransformer.java
 ##
 @@ -95,7 +99,12 @@ public CounterUpdate transform(MonitoringInfo 
monitoringInfo) {
   return null;
 }
 
-long value = monitoringInfo.getMetric().getCounterData().getInt64Value();
+long value;
+try {
+  value = VARINT_CODER.decode(monitoringInfo.getPayload().newInput());
 
 Review comment:
   Done here an elsewhere. I introduced a MonitoringInfoEncodings class with 
the convenience methods for the currently used encodings.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 410637)
Time Spent: 31h 50m  (was: 31h 40m)

> Update existing metrics in the FN API to use new Metric Schema
> --
>
> Key: BEAM-4374
> URL: https://issues.apache.org/jira/browse/BEAM-4374
> Project: Beam
>  Issue Type: New Feature
>  Components: beam-model
>Reporter: Alex Amato
>Priority: Major
>  Time Spent: 31h 50m
>  Remaining Estimate: 0h
>
> Update existing metrics to use the new proto and cataloging schema defined in:
> [_https://s.apache.org/beam-fn-api-metrics_]
>  * Check in new protos
>  * Define catalog file for metrics
>  * Port existing metrics to use this new format, based on catalog 
> names+metadata



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9136) Add LICENSES and NOTICES to docker images

2020-03-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9136?focusedWorklogId=410638=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-410638
 ]

ASF GitHub Bot logged work on BEAM-9136:


Author: ASF GitHub Bot
Created on: 26/Mar/20 21:20
Start Date: 26/Mar/20 21:20
Worklog Time Spent: 10m 
  Work Description: Hannah-Jiang commented on issue #11067: [BEAM-9136]Add 
licenses for dependencies
URL: https://github.com/apache/beam/pull/11067#issuecomment-604692858
 
 
   > I wonder if it would make sense to have separate, more focused PRs for 
each of Python, Go, and Java.
   
   yep, will create separate PRs for each language.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 410638)
Time Spent: 7h 40m  (was: 7.5h)

> Add LICENSES and NOTICES to docker images
> -
>
> Key: BEAM-9136
> URL: https://issues.apache.org/jira/browse/BEAM-9136
> Project: Beam
>  Issue Type: Task
>  Components: build-system
>Reporter: Hannah Jiang
>Assignee: Hannah Jiang
>Priority: Major
>  Time Spent: 7h 40m
>  Remaining Estimate: 0h
>
> Scan dependencies and add licenses and notices of the dependencies to SDK 
> docker images.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-4374) Update existing metrics in the FN API to use new Metric Schema

2020-03-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-4374?focusedWorklogId=410639=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-410639
 ]

ASF GitHub Bot logged work on BEAM-4374:


Author: ASF GitHub Bot
Created on: 26/Mar/20 21:20
Start Date: 26/Mar/20 21:20
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on pull request #11184: [BEAM-4374] 
Update protos related to MonitoringInfo.
URL: https://github.com/apache/beam/pull/11184#discussion_r398899707
 
 

 ##
 File path: 
runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/fn/control/ElementCountMonitoringInfoToCounterUpdateTransformer.java
 ##
 @@ -95,7 +99,12 @@ public CounterUpdate transform(MonitoringInfo 
monitoringInfo) {
   return null;
 }
 
-long value = monitoringInfo.getMetric().getCounterData().getInt64Value();
+long value;
+try {
+  value = VARINT_CODER.decode(monitoringInfo.getPayload().newInput());
 
 Review comment:
   Done here and elsewhere. I introduced a MonitoringInfoEncodings class with 
the convenience methods for the currently used encodings.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 410639)
Time Spent: 32h  (was: 31h 50m)

> Update existing metrics in the FN API to use new Metric Schema
> --
>
> Key: BEAM-4374
> URL: https://issues.apache.org/jira/browse/BEAM-4374
> Project: Beam
>  Issue Type: New Feature
>  Components: beam-model
>Reporter: Alex Amato
>Priority: Major
>  Time Spent: 32h
>  Remaining Estimate: 0h
>
> Update existing metrics to use the new proto and cataloging schema defined in:
> [_https://s.apache.org/beam-fn-api-metrics_]
>  * Check in new protos
>  * Define catalog file for metrics
>  * Port existing metrics to use this new format, based on catalog 
> names+metadata



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8603) Add Python SqlTransform example script

2020-03-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8603?focusedWorklogId=410632=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-410632
 ]

ASF GitHub Bot logged work on BEAM-8603:


Author: ASF GitHub Bot
Created on: 26/Mar/20 21:16
Start Date: 26/Mar/20 21:16
Worklog Time Spent: 10m 
  Work Description: TheNeuralBit commented on pull request #10055: 
[BEAM-8603] Add Python SqlTransform
URL: https://github.com/apache/beam/pull/10055#discussion_r398897788
 
 

 ##
 File path: sdks/python/apache_beam/transforms/sql_test.py
 ##
 @@ -0,0 +1,109 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+"""Tests for transforms that use the SQL Expansion service."""
+
+# pytype: skip-file
+
+from __future__ import absolute_import
+
+import logging
+import typing
+import unittest
+
+from nose.plugins.attrib import attr
+from past.builtins import unicode
+
+import apache_beam as beam
+from apache_beam import coders
+from apache_beam.options.pipeline_options import DebugOptions
+from apache_beam.options.pipeline_options import StandardOptions
+from apache_beam.testing.test_pipeline import TestPipeline
+from apache_beam.testing.util import assert_that
+from apache_beam.testing.util import equal_to
+from apache_beam.transforms.sql import SqlTransform
+from apache_beam.utils import subprocess_server
+
+SimpleRow = typing.NamedTuple(
+"SimpleRow", [("int", int), ("str", unicode), ("flt", float)])
+coders.registry.register_coder(SimpleRow, coders.RowCoder)
+
+
+@attr('UsesSqlExpansionService')
+@unittest.skipIf(
+TestPipeline().get_pipeline_options().view_as(StandardOptions).runner is
+None,
+"Must be run with a runner that supports cross-language transforms")
 
 Review comment:
   Oh ok I didn't realize that.
   
   Really I just needed a way to prevent this test from running in the Python 
PreCommit, since the SQL expansion service isn't built in that context. The 
other xlang test suite handles that by checking if the `EXPANSION_PORT` env var 
is set.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 410632)
Time Spent: 4h 40m  (was: 4.5h)

> Add Python SqlTransform example script
> --
>
> Key: BEAM-8603
> URL: https://issues.apache.org/jira/browse/BEAM-8603
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Reporter: Brian Hulette
>Assignee: Brian Hulette
>Priority: Major
>  Time Spent: 4h 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9562) Remove timer from PCollection and treat timers as Elements

2020-03-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9562?focusedWorklogId=410629=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-410629
 ]

ASF GitHub Bot logged work on BEAM-9562:


Author: ASF GitHub Bot
Created on: 26/Mar/20 21:11
Start Date: 26/Mar/20 21:11
Worklog Time Spent: 10m 
  Work Description: boyuanzz commented on issue #11216: [BEAM-9562] Remove 
TimerSpec from Proto
URL: https://github.com/apache/beam/pull/11216#issuecomment-604688942
 
 
   Run Java_Examples_Dataflow PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 410629)
Time Spent: 4h 50m  (was: 4h 40m)

> Remove timer from PCollection and treat timers as Elements 
> ---
>
> Key: BEAM-9562
> URL: https://issues.apache.org/jira/browse/BEAM-9562
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-py-harness
>Reporter: Boyuan Zhang
>Assignee: Boyuan Zhang
>Priority: Major
>  Time Spent: 4h 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9562) Remove timer from PCollection and treat timers as Elements

2020-03-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9562?focusedWorklogId=410628=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-410628
 ]

ASF GitHub Bot logged work on BEAM-9562:


Author: ASF GitHub Bot
Created on: 26/Mar/20 21:11
Start Date: 26/Mar/20 21:11
Worklog Time Spent: 10m 
  Work Description: boyuanzz commented on issue #11216: [BEAM-9562] Remove 
TimerSpec from Proto
URL: https://github.com/apache/beam/pull/11216#issuecomment-604688875
 
 
   Run Python PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 410628)
Time Spent: 4h 40m  (was: 4.5h)

> Remove timer from PCollection and treat timers as Elements 
> ---
>
> Key: BEAM-9562
> URL: https://issues.apache.org/jira/browse/BEAM-9562
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-py-harness
>Reporter: Boyuan Zhang
>Assignee: Boyuan Zhang
>Priority: Major
>  Time Spent: 4h 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9574) NamedTuple instances generated from schemas cannot be pickled

2020-03-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9574?focusedWorklogId=410627=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-410627
 ]

ASF GitHub Bot logged work on BEAM-9574:


Author: ASF GitHub Bot
Created on: 26/Mar/20 21:10
Start Date: 26/Mar/20 21:10
Worklog Time Spent: 10m 
  Work Description: TheNeuralBit commented on pull request #11196: 
[BEAM-9574] Ensure that instances of generated NamedTuple classes can be pickled
URL: https://github.com/apache/beam/pull/11196#discussion_r398895069
 
 

 ##
 File path: sdks/python/apache_beam/typehints/schemas.py
 ##
 @@ -205,6 +218,11 @@ def typing_from_runner_api(fieldtype_proto):
 pass  # TODO
 
 
+def _hydrate_namedtuple_instance(encoded_schema, values):
+  return named_tuple_from_schema(
+  proto_utils.parse_Bytes(encoded_schema, schema_pb2.Schema))(*values)
+
+
 def named_tuple_from_schema(schema):
 
 Review comment:
   It's effectively memoized with SCHEMA_REGISTRY inside 
`typing_from_runner_api`. We could short-circuit it here as well though
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 410627)
Time Spent: 0.5h  (was: 20m)

> NamedTuple instances generated from schemas cannot be pickled
> -
>
> Key: BEAM-9574
> URL: https://issues.apache.org/jira/browse/BEAM-9574
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Brian Hulette
>Assignee: Brian Hulette
>Priority: Major
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Attempting to pickle an instance of a generated NamedTuple class results in 
> the following:
> {code}
> _pickle.PicklingError: Can't pickle  'apache_beam.typehints.schemas.BeamSchema_a7de91e0_ae11_4c52_a041_0b58ada35ac1'>:
>  attribute lookup BeamSchema_a7de91e0_ae11_4c52_a041_0b58ada35ac1 on 
> apache_beam.typehints.schemas failed
> {code}
> In general, we shouldn't be pickling these instances, but occasionally it may 
> be necessary, and we should just do it rather than failing hard.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9136) Add LICENSES and NOTICES to docker images

2020-03-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9136?focusedWorklogId=410623=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-410623
 ]

ASF GitHub Bot logged work on BEAM-9136:


Author: ASF GitHub Bot
Created on: 26/Mar/20 21:04
Start Date: 26/Mar/20 21:04
Worklog Time Spent: 10m 
  Work Description: robertwb commented on pull request #11067: 
[BEAM-9136]Add licenses for dependencies
URL: https://github.com/apache/beam/pull/11067#discussion_r398891416
 
 

 ##
 File path: .test-infra/jenkins/job_PreCommit_Python_DockerBuild.groovy
 ##
 @@ -0,0 +1,38 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+import PrecommitJobBuilder
+
+PrecommitJobBuilder builder = new PrecommitJobBuilder(
+scope: this,
+nameBase: 'Python',
 
 Review comment:
   Isn't docker built as part of the precommit tests already?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 410623)
Time Spent: 7h 20m  (was: 7h 10m)

> Add LICENSES and NOTICES to docker images
> -
>
> Key: BEAM-9136
> URL: https://issues.apache.org/jira/browse/BEAM-9136
> Project: Beam
>  Issue Type: Task
>  Components: build-system
>Reporter: Hannah Jiang
>Assignee: Hannah Jiang
>Priority: Major
>  Time Spent: 7h 20m
>  Remaining Estimate: 0h
>
> Scan dependencies and add licenses and notices of the dependencies to SDK 
> docker images.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9136) Add LICENSES and NOTICES to docker images

2020-03-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9136?focusedWorklogId=410624=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-410624
 ]

ASF GitHub Bot logged work on BEAM-9136:


Author: ASF GitHub Bot
Created on: 26/Mar/20 21:04
Start Date: 26/Mar/20 21:04
Worklog Time Spent: 10m 
  Work Description: robertwb commented on issue #11067: [BEAM-9136]Add 
licenses for dependencies
URL: https://github.com/apache/beam/pull/11067#issuecomment-604685294
 
 
   I wonder if it would make sense to have separate, more focused PRs for each 
of Python, Go, and Java. 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 410624)
Time Spent: 7.5h  (was: 7h 20m)

> Add LICENSES and NOTICES to docker images
> -
>
> Key: BEAM-9136
> URL: https://issues.apache.org/jira/browse/BEAM-9136
> Project: Beam
>  Issue Type: Task
>  Components: build-system
>Reporter: Hannah Jiang
>Assignee: Hannah Jiang
>Priority: Major
>  Time Spent: 7.5h
>  Remaining Estimate: 0h
>
> Scan dependencies and add licenses and notices of the dependencies to SDK 
> docker images.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-4374) Update existing metrics in the FN API to use new Metric Schema

2020-03-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-4374?focusedWorklogId=410617=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-410617
 ]

ASF GitHub Bot logged work on BEAM-4374:


Author: ASF GitHub Bot
Created on: 26/Mar/20 20:57
Start Date: 26/Mar/20 20:57
Worklog Time Spent: 10m 
  Work Description: lostluck commented on pull request #11231: [BEAM-4374] 
Shortids for the Go SDK
URL: https://github.com/apache/beam/pull/11231#discussion_r398886664
 
 

 ##
 File path: sdks/go/pkg/beam/core/runtime/harness/monitoring.go
 ##
 @@ -16,20 +16,71 @@
 package harness
 
 import (
+   "bytes"
+   "strconv"
+   "sync"
+   "sync/atomic"
"time"
 
+   "github.com/apache/beam/sdks/go/pkg/beam/core/graph/coder"
+   "github.com/apache/beam/sdks/go/pkg/beam/core/graph/mtime"
"github.com/apache/beam/sdks/go/pkg/beam/core/metrics"
"github.com/apache/beam/sdks/go/pkg/beam/core/runtime/exec"
fnpb "github.com/apache/beam/sdks/go/pkg/beam/model/fnexecution_v1"
ppb "github.com/apache/beam/sdks/go/pkg/beam/model/pipeline_v1"
"github.com/golang/protobuf/ptypes"
 )
 
-func monitoring(p *exec.Plan) (*fnpb.Metrics, []*ppb.MonitoringInfo) {
+// TODO: 2020/03/26 - measure mutex overhead vs sync.Map for this case.
+// sync.Map might have lower contention for this read heavy load.
+var (
+   shortMu sync.Mutex
+   labels2ShortIds map[metrics.Labels]string
 
 Review comment:
   Ah good point. 
   Can't use protos as Go Map keys, because of all the magic fields they have, 
but I can use other things.
   
   I've put in aligned constants, types, and string arrays for the proto 
specified strings, so these lookups don't end up hashing the strings every time 
(and instead use a uint32, which is very fast for go maps to deal with.) 
There's still the hashing of the fields in metrics.Labels, but we can do the 
same hashing in the metrics code at a later time, to allow for faster lookups 
for those instead.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 410617)
Time Spent: 31.5h  (was: 31h 20m)

> Update existing metrics in the FN API to use new Metric Schema
> --
>
> Key: BEAM-4374
> URL: https://issues.apache.org/jira/browse/BEAM-4374
> Project: Beam
>  Issue Type: New Feature
>  Components: beam-model
>Reporter: Alex Amato
>Priority: Major
>  Time Spent: 31.5h
>  Remaining Estimate: 0h
>
> Update existing metrics to use the new proto and cataloging schema defined in:
> [_https://s.apache.org/beam-fn-api-metrics_]
>  * Check in new protos
>  * Define catalog file for metrics
>  * Port existing metrics to use this new format, based on catalog 
> names+metadata



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-4374) Update existing metrics in the FN API to use new Metric Schema

2020-03-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-4374?focusedWorklogId=410618=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-410618
 ]

ASF GitHub Bot logged work on BEAM-4374:


Author: ASF GitHub Bot
Created on: 26/Mar/20 20:57
Start Date: 26/Mar/20 20:57
Worklog Time Spent: 10m 
  Work Description: lostluck commented on issue #11231: [BEAM-4374] 
Shortids for the Go SDK
URL: https://github.com/apache/beam/pull/11231#issuecomment-604681904
 
 
   Run Go Postcommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 410618)
Time Spent: 31h 40m  (was: 31.5h)

> Update existing metrics in the FN API to use new Metric Schema
> --
>
> Key: BEAM-4374
> URL: https://issues.apache.org/jira/browse/BEAM-4374
> Project: Beam
>  Issue Type: New Feature
>  Components: beam-model
>Reporter: Alex Amato
>Priority: Major
>  Time Spent: 31h 40m
>  Remaining Estimate: 0h
>
> Update existing metrics to use the new proto and cataloging schema defined in:
> [_https://s.apache.org/beam-fn-api-metrics_]
>  * Check in new protos
>  * Define catalog file for metrics
>  * Port existing metrics to use this new format, based on catalog 
> names+metadata



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


  1   2   3   4   >