[jira] [Work logged] (BEAM-9770) Add BigQuery DeadLetter pattern to Patterns Page

2020-05-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9770?focusedWorklogId=433565&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433565
 ]

ASF GitHub Bot logged work on BEAM-9770:


Author: ASF GitHub Bot
Created on: 15/May/20 05:40
Start Date: 15/May/20 05:40
Worklog Time Spent: 10m 
  Work Description: rezarokni commented on pull request #11437:
URL: https://github.com/apache/beam/pull/11437#issuecomment-629039064


   retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 433565)
Time Spent: 2h 50m  (was: 2h 40m)

> Add BigQuery DeadLetter pattern to Patterns Page
> 
>
> Key: BEAM-9770
> URL: https://issues.apache.org/jira/browse/BEAM-9770
> Project: Beam
>  Issue Type: New Feature
>  Components: website
>Reporter: Reza ardeshir rokni
>Assignee: Reza ardeshir rokni
>Priority: Trivial
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9136) Add LICENSES and NOTICES to docker images

2020-05-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9136?focusedWorklogId=433560&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433560
 ]

ASF GitHub Bot logged work on BEAM-9136:


Author: ASF GitHub Bot
Created on: 15/May/20 05:14
Start Date: 15/May/20 05:14
Worklog Time Spent: 10m 
  Work Description: Hannah-Jiang commented on pull request #11717:
URL: https://github.com/apache/beam/pull/11717#issuecomment-629031663


   Please merge it if it looks good. The PR was reviewed at 
https://github.com/apache/beam/pull/11549.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 433560)
Time Spent: 30.5h  (was: 30h 20m)

> Add LICENSES and NOTICES to docker images
> -
>
> Key: BEAM-9136
> URL: https://issues.apache.org/jira/browse/BEAM-9136
> Project: Beam
>  Issue Type: Task
>  Components: build-system
>Reporter: Hannah Jiang
>Assignee: Hannah Jiang
>Priority: Major
> Fix For: 2.21.0
>
>  Time Spent: 30.5h
>  Remaining Estimate: 0h
>
> Scan dependencies and add licenses and notices of the dependencies to SDK 
> docker images.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9136) Add LICENSES and NOTICES to docker images

2020-05-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9136?focusedWorklogId=433558&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433558
 ]

ASF GitHub Bot logged work on BEAM-9136:


Author: ASF GitHub Bot
Created on: 15/May/20 05:12
Start Date: 15/May/20 05:12
Worklog Time Spent: 10m 
  Work Description: Hannah-Jiang commented on pull request #11584:
URL: https://github.com/apache/beam/pull/11584#issuecomment-629030902


   It is rebased. Please take a look when you have time.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 433558)
Time Spent: 30h 20m  (was: 30h 10m)

> Add LICENSES and NOTICES to docker images
> -
>
> Key: BEAM-9136
> URL: https://issues.apache.org/jira/browse/BEAM-9136
> Project: Beam
>  Issue Type: Task
>  Components: build-system
>Reporter: Hannah Jiang
>Assignee: Hannah Jiang
>Priority: Major
> Fix For: 2.21.0
>
>  Time Spent: 30h 20m
>  Remaining Estimate: 0h
>
> Scan dependencies and add licenses and notices of the dependencies to SDK 
> docker images.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9136) Add LICENSES and NOTICES to docker images

2020-05-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9136?focusedWorklogId=433541&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433541
 ]

ASF GitHub Bot logged work on BEAM-9136:


Author: ASF GitHub Bot
Created on: 15/May/20 04:50
Start Date: 15/May/20 04:50
Worklog Time Spent: 10m 
  Work Description: Hannah-Jiang closed pull request #11549:
URL: https://github.com/apache/beam/pull/11549


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 433541)
Time Spent: 30h 10m  (was: 30h)

> Add LICENSES and NOTICES to docker images
> -
>
> Key: BEAM-9136
> URL: https://issues.apache.org/jira/browse/BEAM-9136
> Project: Beam
>  Issue Type: Task
>  Components: build-system
>Reporter: Hannah Jiang
>Assignee: Hannah Jiang
>Priority: Major
> Fix For: 2.21.0
>
>  Time Spent: 30h 10m
>  Remaining Estimate: 0h
>
> Scan dependencies and add licenses and notices of the dependencies to SDK 
> docker images.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9136) Add LICENSES and NOTICES to docker images

2020-05-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9136?focusedWorklogId=433540&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433540
 ]

ASF GitHub Bot logged work on BEAM-9136:


Author: ASF GitHub Bot
Created on: 15/May/20 04:50
Start Date: 15/May/20 04:50
Worklog Time Spent: 10m 
  Work Description: Hannah-Jiang commented on pull request #11549:
URL: https://github.com/apache/beam/pull/11549#issuecomment-629024197


   > @Hannah-Jiang - you can continue with this change now. However you will 
need to rebase.
   
   Thanks for letting me know. I created a new PR 
https://github.com/apache/beam/pull/11549 and will close this one.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 433540)
Time Spent: 30h  (was: 29h 50m)

> Add LICENSES and NOTICES to docker images
> -
>
> Key: BEAM-9136
> URL: https://issues.apache.org/jira/browse/BEAM-9136
> Project: Beam
>  Issue Type: Task
>  Components: build-system
>Reporter: Hannah Jiang
>Assignee: Hannah Jiang
>Priority: Major
> Fix For: 2.21.0
>
>  Time Spent: 30h
>  Remaining Estimate: 0h
>
> Scan dependencies and add licenses and notices of the dependencies to SDK 
> docker images.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9136) Add LICENSES and NOTICES to docker images

2020-05-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9136?focusedWorklogId=433538&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433538
 ]

ASF GitHub Bot logged work on BEAM-9136:


Author: ASF GitHub Bot
Created on: 15/May/20 04:48
Start Date: 15/May/20 04:48
Worklog Time Spent: 10m 
  Work Description: Hannah-Jiang opened a new pull request #11717:
URL: https://github.com/apache/beam/pull/11717


   @Kyle
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [ ] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] Update `CHANGES.md` with noteworthy changes.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   See the [Contributor Guide](https://beam.apache.org/contribute) for more 
tips on [how to make review process 
smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunn

[jira] [Work logged] (BEAM-9977) Build Kafka Read on top of Java SplittableDoFn

2020-05-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9977?focusedWorklogId=433510&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433510
 ]

ASF GitHub Bot logged work on BEAM-9977:


Author: ASF GitHub Bot
Created on: 15/May/20 04:00
Start Date: 15/May/20 04:00
Worklog Time Spent: 10m 
  Work Description: boyuanzz commented on a change in pull request #11715:
URL: https://github.com/apache/beam/pull/11715#discussion_r425553418



##
File path: 
sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/splittabledofn/GrowableOffsetRangeTracker.java
##
@@ -0,0 +1,103 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.transforms.splittabledofn;
+
+import static 
org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkNotNull;
+
+import org.apache.beam.sdk.annotations.Experimental;
+import org.apache.beam.sdk.annotations.Experimental.Kind;
+import org.apache.beam.sdk.io.range.OffsetRange;
+
+/**
+ * A special {@link OffsetRangeTracker} for tracking a growable offset range. 
The Long.MAX_VALUE is
+ * used as end range to indicate the possibility of infinity.
+ *
+ * A offset range is considered as growable when the end offset could 
grow(or change) during
+ * execution time(e.g., Kafka backlog, appended file).
+ */
+@Experimental(Kind.SPLITTABLE_DO_FN)
+public class GrowableOffsetRangeTracker extends OffsetRangeTracker {
+  /**
+   * An interface that should be implemented to fetch estimated end offset of 
range.
+   *
+   * {@code estimateRangeEnd} is called to give te end offset when {@code 
trySplit} or {@code
+   * getProgress} is invoked. The end offset is exclusive for the range. It's 
not necessary to
+   * increase monotonically but it's only taken into computation when it's 
larger than the current
+   * position. When returning Long.MAX_VALUE as estimate, it means the largest 
possible position for
+   * the range is Long.MAX_VALUE - 1. Having a good estimate is important for 
providing a good

Review comment:
   Currently I take `Long.MAX_VALUE` as a numeric end offset. But we may 
also need a notion  to say that the `OffsetPoller` doesn't have a good estimate 
and still want to keep current range as infinite.  `Long.MAX_VALUE` is not 
suitable because it's possible that the actual end is `Long.MAX_VALUE`. Do we 
want to provide a notion here? Like `null`?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 433510)
Time Spent: 20m  (was: 10m)

> Build Kafka Read on top of Java SplittableDoFn
> --
>
> Key: BEAM-9977
> URL: https://issues.apache.org/jira/browse/BEAM-9977
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-kafka
>Reporter: Boyuan Zhang
>Assignee: Boyuan Zhang
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9825) Transforms for Intersect, IntersectAll, Except, ExceptAll, Union, UnionAll

2020-05-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9825?focusedWorklogId=433506&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433506
 ]

ASF GitHub Bot logged work on BEAM-9825:


Author: ASF GitHub Bot
Created on: 15/May/20 03:50
Start Date: 15/May/20 03:50
Worklog Time Spent: 10m 
  Work Description: darshanj commented on a change in pull request #11610:
URL: https://github.com/apache/beam/pull/11610#discussion_r425550940



##
File path: 
sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/SetFns.java
##
@@ -0,0 +1,528 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.transforms;
+
+import static 
org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkNotNull;
+
+import java.util.ArrayList;
+import java.util.List;
+import org.apache.beam.sdk.Pipeline;
+import org.apache.beam.sdk.transforms.join.CoGbkResult;
+import org.apache.beam.sdk.transforms.join.CoGroupByKey;
+import org.apache.beam.sdk.transforms.join.KeyedPCollectionTuple;
+import org.apache.beam.sdk.values.KV;
+import org.apache.beam.sdk.values.PCollection;
+import org.apache.beam.sdk.values.PCollectionList;
+import org.apache.beam.sdk.values.TupleTag;
+import 
org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.Iterables;
+
+public class SetFns {

Review comment:
   Thanks. I was thinking to add "If you have multiple triggers configured 
and fired, output of
  this transform will be calculated on data which is in the respective 
trigger." 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 433506)
Remaining Estimate: 88h 10m  (was: 88h 20m)
Time Spent: 7h 50m  (was: 7h 40m)

> Transforms for Intersect, IntersectAll, Except, ExceptAll, Union, UnionAll
> --
>
> Key: BEAM-9825
> URL: https://issues.apache.org/jira/browse/BEAM-9825
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Darshan Jani
>Assignee: Darshan Jani
>Priority: Major
>   Original Estimate: 96h
>  Time Spent: 7h 50m
>  Remaining Estimate: 88h 10m
>
> I'd like to propose following new high-level transforms.
>  * Intersect
> Compute the intersection between elements of two PCollection.
> Given _leftCollection_ and _rightCollection_, this transform returns a 
> collection containing elements that common to both _leftCollection_ and 
> _rightCollection_
>  
>  * Except
> Compute the difference between elements of two PCollection.
> Given _leftCollection_ and _rightCollection_, this transform returns a 
> collection containing elements that are in _leftCollection_ but not in 
> _rightCollection_
>  * Union
> Find the elements that are either of two PCollection.
> Implement IntersetAll, ExceptAll and UnionAll variants of transforms.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9825) Transforms for Intersect, IntersectAll, Except, ExceptAll, Union, UnionAll

2020-05-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9825?focusedWorklogId=433503&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433503
 ]

ASF GitHub Bot logged work on BEAM-9825:


Author: ASF GitHub Bot
Created on: 15/May/20 03:46
Start Date: 15/May/20 03:46
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on a change in pull request #11610:
URL: https://github.com/apache/beam/pull/11610#discussion_r425547978



##
File path: 
sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/SetFns.java
##
@@ -0,0 +1,528 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.transforms;
+
+import static 
org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkNotNull;
+
+import java.util.ArrayList;
+import java.util.List;
+import org.apache.beam.sdk.Pipeline;
+import org.apache.beam.sdk.transforms.join.CoGbkResult;
+import org.apache.beam.sdk.transforms.join.CoGroupByKey;
+import org.apache.beam.sdk.transforms.join.KeyedPCollectionTuple;
+import org.apache.beam.sdk.values.KV;
+import org.apache.beam.sdk.values.PCollection;
+import org.apache.beam.sdk.values.PCollectionList;
+import org.apache.beam.sdk.values.TupleTag;
+import 
org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.Iterables;
+
+public class SetFns {

Review comment:
   Your understanding is correct that users will get results based upon 
whatever data is fired because of the trigger but from a cursory reading of the 
docs, we mention doing intersect/distinct/... over PCollections and not trigger 
firings which could confuse a user into thinking that intersect/distinct will 
be over all elements in these PCollections. 
   
   This is why I believe it's important to insert a statement something like:
   ```
   Triggers with multiple firings may lead to nondeterministic results since 
the intersect/distinct/... is only computed over each individual firing.
   ```
   
   This would go well with your current statement about having compatible 
triggers in all your methods.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 433503)
Remaining Estimate: 88h 20m  (was: 88.5h)
Time Spent: 7h 40m  (was: 7.5h)

> Transforms for Intersect, IntersectAll, Except, ExceptAll, Union, UnionAll
> --
>
> Key: BEAM-9825
> URL: https://issues.apache.org/jira/browse/BEAM-9825
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Darshan Jani
>Assignee: Darshan Jani
>Priority: Major
>   Original Estimate: 96h
>  Time Spent: 7h 40m
>  Remaining Estimate: 88h 20m
>
> I'd like to propose following new high-level transforms.
>  * Intersect
> Compute the intersection between elements of two PCollection.
> Given _leftCollection_ and _rightCollection_, this transform returns a 
> collection containing elements that common to both _leftCollection_ and 
> _rightCollection_
>  
>  * Except
> Compute the difference between elements of two PCollection.
> Given _leftCollection_ and _rightCollection_, this transform returns a 
> collection containing elements that are in _leftCollection_ but not in 
> _rightCollection_
>  * Union
> Find the elements that are either of two PCollection.
> Implement IntersetAll, ExceptAll and UnionAll variants of transforms.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9825) Transforms for Intersect, IntersectAll, Except, ExceptAll, Union, UnionAll

2020-05-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9825?focusedWorklogId=433502&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433502
 ]

ASF GitHub Bot logged work on BEAM-9825:


Author: ASF GitHub Bot
Created on: 15/May/20 03:43
Start Date: 15/May/20 03:43
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on a change in pull request #11610:
URL: https://github.com/apache/beam/pull/11610#discussion_r425547978



##
File path: 
sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/SetFns.java
##
@@ -0,0 +1,528 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.transforms;
+
+import static 
org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkNotNull;
+
+import java.util.ArrayList;
+import java.util.List;
+import org.apache.beam.sdk.Pipeline;
+import org.apache.beam.sdk.transforms.join.CoGbkResult;
+import org.apache.beam.sdk.transforms.join.CoGroupByKey;
+import org.apache.beam.sdk.transforms.join.KeyedPCollectionTuple;
+import org.apache.beam.sdk.values.KV;
+import org.apache.beam.sdk.values.PCollection;
+import org.apache.beam.sdk.values.PCollectionList;
+import org.apache.beam.sdk.values.TupleTag;
+import 
org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.Iterables;
+
+public class SetFns {

Review comment:
   Your understanding is correct that users will get results based upon 
whatever data is fired because of the trigger but from a cursory reading of the 
docs, we mention doing intersect/distinct/... over PCollections and not trigger 
firings which could confuse a user into thinking that intersect/distinct will 
be over all elements in these PCollections.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 433502)
Remaining Estimate: 88.5h  (was: 88h 40m)
Time Spent: 7.5h  (was: 7h 20m)

> Transforms for Intersect, IntersectAll, Except, ExceptAll, Union, UnionAll
> --
>
> Key: BEAM-9825
> URL: https://issues.apache.org/jira/browse/BEAM-9825
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Darshan Jani
>Assignee: Darshan Jani
>Priority: Major
>   Original Estimate: 96h
>  Time Spent: 7.5h
>  Remaining Estimate: 88.5h
>
> I'd like to propose following new high-level transforms.
>  * Intersect
> Compute the intersection between elements of two PCollection.
> Given _leftCollection_ and _rightCollection_, this transform returns a 
> collection containing elements that common to both _leftCollection_ and 
> _rightCollection_
>  
>  * Except
> Compute the difference between elements of two PCollection.
> Given _leftCollection_ and _rightCollection_, this transform returns a 
> collection containing elements that are in _leftCollection_ but not in 
> _rightCollection_
>  * Union
> Find the elements that are either of two PCollection.
> Implement IntersetAll, ExceptAll and UnionAll variants of transforms.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9825) Transforms for Intersect, IntersectAll, Except, ExceptAll, Union, UnionAll

2020-05-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9825?focusedWorklogId=433501&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433501
 ]

ASF GitHub Bot logged work on BEAM-9825:


Author: ASF GitHub Bot
Created on: 15/May/20 03:36
Start Date: 15/May/20 03:36
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on a change in pull request #11610:
URL: https://github.com/apache/beam/pull/11610#discussion_r425547978



##
File path: 
sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/SetFns.java
##
@@ -0,0 +1,528 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.transforms;
+
+import static 
org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkNotNull;
+
+import java.util.ArrayList;
+import java.util.List;
+import org.apache.beam.sdk.Pipeline;
+import org.apache.beam.sdk.transforms.join.CoGbkResult;
+import org.apache.beam.sdk.transforms.join.CoGroupByKey;
+import org.apache.beam.sdk.transforms.join.KeyedPCollectionTuple;
+import org.apache.beam.sdk.values.KV;
+import org.apache.beam.sdk.values.PCollection;
+import org.apache.beam.sdk.values.PCollectionList;
+import org.apache.beam.sdk.values.TupleTag;
+import 
org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.Iterables;
+
+public class SetFns {

Review comment:
   Your understanding is correct that users will get results based upon 
whatever data is fired because of a trigger but from a cursory reading of the 
docs, we mention doing intersect/distinct/... over PCollections and not trigger 
firings which could confuse a user into thinking that intersect/distinct will 
be over all elements in these PCollections.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 433501)
Remaining Estimate: 88h 40m  (was: 88h 50m)
Time Spent: 7h 20m  (was: 7h 10m)

> Transforms for Intersect, IntersectAll, Except, ExceptAll, Union, UnionAll
> --
>
> Key: BEAM-9825
> URL: https://issues.apache.org/jira/browse/BEAM-9825
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Darshan Jani
>Assignee: Darshan Jani
>Priority: Major
>   Original Estimate: 96h
>  Time Spent: 7h 20m
>  Remaining Estimate: 88h 40m
>
> I'd like to propose following new high-level transforms.
>  * Intersect
> Compute the intersection between elements of two PCollection.
> Given _leftCollection_ and _rightCollection_, this transform returns a 
> collection containing elements that common to both _leftCollection_ and 
> _rightCollection_
>  
>  * Except
> Compute the difference between elements of two PCollection.
> Given _leftCollection_ and _rightCollection_, this transform returns a 
> collection containing elements that are in _leftCollection_ but not in 
> _rightCollection_
>  * Union
> Find the elements that are either of two PCollection.
> Implement IntersetAll, ExceptAll and UnionAll variants of transforms.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9770) Add BigQuery DeadLetter pattern to Patterns Page

2020-05-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9770?focusedWorklogId=433498&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433498
 ]

ASF GitHub Bot logged work on BEAM-9770:


Author: ASF GitHub Bot
Created on: 15/May/20 03:19
Start Date: 15/May/20 03:19
Worklog Time Spent: 10m 
  Work Description: rezarokni commented on pull request #11437:
URL: https://github.com/apache/beam/pull/11437#issuecomment-629003109


   Local tests ran ok, but also raised:
   https://issues.apache.org/jira/browse/BEAM-10003
   
   So this PR is now just the code bits , rather than code + website bits.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 433498)
Time Spent: 2h 40m  (was: 2.5h)

> Add BigQuery DeadLetter pattern to Patterns Page
> 
>
> Key: BEAM-9770
> URL: https://issues.apache.org/jira/browse/BEAM-9770
> Project: Beam
>  Issue Type: New Feature
>  Components: website
>Reporter: Reza ardeshir rokni
>Assignee: Reza ardeshir rokni
>Priority: Trivial
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-10003) Need two PR to submit snippets to website

2020-05-14 Thread Reza ardeshir rokni (Jira)
Reza ardeshir rokni created BEAM-10003:
--

 Summary: Need two PR to submit snippets to website
 Key: BEAM-10003
 URL: https://issues.apache.org/jira/browse/BEAM-10003
 Project: Beam
  Issue Type: New Feature
  Components: website
Reporter: Reza ardeshir rokni


Looks like build_github_samples.sh uses code already on the repo to build local 
serving;

do
  fileName=$(echo "$url" | sed -e 's/\//_/g')
  curl -o "$DIST_DIR"/"$fileName" 
"[https://raw.githubusercontent.com|https://raw.githubusercontent.com/]$url";
done

So when tying to test locally, the code needs to have already be in Beam. 

Ideally the script should make use of local code when building so :

1- Easier to  build & test changes.
2- No need to raise two PR for what is a single change

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-10002) Cursor not found if work items take a long time

2020-05-14 Thread Corvin Deboeser (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-10002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Corvin Deboeser updated BEAM-10002:
---
Description: 
If some work items take a lot of processing time and the cursor of a bundle is 
not queried for too long, then mongodb will timeout the cursor which results in
{code:java}
pymongo.errors.CursorNotFound: cursor id ... not found
{code}
 

  was:
If some work items take a lot of processing time and the cursor of a bundle is 
not queried for too long, then mongodb will timeout the cursor which results in

```

pymongo.errors.CursorNotFound: cursor id ... not found

```


> Cursor not found if work items take a long time
> ---
>
> Key: BEAM-10002
> URL: https://issues.apache.org/jira/browse/BEAM-10002
> Project: Beam
>  Issue Type: Bug
>  Components: io-py-mongodb
>Affects Versions: 2.20.0
>Reporter: Corvin Deboeser
>Assignee: Yichi Zhang
>Priority: Major
>
> If some work items take a lot of processing time and the cursor of a bundle 
> is not queried for too long, then mongodb will timeout the cursor which 
> results in
> {code:java}
> pymongo.errors.CursorNotFound: cursor id ... not found
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-10002) Cursor not found if work items take a long time

2020-05-14 Thread Corvin Deboeser (Jira)
Corvin Deboeser created BEAM-10002:
--

 Summary: Cursor not found if work items take a long time
 Key: BEAM-10002
 URL: https://issues.apache.org/jira/browse/BEAM-10002
 Project: Beam
  Issue Type: Bug
  Components: io-py-mongodb
Affects Versions: 2.20.0
Reporter: Corvin Deboeser
Assignee: Yichi Zhang


If some work items take a lot of processing time and the cursor of a bundle is 
not queried for too long, then mongodb will timeout the cursor which results in

```

pymongo.errors.CursorNotFound: cursor id ... not found

```



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9960) Python MongoDBIO fails when response of split vector command is larger than 16mb

2020-05-14 Thread Corvin Deboeser (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Corvin Deboeser updated BEAM-9960:
--
Component/s: (was: sdk-py-core)

> Python MongoDBIO fails when response of split vector command is larger than 
> 16mb
> 
>
> Key: BEAM-9960
> URL: https://issues.apache.org/jira/browse/BEAM-9960
> Project: Beam
>  Issue Type: Bug
>  Components: io-py-mongodb
>Affects Versions: 2.20.0
>Reporter: Corvin Deboeser
>Priority: Major
>
> When using MongoDBIO on a large collection with large documents on average, 
> then the split vector command results in a lot of splits if the desired 
> bundle size is small. In extreme cases, the response from the split vector 
> command can be larger than 16mb which is not supported by pymongo / MongoDB:
> {{pymongo.errors.ProtocolError: Message length (33699186) is larger than 
> server max message size (33554432)}}
>  
> Environment: Was running this on Google Dataflow / Beam Python SDK 2.20.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9960) Python MongoDBIO fails when response of split vector command is larger than 16mb

2020-05-14 Thread Corvin Deboeser (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Corvin Deboeser updated BEAM-9960:
--
Description: 
When using MongoDBIO on a large collection with large documents on average, 
then the split vector command results in a lot of splits if the desired bundle 
size is small. In extreme cases, the response from the split vector command can 
be larger than 16mb which is not supported by pymongo / MongoDB:

{{pymongo.errors.ProtocolError: Message length (33699186) is larger than server 
max message size (33554432)}}

 

Environment: Was running this on Google Dataflow / Beam Python SDK 2.20.

  was:
When using MongoDBIO on a large collection and the source bundle size was 
determined to be 1, then the response from the split vector command can be 
larger than 16mb which is not supported by pymongo / MongoDB:

{{pymongo.errors.ProtocolError: Message length (33699186) is larger than server 
max message size (33554432)}}

 

Environment: Was running this on Google Dataflow / Beam Python SDK 2.20.


> Python MongoDBIO fails when response of split vector command is larger than 
> 16mb
> 
>
> Key: BEAM-9960
> URL: https://issues.apache.org/jira/browse/BEAM-9960
> Project: Beam
>  Issue Type: Bug
>  Components: io-py-mongodb, sdk-py-core
>Affects Versions: 2.20.0
>Reporter: Corvin Deboeser
>Priority: Major
>
> When using MongoDBIO on a large collection with large documents on average, 
> then the split vector command results in a lot of splits if the desired 
> bundle size is small. In extreme cases, the response from the split vector 
> command can be larger than 16mb which is not supported by pymongo / MongoDB:
> {{pymongo.errors.ProtocolError: Message length (33699186) is larger than 
> server max message size (33554432)}}
>  
> Environment: Was running this on Google Dataflow / Beam Python SDK 2.20.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9960) Python MongoDBIO fails when response of split vector command is larger than 16mb

2020-05-14 Thread Corvin Deboeser (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Corvin Deboeser updated BEAM-9960:
--
Component/s: io-py-mongodb

> Python MongoDBIO fails when response of split vector command is larger than 
> 16mb
> 
>
> Key: BEAM-9960
> URL: https://issues.apache.org/jira/browse/BEAM-9960
> Project: Beam
>  Issue Type: Bug
>  Components: io-py-mongodb, sdk-py-core
>Affects Versions: 2.20.0
>Reporter: Corvin Deboeser
>Priority: Major
>
> When using MongoDBIO on a large collection and the source bundle size was 
> determined to be 1, then the response from the split vector command can be 
> larger than 16mb which is not supported by pymongo / MongoDB:
> {{pymongo.errors.ProtocolError: Message length (33699186) is larger than 
> server max message size (33554432)}}
>  
> Environment: Was running this on Google Dataflow / Beam Python SDK 2.20.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9961) Python MongoDBIO does not apply projection

2020-05-14 Thread Corvin Deboeser (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Corvin Deboeser updated BEAM-9961:
--
Component/s: io-py-mongodb

> Python MongoDBIO does not apply projection
> --
>
> Key: BEAM-9961
> URL: https://issues.apache.org/jira/browse/BEAM-9961
> Project: Beam
>  Issue Type: Bug
>  Components: io-py-mongodb, sdk-py-core
>Affects Versions: 2.20.0
>Reporter: Corvin Deboeser
>Priority: Minor
>
> ReadFromMongoDB does not apply the provided projection when reading from the 
> client - only filter is being applied as you can see here:
> https://github.com/apache/beam/blob/9f0cb649d39ee6236ea27f111acb4b66591a80ec/sdks/python/apache_beam/io/mongodbio.py#L204



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-2939) Fn API SDF support

2020-05-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-2939?focusedWorklogId=433486&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433486
 ]

ASF GitHub Bot logged work on BEAM-2939:


Author: ASF GitHub Bot
Created on: 15/May/20 02:31
Start Date: 15/May/20 02:31
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on pull request #11716:
URL: https://github.com/apache/beam/pull/11716#issuecomment-628990029


   Run Java PreCommit



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 433486)
Time Spent: 32h 10m  (was: 32h)

> Fn API SDF support
> --
>
> Key: BEAM-2939
> URL: https://issues.apache.org/jira/browse/BEAM-2939
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model
>Reporter: Henning Rohde
>Assignee: Luke Cwik
>Priority: Major
>  Labels: portability
>  Time Spent: 32h 10m
>  Remaining Estimate: 0h
>
> The Fn API should support streaming SDF. Detailed design TBD.
> Once design is ready, expand subtasks similarly to BEAM-2822.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9825) Transforms for Intersect, IntersectAll, Except, ExceptAll, Union, UnionAll

2020-05-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9825?focusedWorklogId=433483&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433483
 ]

ASF GitHub Bot logged work on BEAM-9825:


Author: ASF GitHub Bot
Created on: 15/May/20 02:19
Start Date: 15/May/20 02:19
Worklog Time Spent: 10m 
  Work Description: darshanj commented on a change in pull request #11610:
URL: https://github.com/apache/beam/pull/11610#discussion_r425529534



##
File path: 
sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/SetFns.java
##
@@ -0,0 +1,528 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.transforms;
+
+import static 
org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkNotNull;
+
+import java.util.ArrayList;
+import java.util.List;
+import org.apache.beam.sdk.Pipeline;
+import org.apache.beam.sdk.transforms.join.CoGbkResult;
+import org.apache.beam.sdk.transforms.join.CoGroupByKey;
+import org.apache.beam.sdk.transforms.join.KeyedPCollectionTuple;
+import org.apache.beam.sdk.values.KV;
+import org.apache.beam.sdk.values.PCollection;
+import org.apache.beam.sdk.values.PCollectionList;
+import org.apache.beam.sdk.values.TupleTag;
+import 
org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.Iterables;
+
+public class SetFns {

Review comment:
   Added Comments for coders. 
   
   Regarding triggers, I assume, multiple triggers should work if they are 
compatible.
   It uses CGBK (internally uses flattens) , which checks if triggers 
compatible and windowFns are compatible. User will get results based on 
whatever data is triggered.
   
   My understanding may be flawed. Please correct if you think that is not the 
case.
   
   I have added a comment for compatible triggers and same windowFns.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 433483)
Remaining Estimate: 88h 50m  (was: 89h)
Time Spent: 7h 10m  (was: 7h)

> Transforms for Intersect, IntersectAll, Except, ExceptAll, Union, UnionAll
> --
>
> Key: BEAM-9825
> URL: https://issues.apache.org/jira/browse/BEAM-9825
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Darshan Jani
>Assignee: Darshan Jani
>Priority: Major
>   Original Estimate: 96h
>  Time Spent: 7h 10m
>  Remaining Estimate: 88h 50m
>
> I'd like to propose following new high-level transforms.
>  * Intersect
> Compute the intersection between elements of two PCollection.
> Given _leftCollection_ and _rightCollection_, this transform returns a 
> collection containing elements that common to both _leftCollection_ and 
> _rightCollection_
>  
>  * Except
> Compute the difference between elements of two PCollection.
> Given _leftCollection_ and _rightCollection_, this transform returns a 
> collection containing elements that are in _leftCollection_ but not in 
> _rightCollection_
>  * Union
> Find the elements that are either of two PCollection.
> Implement IntersetAll, ExceptAll and UnionAll variants of transforms.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9679) Core Transforms | Go SDK Code Katas

2020-05-14 Thread Damon Douglas (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Damon Douglas updated BEAM-9679:

Description: 
A kata devoted to core beam transforms patterns after 
[https://github.com/apache/beam/tree/master/learning/katas/java/Core%20Transforms]
 where the take away is an individual's ability to master the following using 
an Apache Beam pipeline using the Golang SDK.
 * Branching
 * 
[CoGroupByKey|[https://github.com/damondouglas/beam/tree/BEAM-9679-core-transform-groupbykey]]
 * Combine
 * Composite Transform
 * DoFn Additional Parameters
 * Flatten
 * GroupByKey
 * [Map|[https://github.com/apache/beam/pull/11564]]
 * Partition
 * Side Input

  was:
A kata devoted to core beam transforms patterns after 
[https://github.com/apache/beam/tree/master/learning/katas/java/Core%20Transforms]
 where the take away is an individual's ability to master the following using 
an Apache Beam pipeline using the Golang SDK.
 * Branching
 * CoGroupByKey
 * Combine
 * Composite Transform
 * DoFn Additional Parameters
 * Flatten
 * GroupByKey
 * [Map|[https://github.com/apache/beam/pull/11564]]
 * Partition
 * Side Input


> Core Transforms | Go SDK Code Katas
> ---
>
> Key: BEAM-9679
> URL: https://issues.apache.org/jira/browse/BEAM-9679
> Project: Beam
>  Issue Type: Sub-task
>  Components: katas, sdk-go
>Reporter: Damon Douglas
>Assignee: Damon Douglas
>Priority: Major
>
> A kata devoted to core beam transforms patterns after 
> [https://github.com/apache/beam/tree/master/learning/katas/java/Core%20Transforms]
>  where the take away is an individual's ability to master the following using 
> an Apache Beam pipeline using the Golang SDK.
>  * Branching
>  * 
> [CoGroupByKey|[https://github.com/damondouglas/beam/tree/BEAM-9679-core-transform-groupbykey]]
>  * Combine
>  * Composite Transform
>  * DoFn Additional Parameters
>  * Flatten
>  * GroupByKey
>  * [Map|[https://github.com/apache/beam/pull/11564]]
>  * Partition
>  * Side Input



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-2939) Fn API SDF support

2020-05-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-2939?focusedWorklogId=433469&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433469
 ]

ASF GitHub Bot logged work on BEAM-2939:


Author: ASF GitHub Bot
Created on: 15/May/20 01:36
Start Date: 15/May/20 01:36
Worklog Time Spent: 10m 
  Work Description: lukecwik opened a new pull request #11716:
URL: https://github.com/apache/beam/pull/11716


   This got rid of the NullPointerException for the Kafka checkpoint because 
the checkpoint itself isn't serializable. When it gets deserialized, the 
optional reader field is null which is what was causing the NPE.
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [ ] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] Update `CHANGES.md` with noteworthy changes.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   See the [Contributor Guide](https://beam.apache.org/contribute) for more 
tips on [how to make review process 
smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batc

[jira] [Work logged] (BEAM-2939) Fn API SDF support

2020-05-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-2939?focusedWorklogId=433470&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433470
 ]

ASF GitHub Bot logged work on BEAM-2939:


Author: ASF GitHub Bot
Created on: 15/May/20 01:36
Start Date: 15/May/20 01:36
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on pull request #11716:
URL: https://github.com/apache/beam/pull/11716#issuecomment-628974287


   R: @ihji 
   CC: @chamikaramj 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 433470)
Time Spent: 32h  (was: 31h 50m)

> Fn API SDF support
> --
>
> Key: BEAM-2939
> URL: https://issues.apache.org/jira/browse/BEAM-2939
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model
>Reporter: Henning Rohde
>Assignee: Luke Cwik
>Priority: Major
>  Labels: portability
>  Time Spent: 32h
>  Remaining Estimate: 0h
>
> The Fn API should support streaming SDF. Detailed design TBD.
> Once design is ready, expand subtasks similarly to BEAM-2822.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9136) Add LICENSES and NOTICES to docker images

2020-05-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9136?focusedWorklogId=433460&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433460
 ]

ASF GitHub Bot logged work on BEAM-9136:


Author: ASF GitHub Bot
Created on: 15/May/20 01:07
Start Date: 15/May/20 01:07
Worklog Time Spent: 10m 
  Work Description: aaltay commented on pull request #11584:
URL: https://github.com/apache/beam/pull/11584#issuecomment-628965737


   @Hannah-Jiang - you can continue with this change now. However you will need 
to rebase.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 433460)
Time Spent: 29h 40m  (was: 29.5h)

> Add LICENSES and NOTICES to docker images
> -
>
> Key: BEAM-9136
> URL: https://issues.apache.org/jira/browse/BEAM-9136
> Project: Beam
>  Issue Type: Task
>  Components: build-system
>Reporter: Hannah Jiang
>Assignee: Hannah Jiang
>Priority: Major
> Fix For: 2.21.0
>
>  Time Spent: 29h 40m
>  Remaining Estimate: 0h
>
> Scan dependencies and add licenses and notices of the dependencies to SDK 
> docker images.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9136) Add LICENSES and NOTICES to docker images

2020-05-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9136?focusedWorklogId=433459&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433459
 ]

ASF GitHub Bot logged work on BEAM-9136:


Author: ASF GitHub Bot
Created on: 15/May/20 01:07
Start Date: 15/May/20 01:07
Worklog Time Spent: 10m 
  Work Description: aaltay commented on pull request #11549:
URL: https://github.com/apache/beam/pull/11549#issuecomment-628965670


   @Hannah-Jiang - you can continue with this change now. However you will need 
to rebase.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 433459)
Time Spent: 29.5h  (was: 29h 20m)

> Add LICENSES and NOTICES to docker images
> -
>
> Key: BEAM-9136
> URL: https://issues.apache.org/jira/browse/BEAM-9136
> Project: Beam
>  Issue Type: Task
>  Components: build-system
>Reporter: Hannah Jiang
>Assignee: Hannah Jiang
>Priority: Major
> Fix For: 2.21.0
>
>  Time Spent: 29.5h
>  Remaining Estimate: 0h
>
> Scan dependencies and add licenses and notices of the dependencies to SDK 
> docker images.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9977) Build Kafka Read on top of Java SplittableDoFn

2020-05-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9977?focusedWorklogId=433456&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433456
 ]

ASF GitHub Bot logged work on BEAM-9977:


Author: ASF GitHub Bot
Created on: 15/May/20 00:53
Start Date: 15/May/20 00:53
Worklog Time Spent: 10m 
  Work Description: boyuanzz opened a new pull request #11715:
URL: https://github.com/apache/beam/pull/11715


   **Please** add a meaningful description for your change here
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [ ] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] Update `CHANGES.md` with noteworthy changes.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   See the [Contributor Guide](https://beam.apache.org/contribute) for more 
tips on [how to make review process 
smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)[![Build
 
Status](https://build

[jira] [Work logged] (BEAM-9951) Create Go SDK synthetic sources.

2020-05-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9951?focusedWorklogId=433452&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433452
 ]

ASF GitHub Bot logged work on BEAM-9951:


Author: ASF GitHub Bot
Created on: 15/May/20 00:41
Start Date: 15/May/20 00:41
Worklog Time Spent: 10m 
  Work Description: lostluck commented on pull request #11665:
URL: https://github.com/apache/beam/pull/11665#issuecomment-628958995


   I think this LGTM.
   Overall, it's probably fine either way. In terms of effort, the risk is 
often "the pipelines emit nothing/very little" and terminate very quickly, 
which other metrics that expect certain amounts of data. The main risk is the 
user doesn't have validation on the profiling pipeline and think things are 
going very very fast. But given performance metrics tend to be "per element", 
they'll pay attention to things like that.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 433452)
Time Spent: 1h 10m  (was: 1h)

> Create Go SDK synthetic sources.
> 
>
> Key: BEAM-9951
> URL: https://issues.apache.org/jira/browse/BEAM-9951
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-go
>Reporter: Daniel Oliveira
>Assignee: Daniel Oliveira
>Priority: Major
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Create synthetic sources for the Go SDK like 
> [Java|https://github.com/apache/beam/tree/master/sdks/java/io/synthetic/src/main/java/org/apache/beam/sdk/io/synthetic]
>  and 
> [Python|https://github.com/apache/beam/blob/master/sdks/python/apache_beam/testing/synthetic_pipeline.py]
>  have.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-9999) Remove support for EOLed runners (Apex, etc.)

2020-05-14 Thread Manu Zhang (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17107808#comment-17107808
 ] 

Manu Zhang commented on BEAM-:
--

Same for Gearpump. Thanks for pinging me.

> Remove support for EOLed runners (Apex, etc.)
> -
>
> Key: BEAM-
> URL: https://issues.apache.org/jira/browse/BEAM-
> Project: Beam
>  Issue Type: Bug
>  Components: runner-apex, runner-core
>Reporter: Ahmet Altay
>Priority: Major
>
> These runners look EOLed, not maintained:
> - Apex (last release 2+ years ago)
> - Gearpump (last release 1+ year ago)
> Removing support for these could reduce the code base size, reduce flaky 
> test, and make it easier to add new features.
> /cc [~kenn][~tysonjh]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (BEAM-9983) bigquery_read_it_test.ReadNewTypesTests.test_iobase_source failing

2020-05-14 Thread Pablo Estrada (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pablo Estrada resolved BEAM-9983.
-
Fix Version/s: Not applicable
   Resolution: Fixed

this pr was reverted and rolled-forward later on with fixes.

> bigquery_read_it_test.ReadNewTypesTests.test_iobase_source failing
> --
>
> Key: BEAM-9983
> URL: https://issues.apache.org/jira/browse/BEAM-9983
> Project: Beam
>  Issue Type: Bug
>  Components: io-py-gcp, test-failures
>Reporter: Kyle Weaver
>Assignee: Pablo Estrada
>Priority: Major
> Fix For: Not applicable
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> This failure seems to afflict all Python postcommits.
> apache_beam.io.gcp.bigquery_read_it_test.ReadNewTypesTests.test_iobase_source 
> (from nosetests)
> Failing for the past 1 build (Since Failed#2429 )
> Took 9 min 57 sec.
> Error Message
> Dataflow pipeline failed. State: FAILED, Error:
> Traceback (most recent call last):
>   File "/usr/local/lib/python3.6/site-packages/apache_beam/utils/retry.py", 
> line 246, in wrapper
> sleep_interval = next(retry_intervals)
> StopIteration
> During handling of the above exception, another exception occurred:
> Traceback (most recent call last):
>   File 
> "/usr/local/lib/python3.6/site-packages/dataflow_worker/batchworker.py", line 
> 647, in do_work
> work_executor.execute()
>   File "/usr/local/lib/python3.6/site-packages/dataflow_worker/executor.py", 
> line 226, in execute
> self._split_task)
>   File "/usr/local/lib/python3.6/site-packages/dataflow_worker/executor.py", 
> line 234, in _perform_source_split_considering_api_limits
> desired_bundle_size)
>   File "/usr/local/lib/python3.6/site-packages/dataflow_worker/executor.py", 
> line 271, in _perform_source_split
> for split in source.split(desired_bundle_size):
>   File 
> "/usr/local/lib/python3.6/site-packages/apache_beam/io/gcp/bigquery.py", line 
> 698, in split
> self.table_reference = self._execute_query(bq)
>   File 
> "/usr/local/lib/python3.6/site-packages/apache_beam/options/value_provider.py",
>  line 135, in _f
> return fnc(self, *args, **kwargs)
>   File 
> "/usr/local/lib/python3.6/site-packages/apache_beam/io/gcp/bigquery.py", line 
> 744, in _execute_query
> job_labels=self.bigquery_job_labels)
>   File "/usr/local/lib/python3.6/site-packages/apache_beam/utils/retry.py", 
> line 249, in wrapper
> raise_with_traceback(exn, exn_traceback)
>   File "/usr/local/lib/python3.6/site-packages/future/utils/__init__.py", 
> line 446, in raise_with_traceback
> raise exc.with_traceback(traceback)
>   File "/usr/local/lib/python3.6/site-packages/apache_beam/utils/retry.py", 
> line 236, in wrapper
> return fun(*args, **kwargs)
>   File 
> "/usr/local/lib/python3.6/site-packages/apache_beam/io/gcp/bigquery_tools.py",
>  line 415, in _start_query_job
> labels=job_labels or {},
>   File 
> "/usr/local/lib/python3.6/site-packages/apitools/base/protorpclite/messages.py",
>  line 791, in __init__
> setattr(self, name, value)
>   File 
> "/usr/local/lib/python3.6/site-packages/apitools/base/protorpclite/messages.py",
>  line 973, in __setattr__
> object.__setattr__(self, name, value)
>   File 
> "/usr/local/lib/python3.6/site-packages/apitools/base/protorpclite/messages.py",
>  line 1651, in __set__
> value = t(**value)
>   File 
> "/usr/local/lib/python3.6/site-packages/apitools/base/protorpclite/messages.py",
>  line 791, in __init__
> setattr(self, name, value)
>   File 
> "/usr/local/lib/python3.6/site-packages/apitools/base/protorpclite/messages.py",
>  line 976, in __setattr__
> "to message %s" % (name, type(self).__name__))
> AttributeError: May not assign arbitrary value owner to message LabelsValue



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9964) Setting workerCacheMb to make its way to the WindmillStateCache Constructor

2020-05-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9964?focusedWorklogId=433451&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433451
 ]

ASF GitHub Bot logged work on BEAM-9964:


Author: ASF GitHub Bot
Created on: 15/May/20 00:38
Start Date: 15/May/20 00:38
Worklog Time Spent: 10m 
  Work Description: pabloem commented on pull request #11710:
URL: https://github.com/apache/beam/pull/11710#issuecomment-628958485


   Seems like java precommits are broken on master - but this change LGTM. I'll 
wait for precommits to be fixed if possible. Thanks @omarismail94 !



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 433451)
Time Spent: 2.5h  (was: 2h 20m)

> Setting workerCacheMb to make its way to the WindmillStateCache Constructor
> ---
>
> Key: BEAM-9964
> URL: https://issues.apache.org/jira/browse/BEAM-9964
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-dataflow
>Reporter: Omar Ismail
>Assignee: Omar Ismail
>Priority: Minor
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> Setting --workerCacheMB seems to affect batch pipelines only. For Streaming, 
> the cache seems to be hardcoded to 100Mb [1]. If possible, I would like to 
> make it allowable to change the cache value in Streaming when setting 
> -workerCacheMB.
> I've never made changes to the Beam SDK, so I am super excited to work on 
> this! 
>  
> [[1] 
> https://github.com/apache/beam/blob/5e659bb80bcbf70795f6806e05a255ee72706d9f/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillStateCache.java#L73|https://github.com/apache/beam/blob/5e659bb80bcbf70795f6806e05a255ee72706d9f/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillStateCache.java#L73]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9951) Create Go SDK synthetic sources.

2020-05-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9951?focusedWorklogId=433450&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433450
 ]

ASF GitHub Bot logged work on BEAM-9951:


Author: ASF GitHub Bot
Created on: 15/May/20 00:34
Start Date: 15/May/20 00:34
Worklog Time Spent: 10m 
  Work Description: lostluck commented on a change in pull request #11665:
URL: https://github.com/apache/beam/pull/11665#discussion_r425502985



##
File path: sdks/go/pkg/beam/io/synthetic/source.go
##
@@ -0,0 +1,151 @@
+// Licensed to the Apache Software Foundation (ASF) under one or more
+// contributor license agreements.  See the NOTICE file distributed with
+// this work for additional information regarding copyright ownership.
+// The ASF licenses this file to You under the Apache License, Version 2.0
+// (the "License"); you may not use this file except in compliance with
+// the License.  You may obtain a copy of the License at
+//
+//http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+// Package synthetic contains transforms for creating synthetic pipelines.
+// Synthetic pipelines are pipelines that simulate the behavior of possible
+// pipelines in order to test performance, splitting, liquid sharding, and
+// various other infrastructure used for running pipelines. This category of
+// tests are not concerned with the correctness of the elements themselves, but
+// need to simulate transforms that output many elements throughout varying
+// pipeline shapes.
+package synthetic
+
+import (
+   "github.com/apache/beam/sdks/go/pkg/beam"
+   "github.com/apache/beam/sdks/go/pkg/beam/io/rtrackers/offsetrange"
+   "math/rand"
+   "time"
+)
+
+// Source creates a synthetic source transform that emits randomly
+// generated KV<[]byte, []byte> elements.
+//
+// This transform accepts a PCollection of SourceConfig, where each 
SourceConfig
+// determines the synthetic source's behavior for that element.

Review comment:
   I think I'm coming to the position that it might be overengineering to 
protect users, since the risky fields are known with the first version. 
   
   It's probably better to simply panic/error out with a clear message at some 
early stage if it's configured incorrectly, if graceful behavior isn't 
possible, and it's not typically the value that needs to be set (eg. 
NumElements or OutputsPerInput).
   
   Regardless, this is a good exercise to consider, whatever you choose to do.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 433450)
Time Spent: 1h  (was: 50m)

> Create Go SDK synthetic sources.
> 
>
> Key: BEAM-9951
> URL: https://issues.apache.org/jira/browse/BEAM-9951
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-go
>Reporter: Daniel Oliveira
>Assignee: Daniel Oliveira
>Priority: Major
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Create synthetic sources for the Go SDK like 
> [Java|https://github.com/apache/beam/tree/master/sdks/java/io/synthetic/src/main/java/org/apache/beam/sdk/io/synthetic]
>  and 
> [Python|https://github.com/apache/beam/blob/master/sdks/python/apache_beam/testing/synthetic_pipeline.py]
>  have.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9577) Update artifact staging and retrieval protocols to be dependency aware.

2020-05-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9577?focusedWorklogId=433449&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433449
 ]

ASF GitHub Bot logged work on BEAM-9577:


Author: ASF GitHub Bot
Created on: 15/May/20 00:31
Start Date: 15/May/20 00:31
Worklog Time Spent: 10m 
  Work Description: robertwb commented on pull request #11708:
URL: https://github.com/apache/beam/pull/11708#issuecomment-628956526


   Still trying to figure out why the test fails on jenkins but passes locally, 
but other than that it should be ready to be looked at again.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 433449)
Time Spent: 22h 10m  (was: 22h)

> Update artifact staging and retrieval protocols to be dependency aware.
> ---
>
> Key: BEAM-9577
> URL: https://issues.apache.org/jira/browse/BEAM-9577
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model
>Reporter: Robert Bradshaw
>Assignee: Robert Bradshaw
>Priority: Major
>  Time Spent: 22h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-10000) Support BIT_XOR aggregation function in BeamSQL

2020-05-14 Thread Rui Wang (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-1?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17107795#comment-17107795
 ] 

Rui Wang commented on BEAM-1:
-

Yeah!

> Support BIT_XOR aggregation function in BeamSQL
> ---
>
> Key: BEAM-1
> URL: https://issues.apache.org/jira/browse/BEAM-1
> Project: Beam
>  Issue Type: Task
>  Components: dsl-sql
>Reporter: Rui Wang
>Priority: Major
>
> See reference: 
> https://cloud.google.com/bigquery/docs/reference/standard-sql/aggregate_functions#bit_xor



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9641) Support ZetaSQL DATE functions in BeamSQL

2020-05-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9641?focusedWorklogId=433441&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433441
 ]

ASF GitHub Bot logged work on BEAM-9641:


Author: ASF GitHub Bot
Created on: 14/May/20 23:51
Start Date: 14/May/20 23:51
Worklog Time Spent: 10m 
  Work Description: apilloud commented on pull request #11272:
URL: https://github.com/apache/beam/pull/11272#issuecomment-628945312


   retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 433441)
Time Spent: 3h 10m  (was: 3h)

> Support ZetaSQL DATE functions in BeamSQL
> -
>
> Key: BEAM-9641
> URL: https://issues.apache.org/jira/browse/BEAM-9641
> Project: Beam
>  Issue Type: New Feature
>  Components: dsl-sql-zetasql
>Reporter: Yueyang Qiu
>Assignee: Yueyang Qiu
>Priority: Major
>  Labels: zetasql-compliance
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-9999) Remove support for EOLed runners (Apex, etc.)

2020-05-14 Thread Thomas Weise (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17107792#comment-17107792
 ] 

Thomas Weise commented on BEAM-:


Apache Apex itself has moved to attic and there are no users of the Beam Apex 
runners that I know of.

 

> Remove support for EOLed runners (Apex, etc.)
> -
>
> Key: BEAM-
> URL: https://issues.apache.org/jira/browse/BEAM-
> Project: Beam
>  Issue Type: Bug
>  Components: runner-apex, runner-core
>Reporter: Ahmet Altay
>Priority: Major
>
> These runners look EOLed, not maintained:
> - Apex (last release 2+ years ago)
> - Gearpump (last release 1+ year ago)
> Removing support for these could reduce the code base size, reduce flaky 
> test, and make it easier to add new features.
> /cc [~kenn][~tysonjh]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-9698) BeamUncollectRel UncollectDoFn NullPointerException

2020-05-14 Thread Andrew Pilloud (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17107791#comment-17107791
 ] 

Andrew Pilloud commented on BEAM-9698:
--

Looks like this is coming out of BeamZetaSqlCalcRel...

BeamZetaSqlCalcRel(expr#0=[{inputs}], $unnest1=[$t0])
  BeamUncollectRel
BeamZetaSqlCalcRel(expr#0=[{inputs}], expr#1=[null:BIGINT NOT NULL 
ARRAY], $array$unnest1=[$t1])
  BeamValuesRel(tuples=[[{ 0 }]])

> BeamUncollectRel UncollectDoFn NullPointerException
> ---
>
> Key: BEAM-9698
> URL: https://issues.apache.org/jira/browse/BEAM-9698
> Project: Beam
>  Issue Type: Bug
>  Components: dsl-sql-zetasql
>Reporter: Andrew Pilloud
>Assignee: Andrew Pilloud
>Priority: Major
>  Labels: zetasql-compliance
>
> two failures in shard 19
> {code}
> org.apache.beam.sdk.Pipeline$PipelineExecutionException: 
> java.lang.NullPointerException
>   at 
> org.apache.beam.runners.direct.DirectRunner$DirectPipelineResult.waitUntilFinish(DirectRunner.java:348)
>   at 
> org.apache.beam.runners.direct.DirectRunner$DirectPipelineResult.waitUntilFinish(DirectRunner.java:318)
>   at 
> org.apache.beam.runners.direct.DirectRunner.run(DirectRunner.java:213)
>   at org.apache.beam.runners.direct.DirectRunner.run(DirectRunner.java:67)
>   at org.apache.beam.sdk.Pipeline.run(Pipeline.java:317)
>   at org.apache.beam.sdk.Pipeline.run(Pipeline.java:303)
>   at 
> org.apache.beam.sdk.extensions.sql.impl.rel.BeamEnumerableConverter.runCollector(BeamEnumerableConverter.java:201)
>   at 
> org.apache.beam.sdk.extensions.sql.impl.rel.BeamEnumerableConverter.collectRows(BeamEnumerableConverter.java:218)
>   at 
> org.apache.beam.sdk.extensions.sql.impl.rel.BeamEnumerableConverter.toRowList(BeamEnumerableConverter.java:150)
>   at 
> org.apache.beam.sdk.extensions.sql.impl.rel.BeamEnumerableConverter.toRowList(BeamEnumerableConverter.java:127)
>   at 
> cloud.dataflow.sql.ExecuteQueryServiceServer$SqlComplianceServiceImpl.executeQuery(ExecuteQueryServiceServer.java:329)
>   at 
> com.google.zetasql.testing.SqlComplianceServiceGrpc$MethodHandlers.invoke(SqlComplianceServiceGrpc.java:423)
>   at 
> com.google.zetasql.io.grpc.stub.ServerCalls$UnaryServerCallHandler$UnaryServerCallListener.onHalfClose(ServerCalls.java:171)
>   at 
> com.google.zetasql.io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.halfClosed(ServerCallImpl.java:283)
>   at 
> com.google.zetasql.io.grpc.internal.ServerImpl$JumpToApplicationThreadServerStreamListener$1HalfClosed.runInContext(ServerImpl.java:711)
>   at 
> com.google.zetasql.io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
>   at 
> com.google.zetasql.io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:123)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.beam.sdk.extensions.sql.impl.rel.BeamUncollectRel$UncollectDoFn.process(BeamUncollectRel.java:103)
> {code}
> {code}
> Apr 01, 2020 5:58:27 PM 
> cloud.dataflow.sql.ExecuteQueryServiceServer$SqlComplianceServiceImpl 
> executeQuery
> INFO: Processing Sql statement: SELECT e FROM UNNEST(CAST(NULL AS 
> ARRAY)) e
> Apr 01, 2020 5:58:27 PM 
> cloud.dataflow.sql.ExecuteQueryServiceServer$SqlComplianceServiceImpl 
> executeQuery
> INFO: Processing Sql statement: SELECT e FROM UNNEST(CAST(NULL AS 
> ARRAY>)) e
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (BEAM-9698) BeamUncollectRel UncollectDoFn NullPointerException

2020-05-14 Thread Andrew Pilloud (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Pilloud reassigned BEAM-9698:


Assignee: Andrew Pilloud

> BeamUncollectRel UncollectDoFn NullPointerException
> ---
>
> Key: BEAM-9698
> URL: https://issues.apache.org/jira/browse/BEAM-9698
> Project: Beam
>  Issue Type: Bug
>  Components: dsl-sql-zetasql
>Reporter: Andrew Pilloud
>Assignee: Andrew Pilloud
>Priority: Major
>  Labels: zetasql-compliance
>
> two failures in shard 19
> {code}
> org.apache.beam.sdk.Pipeline$PipelineExecutionException: 
> java.lang.NullPointerException
>   at 
> org.apache.beam.runners.direct.DirectRunner$DirectPipelineResult.waitUntilFinish(DirectRunner.java:348)
>   at 
> org.apache.beam.runners.direct.DirectRunner$DirectPipelineResult.waitUntilFinish(DirectRunner.java:318)
>   at 
> org.apache.beam.runners.direct.DirectRunner.run(DirectRunner.java:213)
>   at org.apache.beam.runners.direct.DirectRunner.run(DirectRunner.java:67)
>   at org.apache.beam.sdk.Pipeline.run(Pipeline.java:317)
>   at org.apache.beam.sdk.Pipeline.run(Pipeline.java:303)
>   at 
> org.apache.beam.sdk.extensions.sql.impl.rel.BeamEnumerableConverter.runCollector(BeamEnumerableConverter.java:201)
>   at 
> org.apache.beam.sdk.extensions.sql.impl.rel.BeamEnumerableConverter.collectRows(BeamEnumerableConverter.java:218)
>   at 
> org.apache.beam.sdk.extensions.sql.impl.rel.BeamEnumerableConverter.toRowList(BeamEnumerableConverter.java:150)
>   at 
> org.apache.beam.sdk.extensions.sql.impl.rel.BeamEnumerableConverter.toRowList(BeamEnumerableConverter.java:127)
>   at 
> cloud.dataflow.sql.ExecuteQueryServiceServer$SqlComplianceServiceImpl.executeQuery(ExecuteQueryServiceServer.java:329)
>   at 
> com.google.zetasql.testing.SqlComplianceServiceGrpc$MethodHandlers.invoke(SqlComplianceServiceGrpc.java:423)
>   at 
> com.google.zetasql.io.grpc.stub.ServerCalls$UnaryServerCallHandler$UnaryServerCallListener.onHalfClose(ServerCalls.java:171)
>   at 
> com.google.zetasql.io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.halfClosed(ServerCallImpl.java:283)
>   at 
> com.google.zetasql.io.grpc.internal.ServerImpl$JumpToApplicationThreadServerStreamListener$1HalfClosed.runInContext(ServerImpl.java:711)
>   at 
> com.google.zetasql.io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
>   at 
> com.google.zetasql.io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:123)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.beam.sdk.extensions.sql.impl.rel.BeamUncollectRel$UncollectDoFn.process(BeamUncollectRel.java:103)
> {code}
> {code}
> Apr 01, 2020 5:58:27 PM 
> cloud.dataflow.sql.ExecuteQueryServiceServer$SqlComplianceServiceImpl 
> executeQuery
> INFO: Processing Sql statement: SELECT e FROM UNNEST(CAST(NULL AS 
> ARRAY)) e
> Apr 01, 2020 5:58:27 PM 
> cloud.dataflow.sql.ExecuteQueryServiceServer$SqlComplianceServiceImpl 
> executeQuery
> INFO: Processing Sql statement: SELECT e FROM UNNEST(CAST(NULL AS 
> ARRAY>)) e
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-9999) Remove support for EOLed runners (Apex, etc.)

2020-05-14 Thread Kenneth Knowles (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17107790#comment-17107790
 ] 

Kenneth Knowles commented on BEAM-:
---

Makes sense to me, especially if Beam is moving on in ways that require updates 
(like Java 11) or if there is maintenance burden. I think the important thing 
is whether they have users. You can get some general ideas from 
https://repository.apache.org/#central-stat (committers only access) but it 
cannot distinguish continuous testing downloads of course.

CC [~t...@apache.org] for comment on Apex
CC [~mauzhang] for comment on Gearpump

> Remove support for EOLed runners (Apex, etc.)
> -
>
> Key: BEAM-
> URL: https://issues.apache.org/jira/browse/BEAM-
> Project: Beam
>  Issue Type: Bug
>  Components: runner-apex, runner-core
>Reporter: Ahmet Altay
>Priority: Major
>
> These runners look EOLed, not maintained:
> - Apex (last release 2+ years ago)
> - Gearpump (last release 1+ year ago)
> Removing support for these could reduce the code base size, reduce flaky 
> test, and make it easier to add new features.
> /cc [~kenn][~tysonjh]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-9994) Cannot create a virtualenv using Python 3.8 on Jenkins machines

2020-05-14 Thread Valentyn Tymofieiev (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17107785#comment-17107785
 ] 

Valentyn Tymofieiev commented on BEAM-9994:
---

Can you try:
python3.8 -m venv env ?
I am not sure what is going on but possibly the version of ubuntu we have on 
Jenkins is missing some packages for Python 3.8 or virtualenv. The best course 
of action would be to clone a Jenkins VM image, create a VM, experiment to see 
what needs to be fixed.

If the fix is not easy, we may need to implement BEAM-8152 to unblock this.

cc: [~wintermelons] [~yifanzou] [~yoshiki.obata]
  

> Cannot create a virtualenv using Python 3.8 on Jenkins machines
> ---
>
> Key: BEAM-9994
> URL: https://issues.apache.org/jira/browse/BEAM-9994
> Project: Beam
>  Issue Type: Bug
>  Components: build-system
>Reporter: Kamil Wasilewski
>Priority: Blocker
>
> Command: *virtualenv --python /usr/bin/python3.8 env*
> Output:
> {noformat}
> Running virtualenv with interpreter /usr/bin/python3.8
> Traceback (most recent call last):
>   File "/usr/local/lib/python3.5/dist-packages/virtualenv.py", line 22, in 
> 
> import distutils.spawn
> ModuleNotFoundError: No module named 'distutils.spawn'
> {noformat}
> Example test affected: 
> https://builds.apache.org/job/beam_PreCommit_PythonFormatter_Commit/1723/console



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-10000) Support BIT_XOR aggregation function in BeamSQL

2020-05-14 Thread Kai Jiang (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-1?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17107787#comment-17107787
 ] 

Kai Jiang commented on BEAM-1:
--

wow! #1

> Support BIT_XOR aggregation function in BeamSQL
> ---
>
> Key: BEAM-1
> URL: https://issues.apache.org/jira/browse/BEAM-1
> Project: Beam
>  Issue Type: Task
>  Components: dsl-sql
>Reporter: Rui Wang
>Priority: Major
>
> See reference: 
> https://cloud.google.com/bigquery/docs/reference/standard-sql/aggregate_functions#bit_xor



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9951) Create Go SDK synthetic sources.

2020-05-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9951?focusedWorklogId=433434&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433434
 ]

ASF GitHub Bot logged work on BEAM-9951:


Author: ASF GitHub Bot
Created on: 14/May/20 23:18
Start Date: 14/May/20 23:18
Worklog Time Spent: 10m 
  Work Description: youngoli commented on a change in pull request #11665:
URL: https://github.com/apache/beam/pull/11665#discussion_r425481703



##
File path: sdks/go/pkg/beam/io/synthetic/source.go
##
@@ -33,22 +33,30 @@ import (
 // generated KV<[]byte, []byte> elements.
 //
 // This transform accepts a PCollection of SourceConfig, where each 
SourceConfig
-// determines the synthetic source's behavior for that element.
+// determines the synthetic source's behavior for that element and outputs the
+// randomly generated elements.
 //
-// This transform outputs a PCollection of randomly generated
-// KV elements.
+// SourceConfigs are recommended to be created via the DefaultSourceConfig and
+// then sent to a beam.Create transform once modified. Example:
+//
+//cfg1 := synthetic.DefaultSourceConfig()
+//cfg1.NumElements = 1000
+//cfg2 := synthetic.DefaultSourceConfig()

Review comment:
   Yeah I was thinking the same thing. I guess it technically works that 
way right now because the code catches the 0 initial splits case and clamps it 
to 1, but I'm still not a huge fan of allowing it to be used as a value like 
that. Hence, why I used that "proper" example.
   
   While working on the step PR I've been thinking that the most appealing 
approach to me right now is using a builder so you can do 
DefaultSourceConfig().NumElements(1000).Build() and have Build catch any 
invalid values. It's a little over-engineered for what we have now, but I 
prefer over-engineered default values that at least have a user-friendly API 
over implicitly changing 0 values to the defaults we actually want ("0 initial 
splits? That's invalid so I'll just set it to 1 for you.")





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 433434)
Time Spent: 50m  (was: 40m)

> Create Go SDK synthetic sources.
> 
>
> Key: BEAM-9951
> URL: https://issues.apache.org/jira/browse/BEAM-9951
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-go
>Reporter: Daniel Oliveira
>Assignee: Daniel Oliveira
>Priority: Major
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Create synthetic sources for the Go SDK like 
> [Java|https://github.com/apache/beam/tree/master/sdks/java/io/synthetic/src/main/java/org/apache/beam/sdk/io/synthetic]
>  and 
> [Python|https://github.com/apache/beam/blob/master/sdks/python/apache_beam/testing/synthetic_pipeline.py]
>  have.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9876) Migrate the Beam website from Jekyll to Hugo to enable localization of the site content

2020-05-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9876?focusedWorklogId=433433&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433433
 ]

ASF GitHub Bot logged work on BEAM-9876:


Author: ASF GitHub Bot
Created on: 14/May/20 23:12
Start Date: 14/May/20 23:12
Worklog Time Spent: 10m 
  Work Description: pabloem commented on pull request #11554:
URL: https://github.com/apache/beam/pull/11554#issuecomment-628933708


   I've tagged the commit 1d2700818474c008eaa324ac1b5c49c9d2857298 with the 
`website-to-hugo` tag.
   fyi @aijamalnk @bntnam et al



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 433433)
Time Spent: 12.5h  (was: 12h 20m)

> Migrate the Beam website from Jekyll to Hugo to enable localization of the 
> site content
> ---
>
> Key: BEAM-9876
> URL: https://issues.apache.org/jira/browse/BEAM-9876
> Project: Beam
>  Issue Type: Task
>  Components: website
>Reporter: Aizhamal Nurmamat kyzy
>Assignee: Aizhamal Nurmamat kyzy
>Priority: Major
>  Time Spent: 12.5h
>  Remaining Estimate: 0h
>
> Enable internationalization of the Apache Beam website to increase the reach 
> of the project, and facilitate adoption and growth of its community.
> The proposal was to do this by migrating the current Apache Beam website from 
> Jekyll do Hugo [1]. Hugo supports internationalization out-of-the-box, making 
> it easier both for contributors and maintainers support the 
> internationalization effort.
> The further discussion on implementation can be viewed here  [2]
> [1] 
> [https://lists.apache.org/thread.html/rfab4cc1411318c3f4667bee051df68f37be11846ada877f3576c41a9%40%3Cdev.beam.apache.org%3E]
> [2] 
> [https://lists.apache.org/thread.html/r6b999b6d7d1f6cbb94e16bb2deed2b65098a6b14c4ac98707fe0c36a%40%3Cdev.beam.apache.org%3E]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-10001) Change the code block colors from grey to blue to increase the contrast between text and background.

2020-05-14 Thread Aizhamal Nurmamat kyzy (Jira)
Aizhamal Nurmamat kyzy created BEAM-10001:
-

 Summary: Change the code block colors from grey to blue to 
increase the contrast between text and background.
 Key: BEAM-10001
 URL: https://issues.apache.org/jira/browse/BEAM-10001
 Project: Beam
  Issue Type: Task
  Components: website
Reporter: Aizhamal Nurmamat kyzy


Example: [https://beam.apache.org/get-started/try-apache-beam/]

The old background color: 
[http://apache-beam-website-pull-requests.storage.googleapis.com/11705/documentation/programming-guide/index.html#creating-a-pipeline]
 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-9975) PortableRunnerTest flake "ParseError: Unexpected type for Value message."

2020-05-14 Thread Brian Hulette (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17107778#comment-17107778
 ] 

Brian Hulette commented on BEAM-9975:
-

It looks like there are (at least) two problems here:
- get_all_options relies on 
[__subclasses__|https://github.com/apache/beam/blob/5d00ccba5b905584f07c1b0275841113d4921a8c/sdks/python/apache_beam/options/pipeline_options.py#L283]
 to find every PipelineOptions subclass, which finds all the [subclasses that 
have had their definition 
executed|https://stackoverflow.com/questions/3862310/how-to-find-all-the-subclasses-of-a-class-given-its-name].
 It seems when running tests it's possible for this to pull in definitions from 
previously executed tests. I tried to repro this locally by running two tests 
with pytest and I couldn't do it. I'm not sure what's different on jenkins.
- I'm pretty sure we should check for instances of ValueProvider and call get() 
before trying to convert to a proto struct.


> PortableRunnerTest flake "ParseError: Unexpected type for Value message."
> -
>
> Key: BEAM-9975
> URL: https://issues.apache.org/jira/browse/BEAM-9975
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Brian Hulette
>Assignee: Brian Hulette
>Priority: Major
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Error looks similar to the one in BEAM-9907. Example from 
> https://builds.apache.org/job/beam_PreCommit_Python_Cron/2732
> {code}
> apache_beam/runners/portability/fn_api_runner/fn_runner_test.py:569: 
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> apache_beam/pipeline.py:550: in __exit__
> self.run().wait_until_finish()
> apache_beam/pipeline.py:529: in run
> return self.runner.run_pipeline(self, self._options)
> apache_beam/runners/portability/portable_runner.py:426: in run_pipeline
> job_service_handle.submit(proto_pipeline)
> apache_beam/runners/portability/portable_runner.py:107: in submit
> prepare_response = self.prepare(proto_pipeline)
> apache_beam/runners/portability/portable_runner.py:184: in prepare
> pipeline_options=self.get_pipeline_options()),
> apache_beam/runners/portability/portable_runner.py:174: in 
> get_pipeline_options
> return job_utils.dict_to_struct(p_options)
> apache_beam/runners/job/utils.py:33: in dict_to_struct
> return json_format.ParseDict(dict_obj, struct_pb2.Struct())
> target/.tox-py36-cython/py36-cython/lib/python3.6/site-packages/google/protobuf/json_format.py:450:
>  in ParseDict
> parser.ConvertMessage(js_dict, message)
> target/.tox-py36-cython/py36-cython/lib/python3.6/site-packages/google/protobuf/json_format.py:479:
>  in ConvertMessage
> methodcaller(_WKTJSONMETHODS[full_name][1], value, message)(self)
> target/.tox-py36-cython/py36-cython/lib/python3.6/site-packages/google/protobuf/json_format.py:667:
>  in _ConvertStructMessage
> self._ConvertValueMessage(value[key], message.fields[key])
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> self = 
> value =  0x7f69eb7b3ac8>
> message = 
> def _ConvertValueMessage(self, value, message):
>   """Convert a JSON representation into Value message."""
>   if isinstance(value, dict):
> self._ConvertStructMessage(value, message.struct_value)
>   elif isinstance(value, list):
> self. _ConvertListValueMessage(value, message.list_value)
>   elif value is None:
> message.null_value = 0
>   elif isinstance(value, bool):
> message.bool_value = value
>   elif isinstance(value, six.string_types):
> message.string_value = value
>   elif isinstance(value, _INT_OR_FLOAT):
> message.number_value = value
>   else:
> >   raise ParseError('Unexpected type for Value message.')
> E   google.protobuf.json_format.ParseError: Unexpected type for Value 
> message.
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9468) Add Google Cloud Healthcare API IO Connectors

2020-05-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9468?focusedWorklogId=433432&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433432
 ]

ASF GitHub Bot logged work on BEAM-9468:


Author: ASF GitHub Bot
Created on: 14/May/20 23:11
Start Date: 14/May/20 23:11
Worklog Time Spent: 10m 
  Work Description: pabloem commented on pull request #11339:
URL: https://github.com/apache/beam/pull/11339#issuecomment-628933465


   retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 433432)
Time Spent: 42h  (was: 41h 50m)

> Add Google Cloud Healthcare API IO Connectors
> -
>
> Key: BEAM-9468
> URL: https://issues.apache.org/jira/browse/BEAM-9468
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-gcp
>Reporter: Jacob Ferriero
>Assignee: Jacob Ferriero
>Priority: Minor
>  Time Spent: 42h
>  Remaining Estimate: 0h
>
> Add IO Transforms for the HL7v2, FHIR and DICOM stores in the [Google Cloud 
> Healthcare API|https://cloud.google.com/healthcare/docs/]
> HL7v2IO
> FHIRIO
> DICOM 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-10000) Support BIT_XOR aggregation function in BeamSQL

2020-05-14 Thread Rui Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-1?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Wang updated BEAM-1:

Description: See reference: 
https://cloud.google.com/bigquery/docs/reference/standard-sql/aggregate_functions#bit_xor

> Support BIT_XOR aggregation function in BeamSQL
> ---
>
> Key: BEAM-1
> URL: https://issues.apache.org/jira/browse/BEAM-1
> Project: Beam
>  Issue Type: Task
>  Components: dsl-sql
>Reporter: Rui Wang
>Priority: Major
>
> See reference: 
> https://cloud.google.com/bigquery/docs/reference/standard-sql/aggregate_functions#bit_xor



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-10000) Support BIT_XOR aggregation function in BeamSQL

2020-05-14 Thread Rui Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-1?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Wang updated BEAM-1:

Status: Open  (was: Triage Needed)

> Support BIT_XOR aggregation function in BeamSQL
> ---
>
> Key: BEAM-1
> URL: https://issues.apache.org/jira/browse/BEAM-1
> Project: Beam
>  Issue Type: Task
>  Components: dsl-sql
>Reporter: Rui Wang
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-10000) Support BIT_XOR aggregation function in BeamSQL

2020-05-14 Thread Rui Wang (Jira)
Rui Wang created BEAM-1:
---

 Summary: Support BIT_XOR aggregation function in BeamSQL
 Key: BEAM-1
 URL: https://issues.apache.org/jira/browse/BEAM-1
 Project: Beam
  Issue Type: Task
  Components: dsl-sql
Reporter: Rui Wang






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9993) Add option defaults for Flink Python tests

2020-05-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9993?focusedWorklogId=433425&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433425
 ]

ASF GitHub Bot logged work on BEAM-9993:


Author: ASF GitHub Bot
Created on: 14/May/20 22:48
Start Date: 14/May/20 22:48
Worklog Time Spent: 10m 
  Work Description: ibzib merged pull request #11711:
URL: https://github.com/apache/beam/pull/11711


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 433425)
Time Spent: 20m  (was: 10m)

> Add option defaults for Flink Python tests
> --
>
> Key: BEAM-9993
> URL: https://issues.apache.org/jira/browse/BEAM-9993
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-flink
>Reporter: Kyle Weaver
>Assignee: Kyle Weaver
>Priority: Minor
>  Labels: portability-flink
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> I want to run a single Flink Python test:
> python -m apache_beam.runners.portability.flink_runner_test 
> FlinkRunnerTest.test_metrics
> But I get this error:
> TypeError: expected str, bytes or os.PathLike object, not NoneType
> Turns out flink_job_server_jar isn't set, and there's no default value. We 
> should set a default.
> We should also change the default environment type to LOOPBACK for basic 
> testing purposes because it requires the least setup.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-9999) Remove support for EOLed runners (Apex, etc.)

2020-05-14 Thread Ahmet Altay (Jira)
Ahmet Altay created BEAM-:
-

 Summary: Remove support for EOLed runners (Apex, etc.)
 Key: BEAM-
 URL: https://issues.apache.org/jira/browse/BEAM-
 Project: Beam
  Issue Type: Bug
  Components: runner-apex, runner-core
Reporter: Ahmet Altay


These runners look EOLed, not maintained:
- Apex (last release 2+ years ago)
- Gearpump (last release 1+ year ago)

Removing support for these could reduce the code base size, reduce flaky test, 
and make it easier to add new features.

/cc [~kenn][~tysonjh]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-6950) Beam Dependency Update Request: com.github.spotbugs

2020-05-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-6950?focusedWorklogId=433422&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433422
 ]

ASF GitHub Bot logged work on BEAM-6950:


Author: ASF GitHub Bot
Created on: 14/May/20 22:45
Start Date: 14/May/20 22:45
Worklog Time Spent: 10m 
  Work Description: iemejia commented on pull request #11712:
URL: https://github.com/apache/beam/pull/11712#issuecomment-628924367


   Run JavaPortabilityApi PreCommit



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 433422)
Time Spent: 0.5h  (was: 20m)

> Beam Dependency Update Request: com.github.spotbugs
> ---
>
> Key: BEAM-6950
> URL: https://issues.apache.org/jira/browse/BEAM-6950
> Project: Beam
>  Issue Type: Bug
>  Components: dependencies
>Reporter: Beam JIRA Bot
>Priority: Major
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
>  - 2019-04-01 12:15:04.215434 
> -
> Please consider upgrading the dependency com.github.spotbugs. 
> The current version is None. The latest version is None 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-6950) Beam Dependency Update Request: com.github.spotbugs

2020-05-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-6950?focusedWorklogId=433423&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433423
 ]

ASF GitHub Bot logged work on BEAM-6950:


Author: ASF GitHub Bot
Created on: 14/May/20 22:45
Start Date: 14/May/20 22:45
Worklog Time Spent: 10m 
  Work Description: iemejia commented on pull request #11712:
URL: https://github.com/apache/beam/pull/11712#issuecomment-628924410


   Run JavaPortabilityApiJava11 PreCommit



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 433423)
Time Spent: 40m  (was: 0.5h)

> Beam Dependency Update Request: com.github.spotbugs
> ---
>
> Key: BEAM-6950
> URL: https://issues.apache.org/jira/browse/BEAM-6950
> Project: Beam
>  Issue Type: Bug
>  Components: dependencies
>Reporter: Beam JIRA Bot
>Priority: Major
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
>  - 2019-04-01 12:15:04.215434 
> -
> Please consider upgrading the dependency com.github.spotbugs. 
> The current version is None. The latest version is None 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-6950) Beam Dependency Update Request: com.github.spotbugs

2020-05-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-6950?focusedWorklogId=433421&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433421
 ]

ASF GitHub Bot logged work on BEAM-6950:


Author: ASF GitHub Bot
Created on: 14/May/20 22:45
Start Date: 14/May/20 22:45
Worklog Time Spent: 10m 
  Work Description: iemejia commented on pull request #11712:
URL: https://github.com/apache/beam/pull/11712#issuecomment-628924278


   Run Java PreCommit



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 433421)
Time Spent: 20m  (was: 10m)

> Beam Dependency Update Request: com.github.spotbugs
> ---
>
> Key: BEAM-6950
> URL: https://issues.apache.org/jira/browse/BEAM-6950
> Project: Beam
>  Issue Type: Bug
>  Components: dependencies
>Reporter: Beam JIRA Bot
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
>  - 2019-04-01 12:15:04.215434 
> -
> Please consider upgrading the dependency com.github.spotbugs. 
> The current version is None. The latest version is None 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9964) Setting workerCacheMb to make its way to the WindmillStateCache Constructor

2020-05-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9964?focusedWorklogId=433420&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433420
 ]

ASF GitHub Bot logged work on BEAM-9964:


Author: ASF GitHub Bot
Created on: 14/May/20 22:44
Start Date: 14/May/20 22:44
Worklog Time Spent: 10m 
  Work Description: pabloem commented on pull request #11710:
URL: https://github.com/apache/beam/pull/11710#issuecomment-628924201


   Run Java PostCommit



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 433420)
Time Spent: 2h 20m  (was: 2h 10m)

> Setting workerCacheMb to make its way to the WindmillStateCache Constructor
> ---
>
> Key: BEAM-9964
> URL: https://issues.apache.org/jira/browse/BEAM-9964
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-dataflow
>Reporter: Omar Ismail
>Assignee: Omar Ismail
>Priority: Minor
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> Setting --workerCacheMB seems to affect batch pipelines only. For Streaming, 
> the cache seems to be hardcoded to 100Mb [1]. If possible, I would like to 
> make it allowable to change the cache value in Streaming when setting 
> -workerCacheMB.
> I've never made changes to the Beam SDK, so I am super excited to work on 
> this! 
>  
> [[1] 
> https://github.com/apache/beam/blob/5e659bb80bcbf70795f6806e05a255ee72706d9f/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillStateCache.java#L73|https://github.com/apache/beam/blob/5e659bb80bcbf70795f6806e05a255ee72706d9f/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillStateCache.java#L73]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9876) Migrate the Beam website from Jekyll to Hugo to enable localization of the site content

2020-05-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9876?focusedWorklogId=433413&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433413
 ]

ASF GitHub Bot logged work on BEAM-9876:


Author: ASF GitHub Bot
Created on: 14/May/20 22:40
Start Date: 14/May/20 22:40
Worklog Time Spent: 10m 
  Work Description: aaltay merged pull request #11554:
URL: https://github.com/apache/beam/pull/11554


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 433413)
Time Spent: 12h 20m  (was: 12h 10m)

> Migrate the Beam website from Jekyll to Hugo to enable localization of the 
> site content
> ---
>
> Key: BEAM-9876
> URL: https://issues.apache.org/jira/browse/BEAM-9876
> Project: Beam
>  Issue Type: Task
>  Components: website
>Reporter: Aizhamal Nurmamat kyzy
>Assignee: Aizhamal Nurmamat kyzy
>Priority: Major
>  Time Spent: 12h 20m
>  Remaining Estimate: 0h
>
> Enable internationalization of the Apache Beam website to increase the reach 
> of the project, and facilitate adoption and growth of its community.
> The proposal was to do this by migrating the current Apache Beam website from 
> Jekyll do Hugo [1]. Hugo supports internationalization out-of-the-box, making 
> it easier both for contributors and maintainers support the 
> internationalization effort.
> The further discussion on implementation can be viewed here  [2]
> [1] 
> [https://lists.apache.org/thread.html/rfab4cc1411318c3f4667bee051df68f37be11846ada877f3576c41a9%40%3Cdev.beam.apache.org%3E]
> [2] 
> [https://lists.apache.org/thread.html/r6b999b6d7d1f6cbb94e16bb2deed2b65098a6b14c4ac98707fe0c36a%40%3Cdev.beam.apache.org%3E]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-9975) PortableRunnerTest flake "ParseError: Unexpected type for Value message."

2020-05-14 Thread Brian Hulette (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17107742#comment-17107742
 ] 

Brian Hulette commented on BEAM-9975:
-

Note that log is from PortableRunnerTest.test_read

> PortableRunnerTest flake "ParseError: Unexpected type for Value message."
> -
>
> Key: BEAM-9975
> URL: https://issues.apache.org/jira/browse/BEAM-9975
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Brian Hulette
>Assignee: Brian Hulette
>Priority: Major
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Error looks similar to the one in BEAM-9907. Example from 
> https://builds.apache.org/job/beam_PreCommit_Python_Cron/2732
> {code}
> apache_beam/runners/portability/fn_api_runner/fn_runner_test.py:569: 
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> apache_beam/pipeline.py:550: in __exit__
> self.run().wait_until_finish()
> apache_beam/pipeline.py:529: in run
> return self.runner.run_pipeline(self, self._options)
> apache_beam/runners/portability/portable_runner.py:426: in run_pipeline
> job_service_handle.submit(proto_pipeline)
> apache_beam/runners/portability/portable_runner.py:107: in submit
> prepare_response = self.prepare(proto_pipeline)
> apache_beam/runners/portability/portable_runner.py:184: in prepare
> pipeline_options=self.get_pipeline_options()),
> apache_beam/runners/portability/portable_runner.py:174: in 
> get_pipeline_options
> return job_utils.dict_to_struct(p_options)
> apache_beam/runners/job/utils.py:33: in dict_to_struct
> return json_format.ParseDict(dict_obj, struct_pb2.Struct())
> target/.tox-py36-cython/py36-cython/lib/python3.6/site-packages/google/protobuf/json_format.py:450:
>  in ParseDict
> parser.ConvertMessage(js_dict, message)
> target/.tox-py36-cython/py36-cython/lib/python3.6/site-packages/google/protobuf/json_format.py:479:
>  in ConvertMessage
> methodcaller(_WKTJSONMETHODS[full_name][1], value, message)(self)
> target/.tox-py36-cython/py36-cython/lib/python3.6/site-packages/google/protobuf/json_format.py:667:
>  in _ConvertStructMessage
> self._ConvertValueMessage(value[key], message.fields[key])
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> self = 
> value =  0x7f69eb7b3ac8>
> message = 
> def _ConvertValueMessage(self, value, message):
>   """Convert a JSON representation into Value message."""
>   if isinstance(value, dict):
> self._ConvertStructMessage(value, message.struct_value)
>   elif isinstance(value, list):
> self. _ConvertListValueMessage(value, message.list_value)
>   elif value is None:
> message.null_value = 0
>   elif isinstance(value, bool):
> message.bool_value = value
>   elif isinstance(value, six.string_types):
> message.string_value = value
>   elif isinstance(value, _INT_OR_FLOAT):
> message.number_value = value
>   else:
> >   raise ParseError('Unexpected type for Value message.')
> E   google.protobuf.json_format.ParseError: Unexpected type for Value 
> message.
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9641) Support ZetaSQL DATE functions in BeamSQL

2020-05-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9641?focusedWorklogId=433411&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433411
 ]

ASF GitHub Bot logged work on BEAM-9641:


Author: ASF GitHub Bot
Created on: 14/May/20 22:36
Start Date: 14/May/20 22:36
Worklog Time Spent: 10m 
  Work Description: apilloud commented on pull request #11272:
URL: https://github.com/apache/beam/pull/11272#issuecomment-628921575


   retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 433411)
Time Spent: 3h  (was: 2h 50m)

> Support ZetaSQL DATE functions in BeamSQL
> -
>
> Key: BEAM-9641
> URL: https://issues.apache.org/jira/browse/BEAM-9641
> Project: Beam
>  Issue Type: New Feature
>  Components: dsl-sql-zetasql
>Reporter: Yueyang Qiu
>Assignee: Yueyang Qiu
>Priority: Major
>  Labels: zetasql-compliance
>  Time Spent: 3h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9577) Update artifact staging and retrieval protocols to be dependency aware.

2020-05-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9577?focusedWorklogId=433407&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433407
 ]

ASF GitHub Bot logged work on BEAM-9577:


Author: ASF GitHub Bot
Created on: 14/May/20 22:34
Start Date: 14/May/20 22:34
Worklog Time Spent: 10m 
  Work Description: ibzib commented on a change in pull request #11708:
URL: https://github.com/apache/beam/pull/11708#discussion_r425468106



##
File path: 
sdks/java/core/src/main/java/org/apache/beam/sdk/io/ClassLoaderFileSystem.java
##
@@ -0,0 +1,154 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io;
+
+import static 
org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkArgument;
+
+import com.google.auto.service.AutoService;
+import java.io.IOException;
+import java.io.InputStream;
+import java.nio.channels.Channels;
+import java.nio.channels.ReadableByteChannel;
+import java.nio.channels.WritableByteChannel;
+import java.util.Collection;
+import java.util.List;
+import javax.annotation.Nullable;
+import org.apache.beam.sdk.annotations.Experimental;
+import org.apache.beam.sdk.io.fs.CreateOptions;
+import org.apache.beam.sdk.io.fs.MatchResult;
+import org.apache.beam.sdk.io.fs.ResolveOptions;
+import org.apache.beam.sdk.io.fs.ResourceId;
+import org.apache.beam.sdk.options.PipelineOptions;
+import 
org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.ImmutableList;
+
+/** A read-only {@link FileSystem} implementation looking up resources using a 
ClassLoader. */

Review comment:
   That's what I get for only reading half the PR...





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 433407)
Time Spent: 22h  (was: 21h 50m)

> Update artifact staging and retrieval protocols to be dependency aware.
> ---
>
> Key: BEAM-9577
> URL: https://issues.apache.org/jira/browse/BEAM-9577
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model
>Reporter: Robert Bradshaw
>Assignee: Robert Bradshaw
>Priority: Major
>  Time Spent: 22h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9833) Add .asf.yaml file

2020-05-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9833?focusedWorklogId=433401&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433401
 ]

ASF GitHub Bot logged work on BEAM-9833:


Author: ASF GitHub Bot
Created on: 14/May/20 22:26
Start Date: 14/May/20 22:26
Worklog Time Spent: 10m 
  Work Description: pabloem commented on pull request #11613:
URL: https://github.com/apache/beam/pull/11613#issuecomment-628918098


   LGTM!



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 433401)
Time Spent: 2h 20m  (was: 2h 10m)

> Add .asf.yaml file
> --
>
> Key: BEAM-9833
> URL: https://issues.apache.org/jira/browse/BEAM-9833
> Project: Beam
>  Issue Type: New Feature
>  Components: build-system
>Reporter: Kyle Weaver
>Priority: Major
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> Github links haven't been automatically added to Jira for a week or so. 
> According to comments on INFRA-20171 and related issues, we need to add a 
> .asf.yaml file to configure our notification settings.
> https://s.apache.org/asfyaml-notify



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9833) Add .asf.yaml file

2020-05-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9833?focusedWorklogId=433402&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433402
 ]

ASF GitHub Bot logged work on BEAM-9833:


Author: ASF GitHub Bot
Created on: 14/May/20 22:26
Start Date: 14/May/20 22:26
Worklog Time Spent: 10m 
  Work Description: pabloem merged pull request #11613:
URL: https://github.com/apache/beam/pull/11613


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 433402)
Time Spent: 2.5h  (was: 2h 20m)

> Add .asf.yaml file
> --
>
> Key: BEAM-9833
> URL: https://issues.apache.org/jira/browse/BEAM-9833
> Project: Beam
>  Issue Type: New Feature
>  Components: build-system
>Reporter: Kyle Weaver
>Priority: Major
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> Github links haven't been automatically added to Jira for a week or so. 
> According to comments on INFRA-20171 and related issues, we need to add a 
> .asf.yaml file to configure our notification settings.
> https://s.apache.org/asfyaml-notify



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9577) Update artifact staging and retrieval protocols to be dependency aware.

2020-05-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9577?focusedWorklogId=433400&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433400
 ]

ASF GitHub Bot logged work on BEAM-9577:


Author: ASF GitHub Bot
Created on: 14/May/20 22:20
Start Date: 14/May/20 22:20
Worklog Time Spent: 10m 
  Work Description: robertwb commented on a change in pull request #11708:
URL: https://github.com/apache/beam/pull/11708#discussion_r425462944



##
File path: 
sdks/java/core/src/main/java/org/apache/beam/sdk/io/ClassLoaderFileSystem.java
##
@@ -0,0 +1,154 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io;
+
+import static 
org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkArgument;
+
+import com.google.auto.service.AutoService;
+import java.io.IOException;
+import java.io.InputStream;
+import java.nio.channels.Channels;
+import java.nio.channels.ReadableByteChannel;
+import java.nio.channels.WritableByteChannel;
+import java.util.Collection;
+import java.util.List;
+import javax.annotation.Nullable;
+import org.apache.beam.sdk.annotations.Experimental;
+import org.apache.beam.sdk.io.fs.CreateOptions;
+import org.apache.beam.sdk.io.fs.MatchResult;
+import org.apache.beam.sdk.io.fs.ResolveOptions;
+import org.apache.beam.sdk.io.fs.ResourceId;
+import org.apache.beam.sdk.options.PipelineOptions;
+import 
org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.ImmutableList;
+
+/** A read-only {@link FileSystem} implementation looking up resources using a 
ClassLoader. */

Review comment:
   It's used on the Python side.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 433400)
Time Spent: 21h 50m  (was: 21h 40m)

> Update artifact staging and retrieval protocols to be dependency aware.
> ---
>
> Key: BEAM-9577
> URL: https://issues.apache.org/jira/browse/BEAM-9577
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model
>Reporter: Robert Bradshaw
>Assignee: Robert Bradshaw
>Priority: Major
>  Time Spent: 21h 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9577) Update artifact staging and retrieval protocols to be dependency aware.

2020-05-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9577?focusedWorklogId=433398&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433398
 ]

ASF GitHub Bot logged work on BEAM-9577:


Author: ASF GitHub Bot
Created on: 14/May/20 22:19
Start Date: 14/May/20 22:19
Worklog Time Spent: 10m 
  Work Description: robertwb commented on a change in pull request #11708:
URL: https://github.com/apache/beam/pull/11708#discussion_r425462707



##
File path: 
sdks/java/core/src/test/java/org/apache/beam/sdk/io/ClassLoaderFileSystemTest.java
##
@@ -0,0 +1,57 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io;
+
+import static java.nio.channels.Channels.newInputStream;
+import static org.junit.Assert.assertArrayEquals;
+
+import java.io.IOException;
+import java.io.InputStream;
+import java.nio.channels.ReadableByteChannel;
+import org.apache.beam.sdk.options.PipelineOptionsFactory;
+import org.junit.Test;
+import org.junit.runner.RunWith;
+import org.junit.runners.JUnit4;
+
+@RunWith(JUnit4.class)
+public class ClassLoaderFileSystemTest {
+
+  private static final String SOME_CLASS =
+  "classpath://org/apache/beam/sdk/io/ClassLoaderFilesystem.class";
+
+  @Test
+  public void testOpen() throws IOException {
+ClassLoaderFileSystem filesystem = new ClassLoaderFileSystem();
+ReadableByteChannel channel = 
filesystem.open(filesystem.matchNewResource(SOME_CLASS, false));
+checkIsClass(channel);
+  }
+
+  @Test
+  public void testRegistrar() throws IOException {
+ReadableByteChannel channel = 
FileSystems.open(FileSystems.matchNewResource(SOME_CLASS, false));
+checkIsClass(channel);
+  }
+
+  public void checkIsClass(ReadableByteChannel channel) throws IOException {
+FileSystems.setDefaultPipelineOptions(PipelineOptionsFactory.create());
+InputStream inputStream = newInputStream(channel);
+byte[] magic = new byte[4];
+inputStream.read(magic);
+assertArrayEquals(magic, new byte[] {(byte) 0xCA, (byte) 0xFE, (byte) 
0xBA, (byte) 0xBE});

Review comment:
   It's used on the Python side.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 433398)
Time Spent: 21.5h  (was: 21h 20m)

> Update artifact staging and retrieval protocols to be dependency aware.
> ---
>
> Key: BEAM-9577
> URL: https://issues.apache.org/jira/browse/BEAM-9577
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model
>Reporter: Robert Bradshaw
>Assignee: Robert Bradshaw
>Priority: Major
>  Time Spent: 21.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9577) Update artifact staging and retrieval protocols to be dependency aware.

2020-05-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9577?focusedWorklogId=433399&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433399
 ]

ASF GitHub Bot logged work on BEAM-9577:


Author: ASF GitHub Bot
Created on: 14/May/20 22:19
Start Date: 14/May/20 22:19
Worklog Time Spent: 10m 
  Work Description: robertwb commented on a change in pull request #11708:
URL: https://github.com/apache/beam/pull/11708#discussion_r425462830



##
File path: 
sdks/java/core/src/main/java/org/apache/beam/sdk/io/ClassLoaderFileSystem.java
##
@@ -0,0 +1,154 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io;
+
+import static 
org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkArgument;
+
+import com.google.auto.service.AutoService;
+import java.io.IOException;
+import java.io.InputStream;
+import java.nio.channels.Channels;
+import java.nio.channels.ReadableByteChannel;
+import java.nio.channels.WritableByteChannel;
+import java.util.Collection;
+import java.util.List;
+import javax.annotation.Nullable;
+import org.apache.beam.sdk.annotations.Experimental;
+import org.apache.beam.sdk.io.fs.CreateOptions;
+import org.apache.beam.sdk.io.fs.MatchResult;
+import org.apache.beam.sdk.io.fs.ResolveOptions;
+import org.apache.beam.sdk.io.fs.ResourceId;
+import org.apache.beam.sdk.options.PipelineOptions;
+import 
org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.ImmutableList;
+
+/** A read-only {@link FileSystem} implementation looking up resources using a 
ClassLoader. */
+public class ClassLoaderFileSystem extends 
FileSystem {
+
+  public static final String SCHEMA = "classpath";
+  private static final String PREFIX = SCHEMA + "://";
+
+  ClassLoaderFileSystem() {}
+
+  @Override
+  protected List match(List specs) throws IOException {
+throw new UnsupportedOperationException("Un-globable filesystem.");
+  }
+
+  @Override
+  protected WritableByteChannel create(
+  ClassLoaderResourceId resourceId, CreateOptions createOptions) throws 
IOException {
+throw new UnsupportedOperationException("Read-only filesystem.");
+  }
+
+  @Override
+  protected ReadableByteChannel open(ClassLoaderResourceId resourceId) throws 
IOException {
+ClassLoader classLoader = getClass().getClassLoader();
+InputStream inputStream =
+
classLoader.getResourceAsStream(resourceId.path.substring(PREFIX.length()));
+if (inputStream == null) {
+  throw new IOException("Unable to load " + resourceId.path + " with " + 
classLoader);
+}
+return Channels.newChannel(inputStream);
+  }
+
+  @Override
+  protected void copy(
+  List srcResourceIds, List 
destResourceIds)
+  throws IOException {
+throw new UnsupportedOperationException("Read-only filesystem.");
+  }
+
+  @Override
+  protected void rename(
+  List srcResourceIds, List 
destResourceIds)
+  throws IOException {
+throw new UnsupportedOperationException("Read-only filesystem.");
+  }
+
+  @Override
+  protected void delete(Collection resourceIds) throws 
IOException {
+throw new UnsupportedOperationException("Read-only filesystem.");
+  }
+
+  @Override
+  protected ClassLoaderResourceId matchNewResource(String path, boolean 
isDirectory) {
+return new ClassLoaderResourceId(path);
+  }
+
+  @Override
+  protected String getScheme() {
+return SCHEMA;
+  }
+
+  public static class ClassLoaderResourceId implements ResourceId {
+
+private final String path;
+
+private ClassLoaderResourceId(String path) {
+  checkArgument(path.startsWith(PREFIX), path);
+  this.path = path;
+}
+
+@Override
+public ResourceId resolve(String other, ResolveOptions resolveOptions) {

Review comment:
   The documentation is in the super classes, but I added a test. 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastruct

[jira] [Work logged] (BEAM-9641) Support ZetaSQL DATE functions in BeamSQL

2020-05-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9641?focusedWorklogId=433397&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433397
 ]

ASF GitHub Bot logged work on BEAM-9641:


Author: ASF GitHub Bot
Created on: 14/May/20 22:16
Start Date: 14/May/20 22:16
Worklog Time Spent: 10m 
  Work Description: robinyqiu commented on pull request #11272:
URL: https://github.com/apache/beam/pull/11272#issuecomment-628914752


   Rebased against master. Please run precommit tests again.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 433397)
Time Spent: 2h 50m  (was: 2h 40m)

> Support ZetaSQL DATE functions in BeamSQL
> -
>
> Key: BEAM-9641
> URL: https://issues.apache.org/jira/browse/BEAM-9641
> Project: Beam
>  Issue Type: New Feature
>  Components: dsl-sql-zetasql
>Reporter: Yueyang Qiu
>Assignee: Yueyang Qiu
>Priority: Major
>  Labels: zetasql-compliance
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-9975) PortableRunnerTest flake "ParseError: Unexpected type for Value message."

2020-05-14 Thread Brian Hulette (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17107733#comment-17107733
 ] 

Brian Hulette commented on BEAM-9975:
-

vpt_vp_arg13 and vpt_vp_arg14 are from [another 
test|https://github.com/apache/beam/blob/b91560cc354da471e3de502aad78dd059997a3d0/sdks/python/apache_beam/options/value_provider_test.py#L187].
 It looks like something isn't being cleaned up properly between test runs.

(Also  I wonder if we need to resolve ValueProvider instances when converting 
pipeline options to proto?)

> PortableRunnerTest flake "ParseError: Unexpected type for Value message."
> -
>
> Key: BEAM-9975
> URL: https://issues.apache.org/jira/browse/BEAM-9975
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Brian Hulette
>Assignee: Brian Hulette
>Priority: Major
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Error looks similar to the one in BEAM-9907. Example from 
> https://builds.apache.org/job/beam_PreCommit_Python_Cron/2732
> {code}
> apache_beam/runners/portability/fn_api_runner/fn_runner_test.py:569: 
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> apache_beam/pipeline.py:550: in __exit__
> self.run().wait_until_finish()
> apache_beam/pipeline.py:529: in run
> return self.runner.run_pipeline(self, self._options)
> apache_beam/runners/portability/portable_runner.py:426: in run_pipeline
> job_service_handle.submit(proto_pipeline)
> apache_beam/runners/portability/portable_runner.py:107: in submit
> prepare_response = self.prepare(proto_pipeline)
> apache_beam/runners/portability/portable_runner.py:184: in prepare
> pipeline_options=self.get_pipeline_options()),
> apache_beam/runners/portability/portable_runner.py:174: in 
> get_pipeline_options
> return job_utils.dict_to_struct(p_options)
> apache_beam/runners/job/utils.py:33: in dict_to_struct
> return json_format.ParseDict(dict_obj, struct_pb2.Struct())
> target/.tox-py36-cython/py36-cython/lib/python3.6/site-packages/google/protobuf/json_format.py:450:
>  in ParseDict
> parser.ConvertMessage(js_dict, message)
> target/.tox-py36-cython/py36-cython/lib/python3.6/site-packages/google/protobuf/json_format.py:479:
>  in ConvertMessage
> methodcaller(_WKTJSONMETHODS[full_name][1], value, message)(self)
> target/.tox-py36-cython/py36-cython/lib/python3.6/site-packages/google/protobuf/json_format.py:667:
>  in _ConvertStructMessage
> self._ConvertValueMessage(value[key], message.fields[key])
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> self = 
> value =  0x7f69eb7b3ac8>
> message = 
> def _ConvertValueMessage(self, value, message):
>   """Convert a JSON representation into Value message."""
>   if isinstance(value, dict):
> self._ConvertStructMessage(value, message.struct_value)
>   elif isinstance(value, list):
> self. _ConvertListValueMessage(value, message.list_value)
>   elif value is None:
> message.null_value = 0
>   elif isinstance(value, bool):
> message.bool_value = value
>   elif isinstance(value, six.string_types):
> message.string_value = value
>   elif isinstance(value, _INT_OR_FLOAT):
> message.number_value = value
>   else:
> >   raise ParseError('Unexpected type for Value message.')
> E   google.protobuf.json_format.ParseError: Unexpected type for Value 
> message.
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-9975) PortableRunnerTest flake "ParseError: Unexpected type for Value message."

2020-05-14 Thread Brian Hulette (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17107729#comment-17107729
 ] 

Brian Hulette commented on BEAM-9975:
-

The RuntimeValueProvider instances are probably the issue

> PortableRunnerTest flake "ParseError: Unexpected type for Value message."
> -
>
> Key: BEAM-9975
> URL: https://issues.apache.org/jira/browse/BEAM-9975
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Brian Hulette
>Assignee: Brian Hulette
>Priority: Major
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Error looks similar to the one in BEAM-9907. Example from 
> https://builds.apache.org/job/beam_PreCommit_Python_Cron/2732
> {code}
> apache_beam/runners/portability/fn_api_runner/fn_runner_test.py:569: 
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> apache_beam/pipeline.py:550: in __exit__
> self.run().wait_until_finish()
> apache_beam/pipeline.py:529: in run
> return self.runner.run_pipeline(self, self._options)
> apache_beam/runners/portability/portable_runner.py:426: in run_pipeline
> job_service_handle.submit(proto_pipeline)
> apache_beam/runners/portability/portable_runner.py:107: in submit
> prepare_response = self.prepare(proto_pipeline)
> apache_beam/runners/portability/portable_runner.py:184: in prepare
> pipeline_options=self.get_pipeline_options()),
> apache_beam/runners/portability/portable_runner.py:174: in 
> get_pipeline_options
> return job_utils.dict_to_struct(p_options)
> apache_beam/runners/job/utils.py:33: in dict_to_struct
> return json_format.ParseDict(dict_obj, struct_pb2.Struct())
> target/.tox-py36-cython/py36-cython/lib/python3.6/site-packages/google/protobuf/json_format.py:450:
>  in ParseDict
> parser.ConvertMessage(js_dict, message)
> target/.tox-py36-cython/py36-cython/lib/python3.6/site-packages/google/protobuf/json_format.py:479:
>  in ConvertMessage
> methodcaller(_WKTJSONMETHODS[full_name][1], value, message)(self)
> target/.tox-py36-cython/py36-cython/lib/python3.6/site-packages/google/protobuf/json_format.py:667:
>  in _ConvertStructMessage
> self._ConvertValueMessage(value[key], message.fields[key])
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> self = 
> value =  0x7f69eb7b3ac8>
> message = 
> def _ConvertValueMessage(self, value, message):
>   """Convert a JSON representation into Value message."""
>   if isinstance(value, dict):
> self._ConvertStructMessage(value, message.struct_value)
>   elif isinstance(value, list):
> self. _ConvertListValueMessage(value, message.list_value)
>   elif value is None:
> message.null_value = 0
>   elif isinstance(value, bool):
> message.bool_value = value
>   elif isinstance(value, six.string_types):
> message.string_value = value
>   elif isinstance(value, _INT_OR_FLOAT):
> message.number_value = value
>   else:
> >   raise ParseError('Unexpected type for Value message.')
> E   google.protobuf.json_format.ParseError: Unexpected type for Value 
> message.
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-9975) PortableRunnerTest flake "ParseError: Unexpected type for Value message."

2020-05-14 Thread Brian Hulette (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17107727#comment-17107727
 ] 

Brian Hulette commented on BEAM-9975:
-

Got an [error 
log|https://builds.apache.org/job/beam_PreCommit_Python_Cron/2753/testReport/junit/apache_beam.runners.portability.portable_runner_test/PortableRunnerTest/test_read/]:
{code}
ERROR:root:Failed to parse dict {'beam:option:streaming:v1': False, 
'beam:option:beam_services:v1': {}, 'beam:option:type_check_strictness:v1': 
'DEFAULT_TO_ANY', 'beam:option:pipeline_type_check:v1': True, 
'beam:option:runtime_type_check:v1': False, 
'beam:option:direct_runner_use_stacked_bundle:v1': True, 
'beam:option:direct_runner_bundle_repeat:v1': '0', 
'beam:option:direct_num_workers:v1': '1', 'beam:option:direct_running_mode:v1': 
'in_memory', 'beam:option:dataflow_endpoint:v1': 
'https://dataflow.googleapis.com', 'beam:option:job_name:v1': 
'test_read_1589482267.7994738', 'beam:option:no_auth:v1': False, 
'beam:option:update:v1': False, 'beam:option:enable_streaming_engine:v1': 
False, 'beam:option:hdfs_full_urls:v1': False, 'beam:option:experiments:v1': 
['state_cache_size=100', 'data_buffer_time_limit_ms=1000', 'beam_fn_api'], 
'beam:option:profile_cpu:v1': False, 'beam:option:profile_memory:v1': False, 
'beam:option:profile_sample_rate:v1': 1.0, 'beam:option:save_main_session:v1': 
False, 'beam:option:sdk_location:v1': 'container', 
'beam:option:job_endpoint:v1': 'localhost:35763', 
'beam:option:job_server_timeout:v1': '60', 'beam:option:environment_type:v1': 
'beam:env:embedded_python:v1', 'beam:option:sdk_worker_parallelism:v1': '1', 
'beam:option:environment_cache_millis:v1': '0', 'beam:option:job_port:v1': '0', 
'beam:option:artifact_port:v1': '0', 'beam:option:expansion_port:v1': '0', 
'beam:option:flink_master:v1': '[auto]', 'beam:option:flink_version:v1': 
'1.10', 'beam:option:flink_submit_uber_jar:v1': False, 
'beam:option:spark_master_url:v1': 'local[4]', 
'beam:option:spark_submit_uber_jar:v1': False, 'beam:option:dry_run:v1': False, 
'beam:option:style:v1': 'scrambled', 'beam:option:influx_hostname:v1': 
'http://localhost:8086', 'beam:option:timeout_ms:v1': '0', 
'beam:option:mock_flag:v1': False, 'beam:option:fake_flag:v1': False, 
'beam:option:m_flag:v1': False, 'beam:option:male:v1': False, 
'beam:option:redefined_flag:v1': False, 'beam:option:vpt_vp_arg13:v1': 
, 'beam:option:vpt_vp_arg14:v1': 
}
{code}

> PortableRunnerTest flake "ParseError: Unexpected type for Value message."
> -
>
> Key: BEAM-9975
> URL: https://issues.apache.org/jira/browse/BEAM-9975
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Brian Hulette
>Assignee: Brian Hulette
>Priority: Major
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Error looks similar to the one in BEAM-9907. Example from 
> https://builds.apache.org/job/beam_PreCommit_Python_Cron/2732
> {code}
> apache_beam/runners/portability/fn_api_runner/fn_runner_test.py:569: 
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> apache_beam/pipeline.py:550: in __exit__
> self.run().wait_until_finish()
> apache_beam/pipeline.py:529: in run
> return self.runner.run_pipeline(self, self._options)
> apache_beam/runners/portability/portable_runner.py:426: in run_pipeline
> job_service_handle.submit(proto_pipeline)
> apache_beam/runners/portability/portable_runner.py:107: in submit
> prepare_response = self.prepare(proto_pipeline)
> apache_beam/runners/portability/portable_runner.py:184: in prepare
> pipeline_options=self.get_pipeline_options()),
> apache_beam/runners/portability/portable_runner.py:174: in 
> get_pipeline_options
> return job_utils.dict_to_struct(p_options)
> apache_beam/runners/job/utils.py:33: in dict_to_struct
> return json_format.ParseDict(dict_obj, struct_pb2.Struct())
> target/.tox-py36-cython/py36-cython/lib/python3.6/site-packages/google/protobuf/json_format.py:450:
>  in ParseDict
> parser.ConvertMessage(js_dict, message)
> target/.tox-py36-cython/py36-cython/lib/python3.6/site-packages/google/protobuf/json_format.py:479:
>  in ConvertMessage
> methodcaller(_WKTJSONMETHODS[full_name][1], value, message)(self)
> target/.tox-py36-cython/py36-cython/lib/python3.6/site-packages/google/protobuf/json_format.py:667:
>  in _ConvertStructMessage
> self._ConvertValueMessage(value[key], message.fields[key])
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> self = 
> value =  0x7f69eb7b3ac8>
> message = 
> def _ConvertValueMessage(self, value, message):
>   """Convert a JSON representation into Value message."""
>   if isinstance(value, dict):
> self._ConvertStructMessage(value, me

[jira] [Assigned] (BEAM-9975) PortableRunnerTest flake "ParseError: Unexpected type for Value message."

2020-05-14 Thread Brian Hulette (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brian Hulette reassigned BEAM-9975:
---

Assignee: Brian Hulette

> PortableRunnerTest flake "ParseError: Unexpected type for Value message."
> -
>
> Key: BEAM-9975
> URL: https://issues.apache.org/jira/browse/BEAM-9975
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Brian Hulette
>Assignee: Brian Hulette
>Priority: Major
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Error looks similar to the one in BEAM-9907. Example from 
> https://builds.apache.org/job/beam_PreCommit_Python_Cron/2732
> {code}
> apache_beam/runners/portability/fn_api_runner/fn_runner_test.py:569: 
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> apache_beam/pipeline.py:550: in __exit__
> self.run().wait_until_finish()
> apache_beam/pipeline.py:529: in run
> return self.runner.run_pipeline(self, self._options)
> apache_beam/runners/portability/portable_runner.py:426: in run_pipeline
> job_service_handle.submit(proto_pipeline)
> apache_beam/runners/portability/portable_runner.py:107: in submit
> prepare_response = self.prepare(proto_pipeline)
> apache_beam/runners/portability/portable_runner.py:184: in prepare
> pipeline_options=self.get_pipeline_options()),
> apache_beam/runners/portability/portable_runner.py:174: in 
> get_pipeline_options
> return job_utils.dict_to_struct(p_options)
> apache_beam/runners/job/utils.py:33: in dict_to_struct
> return json_format.ParseDict(dict_obj, struct_pb2.Struct())
> target/.tox-py36-cython/py36-cython/lib/python3.6/site-packages/google/protobuf/json_format.py:450:
>  in ParseDict
> parser.ConvertMessage(js_dict, message)
> target/.tox-py36-cython/py36-cython/lib/python3.6/site-packages/google/protobuf/json_format.py:479:
>  in ConvertMessage
> methodcaller(_WKTJSONMETHODS[full_name][1], value, message)(self)
> target/.tox-py36-cython/py36-cython/lib/python3.6/site-packages/google/protobuf/json_format.py:667:
>  in _ConvertStructMessage
> self._ConvertValueMessage(value[key], message.fields[key])
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> self = 
> value =  0x7f69eb7b3ac8>
> message = 
> def _ConvertValueMessage(self, value, message):
>   """Convert a JSON representation into Value message."""
>   if isinstance(value, dict):
> self._ConvertStructMessage(value, message.struct_value)
>   elif isinstance(value, list):
> self. _ConvertListValueMessage(value, message.list_value)
>   elif value is None:
> message.null_value = 0
>   elif isinstance(value, bool):
> message.bool_value = value
>   elif isinstance(value, six.string_types):
> message.string_value = value
>   elif isinstance(value, _INT_OR_FLOAT):
> message.number_value = value
>   else:
> >   raise ParseError('Unexpected type for Value message.')
> E   google.protobuf.json_format.ParseError: Unexpected type for Value 
> message.
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9951) Create Go SDK synthetic sources.

2020-05-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9951?focusedWorklogId=433387&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433387
 ]

ASF GitHub Bot logged work on BEAM-9951:


Author: ASF GitHub Bot
Created on: 14/May/20 21:56
Start Date: 14/May/20 21:56
Worklog Time Spent: 10m 
  Work Description: youngoli commented on a change in pull request #11665:
URL: https://github.com/apache/beam/pull/11665#discussion_r425453509



##
File path: sdks/go/pkg/beam/io/synthetic/source.go
##
@@ -0,0 +1,151 @@
+// Licensed to the Apache Software Foundation (ASF) under one or more
+// contributor license agreements.  See the NOTICE file distributed with
+// this work for additional information regarding copyright ownership.
+// The ASF licenses this file to You under the Apache License, Version 2.0
+// (the "License"); you may not use this file except in compliance with
+// the License.  You may obtain a copy of the License at
+//
+//http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+// Package synthetic contains transforms for creating synthetic pipelines.
+// Synthetic pipelines are pipelines that simulate the behavior of possible
+// pipelines in order to test performance, splitting, liquid sharding, and
+// various other infrastructure used for running pipelines. This category of
+// tests are not concerned with the correctness of the elements themselves, but
+// need to simulate transforms that output many elements throughout varying
+// pipeline shapes.
+package synthetic
+
+import (
+   "github.com/apache/beam/sdks/go/pkg/beam"
+   "github.com/apache/beam/sdks/go/pkg/beam/io/rtrackers/offsetrange"
+   "math/rand"
+   "time"
+)
+
+// Source creates a synthetic source transform that emits randomly
+// generated KV<[]byte, []byte> elements.
+//
+// This transform accepts a PCollection of SourceConfig, where each 
SourceConfig
+// determines the synthetic source's behavior for that element.

Review comment:
   No, it applies to the synthetic steps, so it would be in an upcoming 
StepConfig, but that would probably still have a similar interface.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 433387)
Time Spent: 40m  (was: 0.5h)

> Create Go SDK synthetic sources.
> 
>
> Key: BEAM-9951
> URL: https://issues.apache.org/jira/browse/BEAM-9951
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-go
>Reporter: Daniel Oliveira
>Assignee: Daniel Oliveira
>Priority: Major
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Create synthetic sources for the Go SDK like 
> [Java|https://github.com/apache/beam/tree/master/sdks/java/io/synthetic/src/main/java/org/apache/beam/sdk/io/synthetic]
>  and 
> [Python|https://github.com/apache/beam/blob/master/sdks/python/apache_beam/testing/synthetic_pipeline.py]
>  have.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8949) Add Spanner IO Integration Test for Python

2020-05-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8949?focusedWorklogId=433382&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433382
 ]

ASF GitHub Bot logged work on BEAM-8949:


Author: ASF GitHub Bot
Created on: 14/May/20 21:44
Start Date: 14/May/20 21:44
Worklog Time Spent: 10m 
  Work Description: pabloem merged pull request #11210:
URL: https://github.com/apache/beam/pull/11210


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 433382)
Time Spent: 12.5h  (was: 12h 20m)

> Add Spanner IO Integration Test for Python
> --
>
> Key: BEAM-8949
> URL: https://issues.apache.org/jira/browse/BEAM-8949
> Project: Beam
>  Issue Type: Test
>  Components: io-py-gcp
>Reporter: Shoaib Zafar
>Assignee: Shoaib Zafar
>Priority: Major
>  Time Spent: 12.5h
>  Remaining Estimate: 0h
>
> Spanner IO (Python SDK) contains PTransform which uses the BatchAPI to read 
> from the spanner. Currently, it only contains direct runner unit tests. In 
> order to make this functionality available for the users, integration tests 
> also need to be added.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9964) Setting workerCacheMb to make its way to the WindmillStateCache Constructor

2020-05-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9964?focusedWorklogId=433381&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433381
 ]

ASF GitHub Bot logged work on BEAM-9964:


Author: ASF GitHub Bot
Created on: 14/May/20 21:41
Start Date: 14/May/20 21:41
Worklog Time Spent: 10m 
  Work Description: pabloem commented on pull request #11710:
URL: https://github.com/apache/beam/pull/11710#issuecomment-628901207


   Run Java PostCommit



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 433381)
Time Spent: 2h 10m  (was: 2h)

> Setting workerCacheMb to make its way to the WindmillStateCache Constructor
> ---
>
> Key: BEAM-9964
> URL: https://issues.apache.org/jira/browse/BEAM-9964
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-dataflow
>Reporter: Omar Ismail
>Assignee: Omar Ismail
>Priority: Minor
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> Setting --workerCacheMB seems to affect batch pipelines only. For Streaming, 
> the cache seems to be hardcoded to 100Mb [1]. If possible, I would like to 
> make it allowable to change the cache value in Streaming when setting 
> -workerCacheMB.
> I've never made changes to the Beam SDK, so I am super excited to work on 
> this! 
>  
> [[1] 
> https://github.com/apache/beam/blob/5e659bb80bcbf70795f6806e05a255ee72706d9f/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillStateCache.java#L73|https://github.com/apache/beam/blob/5e659bb80bcbf70795f6806e05a255ee72706d9f/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillStateCache.java#L73]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9577) Update artifact staging and retrieval protocols to be dependency aware.

2020-05-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9577?focusedWorklogId=433376&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433376
 ]

ASF GitHub Bot logged work on BEAM-9577:


Author: ASF GitHub Bot
Created on: 14/May/20 21:30
Start Date: 14/May/20 21:30
Worklog Time Spent: 10m 
  Work Description: ibzib commented on a change in pull request #11708:
URL: https://github.com/apache/beam/pull/11708#discussion_r425439878



##
File path: 
sdks/java/core/src/main/java/org/apache/beam/sdk/io/ClassLoaderFileSystem.java
##
@@ -0,0 +1,154 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io;
+
+import static 
org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkArgument;
+
+import com.google.auto.service.AutoService;
+import java.io.IOException;
+import java.io.InputStream;
+import java.nio.channels.Channels;
+import java.nio.channels.ReadableByteChannel;
+import java.nio.channels.WritableByteChannel;
+import java.util.Collection;
+import java.util.List;
+import javax.annotation.Nullable;
+import org.apache.beam.sdk.annotations.Experimental;
+import org.apache.beam.sdk.io.fs.CreateOptions;
+import org.apache.beam.sdk.io.fs.MatchResult;
+import org.apache.beam.sdk.io.fs.ResolveOptions;
+import org.apache.beam.sdk.io.fs.ResourceId;
+import org.apache.beam.sdk.options.PipelineOptions;
+import 
org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.ImmutableList;
+
+/** A read-only {@link FileSystem} implementation looking up resources using a 
ClassLoader. */
+public class ClassLoaderFileSystem extends 
FileSystem {
+
+  public static final String SCHEMA = "classpath";
+  private static final String PREFIX = SCHEMA + "://";
+
+  ClassLoaderFileSystem() {}
+
+  @Override
+  protected List match(List specs) throws IOException {
+throw new UnsupportedOperationException("Un-globable filesystem.");
+  }
+
+  @Override
+  protected WritableByteChannel create(
+  ClassLoaderResourceId resourceId, CreateOptions createOptions) throws 
IOException {
+throw new UnsupportedOperationException("Read-only filesystem.");
+  }
+
+  @Override
+  protected ReadableByteChannel open(ClassLoaderResourceId resourceId) throws 
IOException {
+ClassLoader classLoader = getClass().getClassLoader();
+InputStream inputStream =
+
classLoader.getResourceAsStream(resourceId.path.substring(PREFIX.length()));
+if (inputStream == null) {
+  throw new IOException("Unable to load " + resourceId.path + " with " + 
classLoader);
+}
+return Channels.newChannel(inputStream);
+  }
+
+  @Override
+  protected void copy(
+  List srcResourceIds, List 
destResourceIds)
+  throws IOException {
+throw new UnsupportedOperationException("Read-only filesystem.");
+  }
+
+  @Override
+  protected void rename(
+  List srcResourceIds, List 
destResourceIds)
+  throws IOException {
+throw new UnsupportedOperationException("Read-only filesystem.");
+  }
+
+  @Override
+  protected void delete(Collection resourceIds) throws 
IOException {
+throw new UnsupportedOperationException("Read-only filesystem.");
+  }
+
+  @Override
+  protected ClassLoaderResourceId matchNewResource(String path, boolean 
isDirectory) {
+return new ClassLoaderResourceId(path);
+  }
+
+  @Override
+  protected String getScheme() {
+return SCHEMA;
+  }
+
+  public static class ClassLoaderResourceId implements ResourceId {
+
+private final String path;
+
+private ClassLoaderResourceId(String path) {
+  checkArgument(path.startsWith(PREFIX), path);
+  this.path = path;
+}
+
+@Override
+public ResourceId resolve(String other, ResolveOptions resolveOptions) {

Review comment:
   Can we add a couple trivial unit tests as sanity checks / documentation 
for `resolve` and `getCurrentDirectory`?

##
File path: 
sdks/java/core/src/main/java/org/apache/beam/sdk/io/ClassLoaderFileSystem.java
##
@@ -0,0 +1,154 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. 

[jira] [Work logged] (BEAM-9833) Add .asf.yaml file

2020-05-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9833?focusedWorklogId=433368&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433368
 ]

ASF GitHub Bot logged work on BEAM-9833:


Author: ASF GitHub Bot
Created on: 14/May/20 21:17
Start Date: 14/May/20 21:17
Worklog Time Spent: 10m 
  Work Description: iemejia commented on pull request #11613:
URL: https://github.com/apache/beam/pull/11613#issuecomment-628891422


   Done the changes suggested by @robertwb let only rebase disabled. Can 
someone PTAL. Thanks



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 433368)
Time Spent: 2h 10m  (was: 2h)

> Add .asf.yaml file
> --
>
> Key: BEAM-9833
> URL: https://issues.apache.org/jira/browse/BEAM-9833
> Project: Beam
>  Issue Type: New Feature
>  Components: build-system
>Reporter: Kyle Weaver
>Priority: Major
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> Github links haven't been automatically added to Jira for a week or so. 
> According to comments on INFRA-20171 and related issues, we need to add a 
> .asf.yaml file to configure our notification settings.
> https://s.apache.org/asfyaml-notify



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9964) Setting workerCacheMb to make its way to the WindmillStateCache Constructor

2020-05-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9964?focusedWorklogId=433367&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433367
 ]

ASF GitHub Bot logged work on BEAM-9964:


Author: ASF GitHub Bot
Created on: 14/May/20 21:17
Start Date: 14/May/20 21:17
Worklog Time Spent: 10m 
  Work Description: pabloem commented on pull request #11710:
URL: https://github.com/apache/beam/pull/11710#issuecomment-628891444


   retest this please
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 433367)
Time Spent: 2h  (was: 1h 50m)

> Setting workerCacheMb to make its way to the WindmillStateCache Constructor
> ---
>
> Key: BEAM-9964
> URL: https://issues.apache.org/jira/browse/BEAM-9964
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-dataflow
>Reporter: Omar Ismail
>Assignee: Omar Ismail
>Priority: Minor
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> Setting --workerCacheMB seems to affect batch pipelines only. For Streaming, 
> the cache seems to be hardcoded to 100Mb [1]. If possible, I would like to 
> make it allowable to change the cache value in Streaming when setting 
> -workerCacheMB.
> I've never made changes to the Beam SDK, so I am super excited to work on 
> this! 
>  
> [[1] 
> https://github.com/apache/beam/blob/5e659bb80bcbf70795f6806e05a255ee72706d9f/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillStateCache.java#L73|https://github.com/apache/beam/blob/5e659bb80bcbf70795f6806e05a255ee72706d9f/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillStateCache.java#L73]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9964) Setting workerCacheMb to make its way to the WindmillStateCache Constructor

2020-05-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9964?focusedWorklogId=433365&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433365
 ]

ASF GitHub Bot logged work on BEAM-9964:


Author: ASF GitHub Bot
Created on: 14/May/20 21:16
Start Date: 14/May/20 21:16
Worklog Time Spent: 10m 
  Work Description: pabloem commented on pull request #11710:
URL: https://github.com/apache/beam/pull/11710#issuecomment-628890978


   retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 433365)
Time Spent: 1h 50m  (was: 1h 40m)

> Setting workerCacheMb to make its way to the WindmillStateCache Constructor
> ---
>
> Key: BEAM-9964
> URL: https://issues.apache.org/jira/browse/BEAM-9964
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-dataflow
>Reporter: Omar Ismail
>Assignee: Omar Ismail
>Priority: Minor
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> Setting --workerCacheMB seems to affect batch pipelines only. For Streaming, 
> the cache seems to be hardcoded to 100Mb [1]. If possible, I would like to 
> make it allowable to change the cache value in Streaming when setting 
> -workerCacheMB.
> I've never made changes to the Beam SDK, so I am super excited to work on 
> this! 
>  
> [[1] 
> https://github.com/apache/beam/blob/5e659bb80bcbf70795f6806e05a255ee72706d9f/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillStateCache.java#L73|https://github.com/apache/beam/blob/5e659bb80bcbf70795f6806e05a255ee72706d9f/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillStateCache.java#L73]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-6950) Beam Dependency Update Request: com.github.spotbugs

2020-05-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-6950?focusedWorklogId=433364&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433364
 ]

ASF GitHub Bot logged work on BEAM-6950:


Author: ASF GitHub Bot
Created on: 14/May/20 21:12
Start Date: 14/May/20 21:12
Worklog Time Spent: 10m 
  Work Description: iemejia opened a new pull request #11712:
URL: https://github.com/apache/beam/pull/11712


   R: @pabloem 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 433364)
Remaining Estimate: 0h
Time Spent: 10m

> Beam Dependency Update Request: com.github.spotbugs
> ---
>
> Key: BEAM-6950
> URL: https://issues.apache.org/jira/browse/BEAM-6950
> Project: Beam
>  Issue Type: Bug
>  Components: dependencies
>Reporter: Beam JIRA Bot
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
>  - 2019-04-01 12:15:04.215434 
> -
> Please consider upgrading the dependency com.github.spotbugs. 
> The current version is None. The latest version is None 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9964) Setting workerCacheMb to make its way to the WindmillStateCache Constructor

2020-05-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9964?focusedWorklogId=433363&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433363
 ]

ASF GitHub Bot logged work on BEAM-9964:


Author: ASF GitHub Bot
Created on: 14/May/20 21:11
Start Date: 14/May/20 21:11
Worklog Time Spent: 10m 
  Work Description: omarismail94 commented on pull request #11710:
URL: https://github.com/apache/beam/pull/11710#issuecomment-62729


   New changes passed ./gradlew -p runners/google-cloud-dataflow-java check on 
my computer
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 433363)
Time Spent: 1h 40m  (was: 1.5h)

> Setting workerCacheMb to make its way to the WindmillStateCache Constructor
> ---
>
> Key: BEAM-9964
> URL: https://issues.apache.org/jira/browse/BEAM-9964
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-dataflow
>Reporter: Omar Ismail
>Assignee: Omar Ismail
>Priority: Minor
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> Setting --workerCacheMB seems to affect batch pipelines only. For Streaming, 
> the cache seems to be hardcoded to 100Mb [1]. If possible, I would like to 
> make it allowable to change the cache value in Streaming when setting 
> -workerCacheMB.
> I've never made changes to the Beam SDK, so I am super excited to work on 
> this! 
>  
> [[1] 
> https://github.com/apache/beam/blob/5e659bb80bcbf70795f6806e05a255ee72706d9f/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillStateCache.java#L73|https://github.com/apache/beam/blob/5e659bb80bcbf70795f6806e05a255ee72706d9f/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillStateCache.java#L73]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9825) Transforms for Intersect, IntersectAll, Except, ExceptAll, Union, UnionAll

2020-05-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9825?focusedWorklogId=433362&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433362
 ]

ASF GitHub Bot logged work on BEAM-9825:


Author: ASF GitHub Bot
Created on: 14/May/20 21:09
Start Date: 14/May/20 21:09
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on a change in pull request #11610:
URL: https://github.com/apache/beam/pull/11610#discussion_r425432135



##
File path: 
sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/SetFns.java
##
@@ -0,0 +1,528 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.transforms;
+
+import static 
org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkNotNull;
+
+import java.util.ArrayList;
+import java.util.List;
+import org.apache.beam.sdk.Pipeline;
+import org.apache.beam.sdk.transforms.join.CoGbkResult;
+import org.apache.beam.sdk.transforms.join.CoGroupByKey;
+import org.apache.beam.sdk.transforms.join.KeyedPCollectionTuple;
+import org.apache.beam.sdk.values.KV;
+import org.apache.beam.sdk.values.PCollection;
+import org.apache.beam.sdk.values.PCollectionList;
+import org.apache.beam.sdk.values.TupleTag;
+import 
org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.Iterables;
+
+public class SetFns {

Review comment:
   We should mention in the comments that we rely on the deterministic 
encoding of the coder similar to how we do GroupByKey.
   
   Also, this implementation assumes that there will only be a single firing of 
the trigger. If there are multiple then the results are likely undefined.
   
   





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 433362)
Remaining Estimate: 89h  (was: 89h 10m)
Time Spent: 7h  (was: 6h 50m)

> Transforms for Intersect, IntersectAll, Except, ExceptAll, Union, UnionAll
> --
>
> Key: BEAM-9825
> URL: https://issues.apache.org/jira/browse/BEAM-9825
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Darshan Jani
>Assignee: Darshan Jani
>Priority: Major
>   Original Estimate: 96h
>  Time Spent: 7h
>  Remaining Estimate: 89h
>
> I'd like to propose following new high-level transforms.
>  * Intersect
> Compute the intersection between elements of two PCollection.
> Given _leftCollection_ and _rightCollection_, this transform returns a 
> collection containing elements that common to both _leftCollection_ and 
> _rightCollection_
>  
>  * Except
> Compute the difference between elements of two PCollection.
> Given _leftCollection_ and _rightCollection_, this transform returns a 
> collection containing elements that are in _leftCollection_ but not in 
> _rightCollection_
>  * Union
> Find the elements that are either of two PCollection.
> Implement IntersetAll, ExceptAll and UnionAll variants of transforms.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9964) Setting workerCacheMb to make its way to the WindmillStateCache Constructor

2020-05-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9964?focusedWorklogId=433359&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433359
 ]

ASF GitHub Bot logged work on BEAM-9964:


Author: ASF GitHub Bot
Created on: 14/May/20 21:07
Start Date: 14/May/20 21:07
Worklog Time Spent: 10m 
  Work Description: omarismail94 commented on pull request #11710:
URL: https://github.com/apache/beam/pull/11710#issuecomment-628886984


   retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 433359)
Time Spent: 1.5h  (was: 1h 20m)

> Setting workerCacheMb to make its way to the WindmillStateCache Constructor
> ---
>
> Key: BEAM-9964
> URL: https://issues.apache.org/jira/browse/BEAM-9964
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-dataflow
>Reporter: Omar Ismail
>Assignee: Omar Ismail
>Priority: Minor
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Setting --workerCacheMB seems to affect batch pipelines only. For Streaming, 
> the cache seems to be hardcoded to 100Mb [1]. If possible, I would like to 
> make it allowable to change the cache value in Streaming when setting 
> -workerCacheMB.
> I've never made changes to the Beam SDK, so I am super excited to work on 
> this! 
>  
> [[1] 
> https://github.com/apache/beam/blob/5e659bb80bcbf70795f6806e05a255ee72706d9f/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillStateCache.java#L73|https://github.com/apache/beam/blob/5e659bb80bcbf70795f6806e05a255ee72706d9f/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillStateCache.java#L73]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-4596) Release Candidates' Maven Repository

2020-05-14 Thread Jira


[ 
https://issues.apache.org/jira/browse/BEAM-4596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17107676#comment-17107676
 ] 

Ismaël Mejía commented on BEAM-4596:


Release candidates repositories are published at the moment of the vote and 
they are available until the release is done (or a new RC is out), so closing 
this one.

> Release Candidates' Maven Repository
> 
>
> Key: BEAM-4596
> URL: https://issues.apache.org/jira/browse/BEAM-4596
> Project: Beam
>  Issue Type: Improvement
>  Components: build-system, website
>Affects Versions: 2.4.0
>Reporter: Cemalettin Koç
>Priority: Minor
>
> I would like to give a try with 2.5.0-RC2 release candidates but I could not 
> find anywhere these files. Please provide necessary repositories. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (BEAM-4596) Release Candidates' Maven Repository

2020-05-14 Thread Jira


 [ 
https://issues.apache.org/jira/browse/BEAM-4596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismaël Mejía resolved BEAM-4596.

Fix Version/s: Not applicable
   Resolution: Invalid

> Release Candidates' Maven Repository
> 
>
> Key: BEAM-4596
> URL: https://issues.apache.org/jira/browse/BEAM-4596
> Project: Beam
>  Issue Type: Improvement
>  Components: build-system, website
>Affects Versions: 2.4.0
>Reporter: Cemalettin Koç
>Priority: Minor
> Fix For: Not applicable
>
>
> I would like to give a try with 2.5.0-RC2 release candidates but I could not 
> find anywhere these files. Please provide necessary repositories. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9964) Setting workerCacheMb to make its way to the WindmillStateCache Constructor

2020-05-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9964?focusedWorklogId=433358&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433358
 ]

ASF GitHub Bot logged work on BEAM-9964:


Author: ASF GitHub Bot
Created on: 14/May/20 21:05
Start Date: 14/May/20 21:05
Worklog Time Spent: 10m 
  Work Description: omarismail94 commented on a change in pull request 
#11710:
URL: https://github.com/apache/beam/pull/11710#discussion_r425430039



##
File path: 
runners/google-cloud-dataflow-java/worker/src/test/java/org/apache/beam/runners/dataflow/worker/WindmillStateCacheTest.java
##
@@ -130,7 +133,8 @@ private static StateNamespace triggerNamespace(long start, 
int triggerIdx) {
 
   @Before
   public void setUp() {
-cache = new WindmillStateCache();
+options = PipelineOptionsFactory.as(DataflowWorkerHarnessOptions.class);
+cache = new WindmillStateCache(options.getWorkerCacheMb());
 assertEquals(0, cache.getWeight());

Review comment:
   Fixed this by adding a new Test method in WindmillStateCacheTest class. 
I created a new getter in the WindmillStateCache to retrieve the size of max 
weight on bytes, and compared it to the initial value set





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 433358)
Time Spent: 1h 20m  (was: 1h 10m)

> Setting workerCacheMb to make its way to the WindmillStateCache Constructor
> ---
>
> Key: BEAM-9964
> URL: https://issues.apache.org/jira/browse/BEAM-9964
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-dataflow
>Reporter: Omar Ismail
>Assignee: Omar Ismail
>Priority: Minor
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Setting --workerCacheMB seems to affect batch pipelines only. For Streaming, 
> the cache seems to be hardcoded to 100Mb [1]. If possible, I would like to 
> make it allowable to change the cache value in Streaming when setting 
> -workerCacheMB.
> I've never made changes to the Beam SDK, so I am super excited to work on 
> this! 
>  
> [[1] 
> https://github.com/apache/beam/blob/5e659bb80bcbf70795f6806e05a255ee72706d9f/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillStateCache.java#L73|https://github.com/apache/beam/blob/5e659bb80bcbf70795f6806e05a255ee72706d9f/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillStateCache.java#L73]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9825) Transforms for Intersect, IntersectAll, Except, ExceptAll, Union, UnionAll

2020-05-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9825?focusedWorklogId=433355&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433355
 ]

ASF GitHub Bot logged work on BEAM-9825:


Author: ASF GitHub Bot
Created on: 14/May/20 20:59
Start Date: 14/May/20 20:59
Worklog Time Spent: 10m 
  Work Description: amaliujia commented on pull request #11610:
URL: https://github.com/apache/beam/pull/11610#issuecomment-628883607


   Thank you! The Jira looks good to me!
   
   Will merge this PR tomorrow if there is no other comments.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 433355)
Remaining Estimate: 89h 10m  (was: 89h 20m)
Time Spent: 6h 50m  (was: 6h 40m)

> Transforms for Intersect, IntersectAll, Except, ExceptAll, Union, UnionAll
> --
>
> Key: BEAM-9825
> URL: https://issues.apache.org/jira/browse/BEAM-9825
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Darshan Jani
>Assignee: Darshan Jani
>Priority: Major
>   Original Estimate: 96h
>  Time Spent: 6h 50m
>  Remaining Estimate: 89h 10m
>
> I'd like to propose following new high-level transforms.
>  * Intersect
> Compute the intersection between elements of two PCollection.
> Given _leftCollection_ and _rightCollection_, this transform returns a 
> collection containing elements that common to both _leftCollection_ and 
> _rightCollection_
>  
>  * Except
> Compute the difference between elements of two PCollection.
> Given _leftCollection_ and _rightCollection_, this transform returns a 
> collection containing elements that are in _leftCollection_ but not in 
> _rightCollection_
>  * Union
> Find the elements that are either of two PCollection.
> Implement IntersetAll, ExceptAll and UnionAll variants of transforms.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9992) Migrate BeamSQL's SET operators to SetFns transforms

2020-05-14 Thread Rui Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Wang updated BEAM-9992:
---
Status: Open  (was: Triage Needed)

> Migrate BeamSQL's SET operators to SetFns transforms
> 
>
> Key: BEAM-9992
> URL: https://issues.apache.org/jira/browse/BEAM-9992
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Darshan Jani
>Assignee: Darshan Jani
>Priority: Major
>
> As par of [BEAM-9946|https://issues.apache.org/jira/browse/BEAM-9946] we have 
> new SetFns transforms for intersect,union and except.
> This jira is to use them to remove existing Set operators in BeamSQL code.
> Tasks:
> # Remove: 
> [BeamSetOperatorRelBase.java|https://github.com/apache/beam/blob/master/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/rel/BeamSetOperatorRelBase.java]
> # use SetFns transforms from
> ## 
> [BeamIntersectRel.java|https://github.com/apache/beam/blob/master/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/rel/BeamIntersectRel.java]
> ## 
> [BeamMinusRel.java|https://github.com/apache/beam/blob/master/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/rel/BeamMinusRel.java]
> ## 
> [BeamMinusRel.java|https://github.com/apache/beam/blob/master/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/rel/BeamMinusRel.java]
> ## 
> [BeamUnionRel.java|https://github.com/apache/beam/blob/master/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/rel/BeamUnionRel.java]
> # 
> Remove:[BeamSetOperatorsTransforms.java|https://github.com/apache/beam/blob/master/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/transform/BeamSetOperatorsTransforms.java]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9946) Enhance Partition transform to provide partitionfn with SideInputs

2020-05-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9946?focusedWorklogId=433351&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433351
 ]

ASF GitHub Bot logged work on BEAM-9946:


Author: ASF GitHub Bot
Created on: 14/May/20 20:52
Start Date: 14/May/20 20:52
Worklog Time Spent: 10m 
  Work Description: aaltay commented on pull request #11682:
URL: https://github.com/apache/beam/pull/11682#issuecomment-628880635


   This LGTM and well written. I will trigger the tests.
   
   Added @kennknowles in case I am missing something related to the java apis.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 433351)
Remaining Estimate: 95h 50m  (was: 96h)
Time Spent: 10m

> Enhance Partition transform to provide partitionfn with SideInputs
> --
>
> Key: BEAM-9946
> URL: https://issues.apache.org/jira/browse/BEAM-9946
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Darshan Jani
>Assignee: Darshan Jani
>Priority: Major
>   Original Estimate: 96h
>  Time Spent: 10m
>  Remaining Estimate: 95h 50m
>
> Currently _Partition_ transform can partition a collection into n collections 
> based on only _element_ value in _PartitionFn_ to decide on which partition a 
> particular element belongs to.
> {code:java}
> public interface PartitionFn extends Serializable {
> int partitionFor(T elem, int numPartitions);
>   }
> public static  Partition of(int numPartitions, PartitionFn 
> partitionFn) {
> return new Partition<>(new PartitionDoFn(numPartitions, partitionFn));
>   }
> {code}
> It will be useful to introduce new API with additional _sideInputs_ provided 
> to partition function. User will be able to write logic to use both _element_ 
> value and _sideInputs_ to decide on which partition a particular element 
> belongs to.
> Option-1: Proposed new API:
> {code:java}
>   public interface PartitionWithSideInputsFn extends Serializable {
> int partitionFor(T elem, int numPartitions, Context c);
>   }
> public static  Partition of(int numPartitions, 
> PartitionWithSideInputsFn partitionFn, Requirements requirements) {
>  ...
>   }
> {code}
> User can use any of the two APIs as per there partitioning function logic.
> Option-2: Redesign old API with Builder Pattern which can provide optionally 
> a _Requirements_ with _sideInputs._ Deprecate old API.
> {code:java}
> // using sideviews
> Partition.into(numberOfPartitions).via(
> fn(
>   (input,c) ->  {
> // use c.sideInput(view)
> // use input
> // return partitionnumber
>  },requiresSideInputs(view))
> )
> // without using sideviews
> Partition.into(numberOfPartitions).via(
> fn((input,c) ->  {
> // use input
> // return partitionnumber
>  })
> )
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9468) Add Google Cloud Healthcare API IO Connectors

2020-05-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9468?focusedWorklogId=433348&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433348
 ]

ASF GitHub Bot logged work on BEAM-9468:


Author: ASF GitHub Bot
Created on: 14/May/20 20:40
Start Date: 14/May/20 20:40
Worklog Time Spent: 10m 
  Work Description: jaketf commented on pull request #11339:
URL: https://github.com/apache/beam/pull/11339#issuecomment-628875209


   @pabloem  I'm not sure what's going on with the class loading IT issue but I 
think we could re-run the precommit



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 433348)
Time Spent: 41h 50m  (was: 41h 40m)

> Add Google Cloud Healthcare API IO Connectors
> -
>
> Key: BEAM-9468
> URL: https://issues.apache.org/jira/browse/BEAM-9468
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-gcp
>Reporter: Jacob Ferriero
>Assignee: Jacob Ferriero
>Priority: Minor
>  Time Spent: 41h 50m
>  Remaining Estimate: 0h
>
> Add IO Transforms for the HL7v2, FHIR and DICOM stores in the [Google Cloud 
> Healthcare API|https://cloud.google.com/healthcare/docs/]
> HL7v2IO
> FHIRIO
> DICOM 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9770) Add BigQuery DeadLetter pattern to Patterns Page

2020-05-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9770?focusedWorklogId=433345&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433345
 ]

ASF GitHub Bot logged work on BEAM-9770:


Author: ASF GitHub Bot
Created on: 14/May/20 20:38
Start Date: 14/May/20 20:38
Worklog Time Spent: 10m 
  Work Description: aaltay commented on pull request #11437:
URL: https://github.com/apache/beam/pull/11437#issuecomment-628874108


   retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 433345)
Time Spent: 2.5h  (was: 2h 20m)

> Add BigQuery DeadLetter pattern to Patterns Page
> 
>
> Key: BEAM-9770
> URL: https://issues.apache.org/jira/browse/BEAM-9770
> Project: Beam
>  Issue Type: New Feature
>  Components: website
>Reporter: Reza ardeshir rokni
>Assignee: Reza ardeshir rokni
>Priority: Trivial
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9468) Add Google Cloud Healthcare API IO Connectors

2020-05-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9468?focusedWorklogId=433342&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433342
 ]

ASF GitHub Bot logged work on BEAM-9468:


Author: ASF GitHub Bot
Created on: 14/May/20 20:34
Start Date: 14/May/20 20:34
Worklog Time Spent: 10m 
  Work Description: jaketf commented on pull request #11339:
URL: https://github.com/apache/beam/pull/11339#issuecomment-628872612


   - [x] Fixed javadoc checkstyle issue
   - [x] Added pubsub.close() to the @After in FhirIOReadIT (this should fix 
the pubsub not shutdown properly message)
   - [x] Added KV coders to FhirIO.Import
   
   see pre-commit output here.
   https://gist.github.com/jaketf/e7c9116daed60e7babffdb4c99ae1d54



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 433342)
Time Spent: 41h 40m  (was: 41.5h)

> Add Google Cloud Healthcare API IO Connectors
> -
>
> Key: BEAM-9468
> URL: https://issues.apache.org/jira/browse/BEAM-9468
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-gcp
>Reporter: Jacob Ferriero
>Assignee: Jacob Ferriero
>Priority: Minor
>  Time Spent: 41h 40m
>  Remaining Estimate: 0h
>
> Add IO Transforms for the HL7v2, FHIR and DICOM stores in the [Google Cloud 
> Healthcare API|https://cloud.google.com/healthcare/docs/]
> HL7v2IO
> FHIRIO
> DICOM 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9941) Add a test to prevent a regression in Dataflow when using a Flatten with different input/output coder followed by a GBK

2020-05-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9941?focusedWorklogId=44&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-44
 ]

ASF GitHub Bot logged work on BEAM-9941:


Author: ASF GitHub Bot
Created on: 14/May/20 20:15
Start Date: 14/May/20 20:15
Worklog Time Spent: 10m 
  Work Description: lukecwik merged pull request #11666:
URL: https://github.com/apache/beam/pull/11666


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 44)
Time Spent: 1.5h  (was: 1h 20m)

> Add a test to prevent a regression in Dataflow when using a Flatten with 
> different input/output coder followed by a GBK
> ---
>
> Key: BEAM-9941
> URL: https://issues.apache.org/jira/browse/BEAM-9941
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Reporter: Luke Cwik
>Assignee: Craig Chambers
>Priority: Minor
> Fix For: 2.22.0
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Dataflow used to fail when having an input coder that differed from the 
> output coder when followed by a GBK because of an optimization it is 
> performing.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9998) Figures and associated text in Windowing section of programming guide should be updated

2020-05-14 Thread David Wrede (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Wrede updated BEAM-9998:
--
Description: 
All figures and associated text in [section 8.1 of the programming 
guide|https://beam.apache.org/documentation/programming-guide/#windowing-basics]
 should be updated and clarified to be more intuitive (i.e. not sourcing from a 
database table, changing "Table rows" to some other intermediate PCollection, 
etc.).

Also, Figure 4 in 8.1.2 uses KafkaIO, but the text in that section refers to 
TextIO and describes windowing with a bounded collection, so the figure should 
be updated to match the text and scenario described.

 

 

 

  was:
All figures and associated text in [section 8.1 of the programming 
guide|#windowing-basics] should be updated and clarified to be more intuitive 
(i.e. not sourcing from a database table, changing "Table rows" to some other 
intermediate PCollection, etc.).

Also, Figure 4 in 8.1.2 uses KafkaIO, but the text in that section refers to 
TextIO and describes windowing with a bounded collection, so the figure should 
be updated to match the text and scenario described.

 

 

 


> Figures and associated text in Windowing section of programming guide should 
> be updated
> ---
>
> Key: BEAM-9998
> URL: https://issues.apache.org/jira/browse/BEAM-9998
> Project: Beam
>  Issue Type: Bug
>  Components: website
>Reporter: David Wrede
>Priority: Major
>
> All figures and associated text in [section 8.1 of the programming 
> guide|https://beam.apache.org/documentation/programming-guide/#windowing-basics]
>  should be updated and clarified to be more intuitive (i.e. not sourcing from 
> a database table, changing "Table rows" to some other intermediate 
> PCollection, etc.).
> Also, Figure 4 in 8.1.2 uses KafkaIO, but the text in that section refers to 
> TextIO and describes windowing with a bounded collection, so the figure 
> should be updated to match the text and scenario described.
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9998) Figures and associated text in Windowing section of programming guide should be updated

2020-05-14 Thread David Wrede (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Wrede updated BEAM-9998:
--
Description: 
All figures and associated text in [section 8.1 of the programming 
guide|#windowing-basics] should be updated and clarified to be more intuitive 
(i.e. not sourcing from a database table, changing "Table rows" to some other 
intermediate PCollection, etc.).

Also, Figure 4 in 8.1.2 uses KafkaIO, but the text in that section refers to 
TextIO and describes windowing with a bounded collection, so the figure should 
be updated to match the text and scenario described.

 

 

 

  was:
All figures and associated text in [section 8.1 of the programming 
guide|[https://beam.apache.org/documentation/programming-guide/#windowing-basics]]
 should be updated and clarified to be more intuitive (i.e. not sourcing from a 
database table, changing "Table rows" to some other intermediate PCollection, 
etc.).

Also, Figure 4 in 8.1.2 uses KafkaIO, but the text in that section refers to 
TextIO and describes windowing with a bounded collection, so the figure should 
be updated to match the text and scenario described.

 

 

 


> Figures and associated text in Windowing section of programming guide should 
> be updated
> ---
>
> Key: BEAM-9998
> URL: https://issues.apache.org/jira/browse/BEAM-9998
> Project: Beam
>  Issue Type: Bug
>  Components: website
>Reporter: David Wrede
>Priority: Major
>
> All figures and associated text in [section 8.1 of the programming 
> guide|#windowing-basics] should be updated and clarified to be more intuitive 
> (i.e. not sourcing from a database table, changing "Table rows" to some other 
> intermediate PCollection, etc.).
> Also, Figure 4 in 8.1.2 uses KafkaIO, but the text in that section refers to 
> TextIO and describes windowing with a bounded collection, so the figure 
> should be updated to match the text and scenario described.
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8949) Add Spanner IO Integration Test for Python

2020-05-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8949?focusedWorklogId=433324&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433324
 ]

ASF GitHub Bot logged work on BEAM-8949:


Author: ASF GitHub Bot
Created on: 14/May/20 19:46
Start Date: 14/May/20 19:46
Worklog Time Spent: 10m 
  Work Description: pabloem commented on pull request #11210:
URL: https://github.com/apache/beam/pull/11210#issuecomment-628850653


   Run Python 3.7 PostCommit



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 433324)
Time Spent: 12h 20m  (was: 12h 10m)

> Add Spanner IO Integration Test for Python
> --
>
> Key: BEAM-8949
> URL: https://issues.apache.org/jira/browse/BEAM-8949
> Project: Beam
>  Issue Type: Test
>  Components: io-py-gcp
>Reporter: Shoaib Zafar
>Assignee: Shoaib Zafar
>Priority: Major
>  Time Spent: 12h 20m
>  Remaining Estimate: 0h
>
> Spanner IO (Python SDK) contains PTransform which uses the BatchAPI to read 
> from the spanner. Currently, it only contains direct runner unit tests. In 
> order to make this functionality available for the users, integration tests 
> also need to be added.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9468) Add Google Cloud Healthcare API IO Connectors

2020-05-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9468?focusedWorklogId=433323&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433323
 ]

ASF GitHub Bot logged work on BEAM-9468:


Author: ASF GitHub Bot
Created on: 14/May/20 19:46
Start Date: 14/May/20 19:46
Worklog Time Spent: 10m 
  Work Description: pabloem commented on pull request #11339:
URL: https://github.com/apache/beam/pull/11339#issuecomment-628850532


   afaik, the test classes should be bundled together (see 
https://github.com/apache/beam/blob/4c7d5460ba1d643e7fd57fa8f2f8e6a87d80bee5/sdks/java/io/google-cloud-platform/build.gradle#L118-L122)
   
   @mwalenia do you know why the testutil classes could be missing from the IT 
test?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 433323)
Time Spent: 41.5h  (was: 41h 20m)

> Add Google Cloud Healthcare API IO Connectors
> -
>
> Key: BEAM-9468
> URL: https://issues.apache.org/jira/browse/BEAM-9468
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-gcp
>Reporter: Jacob Ferriero
>Assignee: Jacob Ferriero
>Priority: Minor
>  Time Spent: 41.5h
>  Remaining Estimate: 0h
>
> Add IO Transforms for the HL7v2, FHIR and DICOM stores in the [Google Cloud 
> Healthcare API|https://cloud.google.com/healthcare/docs/]
> HL7v2IO
> FHIRIO
> DICOM 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9821) SpannerIO does not include all batching parameters in DisplayData.

2020-05-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9821?focusedWorklogId=433317&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433317
 ]

ASF GitHub Bot logged work on BEAM-9821:


Author: ASF GitHub Bot
Created on: 14/May/20 19:40
Start Date: 14/May/20 19:40
Worklog Time Spent: 10m 
  Work Description: TheNeuralBit commented on a change in pull request 
#11528:
URL: https://github.com/apache/beam/pull/11528#discussion_r425385260



##
File path: 
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/SpannerIO.java
##
@@ -991,6 +1001,24 @@ public WriteGrouped(Write spec) {
   this.spec = spec;
 }
 
+@Override
+public void populateDisplayData(DisplayData.Builder builder) {
+  super.populateDisplayData(builder);
+  spec.getSpannerConfig().populateDisplayData(builder);
+  builder.add(
+  DisplayData.item("batchSizeBytes", spec.getBatchSizeBytes())
+  .withLabel("Max batch size in sytes"));
+  builder.add(
+  DisplayData.item("maxNumMutations", spec.getMaxNumMutations())
+  .withLabel("Max number of mutated cells in each batch"));
+  builder.add(
+  DisplayData.item("maxNumRows", spec.getMaxNumRows())
+  .withLabel("Max number of rows in each batch"));
+  builder.add(
+  DisplayData.item("groupingFactor", spec.getGroupingFactor())
+  .withLabel("Number of batches to sort over"));
+}
+

Review comment:
   I added some commits to tweak these descriptions a bit. @nielm can you 
confirm that they're still correct?
   
   Also I wonder if you can re-use the implementation in `Write` by calling 
`spec.populateDisplayData(builder)` here instead?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 433317)
Time Spent: 40m  (was: 0.5h)

> SpannerIO does not include all batching parameters in DisplayData.
> --
>
> Key: BEAM-9821
> URL: https://issues.apache.org/jira/browse/BEAM-9821
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-gcp
>Affects Versions: 2.20.0, 2.21.0
>Reporter: Niel Markwick
>Assignee: Niel Markwick
>Priority: Minor
>  Labels: google-cloud-spanner
> Fix For: 2.22.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> SpannerIO Write and WriteGrouped do not populate all of the batching/grouping 
> parameters in their DisplayData – they only show "batchSizeBytes"



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9468) Add Google Cloud Healthcare API IO Connectors

2020-05-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9468?focusedWorklogId=433315&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433315
 ]

ASF GitHub Bot logged work on BEAM-9468:


Author: ASF GitHub Bot
Created on: 14/May/20 19:38
Start Date: 14/May/20 19:38
Worklog Time Spent: 10m 
  Work Description: jaketf edited a comment on pull request #11339:
URL: https://github.com/apache/beam/pull/11339#issuecomment-628842736


   Yeah all the FhirIO read tests are  parameterized tests that are all failing 
like this:
   ```
   WARNING: No terminal state was returned within allotted timeout. State value 
RUNNING
   May 14, 2020 2:37:23 AM org.apache.beam.runners.dataflow.TestDataflowRunner 
waitForStreamingJobTermination
   INFO: Dataflow job 2020-05-13_19_27_22-9842849986096969383 took longer than 
600 seconds to complete, cancelling.
   May 14, 2020 2:37:23 AM org.apache.beam.runners.dataflow.TestDataflowRunner 
run
   WARNING: Dataflow job 2020-05-13_19_27_22-9842849986096969383 did not output 
a success or failure metric.
   May 14, 2020 2:37:24 AM 
io.grpc.internal.ManagedChannelOrphanWrapper$ManagedChannelReference cleanQueue
   SEVERE: *~*~*~ Channel ManagedChannelImpl{logId=15, 
target=pubsub.googleapis.com:443} was not shutdown properly!!! ~*~*~*
   Make sure to call shutdown()/shutdownNow() and wait until 
awaitTermination() returns true.
   java.lang.RuntimeException: ManagedChannel allocation site
   ```



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 433315)
Time Spent: 41h 20m  (was: 41h 10m)

> Add Google Cloud Healthcare API IO Connectors
> -
>
> Key: BEAM-9468
> URL: https://issues.apache.org/jira/browse/BEAM-9468
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-gcp
>Reporter: Jacob Ferriero
>Assignee: Jacob Ferriero
>Priority: Minor
>  Time Spent: 41h 20m
>  Remaining Estimate: 0h
>
> Add IO Transforms for the HL7v2, FHIR and DICOM stores in the [Google Cloud 
> Healthcare API|https://cloud.google.com/healthcare/docs/]
> HL7v2IO
> FHIRIO
> DICOM 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9468) Add Google Cloud Healthcare API IO Connectors

2020-05-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9468?focusedWorklogId=433314&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433314
 ]

ASF GitHub Bot logged work on BEAM-9468:


Author: ASF GitHub Bot
Created on: 14/May/20 19:37
Start Date: 14/May/20 19:37
Worklog Time Spent: 10m 
  Work Description: jaketf commented on pull request #11339:
URL: https://github.com/apache/beam/pull/11339#issuecomment-628846183


   There is a separate crop of issues of 
   
   ```java.lang.NoClassDefFoundError: Could not initialize class 
org.apache.beam.sdk.io.gcp.healthcare.FhirIOTestUtil```
   
   In the execute bundles test. Are test utility classes not bundled up and set 
to dataflow? should I move this under main in the source tree?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 433314)
Time Spent: 41h 10m  (was: 41h)

> Add Google Cloud Healthcare API IO Connectors
> -
>
> Key: BEAM-9468
> URL: https://issues.apache.org/jira/browse/BEAM-9468
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-gcp
>Reporter: Jacob Ferriero
>Assignee: Jacob Ferriero
>Priority: Minor
>  Time Spent: 41h 10m
>  Remaining Estimate: 0h
>
> Add IO Transforms for the HL7v2, FHIR and DICOM stores in the [Google Cloud 
> Healthcare API|https://cloud.google.com/healthcare/docs/]
> HL7v2IO
> FHIRIO
> DICOM 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9468) Add Google Cloud Healthcare API IO Connectors

2020-05-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9468?focusedWorklogId=433313&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433313
 ]

ASF GitHub Bot logged work on BEAM-9468:


Author: ASF GitHub Bot
Created on: 14/May/20 19:35
Start Date: 14/May/20 19:35
Worklog Time Spent: 10m 
  Work Description: jaketf commented on pull request #11339:
URL: https://github.com/apache/beam/pull/11339#issuecomment-628845239


   
[FhirIOReadIT](https://github.com/apache/beam/pull/11339/files#diff-7a1359c60a094e73275769adb69b35d3R57)
 borrows the 
[TestPubsubSignal](https://github.com/apache/beam/blob/44f3d3f2d52e93a34b05068bf76f6f9d2611bf77/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/pubsub/TestPubsubSignal.java)
 pattern from 
[PubsubReadIT](https://github.com/apache/beam/blob/a9e14ff7f4b1aafa9915cb95e1bb7e7d3ab6a28b/sdks/java/io/google-cloud-platform/src/test/java/org/apache/beam/sdk/io/gcp/pubsub/PubsubReadIT.java#L41)
   This works fine on direct runner for me. 
   It seems this is a dataflow runner / test pubsub signal thing.
   I will look into if I'm not closing some channel in my tests.

   @pabloem Have we seen anything flaky on this before?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 433313)
Time Spent: 41h  (was: 40h 50m)

> Add Google Cloud Healthcare API IO Connectors
> -
>
> Key: BEAM-9468
> URL: https://issues.apache.org/jira/browse/BEAM-9468
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-gcp
>Reporter: Jacob Ferriero
>Assignee: Jacob Ferriero
>Priority: Minor
>  Time Spent: 41h
>  Remaining Estimate: 0h
>
> Add IO Transforms for the HL7v2, FHIR and DICOM stores in the [Google Cloud 
> Healthcare API|https://cloud.google.com/healthcare/docs/]
> HL7v2IO
> FHIRIO
> DICOM 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9993) Add option defaults for Flink Python tests

2020-05-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9993?focusedWorklogId=433310&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433310
 ]

ASF GitHub Bot logged work on BEAM-9993:


Author: ASF GitHub Bot
Created on: 14/May/20 19:31
Start Date: 14/May/20 19:31
Worklog Time Spent: 10m 
  Work Description: ibzib opened a new pull request #11711:
URL: https://github.com/apache/beam/pull/11711


   With this change, it is now possible to run Flink Python unit tests without 
any setup or options.
   
   R: @robertwb 
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [ ] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] Update `CHANGES.md` with noteworthy changes.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   See the [Contributor Guide](https://beam.apache.org/contribute) for more 
tips on [how to make review process 
smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Ba

[jira] [Work logged] (BEAM-9468) Add Google Cloud Healthcare API IO Connectors

2020-05-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9468?focusedWorklogId=433309&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433309
 ]

ASF GitHub Bot logged work on BEAM-9468:


Author: ASF GitHub Bot
Created on: 14/May/20 19:29
Start Date: 14/May/20 19:29
Worklog Time Spent: 10m 
  Work Description: jaketf commented on pull request #11339:
URL: https://github.com/apache/beam/pull/11339#issuecomment-628842736


   Yeah they are  parameterized tests that are all failing like this:
   ```
   WARNING: No terminal state was returned within allotted timeout. State value 
RUNNING
   May 14, 2020 2:37:23 AM org.apache.beam.runners.dataflow.TestDataflowRunner 
waitForStreamingJobTermination
   INFO: Dataflow job 2020-05-13_19_27_22-9842849986096969383 took longer than 
600 seconds to complete, cancelling.
   May 14, 2020 2:37:23 AM org.apache.beam.runners.dataflow.TestDataflowRunner 
run
   WARNING: Dataflow job 2020-05-13_19_27_22-9842849986096969383 did not output 
a success or failure metric.
   May 14, 2020 2:37:24 AM 
io.grpc.internal.ManagedChannelOrphanWrapper$ManagedChannelReference cleanQueue
   SEVERE: *~*~*~ Channel ManagedChannelImpl{logId=15, 
target=pubsub.googleapis.com:443} was not shutdown properly!!! ~*~*~*
   Make sure to call shutdown()/shutdownNow() and wait until 
awaitTermination() returns true.
   java.lang.RuntimeException: ManagedChannel allocation site
   ```



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 433309)
Time Spent: 40h 50m  (was: 40h 40m)

> Add Google Cloud Healthcare API IO Connectors
> -
>
> Key: BEAM-9468
> URL: https://issues.apache.org/jira/browse/BEAM-9468
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-gcp
>Reporter: Jacob Ferriero
>Assignee: Jacob Ferriero
>Priority: Minor
>  Time Spent: 40h 50m
>  Remaining Estimate: 0h
>
> Add IO Transforms for the HL7v2, FHIR and DICOM stores in the [Google Cloud 
> Healthcare API|https://cloud.google.com/healthcare/docs/]
> HL7v2IO
> FHIRIO
> DICOM 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


  1   2   >