[jira] [Work logged] (BEAM-9770) Add BigQuery DeadLetter pattern to Patterns Page
[ https://issues.apache.org/jira/browse/BEAM-9770?focusedWorklogId=433565&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433565 ] ASF GitHub Bot logged work on BEAM-9770: Author: ASF GitHub Bot Created on: 15/May/20 05:40 Start Date: 15/May/20 05:40 Worklog Time Spent: 10m Work Description: rezarokni commented on pull request #11437: URL: https://github.com/apache/beam/pull/11437#issuecomment-629039064 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 433565) Time Spent: 2h 50m (was: 2h 40m) > Add BigQuery DeadLetter pattern to Patterns Page > > > Key: BEAM-9770 > URL: https://issues.apache.org/jira/browse/BEAM-9770 > Project: Beam > Issue Type: New Feature > Components: website >Reporter: Reza ardeshir rokni >Assignee: Reza ardeshir rokni >Priority: Trivial > Time Spent: 2h 50m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9136) Add LICENSES and NOTICES to docker images
[ https://issues.apache.org/jira/browse/BEAM-9136?focusedWorklogId=433560&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433560 ] ASF GitHub Bot logged work on BEAM-9136: Author: ASF GitHub Bot Created on: 15/May/20 05:14 Start Date: 15/May/20 05:14 Worklog Time Spent: 10m Work Description: Hannah-Jiang commented on pull request #11717: URL: https://github.com/apache/beam/pull/11717#issuecomment-629031663 Please merge it if it looks good. The PR was reviewed at https://github.com/apache/beam/pull/11549. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 433560) Time Spent: 30.5h (was: 30h 20m) > Add LICENSES and NOTICES to docker images > - > > Key: BEAM-9136 > URL: https://issues.apache.org/jira/browse/BEAM-9136 > Project: Beam > Issue Type: Task > Components: build-system >Reporter: Hannah Jiang >Assignee: Hannah Jiang >Priority: Major > Fix For: 2.21.0 > > Time Spent: 30.5h > Remaining Estimate: 0h > > Scan dependencies and add licenses and notices of the dependencies to SDK > docker images. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9136) Add LICENSES and NOTICES to docker images
[ https://issues.apache.org/jira/browse/BEAM-9136?focusedWorklogId=433558&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433558 ] ASF GitHub Bot logged work on BEAM-9136: Author: ASF GitHub Bot Created on: 15/May/20 05:12 Start Date: 15/May/20 05:12 Worklog Time Spent: 10m Work Description: Hannah-Jiang commented on pull request #11584: URL: https://github.com/apache/beam/pull/11584#issuecomment-629030902 It is rebased. Please take a look when you have time. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 433558) Time Spent: 30h 20m (was: 30h 10m) > Add LICENSES and NOTICES to docker images > - > > Key: BEAM-9136 > URL: https://issues.apache.org/jira/browse/BEAM-9136 > Project: Beam > Issue Type: Task > Components: build-system >Reporter: Hannah Jiang >Assignee: Hannah Jiang >Priority: Major > Fix For: 2.21.0 > > Time Spent: 30h 20m > Remaining Estimate: 0h > > Scan dependencies and add licenses and notices of the dependencies to SDK > docker images. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9136) Add LICENSES and NOTICES to docker images
[ https://issues.apache.org/jira/browse/BEAM-9136?focusedWorklogId=433541&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433541 ] ASF GitHub Bot logged work on BEAM-9136: Author: ASF GitHub Bot Created on: 15/May/20 04:50 Start Date: 15/May/20 04:50 Worklog Time Spent: 10m Work Description: Hannah-Jiang closed pull request #11549: URL: https://github.com/apache/beam/pull/11549 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 433541) Time Spent: 30h 10m (was: 30h) > Add LICENSES and NOTICES to docker images > - > > Key: BEAM-9136 > URL: https://issues.apache.org/jira/browse/BEAM-9136 > Project: Beam > Issue Type: Task > Components: build-system >Reporter: Hannah Jiang >Assignee: Hannah Jiang >Priority: Major > Fix For: 2.21.0 > > Time Spent: 30h 10m > Remaining Estimate: 0h > > Scan dependencies and add licenses and notices of the dependencies to SDK > docker images. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9136) Add LICENSES and NOTICES to docker images
[ https://issues.apache.org/jira/browse/BEAM-9136?focusedWorklogId=433540&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433540 ] ASF GitHub Bot logged work on BEAM-9136: Author: ASF GitHub Bot Created on: 15/May/20 04:50 Start Date: 15/May/20 04:50 Worklog Time Spent: 10m Work Description: Hannah-Jiang commented on pull request #11549: URL: https://github.com/apache/beam/pull/11549#issuecomment-629024197 > @Hannah-Jiang - you can continue with this change now. However you will need to rebase. Thanks for letting me know. I created a new PR https://github.com/apache/beam/pull/11549 and will close this one. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 433540) Time Spent: 30h (was: 29h 50m) > Add LICENSES and NOTICES to docker images > - > > Key: BEAM-9136 > URL: https://issues.apache.org/jira/browse/BEAM-9136 > Project: Beam > Issue Type: Task > Components: build-system >Reporter: Hannah Jiang >Assignee: Hannah Jiang >Priority: Major > Fix For: 2.21.0 > > Time Spent: 30h > Remaining Estimate: 0h > > Scan dependencies and add licenses and notices of the dependencies to SDK > docker images. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9136) Add LICENSES and NOTICES to docker images
[ https://issues.apache.org/jira/browse/BEAM-9136?focusedWorklogId=433538&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433538 ] ASF GitHub Bot logged work on BEAM-9136: Author: ASF GitHub Bot Created on: 15/May/20 04:48 Start Date: 15/May/20 04:48 Worklog Time Spent: 10m Work Description: Hannah-Jiang opened a new pull request #11717: URL: https://github.com/apache/beam/pull/11717 @Kyle Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily: - [ ] [**Choose reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and mention them in a comment (`R: @username`). - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue. - [ ] Update `CHANGES.md` with noteworthy changes. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf). See the [Contributor Guide](https://beam.apache.org/contribute) for more tips on [how to make review process smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier). Post-Commit Tests Status (on master branch) Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark --- | --- | --- | --- | --- | --- | --- | --- Go | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/) Java | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunn
[jira] [Work logged] (BEAM-9977) Build Kafka Read on top of Java SplittableDoFn
[ https://issues.apache.org/jira/browse/BEAM-9977?focusedWorklogId=433510&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433510 ] ASF GitHub Bot logged work on BEAM-9977: Author: ASF GitHub Bot Created on: 15/May/20 04:00 Start Date: 15/May/20 04:00 Worklog Time Spent: 10m Work Description: boyuanzz commented on a change in pull request #11715: URL: https://github.com/apache/beam/pull/11715#discussion_r425553418 ## File path: sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/splittabledofn/GrowableOffsetRangeTracker.java ## @@ -0,0 +1,103 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.sdk.transforms.splittabledofn; + +import static org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkNotNull; + +import org.apache.beam.sdk.annotations.Experimental; +import org.apache.beam.sdk.annotations.Experimental.Kind; +import org.apache.beam.sdk.io.range.OffsetRange; + +/** + * A special {@link OffsetRangeTracker} for tracking a growable offset range. The Long.MAX_VALUE is + * used as end range to indicate the possibility of infinity. + * + * A offset range is considered as growable when the end offset could grow(or change) during + * execution time(e.g., Kafka backlog, appended file). + */ +@Experimental(Kind.SPLITTABLE_DO_FN) +public class GrowableOffsetRangeTracker extends OffsetRangeTracker { + /** + * An interface that should be implemented to fetch estimated end offset of range. + * + * {@code estimateRangeEnd} is called to give te end offset when {@code trySplit} or {@code + * getProgress} is invoked. The end offset is exclusive for the range. It's not necessary to + * increase monotonically but it's only taken into computation when it's larger than the current + * position. When returning Long.MAX_VALUE as estimate, it means the largest possible position for + * the range is Long.MAX_VALUE - 1. Having a good estimate is important for providing a good Review comment: Currently I take `Long.MAX_VALUE` as a numeric end offset. But we may also need a notion to say that the `OffsetPoller` doesn't have a good estimate and still want to keep current range as infinite. `Long.MAX_VALUE` is not suitable because it's possible that the actual end is `Long.MAX_VALUE`. Do we want to provide a notion here? Like `null`? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 433510) Time Spent: 20m (was: 10m) > Build Kafka Read on top of Java SplittableDoFn > -- > > Key: BEAM-9977 > URL: https://issues.apache.org/jira/browse/BEAM-9977 > Project: Beam > Issue Type: New Feature > Components: io-java-kafka >Reporter: Boyuan Zhang >Assignee: Boyuan Zhang >Priority: Major > Time Spent: 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9825) Transforms for Intersect, IntersectAll, Except, ExceptAll, Union, UnionAll
[ https://issues.apache.org/jira/browse/BEAM-9825?focusedWorklogId=433506&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433506 ] ASF GitHub Bot logged work on BEAM-9825: Author: ASF GitHub Bot Created on: 15/May/20 03:50 Start Date: 15/May/20 03:50 Worklog Time Spent: 10m Work Description: darshanj commented on a change in pull request #11610: URL: https://github.com/apache/beam/pull/11610#discussion_r425550940 ## File path: sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/SetFns.java ## @@ -0,0 +1,528 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.sdk.transforms; + +import static org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkNotNull; + +import java.util.ArrayList; +import java.util.List; +import org.apache.beam.sdk.Pipeline; +import org.apache.beam.sdk.transforms.join.CoGbkResult; +import org.apache.beam.sdk.transforms.join.CoGroupByKey; +import org.apache.beam.sdk.transforms.join.KeyedPCollectionTuple; +import org.apache.beam.sdk.values.KV; +import org.apache.beam.sdk.values.PCollection; +import org.apache.beam.sdk.values.PCollectionList; +import org.apache.beam.sdk.values.TupleTag; +import org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.Iterables; + +public class SetFns { Review comment: Thanks. I was thinking to add "If you have multiple triggers configured and fired, output of this transform will be calculated on data which is in the respective trigger." This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 433506) Remaining Estimate: 88h 10m (was: 88h 20m) Time Spent: 7h 50m (was: 7h 40m) > Transforms for Intersect, IntersectAll, Except, ExceptAll, Union, UnionAll > -- > > Key: BEAM-9825 > URL: https://issues.apache.org/jira/browse/BEAM-9825 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Darshan Jani >Assignee: Darshan Jani >Priority: Major > Original Estimate: 96h > Time Spent: 7h 50m > Remaining Estimate: 88h 10m > > I'd like to propose following new high-level transforms. > * Intersect > Compute the intersection between elements of two PCollection. > Given _leftCollection_ and _rightCollection_, this transform returns a > collection containing elements that common to both _leftCollection_ and > _rightCollection_ > > * Except > Compute the difference between elements of two PCollection. > Given _leftCollection_ and _rightCollection_, this transform returns a > collection containing elements that are in _leftCollection_ but not in > _rightCollection_ > * Union > Find the elements that are either of two PCollection. > Implement IntersetAll, ExceptAll and UnionAll variants of transforms. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9825) Transforms for Intersect, IntersectAll, Except, ExceptAll, Union, UnionAll
[ https://issues.apache.org/jira/browse/BEAM-9825?focusedWorklogId=433503&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433503 ] ASF GitHub Bot logged work on BEAM-9825: Author: ASF GitHub Bot Created on: 15/May/20 03:46 Start Date: 15/May/20 03:46 Worklog Time Spent: 10m Work Description: lukecwik commented on a change in pull request #11610: URL: https://github.com/apache/beam/pull/11610#discussion_r425547978 ## File path: sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/SetFns.java ## @@ -0,0 +1,528 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.sdk.transforms; + +import static org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkNotNull; + +import java.util.ArrayList; +import java.util.List; +import org.apache.beam.sdk.Pipeline; +import org.apache.beam.sdk.transforms.join.CoGbkResult; +import org.apache.beam.sdk.transforms.join.CoGroupByKey; +import org.apache.beam.sdk.transforms.join.KeyedPCollectionTuple; +import org.apache.beam.sdk.values.KV; +import org.apache.beam.sdk.values.PCollection; +import org.apache.beam.sdk.values.PCollectionList; +import org.apache.beam.sdk.values.TupleTag; +import org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.Iterables; + +public class SetFns { Review comment: Your understanding is correct that users will get results based upon whatever data is fired because of the trigger but from a cursory reading of the docs, we mention doing intersect/distinct/... over PCollections and not trigger firings which could confuse a user into thinking that intersect/distinct will be over all elements in these PCollections. This is why I believe it's important to insert a statement something like: ``` Triggers with multiple firings may lead to nondeterministic results since the intersect/distinct/... is only computed over each individual firing. ``` This would go well with your current statement about having compatible triggers in all your methods. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 433503) Remaining Estimate: 88h 20m (was: 88.5h) Time Spent: 7h 40m (was: 7.5h) > Transforms for Intersect, IntersectAll, Except, ExceptAll, Union, UnionAll > -- > > Key: BEAM-9825 > URL: https://issues.apache.org/jira/browse/BEAM-9825 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Darshan Jani >Assignee: Darshan Jani >Priority: Major > Original Estimate: 96h > Time Spent: 7h 40m > Remaining Estimate: 88h 20m > > I'd like to propose following new high-level transforms. > * Intersect > Compute the intersection between elements of two PCollection. > Given _leftCollection_ and _rightCollection_, this transform returns a > collection containing elements that common to both _leftCollection_ and > _rightCollection_ > > * Except > Compute the difference between elements of two PCollection. > Given _leftCollection_ and _rightCollection_, this transform returns a > collection containing elements that are in _leftCollection_ but not in > _rightCollection_ > * Union > Find the elements that are either of two PCollection. > Implement IntersetAll, ExceptAll and UnionAll variants of transforms. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9825) Transforms for Intersect, IntersectAll, Except, ExceptAll, Union, UnionAll
[ https://issues.apache.org/jira/browse/BEAM-9825?focusedWorklogId=433502&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433502 ] ASF GitHub Bot logged work on BEAM-9825: Author: ASF GitHub Bot Created on: 15/May/20 03:43 Start Date: 15/May/20 03:43 Worklog Time Spent: 10m Work Description: lukecwik commented on a change in pull request #11610: URL: https://github.com/apache/beam/pull/11610#discussion_r425547978 ## File path: sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/SetFns.java ## @@ -0,0 +1,528 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.sdk.transforms; + +import static org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkNotNull; + +import java.util.ArrayList; +import java.util.List; +import org.apache.beam.sdk.Pipeline; +import org.apache.beam.sdk.transforms.join.CoGbkResult; +import org.apache.beam.sdk.transforms.join.CoGroupByKey; +import org.apache.beam.sdk.transforms.join.KeyedPCollectionTuple; +import org.apache.beam.sdk.values.KV; +import org.apache.beam.sdk.values.PCollection; +import org.apache.beam.sdk.values.PCollectionList; +import org.apache.beam.sdk.values.TupleTag; +import org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.Iterables; + +public class SetFns { Review comment: Your understanding is correct that users will get results based upon whatever data is fired because of the trigger but from a cursory reading of the docs, we mention doing intersect/distinct/... over PCollections and not trigger firings which could confuse a user into thinking that intersect/distinct will be over all elements in these PCollections. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 433502) Remaining Estimate: 88.5h (was: 88h 40m) Time Spent: 7.5h (was: 7h 20m) > Transforms for Intersect, IntersectAll, Except, ExceptAll, Union, UnionAll > -- > > Key: BEAM-9825 > URL: https://issues.apache.org/jira/browse/BEAM-9825 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Darshan Jani >Assignee: Darshan Jani >Priority: Major > Original Estimate: 96h > Time Spent: 7.5h > Remaining Estimate: 88.5h > > I'd like to propose following new high-level transforms. > * Intersect > Compute the intersection between elements of two PCollection. > Given _leftCollection_ and _rightCollection_, this transform returns a > collection containing elements that common to both _leftCollection_ and > _rightCollection_ > > * Except > Compute the difference between elements of two PCollection. > Given _leftCollection_ and _rightCollection_, this transform returns a > collection containing elements that are in _leftCollection_ but not in > _rightCollection_ > * Union > Find the elements that are either of two PCollection. > Implement IntersetAll, ExceptAll and UnionAll variants of transforms. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9825) Transforms for Intersect, IntersectAll, Except, ExceptAll, Union, UnionAll
[ https://issues.apache.org/jira/browse/BEAM-9825?focusedWorklogId=433501&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433501 ] ASF GitHub Bot logged work on BEAM-9825: Author: ASF GitHub Bot Created on: 15/May/20 03:36 Start Date: 15/May/20 03:36 Worklog Time Spent: 10m Work Description: lukecwik commented on a change in pull request #11610: URL: https://github.com/apache/beam/pull/11610#discussion_r425547978 ## File path: sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/SetFns.java ## @@ -0,0 +1,528 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.sdk.transforms; + +import static org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkNotNull; + +import java.util.ArrayList; +import java.util.List; +import org.apache.beam.sdk.Pipeline; +import org.apache.beam.sdk.transforms.join.CoGbkResult; +import org.apache.beam.sdk.transforms.join.CoGroupByKey; +import org.apache.beam.sdk.transforms.join.KeyedPCollectionTuple; +import org.apache.beam.sdk.values.KV; +import org.apache.beam.sdk.values.PCollection; +import org.apache.beam.sdk.values.PCollectionList; +import org.apache.beam.sdk.values.TupleTag; +import org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.Iterables; + +public class SetFns { Review comment: Your understanding is correct that users will get results based upon whatever data is fired because of a trigger but from a cursory reading of the docs, we mention doing intersect/distinct/... over PCollections and not trigger firings which could confuse a user into thinking that intersect/distinct will be over all elements in these PCollections. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 433501) Remaining Estimate: 88h 40m (was: 88h 50m) Time Spent: 7h 20m (was: 7h 10m) > Transforms for Intersect, IntersectAll, Except, ExceptAll, Union, UnionAll > -- > > Key: BEAM-9825 > URL: https://issues.apache.org/jira/browse/BEAM-9825 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Darshan Jani >Assignee: Darshan Jani >Priority: Major > Original Estimate: 96h > Time Spent: 7h 20m > Remaining Estimate: 88h 40m > > I'd like to propose following new high-level transforms. > * Intersect > Compute the intersection between elements of two PCollection. > Given _leftCollection_ and _rightCollection_, this transform returns a > collection containing elements that common to both _leftCollection_ and > _rightCollection_ > > * Except > Compute the difference between elements of two PCollection. > Given _leftCollection_ and _rightCollection_, this transform returns a > collection containing elements that are in _leftCollection_ but not in > _rightCollection_ > * Union > Find the elements that are either of two PCollection. > Implement IntersetAll, ExceptAll and UnionAll variants of transforms. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9770) Add BigQuery DeadLetter pattern to Patterns Page
[ https://issues.apache.org/jira/browse/BEAM-9770?focusedWorklogId=433498&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433498 ] ASF GitHub Bot logged work on BEAM-9770: Author: ASF GitHub Bot Created on: 15/May/20 03:19 Start Date: 15/May/20 03:19 Worklog Time Spent: 10m Work Description: rezarokni commented on pull request #11437: URL: https://github.com/apache/beam/pull/11437#issuecomment-629003109 Local tests ran ok, but also raised: https://issues.apache.org/jira/browse/BEAM-10003 So this PR is now just the code bits , rather than code + website bits. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 433498) Time Spent: 2h 40m (was: 2.5h) > Add BigQuery DeadLetter pattern to Patterns Page > > > Key: BEAM-9770 > URL: https://issues.apache.org/jira/browse/BEAM-9770 > Project: Beam > Issue Type: New Feature > Components: website >Reporter: Reza ardeshir rokni >Assignee: Reza ardeshir rokni >Priority: Trivial > Time Spent: 2h 40m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (BEAM-10003) Need two PR to submit snippets to website
Reza ardeshir rokni created BEAM-10003: -- Summary: Need two PR to submit snippets to website Key: BEAM-10003 URL: https://issues.apache.org/jira/browse/BEAM-10003 Project: Beam Issue Type: New Feature Components: website Reporter: Reza ardeshir rokni Looks like build_github_samples.sh uses code already on the repo to build local serving; do fileName=$(echo "$url" | sed -e 's/\//_/g') curl -o "$DIST_DIR"/"$fileName" "[https://raw.githubusercontent.com|https://raw.githubusercontent.com/]$url"; done So when tying to test locally, the code needs to have already be in Beam. Ideally the script should make use of local code when building so : 1- Easier to build & test changes. 2- No need to raise two PR for what is a single change -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-10002) Cursor not found if work items take a long time
[ https://issues.apache.org/jira/browse/BEAM-10002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Corvin Deboeser updated BEAM-10002: --- Description: If some work items take a lot of processing time and the cursor of a bundle is not queried for too long, then mongodb will timeout the cursor which results in {code:java} pymongo.errors.CursorNotFound: cursor id ... not found {code} was: If some work items take a lot of processing time and the cursor of a bundle is not queried for too long, then mongodb will timeout the cursor which results in ``` pymongo.errors.CursorNotFound: cursor id ... not found ``` > Cursor not found if work items take a long time > --- > > Key: BEAM-10002 > URL: https://issues.apache.org/jira/browse/BEAM-10002 > Project: Beam > Issue Type: Bug > Components: io-py-mongodb >Affects Versions: 2.20.0 >Reporter: Corvin Deboeser >Assignee: Yichi Zhang >Priority: Major > > If some work items take a lot of processing time and the cursor of a bundle > is not queried for too long, then mongodb will timeout the cursor which > results in > {code:java} > pymongo.errors.CursorNotFound: cursor id ... not found > {code} > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (BEAM-10002) Cursor not found if work items take a long time
Corvin Deboeser created BEAM-10002: -- Summary: Cursor not found if work items take a long time Key: BEAM-10002 URL: https://issues.apache.org/jira/browse/BEAM-10002 Project: Beam Issue Type: Bug Components: io-py-mongodb Affects Versions: 2.20.0 Reporter: Corvin Deboeser Assignee: Yichi Zhang If some work items take a lot of processing time and the cursor of a bundle is not queried for too long, then mongodb will timeout the cursor which results in ``` pymongo.errors.CursorNotFound: cursor id ... not found ``` -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-9960) Python MongoDBIO fails when response of split vector command is larger than 16mb
[ https://issues.apache.org/jira/browse/BEAM-9960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Corvin Deboeser updated BEAM-9960: -- Component/s: (was: sdk-py-core) > Python MongoDBIO fails when response of split vector command is larger than > 16mb > > > Key: BEAM-9960 > URL: https://issues.apache.org/jira/browse/BEAM-9960 > Project: Beam > Issue Type: Bug > Components: io-py-mongodb >Affects Versions: 2.20.0 >Reporter: Corvin Deboeser >Priority: Major > > When using MongoDBIO on a large collection with large documents on average, > then the split vector command results in a lot of splits if the desired > bundle size is small. In extreme cases, the response from the split vector > command can be larger than 16mb which is not supported by pymongo / MongoDB: > {{pymongo.errors.ProtocolError: Message length (33699186) is larger than > server max message size (33554432)}} > > Environment: Was running this on Google Dataflow / Beam Python SDK 2.20. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-9960) Python MongoDBIO fails when response of split vector command is larger than 16mb
[ https://issues.apache.org/jira/browse/BEAM-9960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Corvin Deboeser updated BEAM-9960: -- Description: When using MongoDBIO on a large collection with large documents on average, then the split vector command results in a lot of splits if the desired bundle size is small. In extreme cases, the response from the split vector command can be larger than 16mb which is not supported by pymongo / MongoDB: {{pymongo.errors.ProtocolError: Message length (33699186) is larger than server max message size (33554432)}} Environment: Was running this on Google Dataflow / Beam Python SDK 2.20. was: When using MongoDBIO on a large collection and the source bundle size was determined to be 1, then the response from the split vector command can be larger than 16mb which is not supported by pymongo / MongoDB: {{pymongo.errors.ProtocolError: Message length (33699186) is larger than server max message size (33554432)}} Environment: Was running this on Google Dataflow / Beam Python SDK 2.20. > Python MongoDBIO fails when response of split vector command is larger than > 16mb > > > Key: BEAM-9960 > URL: https://issues.apache.org/jira/browse/BEAM-9960 > Project: Beam > Issue Type: Bug > Components: io-py-mongodb, sdk-py-core >Affects Versions: 2.20.0 >Reporter: Corvin Deboeser >Priority: Major > > When using MongoDBIO on a large collection with large documents on average, > then the split vector command results in a lot of splits if the desired > bundle size is small. In extreme cases, the response from the split vector > command can be larger than 16mb which is not supported by pymongo / MongoDB: > {{pymongo.errors.ProtocolError: Message length (33699186) is larger than > server max message size (33554432)}} > > Environment: Was running this on Google Dataflow / Beam Python SDK 2.20. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-9960) Python MongoDBIO fails when response of split vector command is larger than 16mb
[ https://issues.apache.org/jira/browse/BEAM-9960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Corvin Deboeser updated BEAM-9960: -- Component/s: io-py-mongodb > Python MongoDBIO fails when response of split vector command is larger than > 16mb > > > Key: BEAM-9960 > URL: https://issues.apache.org/jira/browse/BEAM-9960 > Project: Beam > Issue Type: Bug > Components: io-py-mongodb, sdk-py-core >Affects Versions: 2.20.0 >Reporter: Corvin Deboeser >Priority: Major > > When using MongoDBIO on a large collection and the source bundle size was > determined to be 1, then the response from the split vector command can be > larger than 16mb which is not supported by pymongo / MongoDB: > {{pymongo.errors.ProtocolError: Message length (33699186) is larger than > server max message size (33554432)}} > > Environment: Was running this on Google Dataflow / Beam Python SDK 2.20. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-9961) Python MongoDBIO does not apply projection
[ https://issues.apache.org/jira/browse/BEAM-9961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Corvin Deboeser updated BEAM-9961: -- Component/s: io-py-mongodb > Python MongoDBIO does not apply projection > -- > > Key: BEAM-9961 > URL: https://issues.apache.org/jira/browse/BEAM-9961 > Project: Beam > Issue Type: Bug > Components: io-py-mongodb, sdk-py-core >Affects Versions: 2.20.0 >Reporter: Corvin Deboeser >Priority: Minor > > ReadFromMongoDB does not apply the provided projection when reading from the > client - only filter is being applied as you can see here: > https://github.com/apache/beam/blob/9f0cb649d39ee6236ea27f111acb4b66591a80ec/sdks/python/apache_beam/io/mongodbio.py#L204 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-2939) Fn API SDF support
[ https://issues.apache.org/jira/browse/BEAM-2939?focusedWorklogId=433486&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433486 ] ASF GitHub Bot logged work on BEAM-2939: Author: ASF GitHub Bot Created on: 15/May/20 02:31 Start Date: 15/May/20 02:31 Worklog Time Spent: 10m Work Description: lukecwik commented on pull request #11716: URL: https://github.com/apache/beam/pull/11716#issuecomment-628990029 Run Java PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 433486) Time Spent: 32h 10m (was: 32h) > Fn API SDF support > -- > > Key: BEAM-2939 > URL: https://issues.apache.org/jira/browse/BEAM-2939 > Project: Beam > Issue Type: Improvement > Components: beam-model >Reporter: Henning Rohde >Assignee: Luke Cwik >Priority: Major > Labels: portability > Time Spent: 32h 10m > Remaining Estimate: 0h > > The Fn API should support streaming SDF. Detailed design TBD. > Once design is ready, expand subtasks similarly to BEAM-2822. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9825) Transforms for Intersect, IntersectAll, Except, ExceptAll, Union, UnionAll
[ https://issues.apache.org/jira/browse/BEAM-9825?focusedWorklogId=433483&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433483 ] ASF GitHub Bot logged work on BEAM-9825: Author: ASF GitHub Bot Created on: 15/May/20 02:19 Start Date: 15/May/20 02:19 Worklog Time Spent: 10m Work Description: darshanj commented on a change in pull request #11610: URL: https://github.com/apache/beam/pull/11610#discussion_r425529534 ## File path: sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/SetFns.java ## @@ -0,0 +1,528 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.sdk.transforms; + +import static org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkNotNull; + +import java.util.ArrayList; +import java.util.List; +import org.apache.beam.sdk.Pipeline; +import org.apache.beam.sdk.transforms.join.CoGbkResult; +import org.apache.beam.sdk.transforms.join.CoGroupByKey; +import org.apache.beam.sdk.transforms.join.KeyedPCollectionTuple; +import org.apache.beam.sdk.values.KV; +import org.apache.beam.sdk.values.PCollection; +import org.apache.beam.sdk.values.PCollectionList; +import org.apache.beam.sdk.values.TupleTag; +import org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.Iterables; + +public class SetFns { Review comment: Added Comments for coders. Regarding triggers, I assume, multiple triggers should work if they are compatible. It uses CGBK (internally uses flattens) , which checks if triggers compatible and windowFns are compatible. User will get results based on whatever data is triggered. My understanding may be flawed. Please correct if you think that is not the case. I have added a comment for compatible triggers and same windowFns. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 433483) Remaining Estimate: 88h 50m (was: 89h) Time Spent: 7h 10m (was: 7h) > Transforms for Intersect, IntersectAll, Except, ExceptAll, Union, UnionAll > -- > > Key: BEAM-9825 > URL: https://issues.apache.org/jira/browse/BEAM-9825 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Darshan Jani >Assignee: Darshan Jani >Priority: Major > Original Estimate: 96h > Time Spent: 7h 10m > Remaining Estimate: 88h 50m > > I'd like to propose following new high-level transforms. > * Intersect > Compute the intersection between elements of two PCollection. > Given _leftCollection_ and _rightCollection_, this transform returns a > collection containing elements that common to both _leftCollection_ and > _rightCollection_ > > * Except > Compute the difference between elements of two PCollection. > Given _leftCollection_ and _rightCollection_, this transform returns a > collection containing elements that are in _leftCollection_ but not in > _rightCollection_ > * Union > Find the elements that are either of two PCollection. > Implement IntersetAll, ExceptAll and UnionAll variants of transforms. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-9679) Core Transforms | Go SDK Code Katas
[ https://issues.apache.org/jira/browse/BEAM-9679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Damon Douglas updated BEAM-9679: Description: A kata devoted to core beam transforms patterns after [https://github.com/apache/beam/tree/master/learning/katas/java/Core%20Transforms] where the take away is an individual's ability to master the following using an Apache Beam pipeline using the Golang SDK. * Branching * [CoGroupByKey|[https://github.com/damondouglas/beam/tree/BEAM-9679-core-transform-groupbykey]] * Combine * Composite Transform * DoFn Additional Parameters * Flatten * GroupByKey * [Map|[https://github.com/apache/beam/pull/11564]] * Partition * Side Input was: A kata devoted to core beam transforms patterns after [https://github.com/apache/beam/tree/master/learning/katas/java/Core%20Transforms] where the take away is an individual's ability to master the following using an Apache Beam pipeline using the Golang SDK. * Branching * CoGroupByKey * Combine * Composite Transform * DoFn Additional Parameters * Flatten * GroupByKey * [Map|[https://github.com/apache/beam/pull/11564]] * Partition * Side Input > Core Transforms | Go SDK Code Katas > --- > > Key: BEAM-9679 > URL: https://issues.apache.org/jira/browse/BEAM-9679 > Project: Beam > Issue Type: Sub-task > Components: katas, sdk-go >Reporter: Damon Douglas >Assignee: Damon Douglas >Priority: Major > > A kata devoted to core beam transforms patterns after > [https://github.com/apache/beam/tree/master/learning/katas/java/Core%20Transforms] > where the take away is an individual's ability to master the following using > an Apache Beam pipeline using the Golang SDK. > * Branching > * > [CoGroupByKey|[https://github.com/damondouglas/beam/tree/BEAM-9679-core-transform-groupbykey]] > * Combine > * Composite Transform > * DoFn Additional Parameters > * Flatten > * GroupByKey > * [Map|[https://github.com/apache/beam/pull/11564]] > * Partition > * Side Input -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-2939) Fn API SDF support
[ https://issues.apache.org/jira/browse/BEAM-2939?focusedWorklogId=433469&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433469 ] ASF GitHub Bot logged work on BEAM-2939: Author: ASF GitHub Bot Created on: 15/May/20 01:36 Start Date: 15/May/20 01:36 Worklog Time Spent: 10m Work Description: lukecwik opened a new pull request #11716: URL: https://github.com/apache/beam/pull/11716 This got rid of the NullPointerException for the Kafka checkpoint because the checkpoint itself isn't serializable. When it gets deserialized, the optional reader field is null which is what was causing the NPE. Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily: - [ ] [**Choose reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and mention them in a comment (`R: @username`). - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue. - [ ] Update `CHANGES.md` with noteworthy changes. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf). See the [Contributor Guide](https://beam.apache.org/contribute) for more tips on [how to make review process smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier). Post-Commit Tests Status (on master branch) Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark --- | --- | --- | --- | --- | --- | --- | --- Go | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/) Java | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batc
[jira] [Work logged] (BEAM-2939) Fn API SDF support
[ https://issues.apache.org/jira/browse/BEAM-2939?focusedWorklogId=433470&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433470 ] ASF GitHub Bot logged work on BEAM-2939: Author: ASF GitHub Bot Created on: 15/May/20 01:36 Start Date: 15/May/20 01:36 Worklog Time Spent: 10m Work Description: lukecwik commented on pull request #11716: URL: https://github.com/apache/beam/pull/11716#issuecomment-628974287 R: @ihji CC: @chamikaramj This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 433470) Time Spent: 32h (was: 31h 50m) > Fn API SDF support > -- > > Key: BEAM-2939 > URL: https://issues.apache.org/jira/browse/BEAM-2939 > Project: Beam > Issue Type: Improvement > Components: beam-model >Reporter: Henning Rohde >Assignee: Luke Cwik >Priority: Major > Labels: portability > Time Spent: 32h > Remaining Estimate: 0h > > The Fn API should support streaming SDF. Detailed design TBD. > Once design is ready, expand subtasks similarly to BEAM-2822. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9136) Add LICENSES and NOTICES to docker images
[ https://issues.apache.org/jira/browse/BEAM-9136?focusedWorklogId=433460&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433460 ] ASF GitHub Bot logged work on BEAM-9136: Author: ASF GitHub Bot Created on: 15/May/20 01:07 Start Date: 15/May/20 01:07 Worklog Time Spent: 10m Work Description: aaltay commented on pull request #11584: URL: https://github.com/apache/beam/pull/11584#issuecomment-628965737 @Hannah-Jiang - you can continue with this change now. However you will need to rebase. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 433460) Time Spent: 29h 40m (was: 29.5h) > Add LICENSES and NOTICES to docker images > - > > Key: BEAM-9136 > URL: https://issues.apache.org/jira/browse/BEAM-9136 > Project: Beam > Issue Type: Task > Components: build-system >Reporter: Hannah Jiang >Assignee: Hannah Jiang >Priority: Major > Fix For: 2.21.0 > > Time Spent: 29h 40m > Remaining Estimate: 0h > > Scan dependencies and add licenses and notices of the dependencies to SDK > docker images. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9136) Add LICENSES and NOTICES to docker images
[ https://issues.apache.org/jira/browse/BEAM-9136?focusedWorklogId=433459&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433459 ] ASF GitHub Bot logged work on BEAM-9136: Author: ASF GitHub Bot Created on: 15/May/20 01:07 Start Date: 15/May/20 01:07 Worklog Time Spent: 10m Work Description: aaltay commented on pull request #11549: URL: https://github.com/apache/beam/pull/11549#issuecomment-628965670 @Hannah-Jiang - you can continue with this change now. However you will need to rebase. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 433459) Time Spent: 29.5h (was: 29h 20m) > Add LICENSES and NOTICES to docker images > - > > Key: BEAM-9136 > URL: https://issues.apache.org/jira/browse/BEAM-9136 > Project: Beam > Issue Type: Task > Components: build-system >Reporter: Hannah Jiang >Assignee: Hannah Jiang >Priority: Major > Fix For: 2.21.0 > > Time Spent: 29.5h > Remaining Estimate: 0h > > Scan dependencies and add licenses and notices of the dependencies to SDK > docker images. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9977) Build Kafka Read on top of Java SplittableDoFn
[ https://issues.apache.org/jira/browse/BEAM-9977?focusedWorklogId=433456&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433456 ] ASF GitHub Bot logged work on BEAM-9977: Author: ASF GitHub Bot Created on: 15/May/20 00:53 Start Date: 15/May/20 00:53 Worklog Time Spent: 10m Work Description: boyuanzz opened a new pull request #11715: URL: https://github.com/apache/beam/pull/11715 **Please** add a meaningful description for your change here Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily: - [ ] [**Choose reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and mention them in a comment (`R: @username`). - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue. - [ ] Update `CHANGES.md` with noteworthy changes. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf). See the [Contributor Guide](https://beam.apache.org/contribute) for more tips on [how to make review process smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier). Post-Commit Tests Status (on master branch) Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark --- | --- | --- | --- | --- | --- | --- | --- Go | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/) Java | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)[![Build Status](https://build
[jira] [Work logged] (BEAM-9951) Create Go SDK synthetic sources.
[ https://issues.apache.org/jira/browse/BEAM-9951?focusedWorklogId=433452&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433452 ] ASF GitHub Bot logged work on BEAM-9951: Author: ASF GitHub Bot Created on: 15/May/20 00:41 Start Date: 15/May/20 00:41 Worklog Time Spent: 10m Work Description: lostluck commented on pull request #11665: URL: https://github.com/apache/beam/pull/11665#issuecomment-628958995 I think this LGTM. Overall, it's probably fine either way. In terms of effort, the risk is often "the pipelines emit nothing/very little" and terminate very quickly, which other metrics that expect certain amounts of data. The main risk is the user doesn't have validation on the profiling pipeline and think things are going very very fast. But given performance metrics tend to be "per element", they'll pay attention to things like that. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 433452) Time Spent: 1h 10m (was: 1h) > Create Go SDK synthetic sources. > > > Key: BEAM-9951 > URL: https://issues.apache.org/jira/browse/BEAM-9951 > Project: Beam > Issue Type: Sub-task > Components: sdk-go >Reporter: Daniel Oliveira >Assignee: Daniel Oliveira >Priority: Major > Time Spent: 1h 10m > Remaining Estimate: 0h > > Create synthetic sources for the Go SDK like > [Java|https://github.com/apache/beam/tree/master/sdks/java/io/synthetic/src/main/java/org/apache/beam/sdk/io/synthetic] > and > [Python|https://github.com/apache/beam/blob/master/sdks/python/apache_beam/testing/synthetic_pipeline.py] > have. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (BEAM-9999) Remove support for EOLed runners (Apex, etc.)
[ https://issues.apache.org/jira/browse/BEAM-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17107808#comment-17107808 ] Manu Zhang commented on BEAM-: -- Same for Gearpump. Thanks for pinging me. > Remove support for EOLed runners (Apex, etc.) > - > > Key: BEAM- > URL: https://issues.apache.org/jira/browse/BEAM- > Project: Beam > Issue Type: Bug > Components: runner-apex, runner-core >Reporter: Ahmet Altay >Priority: Major > > These runners look EOLed, not maintained: > - Apex (last release 2+ years ago) > - Gearpump (last release 1+ year ago) > Removing support for these could reduce the code base size, reduce flaky > test, and make it easier to add new features. > /cc [~kenn][~tysonjh] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (BEAM-9983) bigquery_read_it_test.ReadNewTypesTests.test_iobase_source failing
[ https://issues.apache.org/jira/browse/BEAM-9983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pablo Estrada resolved BEAM-9983. - Fix Version/s: Not applicable Resolution: Fixed this pr was reverted and rolled-forward later on with fixes. > bigquery_read_it_test.ReadNewTypesTests.test_iobase_source failing > -- > > Key: BEAM-9983 > URL: https://issues.apache.org/jira/browse/BEAM-9983 > Project: Beam > Issue Type: Bug > Components: io-py-gcp, test-failures >Reporter: Kyle Weaver >Assignee: Pablo Estrada >Priority: Major > Fix For: Not applicable > > Time Spent: 10m > Remaining Estimate: 0h > > This failure seems to afflict all Python postcommits. > apache_beam.io.gcp.bigquery_read_it_test.ReadNewTypesTests.test_iobase_source > (from nosetests) > Failing for the past 1 build (Since Failed#2429 ) > Took 9 min 57 sec. > Error Message > Dataflow pipeline failed. State: FAILED, Error: > Traceback (most recent call last): > File "/usr/local/lib/python3.6/site-packages/apache_beam/utils/retry.py", > line 246, in wrapper > sleep_interval = next(retry_intervals) > StopIteration > During handling of the above exception, another exception occurred: > Traceback (most recent call last): > File > "/usr/local/lib/python3.6/site-packages/dataflow_worker/batchworker.py", line > 647, in do_work > work_executor.execute() > File "/usr/local/lib/python3.6/site-packages/dataflow_worker/executor.py", > line 226, in execute > self._split_task) > File "/usr/local/lib/python3.6/site-packages/dataflow_worker/executor.py", > line 234, in _perform_source_split_considering_api_limits > desired_bundle_size) > File "/usr/local/lib/python3.6/site-packages/dataflow_worker/executor.py", > line 271, in _perform_source_split > for split in source.split(desired_bundle_size): > File > "/usr/local/lib/python3.6/site-packages/apache_beam/io/gcp/bigquery.py", line > 698, in split > self.table_reference = self._execute_query(bq) > File > "/usr/local/lib/python3.6/site-packages/apache_beam/options/value_provider.py", > line 135, in _f > return fnc(self, *args, **kwargs) > File > "/usr/local/lib/python3.6/site-packages/apache_beam/io/gcp/bigquery.py", line > 744, in _execute_query > job_labels=self.bigquery_job_labels) > File "/usr/local/lib/python3.6/site-packages/apache_beam/utils/retry.py", > line 249, in wrapper > raise_with_traceback(exn, exn_traceback) > File "/usr/local/lib/python3.6/site-packages/future/utils/__init__.py", > line 446, in raise_with_traceback > raise exc.with_traceback(traceback) > File "/usr/local/lib/python3.6/site-packages/apache_beam/utils/retry.py", > line 236, in wrapper > return fun(*args, **kwargs) > File > "/usr/local/lib/python3.6/site-packages/apache_beam/io/gcp/bigquery_tools.py", > line 415, in _start_query_job > labels=job_labels or {}, > File > "/usr/local/lib/python3.6/site-packages/apitools/base/protorpclite/messages.py", > line 791, in __init__ > setattr(self, name, value) > File > "/usr/local/lib/python3.6/site-packages/apitools/base/protorpclite/messages.py", > line 973, in __setattr__ > object.__setattr__(self, name, value) > File > "/usr/local/lib/python3.6/site-packages/apitools/base/protorpclite/messages.py", > line 1651, in __set__ > value = t(**value) > File > "/usr/local/lib/python3.6/site-packages/apitools/base/protorpclite/messages.py", > line 791, in __init__ > setattr(self, name, value) > File > "/usr/local/lib/python3.6/site-packages/apitools/base/protorpclite/messages.py", > line 976, in __setattr__ > "to message %s" % (name, type(self).__name__)) > AttributeError: May not assign arbitrary value owner to message LabelsValue -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9964) Setting workerCacheMb to make its way to the WindmillStateCache Constructor
[ https://issues.apache.org/jira/browse/BEAM-9964?focusedWorklogId=433451&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433451 ] ASF GitHub Bot logged work on BEAM-9964: Author: ASF GitHub Bot Created on: 15/May/20 00:38 Start Date: 15/May/20 00:38 Worklog Time Spent: 10m Work Description: pabloem commented on pull request #11710: URL: https://github.com/apache/beam/pull/11710#issuecomment-628958485 Seems like java precommits are broken on master - but this change LGTM. I'll wait for precommits to be fixed if possible. Thanks @omarismail94 ! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 433451) Time Spent: 2.5h (was: 2h 20m) > Setting workerCacheMb to make its way to the WindmillStateCache Constructor > --- > > Key: BEAM-9964 > URL: https://issues.apache.org/jira/browse/BEAM-9964 > Project: Beam > Issue Type: Improvement > Components: runner-dataflow >Reporter: Omar Ismail >Assignee: Omar Ismail >Priority: Minor > Time Spent: 2.5h > Remaining Estimate: 0h > > Setting --workerCacheMB seems to affect batch pipelines only. For Streaming, > the cache seems to be hardcoded to 100Mb [1]. If possible, I would like to > make it allowable to change the cache value in Streaming when setting > -workerCacheMB. > I've never made changes to the Beam SDK, so I am super excited to work on > this! > > [[1] > https://github.com/apache/beam/blob/5e659bb80bcbf70795f6806e05a255ee72706d9f/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillStateCache.java#L73|https://github.com/apache/beam/blob/5e659bb80bcbf70795f6806e05a255ee72706d9f/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillStateCache.java#L73] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9951) Create Go SDK synthetic sources.
[ https://issues.apache.org/jira/browse/BEAM-9951?focusedWorklogId=433450&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433450 ] ASF GitHub Bot logged work on BEAM-9951: Author: ASF GitHub Bot Created on: 15/May/20 00:34 Start Date: 15/May/20 00:34 Worklog Time Spent: 10m Work Description: lostluck commented on a change in pull request #11665: URL: https://github.com/apache/beam/pull/11665#discussion_r425502985 ## File path: sdks/go/pkg/beam/io/synthetic/source.go ## @@ -0,0 +1,151 @@ +// Licensed to the Apache Software Foundation (ASF) under one or more +// contributor license agreements. See the NOTICE file distributed with +// this work for additional information regarding copyright ownership. +// The ASF licenses this file to You under the Apache License, Version 2.0 +// (the "License"); you may not use this file except in compliance with +// the License. You may obtain a copy of the License at +// +//http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +// Package synthetic contains transforms for creating synthetic pipelines. +// Synthetic pipelines are pipelines that simulate the behavior of possible +// pipelines in order to test performance, splitting, liquid sharding, and +// various other infrastructure used for running pipelines. This category of +// tests are not concerned with the correctness of the elements themselves, but +// need to simulate transforms that output many elements throughout varying +// pipeline shapes. +package synthetic + +import ( + "github.com/apache/beam/sdks/go/pkg/beam" + "github.com/apache/beam/sdks/go/pkg/beam/io/rtrackers/offsetrange" + "math/rand" + "time" +) + +// Source creates a synthetic source transform that emits randomly +// generated KV<[]byte, []byte> elements. +// +// This transform accepts a PCollection of SourceConfig, where each SourceConfig +// determines the synthetic source's behavior for that element. Review comment: I think I'm coming to the position that it might be overengineering to protect users, since the risky fields are known with the first version. It's probably better to simply panic/error out with a clear message at some early stage if it's configured incorrectly, if graceful behavior isn't possible, and it's not typically the value that needs to be set (eg. NumElements or OutputsPerInput). Regardless, this is a good exercise to consider, whatever you choose to do. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 433450) Time Spent: 1h (was: 50m) > Create Go SDK synthetic sources. > > > Key: BEAM-9951 > URL: https://issues.apache.org/jira/browse/BEAM-9951 > Project: Beam > Issue Type: Sub-task > Components: sdk-go >Reporter: Daniel Oliveira >Assignee: Daniel Oliveira >Priority: Major > Time Spent: 1h > Remaining Estimate: 0h > > Create synthetic sources for the Go SDK like > [Java|https://github.com/apache/beam/tree/master/sdks/java/io/synthetic/src/main/java/org/apache/beam/sdk/io/synthetic] > and > [Python|https://github.com/apache/beam/blob/master/sdks/python/apache_beam/testing/synthetic_pipeline.py] > have. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9577) Update artifact staging and retrieval protocols to be dependency aware.
[ https://issues.apache.org/jira/browse/BEAM-9577?focusedWorklogId=433449&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433449 ] ASF GitHub Bot logged work on BEAM-9577: Author: ASF GitHub Bot Created on: 15/May/20 00:31 Start Date: 15/May/20 00:31 Worklog Time Spent: 10m Work Description: robertwb commented on pull request #11708: URL: https://github.com/apache/beam/pull/11708#issuecomment-628956526 Still trying to figure out why the test fails on jenkins but passes locally, but other than that it should be ready to be looked at again. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 433449) Time Spent: 22h 10m (was: 22h) > Update artifact staging and retrieval protocols to be dependency aware. > --- > > Key: BEAM-9577 > URL: https://issues.apache.org/jira/browse/BEAM-9577 > Project: Beam > Issue Type: Improvement > Components: beam-model >Reporter: Robert Bradshaw >Assignee: Robert Bradshaw >Priority: Major > Time Spent: 22h 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (BEAM-10000) Support BIT_XOR aggregation function in BeamSQL
[ https://issues.apache.org/jira/browse/BEAM-1?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17107795#comment-17107795 ] Rui Wang commented on BEAM-1: - Yeah! > Support BIT_XOR aggregation function in BeamSQL > --- > > Key: BEAM-1 > URL: https://issues.apache.org/jira/browse/BEAM-1 > Project: Beam > Issue Type: Task > Components: dsl-sql >Reporter: Rui Wang >Priority: Major > > See reference: > https://cloud.google.com/bigquery/docs/reference/standard-sql/aggregate_functions#bit_xor -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9641) Support ZetaSQL DATE functions in BeamSQL
[ https://issues.apache.org/jira/browse/BEAM-9641?focusedWorklogId=433441&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433441 ] ASF GitHub Bot logged work on BEAM-9641: Author: ASF GitHub Bot Created on: 14/May/20 23:51 Start Date: 14/May/20 23:51 Worklog Time Spent: 10m Work Description: apilloud commented on pull request #11272: URL: https://github.com/apache/beam/pull/11272#issuecomment-628945312 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 433441) Time Spent: 3h 10m (was: 3h) > Support ZetaSQL DATE functions in BeamSQL > - > > Key: BEAM-9641 > URL: https://issues.apache.org/jira/browse/BEAM-9641 > Project: Beam > Issue Type: New Feature > Components: dsl-sql-zetasql >Reporter: Yueyang Qiu >Assignee: Yueyang Qiu >Priority: Major > Labels: zetasql-compliance > Time Spent: 3h 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (BEAM-9999) Remove support for EOLed runners (Apex, etc.)
[ https://issues.apache.org/jira/browse/BEAM-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17107792#comment-17107792 ] Thomas Weise commented on BEAM-: Apache Apex itself has moved to attic and there are no users of the Beam Apex runners that I know of. > Remove support for EOLed runners (Apex, etc.) > - > > Key: BEAM- > URL: https://issues.apache.org/jira/browse/BEAM- > Project: Beam > Issue Type: Bug > Components: runner-apex, runner-core >Reporter: Ahmet Altay >Priority: Major > > These runners look EOLed, not maintained: > - Apex (last release 2+ years ago) > - Gearpump (last release 1+ year ago) > Removing support for these could reduce the code base size, reduce flaky > test, and make it easier to add new features. > /cc [~kenn][~tysonjh] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (BEAM-9698) BeamUncollectRel UncollectDoFn NullPointerException
[ https://issues.apache.org/jira/browse/BEAM-9698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17107791#comment-17107791 ] Andrew Pilloud commented on BEAM-9698: -- Looks like this is coming out of BeamZetaSqlCalcRel... BeamZetaSqlCalcRel(expr#0=[{inputs}], $unnest1=[$t0]) BeamUncollectRel BeamZetaSqlCalcRel(expr#0=[{inputs}], expr#1=[null:BIGINT NOT NULL ARRAY], $array$unnest1=[$t1]) BeamValuesRel(tuples=[[{ 0 }]]) > BeamUncollectRel UncollectDoFn NullPointerException > --- > > Key: BEAM-9698 > URL: https://issues.apache.org/jira/browse/BEAM-9698 > Project: Beam > Issue Type: Bug > Components: dsl-sql-zetasql >Reporter: Andrew Pilloud >Assignee: Andrew Pilloud >Priority: Major > Labels: zetasql-compliance > > two failures in shard 19 > {code} > org.apache.beam.sdk.Pipeline$PipelineExecutionException: > java.lang.NullPointerException > at > org.apache.beam.runners.direct.DirectRunner$DirectPipelineResult.waitUntilFinish(DirectRunner.java:348) > at > org.apache.beam.runners.direct.DirectRunner$DirectPipelineResult.waitUntilFinish(DirectRunner.java:318) > at > org.apache.beam.runners.direct.DirectRunner.run(DirectRunner.java:213) > at org.apache.beam.runners.direct.DirectRunner.run(DirectRunner.java:67) > at org.apache.beam.sdk.Pipeline.run(Pipeline.java:317) > at org.apache.beam.sdk.Pipeline.run(Pipeline.java:303) > at > org.apache.beam.sdk.extensions.sql.impl.rel.BeamEnumerableConverter.runCollector(BeamEnumerableConverter.java:201) > at > org.apache.beam.sdk.extensions.sql.impl.rel.BeamEnumerableConverter.collectRows(BeamEnumerableConverter.java:218) > at > org.apache.beam.sdk.extensions.sql.impl.rel.BeamEnumerableConverter.toRowList(BeamEnumerableConverter.java:150) > at > org.apache.beam.sdk.extensions.sql.impl.rel.BeamEnumerableConverter.toRowList(BeamEnumerableConverter.java:127) > at > cloud.dataflow.sql.ExecuteQueryServiceServer$SqlComplianceServiceImpl.executeQuery(ExecuteQueryServiceServer.java:329) > at > com.google.zetasql.testing.SqlComplianceServiceGrpc$MethodHandlers.invoke(SqlComplianceServiceGrpc.java:423) > at > com.google.zetasql.io.grpc.stub.ServerCalls$UnaryServerCallHandler$UnaryServerCallListener.onHalfClose(ServerCalls.java:171) > at > com.google.zetasql.io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.halfClosed(ServerCallImpl.java:283) > at > com.google.zetasql.io.grpc.internal.ServerImpl$JumpToApplicationThreadServerStreamListener$1HalfClosed.runInContext(ServerImpl.java:711) > at > com.google.zetasql.io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37) > at > com.google.zetasql.io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:123) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > Caused by: java.lang.NullPointerException > at > org.apache.beam.sdk.extensions.sql.impl.rel.BeamUncollectRel$UncollectDoFn.process(BeamUncollectRel.java:103) > {code} > {code} > Apr 01, 2020 5:58:27 PM > cloud.dataflow.sql.ExecuteQueryServiceServer$SqlComplianceServiceImpl > executeQuery > INFO: Processing Sql statement: SELECT e FROM UNNEST(CAST(NULL AS > ARRAY)) e > Apr 01, 2020 5:58:27 PM > cloud.dataflow.sql.ExecuteQueryServiceServer$SqlComplianceServiceImpl > executeQuery > INFO: Processing Sql statement: SELECT e FROM UNNEST(CAST(NULL AS > ARRAY>)) e > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (BEAM-9698) BeamUncollectRel UncollectDoFn NullPointerException
[ https://issues.apache.org/jira/browse/BEAM-9698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Pilloud reassigned BEAM-9698: Assignee: Andrew Pilloud > BeamUncollectRel UncollectDoFn NullPointerException > --- > > Key: BEAM-9698 > URL: https://issues.apache.org/jira/browse/BEAM-9698 > Project: Beam > Issue Type: Bug > Components: dsl-sql-zetasql >Reporter: Andrew Pilloud >Assignee: Andrew Pilloud >Priority: Major > Labels: zetasql-compliance > > two failures in shard 19 > {code} > org.apache.beam.sdk.Pipeline$PipelineExecutionException: > java.lang.NullPointerException > at > org.apache.beam.runners.direct.DirectRunner$DirectPipelineResult.waitUntilFinish(DirectRunner.java:348) > at > org.apache.beam.runners.direct.DirectRunner$DirectPipelineResult.waitUntilFinish(DirectRunner.java:318) > at > org.apache.beam.runners.direct.DirectRunner.run(DirectRunner.java:213) > at org.apache.beam.runners.direct.DirectRunner.run(DirectRunner.java:67) > at org.apache.beam.sdk.Pipeline.run(Pipeline.java:317) > at org.apache.beam.sdk.Pipeline.run(Pipeline.java:303) > at > org.apache.beam.sdk.extensions.sql.impl.rel.BeamEnumerableConverter.runCollector(BeamEnumerableConverter.java:201) > at > org.apache.beam.sdk.extensions.sql.impl.rel.BeamEnumerableConverter.collectRows(BeamEnumerableConverter.java:218) > at > org.apache.beam.sdk.extensions.sql.impl.rel.BeamEnumerableConverter.toRowList(BeamEnumerableConverter.java:150) > at > org.apache.beam.sdk.extensions.sql.impl.rel.BeamEnumerableConverter.toRowList(BeamEnumerableConverter.java:127) > at > cloud.dataflow.sql.ExecuteQueryServiceServer$SqlComplianceServiceImpl.executeQuery(ExecuteQueryServiceServer.java:329) > at > com.google.zetasql.testing.SqlComplianceServiceGrpc$MethodHandlers.invoke(SqlComplianceServiceGrpc.java:423) > at > com.google.zetasql.io.grpc.stub.ServerCalls$UnaryServerCallHandler$UnaryServerCallListener.onHalfClose(ServerCalls.java:171) > at > com.google.zetasql.io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.halfClosed(ServerCallImpl.java:283) > at > com.google.zetasql.io.grpc.internal.ServerImpl$JumpToApplicationThreadServerStreamListener$1HalfClosed.runInContext(ServerImpl.java:711) > at > com.google.zetasql.io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37) > at > com.google.zetasql.io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:123) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > Caused by: java.lang.NullPointerException > at > org.apache.beam.sdk.extensions.sql.impl.rel.BeamUncollectRel$UncollectDoFn.process(BeamUncollectRel.java:103) > {code} > {code} > Apr 01, 2020 5:58:27 PM > cloud.dataflow.sql.ExecuteQueryServiceServer$SqlComplianceServiceImpl > executeQuery > INFO: Processing Sql statement: SELECT e FROM UNNEST(CAST(NULL AS > ARRAY)) e > Apr 01, 2020 5:58:27 PM > cloud.dataflow.sql.ExecuteQueryServiceServer$SqlComplianceServiceImpl > executeQuery > INFO: Processing Sql statement: SELECT e FROM UNNEST(CAST(NULL AS > ARRAY>)) e > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (BEAM-9999) Remove support for EOLed runners (Apex, etc.)
[ https://issues.apache.org/jira/browse/BEAM-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17107790#comment-17107790 ] Kenneth Knowles commented on BEAM-: --- Makes sense to me, especially if Beam is moving on in ways that require updates (like Java 11) or if there is maintenance burden. I think the important thing is whether they have users. You can get some general ideas from https://repository.apache.org/#central-stat (committers only access) but it cannot distinguish continuous testing downloads of course. CC [~t...@apache.org] for comment on Apex CC [~mauzhang] for comment on Gearpump > Remove support for EOLed runners (Apex, etc.) > - > > Key: BEAM- > URL: https://issues.apache.org/jira/browse/BEAM- > Project: Beam > Issue Type: Bug > Components: runner-apex, runner-core >Reporter: Ahmet Altay >Priority: Major > > These runners look EOLed, not maintained: > - Apex (last release 2+ years ago) > - Gearpump (last release 1+ year ago) > Removing support for these could reduce the code base size, reduce flaky > test, and make it easier to add new features. > /cc [~kenn][~tysonjh] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (BEAM-9994) Cannot create a virtualenv using Python 3.8 on Jenkins machines
[ https://issues.apache.org/jira/browse/BEAM-9994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17107785#comment-17107785 ] Valentyn Tymofieiev commented on BEAM-9994: --- Can you try: python3.8 -m venv env ? I am not sure what is going on but possibly the version of ubuntu we have on Jenkins is missing some packages for Python 3.8 or virtualenv. The best course of action would be to clone a Jenkins VM image, create a VM, experiment to see what needs to be fixed. If the fix is not easy, we may need to implement BEAM-8152 to unblock this. cc: [~wintermelons] [~yifanzou] [~yoshiki.obata] > Cannot create a virtualenv using Python 3.8 on Jenkins machines > --- > > Key: BEAM-9994 > URL: https://issues.apache.org/jira/browse/BEAM-9994 > Project: Beam > Issue Type: Bug > Components: build-system >Reporter: Kamil Wasilewski >Priority: Blocker > > Command: *virtualenv --python /usr/bin/python3.8 env* > Output: > {noformat} > Running virtualenv with interpreter /usr/bin/python3.8 > Traceback (most recent call last): > File "/usr/local/lib/python3.5/dist-packages/virtualenv.py", line 22, in > > import distutils.spawn > ModuleNotFoundError: No module named 'distutils.spawn' > {noformat} > Example test affected: > https://builds.apache.org/job/beam_PreCommit_PythonFormatter_Commit/1723/console -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (BEAM-10000) Support BIT_XOR aggregation function in BeamSQL
[ https://issues.apache.org/jira/browse/BEAM-1?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17107787#comment-17107787 ] Kai Jiang commented on BEAM-1: -- wow! #1 > Support BIT_XOR aggregation function in BeamSQL > --- > > Key: BEAM-1 > URL: https://issues.apache.org/jira/browse/BEAM-1 > Project: Beam > Issue Type: Task > Components: dsl-sql >Reporter: Rui Wang >Priority: Major > > See reference: > https://cloud.google.com/bigquery/docs/reference/standard-sql/aggregate_functions#bit_xor -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9951) Create Go SDK synthetic sources.
[ https://issues.apache.org/jira/browse/BEAM-9951?focusedWorklogId=433434&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433434 ] ASF GitHub Bot logged work on BEAM-9951: Author: ASF GitHub Bot Created on: 14/May/20 23:18 Start Date: 14/May/20 23:18 Worklog Time Spent: 10m Work Description: youngoli commented on a change in pull request #11665: URL: https://github.com/apache/beam/pull/11665#discussion_r425481703 ## File path: sdks/go/pkg/beam/io/synthetic/source.go ## @@ -33,22 +33,30 @@ import ( // generated KV<[]byte, []byte> elements. // // This transform accepts a PCollection of SourceConfig, where each SourceConfig -// determines the synthetic source's behavior for that element. +// determines the synthetic source's behavior for that element and outputs the +// randomly generated elements. // -// This transform outputs a PCollection of randomly generated -// KV elements. +// SourceConfigs are recommended to be created via the DefaultSourceConfig and +// then sent to a beam.Create transform once modified. Example: +// +//cfg1 := synthetic.DefaultSourceConfig() +//cfg1.NumElements = 1000 +//cfg2 := synthetic.DefaultSourceConfig() Review comment: Yeah I was thinking the same thing. I guess it technically works that way right now because the code catches the 0 initial splits case and clamps it to 1, but I'm still not a huge fan of allowing it to be used as a value like that. Hence, why I used that "proper" example. While working on the step PR I've been thinking that the most appealing approach to me right now is using a builder so you can do DefaultSourceConfig().NumElements(1000).Build() and have Build catch any invalid values. It's a little over-engineered for what we have now, but I prefer over-engineered default values that at least have a user-friendly API over implicitly changing 0 values to the defaults we actually want ("0 initial splits? That's invalid so I'll just set it to 1 for you.") This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 433434) Time Spent: 50m (was: 40m) > Create Go SDK synthetic sources. > > > Key: BEAM-9951 > URL: https://issues.apache.org/jira/browse/BEAM-9951 > Project: Beam > Issue Type: Sub-task > Components: sdk-go >Reporter: Daniel Oliveira >Assignee: Daniel Oliveira >Priority: Major > Time Spent: 50m > Remaining Estimate: 0h > > Create synthetic sources for the Go SDK like > [Java|https://github.com/apache/beam/tree/master/sdks/java/io/synthetic/src/main/java/org/apache/beam/sdk/io/synthetic] > and > [Python|https://github.com/apache/beam/blob/master/sdks/python/apache_beam/testing/synthetic_pipeline.py] > have. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9876) Migrate the Beam website from Jekyll to Hugo to enable localization of the site content
[ https://issues.apache.org/jira/browse/BEAM-9876?focusedWorklogId=433433&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433433 ] ASF GitHub Bot logged work on BEAM-9876: Author: ASF GitHub Bot Created on: 14/May/20 23:12 Start Date: 14/May/20 23:12 Worklog Time Spent: 10m Work Description: pabloem commented on pull request #11554: URL: https://github.com/apache/beam/pull/11554#issuecomment-628933708 I've tagged the commit 1d2700818474c008eaa324ac1b5c49c9d2857298 with the `website-to-hugo` tag. fyi @aijamalnk @bntnam et al This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 433433) Time Spent: 12.5h (was: 12h 20m) > Migrate the Beam website from Jekyll to Hugo to enable localization of the > site content > --- > > Key: BEAM-9876 > URL: https://issues.apache.org/jira/browse/BEAM-9876 > Project: Beam > Issue Type: Task > Components: website >Reporter: Aizhamal Nurmamat kyzy >Assignee: Aizhamal Nurmamat kyzy >Priority: Major > Time Spent: 12.5h > Remaining Estimate: 0h > > Enable internationalization of the Apache Beam website to increase the reach > of the project, and facilitate adoption and growth of its community. > The proposal was to do this by migrating the current Apache Beam website from > Jekyll do Hugo [1]. Hugo supports internationalization out-of-the-box, making > it easier both for contributors and maintainers support the > internationalization effort. > The further discussion on implementation can be viewed here [2] > [1] > [https://lists.apache.org/thread.html/rfab4cc1411318c3f4667bee051df68f37be11846ada877f3576c41a9%40%3Cdev.beam.apache.org%3E] > [2] > [https://lists.apache.org/thread.html/r6b999b6d7d1f6cbb94e16bb2deed2b65098a6b14c4ac98707fe0c36a%40%3Cdev.beam.apache.org%3E] > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (BEAM-10001) Change the code block colors from grey to blue to increase the contrast between text and background.
Aizhamal Nurmamat kyzy created BEAM-10001: - Summary: Change the code block colors from grey to blue to increase the contrast between text and background. Key: BEAM-10001 URL: https://issues.apache.org/jira/browse/BEAM-10001 Project: Beam Issue Type: Task Components: website Reporter: Aizhamal Nurmamat kyzy Example: [https://beam.apache.org/get-started/try-apache-beam/] The old background color: [http://apache-beam-website-pull-requests.storage.googleapis.com/11705/documentation/programming-guide/index.html#creating-a-pipeline] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (BEAM-9975) PortableRunnerTest flake "ParseError: Unexpected type for Value message."
[ https://issues.apache.org/jira/browse/BEAM-9975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17107778#comment-17107778 ] Brian Hulette commented on BEAM-9975: - It looks like there are (at least) two problems here: - get_all_options relies on [__subclasses__|https://github.com/apache/beam/blob/5d00ccba5b905584f07c1b0275841113d4921a8c/sdks/python/apache_beam/options/pipeline_options.py#L283] to find every PipelineOptions subclass, which finds all the [subclasses that have had their definition executed|https://stackoverflow.com/questions/3862310/how-to-find-all-the-subclasses-of-a-class-given-its-name]. It seems when running tests it's possible for this to pull in definitions from previously executed tests. I tried to repro this locally by running two tests with pytest and I couldn't do it. I'm not sure what's different on jenkins. - I'm pretty sure we should check for instances of ValueProvider and call get() before trying to convert to a proto struct. > PortableRunnerTest flake "ParseError: Unexpected type for Value message." > - > > Key: BEAM-9975 > URL: https://issues.apache.org/jira/browse/BEAM-9975 > Project: Beam > Issue Type: Bug > Components: sdk-py-core >Reporter: Brian Hulette >Assignee: Brian Hulette >Priority: Major > Time Spent: 50m > Remaining Estimate: 0h > > Error looks similar to the one in BEAM-9907. Example from > https://builds.apache.org/job/beam_PreCommit_Python_Cron/2732 > {code} > apache_beam/runners/portability/fn_api_runner/fn_runner_test.py:569: > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > _ > apache_beam/pipeline.py:550: in __exit__ > self.run().wait_until_finish() > apache_beam/pipeline.py:529: in run > return self.runner.run_pipeline(self, self._options) > apache_beam/runners/portability/portable_runner.py:426: in run_pipeline > job_service_handle.submit(proto_pipeline) > apache_beam/runners/portability/portable_runner.py:107: in submit > prepare_response = self.prepare(proto_pipeline) > apache_beam/runners/portability/portable_runner.py:184: in prepare > pipeline_options=self.get_pipeline_options()), > apache_beam/runners/portability/portable_runner.py:174: in > get_pipeline_options > return job_utils.dict_to_struct(p_options) > apache_beam/runners/job/utils.py:33: in dict_to_struct > return json_format.ParseDict(dict_obj, struct_pb2.Struct()) > target/.tox-py36-cython/py36-cython/lib/python3.6/site-packages/google/protobuf/json_format.py:450: > in ParseDict > parser.ConvertMessage(js_dict, message) > target/.tox-py36-cython/py36-cython/lib/python3.6/site-packages/google/protobuf/json_format.py:479: > in ConvertMessage > methodcaller(_WKTJSONMETHODS[full_name][1], value, message)(self) > target/.tox-py36-cython/py36-cython/lib/python3.6/site-packages/google/protobuf/json_format.py:667: > in _ConvertStructMessage > self._ConvertValueMessage(value[key], message.fields[key]) > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > _ > self = > value = 0x7f69eb7b3ac8> > message = > def _ConvertValueMessage(self, value, message): > """Convert a JSON representation into Value message.""" > if isinstance(value, dict): > self._ConvertStructMessage(value, message.struct_value) > elif isinstance(value, list): > self. _ConvertListValueMessage(value, message.list_value) > elif value is None: > message.null_value = 0 > elif isinstance(value, bool): > message.bool_value = value > elif isinstance(value, six.string_types): > message.string_value = value > elif isinstance(value, _INT_OR_FLOAT): > message.number_value = value > else: > > raise ParseError('Unexpected type for Value message.') > E google.protobuf.json_format.ParseError: Unexpected type for Value > message. > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9468) Add Google Cloud Healthcare API IO Connectors
[ https://issues.apache.org/jira/browse/BEAM-9468?focusedWorklogId=433432&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433432 ] ASF GitHub Bot logged work on BEAM-9468: Author: ASF GitHub Bot Created on: 14/May/20 23:11 Start Date: 14/May/20 23:11 Worklog Time Spent: 10m Work Description: pabloem commented on pull request #11339: URL: https://github.com/apache/beam/pull/11339#issuecomment-628933465 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 433432) Time Spent: 42h (was: 41h 50m) > Add Google Cloud Healthcare API IO Connectors > - > > Key: BEAM-9468 > URL: https://issues.apache.org/jira/browse/BEAM-9468 > Project: Beam > Issue Type: New Feature > Components: io-java-gcp >Reporter: Jacob Ferriero >Assignee: Jacob Ferriero >Priority: Minor > Time Spent: 42h > Remaining Estimate: 0h > > Add IO Transforms for the HL7v2, FHIR and DICOM stores in the [Google Cloud > Healthcare API|https://cloud.google.com/healthcare/docs/] > HL7v2IO > FHIRIO > DICOM -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-10000) Support BIT_XOR aggregation function in BeamSQL
[ https://issues.apache.org/jira/browse/BEAM-1?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Wang updated BEAM-1: Description: See reference: https://cloud.google.com/bigquery/docs/reference/standard-sql/aggregate_functions#bit_xor > Support BIT_XOR aggregation function in BeamSQL > --- > > Key: BEAM-1 > URL: https://issues.apache.org/jira/browse/BEAM-1 > Project: Beam > Issue Type: Task > Components: dsl-sql >Reporter: Rui Wang >Priority: Major > > See reference: > https://cloud.google.com/bigquery/docs/reference/standard-sql/aggregate_functions#bit_xor -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-10000) Support BIT_XOR aggregation function in BeamSQL
[ https://issues.apache.org/jira/browse/BEAM-1?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Wang updated BEAM-1: Status: Open (was: Triage Needed) > Support BIT_XOR aggregation function in BeamSQL > --- > > Key: BEAM-1 > URL: https://issues.apache.org/jira/browse/BEAM-1 > Project: Beam > Issue Type: Task > Components: dsl-sql >Reporter: Rui Wang >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (BEAM-10000) Support BIT_XOR aggregation function in BeamSQL
Rui Wang created BEAM-1: --- Summary: Support BIT_XOR aggregation function in BeamSQL Key: BEAM-1 URL: https://issues.apache.org/jira/browse/BEAM-1 Project: Beam Issue Type: Task Components: dsl-sql Reporter: Rui Wang -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9993) Add option defaults for Flink Python tests
[ https://issues.apache.org/jira/browse/BEAM-9993?focusedWorklogId=433425&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433425 ] ASF GitHub Bot logged work on BEAM-9993: Author: ASF GitHub Bot Created on: 14/May/20 22:48 Start Date: 14/May/20 22:48 Worklog Time Spent: 10m Work Description: ibzib merged pull request #11711: URL: https://github.com/apache/beam/pull/11711 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 433425) Time Spent: 20m (was: 10m) > Add option defaults for Flink Python tests > -- > > Key: BEAM-9993 > URL: https://issues.apache.org/jira/browse/BEAM-9993 > Project: Beam > Issue Type: Improvement > Components: runner-flink >Reporter: Kyle Weaver >Assignee: Kyle Weaver >Priority: Minor > Labels: portability-flink > Time Spent: 20m > Remaining Estimate: 0h > > I want to run a single Flink Python test: > python -m apache_beam.runners.portability.flink_runner_test > FlinkRunnerTest.test_metrics > But I get this error: > TypeError: expected str, bytes or os.PathLike object, not NoneType > Turns out flink_job_server_jar isn't set, and there's no default value. We > should set a default. > We should also change the default environment type to LOOPBACK for basic > testing purposes because it requires the least setup. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (BEAM-9999) Remove support for EOLed runners (Apex, etc.)
Ahmet Altay created BEAM-: - Summary: Remove support for EOLed runners (Apex, etc.) Key: BEAM- URL: https://issues.apache.org/jira/browse/BEAM- Project: Beam Issue Type: Bug Components: runner-apex, runner-core Reporter: Ahmet Altay These runners look EOLed, not maintained: - Apex (last release 2+ years ago) - Gearpump (last release 1+ year ago) Removing support for these could reduce the code base size, reduce flaky test, and make it easier to add new features. /cc [~kenn][~tysonjh] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-6950) Beam Dependency Update Request: com.github.spotbugs
[ https://issues.apache.org/jira/browse/BEAM-6950?focusedWorklogId=433422&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433422 ] ASF GitHub Bot logged work on BEAM-6950: Author: ASF GitHub Bot Created on: 14/May/20 22:45 Start Date: 14/May/20 22:45 Worklog Time Spent: 10m Work Description: iemejia commented on pull request #11712: URL: https://github.com/apache/beam/pull/11712#issuecomment-628924367 Run JavaPortabilityApi PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 433422) Time Spent: 0.5h (was: 20m) > Beam Dependency Update Request: com.github.spotbugs > --- > > Key: BEAM-6950 > URL: https://issues.apache.org/jira/browse/BEAM-6950 > Project: Beam > Issue Type: Bug > Components: dependencies >Reporter: Beam JIRA Bot >Priority: Major > Time Spent: 0.5h > Remaining Estimate: 0h > > - 2019-04-01 12:15:04.215434 > - > Please consider upgrading the dependency com.github.spotbugs. > The current version is None. The latest version is None > cc: > Please refer to [Beam Dependency Guide > |https://beam.apache.org/contribute/dependencies/]for more information. > Do Not Modify The Description Above. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-6950) Beam Dependency Update Request: com.github.spotbugs
[ https://issues.apache.org/jira/browse/BEAM-6950?focusedWorklogId=433423&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433423 ] ASF GitHub Bot logged work on BEAM-6950: Author: ASF GitHub Bot Created on: 14/May/20 22:45 Start Date: 14/May/20 22:45 Worklog Time Spent: 10m Work Description: iemejia commented on pull request #11712: URL: https://github.com/apache/beam/pull/11712#issuecomment-628924410 Run JavaPortabilityApiJava11 PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 433423) Time Spent: 40m (was: 0.5h) > Beam Dependency Update Request: com.github.spotbugs > --- > > Key: BEAM-6950 > URL: https://issues.apache.org/jira/browse/BEAM-6950 > Project: Beam > Issue Type: Bug > Components: dependencies >Reporter: Beam JIRA Bot >Priority: Major > Time Spent: 40m > Remaining Estimate: 0h > > - 2019-04-01 12:15:04.215434 > - > Please consider upgrading the dependency com.github.spotbugs. > The current version is None. The latest version is None > cc: > Please refer to [Beam Dependency Guide > |https://beam.apache.org/contribute/dependencies/]for more information. > Do Not Modify The Description Above. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-6950) Beam Dependency Update Request: com.github.spotbugs
[ https://issues.apache.org/jira/browse/BEAM-6950?focusedWorklogId=433421&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433421 ] ASF GitHub Bot logged work on BEAM-6950: Author: ASF GitHub Bot Created on: 14/May/20 22:45 Start Date: 14/May/20 22:45 Worklog Time Spent: 10m Work Description: iemejia commented on pull request #11712: URL: https://github.com/apache/beam/pull/11712#issuecomment-628924278 Run Java PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 433421) Time Spent: 20m (was: 10m) > Beam Dependency Update Request: com.github.spotbugs > --- > > Key: BEAM-6950 > URL: https://issues.apache.org/jira/browse/BEAM-6950 > Project: Beam > Issue Type: Bug > Components: dependencies >Reporter: Beam JIRA Bot >Priority: Major > Time Spent: 20m > Remaining Estimate: 0h > > - 2019-04-01 12:15:04.215434 > - > Please consider upgrading the dependency com.github.spotbugs. > The current version is None. The latest version is None > cc: > Please refer to [Beam Dependency Guide > |https://beam.apache.org/contribute/dependencies/]for more information. > Do Not Modify The Description Above. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9964) Setting workerCacheMb to make its way to the WindmillStateCache Constructor
[ https://issues.apache.org/jira/browse/BEAM-9964?focusedWorklogId=433420&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433420 ] ASF GitHub Bot logged work on BEAM-9964: Author: ASF GitHub Bot Created on: 14/May/20 22:44 Start Date: 14/May/20 22:44 Worklog Time Spent: 10m Work Description: pabloem commented on pull request #11710: URL: https://github.com/apache/beam/pull/11710#issuecomment-628924201 Run Java PostCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 433420) Time Spent: 2h 20m (was: 2h 10m) > Setting workerCacheMb to make its way to the WindmillStateCache Constructor > --- > > Key: BEAM-9964 > URL: https://issues.apache.org/jira/browse/BEAM-9964 > Project: Beam > Issue Type: Improvement > Components: runner-dataflow >Reporter: Omar Ismail >Assignee: Omar Ismail >Priority: Minor > Time Spent: 2h 20m > Remaining Estimate: 0h > > Setting --workerCacheMB seems to affect batch pipelines only. For Streaming, > the cache seems to be hardcoded to 100Mb [1]. If possible, I would like to > make it allowable to change the cache value in Streaming when setting > -workerCacheMB. > I've never made changes to the Beam SDK, so I am super excited to work on > this! > > [[1] > https://github.com/apache/beam/blob/5e659bb80bcbf70795f6806e05a255ee72706d9f/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillStateCache.java#L73|https://github.com/apache/beam/blob/5e659bb80bcbf70795f6806e05a255ee72706d9f/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillStateCache.java#L73] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9876) Migrate the Beam website from Jekyll to Hugo to enable localization of the site content
[ https://issues.apache.org/jira/browse/BEAM-9876?focusedWorklogId=433413&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433413 ] ASF GitHub Bot logged work on BEAM-9876: Author: ASF GitHub Bot Created on: 14/May/20 22:40 Start Date: 14/May/20 22:40 Worklog Time Spent: 10m Work Description: aaltay merged pull request #11554: URL: https://github.com/apache/beam/pull/11554 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 433413) Time Spent: 12h 20m (was: 12h 10m) > Migrate the Beam website from Jekyll to Hugo to enable localization of the > site content > --- > > Key: BEAM-9876 > URL: https://issues.apache.org/jira/browse/BEAM-9876 > Project: Beam > Issue Type: Task > Components: website >Reporter: Aizhamal Nurmamat kyzy >Assignee: Aizhamal Nurmamat kyzy >Priority: Major > Time Spent: 12h 20m > Remaining Estimate: 0h > > Enable internationalization of the Apache Beam website to increase the reach > of the project, and facilitate adoption and growth of its community. > The proposal was to do this by migrating the current Apache Beam website from > Jekyll do Hugo [1]. Hugo supports internationalization out-of-the-box, making > it easier both for contributors and maintainers support the > internationalization effort. > The further discussion on implementation can be viewed here [2] > [1] > [https://lists.apache.org/thread.html/rfab4cc1411318c3f4667bee051df68f37be11846ada877f3576c41a9%40%3Cdev.beam.apache.org%3E] > [2] > [https://lists.apache.org/thread.html/r6b999b6d7d1f6cbb94e16bb2deed2b65098a6b14c4ac98707fe0c36a%40%3Cdev.beam.apache.org%3E] > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (BEAM-9975) PortableRunnerTest flake "ParseError: Unexpected type for Value message."
[ https://issues.apache.org/jira/browse/BEAM-9975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17107742#comment-17107742 ] Brian Hulette commented on BEAM-9975: - Note that log is from PortableRunnerTest.test_read > PortableRunnerTest flake "ParseError: Unexpected type for Value message." > - > > Key: BEAM-9975 > URL: https://issues.apache.org/jira/browse/BEAM-9975 > Project: Beam > Issue Type: Bug > Components: sdk-py-core >Reporter: Brian Hulette >Assignee: Brian Hulette >Priority: Major > Time Spent: 50m > Remaining Estimate: 0h > > Error looks similar to the one in BEAM-9907. Example from > https://builds.apache.org/job/beam_PreCommit_Python_Cron/2732 > {code} > apache_beam/runners/portability/fn_api_runner/fn_runner_test.py:569: > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > _ > apache_beam/pipeline.py:550: in __exit__ > self.run().wait_until_finish() > apache_beam/pipeline.py:529: in run > return self.runner.run_pipeline(self, self._options) > apache_beam/runners/portability/portable_runner.py:426: in run_pipeline > job_service_handle.submit(proto_pipeline) > apache_beam/runners/portability/portable_runner.py:107: in submit > prepare_response = self.prepare(proto_pipeline) > apache_beam/runners/portability/portable_runner.py:184: in prepare > pipeline_options=self.get_pipeline_options()), > apache_beam/runners/portability/portable_runner.py:174: in > get_pipeline_options > return job_utils.dict_to_struct(p_options) > apache_beam/runners/job/utils.py:33: in dict_to_struct > return json_format.ParseDict(dict_obj, struct_pb2.Struct()) > target/.tox-py36-cython/py36-cython/lib/python3.6/site-packages/google/protobuf/json_format.py:450: > in ParseDict > parser.ConvertMessage(js_dict, message) > target/.tox-py36-cython/py36-cython/lib/python3.6/site-packages/google/protobuf/json_format.py:479: > in ConvertMessage > methodcaller(_WKTJSONMETHODS[full_name][1], value, message)(self) > target/.tox-py36-cython/py36-cython/lib/python3.6/site-packages/google/protobuf/json_format.py:667: > in _ConvertStructMessage > self._ConvertValueMessage(value[key], message.fields[key]) > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > _ > self = > value = 0x7f69eb7b3ac8> > message = > def _ConvertValueMessage(self, value, message): > """Convert a JSON representation into Value message.""" > if isinstance(value, dict): > self._ConvertStructMessage(value, message.struct_value) > elif isinstance(value, list): > self. _ConvertListValueMessage(value, message.list_value) > elif value is None: > message.null_value = 0 > elif isinstance(value, bool): > message.bool_value = value > elif isinstance(value, six.string_types): > message.string_value = value > elif isinstance(value, _INT_OR_FLOAT): > message.number_value = value > else: > > raise ParseError('Unexpected type for Value message.') > E google.protobuf.json_format.ParseError: Unexpected type for Value > message. > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9641) Support ZetaSQL DATE functions in BeamSQL
[ https://issues.apache.org/jira/browse/BEAM-9641?focusedWorklogId=433411&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433411 ] ASF GitHub Bot logged work on BEAM-9641: Author: ASF GitHub Bot Created on: 14/May/20 22:36 Start Date: 14/May/20 22:36 Worklog Time Spent: 10m Work Description: apilloud commented on pull request #11272: URL: https://github.com/apache/beam/pull/11272#issuecomment-628921575 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 433411) Time Spent: 3h (was: 2h 50m) > Support ZetaSQL DATE functions in BeamSQL > - > > Key: BEAM-9641 > URL: https://issues.apache.org/jira/browse/BEAM-9641 > Project: Beam > Issue Type: New Feature > Components: dsl-sql-zetasql >Reporter: Yueyang Qiu >Assignee: Yueyang Qiu >Priority: Major > Labels: zetasql-compliance > Time Spent: 3h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9577) Update artifact staging and retrieval protocols to be dependency aware.
[ https://issues.apache.org/jira/browse/BEAM-9577?focusedWorklogId=433407&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433407 ] ASF GitHub Bot logged work on BEAM-9577: Author: ASF GitHub Bot Created on: 14/May/20 22:34 Start Date: 14/May/20 22:34 Worklog Time Spent: 10m Work Description: ibzib commented on a change in pull request #11708: URL: https://github.com/apache/beam/pull/11708#discussion_r425468106 ## File path: sdks/java/core/src/main/java/org/apache/beam/sdk/io/ClassLoaderFileSystem.java ## @@ -0,0 +1,154 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.sdk.io; + +import static org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkArgument; + +import com.google.auto.service.AutoService; +import java.io.IOException; +import java.io.InputStream; +import java.nio.channels.Channels; +import java.nio.channels.ReadableByteChannel; +import java.nio.channels.WritableByteChannel; +import java.util.Collection; +import java.util.List; +import javax.annotation.Nullable; +import org.apache.beam.sdk.annotations.Experimental; +import org.apache.beam.sdk.io.fs.CreateOptions; +import org.apache.beam.sdk.io.fs.MatchResult; +import org.apache.beam.sdk.io.fs.ResolveOptions; +import org.apache.beam.sdk.io.fs.ResourceId; +import org.apache.beam.sdk.options.PipelineOptions; +import org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.ImmutableList; + +/** A read-only {@link FileSystem} implementation looking up resources using a ClassLoader. */ Review comment: That's what I get for only reading half the PR... This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 433407) Time Spent: 22h (was: 21h 50m) > Update artifact staging and retrieval protocols to be dependency aware. > --- > > Key: BEAM-9577 > URL: https://issues.apache.org/jira/browse/BEAM-9577 > Project: Beam > Issue Type: Improvement > Components: beam-model >Reporter: Robert Bradshaw >Assignee: Robert Bradshaw >Priority: Major > Time Spent: 22h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9833) Add .asf.yaml file
[ https://issues.apache.org/jira/browse/BEAM-9833?focusedWorklogId=433401&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433401 ] ASF GitHub Bot logged work on BEAM-9833: Author: ASF GitHub Bot Created on: 14/May/20 22:26 Start Date: 14/May/20 22:26 Worklog Time Spent: 10m Work Description: pabloem commented on pull request #11613: URL: https://github.com/apache/beam/pull/11613#issuecomment-628918098 LGTM! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 433401) Time Spent: 2h 20m (was: 2h 10m) > Add .asf.yaml file > -- > > Key: BEAM-9833 > URL: https://issues.apache.org/jira/browse/BEAM-9833 > Project: Beam > Issue Type: New Feature > Components: build-system >Reporter: Kyle Weaver >Priority: Major > Time Spent: 2h 20m > Remaining Estimate: 0h > > Github links haven't been automatically added to Jira for a week or so. > According to comments on INFRA-20171 and related issues, we need to add a > .asf.yaml file to configure our notification settings. > https://s.apache.org/asfyaml-notify -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9833) Add .asf.yaml file
[ https://issues.apache.org/jira/browse/BEAM-9833?focusedWorklogId=433402&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433402 ] ASF GitHub Bot logged work on BEAM-9833: Author: ASF GitHub Bot Created on: 14/May/20 22:26 Start Date: 14/May/20 22:26 Worklog Time Spent: 10m Work Description: pabloem merged pull request #11613: URL: https://github.com/apache/beam/pull/11613 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 433402) Time Spent: 2.5h (was: 2h 20m) > Add .asf.yaml file > -- > > Key: BEAM-9833 > URL: https://issues.apache.org/jira/browse/BEAM-9833 > Project: Beam > Issue Type: New Feature > Components: build-system >Reporter: Kyle Weaver >Priority: Major > Time Spent: 2.5h > Remaining Estimate: 0h > > Github links haven't been automatically added to Jira for a week or so. > According to comments on INFRA-20171 and related issues, we need to add a > .asf.yaml file to configure our notification settings. > https://s.apache.org/asfyaml-notify -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9577) Update artifact staging and retrieval protocols to be dependency aware.
[ https://issues.apache.org/jira/browse/BEAM-9577?focusedWorklogId=433400&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433400 ] ASF GitHub Bot logged work on BEAM-9577: Author: ASF GitHub Bot Created on: 14/May/20 22:20 Start Date: 14/May/20 22:20 Worklog Time Spent: 10m Work Description: robertwb commented on a change in pull request #11708: URL: https://github.com/apache/beam/pull/11708#discussion_r425462944 ## File path: sdks/java/core/src/main/java/org/apache/beam/sdk/io/ClassLoaderFileSystem.java ## @@ -0,0 +1,154 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.sdk.io; + +import static org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkArgument; + +import com.google.auto.service.AutoService; +import java.io.IOException; +import java.io.InputStream; +import java.nio.channels.Channels; +import java.nio.channels.ReadableByteChannel; +import java.nio.channels.WritableByteChannel; +import java.util.Collection; +import java.util.List; +import javax.annotation.Nullable; +import org.apache.beam.sdk.annotations.Experimental; +import org.apache.beam.sdk.io.fs.CreateOptions; +import org.apache.beam.sdk.io.fs.MatchResult; +import org.apache.beam.sdk.io.fs.ResolveOptions; +import org.apache.beam.sdk.io.fs.ResourceId; +import org.apache.beam.sdk.options.PipelineOptions; +import org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.ImmutableList; + +/** A read-only {@link FileSystem} implementation looking up resources using a ClassLoader. */ Review comment: It's used on the Python side. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 433400) Time Spent: 21h 50m (was: 21h 40m) > Update artifact staging and retrieval protocols to be dependency aware. > --- > > Key: BEAM-9577 > URL: https://issues.apache.org/jira/browse/BEAM-9577 > Project: Beam > Issue Type: Improvement > Components: beam-model >Reporter: Robert Bradshaw >Assignee: Robert Bradshaw >Priority: Major > Time Spent: 21h 50m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9577) Update artifact staging and retrieval protocols to be dependency aware.
[ https://issues.apache.org/jira/browse/BEAM-9577?focusedWorklogId=433398&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433398 ] ASF GitHub Bot logged work on BEAM-9577: Author: ASF GitHub Bot Created on: 14/May/20 22:19 Start Date: 14/May/20 22:19 Worklog Time Spent: 10m Work Description: robertwb commented on a change in pull request #11708: URL: https://github.com/apache/beam/pull/11708#discussion_r425462707 ## File path: sdks/java/core/src/test/java/org/apache/beam/sdk/io/ClassLoaderFileSystemTest.java ## @@ -0,0 +1,57 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.sdk.io; + +import static java.nio.channels.Channels.newInputStream; +import static org.junit.Assert.assertArrayEquals; + +import java.io.IOException; +import java.io.InputStream; +import java.nio.channels.ReadableByteChannel; +import org.apache.beam.sdk.options.PipelineOptionsFactory; +import org.junit.Test; +import org.junit.runner.RunWith; +import org.junit.runners.JUnit4; + +@RunWith(JUnit4.class) +public class ClassLoaderFileSystemTest { + + private static final String SOME_CLASS = + "classpath://org/apache/beam/sdk/io/ClassLoaderFilesystem.class"; + + @Test + public void testOpen() throws IOException { +ClassLoaderFileSystem filesystem = new ClassLoaderFileSystem(); +ReadableByteChannel channel = filesystem.open(filesystem.matchNewResource(SOME_CLASS, false)); +checkIsClass(channel); + } + + @Test + public void testRegistrar() throws IOException { +ReadableByteChannel channel = FileSystems.open(FileSystems.matchNewResource(SOME_CLASS, false)); +checkIsClass(channel); + } + + public void checkIsClass(ReadableByteChannel channel) throws IOException { +FileSystems.setDefaultPipelineOptions(PipelineOptionsFactory.create()); +InputStream inputStream = newInputStream(channel); +byte[] magic = new byte[4]; +inputStream.read(magic); +assertArrayEquals(magic, new byte[] {(byte) 0xCA, (byte) 0xFE, (byte) 0xBA, (byte) 0xBE}); Review comment: It's used on the Python side. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 433398) Time Spent: 21.5h (was: 21h 20m) > Update artifact staging and retrieval protocols to be dependency aware. > --- > > Key: BEAM-9577 > URL: https://issues.apache.org/jira/browse/BEAM-9577 > Project: Beam > Issue Type: Improvement > Components: beam-model >Reporter: Robert Bradshaw >Assignee: Robert Bradshaw >Priority: Major > Time Spent: 21.5h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9577) Update artifact staging and retrieval protocols to be dependency aware.
[ https://issues.apache.org/jira/browse/BEAM-9577?focusedWorklogId=433399&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433399 ] ASF GitHub Bot logged work on BEAM-9577: Author: ASF GitHub Bot Created on: 14/May/20 22:19 Start Date: 14/May/20 22:19 Worklog Time Spent: 10m Work Description: robertwb commented on a change in pull request #11708: URL: https://github.com/apache/beam/pull/11708#discussion_r425462830 ## File path: sdks/java/core/src/main/java/org/apache/beam/sdk/io/ClassLoaderFileSystem.java ## @@ -0,0 +1,154 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.sdk.io; + +import static org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkArgument; + +import com.google.auto.service.AutoService; +import java.io.IOException; +import java.io.InputStream; +import java.nio.channels.Channels; +import java.nio.channels.ReadableByteChannel; +import java.nio.channels.WritableByteChannel; +import java.util.Collection; +import java.util.List; +import javax.annotation.Nullable; +import org.apache.beam.sdk.annotations.Experimental; +import org.apache.beam.sdk.io.fs.CreateOptions; +import org.apache.beam.sdk.io.fs.MatchResult; +import org.apache.beam.sdk.io.fs.ResolveOptions; +import org.apache.beam.sdk.io.fs.ResourceId; +import org.apache.beam.sdk.options.PipelineOptions; +import org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.ImmutableList; + +/** A read-only {@link FileSystem} implementation looking up resources using a ClassLoader. */ +public class ClassLoaderFileSystem extends FileSystem { + + public static final String SCHEMA = "classpath"; + private static final String PREFIX = SCHEMA + "://"; + + ClassLoaderFileSystem() {} + + @Override + protected List match(List specs) throws IOException { +throw new UnsupportedOperationException("Un-globable filesystem."); + } + + @Override + protected WritableByteChannel create( + ClassLoaderResourceId resourceId, CreateOptions createOptions) throws IOException { +throw new UnsupportedOperationException("Read-only filesystem."); + } + + @Override + protected ReadableByteChannel open(ClassLoaderResourceId resourceId) throws IOException { +ClassLoader classLoader = getClass().getClassLoader(); +InputStream inputStream = + classLoader.getResourceAsStream(resourceId.path.substring(PREFIX.length())); +if (inputStream == null) { + throw new IOException("Unable to load " + resourceId.path + " with " + classLoader); +} +return Channels.newChannel(inputStream); + } + + @Override + protected void copy( + List srcResourceIds, List destResourceIds) + throws IOException { +throw new UnsupportedOperationException("Read-only filesystem."); + } + + @Override + protected void rename( + List srcResourceIds, List destResourceIds) + throws IOException { +throw new UnsupportedOperationException("Read-only filesystem."); + } + + @Override + protected void delete(Collection resourceIds) throws IOException { +throw new UnsupportedOperationException("Read-only filesystem."); + } + + @Override + protected ClassLoaderResourceId matchNewResource(String path, boolean isDirectory) { +return new ClassLoaderResourceId(path); + } + + @Override + protected String getScheme() { +return SCHEMA; + } + + public static class ClassLoaderResourceId implements ResourceId { + +private final String path; + +private ClassLoaderResourceId(String path) { + checkArgument(path.startsWith(PREFIX), path); + this.path = path; +} + +@Override +public ResourceId resolve(String other, ResolveOptions resolveOptions) { Review comment: The documentation is in the super classes, but I added a test. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastruct
[jira] [Work logged] (BEAM-9641) Support ZetaSQL DATE functions in BeamSQL
[ https://issues.apache.org/jira/browse/BEAM-9641?focusedWorklogId=433397&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433397 ] ASF GitHub Bot logged work on BEAM-9641: Author: ASF GitHub Bot Created on: 14/May/20 22:16 Start Date: 14/May/20 22:16 Worklog Time Spent: 10m Work Description: robinyqiu commented on pull request #11272: URL: https://github.com/apache/beam/pull/11272#issuecomment-628914752 Rebased against master. Please run precommit tests again. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 433397) Time Spent: 2h 50m (was: 2h 40m) > Support ZetaSQL DATE functions in BeamSQL > - > > Key: BEAM-9641 > URL: https://issues.apache.org/jira/browse/BEAM-9641 > Project: Beam > Issue Type: New Feature > Components: dsl-sql-zetasql >Reporter: Yueyang Qiu >Assignee: Yueyang Qiu >Priority: Major > Labels: zetasql-compliance > Time Spent: 2h 50m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (BEAM-9975) PortableRunnerTest flake "ParseError: Unexpected type for Value message."
[ https://issues.apache.org/jira/browse/BEAM-9975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17107733#comment-17107733 ] Brian Hulette commented on BEAM-9975: - vpt_vp_arg13 and vpt_vp_arg14 are from [another test|https://github.com/apache/beam/blob/b91560cc354da471e3de502aad78dd059997a3d0/sdks/python/apache_beam/options/value_provider_test.py#L187]. It looks like something isn't being cleaned up properly between test runs. (Also I wonder if we need to resolve ValueProvider instances when converting pipeline options to proto?) > PortableRunnerTest flake "ParseError: Unexpected type for Value message." > - > > Key: BEAM-9975 > URL: https://issues.apache.org/jira/browse/BEAM-9975 > Project: Beam > Issue Type: Bug > Components: sdk-py-core >Reporter: Brian Hulette >Assignee: Brian Hulette >Priority: Major > Time Spent: 50m > Remaining Estimate: 0h > > Error looks similar to the one in BEAM-9907. Example from > https://builds.apache.org/job/beam_PreCommit_Python_Cron/2732 > {code} > apache_beam/runners/portability/fn_api_runner/fn_runner_test.py:569: > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > _ > apache_beam/pipeline.py:550: in __exit__ > self.run().wait_until_finish() > apache_beam/pipeline.py:529: in run > return self.runner.run_pipeline(self, self._options) > apache_beam/runners/portability/portable_runner.py:426: in run_pipeline > job_service_handle.submit(proto_pipeline) > apache_beam/runners/portability/portable_runner.py:107: in submit > prepare_response = self.prepare(proto_pipeline) > apache_beam/runners/portability/portable_runner.py:184: in prepare > pipeline_options=self.get_pipeline_options()), > apache_beam/runners/portability/portable_runner.py:174: in > get_pipeline_options > return job_utils.dict_to_struct(p_options) > apache_beam/runners/job/utils.py:33: in dict_to_struct > return json_format.ParseDict(dict_obj, struct_pb2.Struct()) > target/.tox-py36-cython/py36-cython/lib/python3.6/site-packages/google/protobuf/json_format.py:450: > in ParseDict > parser.ConvertMessage(js_dict, message) > target/.tox-py36-cython/py36-cython/lib/python3.6/site-packages/google/protobuf/json_format.py:479: > in ConvertMessage > methodcaller(_WKTJSONMETHODS[full_name][1], value, message)(self) > target/.tox-py36-cython/py36-cython/lib/python3.6/site-packages/google/protobuf/json_format.py:667: > in _ConvertStructMessage > self._ConvertValueMessage(value[key], message.fields[key]) > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > _ > self = > value = 0x7f69eb7b3ac8> > message = > def _ConvertValueMessage(self, value, message): > """Convert a JSON representation into Value message.""" > if isinstance(value, dict): > self._ConvertStructMessage(value, message.struct_value) > elif isinstance(value, list): > self. _ConvertListValueMessage(value, message.list_value) > elif value is None: > message.null_value = 0 > elif isinstance(value, bool): > message.bool_value = value > elif isinstance(value, six.string_types): > message.string_value = value > elif isinstance(value, _INT_OR_FLOAT): > message.number_value = value > else: > > raise ParseError('Unexpected type for Value message.') > E google.protobuf.json_format.ParseError: Unexpected type for Value > message. > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (BEAM-9975) PortableRunnerTest flake "ParseError: Unexpected type for Value message."
[ https://issues.apache.org/jira/browse/BEAM-9975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17107729#comment-17107729 ] Brian Hulette commented on BEAM-9975: - The RuntimeValueProvider instances are probably the issue > PortableRunnerTest flake "ParseError: Unexpected type for Value message." > - > > Key: BEAM-9975 > URL: https://issues.apache.org/jira/browse/BEAM-9975 > Project: Beam > Issue Type: Bug > Components: sdk-py-core >Reporter: Brian Hulette >Assignee: Brian Hulette >Priority: Major > Time Spent: 50m > Remaining Estimate: 0h > > Error looks similar to the one in BEAM-9907. Example from > https://builds.apache.org/job/beam_PreCommit_Python_Cron/2732 > {code} > apache_beam/runners/portability/fn_api_runner/fn_runner_test.py:569: > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > _ > apache_beam/pipeline.py:550: in __exit__ > self.run().wait_until_finish() > apache_beam/pipeline.py:529: in run > return self.runner.run_pipeline(self, self._options) > apache_beam/runners/portability/portable_runner.py:426: in run_pipeline > job_service_handle.submit(proto_pipeline) > apache_beam/runners/portability/portable_runner.py:107: in submit > prepare_response = self.prepare(proto_pipeline) > apache_beam/runners/portability/portable_runner.py:184: in prepare > pipeline_options=self.get_pipeline_options()), > apache_beam/runners/portability/portable_runner.py:174: in > get_pipeline_options > return job_utils.dict_to_struct(p_options) > apache_beam/runners/job/utils.py:33: in dict_to_struct > return json_format.ParseDict(dict_obj, struct_pb2.Struct()) > target/.tox-py36-cython/py36-cython/lib/python3.6/site-packages/google/protobuf/json_format.py:450: > in ParseDict > parser.ConvertMessage(js_dict, message) > target/.tox-py36-cython/py36-cython/lib/python3.6/site-packages/google/protobuf/json_format.py:479: > in ConvertMessage > methodcaller(_WKTJSONMETHODS[full_name][1], value, message)(self) > target/.tox-py36-cython/py36-cython/lib/python3.6/site-packages/google/protobuf/json_format.py:667: > in _ConvertStructMessage > self._ConvertValueMessage(value[key], message.fields[key]) > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > _ > self = > value = 0x7f69eb7b3ac8> > message = > def _ConvertValueMessage(self, value, message): > """Convert a JSON representation into Value message.""" > if isinstance(value, dict): > self._ConvertStructMessage(value, message.struct_value) > elif isinstance(value, list): > self. _ConvertListValueMessage(value, message.list_value) > elif value is None: > message.null_value = 0 > elif isinstance(value, bool): > message.bool_value = value > elif isinstance(value, six.string_types): > message.string_value = value > elif isinstance(value, _INT_OR_FLOAT): > message.number_value = value > else: > > raise ParseError('Unexpected type for Value message.') > E google.protobuf.json_format.ParseError: Unexpected type for Value > message. > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (BEAM-9975) PortableRunnerTest flake "ParseError: Unexpected type for Value message."
[ https://issues.apache.org/jira/browse/BEAM-9975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17107727#comment-17107727 ] Brian Hulette commented on BEAM-9975: - Got an [error log|https://builds.apache.org/job/beam_PreCommit_Python_Cron/2753/testReport/junit/apache_beam.runners.portability.portable_runner_test/PortableRunnerTest/test_read/]: {code} ERROR:root:Failed to parse dict {'beam:option:streaming:v1': False, 'beam:option:beam_services:v1': {}, 'beam:option:type_check_strictness:v1': 'DEFAULT_TO_ANY', 'beam:option:pipeline_type_check:v1': True, 'beam:option:runtime_type_check:v1': False, 'beam:option:direct_runner_use_stacked_bundle:v1': True, 'beam:option:direct_runner_bundle_repeat:v1': '0', 'beam:option:direct_num_workers:v1': '1', 'beam:option:direct_running_mode:v1': 'in_memory', 'beam:option:dataflow_endpoint:v1': 'https://dataflow.googleapis.com', 'beam:option:job_name:v1': 'test_read_1589482267.7994738', 'beam:option:no_auth:v1': False, 'beam:option:update:v1': False, 'beam:option:enable_streaming_engine:v1': False, 'beam:option:hdfs_full_urls:v1': False, 'beam:option:experiments:v1': ['state_cache_size=100', 'data_buffer_time_limit_ms=1000', 'beam_fn_api'], 'beam:option:profile_cpu:v1': False, 'beam:option:profile_memory:v1': False, 'beam:option:profile_sample_rate:v1': 1.0, 'beam:option:save_main_session:v1': False, 'beam:option:sdk_location:v1': 'container', 'beam:option:job_endpoint:v1': 'localhost:35763', 'beam:option:job_server_timeout:v1': '60', 'beam:option:environment_type:v1': 'beam:env:embedded_python:v1', 'beam:option:sdk_worker_parallelism:v1': '1', 'beam:option:environment_cache_millis:v1': '0', 'beam:option:job_port:v1': '0', 'beam:option:artifact_port:v1': '0', 'beam:option:expansion_port:v1': '0', 'beam:option:flink_master:v1': '[auto]', 'beam:option:flink_version:v1': '1.10', 'beam:option:flink_submit_uber_jar:v1': False, 'beam:option:spark_master_url:v1': 'local[4]', 'beam:option:spark_submit_uber_jar:v1': False, 'beam:option:dry_run:v1': False, 'beam:option:style:v1': 'scrambled', 'beam:option:influx_hostname:v1': 'http://localhost:8086', 'beam:option:timeout_ms:v1': '0', 'beam:option:mock_flag:v1': False, 'beam:option:fake_flag:v1': False, 'beam:option:m_flag:v1': False, 'beam:option:male:v1': False, 'beam:option:redefined_flag:v1': False, 'beam:option:vpt_vp_arg13:v1': , 'beam:option:vpt_vp_arg14:v1': } {code} > PortableRunnerTest flake "ParseError: Unexpected type for Value message." > - > > Key: BEAM-9975 > URL: https://issues.apache.org/jira/browse/BEAM-9975 > Project: Beam > Issue Type: Bug > Components: sdk-py-core >Reporter: Brian Hulette >Assignee: Brian Hulette >Priority: Major > Time Spent: 50m > Remaining Estimate: 0h > > Error looks similar to the one in BEAM-9907. Example from > https://builds.apache.org/job/beam_PreCommit_Python_Cron/2732 > {code} > apache_beam/runners/portability/fn_api_runner/fn_runner_test.py:569: > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > _ > apache_beam/pipeline.py:550: in __exit__ > self.run().wait_until_finish() > apache_beam/pipeline.py:529: in run > return self.runner.run_pipeline(self, self._options) > apache_beam/runners/portability/portable_runner.py:426: in run_pipeline > job_service_handle.submit(proto_pipeline) > apache_beam/runners/portability/portable_runner.py:107: in submit > prepare_response = self.prepare(proto_pipeline) > apache_beam/runners/portability/portable_runner.py:184: in prepare > pipeline_options=self.get_pipeline_options()), > apache_beam/runners/portability/portable_runner.py:174: in > get_pipeline_options > return job_utils.dict_to_struct(p_options) > apache_beam/runners/job/utils.py:33: in dict_to_struct > return json_format.ParseDict(dict_obj, struct_pb2.Struct()) > target/.tox-py36-cython/py36-cython/lib/python3.6/site-packages/google/protobuf/json_format.py:450: > in ParseDict > parser.ConvertMessage(js_dict, message) > target/.tox-py36-cython/py36-cython/lib/python3.6/site-packages/google/protobuf/json_format.py:479: > in ConvertMessage > methodcaller(_WKTJSONMETHODS[full_name][1], value, message)(self) > target/.tox-py36-cython/py36-cython/lib/python3.6/site-packages/google/protobuf/json_format.py:667: > in _ConvertStructMessage > self._ConvertValueMessage(value[key], message.fields[key]) > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > _ > self = > value = 0x7f69eb7b3ac8> > message = > def _ConvertValueMessage(self, value, message): > """Convert a JSON representation into Value message.""" > if isinstance(value, dict): > self._ConvertStructMessage(value, me
[jira] [Assigned] (BEAM-9975) PortableRunnerTest flake "ParseError: Unexpected type for Value message."
[ https://issues.apache.org/jira/browse/BEAM-9975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brian Hulette reassigned BEAM-9975: --- Assignee: Brian Hulette > PortableRunnerTest flake "ParseError: Unexpected type for Value message." > - > > Key: BEAM-9975 > URL: https://issues.apache.org/jira/browse/BEAM-9975 > Project: Beam > Issue Type: Bug > Components: sdk-py-core >Reporter: Brian Hulette >Assignee: Brian Hulette >Priority: Major > Time Spent: 50m > Remaining Estimate: 0h > > Error looks similar to the one in BEAM-9907. Example from > https://builds.apache.org/job/beam_PreCommit_Python_Cron/2732 > {code} > apache_beam/runners/portability/fn_api_runner/fn_runner_test.py:569: > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > _ > apache_beam/pipeline.py:550: in __exit__ > self.run().wait_until_finish() > apache_beam/pipeline.py:529: in run > return self.runner.run_pipeline(self, self._options) > apache_beam/runners/portability/portable_runner.py:426: in run_pipeline > job_service_handle.submit(proto_pipeline) > apache_beam/runners/portability/portable_runner.py:107: in submit > prepare_response = self.prepare(proto_pipeline) > apache_beam/runners/portability/portable_runner.py:184: in prepare > pipeline_options=self.get_pipeline_options()), > apache_beam/runners/portability/portable_runner.py:174: in > get_pipeline_options > return job_utils.dict_to_struct(p_options) > apache_beam/runners/job/utils.py:33: in dict_to_struct > return json_format.ParseDict(dict_obj, struct_pb2.Struct()) > target/.tox-py36-cython/py36-cython/lib/python3.6/site-packages/google/protobuf/json_format.py:450: > in ParseDict > parser.ConvertMessage(js_dict, message) > target/.tox-py36-cython/py36-cython/lib/python3.6/site-packages/google/protobuf/json_format.py:479: > in ConvertMessage > methodcaller(_WKTJSONMETHODS[full_name][1], value, message)(self) > target/.tox-py36-cython/py36-cython/lib/python3.6/site-packages/google/protobuf/json_format.py:667: > in _ConvertStructMessage > self._ConvertValueMessage(value[key], message.fields[key]) > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > _ > self = > value = 0x7f69eb7b3ac8> > message = > def _ConvertValueMessage(self, value, message): > """Convert a JSON representation into Value message.""" > if isinstance(value, dict): > self._ConvertStructMessage(value, message.struct_value) > elif isinstance(value, list): > self. _ConvertListValueMessage(value, message.list_value) > elif value is None: > message.null_value = 0 > elif isinstance(value, bool): > message.bool_value = value > elif isinstance(value, six.string_types): > message.string_value = value > elif isinstance(value, _INT_OR_FLOAT): > message.number_value = value > else: > > raise ParseError('Unexpected type for Value message.') > E google.protobuf.json_format.ParseError: Unexpected type for Value > message. > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9951) Create Go SDK synthetic sources.
[ https://issues.apache.org/jira/browse/BEAM-9951?focusedWorklogId=433387&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433387 ] ASF GitHub Bot logged work on BEAM-9951: Author: ASF GitHub Bot Created on: 14/May/20 21:56 Start Date: 14/May/20 21:56 Worklog Time Spent: 10m Work Description: youngoli commented on a change in pull request #11665: URL: https://github.com/apache/beam/pull/11665#discussion_r425453509 ## File path: sdks/go/pkg/beam/io/synthetic/source.go ## @@ -0,0 +1,151 @@ +// Licensed to the Apache Software Foundation (ASF) under one or more +// contributor license agreements. See the NOTICE file distributed with +// this work for additional information regarding copyright ownership. +// The ASF licenses this file to You under the Apache License, Version 2.0 +// (the "License"); you may not use this file except in compliance with +// the License. You may obtain a copy of the License at +// +//http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +// Package synthetic contains transforms for creating synthetic pipelines. +// Synthetic pipelines are pipelines that simulate the behavior of possible +// pipelines in order to test performance, splitting, liquid sharding, and +// various other infrastructure used for running pipelines. This category of +// tests are not concerned with the correctness of the elements themselves, but +// need to simulate transforms that output many elements throughout varying +// pipeline shapes. +package synthetic + +import ( + "github.com/apache/beam/sdks/go/pkg/beam" + "github.com/apache/beam/sdks/go/pkg/beam/io/rtrackers/offsetrange" + "math/rand" + "time" +) + +// Source creates a synthetic source transform that emits randomly +// generated KV<[]byte, []byte> elements. +// +// This transform accepts a PCollection of SourceConfig, where each SourceConfig +// determines the synthetic source's behavior for that element. Review comment: No, it applies to the synthetic steps, so it would be in an upcoming StepConfig, but that would probably still have a similar interface. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 433387) Time Spent: 40m (was: 0.5h) > Create Go SDK synthetic sources. > > > Key: BEAM-9951 > URL: https://issues.apache.org/jira/browse/BEAM-9951 > Project: Beam > Issue Type: Sub-task > Components: sdk-go >Reporter: Daniel Oliveira >Assignee: Daniel Oliveira >Priority: Major > Time Spent: 40m > Remaining Estimate: 0h > > Create synthetic sources for the Go SDK like > [Java|https://github.com/apache/beam/tree/master/sdks/java/io/synthetic/src/main/java/org/apache/beam/sdk/io/synthetic] > and > [Python|https://github.com/apache/beam/blob/master/sdks/python/apache_beam/testing/synthetic_pipeline.py] > have. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8949) Add Spanner IO Integration Test for Python
[ https://issues.apache.org/jira/browse/BEAM-8949?focusedWorklogId=433382&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433382 ] ASF GitHub Bot logged work on BEAM-8949: Author: ASF GitHub Bot Created on: 14/May/20 21:44 Start Date: 14/May/20 21:44 Worklog Time Spent: 10m Work Description: pabloem merged pull request #11210: URL: https://github.com/apache/beam/pull/11210 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 433382) Time Spent: 12.5h (was: 12h 20m) > Add Spanner IO Integration Test for Python > -- > > Key: BEAM-8949 > URL: https://issues.apache.org/jira/browse/BEAM-8949 > Project: Beam > Issue Type: Test > Components: io-py-gcp >Reporter: Shoaib Zafar >Assignee: Shoaib Zafar >Priority: Major > Time Spent: 12.5h > Remaining Estimate: 0h > > Spanner IO (Python SDK) contains PTransform which uses the BatchAPI to read > from the spanner. Currently, it only contains direct runner unit tests. In > order to make this functionality available for the users, integration tests > also need to be added. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9964) Setting workerCacheMb to make its way to the WindmillStateCache Constructor
[ https://issues.apache.org/jira/browse/BEAM-9964?focusedWorklogId=433381&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433381 ] ASF GitHub Bot logged work on BEAM-9964: Author: ASF GitHub Bot Created on: 14/May/20 21:41 Start Date: 14/May/20 21:41 Worklog Time Spent: 10m Work Description: pabloem commented on pull request #11710: URL: https://github.com/apache/beam/pull/11710#issuecomment-628901207 Run Java PostCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 433381) Time Spent: 2h 10m (was: 2h) > Setting workerCacheMb to make its way to the WindmillStateCache Constructor > --- > > Key: BEAM-9964 > URL: https://issues.apache.org/jira/browse/BEAM-9964 > Project: Beam > Issue Type: Improvement > Components: runner-dataflow >Reporter: Omar Ismail >Assignee: Omar Ismail >Priority: Minor > Time Spent: 2h 10m > Remaining Estimate: 0h > > Setting --workerCacheMB seems to affect batch pipelines only. For Streaming, > the cache seems to be hardcoded to 100Mb [1]. If possible, I would like to > make it allowable to change the cache value in Streaming when setting > -workerCacheMB. > I've never made changes to the Beam SDK, so I am super excited to work on > this! > > [[1] > https://github.com/apache/beam/blob/5e659bb80bcbf70795f6806e05a255ee72706d9f/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillStateCache.java#L73|https://github.com/apache/beam/blob/5e659bb80bcbf70795f6806e05a255ee72706d9f/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillStateCache.java#L73] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9577) Update artifact staging and retrieval protocols to be dependency aware.
[ https://issues.apache.org/jira/browse/BEAM-9577?focusedWorklogId=433376&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433376 ] ASF GitHub Bot logged work on BEAM-9577: Author: ASF GitHub Bot Created on: 14/May/20 21:30 Start Date: 14/May/20 21:30 Worklog Time Spent: 10m Work Description: ibzib commented on a change in pull request #11708: URL: https://github.com/apache/beam/pull/11708#discussion_r425439878 ## File path: sdks/java/core/src/main/java/org/apache/beam/sdk/io/ClassLoaderFileSystem.java ## @@ -0,0 +1,154 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.sdk.io; + +import static org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkArgument; + +import com.google.auto.service.AutoService; +import java.io.IOException; +import java.io.InputStream; +import java.nio.channels.Channels; +import java.nio.channels.ReadableByteChannel; +import java.nio.channels.WritableByteChannel; +import java.util.Collection; +import java.util.List; +import javax.annotation.Nullable; +import org.apache.beam.sdk.annotations.Experimental; +import org.apache.beam.sdk.io.fs.CreateOptions; +import org.apache.beam.sdk.io.fs.MatchResult; +import org.apache.beam.sdk.io.fs.ResolveOptions; +import org.apache.beam.sdk.io.fs.ResourceId; +import org.apache.beam.sdk.options.PipelineOptions; +import org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.ImmutableList; + +/** A read-only {@link FileSystem} implementation looking up resources using a ClassLoader. */ +public class ClassLoaderFileSystem extends FileSystem { + + public static final String SCHEMA = "classpath"; + private static final String PREFIX = SCHEMA + "://"; + + ClassLoaderFileSystem() {} + + @Override + protected List match(List specs) throws IOException { +throw new UnsupportedOperationException("Un-globable filesystem."); + } + + @Override + protected WritableByteChannel create( + ClassLoaderResourceId resourceId, CreateOptions createOptions) throws IOException { +throw new UnsupportedOperationException("Read-only filesystem."); + } + + @Override + protected ReadableByteChannel open(ClassLoaderResourceId resourceId) throws IOException { +ClassLoader classLoader = getClass().getClassLoader(); +InputStream inputStream = + classLoader.getResourceAsStream(resourceId.path.substring(PREFIX.length())); +if (inputStream == null) { + throw new IOException("Unable to load " + resourceId.path + " with " + classLoader); +} +return Channels.newChannel(inputStream); + } + + @Override + protected void copy( + List srcResourceIds, List destResourceIds) + throws IOException { +throw new UnsupportedOperationException("Read-only filesystem."); + } + + @Override + protected void rename( + List srcResourceIds, List destResourceIds) + throws IOException { +throw new UnsupportedOperationException("Read-only filesystem."); + } + + @Override + protected void delete(Collection resourceIds) throws IOException { +throw new UnsupportedOperationException("Read-only filesystem."); + } + + @Override + protected ClassLoaderResourceId matchNewResource(String path, boolean isDirectory) { +return new ClassLoaderResourceId(path); + } + + @Override + protected String getScheme() { +return SCHEMA; + } + + public static class ClassLoaderResourceId implements ResourceId { + +private final String path; + +private ClassLoaderResourceId(String path) { + checkArgument(path.startsWith(PREFIX), path); + this.path = path; +} + +@Override +public ResourceId resolve(String other, ResolveOptions resolveOptions) { Review comment: Can we add a couple trivial unit tests as sanity checks / documentation for `resolve` and `getCurrentDirectory`? ## File path: sdks/java/core/src/main/java/org/apache/beam/sdk/io/ClassLoaderFileSystem.java ## @@ -0,0 +1,154 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements.
[jira] [Work logged] (BEAM-9833) Add .asf.yaml file
[ https://issues.apache.org/jira/browse/BEAM-9833?focusedWorklogId=433368&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433368 ] ASF GitHub Bot logged work on BEAM-9833: Author: ASF GitHub Bot Created on: 14/May/20 21:17 Start Date: 14/May/20 21:17 Worklog Time Spent: 10m Work Description: iemejia commented on pull request #11613: URL: https://github.com/apache/beam/pull/11613#issuecomment-628891422 Done the changes suggested by @robertwb let only rebase disabled. Can someone PTAL. Thanks This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 433368) Time Spent: 2h 10m (was: 2h) > Add .asf.yaml file > -- > > Key: BEAM-9833 > URL: https://issues.apache.org/jira/browse/BEAM-9833 > Project: Beam > Issue Type: New Feature > Components: build-system >Reporter: Kyle Weaver >Priority: Major > Time Spent: 2h 10m > Remaining Estimate: 0h > > Github links haven't been automatically added to Jira for a week or so. > According to comments on INFRA-20171 and related issues, we need to add a > .asf.yaml file to configure our notification settings. > https://s.apache.org/asfyaml-notify -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9964) Setting workerCacheMb to make its way to the WindmillStateCache Constructor
[ https://issues.apache.org/jira/browse/BEAM-9964?focusedWorklogId=433367&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433367 ] ASF GitHub Bot logged work on BEAM-9964: Author: ASF GitHub Bot Created on: 14/May/20 21:17 Start Date: 14/May/20 21:17 Worklog Time Spent: 10m Work Description: pabloem commented on pull request #11710: URL: https://github.com/apache/beam/pull/11710#issuecomment-628891444 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 433367) Time Spent: 2h (was: 1h 50m) > Setting workerCacheMb to make its way to the WindmillStateCache Constructor > --- > > Key: BEAM-9964 > URL: https://issues.apache.org/jira/browse/BEAM-9964 > Project: Beam > Issue Type: Improvement > Components: runner-dataflow >Reporter: Omar Ismail >Assignee: Omar Ismail >Priority: Minor > Time Spent: 2h > Remaining Estimate: 0h > > Setting --workerCacheMB seems to affect batch pipelines only. For Streaming, > the cache seems to be hardcoded to 100Mb [1]. If possible, I would like to > make it allowable to change the cache value in Streaming when setting > -workerCacheMB. > I've never made changes to the Beam SDK, so I am super excited to work on > this! > > [[1] > https://github.com/apache/beam/blob/5e659bb80bcbf70795f6806e05a255ee72706d9f/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillStateCache.java#L73|https://github.com/apache/beam/blob/5e659bb80bcbf70795f6806e05a255ee72706d9f/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillStateCache.java#L73] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9964) Setting workerCacheMb to make its way to the WindmillStateCache Constructor
[ https://issues.apache.org/jira/browse/BEAM-9964?focusedWorklogId=433365&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433365 ] ASF GitHub Bot logged work on BEAM-9964: Author: ASF GitHub Bot Created on: 14/May/20 21:16 Start Date: 14/May/20 21:16 Worklog Time Spent: 10m Work Description: pabloem commented on pull request #11710: URL: https://github.com/apache/beam/pull/11710#issuecomment-628890978 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 433365) Time Spent: 1h 50m (was: 1h 40m) > Setting workerCacheMb to make its way to the WindmillStateCache Constructor > --- > > Key: BEAM-9964 > URL: https://issues.apache.org/jira/browse/BEAM-9964 > Project: Beam > Issue Type: Improvement > Components: runner-dataflow >Reporter: Omar Ismail >Assignee: Omar Ismail >Priority: Minor > Time Spent: 1h 50m > Remaining Estimate: 0h > > Setting --workerCacheMB seems to affect batch pipelines only. For Streaming, > the cache seems to be hardcoded to 100Mb [1]. If possible, I would like to > make it allowable to change the cache value in Streaming when setting > -workerCacheMB. > I've never made changes to the Beam SDK, so I am super excited to work on > this! > > [[1] > https://github.com/apache/beam/blob/5e659bb80bcbf70795f6806e05a255ee72706d9f/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillStateCache.java#L73|https://github.com/apache/beam/blob/5e659bb80bcbf70795f6806e05a255ee72706d9f/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillStateCache.java#L73] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-6950) Beam Dependency Update Request: com.github.spotbugs
[ https://issues.apache.org/jira/browse/BEAM-6950?focusedWorklogId=433364&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433364 ] ASF GitHub Bot logged work on BEAM-6950: Author: ASF GitHub Bot Created on: 14/May/20 21:12 Start Date: 14/May/20 21:12 Worklog Time Spent: 10m Work Description: iemejia opened a new pull request #11712: URL: https://github.com/apache/beam/pull/11712 R: @pabloem This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 433364) Remaining Estimate: 0h Time Spent: 10m > Beam Dependency Update Request: com.github.spotbugs > --- > > Key: BEAM-6950 > URL: https://issues.apache.org/jira/browse/BEAM-6950 > Project: Beam > Issue Type: Bug > Components: dependencies >Reporter: Beam JIRA Bot >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > - 2019-04-01 12:15:04.215434 > - > Please consider upgrading the dependency com.github.spotbugs. > The current version is None. The latest version is None > cc: > Please refer to [Beam Dependency Guide > |https://beam.apache.org/contribute/dependencies/]for more information. > Do Not Modify The Description Above. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9964) Setting workerCacheMb to make its way to the WindmillStateCache Constructor
[ https://issues.apache.org/jira/browse/BEAM-9964?focusedWorklogId=433363&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433363 ] ASF GitHub Bot logged work on BEAM-9964: Author: ASF GitHub Bot Created on: 14/May/20 21:11 Start Date: 14/May/20 21:11 Worklog Time Spent: 10m Work Description: omarismail94 commented on pull request #11710: URL: https://github.com/apache/beam/pull/11710#issuecomment-62729 New changes passed ./gradlew -p runners/google-cloud-dataflow-java check on my computer This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 433363) Time Spent: 1h 40m (was: 1.5h) > Setting workerCacheMb to make its way to the WindmillStateCache Constructor > --- > > Key: BEAM-9964 > URL: https://issues.apache.org/jira/browse/BEAM-9964 > Project: Beam > Issue Type: Improvement > Components: runner-dataflow >Reporter: Omar Ismail >Assignee: Omar Ismail >Priority: Minor > Time Spent: 1h 40m > Remaining Estimate: 0h > > Setting --workerCacheMB seems to affect batch pipelines only. For Streaming, > the cache seems to be hardcoded to 100Mb [1]. If possible, I would like to > make it allowable to change the cache value in Streaming when setting > -workerCacheMB. > I've never made changes to the Beam SDK, so I am super excited to work on > this! > > [[1] > https://github.com/apache/beam/blob/5e659bb80bcbf70795f6806e05a255ee72706d9f/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillStateCache.java#L73|https://github.com/apache/beam/blob/5e659bb80bcbf70795f6806e05a255ee72706d9f/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillStateCache.java#L73] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9825) Transforms for Intersect, IntersectAll, Except, ExceptAll, Union, UnionAll
[ https://issues.apache.org/jira/browse/BEAM-9825?focusedWorklogId=433362&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433362 ] ASF GitHub Bot logged work on BEAM-9825: Author: ASF GitHub Bot Created on: 14/May/20 21:09 Start Date: 14/May/20 21:09 Worklog Time Spent: 10m Work Description: lukecwik commented on a change in pull request #11610: URL: https://github.com/apache/beam/pull/11610#discussion_r425432135 ## File path: sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/SetFns.java ## @@ -0,0 +1,528 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.sdk.transforms; + +import static org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkNotNull; + +import java.util.ArrayList; +import java.util.List; +import org.apache.beam.sdk.Pipeline; +import org.apache.beam.sdk.transforms.join.CoGbkResult; +import org.apache.beam.sdk.transforms.join.CoGroupByKey; +import org.apache.beam.sdk.transforms.join.KeyedPCollectionTuple; +import org.apache.beam.sdk.values.KV; +import org.apache.beam.sdk.values.PCollection; +import org.apache.beam.sdk.values.PCollectionList; +import org.apache.beam.sdk.values.TupleTag; +import org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.Iterables; + +public class SetFns { Review comment: We should mention in the comments that we rely on the deterministic encoding of the coder similar to how we do GroupByKey. Also, this implementation assumes that there will only be a single firing of the trigger. If there are multiple then the results are likely undefined. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 433362) Remaining Estimate: 89h (was: 89h 10m) Time Spent: 7h (was: 6h 50m) > Transforms for Intersect, IntersectAll, Except, ExceptAll, Union, UnionAll > -- > > Key: BEAM-9825 > URL: https://issues.apache.org/jira/browse/BEAM-9825 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Darshan Jani >Assignee: Darshan Jani >Priority: Major > Original Estimate: 96h > Time Spent: 7h > Remaining Estimate: 89h > > I'd like to propose following new high-level transforms. > * Intersect > Compute the intersection between elements of two PCollection. > Given _leftCollection_ and _rightCollection_, this transform returns a > collection containing elements that common to both _leftCollection_ and > _rightCollection_ > > * Except > Compute the difference between elements of two PCollection. > Given _leftCollection_ and _rightCollection_, this transform returns a > collection containing elements that are in _leftCollection_ but not in > _rightCollection_ > * Union > Find the elements that are either of two PCollection. > Implement IntersetAll, ExceptAll and UnionAll variants of transforms. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9964) Setting workerCacheMb to make its way to the WindmillStateCache Constructor
[ https://issues.apache.org/jira/browse/BEAM-9964?focusedWorklogId=433359&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433359 ] ASF GitHub Bot logged work on BEAM-9964: Author: ASF GitHub Bot Created on: 14/May/20 21:07 Start Date: 14/May/20 21:07 Worklog Time Spent: 10m Work Description: omarismail94 commented on pull request #11710: URL: https://github.com/apache/beam/pull/11710#issuecomment-628886984 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 433359) Time Spent: 1.5h (was: 1h 20m) > Setting workerCacheMb to make its way to the WindmillStateCache Constructor > --- > > Key: BEAM-9964 > URL: https://issues.apache.org/jira/browse/BEAM-9964 > Project: Beam > Issue Type: Improvement > Components: runner-dataflow >Reporter: Omar Ismail >Assignee: Omar Ismail >Priority: Minor > Time Spent: 1.5h > Remaining Estimate: 0h > > Setting --workerCacheMB seems to affect batch pipelines only. For Streaming, > the cache seems to be hardcoded to 100Mb [1]. If possible, I would like to > make it allowable to change the cache value in Streaming when setting > -workerCacheMB. > I've never made changes to the Beam SDK, so I am super excited to work on > this! > > [[1] > https://github.com/apache/beam/blob/5e659bb80bcbf70795f6806e05a255ee72706d9f/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillStateCache.java#L73|https://github.com/apache/beam/blob/5e659bb80bcbf70795f6806e05a255ee72706d9f/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillStateCache.java#L73] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (BEAM-4596) Release Candidates' Maven Repository
[ https://issues.apache.org/jira/browse/BEAM-4596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17107676#comment-17107676 ] Ismaël Mejía commented on BEAM-4596: Release candidates repositories are published at the moment of the vote and they are available until the release is done (or a new RC is out), so closing this one. > Release Candidates' Maven Repository > > > Key: BEAM-4596 > URL: https://issues.apache.org/jira/browse/BEAM-4596 > Project: Beam > Issue Type: Improvement > Components: build-system, website >Affects Versions: 2.4.0 >Reporter: Cemalettin Koç >Priority: Minor > > I would like to give a try with 2.5.0-RC2 release candidates but I could not > find anywhere these files. Please provide necessary repositories. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (BEAM-4596) Release Candidates' Maven Repository
[ https://issues.apache.org/jira/browse/BEAM-4596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ismaël Mejía resolved BEAM-4596. Fix Version/s: Not applicable Resolution: Invalid > Release Candidates' Maven Repository > > > Key: BEAM-4596 > URL: https://issues.apache.org/jira/browse/BEAM-4596 > Project: Beam > Issue Type: Improvement > Components: build-system, website >Affects Versions: 2.4.0 >Reporter: Cemalettin Koç >Priority: Minor > Fix For: Not applicable > > > I would like to give a try with 2.5.0-RC2 release candidates but I could not > find anywhere these files. Please provide necessary repositories. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9964) Setting workerCacheMb to make its way to the WindmillStateCache Constructor
[ https://issues.apache.org/jira/browse/BEAM-9964?focusedWorklogId=433358&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433358 ] ASF GitHub Bot logged work on BEAM-9964: Author: ASF GitHub Bot Created on: 14/May/20 21:05 Start Date: 14/May/20 21:05 Worklog Time Spent: 10m Work Description: omarismail94 commented on a change in pull request #11710: URL: https://github.com/apache/beam/pull/11710#discussion_r425430039 ## File path: runners/google-cloud-dataflow-java/worker/src/test/java/org/apache/beam/runners/dataflow/worker/WindmillStateCacheTest.java ## @@ -130,7 +133,8 @@ private static StateNamespace triggerNamespace(long start, int triggerIdx) { @Before public void setUp() { -cache = new WindmillStateCache(); +options = PipelineOptionsFactory.as(DataflowWorkerHarnessOptions.class); +cache = new WindmillStateCache(options.getWorkerCacheMb()); assertEquals(0, cache.getWeight()); Review comment: Fixed this by adding a new Test method in WindmillStateCacheTest class. I created a new getter in the WindmillStateCache to retrieve the size of max weight on bytes, and compared it to the initial value set This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 433358) Time Spent: 1h 20m (was: 1h 10m) > Setting workerCacheMb to make its way to the WindmillStateCache Constructor > --- > > Key: BEAM-9964 > URL: https://issues.apache.org/jira/browse/BEAM-9964 > Project: Beam > Issue Type: Improvement > Components: runner-dataflow >Reporter: Omar Ismail >Assignee: Omar Ismail >Priority: Minor > Time Spent: 1h 20m > Remaining Estimate: 0h > > Setting --workerCacheMB seems to affect batch pipelines only. For Streaming, > the cache seems to be hardcoded to 100Mb [1]. If possible, I would like to > make it allowable to change the cache value in Streaming when setting > -workerCacheMB. > I've never made changes to the Beam SDK, so I am super excited to work on > this! > > [[1] > https://github.com/apache/beam/blob/5e659bb80bcbf70795f6806e05a255ee72706d9f/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillStateCache.java#L73|https://github.com/apache/beam/blob/5e659bb80bcbf70795f6806e05a255ee72706d9f/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillStateCache.java#L73] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9825) Transforms for Intersect, IntersectAll, Except, ExceptAll, Union, UnionAll
[ https://issues.apache.org/jira/browse/BEAM-9825?focusedWorklogId=433355&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433355 ] ASF GitHub Bot logged work on BEAM-9825: Author: ASF GitHub Bot Created on: 14/May/20 20:59 Start Date: 14/May/20 20:59 Worklog Time Spent: 10m Work Description: amaliujia commented on pull request #11610: URL: https://github.com/apache/beam/pull/11610#issuecomment-628883607 Thank you! The Jira looks good to me! Will merge this PR tomorrow if there is no other comments. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 433355) Remaining Estimate: 89h 10m (was: 89h 20m) Time Spent: 6h 50m (was: 6h 40m) > Transforms for Intersect, IntersectAll, Except, ExceptAll, Union, UnionAll > -- > > Key: BEAM-9825 > URL: https://issues.apache.org/jira/browse/BEAM-9825 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Darshan Jani >Assignee: Darshan Jani >Priority: Major > Original Estimate: 96h > Time Spent: 6h 50m > Remaining Estimate: 89h 10m > > I'd like to propose following new high-level transforms. > * Intersect > Compute the intersection between elements of two PCollection. > Given _leftCollection_ and _rightCollection_, this transform returns a > collection containing elements that common to both _leftCollection_ and > _rightCollection_ > > * Except > Compute the difference between elements of two PCollection. > Given _leftCollection_ and _rightCollection_, this transform returns a > collection containing elements that are in _leftCollection_ but not in > _rightCollection_ > * Union > Find the elements that are either of two PCollection. > Implement IntersetAll, ExceptAll and UnionAll variants of transforms. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-9992) Migrate BeamSQL's SET operators to SetFns transforms
[ https://issues.apache.org/jira/browse/BEAM-9992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Wang updated BEAM-9992: --- Status: Open (was: Triage Needed) > Migrate BeamSQL's SET operators to SetFns transforms > > > Key: BEAM-9992 > URL: https://issues.apache.org/jira/browse/BEAM-9992 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Darshan Jani >Assignee: Darshan Jani >Priority: Major > > As par of [BEAM-9946|https://issues.apache.org/jira/browse/BEAM-9946] we have > new SetFns transforms for intersect,union and except. > This jira is to use them to remove existing Set operators in BeamSQL code. > Tasks: > # Remove: > [BeamSetOperatorRelBase.java|https://github.com/apache/beam/blob/master/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/rel/BeamSetOperatorRelBase.java] > # use SetFns transforms from > ## > [BeamIntersectRel.java|https://github.com/apache/beam/blob/master/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/rel/BeamIntersectRel.java] > ## > [BeamMinusRel.java|https://github.com/apache/beam/blob/master/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/rel/BeamMinusRel.java] > ## > [BeamMinusRel.java|https://github.com/apache/beam/blob/master/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/rel/BeamMinusRel.java] > ## > [BeamUnionRel.java|https://github.com/apache/beam/blob/master/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/rel/BeamUnionRel.java] > # > Remove:[BeamSetOperatorsTransforms.java|https://github.com/apache/beam/blob/master/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/transform/BeamSetOperatorsTransforms.java] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9946) Enhance Partition transform to provide partitionfn with SideInputs
[ https://issues.apache.org/jira/browse/BEAM-9946?focusedWorklogId=433351&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433351 ] ASF GitHub Bot logged work on BEAM-9946: Author: ASF GitHub Bot Created on: 14/May/20 20:52 Start Date: 14/May/20 20:52 Worklog Time Spent: 10m Work Description: aaltay commented on pull request #11682: URL: https://github.com/apache/beam/pull/11682#issuecomment-628880635 This LGTM and well written. I will trigger the tests. Added @kennknowles in case I am missing something related to the java apis. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 433351) Remaining Estimate: 95h 50m (was: 96h) Time Spent: 10m > Enhance Partition transform to provide partitionfn with SideInputs > -- > > Key: BEAM-9946 > URL: https://issues.apache.org/jira/browse/BEAM-9946 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Darshan Jani >Assignee: Darshan Jani >Priority: Major > Original Estimate: 96h > Time Spent: 10m > Remaining Estimate: 95h 50m > > Currently _Partition_ transform can partition a collection into n collections > based on only _element_ value in _PartitionFn_ to decide on which partition a > particular element belongs to. > {code:java} > public interface PartitionFn extends Serializable { > int partitionFor(T elem, int numPartitions); > } > public static Partition of(int numPartitions, PartitionFn > partitionFn) { > return new Partition<>(new PartitionDoFn(numPartitions, partitionFn)); > } > {code} > It will be useful to introduce new API with additional _sideInputs_ provided > to partition function. User will be able to write logic to use both _element_ > value and _sideInputs_ to decide on which partition a particular element > belongs to. > Option-1: Proposed new API: > {code:java} > public interface PartitionWithSideInputsFn extends Serializable { > int partitionFor(T elem, int numPartitions, Context c); > } > public static Partition of(int numPartitions, > PartitionWithSideInputsFn partitionFn, Requirements requirements) { > ... > } > {code} > User can use any of the two APIs as per there partitioning function logic. > Option-2: Redesign old API with Builder Pattern which can provide optionally > a _Requirements_ with _sideInputs._ Deprecate old API. > {code:java} > // using sideviews > Partition.into(numberOfPartitions).via( > fn( > (input,c) -> { > // use c.sideInput(view) > // use input > // return partitionnumber > },requiresSideInputs(view)) > ) > // without using sideviews > Partition.into(numberOfPartitions).via( > fn((input,c) -> { > // use input > // return partitionnumber > }) > ) > {code} > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9468) Add Google Cloud Healthcare API IO Connectors
[ https://issues.apache.org/jira/browse/BEAM-9468?focusedWorklogId=433348&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433348 ] ASF GitHub Bot logged work on BEAM-9468: Author: ASF GitHub Bot Created on: 14/May/20 20:40 Start Date: 14/May/20 20:40 Worklog Time Spent: 10m Work Description: jaketf commented on pull request #11339: URL: https://github.com/apache/beam/pull/11339#issuecomment-628875209 @pabloem I'm not sure what's going on with the class loading IT issue but I think we could re-run the precommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 433348) Time Spent: 41h 50m (was: 41h 40m) > Add Google Cloud Healthcare API IO Connectors > - > > Key: BEAM-9468 > URL: https://issues.apache.org/jira/browse/BEAM-9468 > Project: Beam > Issue Type: New Feature > Components: io-java-gcp >Reporter: Jacob Ferriero >Assignee: Jacob Ferriero >Priority: Minor > Time Spent: 41h 50m > Remaining Estimate: 0h > > Add IO Transforms for the HL7v2, FHIR and DICOM stores in the [Google Cloud > Healthcare API|https://cloud.google.com/healthcare/docs/] > HL7v2IO > FHIRIO > DICOM -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9770) Add BigQuery DeadLetter pattern to Patterns Page
[ https://issues.apache.org/jira/browse/BEAM-9770?focusedWorklogId=433345&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433345 ] ASF GitHub Bot logged work on BEAM-9770: Author: ASF GitHub Bot Created on: 14/May/20 20:38 Start Date: 14/May/20 20:38 Worklog Time Spent: 10m Work Description: aaltay commented on pull request #11437: URL: https://github.com/apache/beam/pull/11437#issuecomment-628874108 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 433345) Time Spent: 2.5h (was: 2h 20m) > Add BigQuery DeadLetter pattern to Patterns Page > > > Key: BEAM-9770 > URL: https://issues.apache.org/jira/browse/BEAM-9770 > Project: Beam > Issue Type: New Feature > Components: website >Reporter: Reza ardeshir rokni >Assignee: Reza ardeshir rokni >Priority: Trivial > Time Spent: 2.5h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9468) Add Google Cloud Healthcare API IO Connectors
[ https://issues.apache.org/jira/browse/BEAM-9468?focusedWorklogId=433342&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433342 ] ASF GitHub Bot logged work on BEAM-9468: Author: ASF GitHub Bot Created on: 14/May/20 20:34 Start Date: 14/May/20 20:34 Worklog Time Spent: 10m Work Description: jaketf commented on pull request #11339: URL: https://github.com/apache/beam/pull/11339#issuecomment-628872612 - [x] Fixed javadoc checkstyle issue - [x] Added pubsub.close() to the @After in FhirIOReadIT (this should fix the pubsub not shutdown properly message) - [x] Added KV coders to FhirIO.Import see pre-commit output here. https://gist.github.com/jaketf/e7c9116daed60e7babffdb4c99ae1d54 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 433342) Time Spent: 41h 40m (was: 41.5h) > Add Google Cloud Healthcare API IO Connectors > - > > Key: BEAM-9468 > URL: https://issues.apache.org/jira/browse/BEAM-9468 > Project: Beam > Issue Type: New Feature > Components: io-java-gcp >Reporter: Jacob Ferriero >Assignee: Jacob Ferriero >Priority: Minor > Time Spent: 41h 40m > Remaining Estimate: 0h > > Add IO Transforms for the HL7v2, FHIR and DICOM stores in the [Google Cloud > Healthcare API|https://cloud.google.com/healthcare/docs/] > HL7v2IO > FHIRIO > DICOM -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9941) Add a test to prevent a regression in Dataflow when using a Flatten with different input/output coder followed by a GBK
[ https://issues.apache.org/jira/browse/BEAM-9941?focusedWorklogId=44&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-44 ] ASF GitHub Bot logged work on BEAM-9941: Author: ASF GitHub Bot Created on: 14/May/20 20:15 Start Date: 14/May/20 20:15 Worklog Time Spent: 10m Work Description: lukecwik merged pull request #11666: URL: https://github.com/apache/beam/pull/11666 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 44) Time Spent: 1.5h (was: 1h 20m) > Add a test to prevent a regression in Dataflow when using a Flatten with > different input/output coder followed by a GBK > --- > > Key: BEAM-9941 > URL: https://issues.apache.org/jira/browse/BEAM-9941 > Project: Beam > Issue Type: Bug > Components: runner-dataflow >Reporter: Luke Cwik >Assignee: Craig Chambers >Priority: Minor > Fix For: 2.22.0 > > Time Spent: 1.5h > Remaining Estimate: 0h > > Dataflow used to fail when having an input coder that differed from the > output coder when followed by a GBK because of an optimization it is > performing. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-9998) Figures and associated text in Windowing section of programming guide should be updated
[ https://issues.apache.org/jira/browse/BEAM-9998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Wrede updated BEAM-9998: -- Description: All figures and associated text in [section 8.1 of the programming guide|https://beam.apache.org/documentation/programming-guide/#windowing-basics] should be updated and clarified to be more intuitive (i.e. not sourcing from a database table, changing "Table rows" to some other intermediate PCollection, etc.). Also, Figure 4 in 8.1.2 uses KafkaIO, but the text in that section refers to TextIO and describes windowing with a bounded collection, so the figure should be updated to match the text and scenario described. was: All figures and associated text in [section 8.1 of the programming guide|#windowing-basics] should be updated and clarified to be more intuitive (i.e. not sourcing from a database table, changing "Table rows" to some other intermediate PCollection, etc.). Also, Figure 4 in 8.1.2 uses KafkaIO, but the text in that section refers to TextIO and describes windowing with a bounded collection, so the figure should be updated to match the text and scenario described. > Figures and associated text in Windowing section of programming guide should > be updated > --- > > Key: BEAM-9998 > URL: https://issues.apache.org/jira/browse/BEAM-9998 > Project: Beam > Issue Type: Bug > Components: website >Reporter: David Wrede >Priority: Major > > All figures and associated text in [section 8.1 of the programming > guide|https://beam.apache.org/documentation/programming-guide/#windowing-basics] > should be updated and clarified to be more intuitive (i.e. not sourcing from > a database table, changing "Table rows" to some other intermediate > PCollection, etc.). > Also, Figure 4 in 8.1.2 uses KafkaIO, but the text in that section refers to > TextIO and describes windowing with a bounded collection, so the figure > should be updated to match the text and scenario described. > > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-9998) Figures and associated text in Windowing section of programming guide should be updated
[ https://issues.apache.org/jira/browse/BEAM-9998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Wrede updated BEAM-9998: -- Description: All figures and associated text in [section 8.1 of the programming guide|#windowing-basics] should be updated and clarified to be more intuitive (i.e. not sourcing from a database table, changing "Table rows" to some other intermediate PCollection, etc.). Also, Figure 4 in 8.1.2 uses KafkaIO, but the text in that section refers to TextIO and describes windowing with a bounded collection, so the figure should be updated to match the text and scenario described. was: All figures and associated text in [section 8.1 of the programming guide|[https://beam.apache.org/documentation/programming-guide/#windowing-basics]] should be updated and clarified to be more intuitive (i.e. not sourcing from a database table, changing "Table rows" to some other intermediate PCollection, etc.). Also, Figure 4 in 8.1.2 uses KafkaIO, but the text in that section refers to TextIO and describes windowing with a bounded collection, so the figure should be updated to match the text and scenario described. > Figures and associated text in Windowing section of programming guide should > be updated > --- > > Key: BEAM-9998 > URL: https://issues.apache.org/jira/browse/BEAM-9998 > Project: Beam > Issue Type: Bug > Components: website >Reporter: David Wrede >Priority: Major > > All figures and associated text in [section 8.1 of the programming > guide|#windowing-basics] should be updated and clarified to be more intuitive > (i.e. not sourcing from a database table, changing "Table rows" to some other > intermediate PCollection, etc.). > Also, Figure 4 in 8.1.2 uses KafkaIO, but the text in that section refers to > TextIO and describes windowing with a bounded collection, so the figure > should be updated to match the text and scenario described. > > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8949) Add Spanner IO Integration Test for Python
[ https://issues.apache.org/jira/browse/BEAM-8949?focusedWorklogId=433324&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433324 ] ASF GitHub Bot logged work on BEAM-8949: Author: ASF GitHub Bot Created on: 14/May/20 19:46 Start Date: 14/May/20 19:46 Worklog Time Spent: 10m Work Description: pabloem commented on pull request #11210: URL: https://github.com/apache/beam/pull/11210#issuecomment-628850653 Run Python 3.7 PostCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 433324) Time Spent: 12h 20m (was: 12h 10m) > Add Spanner IO Integration Test for Python > -- > > Key: BEAM-8949 > URL: https://issues.apache.org/jira/browse/BEAM-8949 > Project: Beam > Issue Type: Test > Components: io-py-gcp >Reporter: Shoaib Zafar >Assignee: Shoaib Zafar >Priority: Major > Time Spent: 12h 20m > Remaining Estimate: 0h > > Spanner IO (Python SDK) contains PTransform which uses the BatchAPI to read > from the spanner. Currently, it only contains direct runner unit tests. In > order to make this functionality available for the users, integration tests > also need to be added. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9468) Add Google Cloud Healthcare API IO Connectors
[ https://issues.apache.org/jira/browse/BEAM-9468?focusedWorklogId=433323&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433323 ] ASF GitHub Bot logged work on BEAM-9468: Author: ASF GitHub Bot Created on: 14/May/20 19:46 Start Date: 14/May/20 19:46 Worklog Time Spent: 10m Work Description: pabloem commented on pull request #11339: URL: https://github.com/apache/beam/pull/11339#issuecomment-628850532 afaik, the test classes should be bundled together (see https://github.com/apache/beam/blob/4c7d5460ba1d643e7fd57fa8f2f8e6a87d80bee5/sdks/java/io/google-cloud-platform/build.gradle#L118-L122) @mwalenia do you know why the testutil classes could be missing from the IT test? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 433323) Time Spent: 41.5h (was: 41h 20m) > Add Google Cloud Healthcare API IO Connectors > - > > Key: BEAM-9468 > URL: https://issues.apache.org/jira/browse/BEAM-9468 > Project: Beam > Issue Type: New Feature > Components: io-java-gcp >Reporter: Jacob Ferriero >Assignee: Jacob Ferriero >Priority: Minor > Time Spent: 41.5h > Remaining Estimate: 0h > > Add IO Transforms for the HL7v2, FHIR and DICOM stores in the [Google Cloud > Healthcare API|https://cloud.google.com/healthcare/docs/] > HL7v2IO > FHIRIO > DICOM -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9821) SpannerIO does not include all batching parameters in DisplayData.
[ https://issues.apache.org/jira/browse/BEAM-9821?focusedWorklogId=433317&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433317 ] ASF GitHub Bot logged work on BEAM-9821: Author: ASF GitHub Bot Created on: 14/May/20 19:40 Start Date: 14/May/20 19:40 Worklog Time Spent: 10m Work Description: TheNeuralBit commented on a change in pull request #11528: URL: https://github.com/apache/beam/pull/11528#discussion_r425385260 ## File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/SpannerIO.java ## @@ -991,6 +1001,24 @@ public WriteGrouped(Write spec) { this.spec = spec; } +@Override +public void populateDisplayData(DisplayData.Builder builder) { + super.populateDisplayData(builder); + spec.getSpannerConfig().populateDisplayData(builder); + builder.add( + DisplayData.item("batchSizeBytes", spec.getBatchSizeBytes()) + .withLabel("Max batch size in sytes")); + builder.add( + DisplayData.item("maxNumMutations", spec.getMaxNumMutations()) + .withLabel("Max number of mutated cells in each batch")); + builder.add( + DisplayData.item("maxNumRows", spec.getMaxNumRows()) + .withLabel("Max number of rows in each batch")); + builder.add( + DisplayData.item("groupingFactor", spec.getGroupingFactor()) + .withLabel("Number of batches to sort over")); +} + Review comment: I added some commits to tweak these descriptions a bit. @nielm can you confirm that they're still correct? Also I wonder if you can re-use the implementation in `Write` by calling `spec.populateDisplayData(builder)` here instead? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 433317) Time Spent: 40m (was: 0.5h) > SpannerIO does not include all batching parameters in DisplayData. > -- > > Key: BEAM-9821 > URL: https://issues.apache.org/jira/browse/BEAM-9821 > Project: Beam > Issue Type: Bug > Components: io-java-gcp >Affects Versions: 2.20.0, 2.21.0 >Reporter: Niel Markwick >Assignee: Niel Markwick >Priority: Minor > Labels: google-cloud-spanner > Fix For: 2.22.0 > > Time Spent: 40m > Remaining Estimate: 0h > > SpannerIO Write and WriteGrouped do not populate all of the batching/grouping > parameters in their DisplayData – they only show "batchSizeBytes" -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9468) Add Google Cloud Healthcare API IO Connectors
[ https://issues.apache.org/jira/browse/BEAM-9468?focusedWorklogId=433315&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433315 ] ASF GitHub Bot logged work on BEAM-9468: Author: ASF GitHub Bot Created on: 14/May/20 19:38 Start Date: 14/May/20 19:38 Worklog Time Spent: 10m Work Description: jaketf edited a comment on pull request #11339: URL: https://github.com/apache/beam/pull/11339#issuecomment-628842736 Yeah all the FhirIO read tests are parameterized tests that are all failing like this: ``` WARNING: No terminal state was returned within allotted timeout. State value RUNNING May 14, 2020 2:37:23 AM org.apache.beam.runners.dataflow.TestDataflowRunner waitForStreamingJobTermination INFO: Dataflow job 2020-05-13_19_27_22-9842849986096969383 took longer than 600 seconds to complete, cancelling. May 14, 2020 2:37:23 AM org.apache.beam.runners.dataflow.TestDataflowRunner run WARNING: Dataflow job 2020-05-13_19_27_22-9842849986096969383 did not output a success or failure metric. May 14, 2020 2:37:24 AM io.grpc.internal.ManagedChannelOrphanWrapper$ManagedChannelReference cleanQueue SEVERE: *~*~*~ Channel ManagedChannelImpl{logId=15, target=pubsub.googleapis.com:443} was not shutdown properly!!! ~*~*~* Make sure to call shutdown()/shutdownNow() and wait until awaitTermination() returns true. java.lang.RuntimeException: ManagedChannel allocation site ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 433315) Time Spent: 41h 20m (was: 41h 10m) > Add Google Cloud Healthcare API IO Connectors > - > > Key: BEAM-9468 > URL: https://issues.apache.org/jira/browse/BEAM-9468 > Project: Beam > Issue Type: New Feature > Components: io-java-gcp >Reporter: Jacob Ferriero >Assignee: Jacob Ferriero >Priority: Minor > Time Spent: 41h 20m > Remaining Estimate: 0h > > Add IO Transforms for the HL7v2, FHIR and DICOM stores in the [Google Cloud > Healthcare API|https://cloud.google.com/healthcare/docs/] > HL7v2IO > FHIRIO > DICOM -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9468) Add Google Cloud Healthcare API IO Connectors
[ https://issues.apache.org/jira/browse/BEAM-9468?focusedWorklogId=433314&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433314 ] ASF GitHub Bot logged work on BEAM-9468: Author: ASF GitHub Bot Created on: 14/May/20 19:37 Start Date: 14/May/20 19:37 Worklog Time Spent: 10m Work Description: jaketf commented on pull request #11339: URL: https://github.com/apache/beam/pull/11339#issuecomment-628846183 There is a separate crop of issues of ```java.lang.NoClassDefFoundError: Could not initialize class org.apache.beam.sdk.io.gcp.healthcare.FhirIOTestUtil``` In the execute bundles test. Are test utility classes not bundled up and set to dataflow? should I move this under main in the source tree? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 433314) Time Spent: 41h 10m (was: 41h) > Add Google Cloud Healthcare API IO Connectors > - > > Key: BEAM-9468 > URL: https://issues.apache.org/jira/browse/BEAM-9468 > Project: Beam > Issue Type: New Feature > Components: io-java-gcp >Reporter: Jacob Ferriero >Assignee: Jacob Ferriero >Priority: Minor > Time Spent: 41h 10m > Remaining Estimate: 0h > > Add IO Transforms for the HL7v2, FHIR and DICOM stores in the [Google Cloud > Healthcare API|https://cloud.google.com/healthcare/docs/] > HL7v2IO > FHIRIO > DICOM -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9468) Add Google Cloud Healthcare API IO Connectors
[ https://issues.apache.org/jira/browse/BEAM-9468?focusedWorklogId=433313&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433313 ] ASF GitHub Bot logged work on BEAM-9468: Author: ASF GitHub Bot Created on: 14/May/20 19:35 Start Date: 14/May/20 19:35 Worklog Time Spent: 10m Work Description: jaketf commented on pull request #11339: URL: https://github.com/apache/beam/pull/11339#issuecomment-628845239 [FhirIOReadIT](https://github.com/apache/beam/pull/11339/files#diff-7a1359c60a094e73275769adb69b35d3R57) borrows the [TestPubsubSignal](https://github.com/apache/beam/blob/44f3d3f2d52e93a34b05068bf76f6f9d2611bf77/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/pubsub/TestPubsubSignal.java) pattern from [PubsubReadIT](https://github.com/apache/beam/blob/a9e14ff7f4b1aafa9915cb95e1bb7e7d3ab6a28b/sdks/java/io/google-cloud-platform/src/test/java/org/apache/beam/sdk/io/gcp/pubsub/PubsubReadIT.java#L41) This works fine on direct runner for me. It seems this is a dataflow runner / test pubsub signal thing. I will look into if I'm not closing some channel in my tests. @pabloem Have we seen anything flaky on this before? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 433313) Time Spent: 41h (was: 40h 50m) > Add Google Cloud Healthcare API IO Connectors > - > > Key: BEAM-9468 > URL: https://issues.apache.org/jira/browse/BEAM-9468 > Project: Beam > Issue Type: New Feature > Components: io-java-gcp >Reporter: Jacob Ferriero >Assignee: Jacob Ferriero >Priority: Minor > Time Spent: 41h > Remaining Estimate: 0h > > Add IO Transforms for the HL7v2, FHIR and DICOM stores in the [Google Cloud > Healthcare API|https://cloud.google.com/healthcare/docs/] > HL7v2IO > FHIRIO > DICOM -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9993) Add option defaults for Flink Python tests
[ https://issues.apache.org/jira/browse/BEAM-9993?focusedWorklogId=433310&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433310 ] ASF GitHub Bot logged work on BEAM-9993: Author: ASF GitHub Bot Created on: 14/May/20 19:31 Start Date: 14/May/20 19:31 Worklog Time Spent: 10m Work Description: ibzib opened a new pull request #11711: URL: https://github.com/apache/beam/pull/11711 With this change, it is now possible to run Flink Python unit tests without any setup or options. R: @robertwb Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily: - [ ] [**Choose reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and mention them in a comment (`R: @username`). - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue. - [ ] Update `CHANGES.md` with noteworthy changes. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf). See the [Contributor Guide](https://beam.apache.org/contribute) for more tips on [how to make review process smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier). Post-Commit Tests Status (on master branch) Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark --- | --- | --- | --- | --- | --- | --- | --- Go | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/) Java | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Ba
[jira] [Work logged] (BEAM-9468) Add Google Cloud Healthcare API IO Connectors
[ https://issues.apache.org/jira/browse/BEAM-9468?focusedWorklogId=433309&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433309 ] ASF GitHub Bot logged work on BEAM-9468: Author: ASF GitHub Bot Created on: 14/May/20 19:29 Start Date: 14/May/20 19:29 Worklog Time Spent: 10m Work Description: jaketf commented on pull request #11339: URL: https://github.com/apache/beam/pull/11339#issuecomment-628842736 Yeah they are parameterized tests that are all failing like this: ``` WARNING: No terminal state was returned within allotted timeout. State value RUNNING May 14, 2020 2:37:23 AM org.apache.beam.runners.dataflow.TestDataflowRunner waitForStreamingJobTermination INFO: Dataflow job 2020-05-13_19_27_22-9842849986096969383 took longer than 600 seconds to complete, cancelling. May 14, 2020 2:37:23 AM org.apache.beam.runners.dataflow.TestDataflowRunner run WARNING: Dataflow job 2020-05-13_19_27_22-9842849986096969383 did not output a success or failure metric. May 14, 2020 2:37:24 AM io.grpc.internal.ManagedChannelOrphanWrapper$ManagedChannelReference cleanQueue SEVERE: *~*~*~ Channel ManagedChannelImpl{logId=15, target=pubsub.googleapis.com:443} was not shutdown properly!!! ~*~*~* Make sure to call shutdown()/shutdownNow() and wait until awaitTermination() returns true. java.lang.RuntimeException: ManagedChannel allocation site ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 433309) Time Spent: 40h 50m (was: 40h 40m) > Add Google Cloud Healthcare API IO Connectors > - > > Key: BEAM-9468 > URL: https://issues.apache.org/jira/browse/BEAM-9468 > Project: Beam > Issue Type: New Feature > Components: io-java-gcp >Reporter: Jacob Ferriero >Assignee: Jacob Ferriero >Priority: Minor > Time Spent: 40h 50m > Remaining Estimate: 0h > > Add IO Transforms for the HL7v2, FHIR and DICOM stores in the [Google Cloud > Healthcare API|https://cloud.google.com/healthcare/docs/] > HL7v2IO > FHIRIO > DICOM -- This message was sent by Atlassian Jira (v8.3.4#803005)