[jira] [Work logged] (BEAM-4056) Identify Side Inputs by PTransform ID and local name

2018-04-18 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-4056?focusedWorklogId=92220=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-92220
 ]

ASF GitHub Bot logged work on BEAM-4056:


Author: ASF GitHub Bot
Created on: 18/Apr/18 19:30
Start Date: 18/Apr/18 19:30
Worklog Time Spent: 10m 
  Work Description: szewi commented on issue #5170: [BEAM-4056] Basic 
performance tests analysis added.
URL: https://github.com/apache/beam/pull/5170#issuecomment-382502384
 
 
   You are right. It supposed to be 
https://issues.apache.org/jira/browse/BEAM-4065 , 
   however this build contains new jenkins job I would like to test.
   Let me wait for finishing `Jenkins: Seed Job`, run tests to make sure the 
configuration here is valid and I will close this PR. So far I will update 
description.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 92220)
Time Spent: 3h  (was: 2h 50m)

> Identify Side Inputs by PTransform ID and local name
> 
>
> Key: BEAM-4056
> URL: https://issues.apache.org/jira/browse/BEAM-4056
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-core
>Reporter: Ben Sidhom
>Assignee: Ben Sidhom
>Priority: Minor
> Fix For: Not applicable
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> This is necessary in order to correctly identify side inputs during all 
> phases of portable pipeline execution (fusion, translation, and SDK 
> execution).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4056) Identify Side Inputs by PTransform ID and local name

2018-04-18 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-4056?focusedWorklogId=92219=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-92219
 ]

ASF GitHub Bot logged work on BEAM-4056:


Author: ASF GitHub Bot
Created on: 18/Apr/18 19:24
Start Date: 18/Apr/18 19:24
Worklog Time Spent: 10m 
  Work Description: bsidhom commented on issue #5170: [BEAM-4056] Basic 
performance tests analysis added.
URL: https://github.com/apache/beam/pull/5170#issuecomment-382500871
 
 
   I assume you tagged this with BEAM-4056 by mistake. I'm planning to mark 
that bug as closed now.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 92219)
Time Spent: 2h 50m  (was: 2h 40m)

> Identify Side Inputs by PTransform ID and local name
> 
>
> Key: BEAM-4056
> URL: https://issues.apache.org/jira/browse/BEAM-4056
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-core
>Reporter: Ben Sidhom
>Assignee: Ben Sidhom
>Priority: Minor
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> This is necessary in order to correctly identify side inputs during all 
> phases of portable pipeline execution (fusion, translation, and SDK 
> execution).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4056) Identify Side Inputs by PTransform ID and local name

2018-04-18 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-4056?focusedWorklogId=92214=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-92214
 ]

ASF GitHub Bot logged work on BEAM-4056:


Author: ASF GitHub Bot
Created on: 18/Apr/18 18:53
Start Date: 18/Apr/18 18:53
Worklog Time Spent: 10m 
  Work Description: szewi opened a new pull request #5170: [BEAM-4056] 
Basic performance tests analysis added.
URL: https://github.com/apache/beam/pull/5170
 
 
   DESCRIPTION HERE
   
   
   
   Follow this checklist to help us incorporate your contribution quickly and 
easily:
   
- [ ] Make sure there is a [JIRA 
issue](https://issues.apache.org/jira/projects/BEAM/issues/) filed for the 
change (usually before you start working on it).  Trivial changes like typos do 
not require a JIRA issue.  Your pull request should address just this issue, 
without pulling in other changes.
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue.
- [ ] Write a pull request description that is detailed enough to 
understand:
  - [ ] What the pull request does
  - [ ] Why it does it
  - [ ] How it does it
  - [ ] Why this approach
- [ ] Each commit in the pull request should have a meaningful subject line 
and body.
- [ ] Run `mvn clean verify` to make sure basic checks pass. A more 
thorough check will be performed on your pull request automatically.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 92214)
Time Spent: 2.5h  (was: 2h 20m)

> Identify Side Inputs by PTransform ID and local name
> 
>
> Key: BEAM-4056
> URL: https://issues.apache.org/jira/browse/BEAM-4056
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-core
>Reporter: Ben Sidhom
>Assignee: Ben Sidhom
>Priority: Minor
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> This is necessary in order to correctly identify side inputs during all 
> phases of portable pipeline execution (fusion, translation, and SDK 
> execution).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4056) Identify Side Inputs by PTransform ID and local name

2018-04-18 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-4056?focusedWorklogId=92215=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-92215
 ]

ASF GitHub Bot logged work on BEAM-4056:


Author: ASF GitHub Bot
Created on: 18/Apr/18 18:53
Start Date: 18/Apr/18 18:53
Worklog Time Spent: 10m 
  Work Description: szewi commented on issue #5170: [BEAM-4056] Basic 
performance tests analysis added.
URL: https://github.com/apache/beam/pull/5170#issuecomment-382491998
 
 
   Run Seed Job


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 92215)
Time Spent: 2h 40m  (was: 2.5h)

> Identify Side Inputs by PTransform ID and local name
> 
>
> Key: BEAM-4056
> URL: https://issues.apache.org/jira/browse/BEAM-4056
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-core
>Reporter: Ben Sidhom
>Assignee: Ben Sidhom
>Priority: Minor
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> This is necessary in order to correctly identify side inputs during all 
> phases of portable pipeline execution (fusion, translation, and SDK 
> execution).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4056) Identify Side Inputs by PTransform ID and local name

2018-04-16 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-4056?focusedWorklogId=91430=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-91430
 ]

ASF GitHub Bot logged work on BEAM-4056:


Author: ASF GitHub Bot
Created on: 16/Apr/18 18:36
Start Date: 16/Apr/18 18:36
Worklog Time Spent: 10m 
  Work Description: tgroh closed pull request #5118: [BEAM-4056] Identify 
side inputs by transform id and local name
URL: https://github.com/apache/beam/pull/5118
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/model/pipeline/src/main/proto/beam_runner_api.proto 
b/model/pipeline/src/main/proto/beam_runner_api.proto
index 83bb8a29f45..2bc2e34951c 100644
--- a/model/pipeline/src/main/proto/beam_runner_api.proto
+++ b/model/pipeline/src/main/proto/beam_runner_api.proto
@@ -217,12 +217,12 @@ message ExecutableStagePayload {
   // PTransform the ExecutableStagePayload is the payload of.
   string input = 2;
 
-  // Side Input PCollection ids. Each must be present as a value in the inputs 
of
-  // any PTransform the ExecutableStagePayload is the payload of.
-  repeated string side_inputs = 3;
+  // The side inputs required for this executable stage. Each Side Input of 
each PTransform within
+  // this ExecutableStagePayload must be represented within this field.
+  repeated SideInputId side_inputs = 3;
 
   // PTransform ids contained within this executable stage. This must contain 
at least one
-  // PTransform ID.
+  // PTransform id.
   repeated string transforms = 4;
 
   // Output PCollection ids. This must be equal to the values of the outputs 
of any
@@ -232,6 +232,16 @@ message ExecutableStagePayload {
   // (Required) The components for the Executable Stage. This must contain all 
of the Transforms
   // in transforms, and the closure of all of the components they recognize.
   Components components = 6;
+
+  // A reference to a side input. Side inputs are uniquely identified by 
PTransform id and
+  // local name.
+  message SideInputId {
+// (Required) The id of the PTransform that references this side input.
+string transform_id = 1;
+
+// (Required) The local name of this side input from the PTransform that 
references it.
+string local_name = 2;
+  }
 }
 
 // The payload for the primitive ParDo transform.
diff --git 
a/runners/core-construction-java/src/main/java/org/apache/beam/runners/core/construction/graph/ExecutableStage.java
 
b/runners/core-construction-java/src/main/java/org/apache/beam/runners/core/construction/graph/ExecutableStage.java
index c41d0b8b587..50a1c9e1539 100644
--- 
a/runners/core-construction-java/src/main/java/org/apache/beam/runners/core/construction/graph/ExecutableStage.java
+++ 
b/runners/core-construction-java/src/main/java/org/apache/beam/runners/core/construction/graph/ExecutableStage.java
@@ -25,6 +25,7 @@
 import org.apache.beam.model.pipeline.v1.RunnerApi.Components;
 import org.apache.beam.model.pipeline.v1.RunnerApi.Environment;
 import org.apache.beam.model.pipeline.v1.RunnerApi.ExecutableStagePayload;
+import 
org.apache.beam.model.pipeline.v1.RunnerApi.ExecutableStagePayload.SideInputId;
 import org.apache.beam.model.pipeline.v1.RunnerApi.FunctionSpec;
 import org.apache.beam.model.pipeline.v1.RunnerApi.PCollection;
 import org.apache.beam.model.pipeline.v1.RunnerApi.PTransform;
@@ -77,7 +78,7 @@
* Returns the set of {@link PCollectionNode PCollections} that will be 
accessed by this {@link
* ExecutableStage} as side inputs.
*/
-  Collection getSideInputPCollections();
+  Collection getSideInputs();
 
   /**
* Returns the leaf {@link PCollectionNode PCollections} of this {@link 
ExecutableStage}.
@@ -122,11 +123,16 @@ default PTransform toPTransform() {
 pt.putInputs("input", getInputPCollection().getId());
 payload.setInput(input.getId());
 
-int sideInputIndex = 0;
-for (PCollectionNode sideInputNode : getSideInputPCollections()) {
-  pt.putInputs(String.format("side_input_%s", sideInputIndex), 
sideInputNode.getId());
-  payload.addSideInputs(sideInputNode.getId());
-  sideInputIndex++;
+for (SideInputReference sideInput : getSideInputs()) {
+  // Side inputs of the ExecutableStage itself can be uniquely identified 
by inner PTransform
+  // name and local name.
+  String outerLocalName = String.format("%s:%s", sideInput.transform(), 
sideInput.localName());
+  pt.putInputs(outerLocalName, sideInput.collection().getId());
+  payload.addSideInputs(
+  SideInputId.newBuilder()
+  .setTransformId(sideInput.transform().getId())
+  .setLocalName(sideInput.localName())
+  .build());
 }
 
 int 

[jira] [Work logged] (BEAM-4056) Identify Side Inputs by PTransform ID and local name

2018-04-16 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-4056?focusedWorklogId=91407=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-91407
 ]

ASF GitHub Bot logged work on BEAM-4056:


Author: ASF GitHub Bot
Created on: 16/Apr/18 17:18
Start Date: 16/Apr/18 17:18
Worklog Time Spent: 10m 
  Work Description: bsidhom commented on issue #5118: [BEAM-4056] Identify 
side inputs by transform id and local name
URL: https://github.com/apache/beam/pull/5118#issuecomment-381681942
 
 
   Done.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 91407)
Time Spent: 2h 10m  (was: 2h)

> Identify Side Inputs by PTransform ID and local name
> 
>
> Key: BEAM-4056
> URL: https://issues.apache.org/jira/browse/BEAM-4056
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-core
>Reporter: Ben Sidhom
>Assignee: Ben Sidhom
>Priority: Minor
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> This is necessary in order to correctly identify side inputs during all 
> phases of portable pipeline execution (fusion, translation, and SDK 
> execution).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4056) Identify Side Inputs by PTransform ID and local name

2018-04-13 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-4056?focusedWorklogId=91043=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-91043
 ]

ASF GitHub Bot logged work on BEAM-4056:


Author: ASF GitHub Bot
Created on: 14/Apr/18 01:30
Start Date: 14/Apr/18 01:30
Worklog Time Spent: 10m 
  Work Description: bsidhom commented on a change in pull request #5118: 
[BEAM-4056] Identify side inputs by transform id and local name
URL: https://github.com/apache/beam/pull/5118#discussion_r181537077
 
 

 ##
 File path: 
runners/core-construction-java/src/main/java/org/apache/beam/runners/core/construction/graph/SideInputReference.java
 ##
 @@ -0,0 +1,61 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.runners.core.construction.graph;
+
+import com.google.auto.value.AutoValue;
+import org.apache.beam.model.pipeline.v1.RunnerApi;
+import 
org.apache.beam.model.pipeline.v1.RunnerApi.ExecutableStagePayload.SideInputId;
+import org.apache.beam.model.pipeline.v1.RunnerApi.PCollection;
+import org.apache.beam.model.pipeline.v1.RunnerApi.PTransform;
+import 
org.apache.beam.runners.core.construction.graph.PipelineNode.PCollectionNode;
+import 
org.apache.beam.runners.core.construction.graph.PipelineNode.PTransformNode;
+
+/**
+ * A reference to a side input. This includes the PTransform that references 
the side input as well
+ * as the PCollection referenced. Both are necessary in order to fully resolve 
a view.
+ */
+@AutoValue
+public abstract class SideInputReference {
+
+  /** Create a side input reference. */
+  public static SideInputReference of(PTransformNode transform, String 
localName,
 
 Review comment:
   Changed. Let me know if this is any better. I'm sad that there's no Beam 
auto-formatter.  


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 91043)
Time Spent: 2h  (was: 1h 50m)

> Identify Side Inputs by PTransform ID and local name
> 
>
> Key: BEAM-4056
> URL: https://issues.apache.org/jira/browse/BEAM-4056
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-core
>Reporter: Ben Sidhom
>Assignee: Ben Sidhom
>Priority: Minor
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> This is necessary in order to correctly identify side inputs during all 
> phases of portable pipeline execution (fusion, translation, and SDK 
> execution).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4056) Identify Side Inputs by PTransform ID and local name

2018-04-13 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-4056?focusedWorklogId=91042=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-91042
 ]

ASF GitHub Bot logged work on BEAM-4056:


Author: ASF GitHub Bot
Created on: 14/Apr/18 01:30
Start Date: 14/Apr/18 01:30
Worklog Time Spent: 10m 
  Work Description: bsidhom commented on a change in pull request #5118: 
[BEAM-4056] Identify side inputs by transform id and local name
URL: https://github.com/apache/beam/pull/5118#discussion_r181537025
 
 

 ##
 File path: model/pipeline/src/main/proto/beam_runner_api.proto
 ##
 @@ -217,12 +217,12 @@ message ExecutableStagePayload {
   // PTransform the ExecutableStagePayload is the payload of.
   string input = 2;
 
-  // Side Input PCollection ids. Each must be present as a value in the inputs 
of
-  // any PTransform the ExecutableStagePayload is the payload of.
-  repeated string side_inputs = 3;
+  // The side inputs required for this executable stage. Each must be present 
as a side input of
+  // exactly one PTransform within this ExecutableStagePayload.
 
 Review comment:
   Done.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 91042)
Time Spent: 1h 50m  (was: 1h 40m)

> Identify Side Inputs by PTransform ID and local name
> 
>
> Key: BEAM-4056
> URL: https://issues.apache.org/jira/browse/BEAM-4056
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-core
>Reporter: Ben Sidhom
>Assignee: Ben Sidhom
>Priority: Minor
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> This is necessary in order to correctly identify side inputs during all 
> phases of portable pipeline execution (fusion, translation, and SDK 
> execution).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4056) Identify Side Inputs by PTransform ID and local name

2018-04-13 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-4056?focusedWorklogId=90994=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-90994
 ]

ASF GitHub Bot logged work on BEAM-4056:


Author: ASF GitHub Bot
Created on: 13/Apr/18 22:28
Start Date: 13/Apr/18 22:28
Worklog Time Spent: 10m 
  Work Description: tgroh commented on a change in pull request #5118: 
[BEAM-4056] Identify side inputs by transform id and local name
URL: https://github.com/apache/beam/pull/5118#discussion_r181523237
 
 

 ##
 File path: 
runners/core-construction-java/src/main/java/org/apache/beam/runners/core/construction/graph/SideInputReference.java
 ##
 @@ -0,0 +1,61 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.runners.core.construction.graph;
+
+import com.google.auto.value.AutoValue;
+import org.apache.beam.model.pipeline.v1.RunnerApi;
+import 
org.apache.beam.model.pipeline.v1.RunnerApi.ExecutableStagePayload.SideInputId;
+import org.apache.beam.model.pipeline.v1.RunnerApi.PCollection;
+import org.apache.beam.model.pipeline.v1.RunnerApi.PTransform;
+import 
org.apache.beam.runners.core.construction.graph.PipelineNode.PCollectionNode;
+import 
org.apache.beam.runners.core.construction.graph.PipelineNode.PTransformNode;
+
+/**
+ * A reference to a side input. This includes the PTransform that references 
the side input as well
+ * as the PCollection referenced. Both are necessary in order to fully resolve 
a view.
+ */
+@AutoValue
+public abstract class SideInputReference {
+
+  /** Create a side input reference. */
+  public static SideInputReference of(PTransformNode transform, String 
localName,
 
 Review comment:
   Formatting?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 90994)
Time Spent: 1h 40m  (was: 1.5h)

> Identify Side Inputs by PTransform ID and local name
> 
>
> Key: BEAM-4056
> URL: https://issues.apache.org/jira/browse/BEAM-4056
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-core
>Reporter: Ben Sidhom
>Assignee: Ben Sidhom
>Priority: Minor
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> This is necessary in order to correctly identify side inputs during all 
> phases of portable pipeline execution (fusion, translation, and SDK 
> execution).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4056) Identify Side Inputs by PTransform ID and local name

2018-04-13 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-4056?focusedWorklogId=90993=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-90993
 ]

ASF GitHub Bot logged work on BEAM-4056:


Author: ASF GitHub Bot
Created on: 13/Apr/18 22:28
Start Date: 13/Apr/18 22:28
Worklog Time Spent: 10m 
  Work Description: tgroh commented on a change in pull request #5118: 
[BEAM-4056] Identify side inputs by transform id and local name
URL: https://github.com/apache/beam/pull/5118#discussion_r181523199
 
 

 ##
 File path: model/pipeline/src/main/proto/beam_runner_api.proto
 ##
 @@ -217,12 +217,12 @@ message ExecutableStagePayload {
   // PTransform the ExecutableStagePayload is the payload of.
   string input = 2;
 
-  // Side Input PCollection ids. Each must be present as a value in the inputs 
of
-  // any PTransform the ExecutableStagePayload is the payload of.
-  repeated string side_inputs = 3;
+  // The side inputs required for this executable stage. Each must be present 
as a side input of
+  // exactly one PTransform within this ExecutableStagePayload.
 
 Review comment:
   I would modify this spec a little - "Each Side Input of each PTransform 
within this ExecutableStagePayload must be represented within this field." or 
thereabouts.
   
   That way it represents the minimum contents instead of the maximum contents; 
if we make more side inputs available than required, I don't expect either 
harness to break, just to be very confused.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 90993)
Time Spent: 1.5h  (was: 1h 20m)

> Identify Side Inputs by PTransform ID and local name
> 
>
> Key: BEAM-4056
> URL: https://issues.apache.org/jira/browse/BEAM-4056
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-core
>Reporter: Ben Sidhom
>Assignee: Ben Sidhom
>Priority: Minor
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> This is necessary in order to correctly identify side inputs during all 
> phases of portable pipeline execution (fusion, translation, and SDK 
> execution).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4056) Identify Side Inputs by PTransform ID and local name

2018-04-13 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-4056?focusedWorklogId=90943=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-90943
 ]

ASF GitHub Bot logged work on BEAM-4056:


Author: ASF GitHub Bot
Created on: 13/Apr/18 19:00
Start Date: 13/Apr/18 19:00
Worklog Time Spent: 10m 
  Work Description: bsidhom commented on a change in pull request #5118: 
[BEAM-4056] Identify side inputs by transform id and local name
URL: https://github.com/apache/beam/pull/5118#discussion_r181474184
 
 

 ##
 File path: 
runners/core-construction-java/src/main/java/org/apache/beam/runners/core/construction/graph/ExecutableStage.java
 ##
 @@ -122,11 +123,16 @@ default PTransform toPTransform() {
 pt.putInputs("input", getInputPCollection().getId());
 payload.setInput(input.getId());
 
-int sideInputIndex = 0;
-for (PCollectionNode sideInputNode : getSideInputPCollections()) {
-  pt.putInputs(String.format("side_input_%s", sideInputIndex), 
sideInputNode.getId());
-  payload.addSideInputs(sideInputNode.getId());
-  sideInputIndex++;
+for (SideInputReference sideInput : getSideInputs()) {
+  // Side inputs of the ExecutableStage itself can be uniquely identified 
by inner PTransform
+  // name and local name.
+  String outerLocalName = String.format("%s:%s",
+  sideInput.transformId(), sideInput.localName());
+  pt.putInputs(outerLocalName, sideInput.getCollection().getId());
+  payload.addSideInputs(SideInputId.newBuilder()
 
 Review comment:
   Let me know if this is better.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 90943)
Time Spent: 1h  (was: 50m)

> Identify Side Inputs by PTransform ID and local name
> 
>
> Key: BEAM-4056
> URL: https://issues.apache.org/jira/browse/BEAM-4056
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-core
>Reporter: Ben Sidhom
>Assignee: Ben Sidhom
>Priority: Minor
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> This is necessary in order to correctly identify side inputs during all 
> phases of portable pipeline execution (fusion, translation, and SDK 
> execution).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4056) Identify Side Inputs by PTransform ID and local name

2018-04-13 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-4056?focusedWorklogId=90945=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-90945
 ]

ASF GitHub Bot logged work on BEAM-4056:


Author: ASF GitHub Bot
Created on: 13/Apr/18 19:00
Start Date: 13/Apr/18 19:00
Worklog Time Spent: 10m 
  Work Description: bsidhom commented on a change in pull request #5118: 
[BEAM-4056] Identify side inputs by transform id and local name
URL: https://github.com/apache/beam/pull/5118#discussion_r181473382
 
 

 ##
 File path: model/pipeline/src/main/proto/beam_runner_api.proto
 ##
 @@ -217,12 +217,12 @@ message ExecutableStagePayload {
   // PTransform the ExecutableStagePayload is the payload of.
   string input = 2;
 
-  // Side Input PCollection ids. Each must be present as a value in the inputs 
of
-  // any PTransform the ExecutableStagePayload is the payload of.
-  repeated string side_inputs = 3;
+  // The side inputs required for this executable stage. Each must be prsent 
as a side input of
 
 Review comment:
   Aren't you familiar with that common abbreviation??? It shaves off a whole 
character!


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 90945)
Time Spent: 1h 20m  (was: 1h 10m)

> Identify Side Inputs by PTransform ID and local name
> 
>
> Key: BEAM-4056
> URL: https://issues.apache.org/jira/browse/BEAM-4056
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-core
>Reporter: Ben Sidhom
>Assignee: Ben Sidhom
>Priority: Minor
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> This is necessary in order to correctly identify side inputs during all 
> phases of portable pipeline execution (fusion, translation, and SDK 
> execution).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4056) Identify Side Inputs by PTransform ID and local name

2018-04-13 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-4056?focusedWorklogId=90944=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-90944
 ]

ASF GitHub Bot logged work on BEAM-4056:


Author: ASF GitHub Bot
Created on: 13/Apr/18 19:00
Start Date: 13/Apr/18 19:00
Worklog Time Spent: 10m 
  Work Description: bsidhom commented on a change in pull request #5118: 
[BEAM-4056] Identify side inputs by transform id and local name
URL: https://github.com/apache/beam/pull/5118#discussion_r181474800
 
 

 ##
 File path: 
runners/core-construction-java/src/main/java/org/apache/beam/runners/core/construction/graph/SideInputReference.java
 ##
 @@ -0,0 +1,55 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.runners.core.construction.graph;
+
+import com.google.auto.value.AutoValue;
+import org.apache.beam.model.pipeline.v1.RunnerApi;
+import 
org.apache.beam.model.pipeline.v1.RunnerApi.ExecutableStagePayload.SideInputId;
+import 
org.apache.beam.runners.core.construction.graph.PipelineNode.PCollectionNode;
+
+/**
+ * A reference to a side input. This includes the PTransform that references 
the side input as well
+ * as the PCollection referenced. Both are necessary in order to fully resolve 
a view.
+ */
+@AutoValue
+public abstract class SideInputReference {
+
+  /** Create a side input reference. */
+  public static SideInputReference of(String transformId, String localName,
 
 Review comment:
   Yes, it's available. We already require components everywhere due to 
PCollectionNode. Done.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 90944)
Time Spent: 1h 10m  (was: 1h)

> Identify Side Inputs by PTransform ID and local name
> 
>
> Key: BEAM-4056
> URL: https://issues.apache.org/jira/browse/BEAM-4056
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-core
>Reporter: Ben Sidhom
>Assignee: Ben Sidhom
>Priority: Minor
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> This is necessary in order to correctly identify side inputs during all 
> phases of portable pipeline execution (fusion, translation, and SDK 
> execution).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4056) Identify Side Inputs by PTransform ID and local name

2018-04-12 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-4056?focusedWorklogId=90626=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-90626
 ]

ASF GitHub Bot logged work on BEAM-4056:


Author: ASF GitHub Bot
Created on: 12/Apr/18 22:58
Start Date: 12/Apr/18 22:58
Worklog Time Spent: 10m 
  Work Description: tgroh commented on a change in pull request #5118: 
[BEAM-4056] Identify side inputs by transform id and local name
URL: https://github.com/apache/beam/pull/5118#discussion_r181237746
 
 

 ##
 File path: model/pipeline/src/main/proto/beam_runner_api.proto
 ##
 @@ -217,12 +217,12 @@ message ExecutableStagePayload {
   // PTransform the ExecutableStagePayload is the payload of.
   string input = 2;
 
-  // Side Input PCollection ids. Each must be present as a value in the inputs 
of
-  // any PTransform the ExecutableStagePayload is the payload of.
-  repeated string side_inputs = 3;
+  // The side inputs required for this executable stage. Each must be prsent 
as a side input of
 
 Review comment:
   spelling


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 90626)
Time Spent: 0.5h  (was: 20m)

> Identify Side Inputs by PTransform ID and local name
> 
>
> Key: BEAM-4056
> URL: https://issues.apache.org/jira/browse/BEAM-4056
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-core
>Reporter: Ben Sidhom
>Assignee: Ben Sidhom
>Priority: Minor
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> This is necessary in order to correctly identify side inputs during all 
> phases of portable pipeline execution (fusion, translation, and SDK 
> execution).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4056) Identify Side Inputs by PTransform ID and local name

2018-04-12 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-4056?focusedWorklogId=90627=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-90627
 ]

ASF GitHub Bot logged work on BEAM-4056:


Author: ASF GitHub Bot
Created on: 12/Apr/18 22:58
Start Date: 12/Apr/18 22:58
Worklog Time Spent: 10m 
  Work Description: tgroh commented on a change in pull request #5118: 
[BEAM-4056] Identify side inputs by transform id and local name
URL: https://github.com/apache/beam/pull/5118#discussion_r181238037
 
 

 ##
 File path: 
runners/core-construction-java/src/main/java/org/apache/beam/runners/core/construction/graph/SideInputReference.java
 ##
 @@ -0,0 +1,55 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.runners.core.construction.graph;
+
+import com.google.auto.value.AutoValue;
+import org.apache.beam.model.pipeline.v1.RunnerApi;
+import 
org.apache.beam.model.pipeline.v1.RunnerApi.ExecutableStagePayload.SideInputId;
+import 
org.apache.beam.runners.core.construction.graph.PipelineNode.PCollectionNode;
+
+/**
+ * A reference to a side input. This includes the PTransform that references 
the side input as well
+ * as the PCollection referenced. Both are necessary in order to fully resolve 
a view.
+ */
+@AutoValue
+public abstract class SideInputReference {
+
+  /** Create a side input reference. */
+  public static SideInputReference of(String transformId, String localName,
 
 Review comment:
   Maybe a PTransformNode? Would that be available everywhere we're 
constructing this?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 90627)
Time Spent: 40m  (was: 0.5h)

> Identify Side Inputs by PTransform ID and local name
> 
>
> Key: BEAM-4056
> URL: https://issues.apache.org/jira/browse/BEAM-4056
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-core
>Reporter: Ben Sidhom
>Assignee: Ben Sidhom
>Priority: Minor
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> This is necessary in order to correctly identify side inputs during all 
> phases of portable pipeline execution (fusion, translation, and SDK 
> execution).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4056) Identify Side Inputs by PTransform ID and local name

2018-04-12 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-4056?focusedWorklogId=90628=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-90628
 ]

ASF GitHub Bot logged work on BEAM-4056:


Author: ASF GitHub Bot
Created on: 12/Apr/18 22:58
Start Date: 12/Apr/18 22:58
Worklog Time Spent: 10m 
  Work Description: tgroh commented on a change in pull request #5118: 
[BEAM-4056] Identify side inputs by transform id and local name
URL: https://github.com/apache/beam/pull/5118#discussion_r181237867
 
 

 ##
 File path: 
runners/core-construction-java/src/main/java/org/apache/beam/runners/core/construction/graph/ExecutableStage.java
 ##
 @@ -122,11 +123,16 @@ default PTransform toPTransform() {
 pt.putInputs("input", getInputPCollection().getId());
 payload.setInput(input.getId());
 
-int sideInputIndex = 0;
-for (PCollectionNode sideInputNode : getSideInputPCollections()) {
-  pt.putInputs(String.format("side_input_%s", sideInputIndex), 
sideInputNode.getId());
-  payload.addSideInputs(sideInputNode.getId());
-  sideInputIndex++;
+for (SideInputReference sideInput : getSideInputs()) {
+  // Side inputs of the ExecutableStage itself can be uniquely identified 
by inner PTransform
+  // name and local name.
+  String outerLocalName = String.format("%s:%s",
+  sideInput.transformId(), sideInput.localName());
+  pt.putInputs(outerLocalName, sideInput.getCollection().getId());
+  payload.addSideInputs(SideInputId.newBuilder()
 
 Review comment:
   Your formatting looks funky here


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 90628)
Time Spent: 50m  (was: 40m)

> Identify Side Inputs by PTransform ID and local name
> 
>
> Key: BEAM-4056
> URL: https://issues.apache.org/jira/browse/BEAM-4056
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-core
>Reporter: Ben Sidhom
>Assignee: Ben Sidhom
>Priority: Minor
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> This is necessary in order to correctly identify side inputs during all 
> phases of portable pipeline execution (fusion, translation, and SDK 
> execution).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4056) Identify Side Inputs by PTransform ID and local name

2018-04-12 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-4056?focusedWorklogId=90607=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-90607
 ]

ASF GitHub Bot logged work on BEAM-4056:


Author: ASF GitHub Bot
Created on: 12/Apr/18 22:12
Start Date: 12/Apr/18 22:12
Worklog Time Spent: 10m 
  Work Description: bsidhom commented on issue #5118: [BEAM-4056] Identify 
side inputs by transform id and local name
URL: https://github.com/apache/beam/pull/5118#issuecomment-380960266
 
 
   R: @tgroh 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 90607)
Time Spent: 20m  (was: 10m)

> Identify Side Inputs by PTransform ID and local name
> 
>
> Key: BEAM-4056
> URL: https://issues.apache.org/jira/browse/BEAM-4056
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-core
>Reporter: Ben Sidhom
>Assignee: Ben Sidhom
>Priority: Minor
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> This is necessary in order to correctly identify side inputs during all 
> phases of portable pipeline execution (fusion, translation, and SDK 
> execution).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4056) Identify Side Inputs by PTransform ID and local name

2018-04-12 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-4056?focusedWorklogId=90606=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-90606
 ]

ASF GitHub Bot logged work on BEAM-4056:


Author: ASF GitHub Bot
Created on: 12/Apr/18 22:12
Start Date: 12/Apr/18 22:12
Worklog Time Spent: 10m 
  Work Description: bsidhom opened a new pull request #5118: [BEAM-4056] 
Identify side inputs by transform id and local name
URL: https://github.com/apache/beam/pull/5118
 
 
   This is necessary to identify side inputs during portable pipeline 
translation and execution.
   
   
   
   Follow this checklist to help us incorporate your contribution quickly and 
easily:
   
- [ ] Make sure there is a [JIRA 
issue](https://issues.apache.org/jira/projects/BEAM/issues/) filed for the 
change (usually before you start working on it).  Trivial changes like typos do 
not require a JIRA issue.  Your pull request should address just this issue, 
without pulling in other changes.
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue.
- [ ] Write a pull request description that is detailed enough to 
understand:
  - [ ] What the pull request does
  - [ ] Why it does it
  - [ ] How it does it
  - [ ] Why this approach
- [ ] Each commit in the pull request should have a meaningful subject line 
and body.
- [ ] Run `mvn clean verify` to make sure basic checks pass. A more 
thorough check will be performed on your pull request automatically.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 90606)
Time Spent: 10m
Remaining Estimate: 0h

> Identify Side Inputs by PTransform ID and local name
> 
>
> Key: BEAM-4056
> URL: https://issues.apache.org/jira/browse/BEAM-4056
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-core
>Reporter: Ben Sidhom
>Assignee: Ben Sidhom
>Priority: Minor
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> This is necessary in order to correctly identify side inputs during all 
> phases of portable pipeline execution (fusion, translation, and SDK 
> execution).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)