[jira] [Work logged] (BEAM-4076) Schema followups

2018-05-27 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-4076?focusedWorklogId=106248&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-106248
 ]

ASF GitHub Bot logged work on BEAM-4076:


Author: ASF GitHub Bot
Created on: 28/May/18 06:03
Start Date: 28/May/18 06:03
Worklog Time Spent: 10m 
  Work Description: kennknowles opened a new pull request #5498: 
[BEAM-4076] Remove unsafe methods from Schema.TypeName and Schema.FieldType
URL: https://github.com/apache/beam/pull/5498
 
 
   The methods `TypeName.type()` and `FieldType.withMapType()` etc all refer to 
various operations on types and type constructors that are not well-defined for 
a lot of inputs. Since these methods are also not needed, this PR deletes them.
   
   This is stacked on #5497
   
   
   
   Follow this checklist to help us incorporate your contribution quickly and 
easily:
   
- [x] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [x] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   It will help us expedite review of your Pull Request if you tag someone 
(e.g. `@username`) to look at it.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 106248)
Time Spent: 10m
Remaining Estimate: 0h

> Schema followups
> 
>
> Key: BEAM-4076
> URL: https://issues.apache.org/jira/browse/BEAM-4076
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model, dsl-sql, sdk-java-core
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> This umbrella bug contains subtasks with followups for Beam schemas, which 
> were moved from SQL to the core Java SDK and made to be type-name-based 
> rather than coder based.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4076) Schema followups

2018-05-30 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4076?focusedWorklogId=107391&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-107391
 ]

ASF GitHub Bot logged work on BEAM-4076:


Author: ASF GitHub Bot
Created on: 30/May/18 21:38
Start Date: 30/May/18 21:38
Worklog Time Spent: 10m 
  Work Description: kennknowles commented on issue #5498: [BEAM-4076] 
Remove unsafe methods from Schema.TypeName and Schema.FieldType
URL: https://github.com/apache/beam/pull/5498#issuecomment-393328092
 
 
   R: @akedin 
   
   Last in the trilogy.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 107391)
Time Spent: 20m  (was: 10m)

> Schema followups
> 
>
> Key: BEAM-4076
> URL: https://issues.apache.org/jira/browse/BEAM-4076
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model, dsl-sql, sdk-java-core
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> This umbrella bug contains subtasks with followups for Beam schemas, which 
> were moved from SQL to the core Java SDK and made to be type-name-based 
> rather than coder based.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4076) Schema followups

2018-05-30 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4076?focusedWorklogId=107512&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-107512
 ]

ASF GitHub Bot logged work on BEAM-4076:


Author: ASF GitHub Bot
Created on: 31/May/18 02:13
Start Date: 31/May/18 02:13
Worklog Time Spent: 10m 
  Work Description: kennknowles closed pull request #5498: [BEAM-4076] 
Remove unsafe methods from Schema.TypeName and Schema.FieldType
URL: https://github.com/apache/beam/pull/5498
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/Schema.java 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/Schema.java
index 69da5645e11..817e0248ac1 100644
--- a/sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/Schema.java
+++ b/sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/Schema.java
@@ -17,8 +17,6 @@
  */
 package org.apache.beam.sdk.schemas;
 
-import static com.google.common.base.Preconditions.checkArgument;
-
 import com.google.auto.value.AutoValue;
 import com.google.common.collect.BiMap;
 import com.google.common.collect.HashBiMap;
@@ -84,69 +82,69 @@ public Builder addNullableField(String name, FieldType 
type) {
 }
 
 public Builder addByteField(String name) {
-  fields.add(Field.of(name, TypeName.BYTE.type()));
+  fields.add(Field.of(name, FieldType.BYTE));
   return this;
 }
 
 public Builder addInt16Field(String name) {
-  fields.add(Field.of(name, TypeName.INT16.type()));
+  fields.add(Field.of(name, FieldType.INT16));
   return this;
 }
 
 public Builder addInt32Field(String name) {
-  fields.add(Field.of(name, TypeName.INT32.type()));
+  fields.add(Field.of(name, FieldType.INT32));
   return this;
 }
 
 public Builder addInt64Field(String name) {
-  fields.add(Field.of(name, TypeName.INT64.type()));
+  fields.add(Field.of(name, FieldType.INT64));
   return this;
 }
 
 public Builder addDecimalField(String name) {
-  fields.add(Field.of(name, TypeName.DECIMAL.type()));
+  fields.add(Field.of(name, FieldType.DECIMAL));
   return this;
 }
 
 public Builder addFloatField(String name) {
-  fields.add(Field.of(name, TypeName.FLOAT.type()));
+  fields.add(Field.of(name, FieldType.FLOAT));
   return this;
 }
 
 public Builder addDoubleField(String name) {
-  fields.add(Field.of(name, TypeName.DOUBLE.type()));
+  fields.add(Field.of(name, FieldType.DOUBLE));
   return this;
 }
 
 public Builder addStringField(String name) {
-  fields.add(Field.of(name, TypeName.STRING.type()));
+  fields.add(Field.of(name, FieldType.STRING));
   return this;
 }
 
 public Builder addDateTimeField(String name) {
-  fields.add(Field.of(name, TypeName.DATETIME.type()));
+  fields.add(Field.of(name, FieldType.DATETIME));
   return this;
 }
 
 public Builder addBooleanField(String name) {
-  fields.add(Field.of(name, TypeName.BOOLEAN.type()));
+  fields.add(Field.of(name, FieldType.BOOLEAN));
   return this;
 }
 
 public Builder addArrayField(String name, FieldType collectionElementType) 
{
   fields.add(
-  Field.of(name, 
TypeName.ARRAY.type().withCollectionElementType(collectionElementType)));
+  Field.of(name, FieldType.array(collectionElementType)));
   return this;
 }
 
 public Builder addRowField(String name, Schema fieldSchema) {
-  fields.add(Field.of(name, 
TypeName.ROW.type().withRowSchema(fieldSchema)));
+  fields.add(Field.of(name, FieldType.row(fieldSchema)));
   return this;
 }
 
 public Builder addMapField(
 String name, FieldType keyType, FieldType valueType) {
-  fields.add(Field.of(name, TypeName.MAP.type().withMapType(keyType, 
valueType)));
+  fields.add(Field.of(name, FieldType.map(keyType, valueType)));
   return this;
 }
 
@@ -201,8 +199,13 @@ public int hashCode() {
 return fields;
   }
 
-  /**
-   * An enumerated list of supported types.
+  /** An enumerated list of type constructors.
+   *
+   * 
+   *   Atomic types are built from type constructors that take no arguments
+   *   Arrays, rows, and maps are type constructors that take additional
+   *   arguments to form a valid {@link FieldType}.
+   * 
*/
   public enum TypeName {
 BYTE,// One-byte signed integer.
@@ -248,16 +251,6 @@ public boolean isMapType() {
 public boolean isCompositeType() {
   return COMPOSITE_TYPES.contains(this);
 }
-
-/**
- * Returns a {@link FieldType} representing this primitive type.
- *

[jira] [Work logged] (BEAM-4076) Schema followups

2018-06-03 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4076?focusedWorklogId=108391&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-108391
 ]

ASF GitHub Bot logged work on BEAM-4076:


Author: ASF GitHub Bot
Created on: 03/Jun/18 20:10
Start Date: 03/Jun/18 20:10
Worklog Time Spent: 10m 
  Work Description: reuvenlax opened a new pull request #5545: [BEAM-4076] 
Import Schema branch into master
URL: https://github.com/apache/beam/pull/5545
 
 
   This is the import of the schema branch into master. This implements basic 
end-to-end support of Schemas, along with automatic inference of schemas.
   
   These APIs are not yet final, and are all marked Experimental.
   
   R: @akedin 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 108391)
Time Spent: 40m  (was: 0.5h)

> Schema followups
> 
>
> Key: BEAM-4076
> URL: https://issues.apache.org/jira/browse/BEAM-4076
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model, dsl-sql, sdk-java-core
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>Priority: Major
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> This umbrella bug contains subtasks with followups for Beam schemas, which 
> were moved from SQL to the core Java SDK and made to be type-name-based 
> rather than coder based.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4076) Schema followups

2018-06-04 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4076?focusedWorklogId=108466&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-108466
 ]

ASF GitHub Bot logged work on BEAM-4076:


Author: ASF GitHub Bot
Created on: 04/Jun/18 07:31
Start Date: 04/Jun/18 07:31
Worklog Time Spent: 10m 
  Work Description: reuvenlax commented on issue #5545: [BEAM-4076] Import 
Schema branch into master
URL: https://github.com/apache/beam/pull/5545#issuecomment-394260473
 
 
   Spurious test failures are fixed. Dataflow tests are expected to fail until 
the Dataflow worker has been updated.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 108466)
Time Spent: 50m  (was: 40m)

> Schema followups
> 
>
> Key: BEAM-4076
> URL: https://issues.apache.org/jira/browse/BEAM-4076
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model, dsl-sql, sdk-java-core
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>Priority: Major
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> This umbrella bug contains subtasks with followups for Beam schemas, which 
> were moved from SQL to the core Java SDK and made to be type-name-based 
> rather than coder based.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4076) Schema followups

2018-06-04 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4076?focusedWorklogId=108571&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-108571
 ]

ASF GitHub Bot logged work on BEAM-4076:


Author: ASF GitHub Bot
Created on: 04/Jun/18 13:33
Start Date: 04/Jun/18 13:33
Worklog Time Spent: 10m 
  Work Description: kennknowles commented on issue #5545: [BEAM-4076] 
Import Schema branch into master
URL: https://github.com/apache/beam/pull/5545#issuecomment-394356109
 
 
   This seems to have a bunch of commits that have already been moved and also 
some that are sort of intermediate fixups. Can you squash & curate a bit?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 108571)
Time Spent: 1h  (was: 50m)

> Schema followups
> 
>
> Key: BEAM-4076
> URL: https://issues.apache.org/jira/browse/BEAM-4076
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model, dsl-sql, sdk-java-core
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>Priority: Major
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> This umbrella bug contains subtasks with followups for Beam schemas, which 
> were moved from SQL to the core Java SDK and made to be type-name-based 
> rather than coder based.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4076) Schema followups

2018-06-04 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4076?focusedWorklogId=108582&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-108582
 ]

ASF GitHub Bot logged work on BEAM-4076:


Author: ASF GitHub Bot
Created on: 04/Jun/18 13:44
Start Date: 04/Jun/18 13:44
Worklog Time Spent: 10m 
  Work Description: reuvenlax commented on issue #5545: [BEAM-4076] Import 
Schema branch into master
URL: https://github.com/apache/beam/pull/5545#issuecomment-394359654
 
 
   Squashed a bunch of commits.
   
   Unfortunately, I'm not sure what to do about the commits that were already
   moved. I don't think they are all identical to the ones that ended up in
   master (as there was some editing of them in another branch before merging
   to master), rather they are roughly similar to what's in master; later
   commits in this branch bring it back in line with master.
   
   On Mon, Jun 4, 2018 at 4:33 PM Kenn Knowles 
   wrote:
   
   > This seems to have a bunch of commits that have already been moved and
   > also some that are sort of intermediate fixups. Can you squash & curate a
   > bit?
   >
   > —
   > You are receiving this because you authored the thread.
   > Reply to this email directly, view it on GitHub
   > , or mute
   > the thread
   > 

   > .
   >
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 108582)
Time Spent: 1h 10m  (was: 1h)

> Schema followups
> 
>
> Key: BEAM-4076
> URL: https://issues.apache.org/jira/browse/BEAM-4076
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model, dsl-sql, sdk-java-core
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>Priority: Major
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> This umbrella bug contains subtasks with followups for Beam schemas, which 
> were moved from SQL to the core Java SDK and made to be type-name-based 
> rather than coder based.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4076) Schema followups

2018-06-18 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4076?focusedWorklogId=112916&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-112916
 ]

ASF GitHub Bot logged work on BEAM-4076:


Author: ASF GitHub Bot
Created on: 18/Jun/18 21:15
Start Date: 18/Jun/18 21:15
Worklog Time Spent: 10m 
  Work Description: reuvenlax commented on issue #5545: [BEAM-4076] Import 
Schema branch into master
URL: https://github.com/apache/beam/pull/5545#issuecomment-398198230
 
 
   @akedin friendly ping. any update on this review?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 112916)
Time Spent: 1h 20m  (was: 1h 10m)

> Schema followups
> 
>
> Key: BEAM-4076
> URL: https://issues.apache.org/jira/browse/BEAM-4076
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model, dsl-sql, sdk-java-core
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>Priority: Major
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> This umbrella bug contains subtasks with followups for Beam schemas, which 
> were moved from SQL to the core Java SDK and made to be type-name-based 
> rather than coder based.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4076) Schema followups

2018-06-25 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4076?focusedWorklogId=115720&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-115720
 ]

ASF GitHub Bot logged work on BEAM-4076:


Author: ASF GitHub Bot
Created on: 25/Jun/18 23:56
Start Date: 25/Jun/18 23:56
Worklog Time Spent: 10m 
  Work Description: reuvenlax commented on issue #5545: [BEAM-4076] Import 
Schema branch into master
URL: https://github.com/apache/beam/pull/5545#issuecomment-400131758
 
 
   @chamikaramj has agreed to help review the Beam-core parts of this PR. That 
is largely the following files:
   DoFnRunners.java
   SimpleDoFnRunner.java
   ParDoEvaluator.java
   DataflowPipelineTranslator.java
   Pipeline.java
   Create.java
   DoFn.java
   DoFnOutputReceivers.java
   ParDo.java
   ByteBuddyDoFnInvokerFactory.java
   DoFnInvoker.java
   DoFnSignature.java
   DoFnSignatures.java
   DoFnInfo.java
   PCollection.java


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 115720)
Time Spent: 1.5h  (was: 1h 20m)

> Schema followups
> 
>
> Key: BEAM-4076
> URL: https://issues.apache.org/jira/browse/BEAM-4076
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model, dsl-sql, sdk-java-core
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>Priority: Major
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> This umbrella bug contains subtasks with followups for Beam schemas, which 
> were moved from SQL to the core Java SDK and made to be type-name-based 
> rather than coder based.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4076) Schema followups

2018-06-26 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4076?focusedWorklogId=116287&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-116287
 ]

ASF GitHub Bot logged work on BEAM-4076:


Author: ASF GitHub Bot
Created on: 27/Jun/18 03:31
Start Date: 27/Jun/18 03:31
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on a change in pull request 
#5545: [BEAM-4076] Import Schema branch into master
URL: https://github.com/apache/beam/pull/5545#discussion_r198357319
 
 

 ##
 File path: 
runners/core-java/src/main/java/org/apache/beam/runners/core/SimpleDoFnRunner.java
 ##
 @@ -101,19 +119,64 @@ public SimpleDoFnRunner(
   TupleTag mainOutputTag,
   List> additionalOutputTags,
   StepContext stepContext,
+  @Nullable Coder inputCoder,
+  Map, Coder> outputCoders,
   WindowingStrategy windowingStrategy) {
 this.options = options;
 this.fn = fn;
 this.signature = DoFnSignatures.getSignature(fn.getClass());
 this.observesWindow = signature.processElement().observesWindow() || 
!sideInputReader.isEmpty();
 this.invoker = DoFnInvokers.invokerFor(fn);
 this.sideInputReader = sideInputReader;
+this.schemaCoder = (inputCoder != null && inputCoder instanceof 
SchemaCoder)
+? (SchemaCoder) inputCoder : null;
+this.outputCoders = outputCoders;
+if (outputCoders != null) {
+  Coder outputCoder = (Coder) 
outputCoders.get(mainOutputTag);
+  mainOutputSchemaCoder = (outputCoder instanceof SchemaCoder)
+  ? (SchemaCoder) outputCoder : null;
+} else {
+  mainOutputSchemaCoder = null;
+}
 this.outputManager = outputManager;
 this.mainOutputTag = mainOutputTag;
 this.outputTags =
 
Sets.newHashSet(FluentIterable.>of(mainOutputTag).append(additionalOutputTags));
 this.stepContext = stepContext;
 
+// Currently we only support a single FieldAccess on a processElement. We 
should decide
 
 Review comment:
   s/FieldAccess/FieldAccessDescriptor/g


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 116287)
Time Spent: 2h 20m  (was: 2h 10m)

> Schema followups
> 
>
> Key: BEAM-4076
> URL: https://issues.apache.org/jira/browse/BEAM-4076
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model, dsl-sql, sdk-java-core
>Reporter: Kenneth Knowles
>Priority: Major
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> This umbrella bug contains subtasks with followups for Beam schemas, which 
> were moved from SQL to the core Java SDK and made to be type-name-based 
> rather than coder based.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4076) Schema followups

2018-06-26 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4076?focusedWorklogId=116282&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-116282
 ]

ASF GitHub Bot logged work on BEAM-4076:


Author: ASF GitHub Bot
Created on: 27/Jun/18 03:31
Start Date: 27/Jun/18 03:31
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on a change in pull request 
#5545: [BEAM-4076] Import Schema branch into master
URL: https://github.com/apache/beam/pull/5545#discussion_r198357273
 
 

 ##
 File path: 
runners/core-java/src/main/java/org/apache/beam/runners/core/SimpleDoFnRunner.java
 ##
 @@ -93,6 +100,17 @@
   // Because of setKey(Object), we really must refresh stateInternals() at 
each access
   private final StepContext stepContext;
 
+  @Nullable
+  private final SchemaCoder schemaCoder;
+
+  @Nullable final SchemaCoder mainOutputSchemaCoder;
 
 Review comment:
   Why is mainOutput special in this case ? Don't we need schema coders for 
other outputs ?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 116282)
Time Spent: 1h 50m  (was: 1h 40m)

> Schema followups
> 
>
> Key: BEAM-4076
> URL: https://issues.apache.org/jira/browse/BEAM-4076
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model, dsl-sql, sdk-java-core
>Reporter: Kenneth Knowles
>Priority: Major
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> This umbrella bug contains subtasks with followups for Beam schemas, which 
> were moved from SQL to the core Java SDK and made to be type-name-based 
> rather than coder based.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4076) Schema followups

2018-06-26 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4076?focusedWorklogId=116285&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-116285
 ]

ASF GitHub Bot logged work on BEAM-4076:


Author: ASF GitHub Bot
Created on: 27/Jun/18 03:31
Start Date: 27/Jun/18 03:31
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on a change in pull request 
#5545: [BEAM-4076] Import Schema branch into master
URL: https://github.com/apache/beam/pull/5545#discussion_r198357030
 
 

 ##
 File path: 
sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/ParDo.java
 ##
 @@ -479,6 +483,43 @@ private static void validateStateApplicableForInput(
 }
   }
 
+  private static void validateRowParameter(
+  RowParameter rowParameter,
+  Coder inputCoder,
+  Map fieldAccessDeclarations,
+  DoFn fn) {
+checkArgument(inputCoder instanceof SchemaCoder,
+"Cannot access object as a row if the input PCollection does not have 
a schema ."
++ " Coder " + inputCoder.getClass().getSimpleName());
+
+// Resolve the FieldAccessDescriptor against the Schema.
+// This will be resolved anyway by the runner, however we want any 
resolution errors
+// (i.e. caused by a FieldAccessDescriptor that references fields not in 
the schema) to
+// be caught and presented to the user at graph-construction time. 
Therefore we resolve
+// here as well to catch these errors.
+FieldAccessDescriptor fieldAccessDescriptor = null;
+String id = rowParameter.fieldAccessId();
+if (id == null) {
+  // This is the case where no FieldId is defined, just an @Element Row 
row. Default to all
+  // fields accessed.
+  fieldAccessDescriptor = FieldAccessDescriptor.withAllFields();
+} else {
+  // In this case, we expect to have a FieldAccessDescriptor defined in 
the class.
+  FieldAccessDeclaration fieldAccessDeclaration = 
fieldAccessDeclarations.get(id);
+  checkArgument(fieldAccessDeclaration != null,
+  "No FieldAccessDescriptor defined with id", id);
 
 Review comment:

   FieldAccessDeclaration ?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 116285)
Time Spent: 2h 10m  (was: 2h)

> Schema followups
> 
>
> Key: BEAM-4076
> URL: https://issues.apache.org/jira/browse/BEAM-4076
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model, dsl-sql, sdk-java-core
>Reporter: Kenneth Knowles
>Priority: Major
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> This umbrella bug contains subtasks with followups for Beam schemas, which 
> were moved from SQL to the core Java SDK and made to be type-name-based 
> rather than coder based.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4076) Schema followups

2018-06-26 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4076?focusedWorklogId=116289&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-116289
 ]

ASF GitHub Bot logged work on BEAM-4076:


Author: ASF GitHub Bot
Created on: 27/Jun/18 03:31
Start Date: 27/Jun/18 03:31
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on a change in pull request 
#5545: [BEAM-4076] Import Schema branch into master
URL: https://github.com/apache/beam/pull/5545#discussion_r198356882
 
 

 ##
 File path: 
sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/reflect/DoFnSignatures.java
 ##
 @@ -41,9 +41,11 @@
 import java.util.LinkedHashSet;
 import java.util.List;
 import java.util.Map;
+import java.util.Optional;
 import javax.annotation.Nullable;
 import org.apache.beam.sdk.coders.Coder;
 import org.apache.beam.sdk.options.PipelineOptions;
+import org.apache.beam.sdk.schemas.FieldAccessDescriptor;
 
 Review comment:
   Add unit tests to DoFnSignaturesTest to cover updates here ?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 116289)
Time Spent: 2.5h  (was: 2h 20m)

> Schema followups
> 
>
> Key: BEAM-4076
> URL: https://issues.apache.org/jira/browse/BEAM-4076
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model, dsl-sql, sdk-java-core
>Reporter: Kenneth Knowles
>Priority: Major
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> This umbrella bug contains subtasks with followups for Beam schemas, which 
> were moved from SQL to the core Java SDK and made to be type-name-based 
> rather than coder based.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4076) Schema followups

2018-06-26 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4076?focusedWorklogId=116283&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-116283
 ]

ASF GitHub Bot logged work on BEAM-4076:


Author: ASF GitHub Bot
Created on: 27/Jun/18 03:31
Start Date: 27/Jun/18 03:31
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on a change in pull request 
#5545: [BEAM-4076] Import Schema branch into master
URL: https://github.com/apache/beam/pull/5545#discussion_r198357072
 
 

 ##
 File path: 
sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/ParDo.java
 ##
 @@ -778,6 +820,18 @@ public PCollectionTuple expand(PCollection input) {
 validateStateApplicableForInput(fn, input);
   }
 
+  DoFnSignature.ProcessElementMethod processElementMethod = 
signature.processElement();
+  RowParameter rowParameter = processElementMethod.getRowParameter();
+  // Can only as for a Row if a Schema was specified!
 
 Review comment:
   "ask for a"


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 116283)
Time Spent: 2h  (was: 1h 50m)

> Schema followups
> 
>
> Key: BEAM-4076
> URL: https://issues.apache.org/jira/browse/BEAM-4076
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model, dsl-sql, sdk-java-core
>Reporter: Kenneth Knowles
>Priority: Major
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> This umbrella bug contains subtasks with followups for Beam schemas, which 
> were moved from SQL to the core Java SDK and made to be type-name-based 
> rather than coder based.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4076) Schema followups

2018-06-26 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4076?focusedWorklogId=116288&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-116288
 ]

ASF GitHub Bot logged work on BEAM-4076:


Author: ASF GitHub Bot
Created on: 27/Jun/18 03:31
Start Date: 27/Jun/18 03:31
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on a change in pull request 
#5545: [BEAM-4076] Import Schema branch into master
URL: https://github.com/apache/beam/pull/5545#discussion_r198357231
 
 

 ##
 File path: 
sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/DoFn.java
 ##
 @@ -376,7 +377,15 @@ public Duration getAllowedTimestampSkew() {
 
   /** Receives tagged output for a multi-output function. */
   public interface MultiOutputReceiver {
+/** Returns an {@link OutputReceiver} for the given tag. **/
  OutputReceiver get(TupleTag tag);
+
+/** Returns a {@link OutputReceiver} for publishing {@link Row} objects to 
the given tag.
+ *
+ * The {@link PCollection} representing this tag must have a schema 
registered in order to
+ * call this function.
+ */
+ OutputReceiver getRowReceiver(TupleTag tag);
 
 Review comment:
   @Experimental(Kind.SCHEMAS) ?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 116288)

> Schema followups
> 
>
> Key: BEAM-4076
> URL: https://issues.apache.org/jira/browse/BEAM-4076
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model, dsl-sql, sdk-java-core
>Reporter: Kenneth Knowles
>Priority: Major
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> This umbrella bug contains subtasks with followups for Beam schemas, which 
> were moved from SQL to the core Java SDK and made to be type-name-based 
> rather than coder based.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4076) Schema followups

2018-06-26 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4076?focusedWorklogId=116286&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-116286
 ]

ASF GitHub Bot logged work on BEAM-4076:


Author: ASF GitHub Bot
Created on: 27/Jun/18 03:31
Start Date: 27/Jun/18 03:31
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on a change in pull request 
#5545: [BEAM-4076] Import Schema branch into master
URL: https://github.com/apache/beam/pull/5545#discussion_r198357420
 
 

 ##
 File path: 
runners/core-java/src/main/java/org/apache/beam/runners/core/SimpleDoFnRunner.java
 ##
 @@ -101,19 +119,64 @@ public SimpleDoFnRunner(
   TupleTag mainOutputTag,
   List> additionalOutputTags,
   StepContext stepContext,
+  @Nullable Coder inputCoder,
+  Map, Coder> outputCoders,
   WindowingStrategy windowingStrategy) {
 this.options = options;
 this.fn = fn;
 this.signature = DoFnSignatures.getSignature(fn.getClass());
 this.observesWindow = signature.processElement().observesWindow() || 
!sideInputReader.isEmpty();
 this.invoker = DoFnInvokers.invokerFor(fn);
 this.sideInputReader = sideInputReader;
+this.schemaCoder = (inputCoder != null && inputCoder instanceof 
SchemaCoder)
+? (SchemaCoder) inputCoder : null;
+this.outputCoders = outputCoders;
+if (outputCoders != null) {
+  Coder outputCoder = (Coder) 
outputCoders.get(mainOutputTag);
+  mainOutputSchemaCoder = (outputCoder instanceof SchemaCoder)
+  ? (SchemaCoder) outputCoder : null;
+} else {
+  mainOutputSchemaCoder = null;
+}
 this.outputManager = outputManager;
 this.mainOutputTag = mainOutputTag;
 this.outputTags =
 
Sets.newHashSet(FluentIterable.>of(mainOutputTag).append(additionalOutputTags));
 this.stepContext = stepContext;
 
+// Currently we only support a single FieldAccess on a processElement. We 
should decide
+// whether to get rid of the FieldAccess ids, or find a use for multiple.
+DoFnSignature doFnSignature = DoFnSignatures.getSignature(fn.getClass());
+DoFnSignature.ProcessElementMethod processElementMethod = 
doFnSignature.processElement();
+RowParameter rowParameter = processElementMethod.getRowParameter();
+FieldAccessDescriptor fieldAccessDescriptor = null;
+if (rowParameter != null) {
+  checkArgument(schemaCoder != null,
+  "Cannot access object as a row if the input PCollection does not 
have a schema ."
+  + "DoFn " + fn.getClass() + " Coder " + 
inputCoder.getClass().getSimpleName());
+  String id = rowParameter.fieldAccessId();
+  if (id == null) {
+// This is the case where no FieldId is defined, just an @Element Row 
row. Default to all
+// fields accessed.
+fieldAccessDescriptor = FieldAccessDescriptor.withAllFields();
+  } else {
+// In this case, we expect to have a FieldAccessDescriptor defined in 
the class.
+FieldAccessDeclaration fieldAccessDeclaration =
+doFnSignature.fieldAccessDeclarations().get(id);
+checkArgument(fieldAccessDeclaration != null,
+"No FieldAccessDescriptor defined with id", id);
 
 Review comment:

   s/FieldAccessDescriptor/FieldAccessDeclaration/ ?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 116286)
Time Spent: 2h 20m  (was: 2h 10m)

> Schema followups
> 
>
> Key: BEAM-4076
> URL: https://issues.apache.org/jira/browse/BEAM-4076
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model, dsl-sql, sdk-java-core
>Reporter: Kenneth Knowles
>Priority: Major
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> This umbrella bug contains subtasks with followups for Beam schemas, which 
> were moved from SQL to the core Java SDK and made to be type-name-based 
> rather than coder based.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4076) Schema followups

2018-06-26 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4076?focusedWorklogId=116284&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-116284
 ]

ASF GitHub Bot logged work on BEAM-4076:


Author: ASF GitHub Bot
Created on: 27/Jun/18 03:31
Start Date: 27/Jun/18 03:31
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on a change in pull request 
#5545: [BEAM-4076] Import Schema branch into master
URL: https://github.com/apache/beam/pull/5545#discussion_r198357373
 
 

 ##
 File path: 
runners/core-java/src/main/java/org/apache/beam/runners/core/SimpleDoFnRunner.java
 ##
 @@ -101,19 +119,64 @@ public SimpleDoFnRunner(
   TupleTag mainOutputTag,
   List> additionalOutputTags,
   StepContext stepContext,
+  @Nullable Coder inputCoder,
+  Map, Coder> outputCoders,
   WindowingStrategy windowingStrategy) {
 this.options = options;
 this.fn = fn;
 this.signature = DoFnSignatures.getSignature(fn.getClass());
 this.observesWindow = signature.processElement().observesWindow() || 
!sideInputReader.isEmpty();
 this.invoker = DoFnInvokers.invokerFor(fn);
 this.sideInputReader = sideInputReader;
+this.schemaCoder = (inputCoder != null && inputCoder instanceof 
SchemaCoder)
+? (SchemaCoder) inputCoder : null;
+this.outputCoders = outputCoders;
+if (outputCoders != null) {
+  Coder outputCoder = (Coder) 
outputCoders.get(mainOutputTag);
+  mainOutputSchemaCoder = (outputCoder instanceof SchemaCoder)
+  ? (SchemaCoder) outputCoder : null;
+} else {
+  mainOutputSchemaCoder = null;
+}
 this.outputManager = outputManager;
 this.mainOutputTag = mainOutputTag;
 this.outputTags =
 
Sets.newHashSet(FluentIterable.>of(mainOutputTag).append(additionalOutputTags));
 this.stepContext = stepContext;
 
+// Currently we only support a single FieldAccess on a processElement. We 
should decide
+// whether to get rid of the FieldAccess ids, or find a use for multiple.
+DoFnSignature doFnSignature = DoFnSignatures.getSignature(fn.getClass());
+DoFnSignature.ProcessElementMethod processElementMethod = 
doFnSignature.processElement();
+RowParameter rowParameter = processElementMethod.getRowParameter();
+FieldAccessDescriptor fieldAccessDescriptor = null;
+if (rowParameter != null) {
+  checkArgument(schemaCoder != null,
+  "Cannot access object as a row if the input PCollection does not 
have a schema ."
 
 Review comment:

   Does "no schema coder" ==> "no schema" or do we have to talk about not 
having the schema coder defined in this exception message ?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 116284)

> Schema followups
> 
>
> Key: BEAM-4076
> URL: https://issues.apache.org/jira/browse/BEAM-4076
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model, dsl-sql, sdk-java-core
>Reporter: Kenneth Knowles
>Priority: Major
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> This umbrella bug contains subtasks with followups for Beam schemas, which 
> were moved from SQL to the core Java SDK and made to be type-name-based 
> rather than coder based.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4076) Schema followups

2018-06-26 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4076?focusedWorklogId=116281&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-116281
 ]

ASF GitHub Bot logged work on BEAM-4076:


Author: ASF GitHub Bot
Created on: 27/Jun/18 03:31
Start Date: 27/Jun/18 03:31
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on a change in pull request 
#5545: [BEAM-4076] Import Schema branch into master
URL: https://github.com/apache/beam/pull/5545#discussion_r198356774
 
 

 ##
 File path: 
sdks/java/core/src/main/java/org/apache/beam/sdk/values/PCollection.java
 ##
 @@ -276,6 +299,18 @@ public String getName() {
 return this;
   }
 
+  /**
+   * Sets a {@link Schema} on this {@link PCollection}. This is a wrapper 
around
 
 Review comment:
   Why reveal implementation details here ?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 116281)
Time Spent: 1h 40m  (was: 1.5h)

> Schema followups
> 
>
> Key: BEAM-4076
> URL: https://issues.apache.org/jira/browse/BEAM-4076
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model, dsl-sql, sdk-java-core
>Reporter: Kenneth Knowles
>Priority: Major
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> This umbrella bug contains subtasks with followups for Beam schemas, which 
> were moved from SQL to the core Java SDK and made to be type-name-based 
> rather than coder based.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4076) Schema followups

2018-07-20 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4076?focusedWorklogId=125503&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-125503
 ]

ASF GitHub Bot logged work on BEAM-4076:


Author: ASF GitHub Bot
Created on: 20/Jul/18 14:08
Start Date: 20/Jul/18 14:08
Worklog Time Spent: 10m 
  Work Description: echauchot commented on issue #5955: [BEAM-4076] Enable 
schemas for more runners
URL: https://github.com/apache/beam/pull/5955#issuecomment-406611581
 
 
   Run Spark ValidatesRunner


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 125503)
Time Spent: 13h 20m  (was: 13h 10m)

> Schema followups
> 
>
> Key: BEAM-4076
> URL: https://issues.apache.org/jira/browse/BEAM-4076
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model, dsl-sql, sdk-java-core
>Reporter: Kenneth Knowles
>Priority: Major
>  Time Spent: 13h 20m
>  Remaining Estimate: 0h
>
> This umbrella bug contains subtasks with followups for Beam schemas, which 
> were moved from SQL to the core Java SDK and made to be type-name-based 
> rather than coder based.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4076) Schema followups

2018-07-20 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4076?focusedWorklogId=125504&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-125504
 ]

ASF GitHub Bot logged work on BEAM-4076:


Author: ASF GitHub Bot
Created on: 20/Jul/18 14:08
Start Date: 20/Jul/18 14:08
Worklog Time Spent: 10m 
  Work Description: echauchot commented on issue #5955: [BEAM-4076] Enable 
schemas for more runners
URL: https://github.com/apache/beam/pull/5955#issuecomment-406611673
 
 
   Run Flink Validates Runner


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 125504)
Time Spent: 13.5h  (was: 13h 20m)

> Schema followups
> 
>
> Key: BEAM-4076
> URL: https://issues.apache.org/jira/browse/BEAM-4076
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model, dsl-sql, sdk-java-core
>Reporter: Kenneth Knowles
>Priority: Major
>  Time Spent: 13.5h
>  Remaining Estimate: 0h
>
> This umbrella bug contains subtasks with followups for Beam schemas, which 
> were moved from SQL to the core Java SDK and made to be type-name-based 
> rather than coder based.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4076) Schema followups

2018-07-20 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4076?focusedWorklogId=125507&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-125507
 ]

ASF GitHub Bot logged work on BEAM-4076:


Author: ASF GitHub Bot
Created on: 20/Jul/18 14:10
Start Date: 20/Jul/18 14:10
Worklog Time Spent: 10m 
  Work Description: echauchot commented on issue #5955: [BEAM-4076] Enable 
schemas for more runners
URL: https://github.com/apache/beam/pull/5955#issuecomment-406612357
 
 
   Run Apex ValidatesRunner


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 125507)
Time Spent: 13h 50m  (was: 13h 40m)

> Schema followups
> 
>
> Key: BEAM-4076
> URL: https://issues.apache.org/jira/browse/BEAM-4076
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model, dsl-sql, sdk-java-core
>Reporter: Kenneth Knowles
>Priority: Major
>  Time Spent: 13h 50m
>  Remaining Estimate: 0h
>
> This umbrella bug contains subtasks with followups for Beam schemas, which 
> were moved from SQL to the core Java SDK and made to be type-name-based 
> rather than coder based.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4076) Schema followups

2018-07-20 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4076?focusedWorklogId=125506&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-125506
 ]

ASF GitHub Bot logged work on BEAM-4076:


Author: ASF GitHub Bot
Created on: 20/Jul/18 14:10
Start Date: 20/Jul/18 14:10
Worklog Time Spent: 10m 
  Work Description: echauchot edited a comment on issue #5955: [BEAM-4076] 
Enable schemas for more runners
URL: https://github.com/apache/beam/pull/5955#issuecomment-406611673
 
 
   Run Flink ValidatesRunner


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 125506)
Time Spent: 13h 40m  (was: 13.5h)

> Schema followups
> 
>
> Key: BEAM-4076
> URL: https://issues.apache.org/jira/browse/BEAM-4076
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model, dsl-sql, sdk-java-core
>Reporter: Kenneth Knowles
>Priority: Major
>  Time Spent: 13h 40m
>  Remaining Estimate: 0h
>
> This umbrella bug contains subtasks with followups for Beam schemas, which 
> were moved from SQL to the core Java SDK and made to be type-name-based 
> rather than coder based.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4076) Schema followups

2018-07-20 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4076?focusedWorklogId=125508&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-125508
 ]

ASF GitHub Bot logged work on BEAM-4076:


Author: ASF GitHub Bot
Created on: 20/Jul/18 14:11
Start Date: 20/Jul/18 14:11
Worklog Time Spent: 10m 
  Work Description: echauchot removed a comment on issue #5955: [BEAM-4076] 
Enable schemas for more runners
URL: https://github.com/apache/beam/pull/5955#issuecomment-406611673
 
 
   Run Flink ValidatesRunner


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 125508)
Time Spent: 14h  (was: 13h 50m)

> Schema followups
> 
>
> Key: BEAM-4076
> URL: https://issues.apache.org/jira/browse/BEAM-4076
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model, dsl-sql, sdk-java-core
>Reporter: Kenneth Knowles
>Priority: Major
>  Time Spent: 14h
>  Remaining Estimate: 0h
>
> This umbrella bug contains subtasks with followups for Beam schemas, which 
> were moved from SQL to the core Java SDK and made to be type-name-based 
> rather than coder based.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4076) Schema followups

2018-07-20 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4076?focusedWorklogId=125509&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-125509
 ]

ASF GitHub Bot logged work on BEAM-4076:


Author: ASF GitHub Bot
Created on: 20/Jul/18 14:12
Start Date: 20/Jul/18 14:12
Worklog Time Spent: 10m 
  Work Description: echauchot commented on issue #5955: [BEAM-4076] Enable 
schemas for more runners
URL: https://github.com/apache/beam/pull/5955#issuecomment-406612676
 
 
   Run Flink ValidatesRunner


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 125509)
Time Spent: 14h 10m  (was: 14h)

> Schema followups
> 
>
> Key: BEAM-4076
> URL: https://issues.apache.org/jira/browse/BEAM-4076
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model, dsl-sql, sdk-java-core
>Reporter: Kenneth Knowles
>Priority: Major
>  Time Spent: 14h 10m
>  Remaining Estimate: 0h
>
> This umbrella bug contains subtasks with followups for Beam schemas, which 
> were moved from SQL to the core Java SDK and made to be type-name-based 
> rather than coder based.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4076) Schema followups

2018-07-20 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4076?focusedWorklogId=125510&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-125510
 ]

ASF GitHub Bot logged work on BEAM-4076:


Author: ASF GitHub Bot
Created on: 20/Jul/18 14:13
Start Date: 20/Jul/18 14:13
Worklog Time Spent: 10m 
  Work Description: echauchot commented on issue #5955: [BEAM-4076] Enable 
schemas for more runners
URL: https://github.com/apache/beam/pull/5955#issuecomment-406612946
 
 
   Run Samza ValidatesRunner


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 125510)
Time Spent: 14h 20m  (was: 14h 10m)

> Schema followups
> 
>
> Key: BEAM-4076
> URL: https://issues.apache.org/jira/browse/BEAM-4076
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model, dsl-sql, sdk-java-core
>Reporter: Kenneth Knowles
>Priority: Major
>  Time Spent: 14h 20m
>  Remaining Estimate: 0h
>
> This umbrella bug contains subtasks with followups for Beam schemas, which 
> were moved from SQL to the core Java SDK and made to be type-name-based 
> rather than coder based.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4076) Schema followups

2018-07-20 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4076?focusedWorklogId=125553&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-125553
 ]

ASF GitHub Bot logged work on BEAM-4076:


Author: ASF GitHub Bot
Created on: 20/Jul/18 16:25
Start Date: 20/Jul/18 16:25
Worklog Time Spent: 10m 
  Work Description: reuvenlax commented on issue #5955: [BEAM-4076] Enable 
schemas for more runners
URL: https://github.com/apache/beam/pull/5955#issuecomment-406652348
 
 
   @echauchot thanks for the review! Some comments:
   
 1. Schemas are plumbed through the NewDoFn implementation, which is why we 
needed the input and output coders. Schemas are implemented as a special type 
of coder (SchemaCoder), so NewDoFn needs those coders to infer schemas. Other 
transforms such as CoGroupByKey are (currently) untouched by schemas. There 
will be a new schema-specific Join transform (allowing you to join by specific 
field names, etc.), but I don't currently plan on changing CoGroupByKey itself.
   
 2. I'll take a look at the apex runner failures, and rebase.
   
 3. I'll look at variable names. I tried to rename a number of them to be 
less confusing, but might've missed a few :)
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 125553)
Time Spent: 14.5h  (was: 14h 20m)

> Schema followups
> 
>
> Key: BEAM-4076
> URL: https://issues.apache.org/jira/browse/BEAM-4076
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model, dsl-sql, sdk-java-core
>Reporter: Kenneth Knowles
>Priority: Major
>  Time Spent: 14.5h
>  Remaining Estimate: 0h
>
> This umbrella bug contains subtasks with followups for Beam schemas, which 
> were moved from SQL to the core Java SDK and made to be type-name-based 
> rather than coder based.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4076) Schema followups

2018-07-24 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4076?focusedWorklogId=126611&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-126611
 ]

ASF GitHub Bot logged work on BEAM-4076:


Author: ASF GitHub Bot
Created on: 24/Jul/18 13:39
Start Date: 24/Jul/18 13:39
Worklog Time Spent: 10m 
  Work Description: reuvenlax commented on issue #5955: [BEAM-4076] Enable 
schemas for more runners
URL: https://github.com/apache/beam/pull/5955#issuecomment-407409970
 
 
   Run Samza ValidatesRunner


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 126611)
Time Spent: 15h  (was: 14h 50m)

> Schema followups
> 
>
> Key: BEAM-4076
> URL: https://issues.apache.org/jira/browse/BEAM-4076
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model, dsl-sql, sdk-java-core
>Reporter: Kenneth Knowles
>Priority: Major
>  Time Spent: 15h
>  Remaining Estimate: 0h
>
> This umbrella bug contains subtasks with followups for Beam schemas, which 
> were moved from SQL to the core Java SDK and made to be type-name-based 
> rather than coder based.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4076) Schema followups

2018-07-24 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4076?focusedWorklogId=126610&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-126610
 ]

ASF GitHub Bot logged work on BEAM-4076:


Author: ASF GitHub Bot
Created on: 24/Jul/18 13:39
Start Date: 24/Jul/18 13:39
Worklog Time Spent: 10m 
  Work Description: reuvenlax commented on issue #5955: [BEAM-4076] Enable 
schemas for more runners
URL: https://github.com/apache/beam/pull/5955#issuecomment-407409915
 
 
   Run Flink ValidatesRunner


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 126610)
Time Spent: 14h 50m  (was: 14h 40m)

> Schema followups
> 
>
> Key: BEAM-4076
> URL: https://issues.apache.org/jira/browse/BEAM-4076
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model, dsl-sql, sdk-java-core
>Reporter: Kenneth Knowles
>Priority: Major
>  Time Spent: 14h 50m
>  Remaining Estimate: 0h
>
> This umbrella bug contains subtasks with followups for Beam schemas, which 
> were moved from SQL to the core Java SDK and made to be type-name-based 
> rather than coder based.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4076) Schema followups

2018-07-24 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4076?focusedWorklogId=126612&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-126612
 ]

ASF GitHub Bot logged work on BEAM-4076:


Author: ASF GitHub Bot
Created on: 24/Jul/18 13:39
Start Date: 24/Jul/18 13:39
Worklog Time Spent: 10m 
  Work Description: reuvenlax commented on issue #5955: [BEAM-4076] Enable 
schemas for more runners
URL: https://github.com/apache/beam/pull/5955#issuecomment-407410026
 
 
   Run Spark ValidatesRunner


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 126612)
Time Spent: 15h 10m  (was: 15h)

> Schema followups
> 
>
> Key: BEAM-4076
> URL: https://issues.apache.org/jira/browse/BEAM-4076
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model, dsl-sql, sdk-java-core
>Reporter: Kenneth Knowles
>Priority: Major
>  Time Spent: 15h 10m
>  Remaining Estimate: 0h
>
> This umbrella bug contains subtasks with followups for Beam schemas, which 
> were moved from SQL to the core Java SDK and made to be type-name-based 
> rather than coder based.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4076) Schema followups

2018-07-24 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4076?focusedWorklogId=126609&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-126609
 ]

ASF GitHub Bot logged work on BEAM-4076:


Author: ASF GitHub Bot
Created on: 24/Jul/18 13:39
Start Date: 24/Jul/18 13:39
Worklog Time Spent: 10m 
  Work Description: reuvenlax commented on issue #5955: [BEAM-4076] Enable 
schemas for more runners
URL: https://github.com/apache/beam/pull/5955#issuecomment-407409863
 
 
   Run Apex ValidatesRunner


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 126609)
Time Spent: 14h 40m  (was: 14.5h)

> Schema followups
> 
>
> Key: BEAM-4076
> URL: https://issues.apache.org/jira/browse/BEAM-4076
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model, dsl-sql, sdk-java-core
>Reporter: Kenneth Knowles
>Priority: Major
>  Time Spent: 14h 40m
>  Remaining Estimate: 0h
>
> This umbrella bug contains subtasks with followups for Beam schemas, which 
> were moved from SQL to the core Java SDK and made to be type-name-based 
> rather than coder based.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4076) Schema followups

2018-07-24 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4076?focusedWorklogId=126745&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-126745
 ]

ASF GitHub Bot logged work on BEAM-4076:


Author: ASF GitHub Bot
Created on: 24/Jul/18 17:19
Start Date: 24/Jul/18 17:19
Worklog Time Spent: 10m 
  Work Description: reuvenlax commented on issue #5955: [BEAM-4076] Enable 
schemas for more runners
URL: https://github.com/apache/beam/pull/5955#issuecomment-407484184
 
 
   retest this please


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 126745)
Time Spent: 15h 20m  (was: 15h 10m)

> Schema followups
> 
>
> Key: BEAM-4076
> URL: https://issues.apache.org/jira/browse/BEAM-4076
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model, dsl-sql, sdk-java-core
>Reporter: Kenneth Knowles
>Priority: Major
>  Time Spent: 15h 20m
>  Remaining Estimate: 0h
>
> This umbrella bug contains subtasks with followups for Beam schemas, which 
> were moved from SQL to the core Java SDK and made to be type-name-based 
> rather than coder based.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4076) Schema followups

2018-07-24 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4076?focusedWorklogId=126910&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-126910
 ]

ASF GitHub Bot logged work on BEAM-4076:


Author: ASF GitHub Bot
Created on: 24/Jul/18 21:42
Start Date: 24/Jul/18 21:42
Worklog Time Spent: 10m 
  Work Description: reuvenlax commented on issue #5955: [BEAM-4076] Enable 
schemas for more runners
URL: https://github.com/apache/beam/pull/5955#issuecomment-407561785
 
 
   retest this please.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 126910)
Time Spent: 15.5h  (was: 15h 20m)

> Schema followups
> 
>
> Key: BEAM-4076
> URL: https://issues.apache.org/jira/browse/BEAM-4076
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model, dsl-sql, sdk-java-core
>Reporter: Kenneth Knowles
>Priority: Major
>  Time Spent: 15.5h
>  Remaining Estimate: 0h
>
> This umbrella bug contains subtasks with followups for Beam schemas, which 
> were moved from SQL to the core Java SDK and made to be type-name-based 
> rather than coder based.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4076) Schema followups

2018-07-24 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4076?focusedWorklogId=126960&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-126960
 ]

ASF GitHub Bot logged work on BEAM-4076:


Author: ASF GitHub Bot
Created on: 24/Jul/18 23:44
Start Date: 24/Jul/18 23:44
Worklog Time Spent: 10m 
  Work Description: reuvenlax commented on issue #5955: [BEAM-4076] Enable 
schemas for more runners
URL: https://github.com/apache/beam/pull/5955#issuecomment-407587091
 
 
   @echauchot all comments addressed. You were correct about the Apex failure - 
it was caused by a typo between inputCoder and windowedInputCoder.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 126960)
Time Spent: 15h 40m  (was: 15.5h)

> Schema followups
> 
>
> Key: BEAM-4076
> URL: https://issues.apache.org/jira/browse/BEAM-4076
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model, dsl-sql, sdk-java-core
>Reporter: Kenneth Knowles
>Priority: Major
>  Time Spent: 15h 40m
>  Remaining Estimate: 0h
>
> This umbrella bug contains subtasks with followups for Beam schemas, which 
> were moved from SQL to the core Java SDK and made to be type-name-based 
> rather than coder based.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4076) Schema followups

2018-07-24 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4076?focusedWorklogId=126962&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-126962
 ]

ASF GitHub Bot logged work on BEAM-4076:


Author: ASF GitHub Bot
Created on: 24/Jul/18 23:45
Start Date: 24/Jul/18 23:45
Worklog Time Spent: 10m 
  Work Description: reuvenlax closed pull request #5955: [BEAM-4076] Enable 
schemas for more runners
URL: https://github.com/apache/beam/pull/5955
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/runners/apex/build.gradle b/runners/apex/build.gradle
index dbe19e8efa4..a2bfdec355a 100644
--- a/runners/apex/build.gradle
+++ b/runners/apex/build.gradle
@@ -93,7 +93,6 @@ task validatesRunnerBatch(type: Test) {
 excludeCategories 'org.apache.beam.sdk.testing.UsesCommittedMetrics'
 excludeCategories 'org.apache.beam.sdk.testing.UsesImpulse'
 excludeCategories 'org.apache.beam.sdk.testing.UsesParDoLifecycle'
-excludeCategories 'org.apache.beam.sdk.testing.UsesSchema'
 excludeCategories 'org.apache.beam.sdk.testing.UsesTestStream'
 excludeCategories 'org.apache.beam.sdk.testing.UsesTimersInParDo'
 excludeCategories 'org.apache.beam.sdk.testing.UsesMetricsPusher'
diff --git 
a/runners/apex/src/main/java/org/apache/beam/runners/apex/translation/ParDoTranslator.java
 
b/runners/apex/src/main/java/org/apache/beam/runners/apex/translation/ParDoTranslator.java
index 32113a97630..d44d18c849c 100644
--- 
a/runners/apex/src/main/java/org/apache/beam/runners/apex/translation/ParDoTranslator.java
+++ 
b/runners/apex/src/main/java/org/apache/beam/runners/apex/translation/ParDoTranslator.java
@@ -28,9 +28,11 @@
 import java.util.List;
 import java.util.Map;
 import java.util.Map.Entry;
+import java.util.stream.Collectors;
 import org.apache.beam.runners.apex.ApexRunner;
 import org.apache.beam.runners.apex.translation.operators.ApexParDoOperator;
 import 
org.apache.beam.runners.core.SplittableParDoViaKeyedWorkItems.ProcessElements;
+import org.apache.beam.sdk.coders.Coder;
 import org.apache.beam.sdk.transforms.DoFn;
 import org.apache.beam.sdk.transforms.ParDo;
 import org.apache.beam.sdk.transforms.reflect.DoFnSignature;
@@ -76,6 +78,13 @@ public void translate(ParDo.MultiOutput 
transform, TranslationC
 PCollection input = context.getInput();
 List> sideInputs = transform.getSideInputs();
 
+Map, Coder> outputCoders =
+outputs
+.entrySet()
+.stream()
+.filter(e -> e.getValue() instanceof PCollection)
+.collect(
+Collectors.toMap(e -> e.getKey(), e -> ((PCollection) 
e.getValue()).getCoder()));
 ApexParDoOperator operator =
 new ApexParDoOperator<>(
 context.getPipelineOptions(),
@@ -85,6 +94,7 @@ public void translate(ParDo.MultiOutput 
transform, TranslationC
 input.getWindowingStrategy(),
 sideInputs,
 input.getCoder(),
+outputCoders,
 context.getStateBackend());
 
 Map, OutputPort> ports = 
Maps.newHashMapWithExpectedSize(outputs.size());
@@ -130,6 +140,14 @@ public void translate(
   PCollection input = context.getInput();
   List> sideInputs = transform.getSideInputs();
 
+  Map, Coder> outputCoders =
+  outputs
+  .entrySet()
+  .stream()
+  .filter(e -> e.getValue() instanceof PCollection)
+  .collect(
+  Collectors.toMap(e -> e.getKey(), e -> ((PCollection) 
e.getValue()).getCoder()));
+
   @SuppressWarnings({"rawtypes", "unchecked"})
   DoFn doFn = (DoFn) 
transform.newProcessFn(transform.getFn());
   ApexParDoOperator operator =
@@ -140,7 +158,8 @@ public void translate(
   transform.getAdditionalOutputTags().getAll(),
   input.getWindowingStrategy(),
   sideInputs,
-  null,
+  input.getCoder(),
+  outputCoders,
   context.getStateBackend());
 
   Map, OutputPort> ports = 
Maps.newHashMapWithExpectedSize(outputs.size());
diff --git 
a/runners/apex/src/main/java/org/apache/beam/runners/apex/translation/operators/ApexParDoOperator.java
 
b/runners/apex/src/main/java/org/apache/beam/runners/apex/translation/operators/ApexParDoOperator.java
index f9d20520e2b..577835238e4 100644
--- 
a/runners/apex/src/main/java/org/apache/beam/runners/apex/translation/operators/ApexParDoOperator.java
+++ 
b/runners/apex/src/main/java/org/apache/beam/runners/apex/translation/operators/ApexParDoOperator.java
@@ -117,7 +117,13 @@
   private final List> sideInputs;
 
   @Bind(JavaSerializer.class)
-  private final Coder> inpu

[jira] [Work logged] (BEAM-4076) Schema followups

2018-07-25 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4076?focusedWorklogId=127667&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-127667
 ]

ASF GitHub Bot logged work on BEAM-4076:


Author: ASF GitHub Bot
Created on: 26/Jul/18 06:15
Start Date: 26/Jul/18 06:15
Worklog Time Spent: 10m 
  Work Description: reuvenlax opened a new pull request #6072: [BEAM-4076] 
Fix schemas on Dataflow and FnApi
URL: https://github.com/apache/beam/pull/6072
 
 
   Two issues:
 1. For Dataflow, the problem is that the Dataflow replacement for 
ParDo.Single incorrectly broke the correspondence between mainOutputTupleTag 
and the actual outputs. It generated a new PCollection, which by default 
generates a new TupleTag, and no longer matched the output TupleTag. Fix this 
by making sure that the new PCollection has the same tag.
   
   2. For FnApiRunner, the problem is that input/output coders might be 
WindowedValueCoders, preventing us from recognizing a SchemaCoder. Detect this 
case and extract the value coder.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 127667)
Time Spent: 16h  (was: 15h 50m)

> Schema followups
> 
>
> Key: BEAM-4076
> URL: https://issues.apache.org/jira/browse/BEAM-4076
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model, dsl-sql, sdk-java-core
>Reporter: Kenneth Knowles
>Priority: Major
>  Time Spent: 16h
>  Remaining Estimate: 0h
>
> This umbrella bug contains subtasks with followups for Beam schemas, which 
> were moved from SQL to the core Java SDK and made to be type-name-based 
> rather than coder based.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4076) Schema followups

2018-07-26 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4076?focusedWorklogId=127917&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-127917
 ]

ASF GitHub Bot logged work on BEAM-4076:


Author: ASF GitHub Bot
Created on: 26/Jul/18 21:00
Start Date: 26/Jul/18 21:00
Worklog Time Spent: 10m 
  Work Description: aaltay closed pull request #6072: [BEAM-4076] Fix 
schemas on Dataflow and FnApi
URL: https://github.com/apache/beam/pull/6072
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git 
a/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/PrimitiveParDoSingleFactory.java
 
b/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/PrimitiveParDoSingleFactory.java
index 717585c4d7a..aeda31232a2 100644
--- 
a/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/PrimitiveParDoSingleFactory.java
+++ 
b/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/PrimitiveParDoSingleFactory.java
@@ -97,7 +97,11 @@ private ParDoSingle(
 @Override
 public PCollection expand(PCollection input) {
   return PCollection.createPrimitiveOutputInternal(
-  input.getPipeline(), input.getWindowingStrategy(), 
input.isBounded(), outputCoder);
+  input.getPipeline(),
+  input.getWindowingStrategy(),
+  input.isBounded(),
+  outputCoder,
+  onlyOutputTag);
 }
 
 public DoFn getFn() {
diff --git 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/values/PCollection.java 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/values/PCollection.java
index 8457e0b56df..20adbd567c7 100644
--- a/sdks/java/core/src/main/java/org/apache/beam/sdk/values/PCollection.java
+++ b/sdks/java/core/src/main/java/org/apache/beam/sdk/values/PCollection.java
@@ -361,12 +361,21 @@ public IsBounded isBounded() {
   private IsBounded isBounded;
 
   /** A local {@link TupleTag} used in the expansion of this {@link 
PValueBase}. */
-  private final TupleTag tag = new TupleTag<>();
+  private final TupleTag tag;
 
   private PCollection(Pipeline p, WindowingStrategy windowingStrategy, 
IsBounded isBounded) {
 super(p);
 this.windowingStrategy = windowingStrategy;
 this.isBounded = isBounded;
+this.tag = new TupleTag<>();
+  }
+
+  private PCollection(
+  Pipeline p, WindowingStrategy windowingStrategy, IsBounded 
isBounded, TupleTag tag) {
+super(p);
+this.windowingStrategy = windowingStrategy;
+this.isBounded = isBounded;
+this.tag = tag;
   }
 
   /**
@@ -408,6 +417,21 @@ private PCollection(Pipeline p, WindowingStrategy 
windowingStrategy, IsBou
 return res;
   }
 
+  /** For internal use only; no backwards-compatibility 
guarantees. */
+  @Internal
+  public static  PCollection createPrimitiveOutputInternal(
+  Pipeline pipeline,
+  WindowingStrategy windowingStrategy,
+  IsBounded isBounded,
+  @Nullable Coder coder,
+  TupleTag tag) {
+PCollection res = new PCollection<>(pipeline, windowingStrategy, 
isBounded, tag);
+if (coder != null) {
+  res.setCoder(coder);
+}
+return res;
+  }
+
   private static class CoderOrFailure {
 @Nullable private final Coder coder;
 @Nullable private final String failure;
diff --git 
a/sdks/java/harness/src/main/java/org/apache/beam/fn/harness/DoFnPTransformRunnerFactory.java
 
b/sdks/java/harness/src/main/java/org/apache/beam/fn/harness/DoFnPTransformRunnerFactory.java
index 61a341eaed2..92560232284 100644
--- 
a/sdks/java/harness/src/main/java/org/apache/beam/fn/harness/DoFnPTransformRunnerFactory.java
+++ 
b/sdks/java/harness/src/main/java/org/apache/beam/fn/harness/DoFnPTransformRunnerFactory.java
@@ -47,6 +47,7 @@
 import org.apache.beam.sdk.fn.data.FnDataReceiver;
 import org.apache.beam.sdk.fn.function.ThrowingRunnable;
 import org.apache.beam.sdk.options.PipelineOptions;
+import org.apache.beam.sdk.schemas.SchemaCoder;
 import org.apache.beam.sdk.state.TimeDomain;
 import org.apache.beam.sdk.transforms.DoFn;
 import org.apache.beam.sdk.transforms.Materializations;
@@ -150,7 +151,9 @@ public final RunnerT createRunnerForPTransform(
 final DoFnSignature doFnSignature;
 final TupleTag mainOutputTag;
 final Coder inputCoder;
+final SchemaCoder schemaCoder;
 final Coder keyCoder;
+final SchemaCoder mainOutputSchemaCoder;
 final Coder windowCoder;
 final WindowingStrategy windowingStrategy;
 final Map, SideInputSpec> tagToSideInputSpecMap;
@@ -210,6 +213,17 @@ public final RunnerT createRunnerForPTransform(
 } else {
   this.keyCoder = null;
 }
+i

[jira] [Work logged] (BEAM-4076) Schema followups

2018-08-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4076?focusedWorklogId=134099&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-134099
 ]

ASF GitHub Bot logged work on BEAM-4076:


Author: ASF GitHub Bot
Created on: 13/Aug/18 08:52
Start Date: 13/Aug/18 08:52
Worklog Time Spent: 10m 
  Work Description: echauchot commented on issue #5955: [BEAM-4076] Enable 
schemas for more runners
URL: https://github.com/apache/beam/pull/5955#issuecomment-412451096
 
 
   Thanks for that @reuvenlax !


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 134099)
Time Spent: 16h 20m  (was: 16h 10m)

> Schema followups
> 
>
> Key: BEAM-4076
> URL: https://issues.apache.org/jira/browse/BEAM-4076
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model, dsl-sql, sdk-java-core
>Reporter: Kenneth Knowles
>Priority: Major
>  Time Spent: 16h 20m
>  Remaining Estimate: 0h
>
> This umbrella bug contains subtasks with followups for Beam schemas, which 
> were moved from SQL to the core Java SDK and made to be type-name-based 
> rather than coder based.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4076) Schema followups

2018-06-27 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4076?focusedWorklogId=116558&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-116558
 ]

ASF GitHub Bot logged work on BEAM-4076:


Author: ASF GitHub Bot
Created on: 27/Jun/18 20:06
Start Date: 27/Jun/18 20:06
Worklog Time Spent: 10m 
  Work Description: reuvenlax commented on a change in pull request #5545: 
[BEAM-4076] Import Schema branch into master
URL: https://github.com/apache/beam/pull/5545#discussion_r198623418
 
 

 ##
 File path: 
runners/core-java/src/main/java/org/apache/beam/runners/core/SimpleDoFnRunner.java
 ##
 @@ -101,19 +119,64 @@ public SimpleDoFnRunner(
   TupleTag mainOutputTag,
   List> additionalOutputTags,
   StepContext stepContext,
+  @Nullable Coder inputCoder,
+  Map, Coder> outputCoders,
   WindowingStrategy windowingStrategy) {
 this.options = options;
 this.fn = fn;
 this.signature = DoFnSignatures.getSignature(fn.getClass());
 this.observesWindow = signature.processElement().observesWindow() || 
!sideInputReader.isEmpty();
 this.invoker = DoFnInvokers.invokerFor(fn);
 this.sideInputReader = sideInputReader;
+this.schemaCoder = (inputCoder != null && inputCoder instanceof 
SchemaCoder)
+? (SchemaCoder) inputCoder : null;
+this.outputCoders = outputCoders;
+if (outputCoders != null) {
+  Coder outputCoder = (Coder) 
outputCoders.get(mainOutputTag);
+  mainOutputSchemaCoder = (outputCoder instanceof SchemaCoder)
+  ? (SchemaCoder) outputCoder : null;
+} else {
+  mainOutputSchemaCoder = null;
+}
 this.outputManager = outputManager;
 this.mainOutputTag = mainOutputTag;
 this.outputTags =
 
Sets.newHashSet(FluentIterable.>of(mainOutputTag).append(additionalOutputTags));
 this.stepContext = stepContext;
 
+// Currently we only support a single FieldAccess on a processElement. We 
should decide
+// whether to get rid of the FieldAccess ids, or find a use for multiple.
+DoFnSignature doFnSignature = DoFnSignatures.getSignature(fn.getClass());
+DoFnSignature.ProcessElementMethod processElementMethod = 
doFnSignature.processElement();
+RowParameter rowParameter = processElementMethod.getRowParameter();
+FieldAccessDescriptor fieldAccessDescriptor = null;
+if (rowParameter != null) {
+  checkArgument(schemaCoder != null,
+  "Cannot access object as a row if the input PCollection does not 
have a schema ."
 
 Review comment:
   Schema is currently represented under the covers by SchemaCoder. So no 
schema coder === no schema.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 116558)
Time Spent: 2h 50m  (was: 2h 40m)

> Schema followups
> 
>
> Key: BEAM-4076
> URL: https://issues.apache.org/jira/browse/BEAM-4076
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model, dsl-sql, sdk-java-core
>Reporter: Kenneth Knowles
>Priority: Major
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> This umbrella bug contains subtasks with followups for Beam schemas, which 
> were moved from SQL to the core Java SDK and made to be type-name-based 
> rather than coder based.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4076) Schema followups

2018-06-27 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4076?focusedWorklogId=116560&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-116560
 ]

ASF GitHub Bot logged work on BEAM-4076:


Author: ASF GitHub Bot
Created on: 27/Jun/18 20:06
Start Date: 27/Jun/18 20:06
Worklog Time Spent: 10m 
  Work Description: reuvenlax commented on a change in pull request #5545: 
[BEAM-4076] Import Schema branch into master
URL: https://github.com/apache/beam/pull/5545#discussion_r198624045
 
 

 ##
 File path: 
runners/core-java/src/main/java/org/apache/beam/runners/core/SimpleDoFnRunner.java
 ##
 @@ -101,19 +119,64 @@ public SimpleDoFnRunner(
   TupleTag mainOutputTag,
   List> additionalOutputTags,
   StepContext stepContext,
+  @Nullable Coder inputCoder,
+  Map, Coder> outputCoders,
   WindowingStrategy windowingStrategy) {
 this.options = options;
 this.fn = fn;
 this.signature = DoFnSignatures.getSignature(fn.getClass());
 this.observesWindow = signature.processElement().observesWindow() || 
!sideInputReader.isEmpty();
 this.invoker = DoFnInvokers.invokerFor(fn);
 this.sideInputReader = sideInputReader;
+this.schemaCoder = (inputCoder != null && inputCoder instanceof 
SchemaCoder)
+? (SchemaCoder) inputCoder : null;
+this.outputCoders = outputCoders;
+if (outputCoders != null) {
+  Coder outputCoder = (Coder) 
outputCoders.get(mainOutputTag);
+  mainOutputSchemaCoder = (outputCoder instanceof SchemaCoder)
+  ? (SchemaCoder) outputCoder : null;
+} else {
+  mainOutputSchemaCoder = null;
+}
 this.outputManager = outputManager;
 this.mainOutputTag = mainOutputTag;
 this.outputTags =
 
Sets.newHashSet(FluentIterable.>of(mainOutputTag).append(additionalOutputTags));
 this.stepContext = stepContext;
 
+// Currently we only support a single FieldAccess on a processElement. We 
should decide
+// whether to get rid of the FieldAccess ids, or find a use for multiple.
+DoFnSignature doFnSignature = DoFnSignatures.getSignature(fn.getClass());
+DoFnSignature.ProcessElementMethod processElementMethod = 
doFnSignature.processElement();
+RowParameter rowParameter = processElementMethod.getRowParameter();
+FieldAccessDescriptor fieldAccessDescriptor = null;
+if (rowParameter != null) {
+  checkArgument(schemaCoder != null,
+  "Cannot access object as a row if the input PCollection does not 
have a schema ."
 
 Review comment:
   Schema is currently represented under the covers by SchemaCoder. So no 
schema coder === no schema.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 116560)
Time Spent: 3h  (was: 2h 50m)

> Schema followups
> 
>
> Key: BEAM-4076
> URL: https://issues.apache.org/jira/browse/BEAM-4076
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model, dsl-sql, sdk-java-core
>Reporter: Kenneth Knowles
>Priority: Major
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> This umbrella bug contains subtasks with followups for Beam schemas, which 
> were moved from SQL to the core Java SDK and made to be type-name-based 
> rather than coder based.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4076) Schema followups

2018-06-27 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4076?focusedWorklogId=116557&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-116557
 ]

ASF GitHub Bot logged work on BEAM-4076:


Author: ASF GitHub Bot
Created on: 27/Jun/18 20:06
Start Date: 27/Jun/18 20:06
Worklog Time Spent: 10m 
  Work Description: reuvenlax commented on a change in pull request #5545: 
[BEAM-4076] Import Schema branch into master
URL: https://github.com/apache/beam/pull/5545#discussion_r198623373
 
 

 ##
 File path: 
runners/core-java/src/main/java/org/apache/beam/runners/core/SimpleDoFnRunner.java
 ##
 @@ -93,6 +100,17 @@
   // Because of setKey(Object), we really must refresh stateInternals() at 
each access
   private final StepContext stepContext;
 
+  @Nullable
+  private final SchemaCoder schemaCoder;
+
+  @Nullable final SchemaCoder mainOutputSchemaCoder;
 
 Review comment:
   It's used for the default output() method, which always produces to the main 
output.
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 116557)
Time Spent: 2h 40m  (was: 2.5h)

> Schema followups
> 
>
> Key: BEAM-4076
> URL: https://issues.apache.org/jira/browse/BEAM-4076
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model, dsl-sql, sdk-java-core
>Reporter: Kenneth Knowles
>Priority: Major
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> This umbrella bug contains subtasks with followups for Beam schemas, which 
> were moved from SQL to the core Java SDK and made to be type-name-based 
> rather than coder based.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4076) Schema followups

2018-06-27 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4076?focusedWorklogId=116559&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-116559
 ]

ASF GitHub Bot logged work on BEAM-4076:


Author: ASF GitHub Bot
Created on: 27/Jun/18 20:06
Start Date: 27/Jun/18 20:06
Worklog Time Spent: 10m 
  Work Description: reuvenlax commented on a change in pull request #5545: 
[BEAM-4076] Import Schema branch into master
URL: https://github.com/apache/beam/pull/5545#discussion_r198623651
 
 

 ##
 File path: 
sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/reflect/DoFnSignatures.java
 ##
 @@ -41,9 +41,11 @@
 import java.util.LinkedHashSet;
 import java.util.List;
 import java.util.Map;
+import java.util.Optional;
 import javax.annotation.Nullable;
 import org.apache.beam.sdk.coders.Coder;
 import org.apache.beam.sdk.options.PipelineOptions;
+import org.apache.beam.sdk.schemas.FieldAccessDescriptor;
 
 Review comment:
   Done


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 116559)

> Schema followups
> 
>
> Key: BEAM-4076
> URL: https://issues.apache.org/jira/browse/BEAM-4076
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model, dsl-sql, sdk-java-core
>Reporter: Kenneth Knowles
>Priority: Major
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> This umbrella bug contains subtasks with followups for Beam schemas, which 
> were moved from SQL to the core Java SDK and made to be type-name-based 
> rather than coder based.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4076) Schema followups

2018-06-27 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4076?focusedWorklogId=116561&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-116561
 ]

ASF GitHub Bot logged work on BEAM-4076:


Author: ASF GitHub Bot
Created on: 27/Jun/18 20:06
Start Date: 27/Jun/18 20:06
Worklog Time Spent: 10m 
  Work Description: reuvenlax commented on a change in pull request #5545: 
[BEAM-4076] Import Schema branch into master
URL: https://github.com/apache/beam/pull/5545#discussion_r198624051
 
 

 ##
 File path: 
runners/core-java/src/main/java/org/apache/beam/runners/core/SimpleDoFnRunner.java
 ##
 @@ -93,6 +100,17 @@
   // Because of setKey(Object), we really must refresh stateInternals() at 
each access
   private final StepContext stepContext;
 
+  @Nullable
+  private final SchemaCoder schemaCoder;
+
+  @Nullable final SchemaCoder mainOutputSchemaCoder;
 
 Review comment:
   It's used for the default output() method, which always produces to the main 
output.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 116561)
Time Spent: 3h 10m  (was: 3h)

> Schema followups
> 
>
> Key: BEAM-4076
> URL: https://issues.apache.org/jira/browse/BEAM-4076
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model, dsl-sql, sdk-java-core
>Reporter: Kenneth Knowles
>Priority: Major
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> This umbrella bug contains subtasks with followups for Beam schemas, which 
> were moved from SQL to the core Java SDK and made to be type-name-based 
> rather than coder based.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4076) Schema followups

2018-06-27 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4076?focusedWorklogId=116595&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-116595
 ]

ASF GitHub Bot logged work on BEAM-4076:


Author: ASF GitHub Bot
Created on: 27/Jun/18 20:50
Start Date: 27/Jun/18 20:50
Worklog Time Spent: 10m 
  Work Description: reuvenlax commented on issue #5545: [BEAM-4076] Import 
Schema branch into master
URL: https://github.com/apache/beam/pull/5545#issuecomment-400824550
 
 
   @chamikaramj thank you. addressed your comments.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 116595)
Time Spent: 3h 20m  (was: 3h 10m)

> Schema followups
> 
>
> Key: BEAM-4076
> URL: https://issues.apache.org/jira/browse/BEAM-4076
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model, dsl-sql, sdk-java-core
>Reporter: Kenneth Knowles
>Priority: Major
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> This umbrella bug contains subtasks with followups for Beam schemas, which 
> were moved from SQL to the core Java SDK and made to be type-name-based 
> rather than coder based.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4076) Schema followups

2018-06-27 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4076?focusedWorklogId=116608&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-116608
 ]

ASF GitHub Bot logged work on BEAM-4076:


Author: ASF GitHub Bot
Created on: 27/Jun/18 21:11
Start Date: 27/Jun/18 21:11
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on issue #5545: [BEAM-4076] 
Import Schema branch into master
URL: https://github.com/apache/beam/pull/5545#issuecomment-400830488
 
 
   Thanks. LGTM for updates to following classes.
   
   DoFnRunners.java
   SimpleDoFnRunner.java
   ParDoEvaluator.java
   DataflowPipelineTranslator.java
   Pipeline.java
   Create.java
   DoFn.java
   DoFnOutputReceivers.java
   ParDo.java
   ByteBuddyDoFnInvokerFactory.java
   DoFnInvoker.java
   DoFnSignature.java
   DoFnSignatures.java
   DoFnInfo.java
   PCollection.java


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 116608)
Time Spent: 3.5h  (was: 3h 20m)

> Schema followups
> 
>
> Key: BEAM-4076
> URL: https://issues.apache.org/jira/browse/BEAM-4076
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model, dsl-sql, sdk-java-core
>Reporter: Kenneth Knowles
>Priority: Major
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> This umbrella bug contains subtasks with followups for Beam schemas, which 
> were moved from SQL to the core Java SDK and made to be type-name-based 
> rather than coder based.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4076) Schema followups

2018-06-28 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4076?focusedWorklogId=116869&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-116869
 ]

ASF GitHub Bot logged work on BEAM-4076:


Author: ASF GitHub Bot
Created on: 28/Jun/18 16:23
Start Date: 28/Jun/18 16:23
Worklog Time Spent: 10m 
  Work Description: kennknowles commented on issue #5545: [BEAM-4076] 
Import Schema branch into master
URL: https://github.com/apache/beam/pull/5545#issuecomment-401092921
 
 
   We have turned on autoformatting of the codebase, which causes small 
conflicts across the board. You can probably safely rebase and just keep your 
changes. Like this:
   
   ```
   $ git rebase
   ... see some conflicts
   $ git diff
   ... confirmed that the conflicts are just autoformatting
   ... so we can just keep our changes are do our own autoformat
   $ git checkout --theirs --
   $ git add -u
   $ git rebase --continue
   $ ./gradlew spotlessJavaApply
   ```
   
   Please ping me if you run into any difficulty. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 116869)
Time Spent: 3h 40m  (was: 3.5h)

> Schema followups
> 
>
> Key: BEAM-4076
> URL: https://issues.apache.org/jira/browse/BEAM-4076
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model, dsl-sql, sdk-java-core
>Reporter: Kenneth Knowles
>Priority: Major
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> This umbrella bug contains subtasks with followups for Beam schemas, which 
> were moved from SQL to the core Java SDK and made to be type-name-based 
> rather than coder based.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4076) Schema followups

2018-06-28 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4076?focusedWorklogId=116908&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-116908
 ]

ASF GitHub Bot logged work on BEAM-4076:


Author: ASF GitHub Bot
Created on: 28/Jun/18 17:05
Start Date: 28/Jun/18 17:05
Worklog Time Spent: 10m 
  Work Description: reuvenlax commented on issue #5545: [BEAM-4076] Import 
Schema branch into master
URL: https://github.com/apache/beam/pull/5545#issuecomment-401105823
 
 
   This command sequence doesn't actually work :)
   
   git checkout --theirs --
   
   fatal: '--ours/--theirs' cannot be used with switching branches
   
   
   On Thu, Jun 28, 2018 at 9:24 AM Kenn Knowles 
   wrote:
   
   > We have turned on autoformatting of the codebase, which causes small
   > conflicts across the board. You can probably safely rebase and just keep
   > your changes. Like this:
   >
   > $ git rebase
   > ... see some conflicts
   > $ git diff
   > ... confirmed that the conflicts are just autoformatting
   > ... so we can just keep our changes are do our own autoformat
   > $ git checkout --theirs --
   > $ git add -u
   > $ git rebase --continue
   > $ ./gradlew spotlessJavaApply
   >
   > Please ping me if you run into any difficulty.
   >
   > —
   > You are receiving this because you authored the thread.
   > Reply to this email directly, view it on GitHub
   > , or mute
   > the thread
   > 

   > .
   >
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 116908)
Time Spent: 3h 50m  (was: 3h 40m)

> Schema followups
> 
>
> Key: BEAM-4076
> URL: https://issues.apache.org/jira/browse/BEAM-4076
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model, dsl-sql, sdk-java-core
>Reporter: Kenneth Knowles
>Priority: Major
>  Time Spent: 3h 50m
>  Remaining Estimate: 0h
>
> This umbrella bug contains subtasks with followups for Beam schemas, which 
> were moved from SQL to the core Java SDK and made to be type-name-based 
> rather than coder based.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4076) Schema followups

2018-06-28 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4076?focusedWorklogId=116914&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-116914
 ]

ASF GitHub Bot logged work on BEAM-4076:


Author: ASF GitHub Bot
Created on: 28/Jun/18 17:18
Start Date: 28/Jun/18 17:18
Worklog Time Spent: 10m 
  Work Description: kennknowles commented on issue #5545: [BEAM-4076] 
Import Schema branch into master
URL: https://github.com/apache/beam/pull/5545#issuecomment-401109680
 
 
   It only works during a rebase? You might need to provide files or `'**'` 
quoted?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 116914)
Time Spent: 4h  (was: 3h 50m)

> Schema followups
> 
>
> Key: BEAM-4076
> URL: https://issues.apache.org/jira/browse/BEAM-4076
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model, dsl-sql, sdk-java-core
>Reporter: Kenneth Knowles
>Priority: Major
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> This umbrella bug contains subtasks with followups for Beam schemas, which 
> were moved from SQL to the core Java SDK and made to be type-name-based 
> rather than coder based.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4076) Schema followups

2018-06-28 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4076?focusedWorklogId=116917&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-116917
 ]

ASF GitHub Bot logged work on BEAM-4076:


Author: ASF GitHub Bot
Created on: 28/Jun/18 17:26
Start Date: 28/Jun/18 17:26
Worklog Time Spent: 10m 
  Work Description: reuvenlax commented on issue #5545: [BEAM-4076] Import 
Schema branch into master
URL: https://github.com/apache/beam/pull/5545#issuecomment-40981
 
 
   This was during a rebase. I ended up having to do the process manually in
   an editor for all files.
   
   On Thu, Jun 28, 2018 at 10:18 AM Kenn Knowles 
   wrote:
   
   > It only works during a rebase? You might need to provide files or '**'
   > quoted?
   >
   > —
   > You are receiving this because you authored the thread.
   > Reply to this email directly, view it on GitHub
   > , or mute
   > the thread
   > 

   > .
   >
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 116917)
Time Spent: 4h 10m  (was: 4h)

> Schema followups
> 
>
> Key: BEAM-4076
> URL: https://issues.apache.org/jira/browse/BEAM-4076
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model, dsl-sql, sdk-java-core
>Reporter: Kenneth Knowles
>Priority: Major
>  Time Spent: 4h 10m
>  Remaining Estimate: 0h
>
> This umbrella bug contains subtasks with followups for Beam schemas, which 
> were moved from SQL to the core Java SDK and made to be type-name-based 
> rather than coder based.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4076) Schema followups

2018-06-28 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4076?focusedWorklogId=116929&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-116929
 ]

ASF GitHub Bot logged work on BEAM-4076:


Author: ASF GitHub Bot
Created on: 28/Jun/18 18:04
Start Date: 28/Jun/18 18:04
Worklog Time Spent: 10m 
  Work Description: kennknowles commented on issue #5545: [BEAM-4076] 
Import Schema branch into master
URL: https://github.com/apache/beam/pull/5545#issuecomment-401122933
 
 
   You really don't have to. I may have made some typo but it definitely works 
to simply discard one side or the other of a conflict.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 116929)
Time Spent: 4h 20m  (was: 4h 10m)

> Schema followups
> 
>
> Key: BEAM-4076
> URL: https://issues.apache.org/jira/browse/BEAM-4076
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model, dsl-sql, sdk-java-core
>Reporter: Kenneth Knowles
>Priority: Major
>  Time Spent: 4h 20m
>  Remaining Estimate: 0h
>
> This umbrella bug contains subtasks with followups for Beam schemas, which 
> were moved from SQL to the core Java SDK and made to be type-name-based 
> rather than coder based.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4076) Schema followups

2018-06-28 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4076?focusedWorklogId=116953&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-116953
 ]

ASF GitHub Bot logged work on BEAM-4076:


Author: ASF GitHub Bot
Created on: 28/Jun/18 18:35
Start Date: 28/Jun/18 18:35
Worklog Time Spent: 10m 
  Work Description: apilloud commented on a change in pull request #5545: 
[BEAM-4076] Import Schema branch into master
URL: https://github.com/apache/beam/pull/5545#discussion_r198938820
 
 

 ##
 File path: 
sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/FieldAccessDescriptor.java
 ##
 @@ -0,0 +1,265 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.beam.sdk.schemas;
+
+import static com.google.common.base.Preconditions.checkState;
+
+import com.google.common.collect.ImmutableMap;
+import com.google.common.collect.Maps;
+import com.google.common.collect.Sets;
+import java.io.Serializable;
+import java.util.Arrays;
+import java.util.Collections;
+import java.util.Map;
+import java.util.Set;
+import java.util.stream.Collectors;
+import java.util.stream.StreamSupport;
+import org.apache.beam.sdk.annotations.Experimental;
+import org.apache.beam.sdk.annotations.Experimental.Kind;
+import org.apache.beam.sdk.schemas.Schema.Field;
+import org.apache.beam.sdk.schemas.Schema.FieldType;
+import org.apache.beam.sdk.schemas.Schema.TypeName;
+
+/**
+ * Used inside of a {@link org.apache.beam.sdk.transforms.DoFn} to describe 
which fields in a schema
+ * type need to be accessed for processing.
+ */
+@Experimental(Kind.SCHEMAS)
+public class FieldAccessDescriptor implements Serializable {
+  private boolean allFields;
+  private Set fieldIdsAccessed;
+  private Set fieldNamesAccessed;
+  private Map nestedFieldsAccessedById;
+  private Map nestedFieldsAccessedByName;
+
+  FieldAccessDescriptor(
+  boolean allFields,
+  Set fieldsIdsAccessed,
+  Set fieldNamesAccessed,
+  Map nestedFieldsAccessedById,
+  Map nestedFieldsAccessedByName) {
+this.allFields = allFields;
+this.fieldIdsAccessed = fieldsIdsAccessed;
+this.fieldNamesAccessed = fieldNamesAccessed;
+this.nestedFieldsAccessedById = nestedFieldsAccessedById;
+this.nestedFieldsAccessedByName = nestedFieldsAccessedByName;
+  }
+
+  // Return a descriptor that accesses all fields in a row.
+  public static FieldAccessDescriptor withAllFields() {
+return new FieldAccessDescriptor(
+true,
+Collections.emptySet(),
+Collections.emptySet(),
+Collections.emptyMap(),
+Collections.emptyMap());
+  }
+
+  /**
+   * Return a descriptor that access the specified fields.
+   *
+   * By default, if the field is a nested row (or a container containing a 
row), all fields of
+   * said rows are accessed. For finer-grained acccess to nested rows, call 
withNestedField and pass
+   * in a recursive {@link FieldAccessDescriptor}.
+   */
+  public static FieldAccessDescriptor withFieldNames(String... names) {
+return withFieldNames(Arrays.asList(names));
+  }
+
+  /**
+   * Return a descriptor that access the specified fields.
+   *
+   * By default, if the field is a nested row (or a container containing a 
row), all fields of
+   * said rows are accessed. For finer-grained acccess to nested rows, call 
withNestedField and pass
+   * in a recursive {@link FieldAccessDescriptor}.
+   */
+  public static FieldAccessDescriptor withFieldNames(Iterable 
fieldNames) {
+return new FieldAccessDescriptor(
+false,
+Collections.emptySet(),
+Sets.newHashSet(fieldNames),
+Collections.emptyMap(),
+Collections.emptyMap());
+  }
+
+  /**
+   * Return a descriptor that access the specified fields.
+   *
+   * By default, if the field is a nested row (or a container containing a 
row), all fields of
+   * said rows are accessed. For finer-grained acccess to nested rows, call 
withNestedField and pass
+   * in a recursive {@link FieldAccessDescriptor}.
+   */
+  public static FieldAccessDescriptor withFieldIds(Integer... ids) {
+return withFieldIds(Arrays.asList(i

[jira] [Work logged] (BEAM-4076) Schema followups

2018-06-28 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4076?focusedWorklogId=116948&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-116948
 ]

ASF GitHub Bot logged work on BEAM-4076:


Author: ASF GitHub Bot
Created on: 28/Jun/18 18:35
Start Date: 28/Jun/18 18:35
Worklog Time Spent: 10m 
  Work Description: apilloud commented on a change in pull request #5545: 
[BEAM-4076] Import Schema branch into master
URL: https://github.com/apache/beam/pull/5545#discussion_r198931554
 
 

 ##
 File path: 
sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/FieldAccessDescriptor.java
 ##
 @@ -0,0 +1,265 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.beam.sdk.schemas;
+
+import static com.google.common.base.Preconditions.checkState;
+
+import com.google.common.collect.ImmutableMap;
+import com.google.common.collect.Maps;
+import com.google.common.collect.Sets;
+import java.io.Serializable;
+import java.util.Arrays;
+import java.util.Collections;
+import java.util.Map;
+import java.util.Set;
+import java.util.stream.Collectors;
+import java.util.stream.StreamSupport;
+import org.apache.beam.sdk.annotations.Experimental;
+import org.apache.beam.sdk.annotations.Experimental.Kind;
+import org.apache.beam.sdk.schemas.Schema.Field;
+import org.apache.beam.sdk.schemas.Schema.FieldType;
+import org.apache.beam.sdk.schemas.Schema.TypeName;
+
+/**
+ * Used inside of a {@link org.apache.beam.sdk.transforms.DoFn} to describe 
which fields in a schema
+ * type need to be accessed for processing.
+ */
+@Experimental(Kind.SCHEMAS)
+public class FieldAccessDescriptor implements Serializable {
+  private boolean allFields;
+  private Set fieldIdsAccessed;
+  private Set fieldNamesAccessed;
+  private Map nestedFieldsAccessedById;
+  private Map nestedFieldsAccessedByName;
+
+  FieldAccessDescriptor(
+  boolean allFields,
+  Set fieldsIdsAccessed,
+  Set fieldNamesAccessed,
+  Map nestedFieldsAccessedById,
+  Map nestedFieldsAccessedByName) {
+this.allFields = allFields;
+this.fieldIdsAccessed = fieldsIdsAccessed;
+this.fieldNamesAccessed = fieldNamesAccessed;
+this.nestedFieldsAccessedById = nestedFieldsAccessedById;
+this.nestedFieldsAccessedByName = nestedFieldsAccessedByName;
+  }
+
+  // Return a descriptor that accesses all fields in a row.
+  public static FieldAccessDescriptor withAllFields() {
+return new FieldAccessDescriptor(
+true,
+Collections.emptySet(),
+Collections.emptySet(),
+Collections.emptyMap(),
+Collections.emptyMap());
+  }
+
+  /**
+   * Return a descriptor that access the specified fields.
+   *
+   * By default, if the field is a nested row (or a container containing a 
row), all fields of
+   * said rows are accessed. For finer-grained acccess to nested rows, call 
withNestedField and pass
+   * in a recursive {@link FieldAccessDescriptor}.
+   */
+  public static FieldAccessDescriptor withFieldNames(String... names) {
+return withFieldNames(Arrays.asList(names));
+  }
+
+  /**
+   * Return a descriptor that access the specified fields.
+   *
+   * By default, if the field is a nested row (or a container containing a 
row), all fields of
+   * said rows are accessed. For finer-grained acccess to nested rows, call 
withNestedField and pass
+   * in a recursive {@link FieldAccessDescriptor}.
+   */
+  public static FieldAccessDescriptor withFieldNames(Iterable 
fieldNames) {
+return new FieldAccessDescriptor(
+false,
+Collections.emptySet(),
+Sets.newHashSet(fieldNames),
+Collections.emptyMap(),
+Collections.emptyMap());
+  }
+
+  /**
+   * Return a descriptor that access the specified fields.
+   *
+   * By default, if the field is a nested row (or a container containing a 
row), all fields of
+   * said rows are accessed. For finer-grained acccess to nested rows, call 
withNestedField and pass
+   * in a recursive {@link FieldAccessDescriptor}.
+   */
+  public static FieldAccessDescriptor withFieldIds(Integer... ids) {
+return withFieldIds(Arrays.asList(i

[jira] [Work logged] (BEAM-4076) Schema followups

2018-06-28 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4076?focusedWorklogId=116951&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-116951
 ]

ASF GitHub Bot logged work on BEAM-4076:


Author: ASF GitHub Bot
Created on: 28/Jun/18 18:35
Start Date: 28/Jun/18 18:35
Worklog Time Spent: 10m 
  Work Description: apilloud commented on a change in pull request #5545: 
[BEAM-4076] Import Schema branch into master
URL: https://github.com/apache/beam/pull/5545#discussion_r198939979
 
 

 ##
 File path: 
sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/SchemaCoder.java
 ##
 @@ -0,0 +1,89 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.beam.sdk.schemas;
+
+import java.io.IOException;
+import java.io.InputStream;
+import java.io.OutputStream;
+import org.apache.beam.sdk.annotations.Experimental;
+import org.apache.beam.sdk.annotations.Experimental.Kind;
+import org.apache.beam.sdk.coders.CustomCoder;
+import org.apache.beam.sdk.coders.RowCoder;
+import org.apache.beam.sdk.transforms.SerializableFunction;
+import org.apache.beam.sdk.values.Row;
+
+/** {@link SchemaCoder} is used as the coder for types that have schemas 
registered. */
+@Experimental(Kind.SCHEMAS)
+public class SchemaCoder extends CustomCoder {
+  private Schema schema;
 
 Review comment:
   This class also seems like a good candidate for AutoValue? If not, these 
should all be final.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 116951)
Time Spent: 5h  (was: 4h 50m)

> Schema followups
> 
>
> Key: BEAM-4076
> URL: https://issues.apache.org/jira/browse/BEAM-4076
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model, dsl-sql, sdk-java-core
>Reporter: Kenneth Knowles
>Priority: Major
>  Time Spent: 5h
>  Remaining Estimate: 0h
>
> This umbrella bug contains subtasks with followups for Beam schemas, which 
> were moved from SQL to the core Java SDK and made to be type-name-based 
> rather than coder based.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4076) Schema followups

2018-06-28 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4076?focusedWorklogId=116952&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-116952
 ]

ASF GitHub Bot logged work on BEAM-4076:


Author: ASF GitHub Bot
Created on: 28/Jun/18 18:35
Start Date: 28/Jun/18 18:35
Worklog Time Spent: 10m 
  Work Description: apilloud commented on a change in pull request #5545: 
[BEAM-4076] Import Schema branch into master
URL: https://github.com/apache/beam/pull/5545#discussion_r198939391
 
 

 ##
 File path: 
sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/FieldAccessDescriptor.java
 ##
 @@ -0,0 +1,265 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.beam.sdk.schemas;
+
+import static com.google.common.base.Preconditions.checkState;
+
+import com.google.common.collect.ImmutableMap;
+import com.google.common.collect.Maps;
+import com.google.common.collect.Sets;
+import java.io.Serializable;
+import java.util.Arrays;
+import java.util.Collections;
+import java.util.Map;
+import java.util.Set;
+import java.util.stream.Collectors;
+import java.util.stream.StreamSupport;
+import org.apache.beam.sdk.annotations.Experimental;
+import org.apache.beam.sdk.annotations.Experimental.Kind;
+import org.apache.beam.sdk.schemas.Schema.Field;
+import org.apache.beam.sdk.schemas.Schema.FieldType;
+import org.apache.beam.sdk.schemas.Schema.TypeName;
+
+/**
+ * Used inside of a {@link org.apache.beam.sdk.transforms.DoFn} to describe 
which fields in a schema
+ * type need to be accessed for processing.
+ */
+@Experimental(Kind.SCHEMAS)
+public class FieldAccessDescriptor implements Serializable {
+  private boolean allFields;
+  private Set fieldIdsAccessed;
+  private Set fieldNamesAccessed;
+  private Map nestedFieldsAccessedById;
+  private Map nestedFieldsAccessedByName;
+
+  FieldAccessDescriptor(
+  boolean allFields,
+  Set fieldsIdsAccessed,
+  Set fieldNamesAccessed,
+  Map nestedFieldsAccessedById,
+  Map nestedFieldsAccessedByName) {
+this.allFields = allFields;
+this.fieldIdsAccessed = fieldsIdsAccessed;
+this.fieldNamesAccessed = fieldNamesAccessed;
+this.nestedFieldsAccessedById = nestedFieldsAccessedById;
+this.nestedFieldsAccessedByName = nestedFieldsAccessedByName;
+  }
+
+  // Return a descriptor that accesses all fields in a row.
+  public static FieldAccessDescriptor withAllFields() {
+return new FieldAccessDescriptor(
+true,
+Collections.emptySet(),
+Collections.emptySet(),
+Collections.emptyMap(),
+Collections.emptyMap());
+  }
+
+  /**
+   * Return a descriptor that access the specified fields.
+   *
+   * By default, if the field is a nested row (or a container containing a 
row), all fields of
+   * said rows are accessed. For finer-grained acccess to nested rows, call 
withNestedField and pass
+   * in a recursive {@link FieldAccessDescriptor}.
+   */
+  public static FieldAccessDescriptor withFieldNames(String... names) {
+return withFieldNames(Arrays.asList(names));
+  }
+
+  /**
+   * Return a descriptor that access the specified fields.
+   *
+   * By default, if the field is a nested row (or a container containing a 
row), all fields of
+   * said rows are accessed. For finer-grained acccess to nested rows, call 
withNestedField and pass
+   * in a recursive {@link FieldAccessDescriptor}.
+   */
+  public static FieldAccessDescriptor withFieldNames(Iterable 
fieldNames) {
+return new FieldAccessDescriptor(
+false,
+Collections.emptySet(),
+Sets.newHashSet(fieldNames),
+Collections.emptyMap(),
+Collections.emptyMap());
+  }
+
+  /**
+   * Return a descriptor that access the specified fields.
+   *
+   * By default, if the field is a nested row (or a container containing a 
row), all fields of
+   * said rows are accessed. For finer-grained acccess to nested rows, call 
withNestedField and pass
+   * in a recursive {@link FieldAccessDescriptor}.
+   */
+  public static FieldAccessDescriptor withFieldIds(Integer... ids) {
+return withFieldIds(Arrays.asList(i

[jira] [Work logged] (BEAM-4076) Schema followups

2018-06-28 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4076?focusedWorklogId=116950&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-116950
 ]

ASF GitHub Bot logged work on BEAM-4076:


Author: ASF GitHub Bot
Created on: 28/Jun/18 18:35
Start Date: 28/Jun/18 18:35
Worklog Time Spent: 10m 
  Work Description: apilloud commented on a change in pull request #5545: 
[BEAM-4076] Import Schema branch into master
URL: https://github.com/apache/beam/pull/5545#discussion_r198939116
 
 

 ##
 File path: 
sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/FieldAccessDescriptor.java
 ##
 @@ -0,0 +1,265 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.beam.sdk.schemas;
+
+import static com.google.common.base.Preconditions.checkState;
+
+import com.google.common.collect.ImmutableMap;
+import com.google.common.collect.Maps;
+import com.google.common.collect.Sets;
+import java.io.Serializable;
+import java.util.Arrays;
+import java.util.Collections;
+import java.util.Map;
+import java.util.Set;
+import java.util.stream.Collectors;
+import java.util.stream.StreamSupport;
+import org.apache.beam.sdk.annotations.Experimental;
+import org.apache.beam.sdk.annotations.Experimental.Kind;
+import org.apache.beam.sdk.schemas.Schema.Field;
+import org.apache.beam.sdk.schemas.Schema.FieldType;
+import org.apache.beam.sdk.schemas.Schema.TypeName;
+
+/**
+ * Used inside of a {@link org.apache.beam.sdk.transforms.DoFn} to describe 
which fields in a schema
+ * type need to be accessed for processing.
+ */
+@Experimental(Kind.SCHEMAS)
+public class FieldAccessDescriptor implements Serializable {
+  private boolean allFields;
+  private Set fieldIdsAccessed;
+  private Set fieldNamesAccessed;
+  private Map nestedFieldsAccessedById;
+  private Map nestedFieldsAccessedByName;
+
+  FieldAccessDescriptor(
+  boolean allFields,
+  Set fieldsIdsAccessed,
+  Set fieldNamesAccessed,
+  Map nestedFieldsAccessedById,
+  Map nestedFieldsAccessedByName) {
+this.allFields = allFields;
+this.fieldIdsAccessed = fieldsIdsAccessed;
+this.fieldNamesAccessed = fieldNamesAccessed;
+this.nestedFieldsAccessedById = nestedFieldsAccessedById;
+this.nestedFieldsAccessedByName = nestedFieldsAccessedByName;
+  }
+
+  // Return a descriptor that accesses all fields in a row.
+  public static FieldAccessDescriptor withAllFields() {
+return new FieldAccessDescriptor(
+true,
+Collections.emptySet(),
+Collections.emptySet(),
+Collections.emptyMap(),
+Collections.emptyMap());
+  }
+
+  /**
+   * Return a descriptor that access the specified fields.
+   *
+   * By default, if the field is a nested row (or a container containing a 
row), all fields of
+   * said rows are accessed. For finer-grained acccess to nested rows, call 
withNestedField and pass
+   * in a recursive {@link FieldAccessDescriptor}.
+   */
+  public static FieldAccessDescriptor withFieldNames(String... names) {
+return withFieldNames(Arrays.asList(names));
+  }
+
+  /**
+   * Return a descriptor that access the specified fields.
+   *
+   * By default, if the field is a nested row (or a container containing a 
row), all fields of
+   * said rows are accessed. For finer-grained acccess to nested rows, call 
withNestedField and pass
+   * in a recursive {@link FieldAccessDescriptor}.
+   */
+  public static FieldAccessDescriptor withFieldNames(Iterable 
fieldNames) {
+return new FieldAccessDescriptor(
+false,
+Collections.emptySet(),
+Sets.newHashSet(fieldNames),
+Collections.emptyMap(),
+Collections.emptyMap());
+  }
+
+  /**
+   * Return a descriptor that access the specified fields.
+   *
+   * By default, if the field is a nested row (or a container containing a 
row), all fields of
+   * said rows are accessed. For finer-grained acccess to nested rows, call 
withNestedField and pass
+   * in a recursive {@link FieldAccessDescriptor}.
+   */
+  public static FieldAccessDescriptor withFieldIds(Integer... ids) {
+return withFieldIds(Arrays.asList(i

[jira] [Work logged] (BEAM-4076) Schema followups

2018-06-28 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4076?focusedWorklogId=116949&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-116949
 ]

ASF GitHub Bot logged work on BEAM-4076:


Author: ASF GitHub Bot
Created on: 28/Jun/18 18:35
Start Date: 28/Jun/18 18:35
Worklog Time Spent: 10m 
  Work Description: apilloud commented on a change in pull request #5545: 
[BEAM-4076] Import Schema branch into master
URL: https://github.com/apache/beam/pull/5545#discussion_r198929107
 
 

 ##
 File path: 
sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/FieldAccessDescriptor.java
 ##
 @@ -0,0 +1,265 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.beam.sdk.schemas;
+
+import static com.google.common.base.Preconditions.checkState;
+
+import com.google.common.collect.ImmutableMap;
+import com.google.common.collect.Maps;
+import com.google.common.collect.Sets;
+import java.io.Serializable;
+import java.util.Arrays;
+import java.util.Collections;
+import java.util.Map;
+import java.util.Set;
+import java.util.stream.Collectors;
+import java.util.stream.StreamSupport;
+import org.apache.beam.sdk.annotations.Experimental;
+import org.apache.beam.sdk.annotations.Experimental.Kind;
+import org.apache.beam.sdk.schemas.Schema.Field;
+import org.apache.beam.sdk.schemas.Schema.FieldType;
+import org.apache.beam.sdk.schemas.Schema.TypeName;
+
+/**
+ * Used inside of a {@link org.apache.beam.sdk.transforms.DoFn} to describe 
which fields in a schema
+ * type need to be accessed for processing.
+ */
+@Experimental(Kind.SCHEMAS)
+public class FieldAccessDescriptor implements Serializable {
+  private boolean allFields;
 
 Review comment:
   This class seems like a good candidate for `AutoValue`? If not, these should 
all be `final`.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 116949)
Time Spent: 4h 40m  (was: 4.5h)

> Schema followups
> 
>
> Key: BEAM-4076
> URL: https://issues.apache.org/jira/browse/BEAM-4076
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model, dsl-sql, sdk-java-core
>Reporter: Kenneth Knowles
>Priority: Major
>  Time Spent: 4h 40m
>  Remaining Estimate: 0h
>
> This umbrella bug contains subtasks with followups for Beam schemas, which 
> were moved from SQL to the core Java SDK and made to be type-name-based 
> rather than coder based.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4076) Schema followups

2018-06-28 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4076?focusedWorklogId=116947&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-116947
 ]

ASF GitHub Bot logged work on BEAM-4076:


Author: ASF GitHub Bot
Created on: 28/Jun/18 18:35
Start Date: 28/Jun/18 18:35
Worklog Time Spent: 10m 
  Work Description: apilloud commented on a change in pull request #5545: 
[BEAM-4076] Import Schema branch into master
URL: https://github.com/apache/beam/pull/5545#discussion_r198941209
 
 

 ##
 File path: 
sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/SchemaRegistry.java
 ##
 @@ -0,0 +1,156 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.beam.sdk.schemas;
+
+import com.google.common.collect.Lists;
+import com.google.common.collect.Maps;
+import java.util.List;
+import java.util.Map;
+import java.util.function.Function;
+import org.apache.beam.sdk.annotations.Experimental;
+import org.apache.beam.sdk.annotations.Experimental.Kind;
+import org.apache.beam.sdk.transforms.SerializableFunction;
+import org.apache.beam.sdk.values.Row;
+import org.apache.beam.sdk.values.TypeDescriptor;
+
+/**
+ * A {@link SchemaRegistry} allows registering {@link Schema}s for a given 
Java {@link Class} or a
+ * {@link TypeDescriptor}.
+ *
+ * Types registered in a pipeline's schema registry will automatically be 
discovered by any
+ * {@link org.apache.beam.sdk.values.PCollection} that uses {@link 
SchemaCoder}. This allows users
+ * to write pipelines in terms of their own Java types, yet still register 
schemas for these types.
+ *
+ * TODO: Provide support for schemas registered via a ServiceLoader 
interface. This will allow
+ * optional modules to register schemas as well.
+ */
+@Experimental(Kind.SCHEMAS)
+public class SchemaRegistry {
+  private static class SchemaEntry {
+Schema schema;
+SerializableFunction toRow;
+SerializableFunction fromRow;
+
+SchemaEntry(
+Schema schema, SerializableFunction toRow, 
SerializableFunction fromRow) {
+  this.schema = schema;
+  this.toRow = toRow;
+  this.fromRow = fromRow;
+}
+  }
+
+  Map entries = Maps.newHashMap();
 
 Review comment:
   These both are `final`?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 116947)
Time Spent: 4.5h  (was: 4h 20m)

> Schema followups
> 
>
> Key: BEAM-4076
> URL: https://issues.apache.org/jira/browse/BEAM-4076
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model, dsl-sql, sdk-java-core
>Reporter: Kenneth Knowles
>Priority: Major
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> This umbrella bug contains subtasks with followups for Beam schemas, which 
> were moved from SQL to the core Java SDK and made to be type-name-based 
> rather than coder based.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4076) Schema followups

2018-06-28 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4076?focusedWorklogId=116954&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-116954
 ]

ASF GitHub Bot logged work on BEAM-4076:


Author: ASF GitHub Bot
Created on: 28/Jun/18 18:35
Start Date: 28/Jun/18 18:35
Worklog Time Spent: 10m 
  Work Description: apilloud commented on a change in pull request #5545: 
[BEAM-4076] Import Schema branch into master
URL: https://github.com/apache/beam/pull/5545#discussion_r198940797
 
 

 ##
 File path: 
sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/SchemaRegistry.java
 ##
 @@ -0,0 +1,156 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.beam.sdk.schemas;
+
+import com.google.common.collect.Lists;
+import com.google.common.collect.Maps;
+import java.util.List;
+import java.util.Map;
+import java.util.function.Function;
+import org.apache.beam.sdk.annotations.Experimental;
+import org.apache.beam.sdk.annotations.Experimental.Kind;
+import org.apache.beam.sdk.transforms.SerializableFunction;
+import org.apache.beam.sdk.values.Row;
+import org.apache.beam.sdk.values.TypeDescriptor;
+
+/**
+ * A {@link SchemaRegistry} allows registering {@link Schema}s for a given 
Java {@link Class} or a
+ * {@link TypeDescriptor}.
+ *
+ * Types registered in a pipeline's schema registry will automatically be 
discovered by any
+ * {@link org.apache.beam.sdk.values.PCollection} that uses {@link 
SchemaCoder}. This allows users
+ * to write pipelines in terms of their own Java types, yet still register 
schemas for these types.
+ *
+ * TODO: Provide support for schemas registered via a ServiceLoader 
interface. This will allow
+ * optional modules to register schemas as well.
+ */
+@Experimental(Kind.SCHEMAS)
+public class SchemaRegistry {
+  private static class SchemaEntry {
+Schema schema;
 
 Review comment:
   Generic `AutoValue` or `final` comment goes here.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 116954)
Time Spent: 5h 20m  (was: 5h 10m)

> Schema followups
> 
>
> Key: BEAM-4076
> URL: https://issues.apache.org/jira/browse/BEAM-4076
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model, dsl-sql, sdk-java-core
>Reporter: Kenneth Knowles
>Priority: Major
>  Time Spent: 5h 20m
>  Remaining Estimate: 0h
>
> This umbrella bug contains subtasks with followups for Beam schemas, which 
> were moved from SQL to the core Java SDK and made to be type-name-based 
> rather than coder based.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4076) Schema followups

2018-06-28 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4076?focusedWorklogId=117153&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-117153
 ]

ASF GitHub Bot logged work on BEAM-4076:


Author: ASF GitHub Bot
Created on: 29/Jun/18 00:51
Start Date: 29/Jun/18 00:51
Worklog Time Spent: 10m 
  Work Description: reuvenlax commented on issue #5545: [BEAM-4076] Import 
Schema branch into master
URL: https://github.com/apache/beam/pull/5545#issuecomment-401215439
 
 
   retest this please


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 117153)
Time Spent: 5.5h  (was: 5h 20m)

> Schema followups
> 
>
> Key: BEAM-4076
> URL: https://issues.apache.org/jira/browse/BEAM-4076
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model, dsl-sql, sdk-java-core
>Reporter: Kenneth Knowles
>Priority: Major
>  Time Spent: 5.5h
>  Remaining Estimate: 0h
>
> This umbrella bug contains subtasks with followups for Beam schemas, which 
> were moved from SQL to the core Java SDK and made to be type-name-based 
> rather than coder based.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4076) Schema followups

2018-06-28 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4076?focusedWorklogId=117187&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-117187
 ]

ASF GitHub Bot logged work on BEAM-4076:


Author: ASF GitHub Bot
Created on: 29/Jun/18 04:07
Start Date: 29/Jun/18 04:07
Worklog Time Spent: 10m 
  Work Description: reuvenlax commented on a change in pull request #5545: 
[BEAM-4076] Import Schema branch into master
URL: https://github.com/apache/beam/pull/5545#discussion_r199048514
 
 

 ##
 File path: 
sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/SchemaRegistry.java
 ##
 @@ -0,0 +1,156 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.beam.sdk.schemas;
+
+import com.google.common.collect.Lists;
+import com.google.common.collect.Maps;
+import java.util.List;
+import java.util.Map;
+import java.util.function.Function;
+import org.apache.beam.sdk.annotations.Experimental;
+import org.apache.beam.sdk.annotations.Experimental.Kind;
+import org.apache.beam.sdk.transforms.SerializableFunction;
+import org.apache.beam.sdk.values.Row;
+import org.apache.beam.sdk.values.TypeDescriptor;
+
+/**
+ * A {@link SchemaRegistry} allows registering {@link Schema}s for a given 
Java {@link Class} or a
+ * {@link TypeDescriptor}.
+ *
+ * Types registered in a pipeline's schema registry will automatically be 
discovered by any
+ * {@link org.apache.beam.sdk.values.PCollection} that uses {@link 
SchemaCoder}. This allows users
+ * to write pipelines in terms of their own Java types, yet still register 
schemas for these types.
+ *
+ * TODO: Provide support for schemas registered via a ServiceLoader 
interface. This will allow
+ * optional modules to register schemas as well.
+ */
+@Experimental(Kind.SCHEMAS)
+public class SchemaRegistry {
+  private static class SchemaEntry {
+Schema schema;
+SerializableFunction toRow;
+SerializableFunction fromRow;
+
+SchemaEntry(
+Schema schema, SerializableFunction toRow, 
SerializableFunction fromRow) {
+  this.schema = schema;
+  this.toRow = toRow;
+  this.fromRow = fromRow;
+}
+  }
+
+  Map entries = Maps.newHashMap();
 
 Review comment:
   done


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 117187)
Time Spent: 5h 40m  (was: 5.5h)

> Schema followups
> 
>
> Key: BEAM-4076
> URL: https://issues.apache.org/jira/browse/BEAM-4076
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model, dsl-sql, sdk-java-core
>Reporter: Kenneth Knowles
>Priority: Major
>  Time Spent: 5h 40m
>  Remaining Estimate: 0h
>
> This umbrella bug contains subtasks with followups for Beam schemas, which 
> were moved from SQL to the core Java SDK and made to be type-name-based 
> rather than coder based.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4076) Schema followups

2018-06-28 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4076?focusedWorklogId=117190&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-117190
 ]

ASF GitHub Bot logged work on BEAM-4076:


Author: ASF GitHub Bot
Created on: 29/Jun/18 04:07
Start Date: 29/Jun/18 04:07
Worklog Time Spent: 10m 
  Work Description: reuvenlax commented on a change in pull request #5545: 
[BEAM-4076] Import Schema branch into master
URL: https://github.com/apache/beam/pull/5545#discussion_r199048525
 
 

 ##
 File path: 
sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/FieldAccessDescriptor.java
 ##
 @@ -0,0 +1,265 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.beam.sdk.schemas;
+
+import static com.google.common.base.Preconditions.checkState;
+
+import com.google.common.collect.ImmutableMap;
+import com.google.common.collect.Maps;
+import com.google.common.collect.Sets;
+import java.io.Serializable;
+import java.util.Arrays;
+import java.util.Collections;
+import java.util.Map;
+import java.util.Set;
+import java.util.stream.Collectors;
+import java.util.stream.StreamSupport;
+import org.apache.beam.sdk.annotations.Experimental;
+import org.apache.beam.sdk.annotations.Experimental.Kind;
+import org.apache.beam.sdk.schemas.Schema.Field;
+import org.apache.beam.sdk.schemas.Schema.FieldType;
+import org.apache.beam.sdk.schemas.Schema.TypeName;
+
+/**
+ * Used inside of a {@link org.apache.beam.sdk.transforms.DoFn} to describe 
which fields in a schema
+ * type need to be accessed for processing.
+ */
+@Experimental(Kind.SCHEMAS)
+public class FieldAccessDescriptor implements Serializable {
+  private boolean allFields;
+  private Set fieldIdsAccessed;
+  private Set fieldNamesAccessed;
+  private Map nestedFieldsAccessedById;
+  private Map nestedFieldsAccessedByName;
+
+  FieldAccessDescriptor(
+  boolean allFields,
+  Set fieldsIdsAccessed,
+  Set fieldNamesAccessed,
+  Map nestedFieldsAccessedById,
+  Map nestedFieldsAccessedByName) {
+this.allFields = allFields;
+this.fieldIdsAccessed = fieldsIdsAccessed;
+this.fieldNamesAccessed = fieldNamesAccessed;
+this.nestedFieldsAccessedById = nestedFieldsAccessedById;
+this.nestedFieldsAccessedByName = nestedFieldsAccessedByName;
+  }
+
+  // Return a descriptor that accesses all fields in a row.
+  public static FieldAccessDescriptor withAllFields() {
+return new FieldAccessDescriptor(
+true,
+Collections.emptySet(),
+Collections.emptySet(),
+Collections.emptyMap(),
+Collections.emptyMap());
+  }
+
+  /**
+   * Return a descriptor that access the specified fields.
+   *
+   * By default, if the field is a nested row (or a container containing a 
row), all fields of
+   * said rows are accessed. For finer-grained acccess to nested rows, call 
withNestedField and pass
+   * in a recursive {@link FieldAccessDescriptor}.
+   */
+  public static FieldAccessDescriptor withFieldNames(String... names) {
+return withFieldNames(Arrays.asList(names));
+  }
+
+  /**
+   * Return a descriptor that access the specified fields.
+   *
+   * By default, if the field is a nested row (or a container containing a 
row), all fields of
+   * said rows are accessed. For finer-grained acccess to nested rows, call 
withNestedField and pass
+   * in a recursive {@link FieldAccessDescriptor}.
+   */
+  public static FieldAccessDescriptor withFieldNames(Iterable 
fieldNames) {
+return new FieldAccessDescriptor(
+false,
+Collections.emptySet(),
+Sets.newHashSet(fieldNames),
+Collections.emptyMap(),
+Collections.emptyMap());
+  }
+
+  /**
+   * Return a descriptor that access the specified fields.
+   *
+   * By default, if the field is a nested row (or a container containing a 
row), all fields of
+   * said rows are accessed. For finer-grained acccess to nested rows, call 
withNestedField and pass
+   * in a recursive {@link FieldAccessDescriptor}.
+   */
+  public static FieldAccessDescriptor withFieldIds(Integer... ids) {
+return withFieldIds(Arrays.asList(

[jira] [Work logged] (BEAM-4076) Schema followups

2018-06-28 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4076?focusedWorklogId=117191&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-117191
 ]

ASF GitHub Bot logged work on BEAM-4076:


Author: ASF GitHub Bot
Created on: 29/Jun/18 04:07
Start Date: 29/Jun/18 04:07
Worklog Time Spent: 10m 
  Work Description: reuvenlax commented on a change in pull request #5545: 
[BEAM-4076] Import Schema branch into master
URL: https://github.com/apache/beam/pull/5545#discussion_r199048530
 
 

 ##
 File path: 
sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/FieldAccessDescriptor.java
 ##
 @@ -0,0 +1,265 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.beam.sdk.schemas;
+
+import static com.google.common.base.Preconditions.checkState;
+
+import com.google.common.collect.ImmutableMap;
+import com.google.common.collect.Maps;
+import com.google.common.collect.Sets;
+import java.io.Serializable;
+import java.util.Arrays;
+import java.util.Collections;
+import java.util.Map;
+import java.util.Set;
+import java.util.stream.Collectors;
+import java.util.stream.StreamSupport;
+import org.apache.beam.sdk.annotations.Experimental;
+import org.apache.beam.sdk.annotations.Experimental.Kind;
+import org.apache.beam.sdk.schemas.Schema.Field;
+import org.apache.beam.sdk.schemas.Schema.FieldType;
+import org.apache.beam.sdk.schemas.Schema.TypeName;
+
+/**
+ * Used inside of a {@link org.apache.beam.sdk.transforms.DoFn} to describe 
which fields in a schema
+ * type need to be accessed for processing.
+ */
+@Experimental(Kind.SCHEMAS)
+public class FieldAccessDescriptor implements Serializable {
+  private boolean allFields;
+  private Set fieldIdsAccessed;
+  private Set fieldNamesAccessed;
+  private Map nestedFieldsAccessedById;
+  private Map nestedFieldsAccessedByName;
+
+  FieldAccessDescriptor(
+  boolean allFields,
+  Set fieldsIdsAccessed,
+  Set fieldNamesAccessed,
+  Map nestedFieldsAccessedById,
+  Map nestedFieldsAccessedByName) {
+this.allFields = allFields;
+this.fieldIdsAccessed = fieldsIdsAccessed;
+this.fieldNamesAccessed = fieldNamesAccessed;
+this.nestedFieldsAccessedById = nestedFieldsAccessedById;
+this.nestedFieldsAccessedByName = nestedFieldsAccessedByName;
+  }
+
+  // Return a descriptor that accesses all fields in a row.
+  public static FieldAccessDescriptor withAllFields() {
+return new FieldAccessDescriptor(
+true,
+Collections.emptySet(),
+Collections.emptySet(),
+Collections.emptyMap(),
+Collections.emptyMap());
+  }
+
+  /**
+   * Return a descriptor that access the specified fields.
+   *
+   * By default, if the field is a nested row (or a container containing a 
row), all fields of
+   * said rows are accessed. For finer-grained acccess to nested rows, call 
withNestedField and pass
+   * in a recursive {@link FieldAccessDescriptor}.
+   */
+  public static FieldAccessDescriptor withFieldNames(String... names) {
+return withFieldNames(Arrays.asList(names));
+  }
+
+  /**
+   * Return a descriptor that access the specified fields.
+   *
+   * By default, if the field is a nested row (or a container containing a 
row), all fields of
+   * said rows are accessed. For finer-grained acccess to nested rows, call 
withNestedField and pass
+   * in a recursive {@link FieldAccessDescriptor}.
+   */
+  public static FieldAccessDescriptor withFieldNames(Iterable 
fieldNames) {
+return new FieldAccessDescriptor(
+false,
+Collections.emptySet(),
+Sets.newHashSet(fieldNames),
+Collections.emptyMap(),
+Collections.emptyMap());
+  }
+
+  /**
+   * Return a descriptor that access the specified fields.
+   *
+   * By default, if the field is a nested row (or a container containing a 
row), all fields of
+   * said rows are accessed. For finer-grained acccess to nested rows, call 
withNestedField and pass
+   * in a recursive {@link FieldAccessDescriptor}.
+   */
+  public static FieldAccessDescriptor withFieldIds(Integer... ids) {
+return withFieldIds(Arrays.asList(

[jira] [Work logged] (BEAM-4076) Schema followups

2018-06-28 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4076?focusedWorklogId=117193&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-117193
 ]

ASF GitHub Bot logged work on BEAM-4076:


Author: ASF GitHub Bot
Created on: 29/Jun/18 04:07
Start Date: 29/Jun/18 04:07
Worklog Time Spent: 10m 
  Work Description: reuvenlax commented on a change in pull request #5545: 
[BEAM-4076] Import Schema branch into master
URL: https://github.com/apache/beam/pull/5545#discussion_r199048553
 
 

 ##
 File path: 
sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/FieldAccessDescriptor.java
 ##
 @@ -0,0 +1,265 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.beam.sdk.schemas;
+
+import static com.google.common.base.Preconditions.checkState;
+
+import com.google.common.collect.ImmutableMap;
+import com.google.common.collect.Maps;
+import com.google.common.collect.Sets;
+import java.io.Serializable;
+import java.util.Arrays;
+import java.util.Collections;
+import java.util.Map;
+import java.util.Set;
+import java.util.stream.Collectors;
+import java.util.stream.StreamSupport;
+import org.apache.beam.sdk.annotations.Experimental;
+import org.apache.beam.sdk.annotations.Experimental.Kind;
+import org.apache.beam.sdk.schemas.Schema.Field;
+import org.apache.beam.sdk.schemas.Schema.FieldType;
+import org.apache.beam.sdk.schemas.Schema.TypeName;
+
+/**
+ * Used inside of a {@link org.apache.beam.sdk.transforms.DoFn} to describe 
which fields in a schema
+ * type need to be accessed for processing.
+ */
+@Experimental(Kind.SCHEMAS)
+public class FieldAccessDescriptor implements Serializable {
+  private boolean allFields;
 
 Review comment:
   Made class AutoVAlue


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 117193)
Time Spent: 6h 40m  (was: 6.5h)

> Schema followups
> 
>
> Key: BEAM-4076
> URL: https://issues.apache.org/jira/browse/BEAM-4076
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model, dsl-sql, sdk-java-core
>Reporter: Kenneth Knowles
>Priority: Major
>  Time Spent: 6h 40m
>  Remaining Estimate: 0h
>
> This umbrella bug contains subtasks with followups for Beam schemas, which 
> were moved from SQL to the core Java SDK and made to be type-name-based 
> rather than coder based.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4076) Schema followups

2018-06-28 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4076?focusedWorklogId=117192&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-117192
 ]

ASF GitHub Bot logged work on BEAM-4076:


Author: ASF GitHub Bot
Created on: 29/Jun/18 04:07
Start Date: 29/Jun/18 04:07
Worklog Time Spent: 10m 
  Work Description: reuvenlax commented on a change in pull request #5545: 
[BEAM-4076] Import Schema branch into master
URL: https://github.com/apache/beam/pull/5545#discussion_r199048533
 
 

 ##
 File path: 
sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/FieldAccessDescriptor.java
 ##
 @@ -0,0 +1,265 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.beam.sdk.schemas;
+
+import static com.google.common.base.Preconditions.checkState;
+
+import com.google.common.collect.ImmutableMap;
+import com.google.common.collect.Maps;
+import com.google.common.collect.Sets;
+import java.io.Serializable;
+import java.util.Arrays;
+import java.util.Collections;
+import java.util.Map;
+import java.util.Set;
+import java.util.stream.Collectors;
+import java.util.stream.StreamSupport;
+import org.apache.beam.sdk.annotations.Experimental;
+import org.apache.beam.sdk.annotations.Experimental.Kind;
+import org.apache.beam.sdk.schemas.Schema.Field;
+import org.apache.beam.sdk.schemas.Schema.FieldType;
+import org.apache.beam.sdk.schemas.Schema.TypeName;
+
+/**
+ * Used inside of a {@link org.apache.beam.sdk.transforms.DoFn} to describe 
which fields in a schema
+ * type need to be accessed for processing.
+ */
+@Experimental(Kind.SCHEMAS)
+public class FieldAccessDescriptor implements Serializable {
+  private boolean allFields;
+  private Set fieldIdsAccessed;
+  private Set fieldNamesAccessed;
+  private Map nestedFieldsAccessedById;
+  private Map nestedFieldsAccessedByName;
+
+  FieldAccessDescriptor(
+  boolean allFields,
+  Set fieldsIdsAccessed,
+  Set fieldNamesAccessed,
+  Map nestedFieldsAccessedById,
+  Map nestedFieldsAccessedByName) {
+this.allFields = allFields;
+this.fieldIdsAccessed = fieldsIdsAccessed;
+this.fieldNamesAccessed = fieldNamesAccessed;
+this.nestedFieldsAccessedById = nestedFieldsAccessedById;
+this.nestedFieldsAccessedByName = nestedFieldsAccessedByName;
+  }
+
+  // Return a descriptor that accesses all fields in a row.
+  public static FieldAccessDescriptor withAllFields() {
+return new FieldAccessDescriptor(
+true,
+Collections.emptySet(),
+Collections.emptySet(),
+Collections.emptyMap(),
+Collections.emptyMap());
+  }
+
+  /**
+   * Return a descriptor that access the specified fields.
+   *
+   * By default, if the field is a nested row (or a container containing a 
row), all fields of
+   * said rows are accessed. For finer-grained acccess to nested rows, call 
withNestedField and pass
+   * in a recursive {@link FieldAccessDescriptor}.
+   */
+  public static FieldAccessDescriptor withFieldNames(String... names) {
+return withFieldNames(Arrays.asList(names));
+  }
+
+  /**
+   * Return a descriptor that access the specified fields.
+   *
+   * By default, if the field is a nested row (or a container containing a 
row), all fields of
+   * said rows are accessed. For finer-grained acccess to nested rows, call 
withNestedField and pass
+   * in a recursive {@link FieldAccessDescriptor}.
+   */
+  public static FieldAccessDescriptor withFieldNames(Iterable 
fieldNames) {
+return new FieldAccessDescriptor(
+false,
+Collections.emptySet(),
+Sets.newHashSet(fieldNames),
+Collections.emptyMap(),
+Collections.emptyMap());
+  }
+
+  /**
+   * Return a descriptor that access the specified fields.
+   *
+   * By default, if the field is a nested row (or a container containing a 
row), all fields of
+   * said rows are accessed. For finer-grained acccess to nested rows, call 
withNestedField and pass
+   * in a recursive {@link FieldAccessDescriptor}.
+   */
+  public static FieldAccessDescriptor withFieldIds(Integer... ids) {
+return withFieldIds(Arrays.asList(

[jira] [Work logged] (BEAM-4076) Schema followups

2018-06-28 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4076?focusedWorklogId=117188&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-117188
 ]

ASF GitHub Bot logged work on BEAM-4076:


Author: ASF GitHub Bot
Created on: 29/Jun/18 04:07
Start Date: 29/Jun/18 04:07
Worklog Time Spent: 10m 
  Work Description: reuvenlax commented on a change in pull request #5545: 
[BEAM-4076] Import Schema branch into master
URL: https://github.com/apache/beam/pull/5545#discussion_r199048517
 
 

 ##
 File path: 
sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/SchemaRegistry.java
 ##
 @@ -0,0 +1,156 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.beam.sdk.schemas;
+
+import com.google.common.collect.Lists;
+import com.google.common.collect.Maps;
+import java.util.List;
+import java.util.Map;
+import java.util.function.Function;
+import org.apache.beam.sdk.annotations.Experimental;
+import org.apache.beam.sdk.annotations.Experimental.Kind;
+import org.apache.beam.sdk.transforms.SerializableFunction;
+import org.apache.beam.sdk.values.Row;
+import org.apache.beam.sdk.values.TypeDescriptor;
+
+/**
+ * A {@link SchemaRegistry} allows registering {@link Schema}s for a given 
Java {@link Class} or a
+ * {@link TypeDescriptor}.
+ *
+ * Types registered in a pipeline's schema registry will automatically be 
discovered by any
+ * {@link org.apache.beam.sdk.values.PCollection} that uses {@link 
SchemaCoder}. This allows users
+ * to write pipelines in terms of their own Java types, yet still register 
schemas for these types.
+ *
+ * TODO: Provide support for schemas registered via a ServiceLoader 
interface. This will allow
+ * optional modules to register schemas as well.
+ */
+@Experimental(Kind.SCHEMAS)
+public class SchemaRegistry {
+  private static class SchemaEntry {
+Schema schema;
 
 Review comment:
   AutoValue overkill here - made them final


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 117188)
Time Spent: 5h 50m  (was: 5h 40m)

> Schema followups
> 
>
> Key: BEAM-4076
> URL: https://issues.apache.org/jira/browse/BEAM-4076
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model, dsl-sql, sdk-java-core
>Reporter: Kenneth Knowles
>Priority: Major
>  Time Spent: 5h 50m
>  Remaining Estimate: 0h
>
> This umbrella bug contains subtasks with followups for Beam schemas, which 
> were moved from SQL to the core Java SDK and made to be type-name-based 
> rather than coder based.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4076) Schema followups

2018-06-28 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4076?focusedWorklogId=117189&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-117189
 ]

ASF GitHub Bot logged work on BEAM-4076:


Author: ASF GitHub Bot
Created on: 29/Jun/18 04:07
Start Date: 29/Jun/18 04:07
Worklog Time Spent: 10m 
  Work Description: reuvenlax commented on a change in pull request #5545: 
[BEAM-4076] Import Schema branch into master
URL: https://github.com/apache/beam/pull/5545#discussion_r199048520
 
 

 ##
 File path: 
sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/SchemaCoder.java
 ##
 @@ -0,0 +1,89 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.beam.sdk.schemas;
+
+import java.io.IOException;
+import java.io.InputStream;
+import java.io.OutputStream;
+import org.apache.beam.sdk.annotations.Experimental;
+import org.apache.beam.sdk.annotations.Experimental.Kind;
+import org.apache.beam.sdk.coders.CustomCoder;
+import org.apache.beam.sdk.coders.RowCoder;
+import org.apache.beam.sdk.transforms.SerializableFunction;
+import org.apache.beam.sdk.values.Row;
+
+/** {@link SchemaCoder} is used as the coder for types that have schemas 
registered. */
+@Experimental(Kind.SCHEMAS)
+public class SchemaCoder extends CustomCoder {
+  private Schema schema;
 
 Review comment:
   feels less appropriate here; made variables final


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 117189)
Time Spent: 6h  (was: 5h 50m)

> Schema followups
> 
>
> Key: BEAM-4076
> URL: https://issues.apache.org/jira/browse/BEAM-4076
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model, dsl-sql, sdk-java-core
>Reporter: Kenneth Knowles
>Priority: Major
>  Time Spent: 6h
>  Remaining Estimate: 0h
>
> This umbrella bug contains subtasks with followups for Beam schemas, which 
> were moved from SQL to the core Java SDK and made to be type-name-based 
> rather than coder based.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4076) Schema followups

2018-06-29 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4076?focusedWorklogId=117474&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-117474
 ]

ASF GitHub Bot logged work on BEAM-4076:


Author: ASF GitHub Bot
Created on: 29/Jun/18 17:01
Start Date: 29/Jun/18 17:01
Worklog Time Spent: 10m 
  Work Description: reuvenlax commented on issue #5545: [BEAM-4076] Import 
Schema branch into master
URL: https://github.com/apache/beam/pull/5545#issuecomment-401414659
 
 
   retest this please


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 117474)
Time Spent: 6h 50m  (was: 6h 40m)

> Schema followups
> 
>
> Key: BEAM-4076
> URL: https://issues.apache.org/jira/browse/BEAM-4076
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model, dsl-sql, sdk-java-core
>Reporter: Kenneth Knowles
>Priority: Major
>  Time Spent: 6h 50m
>  Remaining Estimate: 0h
>
> This umbrella bug contains subtasks with followups for Beam schemas, which 
> were moved from SQL to the core Java SDK and made to be type-name-based 
> rather than coder based.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4076) Schema followups

2018-06-29 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4076?focusedWorklogId=117535&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-117535
 ]

ASF GitHub Bot logged work on BEAM-4076:


Author: ASF GitHub Bot
Created on: 29/Jun/18 18:45
Start Date: 29/Jun/18 18:45
Worklog Time Spent: 10m 
  Work Description: reuvenlax commented on issue #5545: [BEAM-4076] Import 
Schema branch into master
URL: https://github.com/apache/beam/pull/5545#issuecomment-401441075
 
 
   retest this please


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 117535)
Time Spent: 7h  (was: 6h 50m)

> Schema followups
> 
>
> Key: BEAM-4076
> URL: https://issues.apache.org/jira/browse/BEAM-4076
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model, dsl-sql, sdk-java-core
>Reporter: Kenneth Knowles
>Priority: Major
>  Time Spent: 7h
>  Remaining Estimate: 0h
>
> This umbrella bug contains subtasks with followups for Beam schemas, which 
> were moved from SQL to the core Java SDK and made to be type-name-based 
> rather than coder based.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4076) Schema followups

2018-06-29 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4076?focusedWorklogId=117536&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-117536
 ]

ASF GitHub Bot logged work on BEAM-4076:


Author: ASF GitHub Bot
Created on: 29/Jun/18 18:45
Start Date: 29/Jun/18 18:45
Worklog Time Spent: 10m 
  Work Description: reuvenlax commented on issue #5545: [BEAM-4076] Import 
Schema branch into master
URL: https://github.com/apache/beam/pull/5545#issuecomment-401441099
 
 
   run Dataflow PostCommit


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 117536)
Time Spent: 7h 10m  (was: 7h)

> Schema followups
> 
>
> Key: BEAM-4076
> URL: https://issues.apache.org/jira/browse/BEAM-4076
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model, dsl-sql, sdk-java-core
>Reporter: Kenneth Knowles
>Priority: Major
>  Time Spent: 7h 10m
>  Remaining Estimate: 0h
>
> This umbrella bug contains subtasks with followups for Beam schemas, which 
> were moved from SQL to the core Java SDK and made to be type-name-based 
> rather than coder based.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4076) Schema followups

2018-06-29 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4076?focusedWorklogId=117538&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-117538
 ]

ASF GitHub Bot logged work on BEAM-4076:


Author: ASF GitHub Bot
Created on: 29/Jun/18 18:47
Start Date: 29/Jun/18 18:47
Worklog Time Spent: 10m 
  Work Description: reuvenlax commented on issue #5545: [BEAM-4076] Import 
Schema branch into master
URL: https://github.com/apache/beam/pull/5545#issuecomment-401441734
 
 
   Run Dataflow ValidatesRunner


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 117538)
Time Spent: 7h 20m  (was: 7h 10m)

> Schema followups
> 
>
> Key: BEAM-4076
> URL: https://issues.apache.org/jira/browse/BEAM-4076
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model, dsl-sql, sdk-java-core
>Reporter: Kenneth Knowles
>Priority: Major
>  Time Spent: 7h 20m
>  Remaining Estimate: 0h
>
> This umbrella bug contains subtasks with followups for Beam schemas, which 
> were moved from SQL to the core Java SDK and made to be type-name-based 
> rather than coder based.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4076) Schema followups

2018-06-29 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4076?focusedWorklogId=117559&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-117559
 ]

ASF GitHub Bot logged work on BEAM-4076:


Author: ASF GitHub Bot
Created on: 29/Jun/18 19:26
Start Date: 29/Jun/18 19:26
Worklog Time Spent: 10m 
  Work Description: reuvenlax commented on issue #5545: [BEAM-4076] Import 
Schema branch into master
URL: https://github.com/apache/beam/pull/5545#issuecomment-401451225
 
 
   Run Spark ValidatesRunner


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 117559)
Time Spent: 7.5h  (was: 7h 20m)

> Schema followups
> 
>
> Key: BEAM-4076
> URL: https://issues.apache.org/jira/browse/BEAM-4076
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model, dsl-sql, sdk-java-core
>Reporter: Kenneth Knowles
>Priority: Major
>  Time Spent: 7.5h
>  Remaining Estimate: 0h
>
> This umbrella bug contains subtasks with followups for Beam schemas, which 
> were moved from SQL to the core Java SDK and made to be type-name-based 
> rather than coder based.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4076) Schema followups

2018-06-29 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4076?focusedWorklogId=117560&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-117560
 ]

ASF GitHub Bot logged work on BEAM-4076:


Author: ASF GitHub Bot
Created on: 29/Jun/18 19:26
Start Date: 29/Jun/18 19:26
Worklog Time Spent: 10m 
  Work Description: reuvenlax commented on issue #5545: [BEAM-4076] Import 
Schema branch into master
URL: https://github.com/apache/beam/pull/5545#issuecomment-401451258
 
 
   Run Flink ValidatesRunner


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 117560)
Time Spent: 7h 40m  (was: 7.5h)

> Schema followups
> 
>
> Key: BEAM-4076
> URL: https://issues.apache.org/jira/browse/BEAM-4076
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model, dsl-sql, sdk-java-core
>Reporter: Kenneth Knowles
>Priority: Major
>  Time Spent: 7h 40m
>  Remaining Estimate: 0h
>
> This umbrella bug contains subtasks with followups for Beam schemas, which 
> were moved from SQL to the core Java SDK and made to be type-name-based 
> rather than coder based.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4076) Schema followups

2018-06-29 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4076?focusedWorklogId=117578&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-117578
 ]

ASF GitHub Bot logged work on BEAM-4076:


Author: ASF GitHub Bot
Created on: 29/Jun/18 20:28
Start Date: 29/Jun/18 20:28
Worklog Time Spent: 10m 
  Work Description: reuvenlax commented on issue #5545: [BEAM-4076] Import 
Schema branch into master
URL: https://github.com/apache/beam/pull/5545#issuecomment-401465337
 
 
   Run Dataflow ValidatesRunner


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 117578)
Time Spent: 7h 50m  (was: 7h 40m)

> Schema followups
> 
>
> Key: BEAM-4076
> URL: https://issues.apache.org/jira/browse/BEAM-4076
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model, dsl-sql, sdk-java-core
>Reporter: Kenneth Knowles
>Priority: Major
>  Time Spent: 7h 50m
>  Remaining Estimate: 0h
>
> This umbrella bug contains subtasks with followups for Beam schemas, which 
> were moved from SQL to the core Java SDK and made to be type-name-based 
> rather than coder based.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4076) Schema followups

2018-06-29 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4076?focusedWorklogId=117621&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-117621
 ]

ASF GitHub Bot logged work on BEAM-4076:


Author: ASF GitHub Bot
Created on: 29/Jun/18 22:04
Start Date: 29/Jun/18 22:04
Worklog Time Spent: 10m 
  Work Description: reuvenlax commented on issue #5545: [BEAM-4076] Import 
Schema branch into master
URL: https://github.com/apache/beam/pull/5545#issuecomment-401485569
 
 
   Run Dataflow ValidatesRunner


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 117621)
Time Spent: 8h  (was: 7h 50m)

> Schema followups
> 
>
> Key: BEAM-4076
> URL: https://issues.apache.org/jira/browse/BEAM-4076
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model, dsl-sql, sdk-java-core
>Reporter: Kenneth Knowles
>Priority: Major
>  Time Spent: 8h
>  Remaining Estimate: 0h
>
> This umbrella bug contains subtasks with followups for Beam schemas, which 
> were moved from SQL to the core Java SDK and made to be type-name-based 
> rather than coder based.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4076) Schema followups

2018-06-30 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4076?focusedWorklogId=117887&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-117887
 ]

ASF GitHub Bot logged work on BEAM-4076:


Author: ASF GitHub Bot
Created on: 30/Jun/18 18:03
Start Date: 30/Jun/18 18:03
Worklog Time Spent: 10m 
  Work Description: lgajowy commented on issue #5545: [BEAM-4076] Import 
Schema branch into master
URL: https://github.com/apache/beam/pull/5545#issuecomment-401556941
 
 
   I'm afraid after merging this the build started breaking: 
   
https://builds.apache.org/view/A-D/view/Beam/job/beam_PostCommit_Java_GradleBuild/1002/console
 
   
   Better (more verbose) logs: 
https://builds.apache.org/view/A-D/view/Beam/job/beam_PerformanceTests_XmlIOIT/453/console
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 117887)
Time Spent: 8h 20m  (was: 8h 10m)

> Schema followups
> 
>
> Key: BEAM-4076
> URL: https://issues.apache.org/jira/browse/BEAM-4076
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model, dsl-sql, sdk-java-core
>Reporter: Kenneth Knowles
>Priority: Major
>  Time Spent: 8h 20m
>  Remaining Estimate: 0h
>
> This umbrella bug contains subtasks with followups for Beam schemas, which 
> were moved from SQL to the core Java SDK and made to be type-name-based 
> rather than coder based.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4076) Schema followups

2018-06-30 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4076?focusedWorklogId=117890&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-117890
 ]

ASF GitHub Bot logged work on BEAM-4076:


Author: ASF GitHub Bot
Created on: 30/Jun/18 18:10
Start Date: 30/Jun/18 18:10
Worklog Time Spent: 10m 
  Work Description: reuvenlax commented on issue #5545: [BEAM-4076] Import 
Schema branch into master
URL: https://github.com/apache/beam/pull/5545#issuecomment-401557318
 
 
   Looking. I explicitly triggered a bunch of PostCommit tests before merging
   this, so might have been one I missed.
   
   On Sat, Jun 30, 2018 at 11:03 AM Łukasz Gajowy 
   wrote:
   
   > I'm afraid after merging this the build started breaking:
   >
   > 
https://builds.apache.org/view/A-D/view/Beam/job/beam_PostCommit_Java_GradleBuild/1002/console
   >
   > Better (more verbose) logs:
   > 
https://builds.apache.org/view/A-D/view/Beam/job/beam_PerformanceTests_XmlIOIT/453/console
   >
   > —
   > You are receiving this because you modified the open/close state.
   > Reply to this email directly, view it on GitHub
   > , or mute
   > the thread
   > 

   > .
   >
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 117890)
Time Spent: 8.5h  (was: 8h 20m)

> Schema followups
> 
>
> Key: BEAM-4076
> URL: https://issues.apache.org/jira/browse/BEAM-4076
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model, dsl-sql, sdk-java-core
>Reporter: Kenneth Knowles
>Priority: Major
>  Time Spent: 8.5h
>  Remaining Estimate: 0h
>
> This umbrella bug contains subtasks with followups for Beam schemas, which 
> were moved from SQL to the core Java SDK and made to be type-name-based 
> rather than coder based.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4076) Schema followups

2018-06-30 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4076?focusedWorklogId=117891&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-117891
 ]

ASF GitHub Bot logged work on BEAM-4076:


Author: ASF GitHub Bot
Created on: 30/Jun/18 18:26
Start Date: 30/Jun/18 18:26
Worklog Time Spent: 10m 
  Work Description: reuvenlax commented on issue #5545: [BEAM-4076] Import 
Schema branch into master
URL: https://github.com/apache/beam/pull/5545#issuecomment-401558091
 
 
   There is a compilation error, introduced by a conflict with
   
https://github.com/apache/beam/commit/34edbd5cc38c1eab0307569da518bceef325e696#diff-6f8f4b5d3d2bc0ef3cc57759b224bc36.Sending
   you a PR to fix this.
   
   On Sat, Jun 30, 2018 at 11:10 AM Reuven Lax  wrote:
   
   > Looking. I explicitly triggered a bunch of PostCommit tests before merging
   > this, so might have been one I missed.
   >
   > On Sat, Jun 30, 2018 at 11:03 AM Łukasz Gajowy 
   > wrote:
   >
   >> I'm afraid after merging this the build started breaking:
   >>
   >> 
https://builds.apache.org/view/A-D/view/Beam/job/beam_PostCommit_Java_GradleBuild/1002/console
   >>
   >> Better (more verbose) logs:
   >> 
https://builds.apache.org/view/A-D/view/Beam/job/beam_PerformanceTests_XmlIOIT/453/console
   >>
   >> —
   >> You are receiving this because you modified the open/close state.
   >> Reply to this email directly, view it on GitHub
   >> , or mute
   >> the thread
   >> 

   >> .
   >>
   >
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 117891)
Time Spent: 8h 40m  (was: 8.5h)

> Schema followups
> 
>
> Key: BEAM-4076
> URL: https://issues.apache.org/jira/browse/BEAM-4076
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model, dsl-sql, sdk-java-core
>Reporter: Kenneth Knowles
>Priority: Major
>  Time Spent: 8h 40m
>  Remaining Estimate: 0h
>
> This umbrella bug contains subtasks with followups for Beam schemas, which 
> were moved from SQL to the core Java SDK and made to be type-name-based 
> rather than coder based.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4076) Schema followups

2018-06-30 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4076?focusedWorklogId=117893&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-117893
 ]

ASF GitHub Bot logged work on BEAM-4076:


Author: ASF GitHub Bot
Created on: 30/Jun/18 18:32
Start Date: 30/Jun/18 18:32
Worklog Time Spent: 10m 
  Work Description: reuvenlax commented on issue #5545: [BEAM-4076] Import 
Schema branch into master
URL: https://github.com/apache/beam/pull/5545#issuecomment-401558495
 
 
   More specifically what happened: it appear that the conflicting PR was
   merged after regular tests had triggered for this PR but before I had
   merged my PR. The fact that I made sure to run ValidatesRunner tests before
   merging my PR ironically caused this, as it widened the window between
   tests running and PR merging (since it took many hours to run the
   ValidatesRunner tests).
   
   I sent you https://github.com/apache/beam/pull/5845 to fix this.
   
   On Sat, Jun 30, 2018 at 11:25 AM Reuven Lax  wrote:
   
   > There is a compilation error, introduced by a conflict with
   > 
https://github.com/apache/beam/commit/34edbd5cc38c1eab0307569da518bceef325e696#diff-6f8f4b5d3d2bc0ef3cc57759b224bc36.Sending
   > you a PR to fix this.
   >
   > On Sat, Jun 30, 2018 at 11:10 AM Reuven Lax  wrote:
   >
   >> Looking. I explicitly triggered a bunch of PostCommit tests before
   >> merging this, so might have been one I missed.
   >>
   >> On Sat, Jun 30, 2018 at 11:03 AM Łukasz Gajowy 
   >> wrote:
   >>
   >>> I'm afraid after merging this the build started breaking:
   >>>
   >>> 
https://builds.apache.org/view/A-D/view/Beam/job/beam_PostCommit_Java_GradleBuild/1002/console
   >>>
   >>> Better (more verbose) logs:
   >>> 
https://builds.apache.org/view/A-D/view/Beam/job/beam_PerformanceTests_XmlIOIT/453/console
   >>>
   >>> —
   >>> You are receiving this because you modified the open/close state.
   >>> Reply to this email directly, view it on GitHub
   >>> , or 
mute
   >>> the thread
   >>> 

   >>> .
   >>>
   >>
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 117893)
Time Spent: 8h 50m  (was: 8h 40m)

> Schema followups
> 
>
> Key: BEAM-4076
> URL: https://issues.apache.org/jira/browse/BEAM-4076
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model, dsl-sql, sdk-java-core
>Reporter: Kenneth Knowles
>Priority: Major
>  Time Spent: 8h 50m
>  Remaining Estimate: 0h
>
> This umbrella bug contains subtasks with followups for Beam schemas, which 
> were moved from SQL to the core Java SDK and made to be type-name-based 
> rather than coder based.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4076) Schema followups

2018-06-30 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4076?focusedWorklogId=117896&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-117896
 ]

ASF GitHub Bot logged work on BEAM-4076:


Author: ASF GitHub Bot
Created on: 30/Jun/18 18:40
Start Date: 30/Jun/18 18:40
Worklog Time Spent: 10m 
  Work Description: reuvenlax commented on issue #5545: [BEAM-4076] Import 
Schema branch into master
URL: https://github.com/apache/beam/pull/5545#issuecomment-401558917
 
 
   Fix merged.
   
   On Sat, Jun 30, 2018 at 11:32 AM Reuven Lax  wrote:
   
   > More specifically what happened: it appear that the conflicting PR was
   > merged after regular tests had triggered for this PR but before I had
   > merged my PR. The fact that I made sure to run ValidatesRunner tests before
   > merging my PR ironically caused this, as it widened the window between
   > tests running and PR merging (since it took many hours to run the
   > ValidatesRunner tests).
   >
   > I sent you https://github.com/apache/beam/pull/5845 to fix this.
   >
   > On Sat, Jun 30, 2018 at 11:25 AM Reuven Lax  wrote:
   >
   >> There is a compilation error, introduced by a conflict with
   >> 
https://github.com/apache/beam/commit/34edbd5cc38c1eab0307569da518bceef325e696#diff-6f8f4b5d3d2bc0ef3cc57759b224bc36.Sending
   >> you a PR to fix this.
   >>
   >> On Sat, Jun 30, 2018 at 11:10 AM Reuven Lax  wrote:
   >>
   >>> Looking. I explicitly triggered a bunch of PostCommit tests before
   >>> merging this, so might have been one I missed.
   >>>
   >>> On Sat, Jun 30, 2018 at 11:03 AM Łukasz Gajowy 
   >>> wrote:
   >>>
    I'm afraid after merging this the build started breaking:
   
    
https://builds.apache.org/view/A-D/view/Beam/job/beam_PostCommit_Java_GradleBuild/1002/console
   
    Better (more verbose) logs:
    
https://builds.apache.org/view/A-D/view/Beam/job/beam_PerformanceTests_XmlIOIT/453/console
   
    —
    You are receiving this because you modified the open/close state.
    Reply to this email directly, view it on GitHub
    , or 
mute
    the thread
    

    .
   
   >>>
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 117896)
Time Spent: 9h  (was: 8h 50m)

> Schema followups
> 
>
> Key: BEAM-4076
> URL: https://issues.apache.org/jira/browse/BEAM-4076
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model, dsl-sql, sdk-java-core
>Reporter: Kenneth Knowles
>Priority: Major
>  Time Spent: 9h
>  Remaining Estimate: 0h
>
> This umbrella bug contains subtasks with followups for Beam schemas, which 
> were moved from SQL to the core Java SDK and made to be type-name-based 
> rather than coder based.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4076) Schema followups

2018-07-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4076?focusedWorklogId=122528&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-122528
 ]

ASF GitHub Bot logged work on BEAM-4076:


Author: ASF GitHub Bot
Created on: 12/Jul/18 22:47
Start Date: 12/Jul/18 22:47
Worklog Time Spent: 10m 
  Work Description: reuvenlax opened a new pull request #5941: [BEAM-4076] 
Schema utilities for converting between types
URL: https://github.com/apache/beam/pull/5941
 
 
   If two types have the same or equivalent schemas, we can automatically 
convert between them. This PR provides a utility to convert between any schema 
types.
   
   This is preparatory PR for the one that converts SQL to use Schemas.
   
   R: @apilloud 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 122528)
Time Spent: 9h 10m  (was: 9h)

> Schema followups
> 
>
> Key: BEAM-4076
> URL: https://issues.apache.org/jira/browse/BEAM-4076
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model, dsl-sql, sdk-java-core
>Reporter: Kenneth Knowles
>Priority: Major
>  Time Spent: 9h 10m
>  Remaining Estimate: 0h
>
> This umbrella bug contains subtasks with followups for Beam schemas, which 
> were moved from SQL to the core Java SDK and made to be type-name-based 
> rather than coder based.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4076) Schema followups

2018-07-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4076?focusedWorklogId=122943&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-122943
 ]

ASF GitHub Bot logged work on BEAM-4076:


Author: ASF GitHub Bot
Created on: 13/Jul/18 16:39
Start Date: 13/Jul/18 16:39
Worklog Time Spent: 10m 
  Work Description: apilloud commented on a change in pull request #5941: 
[BEAM-4076] Schema utilities for converting between types
URL: https://github.com/apache/beam/pull/5941#discussion_r202406951
 
 

 ##
 File path: 
sdks/java/core/src/main/java/org/apache/beam/sdk/values/PCollection.java
 ##
 @@ -301,6 +301,18 @@ public String getName() {
 return setCoder(SchemaCoder.of(schema, toRowFunction, fromRowFunction));
   }
 
+  /** Returns whether this {@link PCollection} has an attached schema. */
+  @Experimental(Kind.SCHEMAS)
+  public boolean hasSchema() {
+return getCoder() instanceof SchemaCoder;
+  }
+
+  /** Returns the attached schema, or null if there is none. */
 
 Review comment:
   I'm new to Java, but I believe throwing an exception (possibly 
`IllegalStateException`) might be more idiomatic here. If not that, this 
function needs a `@Nullable` annotation.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 122943)

> Schema followups
> 
>
> Key: BEAM-4076
> URL: https://issues.apache.org/jira/browse/BEAM-4076
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model, dsl-sql, sdk-java-core
>Reporter: Kenneth Knowles
>Priority: Major
>  Time Spent: 9h 20m
>  Remaining Estimate: 0h
>
> This umbrella bug contains subtasks with followups for Beam schemas, which 
> were moved from SQL to the core Java SDK and made to be type-name-based 
> rather than coder based.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4076) Schema followups

2018-07-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4076?focusedWorklogId=122944&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-122944
 ]

ASF GitHub Bot logged work on BEAM-4076:


Author: ASF GitHub Bot
Created on: 13/Jul/18 16:39
Start Date: 13/Jul/18 16:39
Worklog Time Spent: 10m 
  Work Description: apilloud commented on a change in pull request #5941: 
[BEAM-4076] Schema utilities for converting between types
URL: https://github.com/apache/beam/pull/5941#discussion_r202401770
 
 

 ##
 File path: sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/Schema.java
 ##
 @@ -182,6 +184,33 @@ public boolean equals(Object o) {
 && Objects.equals(getFields(), other.getFields());
   }
 
+  /** Returns true if two Schemas have the same fields, but possibly in 
different orders. */
+  public boolean equivalent(Schema other) {
+List otherFields =
+other
+.getFields()
+.stream()
+.sorted(Comparator.comparing(Field::getName))
+.collect(Collectors.toList());
+List actualFields =
+getFields()
+.stream()
+.sorted(Comparator.comparing(Field::getName))
+.collect(Collectors.toList());
+if (otherFields.size() != actualFields.size()) {
 
 Review comment:
   I'm pretty sure I've seen this function before on the whiteboard in an 
interview. It would be good to move this check before the sorts.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 122944)
Time Spent: 9.5h  (was: 9h 20m)

> Schema followups
> 
>
> Key: BEAM-4076
> URL: https://issues.apache.org/jira/browse/BEAM-4076
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model, dsl-sql, sdk-java-core
>Reporter: Kenneth Knowles
>Priority: Major
>  Time Spent: 9.5h
>  Remaining Estimate: 0h
>
> This umbrella bug contains subtasks with followups for Beam schemas, which 
> were moved from SQL to the core Java SDK and made to be type-name-based 
> rather than coder based.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4076) Schema followups

2018-07-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4076?focusedWorklogId=122945&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-122945
 ]

ASF GitHub Bot logged work on BEAM-4076:


Author: ASF GitHub Bot
Created on: 13/Jul/18 16:39
Start Date: 13/Jul/18 16:39
Worklog Time Spent: 10m 
  Work Description: apilloud commented on a change in pull request #5941: 
[BEAM-4076] Schema utilities for converting between types
URL: https://github.com/apache/beam/pull/5941#discussion_r202404465
 
 

 ##
 File path: 
sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/transforms/Convert.java
 ##
 @@ -0,0 +1,162 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.beam.sdk.schemas.transforms;
+
+import javax.annotation.Nullable;
+import org.apache.beam.sdk.annotations.Experimental;
+import org.apache.beam.sdk.annotations.Experimental.Kind;
+import org.apache.beam.sdk.schemas.NoSuchSchemaException;
+import org.apache.beam.sdk.schemas.SchemaCoder;
+import org.apache.beam.sdk.schemas.SchemaRegistry;
+import org.apache.beam.sdk.transforms.DoFn;
+import org.apache.beam.sdk.transforms.PTransform;
+import org.apache.beam.sdk.transforms.ParDo;
+import org.apache.beam.sdk.transforms.SerializableFunctions;
+import org.apache.beam.sdk.values.PCollection;
+import org.apache.beam.sdk.values.Row;
+import org.apache.beam.sdk.values.TypeDescriptor;
+
+/** A set of utilities for converting between different objects supporting 
schemas. */
+@Experimental(Kind.SCHEMAS)
+public class Convert {
+  /**
+   * Convert a {@link PCollection} into a {@link PCollection}.
+   *
+   * The input {@link PCollection} must have a schema attached. The output 
collection will have
+   * the same schema as the iput.
+   */
+  public static  PTransform, PCollection> 
toRows() {
+return to(Row.class);
+  }
+
+  /**
+   * Convert a {@link PCollection} into a {@link PCollection}.
+   *
+   * The output schema will be inferred using the schema registry. A schema 
must be registered
+   * for this type, or the conversion will fail.
+   */
+  public static  PTransform, PCollection> 
fromRows(
+  Class clazz) {
+return to(clazz);
+  }
+
+  /**
+   * Convert a {@link PCollection} into a {@link PCollection}.
+   *
+   * The output schema will be inferred using the schema registry. A schema 
must be registered
+   * for this type, or the conversion will fail.
+   */
+  public static  PTransform, PCollection> 
fromRows(
+  TypeDescriptor typeDescriptor) {
+return to(typeDescriptor);
+  }
+
+  /**
+   * Convert a {@link PCollection} to a {@link PCollection}.
+   *
+   * This function allows converting between two types as long as the two 
types have
+   * compatible schemas. Two schemas are said to be compatible 
if they recursively
+   * have fields with the same names, but possibly different orders.
+   */
+  public static  PTransform, 
PCollection> to(
+  Class clazz) {
+return to(TypeDescriptor.of(clazz));
+  }
+
+  /**
+   * Convert a {@link PCollection} to a {@link PCollection}.
+   *
+   * This function allows converting between two types as long as the two 
types have
+   * compatible schemas. Two schemas are said to be compatible 
if they recursively
+   * have fields with the same names, but possibly different orders.
+   */
+  public static  PTransform, 
PCollection> to(
+  TypeDescriptor typeDescriptor) {
+return new ConvertTransform<>(typeDescriptor);
+  }
+
+  private static class ConvertTransform
+  extends PTransform, PCollection> {
+@Nullable TypeDescriptor outputTypeDescriptor = null;
+SchemaCoder outputSchemaCoder;
 
 Review comment:
   Looks like this should just be a local variable in `expand`?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 122945)

[jira] [Work logged] (BEAM-4076) Schema followups

2018-07-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4076?focusedWorklogId=122942&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-122942
 ]

ASF GitHub Bot logged work on BEAM-4076:


Author: ASF GitHub Bot
Created on: 13/Jul/18 16:39
Start Date: 13/Jul/18 16:39
Worklog Time Spent: 10m 
  Work Description: apilloud commented on a change in pull request #5941: 
[BEAM-4076] Schema utilities for converting between types
URL: https://github.com/apache/beam/pull/5941#discussion_r202404887
 
 

 ##
 File path: 
sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/transforms/Convert.java
 ##
 @@ -0,0 +1,162 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.beam.sdk.schemas.transforms;
+
+import javax.annotation.Nullable;
+import org.apache.beam.sdk.annotations.Experimental;
+import org.apache.beam.sdk.annotations.Experimental.Kind;
+import org.apache.beam.sdk.schemas.NoSuchSchemaException;
+import org.apache.beam.sdk.schemas.SchemaCoder;
+import org.apache.beam.sdk.schemas.SchemaRegistry;
+import org.apache.beam.sdk.transforms.DoFn;
+import org.apache.beam.sdk.transforms.PTransform;
+import org.apache.beam.sdk.transforms.ParDo;
+import org.apache.beam.sdk.transforms.SerializableFunctions;
+import org.apache.beam.sdk.values.PCollection;
+import org.apache.beam.sdk.values.Row;
+import org.apache.beam.sdk.values.TypeDescriptor;
+
+/** A set of utilities for converting between different objects supporting 
schemas. */
+@Experimental(Kind.SCHEMAS)
+public class Convert {
+  /**
+   * Convert a {@link PCollection} into a {@link PCollection}.
+   *
+   * The input {@link PCollection} must have a schema attached. The output 
collection will have
+   * the same schema as the iput.
+   */
+  public static  PTransform, PCollection> 
toRows() {
+return to(Row.class);
+  }
+
+  /**
+   * Convert a {@link PCollection} into a {@link PCollection}.
+   *
+   * The output schema will be inferred using the schema registry. A schema 
must be registered
+   * for this type, or the conversion will fail.
+   */
+  public static  PTransform, PCollection> 
fromRows(
+  Class clazz) {
+return to(clazz);
+  }
+
+  /**
+   * Convert a {@link PCollection} into a {@link PCollection}.
+   *
+   * The output schema will be inferred using the schema registry. A schema 
must be registered
+   * for this type, or the conversion will fail.
+   */
+  public static  PTransform, PCollection> 
fromRows(
+  TypeDescriptor typeDescriptor) {
+return to(typeDescriptor);
+  }
+
+  /**
+   * Convert a {@link PCollection} to a {@link PCollection}.
+   *
+   * This function allows converting between two types as long as the two 
types have
+   * compatible schemas. Two schemas are said to be compatible 
if they recursively
+   * have fields with the same names, but possibly different orders.
+   */
+  public static  PTransform, 
PCollection> to(
+  Class clazz) {
+return to(TypeDescriptor.of(clazz));
+  }
+
+  /**
+   * Convert a {@link PCollection} to a {@link PCollection}.
+   *
+   * This function allows converting between two types as long as the two 
types have
+   * compatible schemas. Two schemas are said to be compatible 
if they recursively
+   * have fields with the same names, but possibly different orders.
+   */
+  public static  PTransform, 
PCollection> to(
+  TypeDescriptor typeDescriptor) {
+return new ConvertTransform<>(typeDescriptor);
+  }
+
+  private static class ConvertTransform
+  extends PTransform, PCollection> {
+@Nullable TypeDescriptor outputTypeDescriptor = null;
+SchemaCoder outputSchemaCoder;
+
+ConvertTransform(Class outputClass) {
+  this(TypeDescriptor.of(outputClass));
+}
+
+ConvertTransform(TypeDescriptor outputTypeDescriptor) {
+  this.outputTypeDescriptor = outputTypeDescriptor;
+}
+
+@Override
+@SuppressWarnings("unchecked")
 
 Review comment:
   This seems related to the incorrect `@Nullable` annotation above.


This is an automated message from the Apache Git 

[jira] [Work logged] (BEAM-4076) Schema followups

2018-07-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4076?focusedWorklogId=122946&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-122946
 ]

ASF GitHub Bot logged work on BEAM-4076:


Author: ASF GitHub Bot
Created on: 13/Jul/18 16:39
Start Date: 13/Jul/18 16:39
Worklog Time Spent: 10m 
  Work Description: apilloud commented on a change in pull request #5941: 
[BEAM-4076] Schema utilities for converting between types
URL: https://github.com/apache/beam/pull/5941#discussion_r202404761
 
 

 ##
 File path: 
sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/transforms/Convert.java
 ##
 @@ -0,0 +1,162 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.beam.sdk.schemas.transforms;
+
+import javax.annotation.Nullable;
+import org.apache.beam.sdk.annotations.Experimental;
+import org.apache.beam.sdk.annotations.Experimental.Kind;
+import org.apache.beam.sdk.schemas.NoSuchSchemaException;
+import org.apache.beam.sdk.schemas.SchemaCoder;
+import org.apache.beam.sdk.schemas.SchemaRegistry;
+import org.apache.beam.sdk.transforms.DoFn;
+import org.apache.beam.sdk.transforms.PTransform;
+import org.apache.beam.sdk.transforms.ParDo;
+import org.apache.beam.sdk.transforms.SerializableFunctions;
+import org.apache.beam.sdk.values.PCollection;
+import org.apache.beam.sdk.values.Row;
+import org.apache.beam.sdk.values.TypeDescriptor;
+
+/** A set of utilities for converting between different objects supporting 
schemas. */
+@Experimental(Kind.SCHEMAS)
+public class Convert {
+  /**
+   * Convert a {@link PCollection} into a {@link PCollection}.
+   *
+   * The input {@link PCollection} must have a schema attached. The output 
collection will have
+   * the same schema as the iput.
+   */
+  public static  PTransform, PCollection> 
toRows() {
+return to(Row.class);
+  }
+
+  /**
+   * Convert a {@link PCollection} into a {@link PCollection}.
+   *
+   * The output schema will be inferred using the schema registry. A schema 
must be registered
+   * for this type, or the conversion will fail.
+   */
+  public static  PTransform, PCollection> 
fromRows(
+  Class clazz) {
+return to(clazz);
+  }
+
+  /**
+   * Convert a {@link PCollection} into a {@link PCollection}.
+   *
+   * The output schema will be inferred using the schema registry. A schema 
must be registered
+   * for this type, or the conversion will fail.
+   */
+  public static  PTransform, PCollection> 
fromRows(
+  TypeDescriptor typeDescriptor) {
+return to(typeDescriptor);
+  }
+
+  /**
+   * Convert a {@link PCollection} to a {@link PCollection}.
+   *
+   * This function allows converting between two types as long as the two 
types have
+   * compatible schemas. Two schemas are said to be compatible 
if they recursively
+   * have fields with the same names, but possibly different orders.
+   */
+  public static  PTransform, 
PCollection> to(
+  Class clazz) {
+return to(TypeDescriptor.of(clazz));
+  }
+
+  /**
+   * Convert a {@link PCollection} to a {@link PCollection}.
+   *
+   * This function allows converting between two types as long as the two 
types have
+   * compatible schemas. Two schemas are said to be compatible 
if they recursively
+   * have fields with the same names, but possibly different orders.
+   */
+  public static  PTransform, 
PCollection> to(
+  TypeDescriptor typeDescriptor) {
+return new ConvertTransform<>(typeDescriptor);
+  }
+
+  private static class ConvertTransform
+  extends PTransform, PCollection> {
+@Nullable TypeDescriptor outputTypeDescriptor = null;
 
 Review comment:
   Is this really `@Nullable`? Looks like it is always set in the constructor, 
you should leave off the default value and make it final.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---


[jira] [Work logged] (BEAM-4076) Schema followups

2018-07-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4076?focusedWorklogId=123068&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-123068
 ]

ASF GitHub Bot logged work on BEAM-4076:


Author: ASF GitHub Bot
Created on: 13/Jul/18 22:20
Start Date: 13/Jul/18 22:20
Worklog Time Spent: 10m 
  Work Description: reuvenlax commented on a change in pull request #5941: 
[BEAM-4076] Schema utilities for converting between types
URL: https://github.com/apache/beam/pull/5941#discussion_r202486266
 
 

 ##
 File path: sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/Schema.java
 ##
 @@ -182,6 +184,33 @@ public boolean equals(Object o) {
 && Objects.equals(getFields(), other.getFields());
   }
 
+  /** Returns true if two Schemas have the same fields, but possibly in 
different orders. */
+  public boolean equivalent(Schema other) {
+List otherFields =
+other
+.getFields()
+.stream()
+.sorted(Comparator.comparing(Field::getName))
+.collect(Collectors.toList());
+List actualFields =
+getFields()
+.stream()
+.sorted(Comparator.comparing(Field::getName))
+.collect(Collectors.toList());
+if (otherFields.size() != actualFields.size()) {
 
 Review comment:
   Good call, done


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 123068)
Time Spent: 10h  (was: 9h 50m)

> Schema followups
> 
>
> Key: BEAM-4076
> URL: https://issues.apache.org/jira/browse/BEAM-4076
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model, dsl-sql, sdk-java-core
>Reporter: Kenneth Knowles
>Priority: Major
>  Time Spent: 10h
>  Remaining Estimate: 0h
>
> This umbrella bug contains subtasks with followups for Beam schemas, which 
> were moved from SQL to the core Java SDK and made to be type-name-based 
> rather than coder based.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4076) Schema followups

2018-07-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4076?focusedWorklogId=123069&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-123069
 ]

ASF GitHub Bot logged work on BEAM-4076:


Author: ASF GitHub Bot
Created on: 13/Jul/18 22:20
Start Date: 13/Jul/18 22:20
Worklog Time Spent: 10m 
  Work Description: reuvenlax commented on a change in pull request #5941: 
[BEAM-4076] Schema utilities for converting between types
URL: https://github.com/apache/beam/pull/5941#discussion_r202486278
 
 

 ##
 File path: 
sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/transforms/Convert.java
 ##
 @@ -0,0 +1,162 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.beam.sdk.schemas.transforms;
+
+import javax.annotation.Nullable;
+import org.apache.beam.sdk.annotations.Experimental;
+import org.apache.beam.sdk.annotations.Experimental.Kind;
+import org.apache.beam.sdk.schemas.NoSuchSchemaException;
+import org.apache.beam.sdk.schemas.SchemaCoder;
+import org.apache.beam.sdk.schemas.SchemaRegistry;
+import org.apache.beam.sdk.transforms.DoFn;
+import org.apache.beam.sdk.transforms.PTransform;
+import org.apache.beam.sdk.transforms.ParDo;
+import org.apache.beam.sdk.transforms.SerializableFunctions;
+import org.apache.beam.sdk.values.PCollection;
+import org.apache.beam.sdk.values.Row;
+import org.apache.beam.sdk.values.TypeDescriptor;
+
+/** A set of utilities for converting between different objects supporting 
schemas. */
+@Experimental(Kind.SCHEMAS)
+public class Convert {
+  /**
+   * Convert a {@link PCollection} into a {@link PCollection}.
+   *
+   * The input {@link PCollection} must have a schema attached. The output 
collection will have
+   * the same schema as the iput.
+   */
+  public static  PTransform, PCollection> 
toRows() {
+return to(Row.class);
+  }
+
+  /**
+   * Convert a {@link PCollection} into a {@link PCollection}.
+   *
+   * The output schema will be inferred using the schema registry. A schema 
must be registered
+   * for this type, or the conversion will fail.
+   */
+  public static  PTransform, PCollection> 
fromRows(
+  Class clazz) {
+return to(clazz);
+  }
+
+  /**
+   * Convert a {@link PCollection} into a {@link PCollection}.
+   *
+   * The output schema will be inferred using the schema registry. A schema 
must be registered
+   * for this type, or the conversion will fail.
+   */
+  public static  PTransform, PCollection> 
fromRows(
+  TypeDescriptor typeDescriptor) {
+return to(typeDescriptor);
+  }
+
+  /**
+   * Convert a {@link PCollection} to a {@link PCollection}.
+   *
+   * This function allows converting between two types as long as the two 
types have
+   * compatible schemas. Two schemas are said to be compatible 
if they recursively
+   * have fields with the same names, but possibly different orders.
+   */
+  public static  PTransform, 
PCollection> to(
+  Class clazz) {
+return to(TypeDescriptor.of(clazz));
+  }
+
+  /**
+   * Convert a {@link PCollection} to a {@link PCollection}.
+   *
+   * This function allows converting between two types as long as the two 
types have
+   * compatible schemas. Two schemas are said to be compatible 
if they recursively
+   * have fields with the same names, but possibly different orders.
+   */
+  public static  PTransform, 
PCollection> to(
+  TypeDescriptor typeDescriptor) {
+return new ConvertTransform<>(typeDescriptor);
+  }
+
+  private static class ConvertTransform
+  extends PTransform, PCollection> {
+@Nullable TypeDescriptor outputTypeDescriptor = null;
+SchemaCoder outputSchemaCoder;
 
 Review comment:
   moved


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 123069)
Time Spent: 10h 10m  (was: 10h)

> Schema followups
> 

[jira] [Work logged] (BEAM-4076) Schema followups

2018-07-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4076?focusedWorklogId=123070&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-123070
 ]

ASF GitHub Bot logged work on BEAM-4076:


Author: ASF GitHub Bot
Created on: 13/Jul/18 22:20
Start Date: 13/Jul/18 22:20
Worklog Time Spent: 10m 
  Work Description: reuvenlax commented on a change in pull request #5941: 
[BEAM-4076] Schema utilities for converting between types
URL: https://github.com/apache/beam/pull/5941#discussion_r202486286
 
 

 ##
 File path: 
sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/transforms/Convert.java
 ##
 @@ -0,0 +1,162 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.beam.sdk.schemas.transforms;
+
+import javax.annotation.Nullable;
+import org.apache.beam.sdk.annotations.Experimental;
+import org.apache.beam.sdk.annotations.Experimental.Kind;
+import org.apache.beam.sdk.schemas.NoSuchSchemaException;
+import org.apache.beam.sdk.schemas.SchemaCoder;
+import org.apache.beam.sdk.schemas.SchemaRegistry;
+import org.apache.beam.sdk.transforms.DoFn;
+import org.apache.beam.sdk.transforms.PTransform;
+import org.apache.beam.sdk.transforms.ParDo;
+import org.apache.beam.sdk.transforms.SerializableFunctions;
+import org.apache.beam.sdk.values.PCollection;
+import org.apache.beam.sdk.values.Row;
+import org.apache.beam.sdk.values.TypeDescriptor;
+
+/** A set of utilities for converting between different objects supporting 
schemas. */
+@Experimental(Kind.SCHEMAS)
+public class Convert {
+  /**
+   * Convert a {@link PCollection} into a {@link PCollection}.
+   *
+   * The input {@link PCollection} must have a schema attached. The output 
collection will have
+   * the same schema as the iput.
+   */
+  public static  PTransform, PCollection> 
toRows() {
+return to(Row.class);
+  }
+
+  /**
+   * Convert a {@link PCollection} into a {@link PCollection}.
+   *
+   * The output schema will be inferred using the schema registry. A schema 
must be registered
+   * for this type, or the conversion will fail.
+   */
+  public static  PTransform, PCollection> 
fromRows(
+  Class clazz) {
+return to(clazz);
+  }
+
+  /**
+   * Convert a {@link PCollection} into a {@link PCollection}.
+   *
+   * The output schema will be inferred using the schema registry. A schema 
must be registered
+   * for this type, or the conversion will fail.
+   */
+  public static  PTransform, PCollection> 
fromRows(
+  TypeDescriptor typeDescriptor) {
+return to(typeDescriptor);
+  }
+
+  /**
+   * Convert a {@link PCollection} to a {@link PCollection}.
+   *
+   * This function allows converting between two types as long as the two 
types have
+   * compatible schemas. Two schemas are said to be compatible 
if they recursively
+   * have fields with the same names, but possibly different orders.
+   */
+  public static  PTransform, 
PCollection> to(
+  Class clazz) {
+return to(TypeDescriptor.of(clazz));
+  }
+
+  /**
+   * Convert a {@link PCollection} to a {@link PCollection}.
+   *
+   * This function allows converting between two types as long as the two 
types have
+   * compatible schemas. Two schemas are said to be compatible 
if they recursively
+   * have fields with the same names, but possibly different orders.
+   */
+  public static  PTransform, 
PCollection> to(
+  TypeDescriptor typeDescriptor) {
+return new ConvertTransform<>(typeDescriptor);
+  }
+
+  private static class ConvertTransform
+  extends PTransform, PCollection> {
+@Nullable TypeDescriptor outputTypeDescriptor = null;
 
 Review comment:
   Fine to change for now. In the future, might have to make this nullable 
though.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 123070)
Time Spent: 10h

[jira] [Work logged] (BEAM-4076) Schema followups

2018-07-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4076?focusedWorklogId=123071&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-123071
 ]

ASF GitHub Bot logged work on BEAM-4076:


Author: ASF GitHub Bot
Created on: 13/Jul/18 22:20
Start Date: 13/Jul/18 22:20
Worklog Time Spent: 10m 
  Work Description: reuvenlax commented on a change in pull request #5941: 
[BEAM-4076] Schema utilities for converting between types
URL: https://github.com/apache/beam/pull/5941#discussion_r202486289
 
 

 ##
 File path: 
sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/transforms/Convert.java
 ##
 @@ -0,0 +1,162 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.beam.sdk.schemas.transforms;
+
+import javax.annotation.Nullable;
+import org.apache.beam.sdk.annotations.Experimental;
+import org.apache.beam.sdk.annotations.Experimental.Kind;
+import org.apache.beam.sdk.schemas.NoSuchSchemaException;
+import org.apache.beam.sdk.schemas.SchemaCoder;
+import org.apache.beam.sdk.schemas.SchemaRegistry;
+import org.apache.beam.sdk.transforms.DoFn;
+import org.apache.beam.sdk.transforms.PTransform;
+import org.apache.beam.sdk.transforms.ParDo;
+import org.apache.beam.sdk.transforms.SerializableFunctions;
+import org.apache.beam.sdk.values.PCollection;
+import org.apache.beam.sdk.values.Row;
+import org.apache.beam.sdk.values.TypeDescriptor;
+
+/** A set of utilities for converting between different objects supporting 
schemas. */
+@Experimental(Kind.SCHEMAS)
+public class Convert {
+  /**
+   * Convert a {@link PCollection} into a {@link PCollection}.
+   *
+   * The input {@link PCollection} must have a schema attached. The output 
collection will have
+   * the same schema as the iput.
+   */
+  public static  PTransform, PCollection> 
toRows() {
+return to(Row.class);
+  }
+
+  /**
+   * Convert a {@link PCollection} into a {@link PCollection}.
+   *
+   * The output schema will be inferred using the schema registry. A schema 
must be registered
+   * for this type, or the conversion will fail.
+   */
+  public static  PTransform, PCollection> 
fromRows(
+  Class clazz) {
+return to(clazz);
+  }
+
+  /**
+   * Convert a {@link PCollection} into a {@link PCollection}.
+   *
+   * The output schema will be inferred using the schema registry. A schema 
must be registered
+   * for this type, or the conversion will fail.
+   */
+  public static  PTransform, PCollection> 
fromRows(
+  TypeDescriptor typeDescriptor) {
+return to(typeDescriptor);
+  }
+
+  /**
+   * Convert a {@link PCollection} to a {@link PCollection}.
+   *
+   * This function allows converting between two types as long as the two 
types have
+   * compatible schemas. Two schemas are said to be compatible 
if they recursively
+   * have fields with the same names, but possibly different orders.
+   */
+  public static  PTransform, 
PCollection> to(
+  Class clazz) {
+return to(TypeDescriptor.of(clazz));
+  }
+
+  /**
+   * Convert a {@link PCollection} to a {@link PCollection}.
+   *
+   * This function allows converting between two types as long as the two 
types have
+   * compatible schemas. Two schemas are said to be compatible 
if they recursively
+   * have fields with the same names, but possibly different orders.
+   */
+  public static  PTransform, 
PCollection> to(
+  TypeDescriptor typeDescriptor) {
+return new ConvertTransform<>(typeDescriptor);
+  }
+
+  private static class ConvertTransform
+  extends PTransform, PCollection> {
+@Nullable TypeDescriptor outputTypeDescriptor = null;
+SchemaCoder outputSchemaCoder;
+
+ConvertTransform(Class outputClass) {
+  this(TypeDescriptor.of(outputClass));
+}
+
+ConvertTransform(TypeDescriptor outputTypeDescriptor) {
+  this.outputTypeDescriptor = outputTypeDescriptor;
+}
+
+@Override
+@SuppressWarnings("unchecked")
 
 Review comment:
   no, it's due to the cast in the toRow branch


This is an automated message from the Apache Git Service.
To respond 

[jira] [Work logged] (BEAM-4076) Schema followups

2018-07-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4076?focusedWorklogId=123072&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-123072
 ]

ASF GitHub Bot logged work on BEAM-4076:


Author: ASF GitHub Bot
Created on: 13/Jul/18 22:20
Start Date: 13/Jul/18 22:20
Worklog Time Spent: 10m 
  Work Description: reuvenlax commented on a change in pull request #5941: 
[BEAM-4076] Schema utilities for converting between types
URL: https://github.com/apache/beam/pull/5941#discussion_r202486290
 
 

 ##
 File path: 
sdks/java/core/src/main/java/org/apache/beam/sdk/values/PCollection.java
 ##
 @@ -301,6 +301,18 @@ public String getName() {
 return setCoder(SchemaCoder.of(schema, toRowFunction, fromRowFunction));
   }
 
+  /** Returns whether this {@link PCollection} has an attached schema. */
+  @Experimental(Kind.SCHEMAS)
+  public boolean hasSchema() {
+return getCoder() instanceof SchemaCoder;
+  }
+
+  /** Returns the attached schema, or null if there is none. */
 
 Review comment:
   done


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 123072)
Time Spent: 10h 40m  (was: 10.5h)

> Schema followups
> 
>
> Key: BEAM-4076
> URL: https://issues.apache.org/jira/browse/BEAM-4076
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model, dsl-sql, sdk-java-core
>Reporter: Kenneth Knowles
>Priority: Major
>  Time Spent: 10h 40m
>  Remaining Estimate: 0h
>
> This umbrella bug contains subtasks with followups for Beam schemas, which 
> were moved from SQL to the core Java SDK and made to be type-name-based 
> rather than coder based.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4076) Schema followups

2018-07-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4076?focusedWorklogId=123098&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-123098
 ]

ASF GitHub Bot logged work on BEAM-4076:


Author: ASF GitHub Bot
Created on: 13/Jul/18 23:23
Start Date: 13/Jul/18 23:23
Worklog Time Spent: 10m 
  Work Description: reuvenlax commented on issue #5941: [BEAM-4076] Schema 
utilities for converting between types
URL: https://github.com/apache/beam/pull/5941#issuecomment-404977688
 
 
   @apilloud new tests are coming, I just replied first to the comments I could 
fix quickly.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 123098)
Time Spent: 10h 50m  (was: 10h 40m)

> Schema followups
> 
>
> Key: BEAM-4076
> URL: https://issues.apache.org/jira/browse/BEAM-4076
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model, dsl-sql, sdk-java-core
>Reporter: Kenneth Knowles
>Priority: Major
>  Time Spent: 10h 50m
>  Remaining Estimate: 0h
>
> This umbrella bug contains subtasks with followups for Beam schemas, which 
> were moved from SQL to the core Java SDK and made to be type-name-based 
> rather than coder based.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4076) Schema followups

2018-07-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4076?focusedWorklogId=123131&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-123131
 ]

ASF GitHub Bot logged work on BEAM-4076:


Author: ASF GitHub Bot
Created on: 14/Jul/18 00:16
Start Date: 14/Jul/18 00:16
Worklog Time Spent: 10m 
  Work Description: reuvenlax commented on issue #5941: [BEAM-4076] Schema 
utilities for converting between types
URL: https://github.com/apache/beam/pull/5941#issuecomment-404983697
 
 
   @apilloud  new tests have been pushed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 123131)
Time Spent: 11h  (was: 10h 50m)

> Schema followups
> 
>
> Key: BEAM-4076
> URL: https://issues.apache.org/jira/browse/BEAM-4076
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model, dsl-sql, sdk-java-core
>Reporter: Kenneth Knowles
>Priority: Major
>  Time Spent: 11h
>  Remaining Estimate: 0h
>
> This umbrella bug contains subtasks with followups for Beam schemas, which 
> were moved from SQL to the core Java SDK and made to be type-name-based 
> rather than coder based.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4076) Schema followups

2018-07-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4076?focusedWorklogId=123132&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-123132
 ]

ASF GitHub Bot logged work on BEAM-4076:


Author: ASF GitHub Bot
Created on: 14/Jul/18 00:17
Start Date: 14/Jul/18 00:17
Worklog Time Spent: 10m 
  Work Description: apilloud commented on issue #5941: [BEAM-4076] Schema 
utilities for converting between types
URL: https://github.com/apache/beam/pull/5941#issuecomment-404983816
 
 
   Thanks! LGTM!


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 123132)
Time Spent: 11h 10m  (was: 11h)

> Schema followups
> 
>
> Key: BEAM-4076
> URL: https://issues.apache.org/jira/browse/BEAM-4076
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model, dsl-sql, sdk-java-core
>Reporter: Kenneth Knowles
>Priority: Major
>  Time Spent: 11h 10m
>  Remaining Estimate: 0h
>
> This umbrella bug contains subtasks with followups for Beam schemas, which 
> were moved from SQL to the core Java SDK and made to be type-name-based 
> rather than coder based.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4076) Schema followups

2018-07-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4076?focusedWorklogId=123178&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-123178
 ]

ASF GitHub Bot logged work on BEAM-4076:


Author: ASF GitHub Bot
Created on: 14/Jul/18 03:09
Start Date: 14/Jul/18 03:09
Worklog Time Spent: 10m 
  Work Description: reuvenlax commented on issue #5941: [BEAM-4076] Schema 
utilities for converting between types
URL: https://github.com/apache/beam/pull/5941#issuecomment-404994501
 
 
   retest this please


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 123178)
Time Spent: 11h 20m  (was: 11h 10m)

> Schema followups
> 
>
> Key: BEAM-4076
> URL: https://issues.apache.org/jira/browse/BEAM-4076
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model, dsl-sql, sdk-java-core
>Reporter: Kenneth Knowles
>Priority: Major
>  Time Spent: 11h 20m
>  Remaining Estimate: 0h
>
> This umbrella bug contains subtasks with followups for Beam schemas, which 
> were moved from SQL to the core Java SDK and made to be type-name-based 
> rather than coder based.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4076) Schema followups

2018-07-14 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4076?focusedWorklogId=123311&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-123311
 ]

ASF GitHub Bot logged work on BEAM-4076:


Author: ASF GitHub Bot
Created on: 14/Jul/18 12:25
Start Date: 14/Jul/18 12:25
Worklog Time Spent: 10m 
  Work Description: reuvenlax closed pull request #5941: [BEAM-4076] Schema 
utilities for converting between types
URL: https://github.com/apache/beam/pull/5941
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/Schema.java 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/Schema.java
index c6d2bbd5041..4d2d7fc0b25 100644
--- a/sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/Schema.java
+++ b/sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/Schema.java
@@ -26,6 +26,7 @@
 import java.nio.charset.StandardCharsets;
 import java.util.ArrayList;
 import java.util.Arrays;
+import java.util.Comparator;
 import java.util.List;
 import java.util.Objects;
 import java.util.Set;
@@ -172,6 +173,7 @@ public static Schema of(Field... fields) {
 return Schema.builder().addFields(fields).build();
   }
 
+  /** Returns true if two Schemas have the same fields in the same order. */
   @Override
   public boolean equals(Object o) {
 if (!(o instanceof Schema)) {
@@ -182,6 +184,34 @@ public boolean equals(Object o) {
 && Objects.equals(getFields(), other.getFields());
   }
 
+  /** Returns true if two Schemas have the same fields, but possibly in 
different orders. */
+  public boolean equivalent(Schema other) {
+if (other.getFieldCount() != getFieldCount()) {
+  return false;
+}
+
+List otherFields =
+other
+.getFields()
+.stream()
+.sorted(Comparator.comparing(Field::getName))
+.collect(Collectors.toList());
+List actualFields =
+getFields()
+.stream()
+.sorted(Comparator.comparing(Field::getName))
+.collect(Collectors.toList());
+
+for (int i = 0; i < otherFields.size(); ++i) {
+  Field otherField = otherFields.get(i);
+  Field actualField = actualFields.get(i);
+  if (!otherField.equivalent(actualField)) {
+return false;
+  }
+}
+return true;
+  }
+
   @Override
   public String toString() {
 StringBuilder builder = new StringBuilder();
@@ -399,6 +429,33 @@ public boolean equals(Object o) {
   && Arrays.equals(getMetadata(), other.getMetadata());
 }
 
+private boolean equivalent(FieldType other) {
+  if (!other.getTypeName().equals(getTypeName())) {
+return false;
+  }
+  switch (getTypeName()) {
+case ROW:
+  if (!other.getRowSchema().equivalent(getRowSchema())) {
+return false;
+  }
+  break;
+case ARRAY:
+  if 
(!other.getCollectionElementType().equivalent(getCollectionElementType())) {
+return false;
+  }
+  break;
+case MAP:
+  if (!other.getMapKeyType().equivalent(getMapKeyType())
+  || !other.getMapValueType().equivalent(getMapValueType())) {
+return false;
+  }
+  break;
+default:
+  return other.equals(this);
+  }
+  return true;
+}
+
 @Override
 public int hashCode() {
   return Arrays.deepHashCode(
@@ -495,6 +552,12 @@ public boolean equals(Object o) {
   && Objects.equals(getNullable(), other.getNullable());
 }
 
+private boolean equivalent(Field otherField) {
+  return otherField.getName().equals(getName())
+  && otherField.getNullable().equals(getNullable())
+  && getType().equivalent(otherField.getType());
+}
+
 @Override
 public int hashCode() {
   return Objects.hash(getName(), getDescription(), getType(), 
getNullable());
diff --git 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/SchemaProvider.java 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/SchemaProvider.java
index 167e39ad213..faf269fd7b4 100644
--- 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/SchemaProvider.java
+++ 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/SchemaProvider.java
@@ -18,6 +18,7 @@
 
 package org.apache.beam.sdk.schemas;
 
+import java.io.Serializable;
 import javax.annotation.Nullable;
 import org.apache.beam.sdk.annotations.Experimental;
 import org.apache.beam.sdk.annotations.Experimental.Kind;
@@ -31,7 +32,7 @@
  * contacts an external schema-registry service to determine the schema for a 
type.
  */
 @Experimental(K

[jira] [Work logged] (BEAM-4076) Schema followups

2018-07-14 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4076?focusedWorklogId=123352&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-123352
 ]

ASF GitHub Bot logged work on BEAM-4076:


Author: ASF GitHub Bot
Created on: 14/Jul/18 19:36
Start Date: 14/Jul/18 19:36
Worklog Time Spent: 10m 
  Work Description: reuvenlax opened a new pull request #5953: [BEAM-4076] 
Fix bugs in generated schema code
URL: https://github.com/apache/beam/pull/5953
 
 
   Fix handling of null fields in generated Row classes: properly infer 
nullable from @Nullable annotations, and don't try to dereference legal null 
values.
   
   Fix handling of time variables. Row's internal storage is Instant, and the 
generated code assumed it was a DateTime
   
   R: @apilloud 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 123352)
Time Spent: 11h 40m  (was: 11.5h)

> Schema followups
> 
>
> Key: BEAM-4076
> URL: https://issues.apache.org/jira/browse/BEAM-4076
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model, dsl-sql, sdk-java-core
>Reporter: Kenneth Knowles
>Priority: Major
>  Time Spent: 11h 40m
>  Remaining Estimate: 0h
>
> This umbrella bug contains subtasks with followups for Beam schemas, which 
> were moved from SQL to the core Java SDK and made to be type-name-based 
> rather than coder based.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4076) Schema followups

2018-07-14 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4076?focusedWorklogId=123361&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-123361
 ]

ASF GitHub Bot logged work on BEAM-4076:


Author: ASF GitHub Bot
Created on: 14/Jul/18 22:53
Start Date: 14/Jul/18 22:53
Worklog Time Spent: 10m 
  Work Description: reuvenlax closed pull request #5953: [BEAM-4076] Fix 
bugs in generated schema code
URL: https://github.com/apache/beam/pull/5953
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/GetterBasedSchemaProvider.java
 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/GetterBasedSchemaProvider.java
index 108c74601ce..7d0856bab17 100644
--- 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/GetterBasedSchemaProvider.java
+++ 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/GetterBasedSchemaProvider.java
@@ -26,6 +26,7 @@
 import java.lang.reflect.Type;
 import java.util.List;
 import java.util.Map;
+import javax.annotation.Nullable;
 import org.apache.beam.sdk.annotations.Experimental;
 import org.apache.beam.sdk.annotations.Experimental.Kind;
 import org.apache.beam.sdk.schemas.Schema.FieldType;
@@ -108,8 +109,13 @@
   }
 
   @SuppressWarnings("unchecked")
+  @Nullable
   private  T fromValue(
   FieldType type, T value, Type fieldType, Type elemenentType, Type 
keyType, Type valueType) {
+if (value == null) {
+  return null;
+}
+
 if (TypeName.ROW.equals(type.getTypeName())) {
   return (T) fromRow((Row) value, (Class) fieldType);
 } else if (TypeName.ARRAY.equals(type.getTypeName())) {
diff --git 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/utils/ByteBuddyUtils.java
 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/utils/ByteBuddyUtils.java
index 1593fbb5fab..cdbb7c71af7 100644
--- 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/utils/ByteBuddyUtils.java
+++ 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/utils/ByteBuddyUtils.java
@@ -51,8 +51,7 @@
 import org.apache.beam.sdk.values.reflect.FieldValueSetter;
 import org.apache.commons.lang3.ArrayUtils;
 import org.apache.commons.lang3.ClassUtils;
-import org.joda.time.DateTime;
-import org.joda.time.ReadableDateTime;
+import org.joda.time.Instant;
 import org.joda.time.ReadableInstant;
 
 class ByteBuddyUtils {
@@ -61,7 +60,7 @@
   private static final ForLoadedType BYTE_ARRAY_TYPE = new 
ForLoadedType(byte[].class);
   private static final ForLoadedType BYTE_BUFFER_TYPE = new 
ForLoadedType(ByteBuffer.class);
   private static final ForLoadedType CHAR_SEQUENCE_TYPE = new 
ForLoadedType(CharSequence.class);
-  private static final ForLoadedType DATE_TIME_TYPE = new 
ForLoadedType(DateTime.class);
+  private static final ForLoadedType INSTANT_TYPE = new 
ForLoadedType(Instant.class);
   private static final ForLoadedType LIST_TYPE = new ForLoadedType(List.class);
   private static final ForLoadedType READABLE_INSTANT_TYPE =
   new ForLoadedType(ReadableInstant.class);
@@ -166,7 +165,7 @@ protected Type convertMap(TypeDescriptor type) {
 
 @Override
 protected Type convertDateTime(TypeDescriptor type) {
-  return ReadableInstant.class;
+  return Instant.class;
 }
 
 @Override
@@ -258,16 +257,16 @@ protected StackManipulation convertMap(TypeDescriptor 
type) {
 
 @Override
 protected StackManipulation convertDateTime(TypeDescriptor type) {
-  // If the class type is a ReadableDateTime, then return it.
-  if (ReadableDateTime.class.isAssignableFrom(type.getRawType())) {
+  // If the class type is an Instant, then return it.
+  if (Instant.class.isAssignableFrom(type.getRawType())) {
 return readValue;
   }
   // Otherwise, generate the following code:
-  //   return new DateTime(value.getMillis());
+  //   return new Instant(value.getMillis());
 
   return new StackManipulation.Compound(
   // Create a new instance of the target type.
-  TypeCreation.of(DATE_TIME_TYPE),
+  TypeCreation.of(INSTANT_TYPE),
   Duplication.SINGLE,
   readValue,
   TypeCasting.to(READABLE_INSTANT_TYPE),
@@ -279,7 +278,7 @@ protected StackManipulation 
convertDateTime(TypeDescriptor type) {
   .getOnly()),
   // Construct a DateTime object containing the millis.
   MethodInvocation.invoke(
-  DATE_TIME_TYPE
+  INSTANT_TYPE
   .getDeclaredMethods()
   .filter(
   ElementMatchers.isConstructor()
diff --git 
a/sdks/java/core/src/main/java/org/a

[jira] [Work logged] (BEAM-4076) Schema followups

2018-07-15 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4076?focusedWorklogId=123471&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-123471
 ]

ASF GitHub Bot logged work on BEAM-4076:


Author: ASF GitHub Bot
Created on: 16/Jul/18 03:39
Start Date: 16/Jul/18 03:39
Worklog Time Spent: 10m 
  Work Description: reuvenlax opened a new pull request #5955: [BEAM-4076] 
Enable schemas for more runners
URL: https://github.com/apache/beam/pull/5955
 
 
   This PR enables schemas for all runners except for gearpump (gearpump is 
more complicated due to the Scala indirection).
   
   Schemas still are not enabled for the Dataflow runner, as that requires a 
change on the internal Dataflow side.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 123471)
Time Spent: 12h  (was: 11h 50m)

> Schema followups
> 
>
> Key: BEAM-4076
> URL: https://issues.apache.org/jira/browse/BEAM-4076
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model, dsl-sql, sdk-java-core
>Reporter: Kenneth Knowles
>Priority: Major
>  Time Spent: 12h
>  Remaining Estimate: 0h
>
> This umbrella bug contains subtasks with followups for Beam schemas, which 
> were moved from SQL to the core Java SDK and made to be type-name-based 
> rather than coder based.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4076) Schema followups

2018-07-17 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4076?focusedWorklogId=124235&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-124235
 ]

ASF GitHub Bot logged work on BEAM-4076:


Author: ASF GitHub Bot
Created on: 17/Jul/18 22:28
Start Date: 17/Jul/18 22:28
Worklog Time Spent: 10m 
  Work Description: alanmyrvold commented on issue #5941: [BEAM-4076] 
Schema utilities for converting between types
URL: https://github.com/apache/beam/pull/5941#issuecomment-405748874
 
 
   ./gradlew :beam-sdks-java-core:javadoc fails, with errors like:
   
   
sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/transforms/Convert.java:82:
 error: type arguments not allowed here
  * Convert a {@link PCollection} to a {@link PCollection}.
  ^


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 124235)
Time Spent: 12h 10m  (was: 12h)

> Schema followups
> 
>
> Key: BEAM-4076
> URL: https://issues.apache.org/jira/browse/BEAM-4076
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model, dsl-sql, sdk-java-core
>Reporter: Kenneth Knowles
>Priority: Major
>  Time Spent: 12h 10m
>  Remaining Estimate: 0h
>
> This umbrella bug contains subtasks with followups for Beam schemas, which 
> were moved from SQL to the core Java SDK and made to be type-name-based 
> rather than coder based.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4076) Schema followups

2018-07-17 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4076?focusedWorklogId=124237&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-124237
 ]

ASF GitHub Bot logged work on BEAM-4076:


Author: ASF GitHub Bot
Created on: 17/Jul/18 22:32
Start Date: 17/Jul/18 22:32
Worklog Time Spent: 10m 
  Work Description: reuvenlax commented on issue #5941: [BEAM-4076] Schema 
utilities for converting between types
URL: https://github.com/apache/beam/pull/5941#issuecomment-405749782
 
 
   This did not fail on Jenkins for this PR, and I've never seen this
   error before. Did something change with Javadoc?
   
   On Tue, Jul 17, 2018 at 3:28 PM Alan Myrvold 
   wrote:
   
   > ./gradlew :beam-sdks-java-core:javadoc fails, with errors like:
   >
   > 
sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/transforms/Convert.java:82:
   > error: type arguments not allowed here
   >
   >- Convert a {@link  PCollection} to a {@link
   > PCollection}.
   >^
   >
   > —
   > You are receiving this because you modified the open/close state.
   > Reply to this email directly, view it on GitHub
   > , or mute
   > the thread
   > 

   > .
   >
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 124237)
Time Spent: 12h 20m  (was: 12h 10m)

> Schema followups
> 
>
> Key: BEAM-4076
> URL: https://issues.apache.org/jira/browse/BEAM-4076
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model, dsl-sql, sdk-java-core
>Reporter: Kenneth Knowles
>Priority: Major
>  Time Spent: 12h 20m
>  Remaining Estimate: 0h
>
> This umbrella bug contains subtasks with followups for Beam schemas, which 
> were moved from SQL to the core Java SDK and made to be type-name-based 
> rather than coder based.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4076) Schema followups

2018-07-17 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4076?focusedWorklogId=124248&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-124248
 ]

ASF GitHub Bot logged work on BEAM-4076:


Author: ASF GitHub Bot
Created on: 17/Jul/18 23:13
Start Date: 17/Jul/18 23:13
Worklog Time Spent: 10m 
  Work Description: alanmyrvold commented on issue #5941: [BEAM-4076] 
Schema utilities for converting between types
URL: https://github.com/apache/beam/pull/5941#issuecomment-405757440
 
 
   Javadoc is not built on preCommit or postCommit. The problem started 
happening with the nightly build. 
https://builds.apache.org/job/beam_Release_Gradle_NightlySnapshot/101/
   
   Sent https://github.com/apache/beam/pull/5970 to add javadoc to 
:javaPreCommit and :javaPostCommit


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 124248)
Time Spent: 12.5h  (was: 12h 20m)

> Schema followups
> 
>
> Key: BEAM-4076
> URL: https://issues.apache.org/jira/browse/BEAM-4076
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model, dsl-sql, sdk-java-core
>Reporter: Kenneth Knowles
>Priority: Major
>  Time Spent: 12.5h
>  Remaining Estimate: 0h
>
> This umbrella bug contains subtasks with followups for Beam schemas, which 
> were moved from SQL to the core Java SDK and made to be type-name-based 
> rather than coder based.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


  1   2   >