[jira] [Work logged] (BEAM-7802) Expose a method to make an Avro-based PCollection into an Schema-based one

2019-08-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7802?focusedWorklogId=296551=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-296551
 ]

ASF GitHub Bot logged work on BEAM-7802:


Author: ASF GitHub Bot
Created on: 16/Aug/19 20:31
Start Date: 16/Aug/19 20:31
Worklog Time Spent: 10m 
  Work Description: iemejia commented on pull request #9130: [BEAM-7802] 
Expose a method to make an Avro-based PCollection into an Schema-based one
URL: https://github.com/apache/beam/pull/9130
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 296551)
Time Spent: 6.5h  (was: 6h 20m)

> Expose a method to make an Avro-based PCollection into an Schema-based one
> --
>
> Key: BEAM-7802
> URL: https://issues.apache.org/jira/browse/BEAM-7802
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
>Priority: Minor
>  Time Spent: 6.5h
>  Remaining Estimate: 0h
>
> Avro can infer the Schema for an Avro based PCollection by using the 
> `withBeamSchemas` method, however if the user created a PCollection with Avro 
> objects or IndexedRecord/GenericRecord, he needs to manually set the schema 
> (or coder). The idea is to expose a method in schema.AvroUtils to ease this.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Work logged] (BEAM-7802) Expose a method to make an Avro-based PCollection into an Schema-based one

2019-08-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7802?focusedWorklogId=296550=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-296550
 ]

ASF GitHub Bot logged work on BEAM-7802:


Author: ASF GitHub Bot
Created on: 16/Aug/19 20:30
Start Date: 16/Aug/19 20:30
Worklog Time Spent: 10m 
  Work Description: iemejia commented on issue #9130: [BEAM-7802] Expose a 
method to make an Avro-based PCollection into an Schema-based one
URL: https://github.com/apache/beam/pull/9130#issuecomment-522141914
 
 
   Merging since it is green now. Thanks for the review @kanterov and 
@RyanSkraba 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 296550)
Time Spent: 6h 20m  (was: 6h 10m)

> Expose a method to make an Avro-based PCollection into an Schema-based one
> --
>
> Key: BEAM-7802
> URL: https://issues.apache.org/jira/browse/BEAM-7802
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
>Priority: Minor
>  Time Spent: 6h 20m
>  Remaining Estimate: 0h
>
> Avro can infer the Schema for an Avro based PCollection by using the 
> `withBeamSchemas` method, however if the user created a PCollection with Avro 
> objects or IndexedRecord/GenericRecord, he needs to manually set the schema 
> (or coder). The idea is to expose a method in schema.AvroUtils to ease this.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Work logged] (BEAM-7802) Expose a method to make an Avro-based PCollection into an Schema-based one

2019-08-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7802?focusedWorklogId=296549=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-296549
 ]

ASF GitHub Bot logged work on BEAM-7802:


Author: ASF GitHub Bot
Created on: 16/Aug/19 20:30
Start Date: 16/Aug/19 20:30
Worklog Time Spent: 10m 
  Work Description: iemejia commented on issue #9130: [BEAM-7802] Expose a 
method to make an Avro-based PCollection into an Schema-based one
URL: https://github.com/apache/beam/pull/9130#issuecomment-522124180
 
 
   Run Java PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 296549)
Time Spent: 6h 10m  (was: 6h)

> Expose a method to make an Avro-based PCollection into an Schema-based one
> --
>
> Key: BEAM-7802
> URL: https://issues.apache.org/jira/browse/BEAM-7802
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
>Priority: Minor
>  Time Spent: 6h 10m
>  Remaining Estimate: 0h
>
> Avro can infer the Schema for an Avro based PCollection by using the 
> `withBeamSchemas` method, however if the user created a PCollection with Avro 
> objects or IndexedRecord/GenericRecord, he needs to manually set the schema 
> (or coder). The idea is to expose a method in schema.AvroUtils to ease this.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Work logged] (BEAM-7802) Expose a method to make an Avro-based PCollection into an Schema-based one

2019-08-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7802?focusedWorklogId=296516=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-296516
 ]

ASF GitHub Bot logged work on BEAM-7802:


Author: ASF GitHub Bot
Created on: 16/Aug/19 19:26
Start Date: 16/Aug/19 19:26
Worklog Time Spent: 10m 
  Work Description: iemejia commented on issue #9130: [BEAM-7802] Expose a 
method to make an Avro-based PCollection into an Schema-based one
URL: https://github.com/apache/beam/pull/9130#issuecomment-522124180
 
 
   Run Java PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 296516)
Time Spent: 5h 50m  (was: 5h 40m)

> Expose a method to make an Avro-based PCollection into an Schema-based one
> --
>
> Key: BEAM-7802
> URL: https://issues.apache.org/jira/browse/BEAM-7802
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
>Priority: Minor
>  Time Spent: 5h 50m
>  Remaining Estimate: 0h
>
> Avro can infer the Schema for an Avro based PCollection by using the 
> `withBeamSchemas` method, however if the user created a PCollection with Avro 
> objects or IndexedRecord/GenericRecord, he needs to manually set the schema 
> (or coder). The idea is to expose a method in schema.AvroUtils to ease this.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Work logged] (BEAM-7802) Expose a method to make an Avro-based PCollection into an Schema-based one

2019-08-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7802?focusedWorklogId=296515=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-296515
 ]

ASF GitHub Bot logged work on BEAM-7802:


Author: ASF GitHub Bot
Created on: 16/Aug/19 19:26
Start Date: 16/Aug/19 19:26
Worklog Time Spent: 10m 
  Work Description: iemejia commented on issue #9130: [BEAM-7802] Expose a 
method to make an Avro-based PCollection into an Schema-based one
URL: https://github.com/apache/beam/pull/9130#issuecomment-522124042
 
 
   Run Java PreCommit
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 296515)
Time Spent: 5h 40m  (was: 5.5h)

> Expose a method to make an Avro-based PCollection into an Schema-based one
> --
>
> Key: BEAM-7802
> URL: https://issues.apache.org/jira/browse/BEAM-7802
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
>Priority: Minor
>  Time Spent: 5h 40m
>  Remaining Estimate: 0h
>
> Avro can infer the Schema for an Avro based PCollection by using the 
> `withBeamSchemas` method, however if the user created a PCollection with Avro 
> objects or IndexedRecord/GenericRecord, he needs to manually set the schema 
> (or coder). The idea is to expose a method in schema.AvroUtils to ease this.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Work logged] (BEAM-7802) Expose a method to make an Avro-based PCollection into an Schema-based one

2019-08-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7802?focusedWorklogId=296517=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-296517
 ]

ASF GitHub Bot logged work on BEAM-7802:


Author: ASF GitHub Bot
Created on: 16/Aug/19 19:26
Start Date: 16/Aug/19 19:26
Worklog Time Spent: 10m 
  Work Description: iemejia commented on issue #9130: [BEAM-7802] Expose a 
method to make an Avro-based PCollection into an Schema-based one
URL: https://github.com/apache/beam/pull/9130#issuecomment-522124042
 
 
   Run Java PreCommit
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 296517)
Time Spent: 6h  (was: 5h 50m)

> Expose a method to make an Avro-based PCollection into an Schema-based one
> --
>
> Key: BEAM-7802
> URL: https://issues.apache.org/jira/browse/BEAM-7802
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
>Priority: Minor
>  Time Spent: 6h
>  Remaining Estimate: 0h
>
> Avro can infer the Schema for an Avro based PCollection by using the 
> `withBeamSchemas` method, however if the user created a PCollection with Avro 
> objects or IndexedRecord/GenericRecord, he needs to manually set the schema 
> (or coder). The idea is to expose a method in schema.AvroUtils to ease this.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Work logged] (BEAM-7802) Expose a method to make an Avro-based PCollection into an Schema-based one

2019-08-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7802?focusedWorklogId=296167=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-296167
 ]

ASF GitHub Bot logged work on BEAM-7802:


Author: ASF GitHub Bot
Created on: 16/Aug/19 09:04
Start Date: 16/Aug/19 09:04
Worklog Time Spent: 10m 
  Work Description: kanterov commented on issue #9130: [BEAM-7802] Expose a 
method to make an Avro-based PCollection into an Schema-based one
URL: https://github.com/apache/beam/pull/9130#issuecomment-521941906
 
 
   @iemejia great, agree, please feel free to merge when PreCommit check passes
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 296167)
Time Spent: 5.5h  (was: 5h 20m)

> Expose a method to make an Avro-based PCollection into an Schema-based one
> --
>
> Key: BEAM-7802
> URL: https://issues.apache.org/jira/browse/BEAM-7802
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
>Priority: Minor
>  Time Spent: 5.5h
>  Remaining Estimate: 0h
>
> Avro can infer the Schema for an Avro based PCollection by using the 
> `withBeamSchemas` method, however if the user created a PCollection with Avro 
> objects or IndexedRecord/GenericRecord, he needs to manually set the schema 
> (or coder). The idea is to expose a method in schema.AvroUtils to ease this.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Work logged] (BEAM-7802) Expose a method to make an Avro-based PCollection into an Schema-based one

2019-08-15 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7802?focusedWorklogId=295780=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-295780
 ]

ASF GitHub Bot logged work on BEAM-7802:


Author: ASF GitHub Bot
Created on: 15/Aug/19 21:38
Start Date: 15/Aug/19 21:38
Worklog Time Spent: 10m 
  Work Description: iemejia commented on issue #9130: [BEAM-7802] Expose a 
method to make an Avro-based PCollection into an Schema-based one
URL: https://github.com/apache/beam/pull/9130#issuecomment-521806974
 
 
   @kanterov I restored the access modifiers and let everything as suggested.
   
   I think we should merge this as it is and open the discussion in dev@ since 
the classes are still experimental we can still adapt the changes we conclude 
from the discussion and ‘experimental’ users can benefit of having this 
available in the meantime. WDYT?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 295780)
Time Spent: 5h 20m  (was: 5h 10m)

> Expose a method to make an Avro-based PCollection into an Schema-based one
> --
>
> Key: BEAM-7802
> URL: https://issues.apache.org/jira/browse/BEAM-7802
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
>Priority: Minor
>  Time Spent: 5h 20m
>  Remaining Estimate: 0h
>
> Avro can infer the Schema for an Avro based PCollection by using the 
> `withBeamSchemas` method, however if the user created a PCollection with Avro 
> objects or IndexedRecord/GenericRecord, he needs to manually set the schema 
> (or coder). The idea is to expose a method in schema.AvroUtils to ease this.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Work logged] (BEAM-7802) Expose a method to make an Avro-based PCollection into an Schema-based one

2019-08-15 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7802?focusedWorklogId=295765=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-295765
 ]

ASF GitHub Bot logged work on BEAM-7802:


Author: ASF GitHub Bot
Created on: 15/Aug/19 21:07
Start Date: 15/Aug/19 21:07
Worklog Time Spent: 10m 
  Work Description: iemejia commented on pull request #9130: [BEAM-7802] 
Expose a method to make an Avro-based PCollection into an Schema-based one
URL: https://github.com/apache/beam/pull/9130#discussion_r314499407
 
 

 ##
 File path: 
sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/utils/AvroUtils.java
 ##
 @@ -127,7 +127,7 @@ public static FixedBytesField withSize(int size) {
 
 /** Create a {@link FixedBytesField} from a Beam {@link FieldType}. */
 @Nullable
-public static FixedBytesField fromBeamFieldType(FieldType fieldType) {
+static FixedBytesField fromBeamFieldType(FieldType fieldType) {
 
 Review comment:
   I am letting things as they were and resolving this one.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 295765)
Time Spent: 5h 10m  (was: 5h)

> Expose a method to make an Avro-based PCollection into an Schema-based one
> --
>
> Key: BEAM-7802
> URL: https://issues.apache.org/jira/browse/BEAM-7802
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
>Priority: Minor
>  Time Spent: 5h 10m
>  Remaining Estimate: 0h
>
> Avro can infer the Schema for an Avro based PCollection by using the 
> `withBeamSchemas` method, however if the user created a PCollection with Avro 
> objects or IndexedRecord/GenericRecord, he needs to manually set the schema 
> (or coder). The idea is to expose a method in schema.AvroUtils to ease this.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Work logged] (BEAM-7802) Expose a method to make an Avro-based PCollection into an Schema-based one

2019-08-15 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7802?focusedWorklogId=295759=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-295759
 ]

ASF GitHub Bot logged work on BEAM-7802:


Author: ASF GitHub Bot
Created on: 15/Aug/19 20:58
Start Date: 15/Aug/19 20:58
Worklog Time Spent: 10m 
  Work Description: iemejia commented on pull request #9130: [BEAM-7802] 
Expose a method to make an Avro-based PCollection into an Schema-based one
URL: https://github.com/apache/beam/pull/9130#discussion_r314495688
 
 

 ##
 File path: sdks/java/core/src/main/java/org/apache/beam/sdk/io/AvroIO.java
 ##
 @@ -875,6 +926,13 @@ public void populateDisplayData(DisplayData.Builder 
builder) {
   recordClass,
   schemaSupplier.get());
 }
+
+private static class JsonToSchema implements Function, 
Serializable {
 
 Review comment:
   :) yaaayy !
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 295759)
Time Spent: 5h  (was: 4h 50m)

> Expose a method to make an Avro-based PCollection into an Schema-based one
> --
>
> Key: BEAM-7802
> URL: https://issues.apache.org/jira/browse/BEAM-7802
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
>Priority: Minor
>  Time Spent: 5h
>  Remaining Estimate: 0h
>
> Avro can infer the Schema for an Avro based PCollection by using the 
> `withBeamSchemas` method, however if the user created a PCollection with Avro 
> objects or IndexedRecord/GenericRecord, he needs to manually set the schema 
> (or coder). The idea is to expose a method in schema.AvroUtils to ease this.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Work logged] (BEAM-7802) Expose a method to make an Avro-based PCollection into an Schema-based one

2019-08-15 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7802?focusedWorklogId=295756=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-295756
 ]

ASF GitHub Bot logged work on BEAM-7802:


Author: ASF GitHub Bot
Created on: 15/Aug/19 20:57
Start Date: 15/Aug/19 20:57
Worklog Time Spent: 10m 
  Work Description: iemejia commented on pull request #9130: [BEAM-7802] 
Expose a method to make an Avro-based PCollection into an Schema-based one
URL: https://github.com/apache/beam/pull/9130#discussion_r314495186
 
 

 ##
 File path: sdks/java/core/src/main/java/org/apache/beam/sdk/io/AvroIO.java
 ##
 @@ -186,6 +188,49 @@
  * scalability. Note that it may decrease performance if the filepattern 
matches only a small number
  * of files.
  *
+ * Inferring Beam schemas from Avro files
+ *
+ * If you want to use SQL or schema based operations on an Avro-based 
PCollection. You must
+ * configure the read transform to infer the Beam schema and automatically 
setup the Beam related
+ * coders by doing:
+ *
+ * {@code
+ * PCollection records =
+ * p.apply(AvroIO.read(...).from(...).withBeamSchemas(true);
+ * }
+ *
+ * Inferring Beam schemas from Avro PCollections
+ *
+ * If you created an Avro-based PCollection by other means e.g. reading 
records from Kafka or as
+ * the output of another PTransform. You may be interested on making your 
PCollection schema-aware
 
 Review comment:
   done
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 295756)
Time Spent: 4h 40m  (was: 4.5h)

> Expose a method to make an Avro-based PCollection into an Schema-based one
> --
>
> Key: BEAM-7802
> URL: https://issues.apache.org/jira/browse/BEAM-7802
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
>Priority: Minor
>  Time Spent: 4h 40m
>  Remaining Estimate: 0h
>
> Avro can infer the Schema for an Avro based PCollection by using the 
> `withBeamSchemas` method, however if the user created a PCollection with Avro 
> objects or IndexedRecord/GenericRecord, he needs to manually set the schema 
> (or coder). The idea is to expose a method in schema.AvroUtils to ease this.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Work logged] (BEAM-7802) Expose a method to make an Avro-based PCollection into an Schema-based one

2019-08-15 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7802?focusedWorklogId=295758=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-295758
 ]

ASF GitHub Bot logged work on BEAM-7802:


Author: ASF GitHub Bot
Created on: 15/Aug/19 20:57
Start Date: 15/Aug/19 20:57
Worklog Time Spent: 10m 
  Work Description: iemejia commented on pull request #9130: [BEAM-7802] 
Expose a method to make an Avro-based PCollection into an Schema-based one
URL: https://github.com/apache/beam/pull/9130#discussion_r314495461
 
 

 ##
 File path: sdks/java/core/src/main/java/org/apache/beam/sdk/io/AvroIO.java
 ##
 @@ -186,6 +188,49 @@
  * scalability. Note that it may decrease performance if the filepattern 
matches only a small number
  * of files.
  *
+ * Inferring Beam schemas from Avro files
+ *
+ * If you want to use SQL or schema based operations on an Avro-based 
PCollection. You must
+ * configure the read transform to infer the Beam schema and automatically 
setup the Beam related
+ * coders by doing:
+ *
+ * {@code
+ * PCollection records =
+ * p.apply(AvroIO.read(...).from(...).withBeamSchemas(true);
+ * }
+ *
+ * Inferring Beam schemas from Avro PCollections
+ *
+ * If you created an Avro-based PCollection by other means e.g. reading 
records from Kafka or as
+ * the output of another PTransform. You may be interested on making your 
PCollection schema-aware
+ * so you can use the Schema-based APIs or Beam's SqlTransform.
+ *
+ * If you are using Avro specific records (generated classes from an Avro 
schema), you can
+ * register a schema provider for the specific Avro class to make any 
PCollection of these objects
+ * schema-aware.
+ *
+ * {@code
+ * pipeline.getSchemaRegistry().registerSchemaProvider(AvroAutoGenClass.class, 
new AvroRecordSchema());
 
 Review comment:
   good one, definitely simpler
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 295758)
Time Spent: 4h 50m  (was: 4h 40m)

> Expose a method to make an Avro-based PCollection into an Schema-based one
> --
>
> Key: BEAM-7802
> URL: https://issues.apache.org/jira/browse/BEAM-7802
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
>Priority: Minor
>  Time Spent: 4h 50m
>  Remaining Estimate: 0h
>
> Avro can infer the Schema for an Avro based PCollection by using the 
> `withBeamSchemas` method, however if the user created a PCollection with Avro 
> objects or IndexedRecord/GenericRecord, he needs to manually set the schema 
> (or coder). The idea is to expose a method in schema.AvroUtils to ease this.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Work logged] (BEAM-7802) Expose a method to make an Avro-based PCollection into an Schema-based one

2019-08-15 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7802?focusedWorklogId=295755=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-295755
 ]

ASF GitHub Bot logged work on BEAM-7802:


Author: ASF GitHub Bot
Created on: 15/Aug/19 20:56
Start Date: 15/Aug/19 20:56
Worklog Time Spent: 10m 
  Work Description: iemejia commented on pull request #9130: [BEAM-7802] 
Expose a method to make an Avro-based PCollection into an Schema-based one
URL: https://github.com/apache/beam/pull/9130#discussion_r314494996
 
 

 ##
 File path: sdks/java/core/src/main/java/org/apache/beam/sdk/io/AvroIO.java
 ##
 @@ -186,6 +188,49 @@
  * scalability. Note that it may decrease performance if the filepattern 
matches only a small number
  * of files.
  *
+ * Inferring Beam schemas from Avro files
+ *
+ * If you want to use SQL or schema based operations on an Avro-based 
PCollection. You must
 
 Review comment:
   done
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 295755)
Time Spent: 4.5h  (was: 4h 20m)

> Expose a method to make an Avro-based PCollection into an Schema-based one
> --
>
> Key: BEAM-7802
> URL: https://issues.apache.org/jira/browse/BEAM-7802
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
>Priority: Minor
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> Avro can infer the Schema for an Avro based PCollection by using the 
> `withBeamSchemas` method, however if the user created a PCollection with Avro 
> objects or IndexedRecord/GenericRecord, he needs to manually set the schema 
> (or coder). The idea is to expose a method in schema.AvroUtils to ease this.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Work logged] (BEAM-7802) Expose a method to make an Avro-based PCollection into an Schema-based one

2019-08-14 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7802?focusedWorklogId=294571=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-294571
 ]

ASF GitHub Bot logged work on BEAM-7802:


Author: ASF GitHub Bot
Created on: 14/Aug/19 07:44
Start Date: 14/Aug/19 07:44
Worklog Time Spent: 10m 
  Work Description: RyanSkraba commented on pull request #9130: [BEAM-7802] 
Expose a method to make an Avro-based PCollection into an Schema-based one
URL: https://github.com/apache/beam/pull/9130#discussion_r313742447
 
 

 ##
 File path: sdks/java/core/src/main/java/org/apache/beam/sdk/io/AvroIO.java
 ##
 @@ -875,6 +926,13 @@ public void populateDisplayData(DisplayData.Builder 
builder) {
   recordClass,
   schemaSupplier.get());
 }
+
+private static class JsonToSchema implements Function, 
Serializable {
 
 Review comment:
   AVRO-1852 to the rescue!  Just for reference: schemas are directly 
serializable from Avro 1.9.1+ (sharing the happiness!)
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 294571)
Time Spent: 4h  (was: 3h 50m)

> Expose a method to make an Avro-based PCollection into an Schema-based one
> --
>
> Key: BEAM-7802
> URL: https://issues.apache.org/jira/browse/BEAM-7802
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
>Priority: Minor
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> Avro can infer the Schema for an Avro based PCollection by using the 
> `withBeamSchemas` method, however if the user created a PCollection with Avro 
> objects or IndexedRecord/GenericRecord, he needs to manually set the schema 
> (or coder). The idea is to expose a method in schema.AvroUtils to ease this.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Work logged] (BEAM-7802) Expose a method to make an Avro-based PCollection into an Schema-based one

2019-08-14 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7802?focusedWorklogId=294573=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-294573
 ]

ASF GitHub Bot logged work on BEAM-7802:


Author: ASF GitHub Bot
Created on: 14/Aug/19 07:44
Start Date: 14/Aug/19 07:44
Worklog Time Spent: 10m 
  Work Description: RyanSkraba commented on pull request #9130: [BEAM-7802] 
Expose a method to make an Avro-based PCollection into an Schema-based one
URL: https://github.com/apache/beam/pull/9130#discussion_r313740939
 
 

 ##
 File path: sdks/java/core/src/main/java/org/apache/beam/sdk/io/AvroIO.java
 ##
 @@ -186,6 +188,49 @@
  * scalability. Note that it may decrease performance if the filepattern 
matches only a small number
  * of files.
  *
+ * Inferring Beam schemas from Avro files
+ *
+ * If you want to use SQL or schema based operations on an Avro-based 
PCollection. You must
 
 Review comment:
   ```suggestion
* If you want to use SQL or schema based operations on an Avro-based 
PCollection, you must
   ```
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 294573)
Time Spent: 4h 10m  (was: 4h)

> Expose a method to make an Avro-based PCollection into an Schema-based one
> --
>
> Key: BEAM-7802
> URL: https://issues.apache.org/jira/browse/BEAM-7802
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
>Priority: Minor
>  Time Spent: 4h 10m
>  Remaining Estimate: 0h
>
> Avro can infer the Schema for an Avro based PCollection by using the 
> `withBeamSchemas` method, however if the user created a PCollection with Avro 
> objects or IndexedRecord/GenericRecord, he needs to manually set the schema 
> (or coder). The idea is to expose a method in schema.AvroUtils to ease this.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Work logged] (BEAM-7802) Expose a method to make an Avro-based PCollection into an Schema-based one

2019-08-14 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7802?focusedWorklogId=294574=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-294574
 ]

ASF GitHub Bot logged work on BEAM-7802:


Author: ASF GitHub Bot
Created on: 14/Aug/19 07:44
Start Date: 14/Aug/19 07:44
Worklog Time Spent: 10m 
  Work Description: RyanSkraba commented on pull request #9130: [BEAM-7802] 
Expose a method to make an Avro-based PCollection into an Schema-based one
URL: https://github.com/apache/beam/pull/9130#discussion_r313741174
 
 

 ##
 File path: sdks/java/core/src/main/java/org/apache/beam/sdk/io/AvroIO.java
 ##
 @@ -186,6 +188,49 @@
  * scalability. Note that it may decrease performance if the filepattern 
matches only a small number
  * of files.
  *
+ * Inferring Beam schemas from Avro files
+ *
+ * If you want to use SQL or schema based operations on an Avro-based 
PCollection. You must
+ * configure the read transform to infer the Beam schema and automatically 
setup the Beam related
+ * coders by doing:
+ *
+ * {@code
+ * PCollection records =
+ * p.apply(AvroIO.read(...).from(...).withBeamSchemas(true);
+ * }
+ *
+ * Inferring Beam schemas from Avro PCollections
+ *
+ * If you created an Avro-based PCollection by other means e.g. reading 
records from Kafka or as
+ * the output of another PTransform. You may be interested on making your 
PCollection schema-aware
 
 Review comment:
   ```suggestion
* the output of another PTransform, you may be interested on making your 
PCollection schema-aware
   ```
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 294574)
Time Spent: 4h 20m  (was: 4h 10m)

> Expose a method to make an Avro-based PCollection into an Schema-based one
> --
>
> Key: BEAM-7802
> URL: https://issues.apache.org/jira/browse/BEAM-7802
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
>Priority: Minor
>  Time Spent: 4h 20m
>  Remaining Estimate: 0h
>
> Avro can infer the Schema for an Avro based PCollection by using the 
> `withBeamSchemas` method, however if the user created a PCollection with Avro 
> objects or IndexedRecord/GenericRecord, he needs to manually set the schema 
> (or coder). The idea is to expose a method in schema.AvroUtils to ease this.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Work logged] (BEAM-7802) Expose a method to make an Avro-based PCollection into an Schema-based one

2019-08-14 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7802?focusedWorklogId=294572=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-294572
 ]

ASF GitHub Bot logged work on BEAM-7802:


Author: ASF GitHub Bot
Created on: 14/Aug/19 07:44
Start Date: 14/Aug/19 07:44
Worklog Time Spent: 10m 
  Work Description: RyanSkraba commented on pull request #9130: [BEAM-7802] 
Expose a method to make an Avro-based PCollection into an Schema-based one
URL: https://github.com/apache/beam/pull/9130#discussion_r312856270
 
 

 ##
 File path: sdks/java/core/src/main/java/org/apache/beam/sdk/io/AvroIO.java
 ##
 @@ -186,6 +188,49 @@
  * scalability. Note that it may decrease performance if the filepattern 
matches only a small number
  * of files.
  *
+ * Inferring Beam schemas from Avro files
+ *
+ * If you want to use SQL or schema based operations on an Avro-based 
PCollection. You must
+ * configure the read transform to infer the Beam schema and automatically 
setup the Beam related
+ * coders by doing:
+ *
+ * {@code
+ * PCollection records =
+ * p.apply(AvroIO.read(...).from(...).withBeamSchemas(true);
+ * }
+ *
+ * Inferring Beam schemas from Avro PCollections
+ *
+ * If you created an Avro-based PCollection by other means e.g. reading 
records from Kafka or as
+ * the output of another PTransform. You may be interested on making your 
PCollection schema-aware
+ * so you can use the Schema-based APIs or Beam's SqlTransform.
+ *
+ * If you are using Avro specific records (generated classes from an Avro 
schema), you can
+ * register a schema provider for the specific Avro class to make any 
PCollection of these objects
+ * schema-aware.
+ *
+ * {@code
+ * pipeline.getSchemaRegistry().registerSchemaProvider(AvroAutoGenClass.class, 
new AvroRecordSchema());
 
 Review comment:
   ```suggestion
* 
pipeline.getSchemaRegistry().registerSchemaProvider(AvroAutoGenClass.class, 
AvroAutoGenClass.getClassSchema());
   ```
   Better matches the way autogenerated schemas can be discovered.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 294572)

> Expose a method to make an Avro-based PCollection into an Schema-based one
> --
>
> Key: BEAM-7802
> URL: https://issues.apache.org/jira/browse/BEAM-7802
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
>Priority: Minor
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> Avro can infer the Schema for an Avro based PCollection by using the 
> `withBeamSchemas` method, however if the user created a PCollection with Avro 
> objects or IndexedRecord/GenericRecord, he needs to manually set the schema 
> (or coder). The idea is to expose a method in schema.AvroUtils to ease this.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Work logged] (BEAM-7802) Expose a method to make an Avro-based PCollection into an Schema-based one

2019-08-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7802?focusedWorklogId=294162=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-294162
 ]

ASF GitHub Bot logged work on BEAM-7802:


Author: ASF GitHub Bot
Created on: 13/Aug/19 20:22
Start Date: 13/Aug/19 20:22
Worklog Time Spent: 10m 
  Work Description: kanterov commented on pull request #9130: [BEAM-7802] 
Expose a method to make an Avro-based PCollection into an Schema-based one
URL: https://github.com/apache/beam/pull/9130#discussion_r313594719
 
 

 ##
 File path: sdks/java/core/src/main/java/org/apache/beam/sdk/io/AvroUtils.java
 ##
 @@ -1,40 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements.  See the NOTICE file
- * distributed with this work for additional information
- * regarding copyright ownership.  The ASF licenses this file
- * to you under the Apache License, Version 2.0 (the
- * "License"); you may not use this file except in compliance
- * with the License.  You may obtain a copy of the License at
- *
- * http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-package org.apache.beam.sdk.io;
-
-import java.io.Serializable;
-import org.apache.avro.Schema;
-import org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Function;
-import org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Supplier;
-import org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Suppliers;
-
-/** Helpers for working with Avro. */
-class AvroUtils {
 
 Review comment:
   I asked on dev, but probably we already broke it a couple of times
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 294162)
Time Spent: 3h 50m  (was: 3h 40m)

> Expose a method to make an Avro-based PCollection into an Schema-based one
> --
>
> Key: BEAM-7802
> URL: https://issues.apache.org/jira/browse/BEAM-7802
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
>Priority: Minor
>  Time Spent: 3h 50m
>  Remaining Estimate: 0h
>
> Avro can infer the Schema for an Avro based PCollection by using the 
> `withBeamSchemas` method, however if the user created a PCollection with Avro 
> objects or IndexedRecord/GenericRecord, he needs to manually set the schema 
> (or coder). The idea is to expose a method in schema.AvroUtils to ease this.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Work logged] (BEAM-7802) Expose a method to make an Avro-based PCollection into an Schema-based one

2019-08-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7802?focusedWorklogId=294159=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-294159
 ]

ASF GitHub Bot logged work on BEAM-7802:


Author: ASF GitHub Bot
Created on: 13/Aug/19 20:20
Start Date: 13/Aug/19 20:20
Worklog Time Spent: 10m 
  Work Description: iemejia commented on pull request #9130: [BEAM-7802] 
Expose a method to make an Avro-based PCollection into an Schema-based one
URL: https://github.com/apache/beam/pull/9130#discussion_r313593893
 
 

 ##
 File path: 
sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/utils/AvroUtils.java
 ##
 @@ -127,7 +127,7 @@ public static FixedBytesField withSize(int size) {
 
 /** Create a {@link FixedBytesField} from a Beam {@link FieldType}. */
 @Nullable
-public static FixedBytesField fromBeamFieldType(FieldType fieldType) {
+static FixedBytesField fromBeamFieldType(FieldType fieldType) {
 
 Review comment:
   Yes you are right but that class does a lot other magic too. We should 
probably decide on that later on, but I think is important because these are 
user friendly fixes and the current uber class hides multiple things.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 294159)
Time Spent: 3h 40m  (was: 3.5h)

> Expose a method to make an Avro-based PCollection into an Schema-based one
> --
>
> Key: BEAM-7802
> URL: https://issues.apache.org/jira/browse/BEAM-7802
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
>Priority: Minor
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> Avro can infer the Schema for an Avro based PCollection by using the 
> `withBeamSchemas` method, however if the user created a PCollection with Avro 
> objects or IndexedRecord/GenericRecord, he needs to manually set the schema 
> (or coder). The idea is to expose a method in schema.AvroUtils to ease this.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Work logged] (BEAM-7802) Expose a method to make an Avro-based PCollection into an Schema-based one

2019-08-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7802?focusedWorklogId=294144=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-294144
 ]

ASF GitHub Bot logged work on BEAM-7802:


Author: ASF GitHub Bot
Created on: 13/Aug/19 20:04
Start Date: 13/Aug/19 20:04
Worklog Time Spent: 10m 
  Work Description: kanterov commented on pull request #9130: [BEAM-7802] 
Expose a method to make an Avro-based PCollection into an Schema-based one
URL: https://github.com/apache/beam/pull/9130#discussion_r313587439
 
 

 ##
 File path: sdks/java/core/src/main/java/org/apache/beam/sdk/io/AvroUtils.java
 ##
 @@ -1,40 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements.  See the NOTICE file
- * distributed with this work for additional information
- * regarding copyright ownership.  The ASF licenses this file
- * to you under the Apache License, Version 2.0 (the
- * "License"); you may not use this file except in compliance
- * with the License.  You may obtain a copy of the License at
- *
- * http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-package org.apache.beam.sdk.io;
-
-import java.io.Serializable;
-import org.apache.avro.Schema;
-import org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Function;
-import org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Supplier;
-import org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Suppliers;
-
-/** Helpers for working with Avro. */
-class AvroUtils {
 
 Review comment:
   Actually, I don't know how it works now, but it uses classes from vendored 
guava, that changes namespace each time we change guava version. I'm wondering 
if we already broke it without noticing.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 294144)
Time Spent: 3.5h  (was: 3h 20m)

> Expose a method to make an Avro-based PCollection into an Schema-based one
> --
>
> Key: BEAM-7802
> URL: https://issues.apache.org/jira/browse/BEAM-7802
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
>Priority: Minor
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> Avro can infer the Schema for an Avro based PCollection by using the 
> `withBeamSchemas` method, however if the user created a PCollection with Avro 
> objects or IndexedRecord/GenericRecord, he needs to manually set the schema 
> (or coder). The idea is to expose a method in schema.AvroUtils to ease this.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Work logged] (BEAM-7802) Expose a method to make an Avro-based PCollection into an Schema-based one

2019-08-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7802?focusedWorklogId=294142=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-294142
 ]

ASF GitHub Bot logged work on BEAM-7802:


Author: ASF GitHub Bot
Created on: 13/Aug/19 20:03
Start Date: 13/Aug/19 20:03
Worklog Time Spent: 10m 
  Work Description: kanterov commented on pull request #9130: [BEAM-7802] 
Expose a method to make an Avro-based PCollection into an Schema-based one
URL: https://github.com/apache/beam/pull/9130#discussion_r313587102
 
 

 ##
 File path: sdks/java/core/src/main/java/org/apache/beam/sdk/io/AvroUtils.java
 ##
 @@ -1,40 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements.  See the NOTICE file
- * distributed with this work for additional information
- * regarding copyright ownership.  The ASF licenses this file
- * to you under the Apache License, Version 2.0 (the
- * "License"); you may not use this file except in compliance
- * with the License.  You may obtain a copy of the License at
- *
- * http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-package org.apache.beam.sdk.io;
-
-import java.io.Serializable;
-import org.apache.avro.Schema;
-import org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Function;
-import org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Supplier;
-import org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Suppliers;
-
-/** Helpers for working with Avro. */
-class AvroUtils {
 
 Review comment:
   But we can't change it, or it will break Java serialization.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 294142)
Time Spent: 3h 20m  (was: 3h 10m)

> Expose a method to make an Avro-based PCollection into an Schema-based one
> --
>
> Key: BEAM-7802
> URL: https://issues.apache.org/jira/browse/BEAM-7802
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
>Priority: Minor
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> Avro can infer the Schema for an Avro based PCollection by using the 
> `withBeamSchemas` method, however if the user created a PCollection with Avro 
> objects or IndexedRecord/GenericRecord, he needs to manually set the schema 
> (or coder). The idea is to expose a method in schema.AvroUtils to ease this.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Work logged] (BEAM-7802) Expose a method to make an Avro-based PCollection into an Schema-based one

2019-08-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7802?focusedWorklogId=294141=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-294141
 ]

ASF GitHub Bot logged work on BEAM-7802:


Author: ASF GitHub Bot
Created on: 13/Aug/19 20:03
Start Date: 13/Aug/19 20:03
Worklog Time Spent: 10m 
  Work Description: kanterov commented on pull request #9130: [BEAM-7802] 
Expose a method to make an Avro-based PCollection into an Schema-based one
URL: https://github.com/apache/beam/pull/9130#discussion_r313586988
 
 

 ##
 File path: sdks/java/core/src/main/java/org/apache/beam/sdk/io/AvroUtils.java
 ##
 @@ -1,40 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements.  See the NOTICE file
- * distributed with this work for additional information
- * regarding copyright ownership.  The ASF licenses this file
- * to you under the Apache License, Version 2.0 (the
- * "License"); you may not use this file except in compliance
- * with the License.  You may obtain a copy of the License at
- *
- * http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-package org.apache.beam.sdk.io;
-
-import java.io.Serializable;
-import org.apache.avro.Schema;
-import org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Function;
-import org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Supplier;
-import org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Suppliers;
-
-/** Helpers for working with Avro. */
-class AvroUtils {
 
 Review comment:
   `AvroCoder.SerializableSchemaSupplier`
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 294141)
Time Spent: 3h 10m  (was: 3h)

> Expose a method to make an Avro-based PCollection into an Schema-based one
> --
>
> Key: BEAM-7802
> URL: https://issues.apache.org/jira/browse/BEAM-7802
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
>Priority: Minor
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> Avro can infer the Schema for an Avro based PCollection by using the 
> `withBeamSchemas` method, however if the user created a PCollection with Avro 
> objects or IndexedRecord/GenericRecord, he needs to manually set the schema 
> (or coder). The idea is to expose a method in schema.AvroUtils to ease this.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Work logged] (BEAM-7802) Expose a method to make an Avro-based PCollection into an Schema-based one

2019-08-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7802?focusedWorklogId=294140=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-294140
 ]

ASF GitHub Bot logged work on BEAM-7802:


Author: ASF GitHub Bot
Created on: 13/Aug/19 20:01
Start Date: 13/Aug/19 20:01
Worklog Time Spent: 10m 
  Work Description: kanterov commented on pull request #9130: [BEAM-7802] 
Expose a method to make an Avro-based PCollection into an Schema-based one
URL: https://github.com/apache/beam/pull/9130#discussion_r313586430
 
 

 ##
 File path: 
sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/utils/AvroUtils.java
 ##
 @@ -127,7 +127,7 @@ public static FixedBytesField withSize(int size) {
 
 /** Create a {@link FixedBytesField} from a Beam {@link FieldType}. */
 @Nullable
-public static FixedBytesField fromBeamFieldType(FieldType fieldType) {
+static FixedBytesField fromBeamFieldType(FieldType fieldType) {
 
 Review comment:
   If you ask me, I would rather make the whole class protected (or delete it), 
hiding everything, and all user-facing functionality will move to AvroCoder
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 294140)
Time Spent: 3h  (was: 2h 50m)

> Expose a method to make an Avro-based PCollection into an Schema-based one
> --
>
> Key: BEAM-7802
> URL: https://issues.apache.org/jira/browse/BEAM-7802
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
>Priority: Minor
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> Avro can infer the Schema for an Avro based PCollection by using the 
> `withBeamSchemas` method, however if the user created a PCollection with Avro 
> objects or IndexedRecord/GenericRecord, he needs to manually set the schema 
> (or coder). The idea is to expose a method in schema.AvroUtils to ease this.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Work logged] (BEAM-7802) Expose a method to make an Avro-based PCollection into an Schema-based one

2019-08-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7802?focusedWorklogId=294137=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-294137
 ]

ASF GitHub Bot logged work on BEAM-7802:


Author: ASF GitHub Bot
Created on: 13/Aug/19 19:59
Start Date: 13/Aug/19 19:59
Worklog Time Spent: 10m 
  Work Description: iemejia commented on pull request #9130: [BEAM-7802] 
Expose a method to make an Avro-based PCollection into an Schema-based one
URL: https://github.com/apache/beam/pull/9130#discussion_r313585482
 
 

 ##
 File path: 
sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/utils/AvroUtils.java
 ##
 @@ -127,7 +127,7 @@ public static FixedBytesField withSize(int size) {
 
 /** Create a {@link FixedBytesField} from a Beam {@link FieldType}. */
 @Nullable
-public static FixedBytesField fromBeamFieldType(FieldType fieldType) {
+static FixedBytesField fromBeamFieldType(FieldType fieldType) {
 
 Review comment:
   I am going to revert the access commit then, but I still don't see the 
point. The complete `AvroUtils` class is `@Experimental` which means we are in 
the perfect moment to do refine those aspects. We will have a harder time to do 
so in the future but well I suppose that's not a big deal.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 294137)
Time Spent: 2h 50m  (was: 2h 40m)

> Expose a method to make an Avro-based PCollection into an Schema-based one
> --
>
> Key: BEAM-7802
> URL: https://issues.apache.org/jira/browse/BEAM-7802
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
>Priority: Minor
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> Avro can infer the Schema for an Avro based PCollection by using the 
> `withBeamSchemas` method, however if the user created a PCollection with Avro 
> objects or IndexedRecord/GenericRecord, he needs to manually set the schema 
> (or coder). The idea is to expose a method in schema.AvroUtils to ease this.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Work logged] (BEAM-7802) Expose a method to make an Avro-based PCollection into an Schema-based one

2019-08-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7802?focusedWorklogId=294121=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-294121
 ]

ASF GitHub Bot logged work on BEAM-7802:


Author: ASF GitHub Bot
Created on: 13/Aug/19 19:40
Start Date: 13/Aug/19 19:40
Worklog Time Spent: 10m 
  Work Description: kanterov commented on issue #9130: [BEAM-7802] Expose a 
method to make an Avro-based PCollection into an Schema-based one
URL: https://github.com/apache/beam/pull/9130#issuecomment-520978853
 
 
   Changing AvroCoder will definitely break compatibility, especially streaming 
pipelines reading from PubSub or Kafka. In addition, SchemaCoder for Avro isn't 
as good (yet) as AvroCoder. As an example, it would serialize enums as strings, 
that is very inefficient when shuffling data. Another source of problems is 
that it doesn't support all Avro features. I believe once it matures we it 
could be the default, but we aren't there. In any case, I think it's a good 
exercise to think where we want to put SchemaCoder and how we are going to 
evolve AvroCoder, so, probably we should start a threat on dev@.
   
   The code looks good. I agree and support your motivation on making fewer 
things private, but I don't find it practical to break it now given that we 
know for sure that there are codebases relying on it being public to avoid 
limitations of existing APIs, so I propose to postpone this before things 
stabilize.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 294121)
Time Spent: 2h 40m  (was: 2.5h)

> Expose a method to make an Avro-based PCollection into an Schema-based one
> --
>
> Key: BEAM-7802
> URL: https://issues.apache.org/jira/browse/BEAM-7802
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
>Priority: Minor
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> Avro can infer the Schema for an Avro based PCollection by using the 
> `withBeamSchemas` method, however if the user created a PCollection with Avro 
> objects or IndexedRecord/GenericRecord, he needs to manually set the schema 
> (or coder). The idea is to expose a method in schema.AvroUtils to ease this.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Work logged] (BEAM-7802) Expose a method to make an Avro-based PCollection into an Schema-based one

2019-08-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7802?focusedWorklogId=294114=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-294114
 ]

ASF GitHub Bot logged work on BEAM-7802:


Author: ASF GitHub Bot
Created on: 13/Aug/19 19:24
Start Date: 13/Aug/19 19:24
Worklog Time Spent: 10m 
  Work Description: kanterov commented on pull request #9130: [BEAM-7802] 
Expose a method to make an Avro-based PCollection into an Schema-based one
URL: https://github.com/apache/beam/pull/9130#discussion_r313571429
 
 

 ##
 File path: 
sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/utils/AvroUtils.java
 ##
 @@ -127,7 +127,7 @@ public static FixedBytesField withSize(int size) {
 
 /** Create a {@link FixedBytesField} from a Beam {@link FieldType}. */
 @Nullable
-public static FixedBytesField fromBeamFieldType(FieldType fieldType) {
+static FixedBytesField fromBeamFieldType(FieldType fieldType) {
 
 Review comment:
   I agree, but at this stage, it doesn't seem practical for me, because by 
reducing access we are going to break existing code and make it problematic to 
upgrade it to the next Beam version. 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 294114)
Time Spent: 2.5h  (was: 2h 20m)

> Expose a method to make an Avro-based PCollection into an Schema-based one
> --
>
> Key: BEAM-7802
> URL: https://issues.apache.org/jira/browse/BEAM-7802
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
>Priority: Minor
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> Avro can infer the Schema for an Avro based PCollection by using the 
> `withBeamSchemas` method, however if the user created a PCollection with Avro 
> objects or IndexedRecord/GenericRecord, he needs to manually set the schema 
> (or coder). The idea is to expose a method in schema.AvroUtils to ease this.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Work logged] (BEAM-7802) Expose a method to make an Avro-based PCollection into an Schema-based one

2019-08-08 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7802?focusedWorklogId=291538=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-291538
 ]

ASF GitHub Bot logged work on BEAM-7802:


Author: ASF GitHub Bot
Created on: 08/Aug/19 19:42
Start Date: 08/Aug/19 19:42
Worklog Time Spent: 10m 
  Work Description: iemejia commented on issue #9130: [BEAM-7802] Expose a 
method to make an Avro-based PCollection into an Schema-based one
URL: https://github.com/apache/beam/pull/9130#issuecomment-518675242
 
 
   I was starting to think if we should somehow make every AvroCoder 
PCollection a SchemaCoder for user friendliness, but I have my doubts on taking 
this 'implicit' approach, I am a bit worried about breaking backwards 
compatibility, but somehow it makes sense too, mmm... hard to decide/know.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 291538)
Time Spent: 2h 20m  (was: 2h 10m)

> Expose a method to make an Avro-based PCollection into an Schema-based one
> --
>
> Key: BEAM-7802
> URL: https://issues.apache.org/jira/browse/BEAM-7802
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
>Priority: Minor
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> Avro can infer the Schema for an Avro based PCollection by using the 
> `withBeamSchemas` method, however if the user created a PCollection with Avro 
> objects or IndexedRecord/GenericRecord, he needs to manually set the schema 
> (or coder). The idea is to expose a method in schema.AvroUtils to ease this.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Work logged] (BEAM-7802) Expose a method to make an Avro-based PCollection into an Schema-based one

2019-08-06 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7802?focusedWorklogId=289659=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-289659
 ]

ASF GitHub Bot logged work on BEAM-7802:


Author: ASF GitHub Bot
Created on: 06/Aug/19 13:43
Start Date: 06/Aug/19 13:43
Worklog Time Spent: 10m 
  Work Description: iemejia commented on issue #9130: [BEAM-7802] Expose a 
method to make an Avro-based PCollection into an Schema-based one
URL: https://github.com/apache/beam/pull/9130#issuecomment-518675242
 
 
   I was starting to think if we should somehow make every AvroCoder 
PCollection a SchemaCoder for user friendliness, but I have my doubts on taking 
this 'implicit' approach, maybe worried about breaking backwards 
incompatibility, but somehow it makes sense too, mmm... hard to decide/know.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 289659)
Time Spent: 2h 10m  (was: 2h)

> Expose a method to make an Avro-based PCollection into an Schema-based one
> --
>
> Key: BEAM-7802
> URL: https://issues.apache.org/jira/browse/BEAM-7802
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
>Priority: Minor
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> Avro can infer the Schema for an Avro based PCollection by using the 
> `withBeamSchemas` method, however if the user created a PCollection with Avro 
> objects or IndexedRecord/GenericRecord, he needs to manually set the schema 
> (or coder). The idea is to expose a method in schema.AvroUtils to ease this.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Work logged] (BEAM-7802) Expose a method to make an Avro-based PCollection into an Schema-based one

2019-08-06 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7802?focusedWorklogId=289658=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-289658
 ]

ASF GitHub Bot logged work on BEAM-7802:


Author: ASF GitHub Bot
Created on: 06/Aug/19 13:41
Start Date: 06/Aug/19 13:41
Worklog Time Spent: 10m 
  Work Description: iemejia commented on issue #9130: [BEAM-7802] Expose a 
method to make an Avro-based PCollection into an Schema-based one
URL: https://github.com/apache/beam/pull/9130#issuecomment-518674333
 
 
   Did suggested updates PTAL @kanterov I added `schemaCoder` methods to match 
those of `AvroCoder.of` and one additional one to do it from an existing 
`AvroCoder` that one is relatively simple from the others and I can remove it 
if you prefer, but I ended up rewriting these two lines multiple times.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 289658)
Time Spent: 2h  (was: 1h 50m)

> Expose a method to make an Avro-based PCollection into an Schema-based one
> --
>
> Key: BEAM-7802
> URL: https://issues.apache.org/jira/browse/BEAM-7802
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
>Priority: Minor
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> Avro can infer the Schema for an Avro based PCollection by using the 
> `withBeamSchemas` method, however if the user created a PCollection with Avro 
> objects or IndexedRecord/GenericRecord, he needs to manually set the schema 
> (or coder). The idea is to expose a method in schema.AvroUtils to ease this.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Work logged] (BEAM-7802) Expose a method to make an Avro-based PCollection into an Schema-based one

2019-08-05 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7802?focusedWorklogId=289241=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-289241
 ]

ASF GitHub Bot logged work on BEAM-7802:


Author: ASF GitHub Bot
Created on: 05/Aug/19 22:46
Start Date: 05/Aug/19 22:46
Worklog Time Spent: 10m 
  Work Description: iemejia commented on pull request #9130: [BEAM-7802] 
Expose a method to make an Avro-based PCollection into an Schema-based one
URL: https://github.com/apache/beam/pull/9130#discussion_r310821130
 
 

 ##
 File path: 
sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/utils/AvroUtils.java
 ##
 @@ -309,6 +310,18 @@ public static GenericRecord toGenericRecord(
 return g -> toGenericRecord(g, avroSchema);
   }
 
+  /** Transform an existing Avro-based PCollection input into an Schema-based 
PCollection. */
+  public static  PCollection asSchemaPCollection(
+  PCollection pc, Class clazz, @Nullable org.apache.avro.Schema 
schema) {
+if (!pc.hasSchema()) {
+  Schema beamSchema = getSchema(clazz, schema);
+  if (beamSchema != null) {
+pc.setSchema(beamSchema, getToRowFunction(clazz, schema), 
getFromRowFunction(clazz));
 
 Review comment:
   Yes good idea, that looks cleaner I will add the `schemaCoder` method to 
AvroUtils, it should cover all the mentioned cases. will ping you back when 
done. Thanks for the review ideas.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 289241)
Time Spent: 1h 50m  (was: 1h 40m)

> Expose a method to make an Avro-based PCollection into an Schema-based one
> --
>
> Key: BEAM-7802
> URL: https://issues.apache.org/jira/browse/BEAM-7802
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
>Priority: Minor
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> Avro can infer the Schema for an Avro based PCollection by using the 
> `withBeamSchemas` method, however if the user created a PCollection with Avro 
> objects or IndexedRecord/GenericRecord, he needs to manually set the schema 
> (or coder). The idea is to expose a method in schema.AvroUtils to ease this.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Work logged] (BEAM-7802) Expose a method to make an Avro-based PCollection into an Schema-based one

2019-08-05 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7802?focusedWorklogId=289160=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-289160
 ]

ASF GitHub Bot logged work on BEAM-7802:


Author: ASF GitHub Bot
Created on: 05/Aug/19 19:30
Start Date: 05/Aug/19 19:30
Worklog Time Spent: 10m 
  Work Description: iemejia commented on pull request #9130: [BEAM-7802] 
Expose a method to make an Avro-based PCollection into an Schema-based one
URL: https://github.com/apache/beam/pull/9130#discussion_r310757382
 
 

 ##
 File path: 
sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/utils/AvroUtils.java
 ##
 @@ -127,7 +127,7 @@ public static FixedBytesField withSize(int size) {
 
 /** Create a {@link FixedBytesField} from a Beam {@link FieldType}. */
 @Nullable
-public static FixedBytesField fromBeamFieldType(FieldType fieldType) {
+static FixedBytesField fromBeamFieldType(FieldType fieldType) {
 
 Review comment:
   Well I think because `@Experimental` is applied to the full class we will 
probably end up exposing more stuff than we want when we remove the annotation. 
keeping it with the lowest possible access level is probably wiser and will 
still allow to evolve access if needed (the other way around won't work). But I 
can change it if you prefer.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 289160)
Time Spent: 1h 40m  (was: 1.5h)

> Expose a method to make an Avro-based PCollection into an Schema-based one
> --
>
> Key: BEAM-7802
> URL: https://issues.apache.org/jira/browse/BEAM-7802
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
>Priority: Minor
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> Avro can infer the Schema for an Avro based PCollection by using the 
> `withBeamSchemas` method, however if the user created a PCollection with Avro 
> objects or IndexedRecord/GenericRecord, he needs to manually set the schema 
> (or coder). The idea is to expose a method in schema.AvroUtils to ease this.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Work logged] (BEAM-7802) Expose a method to make an Avro-based PCollection into an Schema-based one

2019-08-05 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7802?focusedWorklogId=289157=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-289157
 ]

ASF GitHub Bot logged work on BEAM-7802:


Author: ASF GitHub Bot
Created on: 05/Aug/19 19:28
Start Date: 05/Aug/19 19:28
Worklog Time Spent: 10m 
  Work Description: iemejia commented on pull request #9130: [BEAM-7802] 
Expose a method to make an Avro-based PCollection into an Schema-based one
URL: https://github.com/apache/beam/pull/9130#discussion_r310756762
 
 

 ##
 File path: sdks/java/core/src/main/java/org/apache/beam/sdk/io/AvroUtils.java
 ##
 @@ -1,40 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements.  See the NOTICE file
- * distributed with this work for additional information
- * regarding copyright ownership.  The ASF licenses this file
- * to you under the Apache License, Version 2.0 (the
- * "License"); you may not use this file except in compliance
- * with the License.  You may obtain a copy of the License at
- *
- * http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-package org.apache.beam.sdk.io;
-
-import java.io.Serializable;
-import org.apache.avro.Schema;
-import org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Function;
-import org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Supplier;
-import org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Suppliers;
-
-/** Helpers for working with Avro. */
-class AvroUtils {
 
 Review comment:
   Ok just for curiosity which class is it?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 289157)
Time Spent: 1.5h  (was: 1h 20m)

> Expose a method to make an Avro-based PCollection into an Schema-based one
> --
>
> Key: BEAM-7802
> URL: https://issues.apache.org/jira/browse/BEAM-7802
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
>Priority: Minor
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Avro can infer the Schema for an Avro based PCollection by using the 
> `withBeamSchemas` method, however if the user created a PCollection with Avro 
> objects or IndexedRecord/GenericRecord, he needs to manually set the schema 
> (or coder). The idea is to expose a method in schema.AvroUtils to ease this.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Work logged] (BEAM-7802) Expose a method to make an Avro-based PCollection into an Schema-based one

2019-08-05 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7802?focusedWorklogId=288830=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-288830
 ]

ASF GitHub Bot logged work on BEAM-7802:


Author: ASF GitHub Bot
Created on: 05/Aug/19 09:49
Start Date: 05/Aug/19 09:49
Worklog Time Spent: 10m 
  Work Description: kanterov commented on pull request #9130: [BEAM-7802] 
Expose a method to make an Avro-based PCollection into an Schema-based one
URL: https://github.com/apache/beam/pull/9130#discussion_r310517412
 
 

 ##
 File path: 
sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/utils/AvroUtils.java
 ##
 @@ -309,6 +310,18 @@ public static GenericRecord toGenericRecord(
 return g -> toGenericRecord(g, avroSchema);
   }
 
+  /** Transform an existing Avro-based PCollection input into an Schema-based 
PCollection. */
+  public static  PCollection asSchemaPCollection(
+  PCollection pc, Class clazz, @Nullable org.apache.avro.Schema 
schema) {
+if (!pc.hasSchema()) {
+  Schema beamSchema = getSchema(clazz, schema);
+  if (beamSchema != null) {
+pc.setSchema(beamSchema, getToRowFunction(clazz, schema), 
getFromRowFunction(clazz));
 
 Review comment:
   `pc.setSchema` is equivalent to `pc.setCoder(SchemaCoder.of(...))`
   
   I was thinking, what if we just create a static method to create 
`SchemaCoder` instead, then user-facing API would be like:
   ```
   KafkaIO.read(MyRecord.class)
 .setCoder(AvroUtils.schemaCoder(MyRecord.class))
   ```
   
   or, it can be done implicitly by registering classes in advance:
   
   ```
   p.getSchemaRegistry().registerSchemaProvider(MyRecord.class, new 
AvroRecordSchema());
   ```
   
   or for every `SpecificRecord` (didn't try this, but it should work as well):
   ```
   p.getSchemaRegistry().registerSchemaProvider(SpecificRecord.class, new 
AvroRecordSchema());
   ```
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 288830)
Time Spent: 1h 20m  (was: 1h 10m)

> Expose a method to make an Avro-based PCollection into an Schema-based one
> --
>
> Key: BEAM-7802
> URL: https://issues.apache.org/jira/browse/BEAM-7802
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
>Priority: Minor
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Avro can infer the Schema for an Avro based PCollection by using the 
> `withBeamSchemas` method, however if the user created a PCollection with Avro 
> objects or IndexedRecord/GenericRecord, he needs to manually set the schema 
> (or coder). The idea is to expose a method in schema.AvroUtils to ease this.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Work logged] (BEAM-7802) Expose a method to make an Avro-based PCollection into an Schema-based one

2019-08-05 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7802?focusedWorklogId=288829=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-288829
 ]

ASF GitHub Bot logged work on BEAM-7802:


Author: ASF GitHub Bot
Created on: 05/Aug/19 09:49
Start Date: 05/Aug/19 09:49
Worklog Time Spent: 10m 
  Work Description: kanterov commented on pull request #9130: [BEAM-7802] 
Expose a method to make an Avro-based PCollection into an Schema-based one
URL: https://github.com/apache/beam/pull/9130#discussion_r310511199
 
 

 ##
 File path: 
sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/utils/AvroUtils.java
 ##
 @@ -127,7 +127,7 @@ public static FixedBytesField withSize(int size) {
 
 /** Create a {@link FixedBytesField} from a Beam {@link FieldType}. */
 @Nullable
-public static FixedBytesField fromBeamFieldType(FieldType fieldType) {
+static FixedBytesField fromBeamFieldType(FieldType fieldType) {
 
 Review comment:
   I see that you want to hide internal APIs, but I'm not sure it worth it at 
the moment. Everything in this file is an unstable internal API, and it's used 
outside of Beam codebase to workaround existing limitations. At Spotify, we use 
a few of these methods, and it seems there are other contributors following 
this pattern (for instance, 
https://github.com/apache/beam/commit/172b563fdd36019d3284139417808681314ac364#diff-d7153fa055ebd85f76700232cb4e5cce).
 I would say for now it's completely fine to keep everything public as is, 
class is annotated with `@Experimental`
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 288829)
Time Spent: 1h 10m  (was: 1h)

> Expose a method to make an Avro-based PCollection into an Schema-based one
> --
>
> Key: BEAM-7802
> URL: https://issues.apache.org/jira/browse/BEAM-7802
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
>Priority: Minor
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Avro can infer the Schema for an Avro based PCollection by using the 
> `withBeamSchemas` method, however if the user created a PCollection with Avro 
> objects or IndexedRecord/GenericRecord, he needs to manually set the schema 
> (or coder). The idea is to expose a method in schema.AvroUtils to ease this.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Work logged] (BEAM-7802) Expose a method to make an Avro-based PCollection into an Schema-based one

2019-08-05 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7802?focusedWorklogId=288828=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-288828
 ]

ASF GitHub Bot logged work on BEAM-7802:


Author: ASF GitHub Bot
Created on: 05/Aug/19 09:49
Start Date: 05/Aug/19 09:49
Worklog Time Spent: 10m 
  Work Description: kanterov commented on pull request #9130: [BEAM-7802] 
Expose a method to make an Avro-based PCollection into an Schema-based one
URL: https://github.com/apache/beam/pull/9130#discussion_r310508952
 
 

 ##
 File path: sdks/java/core/src/main/java/org/apache/beam/sdk/io/AvroUtils.java
 ##
 @@ -1,40 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements.  See the NOTICE file
- * distributed with this work for additional information
- * regarding copyright ownership.  The ASF licenses this file
- * to you under the Apache License, Version 2.0 (the
- * "License"); you may not use this file except in compliance
- * with the License.  You may obtain a copy of the License at
- *
- * http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-package org.apache.beam.sdk.io;
-
-import java.io.Serializable;
-import org.apache.avro.Schema;
-import org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Function;
-import org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Supplier;
-import org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Suppliers;
-
-/** Helpers for working with Avro. */
-class AvroUtils {
 
 Review comment:
   There is one more similar class in `AvroCoder`, but I guess changing it will 
introduce incompatible change into `AvroCoder` and break streaming pipelines, 
so let's keep it for now.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 288828)
Time Spent: 1h  (was: 50m)

> Expose a method to make an Avro-based PCollection into an Schema-based one
> --
>
> Key: BEAM-7802
> URL: https://issues.apache.org/jira/browse/BEAM-7802
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
>Priority: Minor
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Avro can infer the Schema for an Avro based PCollection by using the 
> `withBeamSchemas` method, however if the user created a PCollection with Avro 
> objects or IndexedRecord/GenericRecord, he needs to manually set the schema 
> (or coder). The idea is to expose a method in schema.AvroUtils to ease this.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Work logged] (BEAM-7802) Expose a method to make an Avro-based PCollection into an Schema-based one

2019-08-02 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7802?focusedWorklogId=287935=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-287935
 ]

ASF GitHub Bot logged work on BEAM-7802:


Author: ASF GitHub Bot
Created on: 02/Aug/19 13:33
Start Date: 02/Aug/19 13:33
Worklog Time Spent: 10m 
  Work Description: iemejia commented on issue #9130: [BEAM-7802] Expose a 
method to make an Avro-based PCollection into an Schema-based one
URL: https://github.com/apache/beam/pull/9130#issuecomment-517702453
 
 
   Rebased to fix a merge issue PTAL when you have some time. Notice that there 
are other Avro / Schema / SQL improvements as part of this but all are isolated 
in its own commits to ease the review.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 287935)
Time Spent: 50m  (was: 40m)

> Expose a method to make an Avro-based PCollection into an Schema-based one
> --
>
> Key: BEAM-7802
> URL: https://issues.apache.org/jira/browse/BEAM-7802
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
>Priority: Minor
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Avro can infer the Schema for an Avro based PCollection by using the 
> `withBeamSchemas` method, however if the user created a PCollection with Avro 
> objects or IndexedRecord/GenericRecord, he needs to manually set the schema 
> (or coder). The idea is to expose a method in schema.AvroUtils to ease this.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Work logged] (BEAM-7802) Expose a method to make an Avro-based PCollection into an Schema-based one

2019-07-23 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7802?focusedWorklogId=281294=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-281294
 ]

ASF GitHub Bot logged work on BEAM-7802:


Author: ASF GitHub Bot
Created on: 23/Jul/19 20:21
Start Date: 23/Jul/19 20:21
Worklog Time Spent: 10m 
  Work Description: reuvenlax commented on issue #9130: [BEAM-7802] Expose 
a method to make an Avro-based PCollection into an Schema-based one
URL: https://github.com/apache/beam/pull/9130#issuecomment-514368800
 
 
   Very cool!
   
   On Tue, Jul 23, 2019 at 12:34 PM Ismaël Mejía 
   wrote:
   
   > This is just my first PR while playing a bit with Schemas and SQL. I found
   > that it was not straight forward to transform an Avro-based PCollection
   > into a Beam schema one so did this. It also does some minor clean ups in
   > AvroUtils and the SQL example.
   >
   > R: @reuvenlax 
   > --
   > You can view, comment on, or merge this pull request online at:
   >
   >   https://github.com/apache/beam/pull/9130
   > Commit Summary
   >
   >- [BEAM-7802] Make SQL example slightly simpler
   >- [BEAM-7802] Inline AvroUtils methods to have only one public
   >AvroUtils class in core SDK
   >- [BEAM-7802] Expose a method to make an Avro-based PCollection into
   >an Schema-based one
   >- [BEAM-7802] Fix minor issues (access modifiers + static) in AvroUtils
   >
   > File Changes
   >
   >- *M* sdks/java/core/src/main/java/org/apache/beam/sdk/io/AvroIO.java
   > (55)
   >- *D*
   >sdks/java/core/src/main/java/org/apache/beam/sdk/io/AvroUtils.java
   > (40)
   >- *M*
   >
sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/utils/AvroUtils.java
   > (28)
   >- *M*
   >
sdks/java/core/src/test/java/org/apache/beam/sdk/schemas/utils/AvroUtilsTest.java
   > (62)
   >- *M*
   >
sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/example/BeamSqlExample.java
   > (15)
   >- *M*
   >
sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/example/BeamSqlPojoExample.java
   > (17)
   >
   > Patch Links:
   >
   >- https://github.com/apache/beam/pull/9130.patch
   >- https://github.com/apache/beam/pull/9130.diff
   >
   > —
   > You are receiving this because you were mentioned.
   > Reply to this email directly, view it on GitHub
   > 
,
   > or mute the thread
   > 

   > .
   >
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 281294)
Time Spent: 40m  (was: 0.5h)

> Expose a method to make an Avro-based PCollection into an Schema-based one
> --
>
> Key: BEAM-7802
> URL: https://issues.apache.org/jira/browse/BEAM-7802
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
>Priority: Minor
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Avro can infer the Schema for an Avro based PCollection by using the 
> `withBeamSchemas` method, however if the user created a PCollection with Avro 
> objects or IndexedRecord/GenericRecord, he needs to manually set the schema 
> (or coder). The idea is to expose a method in schema.AvroUtils to ease this.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Work logged] (BEAM-7802) Expose a method to make an Avro-based PCollection into an Schema-based one

2019-07-23 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7802?focusedWorklogId=281285=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-281285
 ]

ASF GitHub Bot logged work on BEAM-7802:


Author: ASF GitHub Bot
Created on: 23/Jul/19 20:06
Start Date: 23/Jul/19 20:06
Worklog Time Spent: 10m 
  Work Description: iemejia commented on issue #9130: [BEAM-7802] Expose a 
method to make an Avro-based PCollection into an Schema-based one
URL: https://github.com/apache/beam/pull/9130#issuecomment-514352634
 
 
   Run JavaPortabilityApi PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 281285)
Time Spent: 0.5h  (was: 20m)

> Expose a method to make an Avro-based PCollection into an Schema-based one
> --
>
> Key: BEAM-7802
> URL: https://issues.apache.org/jira/browse/BEAM-7802
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
>Priority: Minor
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Avro can infer the Schema for an Avro based PCollection by using the 
> `withBeamSchemas` method, however if the user created a PCollection with Avro 
> objects or IndexedRecord/GenericRecord, he needs to manually set the schema 
> (or coder). The idea is to expose a method in schema.AvroUtils to ease this.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Work logged] (BEAM-7802) Expose a method to make an Avro-based PCollection into an Schema-based one

2019-07-23 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7802?focusedWorklogId=281259=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-281259
 ]

ASF GitHub Bot logged work on BEAM-7802:


Author: ASF GitHub Bot
Created on: 23/Jul/19 19:35
Start Date: 23/Jul/19 19:35
Worklog Time Spent: 10m 
  Work Description: iemejia commented on issue #9130: [BEAM-7802] Expose a 
method to make an Avro-based PCollection into an Schema-based one
URL: https://github.com/apache/beam/pull/9130#issuecomment-514352634
 
 
   Run JavaPortabilityApi PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 281259)
Time Spent: 20m  (was: 10m)

> Expose a method to make an Avro-based PCollection into an Schema-based one
> --
>
> Key: BEAM-7802
> URL: https://issues.apache.org/jira/browse/BEAM-7802
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
>Priority: Minor
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Avro can infer the Schema for an Avro based PCollection by using the 
> `withBeamSchemas` method, however if the user created a PCollection with Avro 
> objects or IndexedRecord/GenericRecord, he needs to manually set the schema 
> (or coder). The idea is to expose a method in schema.AvroUtils to ease this.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Work logged] (BEAM-7802) Expose a method to make an Avro-based PCollection into an Schema-based one

2019-07-23 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7802?focusedWorklogId=281257=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-281257
 ]

ASF GitHub Bot logged work on BEAM-7802:


Author: ASF GitHub Bot
Created on: 23/Jul/19 19:34
Start Date: 23/Jul/19 19:34
Worklog Time Spent: 10m 
  Work Description: iemejia commented on pull request #9130: [BEAM-7802] 
Expose a method to make an Avro-based PCollection into an Schema-based one
URL: https://github.com/apache/beam/pull/9130
 
 
   This is just my first PR while playing a bit with Schemas and SQL. I found 
that it was not straight forward to transform an Avro-based PCollection into a 
Beam schema one so did this. It also does some minor clean ups in AvroUtils and 
the SQL example.
   
   R: @reuvenlax 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 281257)
Time Spent: 10m
Remaining Estimate: 0h

> Expose a method to make an Avro-based PCollection into an Schema-based one
> --
>
> Key: BEAM-7802
> URL: https://issues.apache.org/jira/browse/BEAM-7802
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
>Priority: Minor
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Avro can infer the Schema for an Avro based PCollection by using the 
> `withBeamSchemas` method, however if the user created a PCollection with Avro 
> objects or IndexedRecord/GenericRecord, he needs to manually set the schema 
> (or coder). The idea is to expose a method in schema.AvroUtils to ease this.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)