[jira] [Work started] (BEAM-9275) BIP-1: Beam Schema Options

2020-06-01 Thread Alex Van Boxel (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on BEAM-9275 started by Alex Van Boxel.

> BIP-1: Beam Schema Options
> --
>
> Key: BEAM-9275
> URL: https://issues.apache.org/jira/browse/BEAM-9275
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Alex Van Boxel
>Assignee: Alex Van Boxel
>Priority: P2
>  Labels: stale-P2
>
> Introduce the concept of Options in Beam Schema’s to add extra context to 
> fields and schemas. In contrast to the current Beam metadata that is present 
> in a FieldType, options would be added to fields, logical types and schemas. 
> The schema convertors (ex. Avro, Proto, …) can add 
> options/annotations/decorators that were in the original schema to the Beam 
> schema with these options. These options, that add contextual metadata, can 
> be used in the pipeline for specific transformations or augment the end 
> schema in the target output.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9275) BIP-1: Beam Schema Options

2020-06-01 Thread Alex Van Boxel (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Van Boxel updated BEAM-9275:
-
Status: Open  (was: Triage Needed)

> BIP-1: Beam Schema Options
> --
>
> Key: BEAM-9275
> URL: https://issues.apache.org/jira/browse/BEAM-9275
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Alex Van Boxel
>Priority: P2
>  Labels: stale-P2
>
> Introduce the concept of Options in Beam Schema’s to add extra context to 
> fields and schemas. In contrast to the current Beam metadata that is present 
> in a FieldType, options would be added to fields, logical types and schemas. 
> The schema convertors (ex. Avro, Proto, …) can add 
> options/annotations/decorators that were in the original schema to the Beam 
> schema with these options. These options, that add contextual metadata, can 
> be used in the pipeline for specific transformations or augment the end 
> schema in the target output.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (BEAM-9275) BIP-1: Beam Schema Options

2020-06-01 Thread Alex Van Boxel (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Van Boxel reassigned BEAM-9275:


Assignee: Alex Van Boxel

> BIP-1: Beam Schema Options
> --
>
> Key: BEAM-9275
> URL: https://issues.apache.org/jira/browse/BEAM-9275
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Alex Van Boxel
>Assignee: Alex Van Boxel
>Priority: P2
>  Labels: stale-P2
>
> Introduce the concept of Options in Beam Schema’s to add extra context to 
> fields and schemas. In contrast to the current Beam metadata that is present 
> in a FieldType, options would be added to fields, logical types and schemas. 
> The schema convertors (ex. Avro, Proto, …) can add 
> options/annotations/decorators that were in the original schema to the Beam 
> schema with these options. These options, that add contextual metadata, can 
> be used in the pipeline for specific transformations or augment the end 
> schema in the target output.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9416) BIP-1: Convert avro metadata to Schema options

2020-04-08 Thread Alex Van Boxel (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Van Boxel updated BEAM-9416:
-
Fix Version/s: (was: 2.21.0)
   2.22.0

> BIP-1: Convert avro metadata to Schema options
> --
>
> Key: BEAM-9416
> URL: https://issues.apache.org/jira/browse/BEAM-9416
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-java-core
>Reporter: Alex Van Boxel
>Assignee: Alex Van Boxel
>Priority: Major
> Fix For: 2.22.0
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Avro has some metadata that can be added to the normal type information. It 
> is based on json typing, so the conversion will be best effort (probably we 
> can bet int, string and float out of it).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work started] (BEAM-9704) BIP-1: Deprecate and remove FieldType metadata

2020-04-06 Thread Alex Van Boxel (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on BEAM-9704 started by Alex Van Boxel.

> BIP-1: Deprecate and remove FieldType metadata
> --
>
> Key: BEAM-9704
> URL: https://issues.apache.org/jira/browse/BEAM-9704
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-java-core
>Affects Versions: 2.21.0
>Reporter: Alex Van Boxel
>Assignee: Alex Van Boxel
>Priority: Minor
> Fix For: 2.23.0
>
>
> Deprecate and remove getMetadata on the FieldType.
>  * Add deprecation notice on the getMetadata field in version 2.21.0
>  * Remove the getMetadata field in 2.23.0
> All usage of metadata should be replaced by 2.23.0 and use the portable beam 
> schema options.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-9704) BIP-1: Deprecate and remove FieldType metadata

2020-04-06 Thread Alex Van Boxel (Jira)
Alex Van Boxel created BEAM-9704:


 Summary: BIP-1: Deprecate and remove FieldType metadata
 Key: BEAM-9704
 URL: https://issues.apache.org/jira/browse/BEAM-9704
 Project: Beam
  Issue Type: Sub-task
  Components: sdk-java-core
Affects Versions: 2.21.0
Reporter: Alex Van Boxel
Assignee: Alex Van Boxel
 Fix For: 2.23.0


Deprecate and remove getMetadata on the FieldType.
 * Add deprecation notice on the getMetadata field in version 2.21.0
 * Remove the getMetadata field in 2.23.0

All usage of metadata should be replaced by 2.23.0 and use the portable beam 
schema options.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9704) BIP-1: Deprecate and remove FieldType metadata

2020-04-06 Thread Alex Van Boxel (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Van Boxel updated BEAM-9704:
-
Status: Open  (was: Triage Needed)

> BIP-1: Deprecate and remove FieldType metadata
> --
>
> Key: BEAM-9704
> URL: https://issues.apache.org/jira/browse/BEAM-9704
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-java-core
>Affects Versions: 2.21.0
>Reporter: Alex Van Boxel
>Assignee: Alex Van Boxel
>Priority: Minor
> Fix For: 2.23.0
>
>
> Deprecate and remove getMetadata on the FieldType.
>  * Add deprecation notice on the getMetadata field in version 2.21.0
>  * Remove the getMetadata field in 2.23.0
> All usage of metadata should be replaced by 2.23.0 and use the portable beam 
> schema options.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (BEAM-9604) BIP-1: Remove schema metadata usage for Protobuf extension

2020-04-03 Thread Alex Van Boxel (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Van Boxel resolved BEAM-9604.
--
Fix Version/s: 2.21.0
   Resolution: Fixed

This was part of https://github.com/apache/beam/pull/10529

> BIP-1: Remove schema metadata usage for Protobuf extension
> --
>
> Key: BEAM-9604
> URL: https://issues.apache.org/jira/browse/BEAM-9604
> Project: Beam
>  Issue Type: Sub-task
>  Components: extensions-java-protobuf
>Reporter: Alex Van Boxel
>Assignee: Alex Van Boxel
>Priority: Minor
> Fix For: 2.21.0
>
>
> Replace the schema metadata usage and replace it with using the options. This 
> will probably mean:
>  * Moving the message_name metadata to a Schema option (for field, map key 
> and value)
>  * Replace the proto_number to a Field option



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (BEAM-9605) BIP-1: Rename setRowOption to setOption on Option builder

2020-04-03 Thread Alex Van Boxel (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Van Boxel closed BEAM-9605.


> BIP-1: Rename setRowOption to setOption on Option builder 
> --
>
> Key: BEAM-9605
> URL: https://issues.apache.org/jira/browse/BEAM-9605
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-java-core
>Reporter: Alex Van Boxel
>Assignee: Alex Van Boxel
>Priority: Major
> Fix For: 2.21.0
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Rename setRowOption to setOption on Option builder as setRowOption name is 
> too confusing. 
> It sets an option as a Row, not an option on a Row. Using setOption is better 
> and doesn't conflict with the other setOption with 3 parameters and explicit 
> type.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (BEAM-9604) BIP-1: Remove schema metadata usage for Protobuf extension

2020-04-03 Thread Alex Van Boxel (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Van Boxel closed BEAM-9604.


> BIP-1: Remove schema metadata usage for Protobuf extension
> --
>
> Key: BEAM-9604
> URL: https://issues.apache.org/jira/browse/BEAM-9604
> Project: Beam
>  Issue Type: Sub-task
>  Components: extensions-java-protobuf
>Reporter: Alex Van Boxel
>Assignee: Alex Van Boxel
>Priority: Minor
> Fix For: 2.21.0
>
>
> Replace the schema metadata usage and replace it with using the options. This 
> will probably mean:
>  * Moving the message_name metadata to a Schema option (for field, map key 
> and value)
>  * Replace the proto_number to a Field option



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (BEAM-9044) BIP-1: Convert protobuf options to Schema options

2020-04-03 Thread Alex Van Boxel (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Van Boxel closed BEAM-9044.


> BIP-1: Convert protobuf options to Schema options
> -
>
> Key: BEAM-9044
> URL: https://issues.apache.org/jira/browse/BEAM-9044
> Project: Beam
>  Issue Type: Sub-task
>  Components: extensions-java-protobuf
>Reporter: Alex Van Boxel
>Assignee: Alex Van Boxel
>Priority: Minor
> Fix For: 2.21.0
>
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> Protobuf has a rich metadata system called options. This system is fully 
> typed and matches Beams Schema Option system. For now we can only convert the 
> following protobuf options:
>  * File Options -> _Beam doesn't have this concept_
>  * Message Options -> *Beam Schema Options*
>  * Field Options -> *Beam Schema Options*
>  * Enum Options -> _This can only be done when logical type options are 
> available_
>  * EnumValue Options -> _This can only be done when logical type options are 
> available_
>  * Service Options -> _Beam doesn't have this concept_
>  * Method Options -> _Beam doesn't have this concept_



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (BEAM-9035) BIP-1: Typed options for Row Schema and Fields

2020-04-03 Thread Alex Van Boxel (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Van Boxel closed BEAM-9035.


> BIP-1: Typed options for Row Schema and Fields
> --
>
> Key: BEAM-9035
> URL: https://issues.apache.org/jira/browse/BEAM-9035
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-java-core
>Reporter: Alex Van Boxel
>Assignee: Alex Van Boxel
>Priority: Major
> Fix For: 2.21.0
>
>  Time Spent: 8h 40m
>  Remaining Estimate: 0h
>
> This is the first issue of a multipart commit: this ticket implements the 
> basic infrastructure of options on row and field.
> Full explanation:
> Introduce the concept of Options in Beam Schema’s to add extra context to 
> fields and schema. In contracts to metadata, options would be added to 
> fields, logical types and rows. In the options schema convertors can add 
> options/annotations/decorators that were in the original schema, this context 
> can be used in the rest of the pipeline for specific transformations or 
> augment the end schema in the target output.
> Examples of options are:
>  * informational: like the source of the data, ...
>  * drive decisions further in the pipeline: flatten a row into another, 
> rename a field, ...
>  * influence something in the output: like cluster index, primary key, ...
>  * logical type information
> And option is a key/typed value combination. The advantages of having the 
> value types is: 
>  * Having strongly typed options would give a *portable way of Logical Types* 
> to have structured information that could be shared over different languages.
>  * This could keep the type intact when mapping from a formats that have 
> strongly typed options (example: Protobuf).
> This is part of a multi ticket implementation. The following tickets are 
> related:
>  # Typed options for Row Schema and Fields
>  # Convert Proto Options to Beam Schema options
>  # Convert Avro extra information for Beam string options
>  # Replace meta data with Logical Type options
>  # Extract meta data in Calcite SQL to Beam options
>  # Extract meta data in Zeta SQL to Beam options
>  # Add java example of using option in a transform 
> This feature is discussed with Reuven Lax, Brian Hulette



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-9456) Upgrade to gradle 6.2

2020-03-31 Thread Alex Van Boxel (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17072406#comment-17072406
 ] 

Alex Van Boxel commented on BEAM-9456:
--

it's a lot more involved than that. I already got protobuf, net.ltgt.gradle.*

[~dschmitt] Are you planning the upgrade? I hate todo double effort.

> Upgrade to gradle 6.2
> -
>
> Key: BEAM-9456
> URL: https://issues.apache.org/jira/browse/BEAM-9456
> Project: Beam
>  Issue Type: Task
>  Components: build-system
>Reporter: Alex Van Boxel
>Assignee: Alex Van Boxel
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9605) BIP-1: Rename setRowOption to setOption on Option builder

2020-03-25 Thread Alex Van Boxel (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Van Boxel updated BEAM-9605:
-
Status: Open  (was: Triage Needed)

> BIP-1: Rename setRowOption to setOption on Option builder 
> --
>
> Key: BEAM-9605
> URL: https://issues.apache.org/jira/browse/BEAM-9605
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-java-core
>Reporter: Alex Van Boxel
>Assignee: Alex Van Boxel
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Rename setRowOption to setOption on Option builder as setRowOption name is 
> too confusing. 
> It sets an option as a Row, not an option on a Row. Using setOption is better 
> and doesn't conflict with the other setOption with 3 parameters and explicit 
> type.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (BEAM-9605) BIP-1: Rename setRowOption to setOption on Option builder

2020-03-25 Thread Alex Van Boxel (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Van Boxel resolved BEAM-9605.
--
Fix Version/s: 2.21.0
   Resolution: Fixed

> BIP-1: Rename setRowOption to setOption on Option builder 
> --
>
> Key: BEAM-9605
> URL: https://issues.apache.org/jira/browse/BEAM-9605
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-java-core
>Reporter: Alex Van Boxel
>Assignee: Alex Van Boxel
>Priority: Major
> Fix For: 2.21.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Rename setRowOption to setOption on Option builder as setRowOption name is 
> too confusing. 
> It sets an option as a Row, not an option on a Row. Using setOption is better 
> and doesn't conflict with the other setOption with 3 parameters and explicit 
> type.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-9605) BIP-1: Rename setRowOption to setOption on Option builder

2020-03-25 Thread Alex Van Boxel (Jira)
Alex Van Boxel created BEAM-9605:


 Summary: BIP-1: Rename setRowOption to setOption on Option builder 
 Key: BEAM-9605
 URL: https://issues.apache.org/jira/browse/BEAM-9605
 Project: Beam
  Issue Type: Sub-task
  Components: sdk-java-core
Reporter: Alex Van Boxel
Assignee: Alex Van Boxel


Rename setRowOption to setOption on Option builder as setRowOption name is too 
confusing. 

It sets an option as a Row, not an option on a Row. Using setOption is better 
and doesn't conflict with the other setOption with 3 parameters and explicit 
type.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9604) BIP-1: Remove schema metadata usage for Protobuf extension

2020-03-25 Thread Alex Van Boxel (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Van Boxel updated BEAM-9604:
-
Parent: BEAM-9275
Issue Type: Sub-task  (was: Task)

> BIP-1: Remove schema metadata usage for Protobuf extension
> --
>
> Key: BEAM-9604
> URL: https://issues.apache.org/jira/browse/BEAM-9604
> Project: Beam
>  Issue Type: Sub-task
>  Components: extensions-java-protobuf
>Reporter: Alex Van Boxel
>Assignee: Alex Van Boxel
>Priority: Minor
>
> Replace the schema metadata usage and replace it with using the options. This 
> will probably mean:
>  * Moving the message_name metadata to a Schema option (for field, map key 
> and value)
>  * Replace the proto_number to a Field option



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-9604) BIP-1: Remove schema metadata usage for Protobuf extension

2020-03-25 Thread Alex Van Boxel (Jira)
Alex Van Boxel created BEAM-9604:


 Summary: BIP-1: Remove schema metadata usage for Protobuf extension
 Key: BEAM-9604
 URL: https://issues.apache.org/jira/browse/BEAM-9604
 Project: Beam
  Issue Type: Task
  Components: extensions-java-protobuf
Reporter: Alex Van Boxel
Assignee: Alex Van Boxel


Replace the schema metadata usage and replace it with using the options. This 
will probably mean:
 * Moving the message_name metadata to a Schema option (for field, map key and 
value)
 * Replace the proto_number to a Field option



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (BEAM-9044) BIP-1: Convert protobuf options to Schema options

2020-03-25 Thread Alex Van Boxel (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Van Boxel resolved BEAM-9044.
--
Fix Version/s: 2.21.0
   Resolution: Fixed

> BIP-1: Convert protobuf options to Schema options
> -
>
> Key: BEAM-9044
> URL: https://issues.apache.org/jira/browse/BEAM-9044
> Project: Beam
>  Issue Type: Sub-task
>  Components: extensions-java-protobuf
>Reporter: Alex Van Boxel
>Assignee: Alex Van Boxel
>Priority: Minor
> Fix For: 2.21.0
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Protobuf has a rich metadata system called options. This system is fully 
> typed and matches Beams Schema Option system. For now we can only convert the 
> following protobuf options:
>  * File Options -> _Beam doesn't have this concept_
>  * Message Options -> *Beam Schema Options*
>  * Field Options -> *Beam Schema Options*
>  * Enum Options -> _This can only be done when logical type options are 
> available_
>  * EnumValue Options -> _This can only be done when logical type options are 
> available_
>  * Service Options -> _Beam doesn't have this concept_
>  * Method Options -> _Beam doesn't have this concept_



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (BEAM-9416) BIP-1: Convert avro metadata to Schema options

2020-03-25 Thread Alex Van Boxel (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Van Boxel resolved BEAM-9416.
--
Resolution: Fixed

> BIP-1: Convert avro metadata to Schema options
> --
>
> Key: BEAM-9416
> URL: https://issues.apache.org/jira/browse/BEAM-9416
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-java-core
>Reporter: Alex Van Boxel
>Assignee: Alex Van Boxel
>Priority: Major
> Fix For: 2.21.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Avro has some metadata that can be added to the normal type information. It 
> is based on json typing, so the conversion will be best effort (probably we 
> can bet int, string and float out of it).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9035) BIP-1: Typed options for Row Schema and Fields

2020-03-25 Thread Alex Van Boxel (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Van Boxel updated BEAM-9035:
-
Fix Version/s: (was: 2.20.0)
   2.21.0

> BIP-1: Typed options for Row Schema and Fields
> --
>
> Key: BEAM-9035
> URL: https://issues.apache.org/jira/browse/BEAM-9035
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-java-core
>Reporter: Alex Van Boxel
>Assignee: Alex Van Boxel
>Priority: Major
> Fix For: 2.21.0
>
>  Time Spent: 8h 40m
>  Remaining Estimate: 0h
>
> This is the first issue of a multipart commit: this ticket implements the 
> basic infrastructure of options on row and field.
> Full explanation:
> Introduce the concept of Options in Beam Schema’s to add extra context to 
> fields and schema. In contracts to metadata, options would be added to 
> fields, logical types and rows. In the options schema convertors can add 
> options/annotations/decorators that were in the original schema, this context 
> can be used in the rest of the pipeline for specific transformations or 
> augment the end schema in the target output.
> Examples of options are:
>  * informational: like the source of the data, ...
>  * drive decisions further in the pipeline: flatten a row into another, 
> rename a field, ...
>  * influence something in the output: like cluster index, primary key, ...
>  * logical type information
> And option is a key/typed value combination. The advantages of having the 
> value types is: 
>  * Having strongly typed options would give a *portable way of Logical Types* 
> to have structured information that could be shared over different languages.
>  * This could keep the type intact when mapping from a formats that have 
> strongly typed options (example: Protobuf).
> This is part of a multi ticket implementation. The following tickets are 
> related:
>  # Typed options for Row Schema and Fields
>  # Convert Proto Options to Beam Schema options
>  # Convert Avro extra information for Beam string options
>  # Replace meta data with Logical Type options
>  # Extract meta data in Calcite SQL to Beam options
>  # Extract meta data in Zeta SQL to Beam options
>  # Add java example of using option in a transform 
> This feature is discussed with Reuven Lax, Brian Hulette



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9035) BIP-1: Typed options for Row Schema and Fields

2020-03-25 Thread Alex Van Boxel (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Van Boxel updated BEAM-9035:
-
Fix Version/s: (was: 2.19.0)
   2.20.0

> BIP-1: Typed options for Row Schema and Fields
> --
>
> Key: BEAM-9035
> URL: https://issues.apache.org/jira/browse/BEAM-9035
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-java-core
>Reporter: Alex Van Boxel
>Assignee: Alex Van Boxel
>Priority: Major
> Fix For: 2.20.0
>
>  Time Spent: 8h 40m
>  Remaining Estimate: 0h
>
> This is the first issue of a multipart commit: this ticket implements the 
> basic infrastructure of options on row and field.
> Full explanation:
> Introduce the concept of Options in Beam Schema’s to add extra context to 
> fields and schema. In contracts to metadata, options would be added to 
> fields, logical types and rows. In the options schema convertors can add 
> options/annotations/decorators that were in the original schema, this context 
> can be used in the rest of the pipeline for specific transformations or 
> augment the end schema in the target output.
> Examples of options are:
>  * informational: like the source of the data, ...
>  * drive decisions further in the pipeline: flatten a row into another, 
> rename a field, ...
>  * influence something in the output: like cluster index, primary key, ...
>  * logical type information
> And option is a key/typed value combination. The advantages of having the 
> value types is: 
>  * Having strongly typed options would give a *portable way of Logical Types* 
> to have structured information that could be shared over different languages.
>  * This could keep the type intact when mapping from a formats that have 
> strongly typed options (example: Protobuf).
> This is part of a multi ticket implementation. The following tickets are 
> related:
>  # Typed options for Row Schema and Fields
>  # Convert Proto Options to Beam Schema options
>  # Convert Avro extra information for Beam string options
>  # Replace meta data with Logical Type options
>  # Extract meta data in Calcite SQL to Beam options
>  # Extract meta data in Zeta SQL to Beam options
>  # Add java example of using option in a transform 
> This feature is discussed with Reuven Lax, Brian Hulette



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work started] (BEAM-9044) BIP-1: Convert protobuf options to Schema options

2020-03-25 Thread Alex Van Boxel (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on BEAM-9044 started by Alex Van Boxel.

> BIP-1: Convert protobuf options to Schema options
> -
>
> Key: BEAM-9044
> URL: https://issues.apache.org/jira/browse/BEAM-9044
> Project: Beam
>  Issue Type: Sub-task
>  Components: extensions-java-protobuf
>Reporter: Alex Van Boxel
>Assignee: Alex Van Boxel
>Priority: Minor
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Protobuf has a rich metadata system called options. This system is fully 
> typed and matches Beams Schema Option system. For now we can only convert the 
> following protobuf options:
>  * File Options -> _Beam doesn't have this concept_
>  * Message Options -> *Beam Schema Options*
>  * Field Options -> *Beam Schema Options*
>  * Enum Options -> _This can only be done when logical type options are 
> available_
>  * EnumValue Options -> _This can only be done when logical type options are 
> available_
>  * Service Options -> _Beam doesn't have this concept_
>  * Method Options -> _Beam doesn't have this concept_



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-8218) Implement Apache PulsarIO

2020-03-08 Thread Alex Van Boxel (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17054385#comment-17054385
 ] 

Alex Van Boxel commented on BEAM-8218:
--

Thanks, I appreciate the update. I'll assigned it too myself and will be 
starting it immediately.

> Implement Apache PulsarIO
> -
>
> Key: BEAM-8218
> URL: https://issues.apache.org/jira/browse/BEAM-8218
> Project: Beam
>  Issue Type: Task
>  Components: io-ideas
>Reporter: Alex Van Boxel
>Assignee: Alex Van Boxel
>Priority: Minor
>
> Apache Pulsar is starting to gain popularity. Having a native Beam PulsarIO 
> could be beneficial.
> [https://pulsar.apache.org/|https://pulsar.apache.org/en/]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (BEAM-8218) Implement Apache PulsarIO

2020-03-08 Thread Alex Van Boxel (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Van Boxel reassigned BEAM-8218:


Assignee: Alex Van Boxel  (was: Taher Koitawala)

> Implement Apache PulsarIO
> -
>
> Key: BEAM-8218
> URL: https://issues.apache.org/jira/browse/BEAM-8218
> Project: Beam
>  Issue Type: Task
>  Components: io-ideas
>Reporter: Alex Van Boxel
>Assignee: Alex Van Boxel
>Priority: Minor
>
> Apache Pulsar is starting to gain popularity. Having a native Beam PulsarIO 
> could be beneficial.
> [https://pulsar.apache.org/|https://pulsar.apache.org/en/]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-8218) Implement Apache PulsarIO

2020-03-08 Thread Alex Van Boxel (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17054372#comment-17054372
 ] 

Alex Van Boxel commented on BEAM-8218:
--

[~taherk77]  if we don't hear any updates on this I will consider this 
abandoned an I will take over. We need to get this moving.

> Implement Apache PulsarIO
> -
>
> Key: BEAM-8218
> URL: https://issues.apache.org/jira/browse/BEAM-8218
> Project: Beam
>  Issue Type: Task
>  Components: io-ideas
>Reporter: Alex Van Boxel
>Assignee: Taher Koitawala
>Priority: Minor
>
> Apache Pulsar is starting to gain popularity. Having a native Beam PulsarIO 
> could be beneficial.
> [https://pulsar.apache.org/|https://pulsar.apache.org/en/]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9456) Upgrade to gradle 6.2

2020-03-05 Thread Alex Van Boxel (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Van Boxel updated BEAM-9456:
-
Status: Open  (was: Triage Needed)

> Upgrade to gradle 6.2
> -
>
> Key: BEAM-9456
> URL: https://issues.apache.org/jira/browse/BEAM-9456
> Project: Beam
>  Issue Type: Task
>  Components: build-system
>Reporter: Alex Van Boxel
>Assignee: Alex Van Boxel
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-9456) Upgrade to gradle 6.2

2020-03-05 Thread Alex Van Boxel (Jira)
Alex Van Boxel created BEAM-9456:


 Summary: Upgrade to gradle 6.2
 Key: BEAM-9456
 URL: https://issues.apache.org/jira/browse/BEAM-9456
 Project: Beam
  Issue Type: Task
  Components: build-system
Reporter: Alex Van Boxel
Assignee: Alex Van Boxel






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9035) BIP-1: Typed options for Row Schema and Fields

2020-02-29 Thread Alex Van Boxel (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Van Boxel updated BEAM-9035:
-
Summary: BIP-1: Typed options for Row Schema and Fields  (was: Typed 
options for Row Schema and Fields)

> BIP-1: Typed options for Row Schema and Fields
> --
>
> Key: BEAM-9035
> URL: https://issues.apache.org/jira/browse/BEAM-9035
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-java-core
>Reporter: Alex Van Boxel
>Assignee: Alex Van Boxel
>Priority: Major
> Fix For: 2.19.0
>
>  Time Spent: 3h 50m
>  Remaining Estimate: 0h
>
> This is the first issue of a multipart commit: this ticket implements the 
> basic infrastructure of options on row and field.
> Full explanation:
> Introduce the concept of Options in Beam Schema’s to add extra context to 
> fields and schema. In contracts to metadata, options would be added to 
> fields, logical types and rows. In the options schema convertors can add 
> options/annotations/decorators that were in the original schema, this context 
> can be used in the rest of the pipeline for specific transformations or 
> augment the end schema in the target output.
> Examples of options are:
>  * informational: like the source of the data, ...
>  * drive decisions further in the pipeline: flatten a row into another, 
> rename a field, ...
>  * influence something in the output: like cluster index, primary key, ...
>  * logical type information
> And option is a key/typed value combination. The advantages of having the 
> value types is: 
>  * Having strongly typed options would give a *portable way of Logical Types* 
> to have structured information that could be shared over different languages.
>  * This could keep the type intact when mapping from a formats that have 
> strongly typed options (example: Protobuf).
> This is part of a multi ticket implementation. The following tickets are 
> related:
>  # Typed options for Row Schema and Fields
>  # Convert Proto Options to Beam Schema options
>  # Convert Avro extra information for Beam string options
>  # Replace meta data with Logical Type options
>  # Extract meta data in Calcite SQL to Beam options
>  # Extract meta data in Zeta SQL to Beam options
>  # Add java example of using option in a transform 
> This feature is discussed with Reuven Lax, Brian Hulette



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9416) BIP-1: Convert avro metadata to Schema options

2020-02-29 Thread Alex Van Boxel (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Van Boxel updated BEAM-9416:
-
Summary: BIP-1: Convert avro metadata to Schema options  (was: Convert avro 
metadata to Schema options)

> BIP-1: Convert avro metadata to Schema options
> --
>
> Key: BEAM-9416
> URL: https://issues.apache.org/jira/browse/BEAM-9416
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-java-core
>Reporter: Alex Van Boxel
>Assignee: Alex Van Boxel
>Priority: Major
> Fix For: 2.21.0
>
>
> Avro has some metadata that can be added to the normal type information. It 
> is based on json typing, so the conversion will be best effort (probably we 
> can bet int, string and float out of it).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9044) BIP-1: Convert protobuf options to Schema options

2020-02-29 Thread Alex Van Boxel (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Van Boxel updated BEAM-9044:
-
Summary: BIP-1: Convert protobuf options to Schema options  (was: Convert 
protobuf options to Schema options)

> BIP-1: Convert protobuf options to Schema options
> -
>
> Key: BEAM-9044
> URL: https://issues.apache.org/jira/browse/BEAM-9044
> Project: Beam
>  Issue Type: Sub-task
>  Components: extensions-java-protobuf
>Reporter: Alex Van Boxel
>Assignee: Alex Van Boxel
>Priority: Minor
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Protobuf has a rich metadata system called options. This system is fully 
> typed and matches Beams Schema Option system. For now we can only convert the 
> following protobuf options:
>  * File Options -> _Beam doesn't have this concept_
>  * Message Options -> *Beam Schema Options*
>  * Field Options -> *Beam Schema Options*
>  * Enum Options -> _This can only be done when logical type options are 
> available_
>  * EnumValue Options -> _This can only be done when logical type options are 
> available_
>  * Service Options -> _Beam doesn't have this concept_
>  * Method Options -> _Beam doesn't have this concept_



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-9416) Convert avro metadata to Schema options

2020-02-29 Thread Alex Van Boxel (Jira)
Alex Van Boxel created BEAM-9416:


 Summary: Convert avro metadata to Schema options
 Key: BEAM-9416
 URL: https://issues.apache.org/jira/browse/BEAM-9416
 Project: Beam
  Issue Type: Sub-task
  Components: sdk-java-core
Reporter: Alex Van Boxel
Assignee: Alex Van Boxel
 Fix For: 2.21.0


Avro has some metadata that can be added to the normal type information. It is 
based on json typing, so the conversion will be best effort (probably we can 
bet int, string and float out of it).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (BEAM-7518) Protobuf Schema: Introduce logical type for Timestamp, Duration and other

2020-02-27 Thread Alex Van Boxel (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Van Boxel closed BEAM-7518.


> Protobuf Schema: Introduce logical type for Timestamp, Duration and other
> -
>
> Key: BEAM-7518
> URL: https://issues.apache.org/jira/browse/BEAM-7518
> Project: Beam
>  Issue Type: Task
>  Components: sdk-java-core
>Reporter: Alex Van Boxel
>Assignee: Alex Van Boxel
>Priority: Major
> Fix For: 2.20.0
>
>
> Protobuf Schema provider has some loosy conversion from some Proto types. 
> Introduce Logical Types for:
> Timestamp, Duration and Unsigned Int64



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (BEAM-7518) Protobuf Schema: Introduce logical type for Timestamp, Duration and other

2020-02-27 Thread Alex Van Boxel (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Van Boxel resolved BEAM-7518.
--
Fix Version/s: 2.20.0
   Resolution: Fixed

> Protobuf Schema: Introduce logical type for Timestamp, Duration and other
> -
>
> Key: BEAM-7518
> URL: https://issues.apache.org/jira/browse/BEAM-7518
> Project: Beam
>  Issue Type: Task
>  Components: sdk-java-core
>Reporter: Alex Van Boxel
>Assignee: Alex Van Boxel
>Priority: Major
> Fix For: 2.20.0
>
>
> Protobuf Schema provider has some loosy conversion from some Proto types. 
> Introduce Logical Types for:
> Timestamp, Duration and Unsigned Int64



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (BEAM-9394) DynamicMessage handling of empty map violates schema nullability

2020-02-26 Thread Alex Van Boxel (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Van Boxel resolved BEAM-9394.
--
Resolution: Fixed

> DynamicMessage handling of empty map violates schema nullability
> 
>
> Key: BEAM-9394
> URL: https://issues.apache.org/jira/browse/BEAM-9394
> Project: Beam
>  Issue Type: Bug
>  Components: extensions-java-protobuf
>Reporter: Alex Van Boxel
>Assignee: Alex Van Boxel
>Priority: Major
> Fix For: 2.20.0
>
>
> DynamicMessage handling of empty map violates nullability. It should return 
> an empty map at the Row level.
> Add tests for nullable map and array to verify behaviour.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9394) DynamicMessage handling of empty map violates schema nullability

2020-02-26 Thread Alex Van Boxel (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Van Boxel updated BEAM-9394:
-
Status: Open  (was: Triage Needed)

> DynamicMessage handling of empty map violates schema nullability
> 
>
> Key: BEAM-9394
> URL: https://issues.apache.org/jira/browse/BEAM-9394
> Project: Beam
>  Issue Type: Bug
>  Components: extensions-java-protobuf
>Reporter: Alex Van Boxel
>Assignee: Alex Van Boxel
>Priority: Major
> Fix For: 2.20.0
>
>
> DynamicMessage handling of empty map violates nullability. It should return 
> an empty map at the Row level.
> Add tests for nullable map and array to verify behaviour.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-9394) DynamicMessage handling of empty map violates schema nullability

2020-02-26 Thread Alex Van Boxel (Jira)
Alex Van Boxel created BEAM-9394:


 Summary: DynamicMessage handling of empty map violates schema 
nullability
 Key: BEAM-9394
 URL: https://issues.apache.org/jira/browse/BEAM-9394
 Project: Beam
  Issue Type: Bug
  Components: extensions-java-protobuf
Reporter: Alex Van Boxel
Assignee: Alex Van Boxel
 Fix For: 2.20.0


DynamicMessage handling of empty map violates nullability. It should return an 
empty map at the Row level.

Add tests for nullable map and array to verify behaviour.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (BEAM-7274) Protobuf Beam Schema support

2020-02-26 Thread Alex Van Boxel (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Van Boxel closed BEAM-7274.


> Protobuf Beam Schema support
> 
>
> Key: BEAM-7274
> URL: https://issues.apache.org/jira/browse/BEAM-7274
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-java-core
>Reporter: Alex Van Boxel
>Assignee: Alex Van Boxel
>Priority: Minor
> Fix For: 2.20.0
>
>  Time Spent: 26h 40m
>  Remaining Estimate: 0h
>
> Add support for the new Beam Schema to the Protobuf extension.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (BEAM-7274) Protobuf Beam Schema support

2020-02-26 Thread Alex Van Boxel (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Van Boxel resolved BEAM-7274.
--
Fix Version/s: (was: 2.21.0)
   2.20.0
   Resolution: Fixed

Moving back to 2.20 as it's merged into master

> Protobuf Beam Schema support
> 
>
> Key: BEAM-7274
> URL: https://issues.apache.org/jira/browse/BEAM-7274
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-java-core
>Reporter: Alex Van Boxel
>Assignee: Alex Van Boxel
>Priority: Minor
> Fix For: 2.20.0
>
>  Time Spent: 26h 40m
>  Remaining Estimate: 0h
>
> Add support for the new Beam Schema to the Protobuf extension.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-7274) Protobuf Beam Schema support

2020-02-25 Thread Alex Van Boxel (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-7274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17044920#comment-17044920
 ] 

Alex Van Boxel commented on BEAM-7274:
--

Moved to 2.21

> Protobuf Beam Schema support
> 
>
> Key: BEAM-7274
> URL: https://issues.apache.org/jira/browse/BEAM-7274
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-java-core
>Reporter: Alex Van Boxel
>Assignee: Alex Van Boxel
>Priority: Minor
> Fix For: 2.21.0
>
>  Time Spent: 26h 20m
>  Remaining Estimate: 0h
>
> Add support for the new Beam Schema to the Protobuf extension.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-7274) Protobuf Beam Schema support

2020-02-25 Thread Alex Van Boxel (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Van Boxel updated BEAM-7274:
-
Fix Version/s: (was: 2.20.0)
   2.21.0

> Protobuf Beam Schema support
> 
>
> Key: BEAM-7274
> URL: https://issues.apache.org/jira/browse/BEAM-7274
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-java-core
>Reporter: Alex Van Boxel
>Assignee: Alex Van Boxel
>Priority: Minor
> Fix For: 2.21.0
>
>  Time Spent: 26h 20m
>  Remaining Estimate: 0h
>
> Add support for the new Beam Schema to the Protobuf extension.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (BEAM-9360) Schema FieldType should not consider metadata for equivalence

2020-02-22 Thread Alex Van Boxel (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Van Boxel reassigned BEAM-9360:


Assignee: Jozef Vilcek

> Schema FieldType should not consider metadata for equivalence
> -
>
> Key: BEAM-9360
> URL: https://issues.apache.org/jira/browse/BEAM-9360
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Affects Versions: 2.19.0
>Reporter: Jozef Vilcek
>Assignee: Jozef Vilcek
>Priority: Major
>
> FieldType `equivalent()` check should not require exact match in fields 
> metadata.
> Discussion in dev mailing list:
> [https://lists.apache.org/list.html?d...@beam.apache.org:lte=1M:Schema%20Convert%20transform%20fails%20on%20type%20metadata]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (BEAM-9241) Fix inconsistent nullability mapping for Protobuf to Schema

2020-02-21 Thread Alex Van Boxel (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Van Boxel resolved BEAM-9241.
--
Resolution: Fixed

> Fix inconsistent nullability mapping for Protobuf to Schema
> ---
>
> Key: BEAM-9241
> URL: https://issues.apache.org/jira/browse/BEAM-9241
> Project: Beam
>  Issue Type: Bug
>  Components: extensions-java-protobuf
>Affects Versions: 2.18.0
>Reporter: Alex Van Boxel
>Assignee: Alex Van Boxel
>Priority: Major
> Fix For: 2.20.0
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> Fix the nullability issues with protobuf to schema mapping
>  * Proto3 primitive types should be *not* nullable.
>  * Proto2 required types should be *not* nullable.
>  * Proto2 optional should also be *not* nullable as having an optional value 
> doesn't mean it has not value. The spec states it has the optional value.
>  * Arrays should be *not* nullable, as proto arrays always have an empty 
> array when no value is set.
>  * Maps should be *not* nullable, as proto maps always have an empty map when 
> no value is set.
>  * Elements in an array should be *not* nullable, as nulls are not allowed in 
> an array.
>  * Names and Values should be *not* nullable, as nulls are not allowed.
>  * Rows are nullable, as messages are nullable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9275) BIP-1: Beam Schema Options

2020-02-09 Thread Alex Van Boxel (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Van Boxel updated BEAM-9275:
-
Description: Introduce the concept of Options in Beam Schema’s to add extra 
context to fields and schemas. In contrast to the current Beam metadata that is 
present in a FieldType, options would be added to fields, logical types and 
schemas. The schema convertors (ex. Avro, Proto, …) can add 
options/annotations/decorators that were in the original schema to the Beam 
schema with these options. These options, that add contextual metadata, can be 
used in the pipeline for specific transformations or augment the end schema in 
the target output.

> BIP-1: Beam Schema Options
> --
>
> Key: BEAM-9275
> URL: https://issues.apache.org/jira/browse/BEAM-9275
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Alex Van Boxel
>Priority: Major
>
> Introduce the concept of Options in Beam Schema’s to add extra context to 
> fields and schemas. In contrast to the current Beam metadata that is present 
> in a FieldType, options would be added to fields, logical types and schemas. 
> The schema convertors (ex. Avro, Proto, …) can add 
> options/annotations/decorators that were in the original schema to the Beam 
> schema with these options. These options, that add contextual metadata, can 
> be used in the pipeline for specific transformations or augment the end 
> schema in the target output.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9044) Convert protobuf options to Schema options

2020-02-09 Thread Alex Van Boxel (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Van Boxel updated BEAM-9044:
-
Parent: BEAM-9275
Issue Type: Sub-task  (was: Task)

> Convert protobuf options to Schema options
> --
>
> Key: BEAM-9044
> URL: https://issues.apache.org/jira/browse/BEAM-9044
> Project: Beam
>  Issue Type: Sub-task
>  Components: extensions-java-protobuf
>Reporter: Alex Van Boxel
>Assignee: Alex Van Boxel
>Priority: Minor
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Protobuf has a rich metadata system called options. This system is fully 
> typed and matches Beams Schema Option system. For now we can only convert the 
> following protobuf options:
>  * File Options -> _Beam doesn't have this concept_
>  * Message Options -> *Beam Schema Options*
>  * Field Options -> *Beam Schema Options*
>  * Enum Options -> _This can only be done when logical type options are 
> available_
>  * EnumValue Options -> _This can only be done when logical type options are 
> available_
>  * Service Options -> _Beam doesn't have this concept_
>  * Method Options -> _Beam doesn't have this concept_



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9035) Typed options for Row Schema and Fields

2020-02-09 Thread Alex Van Boxel (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Van Boxel updated BEAM-9035:
-
Parent: BEAM-9275
Issue Type: Sub-task  (was: Task)

> Typed options for Row Schema and Fields
> ---
>
> Key: BEAM-9035
> URL: https://issues.apache.org/jira/browse/BEAM-9035
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-java-core
>Reporter: Alex Van Boxel
>Assignee: Alex Van Boxel
>Priority: Major
> Fix For: 2.19.0
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> This is the first issue of a multipart commit: this ticket implements the 
> basic infrastructure of options on row and field.
> Full explanation:
> Introduce the concept of Options in Beam Schema’s to add extra context to 
> fields and schema. In contracts to metadata, options would be added to 
> fields, logical types and rows. In the options schema convertors can add 
> options/annotations/decorators that were in the original schema, this context 
> can be used in the rest of the pipeline for specific transformations or 
> augment the end schema in the target output.
> Examples of options are:
>  * informational: like the source of the data, ...
>  * drive decisions further in the pipeline: flatten a row into another, 
> rename a field, ...
>  * influence something in the output: like cluster index, primary key, ...
>  * logical type information
> And option is a key/typed value combination. The advantages of having the 
> value types is: 
>  * Having strongly typed options would give a *portable way of Logical Types* 
> to have structured information that could be shared over different languages.
>  * This could keep the type intact when mapping from a formats that have 
> strongly typed options (example: Protobuf).
> This is part of a multi ticket implementation. The following tickets are 
> related:
>  # Typed options for Row Schema and Fields
>  # Convert Proto Options to Beam Schema options
>  # Convert Avro extra information for Beam string options
>  # Replace meta data with Logical Type options
>  # Extract meta data in Calcite SQL to Beam options
>  # Extract meta data in Zeta SQL to Beam options
>  # Add java example of using option in a transform 
> This feature is discussed with Reuven Lax, Brian Hulette



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-9275) BIP-1: Beam Schema Options

2020-02-09 Thread Alex Van Boxel (Jira)
Alex Van Boxel created BEAM-9275:


 Summary: BIP-1: Beam Schema Options
 Key: BEAM-9275
 URL: https://issues.apache.org/jira/browse/BEAM-9275
 Project: Beam
  Issue Type: Improvement
  Components: sdk-java-core
Reporter: Alex Van Boxel






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (BEAM-9037) Instant and duration as logical type

2020-02-05 Thread Alex Van Boxel (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Van Boxel closed BEAM-9037.


> Instant and duration as logical type 
> -
>
> Key: BEAM-9037
> URL: https://issues.apache.org/jira/browse/BEAM-9037
> Project: Beam
>  Issue Type: Task
>  Components: sdk-java-core
>Reporter: Alex Van Boxel
>Assignee: Alex Van Boxel
>Priority: Major
> Fix For: 2.20.0
>
>  Time Spent: 5h
>  Remaining Estimate: 0h
>
> The proto schema includes Timestamp and Duration with nano precision. The 
> logical types should be promoted to the core logical types, so they can be 
> handled on various IO's as standard mandatory conversions.
> This means that the logical type should use the proto specific Timestamp and 
> Duration but the java 8 Instant and Duration.
> See discussion in the design document:
> [https://docs.google.com/document/d/1uu9pJktzT_O3DxGd1-Q2op4nRk4HekIZbzi-0oTAips/edit#heading=h.9uhml95iygqr]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-4457) Analyze FieldAccessDescriptors and drop fields that are never accessed

2020-02-04 Thread Alex Van Boxel (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-4457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17030397#comment-17030397
 ] 

Alex Van Boxel commented on BEAM-4457:
--

I remember the days when I wrote Apache Pig that it had a similar concept as 
well. Another part where this could benefit it the ToRow function where the row 
gets materialized in a RowWithStorage. Only the fields that are accessed should 
be materialized.

> Analyze FieldAccessDescriptors and drop fields that are never accessed
> --
>
> Key: BEAM-4457
> URL: https://issues.apache.org/jira/browse/BEAM-4457
> Project: Beam
>  Issue Type: Sub-task
>  Components: io-java-gcp
>Reporter: Reuven Lax
>Assignee: Reuven Lax
>Priority: Major
>
> We can walk backwards through the graph, analyzing which fields are accessed. 
> When we find paths where many fields are never accessed, we can insert a 
> projection transform to drop those fields preemptively. This can save a lot 
> of resources in the case where many fields in the input are never accessed.
> To do this, the FieldAccessDescriptor information must be added to the 
> portability protos. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work started] (BEAM-9241) Fix inconsistent nullability mapping for Protobuf to Schema

2020-02-01 Thread Alex Van Boxel (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on BEAM-9241 started by Alex Van Boxel.

> Fix inconsistent nullability mapping for Protobuf to Schema
> ---
>
> Key: BEAM-9241
> URL: https://issues.apache.org/jira/browse/BEAM-9241
> Project: Beam
>  Issue Type: Bug
>  Components: extensions-java-protobuf
>Affects Versions: 2.18.0
>Reporter: Alex Van Boxel
>Assignee: Alex Van Boxel
>Priority: Major
> Fix For: 2.20.0
>
>
> Fix the nullability issues with protobuf to schema mapping
>  * Proto3 primitive types should be *not* nullable.
>  * Proto2 required types should be *not* nullable.
>  * Proto2 optional should also be *not* nullable as having an optional value 
> doesn't mean it has not value. The spec states it has the optional value.
>  * Arrays should be *not* nullable, as proto arrays always have an empty 
> array when no value is set.
>  * Maps should be *not* nullable, as proto maps always have an empty map when 
> no value is set.
>  * Elements in an array should be *not* nullable, as nulls are not allowed in 
> an array.
>  * Names and Values should be *not* nullable, as nulls are not allowed.
>  * Rows are nullable, as messages are nullable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9241) Fix inconsistent nullability mapping for Protobuf to Schema

2020-02-01 Thread Alex Van Boxel (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Van Boxel updated BEAM-9241:
-
Status: Open  (was: Triage Needed)

> Fix inconsistent nullability mapping for Protobuf to Schema
> ---
>
> Key: BEAM-9241
> URL: https://issues.apache.org/jira/browse/BEAM-9241
> Project: Beam
>  Issue Type: Bug
>  Components: extensions-java-protobuf
>Affects Versions: 2.18.0
>Reporter: Alex Van Boxel
>Assignee: Alex Van Boxel
>Priority: Major
> Fix For: 2.20.0
>
>
> Fix the nullability issues with protobuf to schema mapping
>  * Proto3 primitive types should be *not* nullable.
>  * Proto2 required types should be *not* nullable.
>  * Proto2 optional should also be *not* nullable as having an optional value 
> doesn't mean it has not value. The spec states it has the optional value.
>  * Arrays should be *not* nullable, as proto arrays always have an empty 
> array when no value is set.
>  * Maps should be *not* nullable, as proto maps always have an empty map when 
> no value is set.
>  * Elements in an array should be *not* nullable, as nulls are not allowed in 
> an array.
>  * Names and Values should be *not* nullable, as nulls are not allowed.
>  * Rows are nullable, as messages are nullable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-9241) Fix inconsistent nullability mapping for Protobuf to Schema

2020-02-01 Thread Alex Van Boxel (Jira)
Alex Van Boxel created BEAM-9241:


 Summary: Fix inconsistent nullability mapping for Protobuf to 
Schema
 Key: BEAM-9241
 URL: https://issues.apache.org/jira/browse/BEAM-9241
 Project: Beam
  Issue Type: Bug
  Components: extensions-java-protobuf
Affects Versions: 2.18.0
Reporter: Alex Van Boxel
Assignee: Alex Van Boxel
 Fix For: 2.20.0


Fix the nullability issues with protobuf to schema mapping
 * Proto3 primitive types should be *not* nullable.
 * Proto2 required types should be *not* nullable.
 * Proto2 optional should also be *not* nullable as having an optional value 
doesn't mean it has not value. The spec states it has the optional value.
 * Arrays should be *not* nullable, as proto arrays always have an empty array 
when no value is set.
 * Maps should be *not* nullable, as proto maps always have an empty map when 
no value is set.
 * Elements in an array should be *not* nullable, as nulls are not allowed in 
an array.
 * Names and Values should be *not* nullable, as nulls are not allowed.
 * Rows are nullable, as messages are nullable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9037) Instant and duration as logical type

2020-01-31 Thread Alex Van Boxel (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Van Boxel updated BEAM-9037:
-
Fix Version/s: (was: 2.19.0)
   2.20.0

> Instant and duration as logical type 
> -
>
> Key: BEAM-9037
> URL: https://issues.apache.org/jira/browse/BEAM-9037
> Project: Beam
>  Issue Type: Task
>  Components: sdk-java-core
>Reporter: Alex Van Boxel
>Assignee: Alex Van Boxel
>Priority: Major
> Fix For: 2.20.0
>
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> The proto schema includes Timestamp and Duration with nano precision. The 
> logical types should be promoted to the core logical types, so they can be 
> handled on various IO's as standard mandatory conversions.
> This means that the logical type should use the proto specific Timestamp and 
> Duration but the java 8 Instant and Duration.
> See discussion in the design document:
> [https://docs.google.com/document/d/1uu9pJktzT_O3DxGd1-Q2op4nRk4HekIZbzi-0oTAips/edit#heading=h.9uhml95iygqr]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (BEAM-9113) Protobuf NanosType serialisation issues

2020-01-31 Thread Alex Van Boxel (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Van Boxel closed BEAM-9113.


> Protobuf NanosType serialisation issues
> --
>
> Key: BEAM-9113
> URL: https://issues.apache.org/jira/browse/BEAM-9113
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Alex Van Boxel
>Assignee: Alex Van Boxel
>Priority: Major
> Fix For: 2.20.0
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> The  NanosType has 2 known issues:
>  * Schema serialisation expects the getArgument to not return a null value
>  * UUID of the base type will not be (de)serialised as it is static



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (BEAM-9113) Protobuf NanosType serialisation issues

2020-01-31 Thread Alex Van Boxel (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Van Boxel resolved BEAM-9113.
--
Fix Version/s: 2.20.0
   Resolution: Fixed

Resolved by the general logical type for Instant and Duration

> Protobuf NanosType serialisation issues
> --
>
> Key: BEAM-9113
> URL: https://issues.apache.org/jira/browse/BEAM-9113
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Alex Van Boxel
>Assignee: Alex Van Boxel
>Priority: Major
> Fix For: 2.20.0
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> The  NanosType has 2 known issues:
>  * Schema serialisation expects the getArgument to not return a null value
>  * UUID of the base type will not be (de)serialised as it is static



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-9113) Protobuf NanosType serialisation issues

2020-01-13 Thread Alex Van Boxel (Jira)
Alex Van Boxel created BEAM-9113:


 Summary: Protobuf NanosType serialisation issues
 Key: BEAM-9113
 URL: https://issues.apache.org/jira/browse/BEAM-9113
 Project: Beam
  Issue Type: Bug
  Components: sdk-java-core
Reporter: Alex Van Boxel
Assignee: Alex Van Boxel


The  NanosType has 2 known issues:
 * Schema serialisation expects the getArgument to not return a null value
 * UUID of the base type will not be (de)serialised as it is static



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-9054) Row.toString with Logical Type are different for RowWithGetters and RowWithStorage

2020-01-04 Thread Alex Van Boxel (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17008131#comment-17008131
 ] 

Alex Van Boxel commented on BEAM-9054:
--

[~reuvenlax] : I could change this, but I'm indifferent of what the best 
representation for the logical type is: the base type or the logical type 
itself.

> Row.toString with Logical Type are different for RowWithGetters and 
> RowWithStorage
> --
>
> Key: BEAM-9054
> URL: https://issues.apache.org/jira/browse/BEAM-9054
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Alex Van Boxel
>Priority: Major
>
> Row.toString with Logical Type are different for RowWithGetters and 
> RowWithStorage with equivalent schemas. Behaviour for:
>  * RowWithGetters will show the .toString() representation of the logical type
>  * RowWithStorage will show the base type
> This should be one or the other



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9054) Row.toString with Logical Type are different for RowWithGetters and RowWithStorage

2020-01-04 Thread Alex Van Boxel (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9054?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Van Boxel updated BEAM-9054:
-
Issue Type: Bug  (was: Task)

> Row.toString with Logical Type are different for RowWithGetters and 
> RowWithStorage
> --
>
> Key: BEAM-9054
> URL: https://issues.apache.org/jira/browse/BEAM-9054
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Alex Van Boxel
>Priority: Major
>
> Row.toString with Logical Type are different for RowWithGetters and 
> RowWithStorage with equivalent schemas. Behaviour for:
>  * RowWithGetters will show the .toString() representation of the logical type
>  * RowWithStorage will show the base type
> This should be one or the other



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-9054) Row.toString with Logical Type are different for RowWithGetters and RowWithStorage

2020-01-04 Thread Alex Van Boxel (Jira)
Alex Van Boxel created BEAM-9054:


 Summary: Row.toString with Logical Type are different for 
RowWithGetters and RowWithStorage
 Key: BEAM-9054
 URL: https://issues.apache.org/jira/browse/BEAM-9054
 Project: Beam
  Issue Type: Task
  Components: sdk-java-core
Reporter: Alex Van Boxel


Row.toString with Logical Type are different for RowWithGetters and 
RowWithStorage with equivalent schemas. Behaviour for:
 * RowWithGetters will show the .toString() representation of the logical type
 * RowWithStorage will show the base type

This should be one or the other



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-9044) Convert protobuf options to Schema options

2020-01-01 Thread Alex Van Boxel (Jira)
Alex Van Boxel created BEAM-9044:


 Summary: Convert protobuf options to Schema options
 Key: BEAM-9044
 URL: https://issues.apache.org/jira/browse/BEAM-9044
 Project: Beam
  Issue Type: Task
  Components: extensions-java-protobuf
Reporter: Alex Van Boxel
Assignee: Alex Van Boxel


Protobuf has a rich metadata system called options. This system is fully typed 
and matches Beams Schema Option system. For now we can only convert the 
following protobuf options:
 * File Options -> _Beam doesn't have this concept_
 * Message Options -> *Beam Schema Options*
 * Field Options -> *Beam Schema Options*
 * Enum Options -> _This can only be done when logical type options are 
available_
 * EnumValue Options -> _This can only be done when logical type options are 
available_
 * Service Options -> _Beam doesn't have this concept_
 * Method Options -> _Beam doesn't have this concept_



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (BEAM-9037) Instant and duration as logical type

2019-12-31 Thread Alex Van Boxel (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Van Boxel resolved BEAM-9037.
--
Fix Version/s: 2.19.0
   Resolution: Fixed

> Instant and duration as logical type 
> -
>
> Key: BEAM-9037
> URL: https://issues.apache.org/jira/browse/BEAM-9037
> Project: Beam
>  Issue Type: Task
>  Components: sdk-java-core
>Reporter: Alex Van Boxel
>Assignee: Alex Van Boxel
>Priority: Major
> Fix For: 2.19.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The proto schema includes Timestamp and Duration with nano precision. The 
> logical types should be promoted to the core logical types, so they can be 
> handled on various IO's as standard mandatory conversions.
> This means that the logical type should use the proto specific Timestamp and 
> Duration but the java 8 Instant and Duration.
> See discussion in the design document:
> [https://docs.google.com/document/d/1uu9pJktzT_O3DxGd1-Q2op4nRk4HekIZbzi-0oTAips/edit#heading=h.9uhml95iygqr]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work started] (BEAM-9037) Instant and duration as logical type

2019-12-31 Thread Alex Van Boxel (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on BEAM-9037 started by Alex Van Boxel.

> Instant and duration as logical type 
> -
>
> Key: BEAM-9037
> URL: https://issues.apache.org/jira/browse/BEAM-9037
> Project: Beam
>  Issue Type: Task
>  Components: sdk-java-core
>Reporter: Alex Van Boxel
>Assignee: Alex Van Boxel
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The proto schema includes Timestamp and Duration with nano precision. The 
> logical types should be promoted to the core logical types, so they can be 
> handled on various IO's as standard mandatory conversions.
> This means that the logical type should use the proto specific Timestamp and 
> Duration but the java 8 Instant and Duration.
> See discussion in the design document:
> [https://docs.google.com/document/d/1uu9pJktzT_O3DxGd1-Q2op4nRk4HekIZbzi-0oTAips/edit#heading=h.9uhml95iygqr]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9037) Instant and duration as logical type

2019-12-31 Thread Alex Van Boxel (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Van Boxel updated BEAM-9037:
-
Summary: Instant and duration as logical type   (was: Promote proto logical 
type and duration to the core logical types)

> Instant and duration as logical type 
> -
>
> Key: BEAM-9037
> URL: https://issues.apache.org/jira/browse/BEAM-9037
> Project: Beam
>  Issue Type: Task
>  Components: sdk-java-core
>Reporter: Alex Van Boxel
>Assignee: Alex Van Boxel
>Priority: Major
>
> The proto schema includes Timestamp and Duration with nano precision. The 
> logical types should be promoted to the core logical types, so they can be 
> handled on various IO's as standard mandatory conversions.
> This means that the logical type should use the proto specific Timestamp and 
> Duration but the java 8 Instant and Duration.
> See discussion in the design document:
> [https://docs.google.com/document/d/1uu9pJktzT_O3DxGd1-Q2op4nRk4HekIZbzi-0oTAips/edit#heading=h.9uhml95iygqr]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9037) Promote proto logical type and duration to the core logical types

2019-12-29 Thread Alex Van Boxel (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Van Boxel updated BEAM-9037:
-
Description: 
The proto schema includes Timestamp and Duration with nano precision. The 
logical types should be promoted to the core logical types, so they can be 
handled on various IO's as standard mandatory conversions.

This means that the logical type should use the proto specific Timestamp and 
Duration but the java 8 Instant and Duration.

See discussion in the design document:

[https://docs.google.com/document/d/1uu9pJktzT_O3DxGd1-Q2op4nRk4HekIZbzi-0oTAips/edit#heading=h.9uhml95iygqr]

  was:
The proto schema includes Timestamp and Duration with nano precision. The 
logical types should be promoted to the core logical types, so they can be 
handled on various IO's as standard mandatory conversions.

See discussion in the design document:

[https://docs.google.com/document/d/1uu9pJktzT_O3DxGd1-Q2op4nRk4HekIZbzi-0oTAips/edit#heading=h.9uhml95iygqr]


> Promote proto logical type and duration to the core logical types
> -
>
> Key: BEAM-9037
> URL: https://issues.apache.org/jira/browse/BEAM-9037
> Project: Beam
>  Issue Type: Task
>  Components: sdk-java-core
>Reporter: Alex Van Boxel
>Assignee: Alex Van Boxel
>Priority: Major
>
> The proto schema includes Timestamp and Duration with nano precision. The 
> logical types should be promoted to the core logical types, so they can be 
> handled on various IO's as standard mandatory conversions.
> This means that the logical type should use the proto specific Timestamp and 
> Duration but the java 8 Instant and Duration.
> See discussion in the design document:
> [https://docs.google.com/document/d/1uu9pJktzT_O3DxGd1-Q2op4nRk4HekIZbzi-0oTAips/edit#heading=h.9uhml95iygqr]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-9037) Promote proto logical type and duration to the core logical types

2019-12-29 Thread Alex Van Boxel (Jira)
Alex Van Boxel created BEAM-9037:


 Summary: Promote proto logical type and duration to the core 
logical types
 Key: BEAM-9037
 URL: https://issues.apache.org/jira/browse/BEAM-9037
 Project: Beam
  Issue Type: Task
  Components: sdk-java-core
Reporter: Alex Van Boxel
Assignee: Alex Van Boxel


The proto schema includes Timestamp and Duration with nano precision. The 
logical types should be promoted to the core logical types, so they can be 
handled on various IO's as standard mandatory conversions.

See discussion in the design document:

[https://docs.google.com/document/d/1uu9pJktzT_O3DxGd1-Q2op4nRk4HekIZbzi-0oTAips/edit#heading=h.9uhml95iygqr]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (BEAM-9035) Typed options for Row Schema and Fields

2019-12-28 Thread Alex Van Boxel (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Van Boxel resolved BEAM-9035.
--
Fix Version/s: 2.19.0
   Resolution: Fixed

Ready for review

> Typed options for Row Schema and Fields
> ---
>
> Key: BEAM-9035
> URL: https://issues.apache.org/jira/browse/BEAM-9035
> Project: Beam
>  Issue Type: Task
>  Components: sdk-java-core
>Reporter: Alex Van Boxel
>Assignee: Alex Van Boxel
>Priority: Major
> Fix For: 2.19.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> This is the first issue of a multipart commit: this ticket implements the 
> basic infrastructure of options on row and field.
> Full explanation:
> Introduce the concept of Options in Beam Schema’s to add extra context to 
> fields and schema. In contracts to metadata, options would be added to 
> fields, logical types and rows. In the options schema convertors can add 
> options/annotations/decorators that were in the original schema, this context 
> can be used in the rest of the pipeline for specific transformations or 
> augment the end schema in the target output.
> Examples of options are:
>  * informational: like the source of the data, ...
>  * drive decisions further in the pipeline: flatten a row into another, 
> rename a field, ...
>  * influence something in the output: like cluster index, primary key, ...
>  * logical type information
> And option is a key/typed value combination. The advantages of having the 
> value types is: 
>  * Having strongly typed options would give a *portable way of Logical Types* 
> to have structured information that could be shared over different languages.
>  * This could keep the type intact when mapping from a formats that have 
> strongly typed options (example: Protobuf).
> This is part of a multi ticket implementation. The following tickets are 
> related:
>  # Typed options for Row Schema and Fields
>  # Convert Proto Options to Beam Schema options
>  # Convert Avro extra information for Beam string options
>  # Replace meta data with Logical Type options
>  # Extract meta data in Calcite SQL to Beam options
>  # Extract meta data in Zeta SQL to Beam options
>  # Add java example of using option in a transform 
> This feature is discussed with Reuven Lax, Brian Hulette



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work started] (BEAM-9035) Typed options for Row Schema and Fields

2019-12-27 Thread Alex Van Boxel (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on BEAM-9035 started by Alex Van Boxel.

> Typed options for Row Schema and Fields
> ---
>
> Key: BEAM-9035
> URL: https://issues.apache.org/jira/browse/BEAM-9035
> Project: Beam
>  Issue Type: Task
>  Components: sdk-java-core
>Reporter: Alex Van Boxel
>Assignee: Alex Van Boxel
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> This is the first issue of a multipart commit: this ticket implements the 
> basic infrastructure of options on row and field.
> Full explanation:
> Introduce the concept of Options in Beam Schema’s to add extra context to 
> fields and schema. In contracts to metadata, options would be added to 
> fields, logical types and rows. In the options schema convertors can add 
> options/annotations/decorators that were in the original schema, this context 
> can be used in the rest of the pipeline for specific transformations or 
> augment the end schema in the target output.
> Examples of options are:
>  * informational: like the source of the data, ...
>  * drive decisions further in the pipeline: flatten a row into another, 
> rename a field, ...
>  * influence something in the output: like cluster index, primary key, ...
>  * logical type information
> And option is a key/typed value combination. The advantages of having the 
> value types is: 
>  * Having strongly typed options would give a *portable way of Logical Types* 
> to have structured information that could be shared over different languages.
>  * This could keep the type intact when mapping from a formats that have 
> strongly typed options (example: Protobuf).
> This is part of a multi ticket implementation. The following tickets are 
> related:
>  # Typed options for Row Schema and Fields
>  # Convert Proto Options to Beam Schema options
>  # Convert Avro extra information for Beam string options
>  # Replace meta data with Logical Type options
>  # Extract meta data in Calcite SQL to Beam options
>  # Extract meta data in Zeta SQL to Beam options
>  # Add java example of using option in a transform 
> This feature is discussed with Reuven Lax, Brian Hulette



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9035) Typed options for Row Schema and Fields

2019-12-27 Thread Alex Van Boxel (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Van Boxel updated BEAM-9035:
-
Status: Open  (was: Triage Needed)

> Typed options for Row Schema and Fields
> ---
>
> Key: BEAM-9035
> URL: https://issues.apache.org/jira/browse/BEAM-9035
> Project: Beam
>  Issue Type: Task
>  Components: sdk-java-core
>Reporter: Alex Van Boxel
>Assignee: Alex Van Boxel
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> This is the first issue of a multipart commit: this ticket implements the 
> basic infrastructure of options on row and field.
> Full explanation:
> Introduce the concept of Options in Beam Schema’s to add extra context to 
> fields and schema. In contracts to metadata, options would be added to 
> fields, logical types and rows. In the options schema convertors can add 
> options/annotations/decorators that were in the original schema, this context 
> can be used in the rest of the pipeline for specific transformations or 
> augment the end schema in the target output.
> Examples of options are:
>  * informational: like the source of the data, ...
>  * drive decisions further in the pipeline: flatten a row into another, 
> rename a field, ...
>  * influence something in the output: like cluster index, primary key, ...
>  * logical type information
> And option is a key/typed value combination. The advantages of having the 
> value types is: 
>  * Having strongly typed options would give a *portable way of Logical Types* 
> to have structured information that could be shared over different languages.
>  * This could keep the type intact when mapping from a formats that have 
> strongly typed options (example: Protobuf).
> This is part of a multi ticket implementation. The following tickets are 
> related:
>  # Typed options for Row Schema and Fields
>  # Convert Proto Options to Beam Schema options
>  # Convert Avro extra information for Beam string options
>  # Replace meta data with Logical Type options
>  # Extract meta data in Calcite SQL to Beam options
>  # Extract meta data in Zeta SQL to Beam options
>  # Add java example of using option in a transform 
> This feature is discussed with Reuven Lax, Brian Hulette



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9035) Typed options for Row Schema and Fields

2019-12-27 Thread Alex Van Boxel (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Van Boxel updated BEAM-9035:
-
Description: 
This is the first issue of a multipart commit: this ticket implements the basic 
infrastructure of options on row and field.

Full explanation:

Introduce the concept of Options in Beam Schema’s to add extra context to 
fields and schema. In contracts to metadata, options would be added to fields, 
logical types and rows. In the options schema convertors can add 
options/annotations/decorators that were in the original schema, this context 
can be used in the rest of the pipeline for specific transformations or augment 
the end schema in the target output.

Examples of options are:
 * informational: like the source of the data, ...
 * drive decisions further in the pipeline: flatten a row into another, rename 
a field, ...
 * influence something in the output: like cluster index, primary key, ...
 * logical type information

And option is a key/typed value combination. The advantages of having the value 
types is: 
 * Having strongly typed options would give a *portable way of Logical Types* 
to have structured information that could be shared over different languages.
 * This could keep the type intact when mapping from a formats that have 
strongly typed options (example: Protobuf).

This is part of a multi ticket implementation. The following tickets are 
related:
 # Typed options for Row Schema and Fields
 # Convert Proto Options to Beam Schema options
 # Convert Avro extra information for Beam string options
 # Replace meta data with Logical Type options
 # Extract meta data in Calcite SQL to Beam options
 # Extract meta data in Zeta SQL to Beam options
 # Add java example of using option in a transform 

This feature is discussed with Reuven Lax, Brian Hulette

  was:
This is the first issue of a multipart commit: 

 

Introduce the concept of Options in Beam Schema’s to add extra context to 
fields and schema. In contracts to metadata, options would be added to fields, 
logical types and rows. In the options schema convertors can add 
options/annotations/decorators that were in the original schema, this context 
can be used in the rest of the pipeline for specific transformations or augment 
the end schema in the target output.

Examples of options are:
 * informational: like the source of the data, ...
 * drive decisions further in the pipeline: flatten a row into another, rename 
a field, ...
 * influence something in the output: like cluster index, primary key, ...
 * logical type information

And option is a key/typed value combination. The advantages of having the value 
types is: 
 * Having strongly typed options would give a *portable way of Logical Types* 
to have structured information that could be shared over different languages.
 * This could keep the type intact when mapping from a formats that have 
strongly typed options (example: Protobuf).

This is part of a multi ticket implementation. The following tickets are 
related:
 # Typed options for Row Schema and Fields
 # Convert Proto Options to Beam Schema options
 # Convert Avro extra information for Beam string options
 # Replace meta data with Logical Type options
 # Extract meta data in Calcite SQL to Beam options
 # Extract meta data in Zeta SQL to Beam options
 # Add java example of using option in a transform 

This feature is discussed with Reuven Lax, Brian Hulette


> Typed options for Row Schema and Fields
> ---
>
> Key: BEAM-9035
> URL: https://issues.apache.org/jira/browse/BEAM-9035
> Project: Beam
>  Issue Type: Task
>  Components: sdk-java-core
>Reporter: Alex Van Boxel
>Assignee: Alex Van Boxel
>Priority: Major
>
> This is the first issue of a multipart commit: this ticket implements the 
> basic infrastructure of options on row and field.
> Full explanation:
> Introduce the concept of Options in Beam Schema’s to add extra context to 
> fields and schema. In contracts to metadata, options would be added to 
> fields, logical types and rows. In the options schema convertors can add 
> options/annotations/decorators that were in the original schema, this context 
> can be used in the rest of the pipeline for specific transformations or 
> augment the end schema in the target output.
> Examples of options are:
>  * informational: like the source of the data, ...
>  * drive decisions further in the pipeline: flatten a row into another, 
> rename a field, ...
>  * influence something in the output: like cluster index, primary key, ...
>  * logical type information
> And option is a key/typed value combination. The advantages of having the 
> value types is: 
>  * Having strongly typed options would give a *portable way of Logical Types* 
> to have structured information that could be s

[jira] [Updated] (BEAM-9035) Typed options for Row Schema and Fields

2019-12-27 Thread Alex Van Boxel (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Van Boxel updated BEAM-9035:
-
Description: 
This is the first issue of a multipart commit: 

 

Introduce the concept of Options in Beam Schema’s to add extra context to 
fields and schema. In contracts to metadata, options would be added to fields, 
logical types and rows. In the options schema convertors can add 
options/annotations/decorators that were in the original schema, this context 
can be used in the rest of the pipeline for specific transformations or augment 
the end schema in the target output.

Examples of options are:
 * informational: like the source of the data, ...
 * drive decisions further in the pipeline: flatten a row into another, rename 
a field, ...
 * influence something in the output: like cluster index, primary key, ...
 * logical type information

And option is a key/typed value combination. The advantages of having the value 
types is: 
 * Having strongly typed options would give a *portable way of Logical Types* 
to have structured information that could be shared over different languages.
 * This could keep the type intact when mapping from a formats that have 
strongly typed options (example: Protobuf).

This is part of a multi ticket implementation. The following tickets are 
related:
 # Typed options for Row Schema and Fields
 # Convert Proto Options to Beam Schema options
 # Convert Avro extra information for Beam string options
 # Replace meta data with Logical Type options
 # Extract meta data in Calcite SQL to Beam options
 # Extract meta data in Zeta SQL to Beam options
 # Add java example of using option in a transform 

This feature is discussed with Reuven Lax, Brian Hulette

  was:
Introduce the concept of Options in Beam Schema’s to add extra context to 
fields and schema. In contracts to metadata, options would be added to fields, 
logical types and rows. In the options schema convertors can add 
options/annotations/decorators that were in the original schema, this context 
can be used in the rest of the pipeline for specific transformations or augment 
the end schema in the target output.

Examples of options are:
 * informational: like the source of the data, ...
 * drive decisions further in the pipeline: flatten a row into another, rename 
a field, ...
 * influence something in the output: like cluster index, primary key, ...
 * logical type information

And option is a key/typed value combination. The advantages of having the value 
types is: 
 * Having strongly typed options would give a *portable way of Logical Types* 
to have structured information that could be shared over different languages.
 * This could keep the type intact when mapping from a formats that have 
strongly typed options (example: Protobuf).

This is part of a multi ticket implementation. The following tickets are 
related:
 # Typed options for Row Schema and Fields
 # Convert Proto Options to Beam Schema options
 # Convert Avro extra information for Beam string options
 # Replace meta data with Logical Type options
 # Extract meta data in Calcite SQL to Beam options
 # Extract meta data in Zeta SQL to Beam options
 # Add java example of using option in a transform 

This feature is discussed with Reuven Lax, Brian Hulette


> Typed options for Row Schema and Fields
> ---
>
> Key: BEAM-9035
> URL: https://issues.apache.org/jira/browse/BEAM-9035
> Project: Beam
>  Issue Type: Task
>  Components: sdk-java-core
>Reporter: Alex Van Boxel
>Assignee: Alex Van Boxel
>Priority: Major
>
> This is the first issue of a multipart commit: 
>  
> Introduce the concept of Options in Beam Schema’s to add extra context to 
> fields and schema. In contracts to metadata, options would be added to 
> fields, logical types and rows. In the options schema convertors can add 
> options/annotations/decorators that were in the original schema, this context 
> can be used in the rest of the pipeline for specific transformations or 
> augment the end schema in the target output.
> Examples of options are:
>  * informational: like the source of the data, ...
>  * drive decisions further in the pipeline: flatten a row into another, 
> rename a field, ...
>  * influence something in the output: like cluster index, primary key, ...
>  * logical type information
> And option is a key/typed value combination. The advantages of having the 
> value types is: 
>  * Having strongly typed options would give a *portable way of Logical Types* 
> to have structured information that could be shared over different languages.
>  * This could keep the type intact when mapping from a formats that have 
> strongly typed options (example: Protobuf).
> This is part of a multi ticket implementation. The following tickets are 
> related:

[jira] [Updated] (BEAM-9035) Typed options for Row Schema and Fields

2019-12-27 Thread Alex Van Boxel (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Van Boxel updated BEAM-9035:
-
Description: 
Introduce the concept of Options in Beam Schema’s to add extra context to 
fields and schema. In contracts to metadata, options would be added to fields, 
logical types and rows. In the options schema convertors can add 
options/annotations/decorators that were in the original schema, this context 
can be used in the rest of the pipeline for specific transformations or augment 
the end schema in the target output.

Examples of options are:
 * informational: like the source of the data, ...
 * drive decisions further in the pipeline: flatten a row into another, rename 
a field, ...
 * influence something in the output: like cluster index, primary key, ...
 * logical type information

And option is a key/typed value combination. The advantages of having the value 
types is: 
 * Having strongly typed options would give a *portable way of Logical Types* 
to have structured information that could be shared over different languages.
 * This could keep the type intact when mapping from a formats that have 
strongly typed options (example: Protobuf).

This is part of a multi ticket implementation. The following tickets are 
related:
 # Typed options for Row Schema and Fields
 # Convert Proto Options to Beam Schema options
 # Convert Avro extra information for Beam string options
 # Replace meta data with Logical Type options
 # Extract meta data in Calcite SQL to Beam options
 # Extract meta data in Zeta SQL to Beam options
 # Add java example of using option in a transform 

This feature is discussed with Reuven Lax, Brian Hulette

  was:
Introduce the concept of Options in Beam Schema’s to add extra context to 
fields and schema. In contracts to metadata, options would be added to fields, 
logical types and rows. In the options schema convertors can add 
options/annotations/decorators that were in the original schema, this context 
can be used in the rest of the pipeline for specific transformations or augment 
the end schema in the target output.

Examples of options are:
 * informational: like the source of the data, ...
 * drive decisions further in the pipeline: flatten a row into another, rename 
a field, ...
 * influence something in the output: like cluster index, primary key, ...

And option is a key/typed value combination. The advantages of having the value 
types is: 
 * Having strongly typed options would give a portable way of Logical Types to 
have structured information that could be shared over different languages.
 * This could keep the type intact when mapping from a formats that have 
strongly typed options (example: Protobuf).

This is part of a multi ticket implementation. The following tickets are 
related:
 # Typed options for Row Schema and Fields
 # Convert Proto Options to Beam Schema options
 # Convert Avro extra information for Beam string options
 # Replace meta data with Logical Type options
 # Extract meta data in Calcite SQL to Beam options
 # Extract meta data in Zeta SQL to Beam options

This feature is discussed with Reuven Lax, Brian Hulette


> Typed options for Row Schema and Fields
> ---
>
> Key: BEAM-9035
> URL: https://issues.apache.org/jira/browse/BEAM-9035
> Project: Beam
>  Issue Type: Task
>  Components: sdk-java-core
>Reporter: Alex Van Boxel
>Assignee: Alex Van Boxel
>Priority: Major
>
> Introduce the concept of Options in Beam Schema’s to add extra context to 
> fields and schema. In contracts to metadata, options would be added to 
> fields, logical types and rows. In the options schema convertors can add 
> options/annotations/decorators that were in the original schema, this context 
> can be used in the rest of the pipeline for specific transformations or 
> augment the end schema in the target output.
> Examples of options are:
>  * informational: like the source of the data, ...
>  * drive decisions further in the pipeline: flatten a row into another, 
> rename a field, ...
>  * influence something in the output: like cluster index, primary key, ...
>  * logical type information
> And option is a key/typed value combination. The advantages of having the 
> value types is: 
>  * Having strongly typed options would give a *portable way of Logical Types* 
> to have structured information that could be shared over different languages.
>  * This could keep the type intact when mapping from a formats that have 
> strongly typed options (example: Protobuf).
> This is part of a multi ticket implementation. The following tickets are 
> related:
>  # Typed options for Row Schema and Fields
>  # Convert Proto Options to Beam Schema options
>  # Convert Avro extra information for Beam string options
>  # Replace meta data with Logi

[jira] [Created] (BEAM-9035) Typed options for Row Schema and Fields

2019-12-27 Thread Alex Van Boxel (Jira)
Alex Van Boxel created BEAM-9035:


 Summary: Typed options for Row Schema and Fields
 Key: BEAM-9035
 URL: https://issues.apache.org/jira/browse/BEAM-9035
 Project: Beam
  Issue Type: Task
  Components: sdk-java-core
Reporter: Alex Van Boxel
Assignee: Alex Van Boxel


Introduce the concept of Options in Beam Schema’s to add extra context to 
fields and schema. In contracts to metadata, options would be added to fields, 
logical types and rows. In the options schema convertors can add 
options/annotations/decorators that were in the original schema, this context 
can be used in the rest of the pipeline for specific transformations or augment 
the end schema in the target output.

Examples of options are:
 * informational: like the source of the data, ...
 * drive decisions further in the pipeline: flatten a row into another, rename 
a field, ...
 * influence something in the output: like cluster index, primary key, ...

And option is a key/typed value combination. The advantages of having the value 
types is: 
 * Having strongly typed options would give a portable way of Logical Types to 
have structured information that could be shared over different languages.
 * This could keep the type intact when mapping from a formats that have 
strongly typed options (example: Protobuf).

This is part of a multi ticket implementation. The following tickets are 
related:
 # Typed options for Row Schema and Fields
 # Convert Proto Options to Beam Schema options
 # Convert Avro extra information for Beam string options
 # Replace meta data with Logical Type options
 # Extract meta data in Calcite SQL to Beam options
 # Extract meta data in Zeta SQL to Beam options

This feature is discussed with Reuven Lax, Brian Hulette



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-8174) BigQueryIO clustering documentation is incorrect and lacking

2019-11-19 Thread Alex Van Boxel (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16978167#comment-16978167
 ] 

Alex Van Boxel commented on BEAM-8174:
--

I've removed the fixed version on this

> BigQueryIO clustering documentation is incorrect and lacking
> 
>
> Key: BEAM-8174
> URL: https://issues.apache.org/jira/browse/BEAM-8174
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-gcp
>Affects Versions: 2.15.0
>Reporter: Alex Van Boxel
>Assignee: Alex Van Boxel
>Priority: Trivial
>  Labels: documentation
>   Original Estimate: 2h
>  Remaining Estimate: 2h
>
> I noticed that the Java doc of the clustering feature in BigQueryIO is more a 
> copy/paste from the timestamp method. This needs to be corrected.
> The Clustering option should also be added to the BigQueryIO page.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-8174) BigQueryIO clustering documentation is incorrect and lacking

2019-11-19 Thread Alex Van Boxel (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Van Boxel updated BEAM-8174:
-
Fix Version/s: (was: 2.17.0)

> BigQueryIO clustering documentation is incorrect and lacking
> 
>
> Key: BEAM-8174
> URL: https://issues.apache.org/jira/browse/BEAM-8174
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-gcp
>Affects Versions: 2.15.0
>Reporter: Alex Van Boxel
>Assignee: Alex Van Boxel
>Priority: Trivial
>  Labels: documentation
>   Original Estimate: 2h
>  Remaining Estimate: 2h
>
> I noticed that the Java doc of the clustering feature in BigQueryIO is more a 
> copy/paste from the timestamp method. This needs to be corrected.
> The Clustering option should also be added to the BigQueryIO page.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-7274) Protobuf Beam Schema support

2019-11-10 Thread Alex Van Boxel (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Van Boxel updated BEAM-7274:
-
Status: Open  (was: Triage Needed)

> Protobuf Beam Schema support
> 
>
> Key: BEAM-7274
> URL: https://issues.apache.org/jira/browse/BEAM-7274
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Alex Van Boxel
>Assignee: Alex Van Boxel
>Priority: Minor
> Fix For: 2.17.0
>
>  Time Spent: 7h 50m
>  Remaining Estimate: 0h
>
> Add support for the new Beam Schema to the Protobuf extension.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-7274) Protobuf Beam Schema support

2019-11-10 Thread Alex Van Boxel (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Van Boxel updated BEAM-7274:
-
Fix Version/s: (was: 2.17.0)

> Protobuf Beam Schema support
> 
>
> Key: BEAM-7274
> URL: https://issues.apache.org/jira/browse/BEAM-7274
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Alex Van Boxel
>Assignee: Alex Van Boxel
>Priority: Minor
>  Time Spent: 7h 50m
>  Remaining Estimate: 0h
>
> Add support for the new Beam Schema to the Protobuf extension.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work started] (BEAM-7274) Protobuf Beam Schema support

2019-11-10 Thread Alex Van Boxel (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on BEAM-7274 started by Alex Van Boxel.

> Protobuf Beam Schema support
> 
>
> Key: BEAM-7274
> URL: https://issues.apache.org/jira/browse/BEAM-7274
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Alex Van Boxel
>Assignee: Alex Van Boxel
>Priority: Minor
> Fix For: 2.17.0
>
>  Time Spent: 7h 50m
>  Remaining Estimate: 0h
>
> Add support for the new Beam Schema to the Protobuf extension.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Reopened] (BEAM-7274) Protobuf Beam Schema support

2019-11-10 Thread Alex Van Boxel (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Van Boxel reopened BEAM-7274:
--

Reopening as new comments on PR still needs to be resolved

> Protobuf Beam Schema support
> 
>
> Key: BEAM-7274
> URL: https://issues.apache.org/jira/browse/BEAM-7274
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Alex Van Boxel
>Assignee: Alex Van Boxel
>Priority: Minor
> Fix For: 2.17.0
>
>  Time Spent: 7h 50m
>  Remaining Estimate: 0h
>
> Add support for the new Beam Schema to the Protobuf extension.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-8218) Implement Apache PulsarIO

2019-09-11 Thread Alex Van Boxel (Jira)
Alex Van Boxel created BEAM-8218:


 Summary: Implement Apache PulsarIO
 Key: BEAM-8218
 URL: https://issues.apache.org/jira/browse/BEAM-8218
 Project: Beam
  Issue Type: Task
  Components: io-ideas
Reporter: Alex Van Boxel
Assignee: Alex Van Boxel


Apache Pulsar is starting to gain popularity. Having a native Beam PulsarIO 
could be beneficial.

[https://pulsar.apache.org/|https://pulsar.apache.org/en/]



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Updated] (BEAM-7274) Protobuf Beam Schema support

2019-09-11 Thread Alex Van Boxel (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Van Boxel updated BEAM-7274:
-
Fix Version/s: (was: 2.16.0)
   2.17.0

Moved to 2.17.0

> Protobuf Beam Schema support
> 
>
> Key: BEAM-7274
> URL: https://issues.apache.org/jira/browse/BEAM-7274
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Alex Van Boxel
>Assignee: Alex Van Boxel
>Priority: Minor
> Fix For: 2.17.0
>
>  Time Spent: 6h 50m
>  Remaining Estimate: 0h
>
> Add support for the new Beam Schema to the Protobuf extension.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Updated] (BEAM-5967) ProtoCoder doesn't support DynamicMessage

2019-09-11 Thread Alex Van Boxel (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-5967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Van Boxel updated BEAM-5967:
-
Fix Version/s: (was: 2.16.0)
   2.17.0

Moved to 2.17.0

> ProtoCoder doesn't support DynamicMessage
> -
>
> Key: BEAM-5967
> URL: https://issues.apache.org/jira/browse/BEAM-5967
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Affects Versions: 2.8.0
>Reporter: Alex Van Boxel
>Assignee: Alex Van Boxel
>Priority: Major
> Fix For: 2.17.0
>
>  Time Spent: 3h 50m
>  Remaining Estimate: 0h
>
> The ProtoCoder does make some assumptions about static messages being 
> available. The DynamicMessage doesn't have some of them, mainly because the 
> proto schema is defined at runtime and not at compile time.
> Does it make sense to make a special coder for DynamicMessage or build it 
> into the normal ProtoCoder.
> Here is an example of the assumtion being made in the current Codec:
> {code:java}
> try {
>   @SuppressWarnings("unchecked")
>   T protoMessageInstance = (T) 
> protoMessageClass.getMethod("getDefaultInstance").invoke(null);
>   @SuppressWarnings("unchecked")
>   Parser tParser = (Parser) protoMessageInstance.getParserForType();
>   memoizedParser = tParser;
> } catch (IllegalAccessException | InvocationTargetException | 
> NoSuchMethodException e) {
>   throw new IllegalArgumentException(e);
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Updated] (BEAM-8174) BigQueryIO clustering documentation is incorrect and lacking

2019-09-07 Thread Alex Van Boxel (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Van Boxel updated BEAM-8174:
-
Status: Open  (was: Triage Needed)

> BigQueryIO clustering documentation is incorrect and lacking
> 
>
> Key: BEAM-8174
> URL: https://issues.apache.org/jira/browse/BEAM-8174
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-gcp
>Affects Versions: 2.15.0
>Reporter: Alex Van Boxel
>Assignee: Alex Van Boxel
>Priority: Trivial
>  Labels: documentation
> Fix For: 2.17.0
>
>   Original Estimate: 2h
>  Remaining Estimate: 2h
>
> I noticed that the Java doc of the clustering feature in BigQueryIO is more a 
> copy/paste from the timestamp method. This needs to be corrected.
> The Clustering option should also be added to the BigQueryIO page.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work started] (BEAM-8174) BigQueryIO clustering documentation is incorrect and lacking

2019-09-07 Thread Alex Van Boxel (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on BEAM-8174 started by Alex Van Boxel.

> BigQueryIO clustering documentation is incorrect and lacking
> 
>
> Key: BEAM-8174
> URL: https://issues.apache.org/jira/browse/BEAM-8174
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-gcp
>Affects Versions: 2.15.0
>Reporter: Alex Van Boxel
>Assignee: Alex Van Boxel
>Priority: Trivial
>  Labels: documentation
> Fix For: 2.17.0
>
>   Original Estimate: 2h
>  Remaining Estimate: 2h
>
> I noticed that the Java doc of the clustering feature in BigQueryIO is more a 
> copy/paste from the timestamp method. This needs to be corrected.
> The Clustering option should also be added to the BigQueryIO page.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Created] (BEAM-8174) BigQueryIO clustering documentation is incorrect and lacking

2019-09-07 Thread Alex Van Boxel (Jira)
Alex Van Boxel created BEAM-8174:


 Summary: BigQueryIO clustering documentation is incorrect and 
lacking
 Key: BEAM-8174
 URL: https://issues.apache.org/jira/browse/BEAM-8174
 Project: Beam
  Issue Type: Bug
  Components: io-java-gcp
Affects Versions: 2.15.0
Reporter: Alex Van Boxel
Assignee: Alex Van Boxel
 Fix For: 2.17.0


I noticed that the Java doc of the clustering feature in BigQueryIO is more a 
copy/paste from the timestamp method. This needs to be corrected.

The Clustering option should also be added to the BigQueryIO page.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Resolved] (BEAM-7274) Protobuf Beam Schema support

2019-09-04 Thread Alex Van Boxel (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Van Boxel resolved BEAM-7274.
--
Fix Version/s: 2.16.0
   Resolution: Fixed

> Protobuf Beam Schema support
> 
>
> Key: BEAM-7274
> URL: https://issues.apache.org/jira/browse/BEAM-7274
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Alex Van Boxel
>Assignee: Alex Van Boxel
>Priority: Minor
> Fix For: 2.16.0
>
>  Time Spent: 6h 20m
>  Remaining Estimate: 0h
>
> Add support for the new Beam Schema to the Protobuf extension.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (BEAM-7274) Protobuf Beam Schema support

2019-08-30 Thread Alex Van Boxel (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-7274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16920033#comment-16920033
 ] 

Alex Van Boxel commented on BEAM-7274:
--

PR ready for review

> Protobuf Beam Schema support
> 
>
> Key: BEAM-7274
> URL: https://issues.apache.org/jira/browse/BEAM-7274
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Alex Van Boxel
>Assignee: Alex Van Boxel
>Priority: Minor
>  Time Spent: 6h
>  Remaining Estimate: 0h
>
> Add support for the new Beam Schema to the Protobuf extension.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Resolved] (BEAM-5967) ProtoCoder doesn't support DynamicMessage

2019-08-30 Thread Alex Van Boxel (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-5967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Van Boxel resolved BEAM-5967.
--
Fix Version/s: 2.16.0
   Resolution: Fixed

Object equality now handled by ProtoDomain. Upgradability is tested from 2.14.0 
to  -- 2.16.0-SNAPSHOT. Waiting for reviewers.

> ProtoCoder doesn't support DynamicMessage
> -
>
> Key: BEAM-5967
> URL: https://issues.apache.org/jira/browse/BEAM-5967
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Affects Versions: 2.8.0
>Reporter: Alex Van Boxel
>Assignee: Alex Van Boxel
>Priority: Major
> Fix For: 2.16.0
>
>  Time Spent: 3h 50m
>  Remaining Estimate: 0h
>
> The ProtoCoder does make some assumptions about static messages being 
> available. The DynamicMessage doesn't have some of them, mainly because the 
> proto schema is defined at runtime and not at compile time.
> Does it make sense to make a special coder for DynamicMessage or build it 
> into the normal ProtoCoder.
> Here is an example of the assumtion being made in the current Codec:
> {code:java}
> try {
>   @SuppressWarnings("unchecked")
>   T protoMessageInstance = (T) 
> protoMessageClass.getMethod("getDefaultInstance").invoke(null);
>   @SuppressWarnings("unchecked")
>   Parser tParser = (Parser) protoMessageInstance.getParserForType();
>   memoizedParser = tParser;
> } catch (IllegalAccessException | InvocationTargetException | 
> NoSuchMethodException e) {
>   throw new IllegalArgumentException(e);
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Closed] (BEAM-7312) SchemaProvider can't be used with dynamic types

2019-08-30 Thread Alex Van Boxel (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Van Boxel closed BEAM-7312.

Fix Version/s: 2.14.0
   Resolution: Won't Fix

This is not the right mechanism for handling dynamic types. Closing with Won't 
Fix.

> SchemaProvider can't be used with dynamic types
> ---
>
> Key: BEAM-7312
> URL: https://issues.apache.org/jira/browse/BEAM-7312
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Alex Van Boxel
>Assignee: Alex Van Boxel
>Priority: Major
> Fix For: 2.14.0
>
>
> Looking at the java doc comment of SchemaProvider it hints at getting 
> schema's from external system. But as the provider only access type this is 
> in general impossible:
> Say you have 2 dynamic types, say Avro, as a java type they have both 
> GenericRecord. Using the current interface it's impossible to make the 
> difference between both dynamic types.
> As getting information from an external system I propose extending the 
> Provider interface by adding an extra parameter to the interface. It would be 
> a string with a URN.
> The URN could indicated for example
>  * Pub/Sub subscription/topic
>  * Kafka topic
>  * whatever... 



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Resolved] (BEAM-7999) BigQueryIO.readTableRowsWithSchema() doesn't handle timestamp correctly

2019-08-27 Thread Alex Van Boxel (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Van Boxel resolved BEAM-7999.
--
Fix Version/s: 2.16.0
   Resolution: Fixed

PR merged

> BigQueryIO.readTableRowsWithSchema() doesn't handle timestamp correctly
> ---
>
> Key: BEAM-7999
> URL: https://issues.apache.org/jira/browse/BEAM-7999
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-gcp
>Affects Versions: 2.14.0, 2.15.0
>Reporter: Alex Van Boxel
>Assignee: Alex Van Boxel
>Priority: Major
> Fix For: 2.16.0
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Using the new readTableRowsWithSchema to make a copy of a table (simple 
> operation), parsing the timestamp in the table doesn't work as it assumes a 
> Double value. BigQuery outputs a string like "2019-08-16 00:12:00.123456 
> UTC". This isn't handled.
> *Reproducable:*
> with this table
> {code:java}
> INSERT `research.alex.in1` (row_id, f_int64, f_timestamp)
> VALUES
> (1, 1, '2019-08-16 00:12:00 UTC'),
> (2, 2, '2019-08-16 00:12:00.123 UTC'),
> (3, 3, '2019-08-16 00:12:00.123456 UTC')
> {code}
> do a copy operation:
> {code:java}
> pipeline
> .apply(
> BigQueryIO.readTableRowsWithSchema()
> .from("research:alex.in1")
> //.withMethod(BigQueryIO.TypedRead.Method.DIRECT_READ)
> )
> .apply(ParDo.of(new Inspect()))
> .apply(
> BigQueryIO.writeTableRows()
> 
> .withCreateDisposition(BigQueryIO.Write.CreateDisposition.CREATE_IF_NEEDED)
> .withMethod(BigQueryIO.Write.Method.FILE_LOADS)
> .useBeamSchema()
> .to("research:alex.out4"));
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Closed] (BEAM-7999) BigQueryIO.readTableRowsWithSchema() doesn't handle timestamp correctly

2019-08-27 Thread Alex Van Boxel (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Van Boxel closed BEAM-7999.


> BigQueryIO.readTableRowsWithSchema() doesn't handle timestamp correctly
> ---
>
> Key: BEAM-7999
> URL: https://issues.apache.org/jira/browse/BEAM-7999
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-gcp
>Affects Versions: 2.14.0, 2.15.0
>Reporter: Alex Van Boxel
>Assignee: Alex Van Boxel
>Priority: Major
> Fix For: 2.16.0
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Using the new readTableRowsWithSchema to make a copy of a table (simple 
> operation), parsing the timestamp in the table doesn't work as it assumes a 
> Double value. BigQuery outputs a string like "2019-08-16 00:12:00.123456 
> UTC". This isn't handled.
> *Reproducable:*
> with this table
> {code:java}
> INSERT `research.alex.in1` (row_id, f_int64, f_timestamp)
> VALUES
> (1, 1, '2019-08-16 00:12:00 UTC'),
> (2, 2, '2019-08-16 00:12:00.123 UTC'),
> (3, 3, '2019-08-16 00:12:00.123456 UTC')
> {code}
> do a copy operation:
> {code:java}
> pipeline
> .apply(
> BigQueryIO.readTableRowsWithSchema()
> .from("research:alex.in1")
> //.withMethod(BigQueryIO.TypedRead.Method.DIRECT_READ)
> )
> .apply(ParDo.of(new Inspect()))
> .apply(
> BigQueryIO.writeTableRows()
> 
> .withCreateDisposition(BigQueryIO.Write.CreateDisposition.CREATE_IF_NEEDED)
> .withMethod(BigQueryIO.Write.Method.FILE_LOADS)
> .useBeamSchema()
> .to("research:alex.out4"));
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Closed] (BEAM-7426) FieldSpecifierNotationLexer should support underscore as field character

2019-08-23 Thread Alex Van Boxel (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Van Boxel closed BEAM-7426.


Part of 2.14 release

> FieldSpecifierNotationLexer should support underscore as field character
> 
>
> Key: BEAM-7426
> URL: https://issues.apache.org/jira/browse/BEAM-7426
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Alex Van Boxel
>Assignee: Alex Van Boxel
>Priority: Major
> Fix For: 2.14.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Underscore is a common used word delimiter in field names, the current 
> FieldSpecifierNotationLexer only support alpha-numeric values for field name 
> character. 
> The upcoming Protobuf schema support will emit underscores in the field 
> names, so field names should support underscore.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Resolved] (BEAM-7426) FieldSpecifierNotationLexer should support underscore as field character

2019-08-23 Thread Alex Van Boxel (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Van Boxel resolved BEAM-7426.
--
Fix Version/s: 2.14.0
   Resolution: Fixed

> FieldSpecifierNotationLexer should support underscore as field character
> 
>
> Key: BEAM-7426
> URL: https://issues.apache.org/jira/browse/BEAM-7426
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Alex Van Boxel
>Assignee: Alex Van Boxel
>Priority: Major
> Fix For: 2.14.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Underscore is a common used word delimiter in field names, the current 
> FieldSpecifierNotationLexer only support alpha-numeric values for field name 
> character. 
> The upcoming Protobuf schema support will emit underscores in the field 
> names, so field names should support underscore.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (BEAM-7274) Protobuf Beam Schema support

2019-08-22 Thread Alex Van Boxel (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-7274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16913097#comment-16913097
 ] 

Alex Van Boxel commented on BEAM-7274:
--

Well. picking it up again as I'm back from holidays. Hopefully we then get the 
pull request out in a reasonable time.

> Protobuf Beam Schema support
> 
>
> Key: BEAM-7274
> URL: https://issues.apache.org/jira/browse/BEAM-7274
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Alex Van Boxel
>Assignee: Alex Van Boxel
>Priority: Minor
>  Time Spent: 4h 50m
>  Remaining Estimate: 0h
>
> Add support for the new Beam Schema to the Protobuf extension.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work started] (BEAM-7999) BigQueryIO.readTableRowsWithSchema() doesn't handle timestamp correctly

2019-08-20 Thread Alex Van Boxel (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on BEAM-7999 started by Alex Van Boxel.

> BigQueryIO.readTableRowsWithSchema() doesn't handle timestamp correctly
> ---
>
> Key: BEAM-7999
> URL: https://issues.apache.org/jira/browse/BEAM-7999
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-gcp
>Affects Versions: 2.14.0, 2.15.0
>Reporter: Alex Van Boxel
>Assignee: Alex Van Boxel
>Priority: Major
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Using the new readTableRowsWithSchema to make a copy of a table (simple 
> operation), parsing the timestamp in the table doesn't work as it assumes a 
> Double value. BigQuery outputs a string like "2019-08-16 00:12:00.123456 
> UTC". This isn't handled.
> *Reproducable:*
> with this table
> {code:java}
> INSERT `research.alex.in1` (row_id, f_int64, f_timestamp)
> VALUES
> (1, 1, '2019-08-16 00:12:00 UTC'),
> (2, 2, '2019-08-16 00:12:00.123 UTC'),
> (3, 3, '2019-08-16 00:12:00.123456 UTC')
> {code}
> do a copy operation:
> {code:java}
> pipeline
> .apply(
> BigQueryIO.readTableRowsWithSchema()
> .from("research:alex.in1")
> //.withMethod(BigQueryIO.TypedRead.Method.DIRECT_READ)
> )
> .apply(ParDo.of(new Inspect()))
> .apply(
> BigQueryIO.writeTableRows()
> 
> .withCreateDisposition(BigQueryIO.Write.CreateDisposition.CREATE_IF_NEEDED)
> .withMethod(BigQueryIO.Write.Method.FILE_LOADS)
> .useBeamSchema()
> .to("research:alex.out4"));
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Updated] (BEAM-7999) BigQueryIO.readTableRowsWithSchema() doesn't handle timestamp correctly

2019-08-17 Thread Alex Van Boxel (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Van Boxel updated BEAM-7999:
-
Issue Type: Bug  (was: Task)

> BigQueryIO.readTableRowsWithSchema() doesn't handle timestamp correctly
> ---
>
> Key: BEAM-7999
> URL: https://issues.apache.org/jira/browse/BEAM-7999
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-gcp
>Affects Versions: 2.14.0, 2.15.0
>Reporter: Alex Van Boxel
>Assignee: Alex Van Boxel
>Priority: Major
>
> Using the new readTableRowsWithSchema to make a copy of a table (simple 
> operation), parsing the timestamp in the table doesn't work as it assumes a 
> Double value. BigQuery outputs a string like "2019-08-16 00:12:00.123456 
> UTC". This isn't handled.
> *Reproducable:*
> with this table
> {code:java}
> INSERT `research.alex.in1` (row_id, f_int64, f_timestamp)
> VALUES
> (1, 1, '2019-08-16 00:12:00 UTC'),
> (2, 2, '2019-08-16 00:12:00.123 UTC'),
> (3, 3, '2019-08-16 00:12:00.123456 UTC')
> {code}
> do a copy operation:
> {code:java}
> pipeline
> .apply(
> BigQueryIO.readTableRowsWithSchema()
> .from("research:alex.in1")
> //.withMethod(BigQueryIO.TypedRead.Method.DIRECT_READ)
> )
> .apply(ParDo.of(new Inspect()))
> .apply(
> BigQueryIO.writeTableRows()
> 
> .withCreateDisposition(BigQueryIO.Write.CreateDisposition.CREATE_IF_NEEDED)
> .withMethod(BigQueryIO.Write.Method.FILE_LOADS)
> .useBeamSchema()
> .to("research:alex.out4"));
> {code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (BEAM-7999) BigQueryIO.readTableRowsWithSchema() doesn't handle timestamp correctly

2019-08-17 Thread Alex Van Boxel (JIRA)
Alex Van Boxel created BEAM-7999:


 Summary: BigQueryIO.readTableRowsWithSchema() doesn't handle 
timestamp correctly
 Key: BEAM-7999
 URL: https://issues.apache.org/jira/browse/BEAM-7999
 Project: Beam
  Issue Type: Task
  Components: io-java-gcp
Affects Versions: 2.14.0, 2.15.0
Reporter: Alex Van Boxel
Assignee: Alex Van Boxel


Using the new readTableRowsWithSchema to make a copy of a table (simple 
operation), parsing the timestamp in the table doesn't work as it assumes a 
Double value. BigQuery outputs a string like "2019-08-16 00:12:00.123456 UTC". 
This isn't handled.

*Reproducable:*

with this table
{code:java}
INSERT `research.alex.in1` (row_id, f_int64, f_timestamp)
VALUES
(1, 1, '2019-08-16 00:12:00 UTC'),
(2, 2, '2019-08-16 00:12:00.123 UTC'),
(3, 3, '2019-08-16 00:12:00.123456 UTC')
{code}
do a copy operation:
{code:java}
pipeline
.apply(
BigQueryIO.readTableRowsWithSchema()
.from("research:alex.in1")
//.withMethod(BigQueryIO.TypedRead.Method.DIRECT_READ)

)
.apply(ParDo.of(new Inspect()))
.apply(
BigQueryIO.writeTableRows()

.withCreateDisposition(BigQueryIO.Write.CreateDisposition.CREATE_IF_NEEDED)
.withMethod(BigQueryIO.Write.Method.FILE_LOADS)
.useBeamSchema()
.to("research:alex.out4"));
{code}
 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (BEAM-7518) Protobuf Schema: Introduce logical type for Timestamp, Duration and other

2019-06-08 Thread Alex Van Boxel (JIRA)
Alex Van Boxel created BEAM-7518:


 Summary: Protobuf Schema: Introduce logical type for Timestamp, 
Duration and other
 Key: BEAM-7518
 URL: https://issues.apache.org/jira/browse/BEAM-7518
 Project: Beam
  Issue Type: Task
  Components: sdk-java-core
Reporter: Alex Van Boxel
Assignee: Alex Van Boxel


Protobuf Schema provider has some loosy conversion from some Proto types. 
Introduce Logical Types for:

Timestamp, Duration and Unsigned Int64



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Issue Comment Deleted] (BEAM-4455) Provide automatic schema registration for Protos

2019-05-25 Thread Alex Van Boxel (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Van Boxel updated BEAM-4455:
-
Comment: was deleted

(was: Protobuf support is almost ready for PR. I'll be submitting it under this 
ticket and close BEAM-7274 as duplicate. That means assigning this ticket to me 
though. Agreed?)

> Provide automatic schema registration for Protos
> 
>
> Key: BEAM-4455
> URL: https://issues.apache.org/jira/browse/BEAM-4455
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-java-core
>Reporter: Reuven Lax
>Assignee: Shehzaad Nakhoda
>Priority: Major
>
> Need to make sure this is a compatible change



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-4455) Provide automatic schema registration for Protos

2019-05-25 Thread Alex Van Boxel (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-4455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16848189#comment-16848189
 ] 

Alex Van Boxel commented on BEAM-4455:
--

BEAM-7274 will be used for the implementation of the schema support, this 
ticket for the integration. Best to split both concerns,

> Provide automatic schema registration for Protos
> 
>
> Key: BEAM-4455
> URL: https://issues.apache.org/jira/browse/BEAM-4455
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-java-core
>Reporter: Reuven Lax
>Assignee: Shehzaad Nakhoda
>Priority: Major
>
> Need to make sure this is a compatible change



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-7426) FieldSpecifierNotationLexer should support underscore as field character

2019-05-25 Thread Alex Van Boxel (JIRA)
Alex Van Boxel created BEAM-7426:


 Summary: FieldSpecifierNotationLexer should support underscore as 
field character
 Key: BEAM-7426
 URL: https://issues.apache.org/jira/browse/BEAM-7426
 Project: Beam
  Issue Type: Improvement
  Components: sdk-java-core
Reporter: Alex Van Boxel
Assignee: Alex Van Boxel


Underscore is a common used word delimiter in field names, the current 
FieldSpecifierNotationLexer only support alpha-numeric values for field name 
character. 

The upcoming Protobuf schema support will emit underscores in the field names, 
so field names should support underscore.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


  1   2   >