[jira] [Work started] (BEAM-9275) BIP-1: Beam Schema Options
[ https://issues.apache.org/jira/browse/BEAM-9275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on BEAM-9275 started by Alex Van Boxel. > BIP-1: Beam Schema Options > -- > > Key: BEAM-9275 > URL: https://issues.apache.org/jira/browse/BEAM-9275 > Project: Beam > Issue Type: Improvement > Components: sdk-java-core >Reporter: Alex Van Boxel >Assignee: Alex Van Boxel >Priority: P2 > Labels: stale-P2 > > Introduce the concept of Options in Beam Schema’s to add extra context to > fields and schemas. In contrast to the current Beam metadata that is present > in a FieldType, options would be added to fields, logical types and schemas. > The schema convertors (ex. Avro, Proto, …) can add > options/annotations/decorators that were in the original schema to the Beam > schema with these options. These options, that add contextual metadata, can > be used in the pipeline for specific transformations or augment the end > schema in the target output. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-9275) BIP-1: Beam Schema Options
[ https://issues.apache.org/jira/browse/BEAM-9275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Van Boxel updated BEAM-9275: - Status: Open (was: Triage Needed) > BIP-1: Beam Schema Options > -- > > Key: BEAM-9275 > URL: https://issues.apache.org/jira/browse/BEAM-9275 > Project: Beam > Issue Type: Improvement > Components: sdk-java-core >Reporter: Alex Van Boxel >Priority: P2 > Labels: stale-P2 > > Introduce the concept of Options in Beam Schema’s to add extra context to > fields and schemas. In contrast to the current Beam metadata that is present > in a FieldType, options would be added to fields, logical types and schemas. > The schema convertors (ex. Avro, Proto, …) can add > options/annotations/decorators that were in the original schema to the Beam > schema with these options. These options, that add contextual metadata, can > be used in the pipeline for specific transformations or augment the end > schema in the target output. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (BEAM-9275) BIP-1: Beam Schema Options
[ https://issues.apache.org/jira/browse/BEAM-9275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Van Boxel reassigned BEAM-9275: Assignee: Alex Van Boxel > BIP-1: Beam Schema Options > -- > > Key: BEAM-9275 > URL: https://issues.apache.org/jira/browse/BEAM-9275 > Project: Beam > Issue Type: Improvement > Components: sdk-java-core >Reporter: Alex Van Boxel >Assignee: Alex Van Boxel >Priority: P2 > Labels: stale-P2 > > Introduce the concept of Options in Beam Schema’s to add extra context to > fields and schemas. In contrast to the current Beam metadata that is present > in a FieldType, options would be added to fields, logical types and schemas. > The schema convertors (ex. Avro, Proto, …) can add > options/annotations/decorators that were in the original schema to the Beam > schema with these options. These options, that add contextual metadata, can > be used in the pipeline for specific transformations or augment the end > schema in the target output. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-9416) BIP-1: Convert avro metadata to Schema options
[ https://issues.apache.org/jira/browse/BEAM-9416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Van Boxel updated BEAM-9416: - Fix Version/s: (was: 2.21.0) 2.22.0 > BIP-1: Convert avro metadata to Schema options > -- > > Key: BEAM-9416 > URL: https://issues.apache.org/jira/browse/BEAM-9416 > Project: Beam > Issue Type: Sub-task > Components: sdk-java-core >Reporter: Alex Van Boxel >Assignee: Alex Van Boxel >Priority: Major > Fix For: 2.22.0 > > Time Spent: 1h > Remaining Estimate: 0h > > Avro has some metadata that can be added to the normal type information. It > is based on json typing, so the conversion will be best effort (probably we > can bet int, string and float out of it). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work started] (BEAM-9704) BIP-1: Deprecate and remove FieldType metadata
[ https://issues.apache.org/jira/browse/BEAM-9704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on BEAM-9704 started by Alex Van Boxel. > BIP-1: Deprecate and remove FieldType metadata > -- > > Key: BEAM-9704 > URL: https://issues.apache.org/jira/browse/BEAM-9704 > Project: Beam > Issue Type: Sub-task > Components: sdk-java-core >Affects Versions: 2.21.0 >Reporter: Alex Van Boxel >Assignee: Alex Van Boxel >Priority: Minor > Fix For: 2.23.0 > > > Deprecate and remove getMetadata on the FieldType. > * Add deprecation notice on the getMetadata field in version 2.21.0 > * Remove the getMetadata field in 2.23.0 > All usage of metadata should be replaced by 2.23.0 and use the portable beam > schema options. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (BEAM-9704) BIP-1: Deprecate and remove FieldType metadata
Alex Van Boxel created BEAM-9704: Summary: BIP-1: Deprecate and remove FieldType metadata Key: BEAM-9704 URL: https://issues.apache.org/jira/browse/BEAM-9704 Project: Beam Issue Type: Sub-task Components: sdk-java-core Affects Versions: 2.21.0 Reporter: Alex Van Boxel Assignee: Alex Van Boxel Fix For: 2.23.0 Deprecate and remove getMetadata on the FieldType. * Add deprecation notice on the getMetadata field in version 2.21.0 * Remove the getMetadata field in 2.23.0 All usage of metadata should be replaced by 2.23.0 and use the portable beam schema options. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-9704) BIP-1: Deprecate and remove FieldType metadata
[ https://issues.apache.org/jira/browse/BEAM-9704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Van Boxel updated BEAM-9704: - Status: Open (was: Triage Needed) > BIP-1: Deprecate and remove FieldType metadata > -- > > Key: BEAM-9704 > URL: https://issues.apache.org/jira/browse/BEAM-9704 > Project: Beam > Issue Type: Sub-task > Components: sdk-java-core >Affects Versions: 2.21.0 >Reporter: Alex Van Boxel >Assignee: Alex Van Boxel >Priority: Minor > Fix For: 2.23.0 > > > Deprecate and remove getMetadata on the FieldType. > * Add deprecation notice on the getMetadata field in version 2.21.0 > * Remove the getMetadata field in 2.23.0 > All usage of metadata should be replaced by 2.23.0 and use the portable beam > schema options. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (BEAM-9604) BIP-1: Remove schema metadata usage for Protobuf extension
[ https://issues.apache.org/jira/browse/BEAM-9604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Van Boxel resolved BEAM-9604. -- Fix Version/s: 2.21.0 Resolution: Fixed This was part of https://github.com/apache/beam/pull/10529 > BIP-1: Remove schema metadata usage for Protobuf extension > -- > > Key: BEAM-9604 > URL: https://issues.apache.org/jira/browse/BEAM-9604 > Project: Beam > Issue Type: Sub-task > Components: extensions-java-protobuf >Reporter: Alex Van Boxel >Assignee: Alex Van Boxel >Priority: Minor > Fix For: 2.21.0 > > > Replace the schema metadata usage and replace it with using the options. This > will probably mean: > * Moving the message_name metadata to a Schema option (for field, map key > and value) > * Replace the proto_number to a Field option -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Closed] (BEAM-9605) BIP-1: Rename setRowOption to setOption on Option builder
[ https://issues.apache.org/jira/browse/BEAM-9605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Van Boxel closed BEAM-9605. > BIP-1: Rename setRowOption to setOption on Option builder > -- > > Key: BEAM-9605 > URL: https://issues.apache.org/jira/browse/BEAM-9605 > Project: Beam > Issue Type: Sub-task > Components: sdk-java-core >Reporter: Alex Van Boxel >Assignee: Alex Van Boxel >Priority: Major > Fix For: 2.21.0 > > Time Spent: 1h > Remaining Estimate: 0h > > Rename setRowOption to setOption on Option builder as setRowOption name is > too confusing. > It sets an option as a Row, not an option on a Row. Using setOption is better > and doesn't conflict with the other setOption with 3 parameters and explicit > type. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Closed] (BEAM-9604) BIP-1: Remove schema metadata usage for Protobuf extension
[ https://issues.apache.org/jira/browse/BEAM-9604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Van Boxel closed BEAM-9604. > BIP-1: Remove schema metadata usage for Protobuf extension > -- > > Key: BEAM-9604 > URL: https://issues.apache.org/jira/browse/BEAM-9604 > Project: Beam > Issue Type: Sub-task > Components: extensions-java-protobuf >Reporter: Alex Van Boxel >Assignee: Alex Van Boxel >Priority: Minor > Fix For: 2.21.0 > > > Replace the schema metadata usage and replace it with using the options. This > will probably mean: > * Moving the message_name metadata to a Schema option (for field, map key > and value) > * Replace the proto_number to a Field option -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Closed] (BEAM-9044) BIP-1: Convert protobuf options to Schema options
[ https://issues.apache.org/jira/browse/BEAM-9044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Van Boxel closed BEAM-9044. > BIP-1: Convert protobuf options to Schema options > - > > Key: BEAM-9044 > URL: https://issues.apache.org/jira/browse/BEAM-9044 > Project: Beam > Issue Type: Sub-task > Components: extensions-java-protobuf >Reporter: Alex Van Boxel >Assignee: Alex Van Boxel >Priority: Minor > Fix For: 2.21.0 > > Time Spent: 3h 10m > Remaining Estimate: 0h > > Protobuf has a rich metadata system called options. This system is fully > typed and matches Beams Schema Option system. For now we can only convert the > following protobuf options: > * File Options -> _Beam doesn't have this concept_ > * Message Options -> *Beam Schema Options* > * Field Options -> *Beam Schema Options* > * Enum Options -> _This can only be done when logical type options are > available_ > * EnumValue Options -> _This can only be done when logical type options are > available_ > * Service Options -> _Beam doesn't have this concept_ > * Method Options -> _Beam doesn't have this concept_ -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Closed] (BEAM-9035) BIP-1: Typed options for Row Schema and Fields
[ https://issues.apache.org/jira/browse/BEAM-9035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Van Boxel closed BEAM-9035. > BIP-1: Typed options for Row Schema and Fields > -- > > Key: BEAM-9035 > URL: https://issues.apache.org/jira/browse/BEAM-9035 > Project: Beam > Issue Type: Sub-task > Components: sdk-java-core >Reporter: Alex Van Boxel >Assignee: Alex Van Boxel >Priority: Major > Fix For: 2.21.0 > > Time Spent: 8h 40m > Remaining Estimate: 0h > > This is the first issue of a multipart commit: this ticket implements the > basic infrastructure of options on row and field. > Full explanation: > Introduce the concept of Options in Beam Schema’s to add extra context to > fields and schema. In contracts to metadata, options would be added to > fields, logical types and rows. In the options schema convertors can add > options/annotations/decorators that were in the original schema, this context > can be used in the rest of the pipeline for specific transformations or > augment the end schema in the target output. > Examples of options are: > * informational: like the source of the data, ... > * drive decisions further in the pipeline: flatten a row into another, > rename a field, ... > * influence something in the output: like cluster index, primary key, ... > * logical type information > And option is a key/typed value combination. The advantages of having the > value types is: > * Having strongly typed options would give a *portable way of Logical Types* > to have structured information that could be shared over different languages. > * This could keep the type intact when mapping from a formats that have > strongly typed options (example: Protobuf). > This is part of a multi ticket implementation. The following tickets are > related: > # Typed options for Row Schema and Fields > # Convert Proto Options to Beam Schema options > # Convert Avro extra information for Beam string options > # Replace meta data with Logical Type options > # Extract meta data in Calcite SQL to Beam options > # Extract meta data in Zeta SQL to Beam options > # Add java example of using option in a transform > This feature is discussed with Reuven Lax, Brian Hulette -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (BEAM-9456) Upgrade to gradle 6.2
[ https://issues.apache.org/jira/browse/BEAM-9456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17072406#comment-17072406 ] Alex Van Boxel commented on BEAM-9456: -- it's a lot more involved than that. I already got protobuf, net.ltgt.gradle.* [~dschmitt] Are you planning the upgrade? I hate todo double effort. > Upgrade to gradle 6.2 > - > > Key: BEAM-9456 > URL: https://issues.apache.org/jira/browse/BEAM-9456 > Project: Beam > Issue Type: Task > Components: build-system >Reporter: Alex Van Boxel >Assignee: Alex Van Boxel >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-9605) BIP-1: Rename setRowOption to setOption on Option builder
[ https://issues.apache.org/jira/browse/BEAM-9605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Van Boxel updated BEAM-9605: - Status: Open (was: Triage Needed) > BIP-1: Rename setRowOption to setOption on Option builder > -- > > Key: BEAM-9605 > URL: https://issues.apache.org/jira/browse/BEAM-9605 > Project: Beam > Issue Type: Sub-task > Components: sdk-java-core >Reporter: Alex Van Boxel >Assignee: Alex Van Boxel >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > Rename setRowOption to setOption on Option builder as setRowOption name is > too confusing. > It sets an option as a Row, not an option on a Row. Using setOption is better > and doesn't conflict with the other setOption with 3 parameters and explicit > type. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (BEAM-9605) BIP-1: Rename setRowOption to setOption on Option builder
[ https://issues.apache.org/jira/browse/BEAM-9605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Van Boxel resolved BEAM-9605. -- Fix Version/s: 2.21.0 Resolution: Fixed > BIP-1: Rename setRowOption to setOption on Option builder > -- > > Key: BEAM-9605 > URL: https://issues.apache.org/jira/browse/BEAM-9605 > Project: Beam > Issue Type: Sub-task > Components: sdk-java-core >Reporter: Alex Van Boxel >Assignee: Alex Van Boxel >Priority: Major > Fix For: 2.21.0 > > Time Spent: 10m > Remaining Estimate: 0h > > Rename setRowOption to setOption on Option builder as setRowOption name is > too confusing. > It sets an option as a Row, not an option on a Row. Using setOption is better > and doesn't conflict with the other setOption with 3 parameters and explicit > type. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (BEAM-9605) BIP-1: Rename setRowOption to setOption on Option builder
Alex Van Boxel created BEAM-9605: Summary: BIP-1: Rename setRowOption to setOption on Option builder Key: BEAM-9605 URL: https://issues.apache.org/jira/browse/BEAM-9605 Project: Beam Issue Type: Sub-task Components: sdk-java-core Reporter: Alex Van Boxel Assignee: Alex Van Boxel Rename setRowOption to setOption on Option builder as setRowOption name is too confusing. It sets an option as a Row, not an option on a Row. Using setOption is better and doesn't conflict with the other setOption with 3 parameters and explicit type. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-9604) BIP-1: Remove schema metadata usage for Protobuf extension
[ https://issues.apache.org/jira/browse/BEAM-9604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Van Boxel updated BEAM-9604: - Parent: BEAM-9275 Issue Type: Sub-task (was: Task) > BIP-1: Remove schema metadata usage for Protobuf extension > -- > > Key: BEAM-9604 > URL: https://issues.apache.org/jira/browse/BEAM-9604 > Project: Beam > Issue Type: Sub-task > Components: extensions-java-protobuf >Reporter: Alex Van Boxel >Assignee: Alex Van Boxel >Priority: Minor > > Replace the schema metadata usage and replace it with using the options. This > will probably mean: > * Moving the message_name metadata to a Schema option (for field, map key > and value) > * Replace the proto_number to a Field option -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (BEAM-9604) BIP-1: Remove schema metadata usage for Protobuf extension
Alex Van Boxel created BEAM-9604: Summary: BIP-1: Remove schema metadata usage for Protobuf extension Key: BEAM-9604 URL: https://issues.apache.org/jira/browse/BEAM-9604 Project: Beam Issue Type: Task Components: extensions-java-protobuf Reporter: Alex Van Boxel Assignee: Alex Van Boxel Replace the schema metadata usage and replace it with using the options. This will probably mean: * Moving the message_name metadata to a Schema option (for field, map key and value) * Replace the proto_number to a Field option -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (BEAM-9044) BIP-1: Convert protobuf options to Schema options
[ https://issues.apache.org/jira/browse/BEAM-9044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Van Boxel resolved BEAM-9044. -- Fix Version/s: 2.21.0 Resolution: Fixed > BIP-1: Convert protobuf options to Schema options > - > > Key: BEAM-9044 > URL: https://issues.apache.org/jira/browse/BEAM-9044 > Project: Beam > Issue Type: Sub-task > Components: extensions-java-protobuf >Reporter: Alex Van Boxel >Assignee: Alex Van Boxel >Priority: Minor > Fix For: 2.21.0 > > Time Spent: 1.5h > Remaining Estimate: 0h > > Protobuf has a rich metadata system called options. This system is fully > typed and matches Beams Schema Option system. For now we can only convert the > following protobuf options: > * File Options -> _Beam doesn't have this concept_ > * Message Options -> *Beam Schema Options* > * Field Options -> *Beam Schema Options* > * Enum Options -> _This can only be done when logical type options are > available_ > * EnumValue Options -> _This can only be done when logical type options are > available_ > * Service Options -> _Beam doesn't have this concept_ > * Method Options -> _Beam doesn't have this concept_ -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (BEAM-9416) BIP-1: Convert avro metadata to Schema options
[ https://issues.apache.org/jira/browse/BEAM-9416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Van Boxel resolved BEAM-9416. -- Resolution: Fixed > BIP-1: Convert avro metadata to Schema options > -- > > Key: BEAM-9416 > URL: https://issues.apache.org/jira/browse/BEAM-9416 > Project: Beam > Issue Type: Sub-task > Components: sdk-java-core >Reporter: Alex Van Boxel >Assignee: Alex Van Boxel >Priority: Major > Fix For: 2.21.0 > > Time Spent: 10m > Remaining Estimate: 0h > > Avro has some metadata that can be added to the normal type information. It > is based on json typing, so the conversion will be best effort (probably we > can bet int, string and float out of it). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-9035) BIP-1: Typed options for Row Schema and Fields
[ https://issues.apache.org/jira/browse/BEAM-9035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Van Boxel updated BEAM-9035: - Fix Version/s: (was: 2.20.0) 2.21.0 > BIP-1: Typed options for Row Schema and Fields > -- > > Key: BEAM-9035 > URL: https://issues.apache.org/jira/browse/BEAM-9035 > Project: Beam > Issue Type: Sub-task > Components: sdk-java-core >Reporter: Alex Van Boxel >Assignee: Alex Van Boxel >Priority: Major > Fix For: 2.21.0 > > Time Spent: 8h 40m > Remaining Estimate: 0h > > This is the first issue of a multipart commit: this ticket implements the > basic infrastructure of options on row and field. > Full explanation: > Introduce the concept of Options in Beam Schema’s to add extra context to > fields and schema. In contracts to metadata, options would be added to > fields, logical types and rows. In the options schema convertors can add > options/annotations/decorators that were in the original schema, this context > can be used in the rest of the pipeline for specific transformations or > augment the end schema in the target output. > Examples of options are: > * informational: like the source of the data, ... > * drive decisions further in the pipeline: flatten a row into another, > rename a field, ... > * influence something in the output: like cluster index, primary key, ... > * logical type information > And option is a key/typed value combination. The advantages of having the > value types is: > * Having strongly typed options would give a *portable way of Logical Types* > to have structured information that could be shared over different languages. > * This could keep the type intact when mapping from a formats that have > strongly typed options (example: Protobuf). > This is part of a multi ticket implementation. The following tickets are > related: > # Typed options for Row Schema and Fields > # Convert Proto Options to Beam Schema options > # Convert Avro extra information for Beam string options > # Replace meta data with Logical Type options > # Extract meta data in Calcite SQL to Beam options > # Extract meta data in Zeta SQL to Beam options > # Add java example of using option in a transform > This feature is discussed with Reuven Lax, Brian Hulette -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-9035) BIP-1: Typed options for Row Schema and Fields
[ https://issues.apache.org/jira/browse/BEAM-9035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Van Boxel updated BEAM-9035: - Fix Version/s: (was: 2.19.0) 2.20.0 > BIP-1: Typed options for Row Schema and Fields > -- > > Key: BEAM-9035 > URL: https://issues.apache.org/jira/browse/BEAM-9035 > Project: Beam > Issue Type: Sub-task > Components: sdk-java-core >Reporter: Alex Van Boxel >Assignee: Alex Van Boxel >Priority: Major > Fix For: 2.20.0 > > Time Spent: 8h 40m > Remaining Estimate: 0h > > This is the first issue of a multipart commit: this ticket implements the > basic infrastructure of options on row and field. > Full explanation: > Introduce the concept of Options in Beam Schema’s to add extra context to > fields and schema. In contracts to metadata, options would be added to > fields, logical types and rows. In the options schema convertors can add > options/annotations/decorators that were in the original schema, this context > can be used in the rest of the pipeline for specific transformations or > augment the end schema in the target output. > Examples of options are: > * informational: like the source of the data, ... > * drive decisions further in the pipeline: flatten a row into another, > rename a field, ... > * influence something in the output: like cluster index, primary key, ... > * logical type information > And option is a key/typed value combination. The advantages of having the > value types is: > * Having strongly typed options would give a *portable way of Logical Types* > to have structured information that could be shared over different languages. > * This could keep the type intact when mapping from a formats that have > strongly typed options (example: Protobuf). > This is part of a multi ticket implementation. The following tickets are > related: > # Typed options for Row Schema and Fields > # Convert Proto Options to Beam Schema options > # Convert Avro extra information for Beam string options > # Replace meta data with Logical Type options > # Extract meta data in Calcite SQL to Beam options > # Extract meta data in Zeta SQL to Beam options > # Add java example of using option in a transform > This feature is discussed with Reuven Lax, Brian Hulette -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work started] (BEAM-9044) BIP-1: Convert protobuf options to Schema options
[ https://issues.apache.org/jira/browse/BEAM-9044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on BEAM-9044 started by Alex Van Boxel. > BIP-1: Convert protobuf options to Schema options > - > > Key: BEAM-9044 > URL: https://issues.apache.org/jira/browse/BEAM-9044 > Project: Beam > Issue Type: Sub-task > Components: extensions-java-protobuf >Reporter: Alex Van Boxel >Assignee: Alex Van Boxel >Priority: Minor > Time Spent: 1.5h > Remaining Estimate: 0h > > Protobuf has a rich metadata system called options. This system is fully > typed and matches Beams Schema Option system. For now we can only convert the > following protobuf options: > * File Options -> _Beam doesn't have this concept_ > * Message Options -> *Beam Schema Options* > * Field Options -> *Beam Schema Options* > * Enum Options -> _This can only be done when logical type options are > available_ > * EnumValue Options -> _This can only be done when logical type options are > available_ > * Service Options -> _Beam doesn't have this concept_ > * Method Options -> _Beam doesn't have this concept_ -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (BEAM-8218) Implement Apache PulsarIO
[ https://issues.apache.org/jira/browse/BEAM-8218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17054385#comment-17054385 ] Alex Van Boxel commented on BEAM-8218: -- Thanks, I appreciate the update. I'll assigned it too myself and will be starting it immediately. > Implement Apache PulsarIO > - > > Key: BEAM-8218 > URL: https://issues.apache.org/jira/browse/BEAM-8218 > Project: Beam > Issue Type: Task > Components: io-ideas >Reporter: Alex Van Boxel >Assignee: Alex Van Boxel >Priority: Minor > > Apache Pulsar is starting to gain popularity. Having a native Beam PulsarIO > could be beneficial. > [https://pulsar.apache.org/|https://pulsar.apache.org/en/] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (BEAM-8218) Implement Apache PulsarIO
[ https://issues.apache.org/jira/browse/BEAM-8218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Van Boxel reassigned BEAM-8218: Assignee: Alex Van Boxel (was: Taher Koitawala) > Implement Apache PulsarIO > - > > Key: BEAM-8218 > URL: https://issues.apache.org/jira/browse/BEAM-8218 > Project: Beam > Issue Type: Task > Components: io-ideas >Reporter: Alex Van Boxel >Assignee: Alex Van Boxel >Priority: Minor > > Apache Pulsar is starting to gain popularity. Having a native Beam PulsarIO > could be beneficial. > [https://pulsar.apache.org/|https://pulsar.apache.org/en/] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (BEAM-8218) Implement Apache PulsarIO
[ https://issues.apache.org/jira/browse/BEAM-8218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17054372#comment-17054372 ] Alex Van Boxel commented on BEAM-8218: -- [~taherk77] if we don't hear any updates on this I will consider this abandoned an I will take over. We need to get this moving. > Implement Apache PulsarIO > - > > Key: BEAM-8218 > URL: https://issues.apache.org/jira/browse/BEAM-8218 > Project: Beam > Issue Type: Task > Components: io-ideas >Reporter: Alex Van Boxel >Assignee: Taher Koitawala >Priority: Minor > > Apache Pulsar is starting to gain popularity. Having a native Beam PulsarIO > could be beneficial. > [https://pulsar.apache.org/|https://pulsar.apache.org/en/] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-9456) Upgrade to gradle 6.2
[ https://issues.apache.org/jira/browse/BEAM-9456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Van Boxel updated BEAM-9456: - Status: Open (was: Triage Needed) > Upgrade to gradle 6.2 > - > > Key: BEAM-9456 > URL: https://issues.apache.org/jira/browse/BEAM-9456 > Project: Beam > Issue Type: Task > Components: build-system >Reporter: Alex Van Boxel >Assignee: Alex Van Boxel >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (BEAM-9456) Upgrade to gradle 6.2
Alex Van Boxel created BEAM-9456: Summary: Upgrade to gradle 6.2 Key: BEAM-9456 URL: https://issues.apache.org/jira/browse/BEAM-9456 Project: Beam Issue Type: Task Components: build-system Reporter: Alex Van Boxel Assignee: Alex Van Boxel -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-9035) BIP-1: Typed options for Row Schema and Fields
[ https://issues.apache.org/jira/browse/BEAM-9035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Van Boxel updated BEAM-9035: - Summary: BIP-1: Typed options for Row Schema and Fields (was: Typed options for Row Schema and Fields) > BIP-1: Typed options for Row Schema and Fields > -- > > Key: BEAM-9035 > URL: https://issues.apache.org/jira/browse/BEAM-9035 > Project: Beam > Issue Type: Sub-task > Components: sdk-java-core >Reporter: Alex Van Boxel >Assignee: Alex Van Boxel >Priority: Major > Fix For: 2.19.0 > > Time Spent: 3h 50m > Remaining Estimate: 0h > > This is the first issue of a multipart commit: this ticket implements the > basic infrastructure of options on row and field. > Full explanation: > Introduce the concept of Options in Beam Schema’s to add extra context to > fields and schema. In contracts to metadata, options would be added to > fields, logical types and rows. In the options schema convertors can add > options/annotations/decorators that were in the original schema, this context > can be used in the rest of the pipeline for specific transformations or > augment the end schema in the target output. > Examples of options are: > * informational: like the source of the data, ... > * drive decisions further in the pipeline: flatten a row into another, > rename a field, ... > * influence something in the output: like cluster index, primary key, ... > * logical type information > And option is a key/typed value combination. The advantages of having the > value types is: > * Having strongly typed options would give a *portable way of Logical Types* > to have structured information that could be shared over different languages. > * This could keep the type intact when mapping from a formats that have > strongly typed options (example: Protobuf). > This is part of a multi ticket implementation. The following tickets are > related: > # Typed options for Row Schema and Fields > # Convert Proto Options to Beam Schema options > # Convert Avro extra information for Beam string options > # Replace meta data with Logical Type options > # Extract meta data in Calcite SQL to Beam options > # Extract meta data in Zeta SQL to Beam options > # Add java example of using option in a transform > This feature is discussed with Reuven Lax, Brian Hulette -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-9416) BIP-1: Convert avro metadata to Schema options
[ https://issues.apache.org/jira/browse/BEAM-9416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Van Boxel updated BEAM-9416: - Summary: BIP-1: Convert avro metadata to Schema options (was: Convert avro metadata to Schema options) > BIP-1: Convert avro metadata to Schema options > -- > > Key: BEAM-9416 > URL: https://issues.apache.org/jira/browse/BEAM-9416 > Project: Beam > Issue Type: Sub-task > Components: sdk-java-core >Reporter: Alex Van Boxel >Assignee: Alex Van Boxel >Priority: Major > Fix For: 2.21.0 > > > Avro has some metadata that can be added to the normal type information. It > is based on json typing, so the conversion will be best effort (probably we > can bet int, string and float out of it). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-9044) BIP-1: Convert protobuf options to Schema options
[ https://issues.apache.org/jira/browse/BEAM-9044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Van Boxel updated BEAM-9044: - Summary: BIP-1: Convert protobuf options to Schema options (was: Convert protobuf options to Schema options) > BIP-1: Convert protobuf options to Schema options > - > > Key: BEAM-9044 > URL: https://issues.apache.org/jira/browse/BEAM-9044 > Project: Beam > Issue Type: Sub-task > Components: extensions-java-protobuf >Reporter: Alex Van Boxel >Assignee: Alex Van Boxel >Priority: Minor > Time Spent: 10m > Remaining Estimate: 0h > > Protobuf has a rich metadata system called options. This system is fully > typed and matches Beams Schema Option system. For now we can only convert the > following protobuf options: > * File Options -> _Beam doesn't have this concept_ > * Message Options -> *Beam Schema Options* > * Field Options -> *Beam Schema Options* > * Enum Options -> _This can only be done when logical type options are > available_ > * EnumValue Options -> _This can only be done when logical type options are > available_ > * Service Options -> _Beam doesn't have this concept_ > * Method Options -> _Beam doesn't have this concept_ -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (BEAM-9416) Convert avro metadata to Schema options
Alex Van Boxel created BEAM-9416: Summary: Convert avro metadata to Schema options Key: BEAM-9416 URL: https://issues.apache.org/jira/browse/BEAM-9416 Project: Beam Issue Type: Sub-task Components: sdk-java-core Reporter: Alex Van Boxel Assignee: Alex Van Boxel Fix For: 2.21.0 Avro has some metadata that can be added to the normal type information. It is based on json typing, so the conversion will be best effort (probably we can bet int, string and float out of it). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Closed] (BEAM-7518) Protobuf Schema: Introduce logical type for Timestamp, Duration and other
[ https://issues.apache.org/jira/browse/BEAM-7518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Van Boxel closed BEAM-7518. > Protobuf Schema: Introduce logical type for Timestamp, Duration and other > - > > Key: BEAM-7518 > URL: https://issues.apache.org/jira/browse/BEAM-7518 > Project: Beam > Issue Type: Task > Components: sdk-java-core >Reporter: Alex Van Boxel >Assignee: Alex Van Boxel >Priority: Major > Fix For: 2.20.0 > > > Protobuf Schema provider has some loosy conversion from some Proto types. > Introduce Logical Types for: > Timestamp, Duration and Unsigned Int64 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (BEAM-7518) Protobuf Schema: Introduce logical type for Timestamp, Duration and other
[ https://issues.apache.org/jira/browse/BEAM-7518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Van Boxel resolved BEAM-7518. -- Fix Version/s: 2.20.0 Resolution: Fixed > Protobuf Schema: Introduce logical type for Timestamp, Duration and other > - > > Key: BEAM-7518 > URL: https://issues.apache.org/jira/browse/BEAM-7518 > Project: Beam > Issue Type: Task > Components: sdk-java-core >Reporter: Alex Van Boxel >Assignee: Alex Van Boxel >Priority: Major > Fix For: 2.20.0 > > > Protobuf Schema provider has some loosy conversion from some Proto types. > Introduce Logical Types for: > Timestamp, Duration and Unsigned Int64 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (BEAM-9394) DynamicMessage handling of empty map violates schema nullability
[ https://issues.apache.org/jira/browse/BEAM-9394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Van Boxel resolved BEAM-9394. -- Resolution: Fixed > DynamicMessage handling of empty map violates schema nullability > > > Key: BEAM-9394 > URL: https://issues.apache.org/jira/browse/BEAM-9394 > Project: Beam > Issue Type: Bug > Components: extensions-java-protobuf >Reporter: Alex Van Boxel >Assignee: Alex Van Boxel >Priority: Major > Fix For: 2.20.0 > > > DynamicMessage handling of empty map violates nullability. It should return > an empty map at the Row level. > Add tests for nullable map and array to verify behaviour. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-9394) DynamicMessage handling of empty map violates schema nullability
[ https://issues.apache.org/jira/browse/BEAM-9394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Van Boxel updated BEAM-9394: - Status: Open (was: Triage Needed) > DynamicMessage handling of empty map violates schema nullability > > > Key: BEAM-9394 > URL: https://issues.apache.org/jira/browse/BEAM-9394 > Project: Beam > Issue Type: Bug > Components: extensions-java-protobuf >Reporter: Alex Van Boxel >Assignee: Alex Van Boxel >Priority: Major > Fix For: 2.20.0 > > > DynamicMessage handling of empty map violates nullability. It should return > an empty map at the Row level. > Add tests for nullable map and array to verify behaviour. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (BEAM-9394) DynamicMessage handling of empty map violates schema nullability
Alex Van Boxel created BEAM-9394: Summary: DynamicMessage handling of empty map violates schema nullability Key: BEAM-9394 URL: https://issues.apache.org/jira/browse/BEAM-9394 Project: Beam Issue Type: Bug Components: extensions-java-protobuf Reporter: Alex Van Boxel Assignee: Alex Van Boxel Fix For: 2.20.0 DynamicMessage handling of empty map violates nullability. It should return an empty map at the Row level. Add tests for nullable map and array to verify behaviour. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Closed] (BEAM-7274) Protobuf Beam Schema support
[ https://issues.apache.org/jira/browse/BEAM-7274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Van Boxel closed BEAM-7274. > Protobuf Beam Schema support > > > Key: BEAM-7274 > URL: https://issues.apache.org/jira/browse/BEAM-7274 > Project: Beam > Issue Type: Sub-task > Components: sdk-java-core >Reporter: Alex Van Boxel >Assignee: Alex Van Boxel >Priority: Minor > Fix For: 2.20.0 > > Time Spent: 26h 40m > Remaining Estimate: 0h > > Add support for the new Beam Schema to the Protobuf extension. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (BEAM-7274) Protobuf Beam Schema support
[ https://issues.apache.org/jira/browse/BEAM-7274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Van Boxel resolved BEAM-7274. -- Fix Version/s: (was: 2.21.0) 2.20.0 Resolution: Fixed Moving back to 2.20 as it's merged into master > Protobuf Beam Schema support > > > Key: BEAM-7274 > URL: https://issues.apache.org/jira/browse/BEAM-7274 > Project: Beam > Issue Type: Sub-task > Components: sdk-java-core >Reporter: Alex Van Boxel >Assignee: Alex Van Boxel >Priority: Minor > Fix For: 2.20.0 > > Time Spent: 26h 40m > Remaining Estimate: 0h > > Add support for the new Beam Schema to the Protobuf extension. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (BEAM-7274) Protobuf Beam Schema support
[ https://issues.apache.org/jira/browse/BEAM-7274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17044920#comment-17044920 ] Alex Van Boxel commented on BEAM-7274: -- Moved to 2.21 > Protobuf Beam Schema support > > > Key: BEAM-7274 > URL: https://issues.apache.org/jira/browse/BEAM-7274 > Project: Beam > Issue Type: Sub-task > Components: sdk-java-core >Reporter: Alex Van Boxel >Assignee: Alex Van Boxel >Priority: Minor > Fix For: 2.21.0 > > Time Spent: 26h 20m > Remaining Estimate: 0h > > Add support for the new Beam Schema to the Protobuf extension. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-7274) Protobuf Beam Schema support
[ https://issues.apache.org/jira/browse/BEAM-7274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Van Boxel updated BEAM-7274: - Fix Version/s: (was: 2.20.0) 2.21.0 > Protobuf Beam Schema support > > > Key: BEAM-7274 > URL: https://issues.apache.org/jira/browse/BEAM-7274 > Project: Beam > Issue Type: Sub-task > Components: sdk-java-core >Reporter: Alex Van Boxel >Assignee: Alex Van Boxel >Priority: Minor > Fix For: 2.21.0 > > Time Spent: 26h 20m > Remaining Estimate: 0h > > Add support for the new Beam Schema to the Protobuf extension. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (BEAM-9360) Schema FieldType should not consider metadata for equivalence
[ https://issues.apache.org/jira/browse/BEAM-9360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Van Boxel reassigned BEAM-9360: Assignee: Jozef Vilcek > Schema FieldType should not consider metadata for equivalence > - > > Key: BEAM-9360 > URL: https://issues.apache.org/jira/browse/BEAM-9360 > Project: Beam > Issue Type: Bug > Components: sdk-java-core >Affects Versions: 2.19.0 >Reporter: Jozef Vilcek >Assignee: Jozef Vilcek >Priority: Major > > FieldType `equivalent()` check should not require exact match in fields > metadata. > Discussion in dev mailing list: > [https://lists.apache.org/list.html?d...@beam.apache.org:lte=1M:Schema%20Convert%20transform%20fails%20on%20type%20metadata] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (BEAM-9241) Fix inconsistent nullability mapping for Protobuf to Schema
[ https://issues.apache.org/jira/browse/BEAM-9241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Van Boxel resolved BEAM-9241. -- Resolution: Fixed > Fix inconsistent nullability mapping for Protobuf to Schema > --- > > Key: BEAM-9241 > URL: https://issues.apache.org/jira/browse/BEAM-9241 > Project: Beam > Issue Type: Bug > Components: extensions-java-protobuf >Affects Versions: 2.18.0 >Reporter: Alex Van Boxel >Assignee: Alex Van Boxel >Priority: Major > Fix For: 2.20.0 > > Time Spent: 1h 40m > Remaining Estimate: 0h > > Fix the nullability issues with protobuf to schema mapping > * Proto3 primitive types should be *not* nullable. > * Proto2 required types should be *not* nullable. > * Proto2 optional should also be *not* nullable as having an optional value > doesn't mean it has not value. The spec states it has the optional value. > * Arrays should be *not* nullable, as proto arrays always have an empty > array when no value is set. > * Maps should be *not* nullable, as proto maps always have an empty map when > no value is set. > * Elements in an array should be *not* nullable, as nulls are not allowed in > an array. > * Names and Values should be *not* nullable, as nulls are not allowed. > * Rows are nullable, as messages are nullable. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-9275) BIP-1: Beam Schema Options
[ https://issues.apache.org/jira/browse/BEAM-9275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Van Boxel updated BEAM-9275: - Description: Introduce the concept of Options in Beam Schema’s to add extra context to fields and schemas. In contrast to the current Beam metadata that is present in a FieldType, options would be added to fields, logical types and schemas. The schema convertors (ex. Avro, Proto, …) can add options/annotations/decorators that were in the original schema to the Beam schema with these options. These options, that add contextual metadata, can be used in the pipeline for specific transformations or augment the end schema in the target output. > BIP-1: Beam Schema Options > -- > > Key: BEAM-9275 > URL: https://issues.apache.org/jira/browse/BEAM-9275 > Project: Beam > Issue Type: Improvement > Components: sdk-java-core >Reporter: Alex Van Boxel >Priority: Major > > Introduce the concept of Options in Beam Schema’s to add extra context to > fields and schemas. In contrast to the current Beam metadata that is present > in a FieldType, options would be added to fields, logical types and schemas. > The schema convertors (ex. Avro, Proto, …) can add > options/annotations/decorators that were in the original schema to the Beam > schema with these options. These options, that add contextual metadata, can > be used in the pipeline for specific transformations or augment the end > schema in the target output. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-9044) Convert protobuf options to Schema options
[ https://issues.apache.org/jira/browse/BEAM-9044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Van Boxel updated BEAM-9044: - Parent: BEAM-9275 Issue Type: Sub-task (was: Task) > Convert protobuf options to Schema options > -- > > Key: BEAM-9044 > URL: https://issues.apache.org/jira/browse/BEAM-9044 > Project: Beam > Issue Type: Sub-task > Components: extensions-java-protobuf >Reporter: Alex Van Boxel >Assignee: Alex Van Boxel >Priority: Minor > Time Spent: 10m > Remaining Estimate: 0h > > Protobuf has a rich metadata system called options. This system is fully > typed and matches Beams Schema Option system. For now we can only convert the > following protobuf options: > * File Options -> _Beam doesn't have this concept_ > * Message Options -> *Beam Schema Options* > * Field Options -> *Beam Schema Options* > * Enum Options -> _This can only be done when logical type options are > available_ > * EnumValue Options -> _This can only be done when logical type options are > available_ > * Service Options -> _Beam doesn't have this concept_ > * Method Options -> _Beam doesn't have this concept_ -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-9035) Typed options for Row Schema and Fields
[ https://issues.apache.org/jira/browse/BEAM-9035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Van Boxel updated BEAM-9035: - Parent: BEAM-9275 Issue Type: Sub-task (was: Task) > Typed options for Row Schema and Fields > --- > > Key: BEAM-9035 > URL: https://issues.apache.org/jira/browse/BEAM-9035 > Project: Beam > Issue Type: Sub-task > Components: sdk-java-core >Reporter: Alex Van Boxel >Assignee: Alex Van Boxel >Priority: Major > Fix For: 2.19.0 > > Time Spent: 1h 50m > Remaining Estimate: 0h > > This is the first issue of a multipart commit: this ticket implements the > basic infrastructure of options on row and field. > Full explanation: > Introduce the concept of Options in Beam Schema’s to add extra context to > fields and schema. In contracts to metadata, options would be added to > fields, logical types and rows. In the options schema convertors can add > options/annotations/decorators that were in the original schema, this context > can be used in the rest of the pipeline for specific transformations or > augment the end schema in the target output. > Examples of options are: > * informational: like the source of the data, ... > * drive decisions further in the pipeline: flatten a row into another, > rename a field, ... > * influence something in the output: like cluster index, primary key, ... > * logical type information > And option is a key/typed value combination. The advantages of having the > value types is: > * Having strongly typed options would give a *portable way of Logical Types* > to have structured information that could be shared over different languages. > * This could keep the type intact when mapping from a formats that have > strongly typed options (example: Protobuf). > This is part of a multi ticket implementation. The following tickets are > related: > # Typed options for Row Schema and Fields > # Convert Proto Options to Beam Schema options > # Convert Avro extra information for Beam string options > # Replace meta data with Logical Type options > # Extract meta data in Calcite SQL to Beam options > # Extract meta data in Zeta SQL to Beam options > # Add java example of using option in a transform > This feature is discussed with Reuven Lax, Brian Hulette -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (BEAM-9275) BIP-1: Beam Schema Options
Alex Van Boxel created BEAM-9275: Summary: BIP-1: Beam Schema Options Key: BEAM-9275 URL: https://issues.apache.org/jira/browse/BEAM-9275 Project: Beam Issue Type: Improvement Components: sdk-java-core Reporter: Alex Van Boxel -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Closed] (BEAM-9037) Instant and duration as logical type
[ https://issues.apache.org/jira/browse/BEAM-9037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Van Boxel closed BEAM-9037. > Instant and duration as logical type > - > > Key: BEAM-9037 > URL: https://issues.apache.org/jira/browse/BEAM-9037 > Project: Beam > Issue Type: Task > Components: sdk-java-core >Reporter: Alex Van Boxel >Assignee: Alex Van Boxel >Priority: Major > Fix For: 2.20.0 > > Time Spent: 5h > Remaining Estimate: 0h > > The proto schema includes Timestamp and Duration with nano precision. The > logical types should be promoted to the core logical types, so they can be > handled on various IO's as standard mandatory conversions. > This means that the logical type should use the proto specific Timestamp and > Duration but the java 8 Instant and Duration. > See discussion in the design document: > [https://docs.google.com/document/d/1uu9pJktzT_O3DxGd1-Q2op4nRk4HekIZbzi-0oTAips/edit#heading=h.9uhml95iygqr] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (BEAM-4457) Analyze FieldAccessDescriptors and drop fields that are never accessed
[ https://issues.apache.org/jira/browse/BEAM-4457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17030397#comment-17030397 ] Alex Van Boxel commented on BEAM-4457: -- I remember the days when I wrote Apache Pig that it had a similar concept as well. Another part where this could benefit it the ToRow function where the row gets materialized in a RowWithStorage. Only the fields that are accessed should be materialized. > Analyze FieldAccessDescriptors and drop fields that are never accessed > -- > > Key: BEAM-4457 > URL: https://issues.apache.org/jira/browse/BEAM-4457 > Project: Beam > Issue Type: Sub-task > Components: io-java-gcp >Reporter: Reuven Lax >Assignee: Reuven Lax >Priority: Major > > We can walk backwards through the graph, analyzing which fields are accessed. > When we find paths where many fields are never accessed, we can insert a > projection transform to drop those fields preemptively. This can save a lot > of resources in the case where many fields in the input are never accessed. > To do this, the FieldAccessDescriptor information must be added to the > portability protos. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work started] (BEAM-9241) Fix inconsistent nullability mapping for Protobuf to Schema
[ https://issues.apache.org/jira/browse/BEAM-9241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on BEAM-9241 started by Alex Van Boxel. > Fix inconsistent nullability mapping for Protobuf to Schema > --- > > Key: BEAM-9241 > URL: https://issues.apache.org/jira/browse/BEAM-9241 > Project: Beam > Issue Type: Bug > Components: extensions-java-protobuf >Affects Versions: 2.18.0 >Reporter: Alex Van Boxel >Assignee: Alex Van Boxel >Priority: Major > Fix For: 2.20.0 > > > Fix the nullability issues with protobuf to schema mapping > * Proto3 primitive types should be *not* nullable. > * Proto2 required types should be *not* nullable. > * Proto2 optional should also be *not* nullable as having an optional value > doesn't mean it has not value. The spec states it has the optional value. > * Arrays should be *not* nullable, as proto arrays always have an empty > array when no value is set. > * Maps should be *not* nullable, as proto maps always have an empty map when > no value is set. > * Elements in an array should be *not* nullable, as nulls are not allowed in > an array. > * Names and Values should be *not* nullable, as nulls are not allowed. > * Rows are nullable, as messages are nullable. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-9241) Fix inconsistent nullability mapping for Protobuf to Schema
[ https://issues.apache.org/jira/browse/BEAM-9241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Van Boxel updated BEAM-9241: - Status: Open (was: Triage Needed) > Fix inconsistent nullability mapping for Protobuf to Schema > --- > > Key: BEAM-9241 > URL: https://issues.apache.org/jira/browse/BEAM-9241 > Project: Beam > Issue Type: Bug > Components: extensions-java-protobuf >Affects Versions: 2.18.0 >Reporter: Alex Van Boxel >Assignee: Alex Van Boxel >Priority: Major > Fix For: 2.20.0 > > > Fix the nullability issues with protobuf to schema mapping > * Proto3 primitive types should be *not* nullable. > * Proto2 required types should be *not* nullable. > * Proto2 optional should also be *not* nullable as having an optional value > doesn't mean it has not value. The spec states it has the optional value. > * Arrays should be *not* nullable, as proto arrays always have an empty > array when no value is set. > * Maps should be *not* nullable, as proto maps always have an empty map when > no value is set. > * Elements in an array should be *not* nullable, as nulls are not allowed in > an array. > * Names and Values should be *not* nullable, as nulls are not allowed. > * Rows are nullable, as messages are nullable. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (BEAM-9241) Fix inconsistent nullability mapping for Protobuf to Schema
Alex Van Boxel created BEAM-9241: Summary: Fix inconsistent nullability mapping for Protobuf to Schema Key: BEAM-9241 URL: https://issues.apache.org/jira/browse/BEAM-9241 Project: Beam Issue Type: Bug Components: extensions-java-protobuf Affects Versions: 2.18.0 Reporter: Alex Van Boxel Assignee: Alex Van Boxel Fix For: 2.20.0 Fix the nullability issues with protobuf to schema mapping * Proto3 primitive types should be *not* nullable. * Proto2 required types should be *not* nullable. * Proto2 optional should also be *not* nullable as having an optional value doesn't mean it has not value. The spec states it has the optional value. * Arrays should be *not* nullable, as proto arrays always have an empty array when no value is set. * Maps should be *not* nullable, as proto maps always have an empty map when no value is set. * Elements in an array should be *not* nullable, as nulls are not allowed in an array. * Names and Values should be *not* nullable, as nulls are not allowed. * Rows are nullable, as messages are nullable. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-9037) Instant and duration as logical type
[ https://issues.apache.org/jira/browse/BEAM-9037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Van Boxel updated BEAM-9037: - Fix Version/s: (was: 2.19.0) 2.20.0 > Instant and duration as logical type > - > > Key: BEAM-9037 > URL: https://issues.apache.org/jira/browse/BEAM-9037 > Project: Beam > Issue Type: Task > Components: sdk-java-core >Reporter: Alex Van Boxel >Assignee: Alex Van Boxel >Priority: Major > Fix For: 2.20.0 > > Time Spent: 2h 50m > Remaining Estimate: 0h > > The proto schema includes Timestamp and Duration with nano precision. The > logical types should be promoted to the core logical types, so they can be > handled on various IO's as standard mandatory conversions. > This means that the logical type should use the proto specific Timestamp and > Duration but the java 8 Instant and Duration. > See discussion in the design document: > [https://docs.google.com/document/d/1uu9pJktzT_O3DxGd1-Q2op4nRk4HekIZbzi-0oTAips/edit#heading=h.9uhml95iygqr] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Closed] (BEAM-9113) Protobuf NanosType serialisation issues
[ https://issues.apache.org/jira/browse/BEAM-9113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Van Boxel closed BEAM-9113. > Protobuf NanosType serialisation issues > -- > > Key: BEAM-9113 > URL: https://issues.apache.org/jira/browse/BEAM-9113 > Project: Beam > Issue Type: Bug > Components: sdk-java-core >Reporter: Alex Van Boxel >Assignee: Alex Van Boxel >Priority: Major > Fix For: 2.20.0 > > Time Spent: 1h 10m > Remaining Estimate: 0h > > The NanosType has 2 known issues: > * Schema serialisation expects the getArgument to not return a null value > * UUID of the base type will not be (de)serialised as it is static -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (BEAM-9113) Protobuf NanosType serialisation issues
[ https://issues.apache.org/jira/browse/BEAM-9113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Van Boxel resolved BEAM-9113. -- Fix Version/s: 2.20.0 Resolution: Fixed Resolved by the general logical type for Instant and Duration > Protobuf NanosType serialisation issues > -- > > Key: BEAM-9113 > URL: https://issues.apache.org/jira/browse/BEAM-9113 > Project: Beam > Issue Type: Bug > Components: sdk-java-core >Reporter: Alex Van Boxel >Assignee: Alex Van Boxel >Priority: Major > Fix For: 2.20.0 > > Time Spent: 1h 10m > Remaining Estimate: 0h > > The NanosType has 2 known issues: > * Schema serialisation expects the getArgument to not return a null value > * UUID of the base type will not be (de)serialised as it is static -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (BEAM-9113) Protobuf NanosType serialisation issues
Alex Van Boxel created BEAM-9113: Summary: Protobuf NanosType serialisation issues Key: BEAM-9113 URL: https://issues.apache.org/jira/browse/BEAM-9113 Project: Beam Issue Type: Bug Components: sdk-java-core Reporter: Alex Van Boxel Assignee: Alex Van Boxel The NanosType has 2 known issues: * Schema serialisation expects the getArgument to not return a null value * UUID of the base type will not be (de)serialised as it is static -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (BEAM-9054) Row.toString with Logical Type are different for RowWithGetters and RowWithStorage
[ https://issues.apache.org/jira/browse/BEAM-9054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17008131#comment-17008131 ] Alex Van Boxel commented on BEAM-9054: -- [~reuvenlax] : I could change this, but I'm indifferent of what the best representation for the logical type is: the base type or the logical type itself. > Row.toString with Logical Type are different for RowWithGetters and > RowWithStorage > -- > > Key: BEAM-9054 > URL: https://issues.apache.org/jira/browse/BEAM-9054 > Project: Beam > Issue Type: Bug > Components: sdk-java-core >Reporter: Alex Van Boxel >Priority: Major > > Row.toString with Logical Type are different for RowWithGetters and > RowWithStorage with equivalent schemas. Behaviour for: > * RowWithGetters will show the .toString() representation of the logical type > * RowWithStorage will show the base type > This should be one or the other -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-9054) Row.toString with Logical Type are different for RowWithGetters and RowWithStorage
[ https://issues.apache.org/jira/browse/BEAM-9054?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Van Boxel updated BEAM-9054: - Issue Type: Bug (was: Task) > Row.toString with Logical Type are different for RowWithGetters and > RowWithStorage > -- > > Key: BEAM-9054 > URL: https://issues.apache.org/jira/browse/BEAM-9054 > Project: Beam > Issue Type: Bug > Components: sdk-java-core >Reporter: Alex Van Boxel >Priority: Major > > Row.toString with Logical Type are different for RowWithGetters and > RowWithStorage with equivalent schemas. Behaviour for: > * RowWithGetters will show the .toString() representation of the logical type > * RowWithStorage will show the base type > This should be one or the other -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (BEAM-9054) Row.toString with Logical Type are different for RowWithGetters and RowWithStorage
Alex Van Boxel created BEAM-9054: Summary: Row.toString with Logical Type are different for RowWithGetters and RowWithStorage Key: BEAM-9054 URL: https://issues.apache.org/jira/browse/BEAM-9054 Project: Beam Issue Type: Task Components: sdk-java-core Reporter: Alex Van Boxel Row.toString with Logical Type are different for RowWithGetters and RowWithStorage with equivalent schemas. Behaviour for: * RowWithGetters will show the .toString() representation of the logical type * RowWithStorage will show the base type This should be one or the other -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (BEAM-9044) Convert protobuf options to Schema options
Alex Van Boxel created BEAM-9044: Summary: Convert protobuf options to Schema options Key: BEAM-9044 URL: https://issues.apache.org/jira/browse/BEAM-9044 Project: Beam Issue Type: Task Components: extensions-java-protobuf Reporter: Alex Van Boxel Assignee: Alex Van Boxel Protobuf has a rich metadata system called options. This system is fully typed and matches Beams Schema Option system. For now we can only convert the following protobuf options: * File Options -> _Beam doesn't have this concept_ * Message Options -> *Beam Schema Options* * Field Options -> *Beam Schema Options* * Enum Options -> _This can only be done when logical type options are available_ * EnumValue Options -> _This can only be done when logical type options are available_ * Service Options -> _Beam doesn't have this concept_ * Method Options -> _Beam doesn't have this concept_ -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (BEAM-9037) Instant and duration as logical type
[ https://issues.apache.org/jira/browse/BEAM-9037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Van Boxel resolved BEAM-9037. -- Fix Version/s: 2.19.0 Resolution: Fixed > Instant and duration as logical type > - > > Key: BEAM-9037 > URL: https://issues.apache.org/jira/browse/BEAM-9037 > Project: Beam > Issue Type: Task > Components: sdk-java-core >Reporter: Alex Van Boxel >Assignee: Alex Van Boxel >Priority: Major > Fix For: 2.19.0 > > Time Spent: 10m > Remaining Estimate: 0h > > The proto schema includes Timestamp and Duration with nano precision. The > logical types should be promoted to the core logical types, so they can be > handled on various IO's as standard mandatory conversions. > This means that the logical type should use the proto specific Timestamp and > Duration but the java 8 Instant and Duration. > See discussion in the design document: > [https://docs.google.com/document/d/1uu9pJktzT_O3DxGd1-Q2op4nRk4HekIZbzi-0oTAips/edit#heading=h.9uhml95iygqr] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work started] (BEAM-9037) Instant and duration as logical type
[ https://issues.apache.org/jira/browse/BEAM-9037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on BEAM-9037 started by Alex Van Boxel. > Instant and duration as logical type > - > > Key: BEAM-9037 > URL: https://issues.apache.org/jira/browse/BEAM-9037 > Project: Beam > Issue Type: Task > Components: sdk-java-core >Reporter: Alex Van Boxel >Assignee: Alex Van Boxel >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > The proto schema includes Timestamp and Duration with nano precision. The > logical types should be promoted to the core logical types, so they can be > handled on various IO's as standard mandatory conversions. > This means that the logical type should use the proto specific Timestamp and > Duration but the java 8 Instant and Duration. > See discussion in the design document: > [https://docs.google.com/document/d/1uu9pJktzT_O3DxGd1-Q2op4nRk4HekIZbzi-0oTAips/edit#heading=h.9uhml95iygqr] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-9037) Instant and duration as logical type
[ https://issues.apache.org/jira/browse/BEAM-9037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Van Boxel updated BEAM-9037: - Summary: Instant and duration as logical type (was: Promote proto logical type and duration to the core logical types) > Instant and duration as logical type > - > > Key: BEAM-9037 > URL: https://issues.apache.org/jira/browse/BEAM-9037 > Project: Beam > Issue Type: Task > Components: sdk-java-core >Reporter: Alex Van Boxel >Assignee: Alex Van Boxel >Priority: Major > > The proto schema includes Timestamp and Duration with nano precision. The > logical types should be promoted to the core logical types, so they can be > handled on various IO's as standard mandatory conversions. > This means that the logical type should use the proto specific Timestamp and > Duration but the java 8 Instant and Duration. > See discussion in the design document: > [https://docs.google.com/document/d/1uu9pJktzT_O3DxGd1-Q2op4nRk4HekIZbzi-0oTAips/edit#heading=h.9uhml95iygqr] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-9037) Promote proto logical type and duration to the core logical types
[ https://issues.apache.org/jira/browse/BEAM-9037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Van Boxel updated BEAM-9037: - Description: The proto schema includes Timestamp and Duration with nano precision. The logical types should be promoted to the core logical types, so they can be handled on various IO's as standard mandatory conversions. This means that the logical type should use the proto specific Timestamp and Duration but the java 8 Instant and Duration. See discussion in the design document: [https://docs.google.com/document/d/1uu9pJktzT_O3DxGd1-Q2op4nRk4HekIZbzi-0oTAips/edit#heading=h.9uhml95iygqr] was: The proto schema includes Timestamp and Duration with nano precision. The logical types should be promoted to the core logical types, so they can be handled on various IO's as standard mandatory conversions. See discussion in the design document: [https://docs.google.com/document/d/1uu9pJktzT_O3DxGd1-Q2op4nRk4HekIZbzi-0oTAips/edit#heading=h.9uhml95iygqr] > Promote proto logical type and duration to the core logical types > - > > Key: BEAM-9037 > URL: https://issues.apache.org/jira/browse/BEAM-9037 > Project: Beam > Issue Type: Task > Components: sdk-java-core >Reporter: Alex Van Boxel >Assignee: Alex Van Boxel >Priority: Major > > The proto schema includes Timestamp and Duration with nano precision. The > logical types should be promoted to the core logical types, so they can be > handled on various IO's as standard mandatory conversions. > This means that the logical type should use the proto specific Timestamp and > Duration but the java 8 Instant and Duration. > See discussion in the design document: > [https://docs.google.com/document/d/1uu9pJktzT_O3DxGd1-Q2op4nRk4HekIZbzi-0oTAips/edit#heading=h.9uhml95iygqr] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (BEAM-9037) Promote proto logical type and duration to the core logical types
Alex Van Boxel created BEAM-9037: Summary: Promote proto logical type and duration to the core logical types Key: BEAM-9037 URL: https://issues.apache.org/jira/browse/BEAM-9037 Project: Beam Issue Type: Task Components: sdk-java-core Reporter: Alex Van Boxel Assignee: Alex Van Boxel The proto schema includes Timestamp and Duration with nano precision. The logical types should be promoted to the core logical types, so they can be handled on various IO's as standard mandatory conversions. See discussion in the design document: [https://docs.google.com/document/d/1uu9pJktzT_O3DxGd1-Q2op4nRk4HekIZbzi-0oTAips/edit#heading=h.9uhml95iygqr] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (BEAM-9035) Typed options for Row Schema and Fields
[ https://issues.apache.org/jira/browse/BEAM-9035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Van Boxel resolved BEAM-9035. -- Fix Version/s: 2.19.0 Resolution: Fixed Ready for review > Typed options for Row Schema and Fields > --- > > Key: BEAM-9035 > URL: https://issues.apache.org/jira/browse/BEAM-9035 > Project: Beam > Issue Type: Task > Components: sdk-java-core >Reporter: Alex Van Boxel >Assignee: Alex Van Boxel >Priority: Major > Fix For: 2.19.0 > > Time Spent: 0.5h > Remaining Estimate: 0h > > This is the first issue of a multipart commit: this ticket implements the > basic infrastructure of options on row and field. > Full explanation: > Introduce the concept of Options in Beam Schema’s to add extra context to > fields and schema. In contracts to metadata, options would be added to > fields, logical types and rows. In the options schema convertors can add > options/annotations/decorators that were in the original schema, this context > can be used in the rest of the pipeline for specific transformations or > augment the end schema in the target output. > Examples of options are: > * informational: like the source of the data, ... > * drive decisions further in the pipeline: flatten a row into another, > rename a field, ... > * influence something in the output: like cluster index, primary key, ... > * logical type information > And option is a key/typed value combination. The advantages of having the > value types is: > * Having strongly typed options would give a *portable way of Logical Types* > to have structured information that could be shared over different languages. > * This could keep the type intact when mapping from a formats that have > strongly typed options (example: Protobuf). > This is part of a multi ticket implementation. The following tickets are > related: > # Typed options for Row Schema and Fields > # Convert Proto Options to Beam Schema options > # Convert Avro extra information for Beam string options > # Replace meta data with Logical Type options > # Extract meta data in Calcite SQL to Beam options > # Extract meta data in Zeta SQL to Beam options > # Add java example of using option in a transform > This feature is discussed with Reuven Lax, Brian Hulette -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work started] (BEAM-9035) Typed options for Row Schema and Fields
[ https://issues.apache.org/jira/browse/BEAM-9035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on BEAM-9035 started by Alex Van Boxel. > Typed options for Row Schema and Fields > --- > > Key: BEAM-9035 > URL: https://issues.apache.org/jira/browse/BEAM-9035 > Project: Beam > Issue Type: Task > Components: sdk-java-core >Reporter: Alex Van Boxel >Assignee: Alex Van Boxel >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > This is the first issue of a multipart commit: this ticket implements the > basic infrastructure of options on row and field. > Full explanation: > Introduce the concept of Options in Beam Schema’s to add extra context to > fields and schema. In contracts to metadata, options would be added to > fields, logical types and rows. In the options schema convertors can add > options/annotations/decorators that were in the original schema, this context > can be used in the rest of the pipeline for specific transformations or > augment the end schema in the target output. > Examples of options are: > * informational: like the source of the data, ... > * drive decisions further in the pipeline: flatten a row into another, > rename a field, ... > * influence something in the output: like cluster index, primary key, ... > * logical type information > And option is a key/typed value combination. The advantages of having the > value types is: > * Having strongly typed options would give a *portable way of Logical Types* > to have structured information that could be shared over different languages. > * This could keep the type intact when mapping from a formats that have > strongly typed options (example: Protobuf). > This is part of a multi ticket implementation. The following tickets are > related: > # Typed options for Row Schema and Fields > # Convert Proto Options to Beam Schema options > # Convert Avro extra information for Beam string options > # Replace meta data with Logical Type options > # Extract meta data in Calcite SQL to Beam options > # Extract meta data in Zeta SQL to Beam options > # Add java example of using option in a transform > This feature is discussed with Reuven Lax, Brian Hulette -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-9035) Typed options for Row Schema and Fields
[ https://issues.apache.org/jira/browse/BEAM-9035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Van Boxel updated BEAM-9035: - Status: Open (was: Triage Needed) > Typed options for Row Schema and Fields > --- > > Key: BEAM-9035 > URL: https://issues.apache.org/jira/browse/BEAM-9035 > Project: Beam > Issue Type: Task > Components: sdk-java-core >Reporter: Alex Van Boxel >Assignee: Alex Van Boxel >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > This is the first issue of a multipart commit: this ticket implements the > basic infrastructure of options on row and field. > Full explanation: > Introduce the concept of Options in Beam Schema’s to add extra context to > fields and schema. In contracts to metadata, options would be added to > fields, logical types and rows. In the options schema convertors can add > options/annotations/decorators that were in the original schema, this context > can be used in the rest of the pipeline for specific transformations or > augment the end schema in the target output. > Examples of options are: > * informational: like the source of the data, ... > * drive decisions further in the pipeline: flatten a row into another, > rename a field, ... > * influence something in the output: like cluster index, primary key, ... > * logical type information > And option is a key/typed value combination. The advantages of having the > value types is: > * Having strongly typed options would give a *portable way of Logical Types* > to have structured information that could be shared over different languages. > * This could keep the type intact when mapping from a formats that have > strongly typed options (example: Protobuf). > This is part of a multi ticket implementation. The following tickets are > related: > # Typed options for Row Schema and Fields > # Convert Proto Options to Beam Schema options > # Convert Avro extra information for Beam string options > # Replace meta data with Logical Type options > # Extract meta data in Calcite SQL to Beam options > # Extract meta data in Zeta SQL to Beam options > # Add java example of using option in a transform > This feature is discussed with Reuven Lax, Brian Hulette -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-9035) Typed options for Row Schema and Fields
[ https://issues.apache.org/jira/browse/BEAM-9035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Van Boxel updated BEAM-9035: - Description: This is the first issue of a multipart commit: this ticket implements the basic infrastructure of options on row and field. Full explanation: Introduce the concept of Options in Beam Schema’s to add extra context to fields and schema. In contracts to metadata, options would be added to fields, logical types and rows. In the options schema convertors can add options/annotations/decorators that were in the original schema, this context can be used in the rest of the pipeline for specific transformations or augment the end schema in the target output. Examples of options are: * informational: like the source of the data, ... * drive decisions further in the pipeline: flatten a row into another, rename a field, ... * influence something in the output: like cluster index, primary key, ... * logical type information And option is a key/typed value combination. The advantages of having the value types is: * Having strongly typed options would give a *portable way of Logical Types* to have structured information that could be shared over different languages. * This could keep the type intact when mapping from a formats that have strongly typed options (example: Protobuf). This is part of a multi ticket implementation. The following tickets are related: # Typed options for Row Schema and Fields # Convert Proto Options to Beam Schema options # Convert Avro extra information for Beam string options # Replace meta data with Logical Type options # Extract meta data in Calcite SQL to Beam options # Extract meta data in Zeta SQL to Beam options # Add java example of using option in a transform This feature is discussed with Reuven Lax, Brian Hulette was: This is the first issue of a multipart commit: Introduce the concept of Options in Beam Schema’s to add extra context to fields and schema. In contracts to metadata, options would be added to fields, logical types and rows. In the options schema convertors can add options/annotations/decorators that were in the original schema, this context can be used in the rest of the pipeline for specific transformations or augment the end schema in the target output. Examples of options are: * informational: like the source of the data, ... * drive decisions further in the pipeline: flatten a row into another, rename a field, ... * influence something in the output: like cluster index, primary key, ... * logical type information And option is a key/typed value combination. The advantages of having the value types is: * Having strongly typed options would give a *portable way of Logical Types* to have structured information that could be shared over different languages. * This could keep the type intact when mapping from a formats that have strongly typed options (example: Protobuf). This is part of a multi ticket implementation. The following tickets are related: # Typed options for Row Schema and Fields # Convert Proto Options to Beam Schema options # Convert Avro extra information for Beam string options # Replace meta data with Logical Type options # Extract meta data in Calcite SQL to Beam options # Extract meta data in Zeta SQL to Beam options # Add java example of using option in a transform This feature is discussed with Reuven Lax, Brian Hulette > Typed options for Row Schema and Fields > --- > > Key: BEAM-9035 > URL: https://issues.apache.org/jira/browse/BEAM-9035 > Project: Beam > Issue Type: Task > Components: sdk-java-core >Reporter: Alex Van Boxel >Assignee: Alex Van Boxel >Priority: Major > > This is the first issue of a multipart commit: this ticket implements the > basic infrastructure of options on row and field. > Full explanation: > Introduce the concept of Options in Beam Schema’s to add extra context to > fields and schema. In contracts to metadata, options would be added to > fields, logical types and rows. In the options schema convertors can add > options/annotations/decorators that were in the original schema, this context > can be used in the rest of the pipeline for specific transformations or > augment the end schema in the target output. > Examples of options are: > * informational: like the source of the data, ... > * drive decisions further in the pipeline: flatten a row into another, > rename a field, ... > * influence something in the output: like cluster index, primary key, ... > * logical type information > And option is a key/typed value combination. The advantages of having the > value types is: > * Having strongly typed options would give a *portable way of Logical Types* > to have structured information that could be s
[jira] [Updated] (BEAM-9035) Typed options for Row Schema and Fields
[ https://issues.apache.org/jira/browse/BEAM-9035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Van Boxel updated BEAM-9035: - Description: This is the first issue of a multipart commit: Introduce the concept of Options in Beam Schema’s to add extra context to fields and schema. In contracts to metadata, options would be added to fields, logical types and rows. In the options schema convertors can add options/annotations/decorators that were in the original schema, this context can be used in the rest of the pipeline for specific transformations or augment the end schema in the target output. Examples of options are: * informational: like the source of the data, ... * drive decisions further in the pipeline: flatten a row into another, rename a field, ... * influence something in the output: like cluster index, primary key, ... * logical type information And option is a key/typed value combination. The advantages of having the value types is: * Having strongly typed options would give a *portable way of Logical Types* to have structured information that could be shared over different languages. * This could keep the type intact when mapping from a formats that have strongly typed options (example: Protobuf). This is part of a multi ticket implementation. The following tickets are related: # Typed options for Row Schema and Fields # Convert Proto Options to Beam Schema options # Convert Avro extra information for Beam string options # Replace meta data with Logical Type options # Extract meta data in Calcite SQL to Beam options # Extract meta data in Zeta SQL to Beam options # Add java example of using option in a transform This feature is discussed with Reuven Lax, Brian Hulette was: Introduce the concept of Options in Beam Schema’s to add extra context to fields and schema. In contracts to metadata, options would be added to fields, logical types and rows. In the options schema convertors can add options/annotations/decorators that were in the original schema, this context can be used in the rest of the pipeline for specific transformations or augment the end schema in the target output. Examples of options are: * informational: like the source of the data, ... * drive decisions further in the pipeline: flatten a row into another, rename a field, ... * influence something in the output: like cluster index, primary key, ... * logical type information And option is a key/typed value combination. The advantages of having the value types is: * Having strongly typed options would give a *portable way of Logical Types* to have structured information that could be shared over different languages. * This could keep the type intact when mapping from a formats that have strongly typed options (example: Protobuf). This is part of a multi ticket implementation. The following tickets are related: # Typed options for Row Schema and Fields # Convert Proto Options to Beam Schema options # Convert Avro extra information for Beam string options # Replace meta data with Logical Type options # Extract meta data in Calcite SQL to Beam options # Extract meta data in Zeta SQL to Beam options # Add java example of using option in a transform This feature is discussed with Reuven Lax, Brian Hulette > Typed options for Row Schema and Fields > --- > > Key: BEAM-9035 > URL: https://issues.apache.org/jira/browse/BEAM-9035 > Project: Beam > Issue Type: Task > Components: sdk-java-core >Reporter: Alex Van Boxel >Assignee: Alex Van Boxel >Priority: Major > > This is the first issue of a multipart commit: > > Introduce the concept of Options in Beam Schema’s to add extra context to > fields and schema. In contracts to metadata, options would be added to > fields, logical types and rows. In the options schema convertors can add > options/annotations/decorators that were in the original schema, this context > can be used in the rest of the pipeline for specific transformations or > augment the end schema in the target output. > Examples of options are: > * informational: like the source of the data, ... > * drive decisions further in the pipeline: flatten a row into another, > rename a field, ... > * influence something in the output: like cluster index, primary key, ... > * logical type information > And option is a key/typed value combination. The advantages of having the > value types is: > * Having strongly typed options would give a *portable way of Logical Types* > to have structured information that could be shared over different languages. > * This could keep the type intact when mapping from a formats that have > strongly typed options (example: Protobuf). > This is part of a multi ticket implementation. The following tickets are > related:
[jira] [Updated] (BEAM-9035) Typed options for Row Schema and Fields
[ https://issues.apache.org/jira/browse/BEAM-9035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Van Boxel updated BEAM-9035: - Description: Introduce the concept of Options in Beam Schema’s to add extra context to fields and schema. In contracts to metadata, options would be added to fields, logical types and rows. In the options schema convertors can add options/annotations/decorators that were in the original schema, this context can be used in the rest of the pipeline for specific transformations or augment the end schema in the target output. Examples of options are: * informational: like the source of the data, ... * drive decisions further in the pipeline: flatten a row into another, rename a field, ... * influence something in the output: like cluster index, primary key, ... * logical type information And option is a key/typed value combination. The advantages of having the value types is: * Having strongly typed options would give a *portable way of Logical Types* to have structured information that could be shared over different languages. * This could keep the type intact when mapping from a formats that have strongly typed options (example: Protobuf). This is part of a multi ticket implementation. The following tickets are related: # Typed options for Row Schema and Fields # Convert Proto Options to Beam Schema options # Convert Avro extra information for Beam string options # Replace meta data with Logical Type options # Extract meta data in Calcite SQL to Beam options # Extract meta data in Zeta SQL to Beam options # Add java example of using option in a transform This feature is discussed with Reuven Lax, Brian Hulette was: Introduce the concept of Options in Beam Schema’s to add extra context to fields and schema. In contracts to metadata, options would be added to fields, logical types and rows. In the options schema convertors can add options/annotations/decorators that were in the original schema, this context can be used in the rest of the pipeline for specific transformations or augment the end schema in the target output. Examples of options are: * informational: like the source of the data, ... * drive decisions further in the pipeline: flatten a row into another, rename a field, ... * influence something in the output: like cluster index, primary key, ... And option is a key/typed value combination. The advantages of having the value types is: * Having strongly typed options would give a portable way of Logical Types to have structured information that could be shared over different languages. * This could keep the type intact when mapping from a formats that have strongly typed options (example: Protobuf). This is part of a multi ticket implementation. The following tickets are related: # Typed options for Row Schema and Fields # Convert Proto Options to Beam Schema options # Convert Avro extra information for Beam string options # Replace meta data with Logical Type options # Extract meta data in Calcite SQL to Beam options # Extract meta data in Zeta SQL to Beam options This feature is discussed with Reuven Lax, Brian Hulette > Typed options for Row Schema and Fields > --- > > Key: BEAM-9035 > URL: https://issues.apache.org/jira/browse/BEAM-9035 > Project: Beam > Issue Type: Task > Components: sdk-java-core >Reporter: Alex Van Boxel >Assignee: Alex Van Boxel >Priority: Major > > Introduce the concept of Options in Beam Schema’s to add extra context to > fields and schema. In contracts to metadata, options would be added to > fields, logical types and rows. In the options schema convertors can add > options/annotations/decorators that were in the original schema, this context > can be used in the rest of the pipeline for specific transformations or > augment the end schema in the target output. > Examples of options are: > * informational: like the source of the data, ... > * drive decisions further in the pipeline: flatten a row into another, > rename a field, ... > * influence something in the output: like cluster index, primary key, ... > * logical type information > And option is a key/typed value combination. The advantages of having the > value types is: > * Having strongly typed options would give a *portable way of Logical Types* > to have structured information that could be shared over different languages. > * This could keep the type intact when mapping from a formats that have > strongly typed options (example: Protobuf). > This is part of a multi ticket implementation. The following tickets are > related: > # Typed options for Row Schema and Fields > # Convert Proto Options to Beam Schema options > # Convert Avro extra information for Beam string options > # Replace meta data with Logi
[jira] [Created] (BEAM-9035) Typed options for Row Schema and Fields
Alex Van Boxel created BEAM-9035: Summary: Typed options for Row Schema and Fields Key: BEAM-9035 URL: https://issues.apache.org/jira/browse/BEAM-9035 Project: Beam Issue Type: Task Components: sdk-java-core Reporter: Alex Van Boxel Assignee: Alex Van Boxel Introduce the concept of Options in Beam Schema’s to add extra context to fields and schema. In contracts to metadata, options would be added to fields, logical types and rows. In the options schema convertors can add options/annotations/decorators that were in the original schema, this context can be used in the rest of the pipeline for specific transformations or augment the end schema in the target output. Examples of options are: * informational: like the source of the data, ... * drive decisions further in the pipeline: flatten a row into another, rename a field, ... * influence something in the output: like cluster index, primary key, ... And option is a key/typed value combination. The advantages of having the value types is: * Having strongly typed options would give a portable way of Logical Types to have structured information that could be shared over different languages. * This could keep the type intact when mapping from a formats that have strongly typed options (example: Protobuf). This is part of a multi ticket implementation. The following tickets are related: # Typed options for Row Schema and Fields # Convert Proto Options to Beam Schema options # Convert Avro extra information for Beam string options # Replace meta data with Logical Type options # Extract meta data in Calcite SQL to Beam options # Extract meta data in Zeta SQL to Beam options This feature is discussed with Reuven Lax, Brian Hulette -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (BEAM-8174) BigQueryIO clustering documentation is incorrect and lacking
[ https://issues.apache.org/jira/browse/BEAM-8174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16978167#comment-16978167 ] Alex Van Boxel commented on BEAM-8174: -- I've removed the fixed version on this > BigQueryIO clustering documentation is incorrect and lacking > > > Key: BEAM-8174 > URL: https://issues.apache.org/jira/browse/BEAM-8174 > Project: Beam > Issue Type: Bug > Components: io-java-gcp >Affects Versions: 2.15.0 >Reporter: Alex Van Boxel >Assignee: Alex Van Boxel >Priority: Trivial > Labels: documentation > Original Estimate: 2h > Remaining Estimate: 2h > > I noticed that the Java doc of the clustering feature in BigQueryIO is more a > copy/paste from the timestamp method. This needs to be corrected. > The Clustering option should also be added to the BigQueryIO page. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-8174) BigQueryIO clustering documentation is incorrect and lacking
[ https://issues.apache.org/jira/browse/BEAM-8174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Van Boxel updated BEAM-8174: - Fix Version/s: (was: 2.17.0) > BigQueryIO clustering documentation is incorrect and lacking > > > Key: BEAM-8174 > URL: https://issues.apache.org/jira/browse/BEAM-8174 > Project: Beam > Issue Type: Bug > Components: io-java-gcp >Affects Versions: 2.15.0 >Reporter: Alex Van Boxel >Assignee: Alex Van Boxel >Priority: Trivial > Labels: documentation > Original Estimate: 2h > Remaining Estimate: 2h > > I noticed that the Java doc of the clustering feature in BigQueryIO is more a > copy/paste from the timestamp method. This needs to be corrected. > The Clustering option should also be added to the BigQueryIO page. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-7274) Protobuf Beam Schema support
[ https://issues.apache.org/jira/browse/BEAM-7274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Van Boxel updated BEAM-7274: - Status: Open (was: Triage Needed) > Protobuf Beam Schema support > > > Key: BEAM-7274 > URL: https://issues.apache.org/jira/browse/BEAM-7274 > Project: Beam > Issue Type: Improvement > Components: sdk-java-core >Reporter: Alex Van Boxel >Assignee: Alex Van Boxel >Priority: Minor > Fix For: 2.17.0 > > Time Spent: 7h 50m > Remaining Estimate: 0h > > Add support for the new Beam Schema to the Protobuf extension. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-7274) Protobuf Beam Schema support
[ https://issues.apache.org/jira/browse/BEAM-7274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Van Boxel updated BEAM-7274: - Fix Version/s: (was: 2.17.0) > Protobuf Beam Schema support > > > Key: BEAM-7274 > URL: https://issues.apache.org/jira/browse/BEAM-7274 > Project: Beam > Issue Type: Improvement > Components: sdk-java-core >Reporter: Alex Van Boxel >Assignee: Alex Van Boxel >Priority: Minor > Time Spent: 7h 50m > Remaining Estimate: 0h > > Add support for the new Beam Schema to the Protobuf extension. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work started] (BEAM-7274) Protobuf Beam Schema support
[ https://issues.apache.org/jira/browse/BEAM-7274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on BEAM-7274 started by Alex Van Boxel. > Protobuf Beam Schema support > > > Key: BEAM-7274 > URL: https://issues.apache.org/jira/browse/BEAM-7274 > Project: Beam > Issue Type: Improvement > Components: sdk-java-core >Reporter: Alex Van Boxel >Assignee: Alex Van Boxel >Priority: Minor > Fix For: 2.17.0 > > Time Spent: 7h 50m > Remaining Estimate: 0h > > Add support for the new Beam Schema to the Protobuf extension. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Reopened] (BEAM-7274) Protobuf Beam Schema support
[ https://issues.apache.org/jira/browse/BEAM-7274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Van Boxel reopened BEAM-7274: -- Reopening as new comments on PR still needs to be resolved > Protobuf Beam Schema support > > > Key: BEAM-7274 > URL: https://issues.apache.org/jira/browse/BEAM-7274 > Project: Beam > Issue Type: Improvement > Components: sdk-java-core >Reporter: Alex Van Boxel >Assignee: Alex Van Boxel >Priority: Minor > Fix For: 2.17.0 > > Time Spent: 7h 50m > Remaining Estimate: 0h > > Add support for the new Beam Schema to the Protobuf extension. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (BEAM-8218) Implement Apache PulsarIO
Alex Van Boxel created BEAM-8218: Summary: Implement Apache PulsarIO Key: BEAM-8218 URL: https://issues.apache.org/jira/browse/BEAM-8218 Project: Beam Issue Type: Task Components: io-ideas Reporter: Alex Van Boxel Assignee: Alex Van Boxel Apache Pulsar is starting to gain popularity. Having a native Beam PulsarIO could be beneficial. [https://pulsar.apache.org/|https://pulsar.apache.org/en/] -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Updated] (BEAM-7274) Protobuf Beam Schema support
[ https://issues.apache.org/jira/browse/BEAM-7274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Van Boxel updated BEAM-7274: - Fix Version/s: (was: 2.16.0) 2.17.0 Moved to 2.17.0 > Protobuf Beam Schema support > > > Key: BEAM-7274 > URL: https://issues.apache.org/jira/browse/BEAM-7274 > Project: Beam > Issue Type: Improvement > Components: sdk-java-core >Reporter: Alex Van Boxel >Assignee: Alex Van Boxel >Priority: Minor > Fix For: 2.17.0 > > Time Spent: 6h 50m > Remaining Estimate: 0h > > Add support for the new Beam Schema to the Protobuf extension. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Updated] (BEAM-5967) ProtoCoder doesn't support DynamicMessage
[ https://issues.apache.org/jira/browse/BEAM-5967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Van Boxel updated BEAM-5967: - Fix Version/s: (was: 2.16.0) 2.17.0 Moved to 2.17.0 > ProtoCoder doesn't support DynamicMessage > - > > Key: BEAM-5967 > URL: https://issues.apache.org/jira/browse/BEAM-5967 > Project: Beam > Issue Type: Improvement > Components: sdk-java-core >Affects Versions: 2.8.0 >Reporter: Alex Van Boxel >Assignee: Alex Van Boxel >Priority: Major > Fix For: 2.17.0 > > Time Spent: 3h 50m > Remaining Estimate: 0h > > The ProtoCoder does make some assumptions about static messages being > available. The DynamicMessage doesn't have some of them, mainly because the > proto schema is defined at runtime and not at compile time. > Does it make sense to make a special coder for DynamicMessage or build it > into the normal ProtoCoder. > Here is an example of the assumtion being made in the current Codec: > {code:java} > try { > @SuppressWarnings("unchecked") > T protoMessageInstance = (T) > protoMessageClass.getMethod("getDefaultInstance").invoke(null); > @SuppressWarnings("unchecked") > Parser tParser = (Parser) protoMessageInstance.getParserForType(); > memoizedParser = tParser; > } catch (IllegalAccessException | InvocationTargetException | > NoSuchMethodException e) { > throw new IllegalArgumentException(e); > } > {code} -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Updated] (BEAM-8174) BigQueryIO clustering documentation is incorrect and lacking
[ https://issues.apache.org/jira/browse/BEAM-8174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Van Boxel updated BEAM-8174: - Status: Open (was: Triage Needed) > BigQueryIO clustering documentation is incorrect and lacking > > > Key: BEAM-8174 > URL: https://issues.apache.org/jira/browse/BEAM-8174 > Project: Beam > Issue Type: Bug > Components: io-java-gcp >Affects Versions: 2.15.0 >Reporter: Alex Van Boxel >Assignee: Alex Van Boxel >Priority: Trivial > Labels: documentation > Fix For: 2.17.0 > > Original Estimate: 2h > Remaining Estimate: 2h > > I noticed that the Java doc of the clustering feature in BigQueryIO is more a > copy/paste from the timestamp method. This needs to be corrected. > The Clustering option should also be added to the BigQueryIO page. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Work started] (BEAM-8174) BigQueryIO clustering documentation is incorrect and lacking
[ https://issues.apache.org/jira/browse/BEAM-8174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on BEAM-8174 started by Alex Van Boxel. > BigQueryIO clustering documentation is incorrect and lacking > > > Key: BEAM-8174 > URL: https://issues.apache.org/jira/browse/BEAM-8174 > Project: Beam > Issue Type: Bug > Components: io-java-gcp >Affects Versions: 2.15.0 >Reporter: Alex Van Boxel >Assignee: Alex Van Boxel >Priority: Trivial > Labels: documentation > Fix For: 2.17.0 > > Original Estimate: 2h > Remaining Estimate: 2h > > I noticed that the Java doc of the clustering feature in BigQueryIO is more a > copy/paste from the timestamp method. This needs to be corrected. > The Clustering option should also be added to the BigQueryIO page. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Created] (BEAM-8174) BigQueryIO clustering documentation is incorrect and lacking
Alex Van Boxel created BEAM-8174: Summary: BigQueryIO clustering documentation is incorrect and lacking Key: BEAM-8174 URL: https://issues.apache.org/jira/browse/BEAM-8174 Project: Beam Issue Type: Bug Components: io-java-gcp Affects Versions: 2.15.0 Reporter: Alex Van Boxel Assignee: Alex Van Boxel Fix For: 2.17.0 I noticed that the Java doc of the clustering feature in BigQueryIO is more a copy/paste from the timestamp method. This needs to be corrected. The Clustering option should also be added to the BigQueryIO page. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Resolved] (BEAM-7274) Protobuf Beam Schema support
[ https://issues.apache.org/jira/browse/BEAM-7274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Van Boxel resolved BEAM-7274. -- Fix Version/s: 2.16.0 Resolution: Fixed > Protobuf Beam Schema support > > > Key: BEAM-7274 > URL: https://issues.apache.org/jira/browse/BEAM-7274 > Project: Beam > Issue Type: Improvement > Components: sdk-java-core >Reporter: Alex Van Boxel >Assignee: Alex Van Boxel >Priority: Minor > Fix For: 2.16.0 > > Time Spent: 6h 20m > Remaining Estimate: 0h > > Add support for the new Beam Schema to the Protobuf extension. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Commented] (BEAM-7274) Protobuf Beam Schema support
[ https://issues.apache.org/jira/browse/BEAM-7274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16920033#comment-16920033 ] Alex Van Boxel commented on BEAM-7274: -- PR ready for review > Protobuf Beam Schema support > > > Key: BEAM-7274 > URL: https://issues.apache.org/jira/browse/BEAM-7274 > Project: Beam > Issue Type: Improvement > Components: sdk-java-core >Reporter: Alex Van Boxel >Assignee: Alex Van Boxel >Priority: Minor > Time Spent: 6h > Remaining Estimate: 0h > > Add support for the new Beam Schema to the Protobuf extension. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Resolved] (BEAM-5967) ProtoCoder doesn't support DynamicMessage
[ https://issues.apache.org/jira/browse/BEAM-5967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Van Boxel resolved BEAM-5967. -- Fix Version/s: 2.16.0 Resolution: Fixed Object equality now handled by ProtoDomain. Upgradability is tested from 2.14.0 to -- 2.16.0-SNAPSHOT. Waiting for reviewers. > ProtoCoder doesn't support DynamicMessage > - > > Key: BEAM-5967 > URL: https://issues.apache.org/jira/browse/BEAM-5967 > Project: Beam > Issue Type: Improvement > Components: sdk-java-core >Affects Versions: 2.8.0 >Reporter: Alex Van Boxel >Assignee: Alex Van Boxel >Priority: Major > Fix For: 2.16.0 > > Time Spent: 3h 50m > Remaining Estimate: 0h > > The ProtoCoder does make some assumptions about static messages being > available. The DynamicMessage doesn't have some of them, mainly because the > proto schema is defined at runtime and not at compile time. > Does it make sense to make a special coder for DynamicMessage or build it > into the normal ProtoCoder. > Here is an example of the assumtion being made in the current Codec: > {code:java} > try { > @SuppressWarnings("unchecked") > T protoMessageInstance = (T) > protoMessageClass.getMethod("getDefaultInstance").invoke(null); > @SuppressWarnings("unchecked") > Parser tParser = (Parser) protoMessageInstance.getParserForType(); > memoizedParser = tParser; > } catch (IllegalAccessException | InvocationTargetException | > NoSuchMethodException e) { > throw new IllegalArgumentException(e); > } > {code} -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Closed] (BEAM-7312) SchemaProvider can't be used with dynamic types
[ https://issues.apache.org/jira/browse/BEAM-7312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Van Boxel closed BEAM-7312. Fix Version/s: 2.14.0 Resolution: Won't Fix This is not the right mechanism for handling dynamic types. Closing with Won't Fix. > SchemaProvider can't be used with dynamic types > --- > > Key: BEAM-7312 > URL: https://issues.apache.org/jira/browse/BEAM-7312 > Project: Beam > Issue Type: Improvement > Components: sdk-java-core >Reporter: Alex Van Boxel >Assignee: Alex Van Boxel >Priority: Major > Fix For: 2.14.0 > > > Looking at the java doc comment of SchemaProvider it hints at getting > schema's from external system. But as the provider only access type this is > in general impossible: > Say you have 2 dynamic types, say Avro, as a java type they have both > GenericRecord. Using the current interface it's impossible to make the > difference between both dynamic types. > As getting information from an external system I propose extending the > Provider interface by adding an extra parameter to the interface. It would be > a string with a URN. > The URN could indicated for example > * Pub/Sub subscription/topic > * Kafka topic > * whatever... -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Resolved] (BEAM-7999) BigQueryIO.readTableRowsWithSchema() doesn't handle timestamp correctly
[ https://issues.apache.org/jira/browse/BEAM-7999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Van Boxel resolved BEAM-7999. -- Fix Version/s: 2.16.0 Resolution: Fixed PR merged > BigQueryIO.readTableRowsWithSchema() doesn't handle timestamp correctly > --- > > Key: BEAM-7999 > URL: https://issues.apache.org/jira/browse/BEAM-7999 > Project: Beam > Issue Type: Bug > Components: io-java-gcp >Affects Versions: 2.14.0, 2.15.0 >Reporter: Alex Van Boxel >Assignee: Alex Van Boxel >Priority: Major > Fix For: 2.16.0 > > Time Spent: 1h 10m > Remaining Estimate: 0h > > Using the new readTableRowsWithSchema to make a copy of a table (simple > operation), parsing the timestamp in the table doesn't work as it assumes a > Double value. BigQuery outputs a string like "2019-08-16 00:12:00.123456 > UTC". This isn't handled. > *Reproducable:* > with this table > {code:java} > INSERT `research.alex.in1` (row_id, f_int64, f_timestamp) > VALUES > (1, 1, '2019-08-16 00:12:00 UTC'), > (2, 2, '2019-08-16 00:12:00.123 UTC'), > (3, 3, '2019-08-16 00:12:00.123456 UTC') > {code} > do a copy operation: > {code:java} > pipeline > .apply( > BigQueryIO.readTableRowsWithSchema() > .from("research:alex.in1") > //.withMethod(BigQueryIO.TypedRead.Method.DIRECT_READ) > ) > .apply(ParDo.of(new Inspect())) > .apply( > BigQueryIO.writeTableRows() > > .withCreateDisposition(BigQueryIO.Write.CreateDisposition.CREATE_IF_NEEDED) > .withMethod(BigQueryIO.Write.Method.FILE_LOADS) > .useBeamSchema() > .to("research:alex.out4")); > {code} > -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Closed] (BEAM-7999) BigQueryIO.readTableRowsWithSchema() doesn't handle timestamp correctly
[ https://issues.apache.org/jira/browse/BEAM-7999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Van Boxel closed BEAM-7999. > BigQueryIO.readTableRowsWithSchema() doesn't handle timestamp correctly > --- > > Key: BEAM-7999 > URL: https://issues.apache.org/jira/browse/BEAM-7999 > Project: Beam > Issue Type: Bug > Components: io-java-gcp >Affects Versions: 2.14.0, 2.15.0 >Reporter: Alex Van Boxel >Assignee: Alex Van Boxel >Priority: Major > Fix For: 2.16.0 > > Time Spent: 1h 10m > Remaining Estimate: 0h > > Using the new readTableRowsWithSchema to make a copy of a table (simple > operation), parsing the timestamp in the table doesn't work as it assumes a > Double value. BigQuery outputs a string like "2019-08-16 00:12:00.123456 > UTC". This isn't handled. > *Reproducable:* > with this table > {code:java} > INSERT `research.alex.in1` (row_id, f_int64, f_timestamp) > VALUES > (1, 1, '2019-08-16 00:12:00 UTC'), > (2, 2, '2019-08-16 00:12:00.123 UTC'), > (3, 3, '2019-08-16 00:12:00.123456 UTC') > {code} > do a copy operation: > {code:java} > pipeline > .apply( > BigQueryIO.readTableRowsWithSchema() > .from("research:alex.in1") > //.withMethod(BigQueryIO.TypedRead.Method.DIRECT_READ) > ) > .apply(ParDo.of(new Inspect())) > .apply( > BigQueryIO.writeTableRows() > > .withCreateDisposition(BigQueryIO.Write.CreateDisposition.CREATE_IF_NEEDED) > .withMethod(BigQueryIO.Write.Method.FILE_LOADS) > .useBeamSchema() > .to("research:alex.out4")); > {code} > -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Closed] (BEAM-7426) FieldSpecifierNotationLexer should support underscore as field character
[ https://issues.apache.org/jira/browse/BEAM-7426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Van Boxel closed BEAM-7426. Part of 2.14 release > FieldSpecifierNotationLexer should support underscore as field character > > > Key: BEAM-7426 > URL: https://issues.apache.org/jira/browse/BEAM-7426 > Project: Beam > Issue Type: Improvement > Components: sdk-java-core >Reporter: Alex Van Boxel >Assignee: Alex Van Boxel >Priority: Major > Fix For: 2.14.0 > > Time Spent: 0.5h > Remaining Estimate: 0h > > Underscore is a common used word delimiter in field names, the current > FieldSpecifierNotationLexer only support alpha-numeric values for field name > character. > The upcoming Protobuf schema support will emit underscores in the field > names, so field names should support underscore. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Resolved] (BEAM-7426) FieldSpecifierNotationLexer should support underscore as field character
[ https://issues.apache.org/jira/browse/BEAM-7426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Van Boxel resolved BEAM-7426. -- Fix Version/s: 2.14.0 Resolution: Fixed > FieldSpecifierNotationLexer should support underscore as field character > > > Key: BEAM-7426 > URL: https://issues.apache.org/jira/browse/BEAM-7426 > Project: Beam > Issue Type: Improvement > Components: sdk-java-core >Reporter: Alex Van Boxel >Assignee: Alex Van Boxel >Priority: Major > Fix For: 2.14.0 > > Time Spent: 0.5h > Remaining Estimate: 0h > > Underscore is a common used word delimiter in field names, the current > FieldSpecifierNotationLexer only support alpha-numeric values for field name > character. > The upcoming Protobuf schema support will emit underscores in the field > names, so field names should support underscore. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Commented] (BEAM-7274) Protobuf Beam Schema support
[ https://issues.apache.org/jira/browse/BEAM-7274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16913097#comment-16913097 ] Alex Van Boxel commented on BEAM-7274: -- Well. picking it up again as I'm back from holidays. Hopefully we then get the pull request out in a reasonable time. > Protobuf Beam Schema support > > > Key: BEAM-7274 > URL: https://issues.apache.org/jira/browse/BEAM-7274 > Project: Beam > Issue Type: Improvement > Components: sdk-java-core >Reporter: Alex Van Boxel >Assignee: Alex Van Boxel >Priority: Minor > Time Spent: 4h 50m > Remaining Estimate: 0h > > Add support for the new Beam Schema to the Protobuf extension. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Work started] (BEAM-7999) BigQueryIO.readTableRowsWithSchema() doesn't handle timestamp correctly
[ https://issues.apache.org/jira/browse/BEAM-7999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on BEAM-7999 started by Alex Van Boxel. > BigQueryIO.readTableRowsWithSchema() doesn't handle timestamp correctly > --- > > Key: BEAM-7999 > URL: https://issues.apache.org/jira/browse/BEAM-7999 > Project: Beam > Issue Type: Bug > Components: io-java-gcp >Affects Versions: 2.14.0, 2.15.0 >Reporter: Alex Van Boxel >Assignee: Alex Van Boxel >Priority: Major > Time Spent: 40m > Remaining Estimate: 0h > > Using the new readTableRowsWithSchema to make a copy of a table (simple > operation), parsing the timestamp in the table doesn't work as it assumes a > Double value. BigQuery outputs a string like "2019-08-16 00:12:00.123456 > UTC". This isn't handled. > *Reproducable:* > with this table > {code:java} > INSERT `research.alex.in1` (row_id, f_int64, f_timestamp) > VALUES > (1, 1, '2019-08-16 00:12:00 UTC'), > (2, 2, '2019-08-16 00:12:00.123 UTC'), > (3, 3, '2019-08-16 00:12:00.123456 UTC') > {code} > do a copy operation: > {code:java} > pipeline > .apply( > BigQueryIO.readTableRowsWithSchema() > .from("research:alex.in1") > //.withMethod(BigQueryIO.TypedRead.Method.DIRECT_READ) > ) > .apply(ParDo.of(new Inspect())) > .apply( > BigQueryIO.writeTableRows() > > .withCreateDisposition(BigQueryIO.Write.CreateDisposition.CREATE_IF_NEEDED) > .withMethod(BigQueryIO.Write.Method.FILE_LOADS) > .useBeamSchema() > .to("research:alex.out4")); > {code} > -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Updated] (BEAM-7999) BigQueryIO.readTableRowsWithSchema() doesn't handle timestamp correctly
[ https://issues.apache.org/jira/browse/BEAM-7999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Van Boxel updated BEAM-7999: - Issue Type: Bug (was: Task) > BigQueryIO.readTableRowsWithSchema() doesn't handle timestamp correctly > --- > > Key: BEAM-7999 > URL: https://issues.apache.org/jira/browse/BEAM-7999 > Project: Beam > Issue Type: Bug > Components: io-java-gcp >Affects Versions: 2.14.0, 2.15.0 >Reporter: Alex Van Boxel >Assignee: Alex Van Boxel >Priority: Major > > Using the new readTableRowsWithSchema to make a copy of a table (simple > operation), parsing the timestamp in the table doesn't work as it assumes a > Double value. BigQuery outputs a string like "2019-08-16 00:12:00.123456 > UTC". This isn't handled. > *Reproducable:* > with this table > {code:java} > INSERT `research.alex.in1` (row_id, f_int64, f_timestamp) > VALUES > (1, 1, '2019-08-16 00:12:00 UTC'), > (2, 2, '2019-08-16 00:12:00.123 UTC'), > (3, 3, '2019-08-16 00:12:00.123456 UTC') > {code} > do a copy operation: > {code:java} > pipeline > .apply( > BigQueryIO.readTableRowsWithSchema() > .from("research:alex.in1") > //.withMethod(BigQueryIO.TypedRead.Method.DIRECT_READ) > ) > .apply(ParDo.of(new Inspect())) > .apply( > BigQueryIO.writeTableRows() > > .withCreateDisposition(BigQueryIO.Write.CreateDisposition.CREATE_IF_NEEDED) > .withMethod(BigQueryIO.Write.Method.FILE_LOADS) > .useBeamSchema() > .to("research:alex.out4")); > {code} > -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Created] (BEAM-7999) BigQueryIO.readTableRowsWithSchema() doesn't handle timestamp correctly
Alex Van Boxel created BEAM-7999: Summary: BigQueryIO.readTableRowsWithSchema() doesn't handle timestamp correctly Key: BEAM-7999 URL: https://issues.apache.org/jira/browse/BEAM-7999 Project: Beam Issue Type: Task Components: io-java-gcp Affects Versions: 2.14.0, 2.15.0 Reporter: Alex Van Boxel Assignee: Alex Van Boxel Using the new readTableRowsWithSchema to make a copy of a table (simple operation), parsing the timestamp in the table doesn't work as it assumes a Double value. BigQuery outputs a string like "2019-08-16 00:12:00.123456 UTC". This isn't handled. *Reproducable:* with this table {code:java} INSERT `research.alex.in1` (row_id, f_int64, f_timestamp) VALUES (1, 1, '2019-08-16 00:12:00 UTC'), (2, 2, '2019-08-16 00:12:00.123 UTC'), (3, 3, '2019-08-16 00:12:00.123456 UTC') {code} do a copy operation: {code:java} pipeline .apply( BigQueryIO.readTableRowsWithSchema() .from("research:alex.in1") //.withMethod(BigQueryIO.TypedRead.Method.DIRECT_READ) ) .apply(ParDo.of(new Inspect())) .apply( BigQueryIO.writeTableRows() .withCreateDisposition(BigQueryIO.Write.CreateDisposition.CREATE_IF_NEEDED) .withMethod(BigQueryIO.Write.Method.FILE_LOADS) .useBeamSchema() .to("research:alex.out4")); {code} -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Created] (BEAM-7518) Protobuf Schema: Introduce logical type for Timestamp, Duration and other
Alex Van Boxel created BEAM-7518: Summary: Protobuf Schema: Introduce logical type for Timestamp, Duration and other Key: BEAM-7518 URL: https://issues.apache.org/jira/browse/BEAM-7518 Project: Beam Issue Type: Task Components: sdk-java-core Reporter: Alex Van Boxel Assignee: Alex Van Boxel Protobuf Schema provider has some loosy conversion from some Proto types. Introduce Logical Types for: Timestamp, Duration and Unsigned Int64 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Issue Comment Deleted] (BEAM-4455) Provide automatic schema registration for Protos
[ https://issues.apache.org/jira/browse/BEAM-4455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Van Boxel updated BEAM-4455: - Comment: was deleted (was: Protobuf support is almost ready for PR. I'll be submitting it under this ticket and close BEAM-7274 as duplicate. That means assigning this ticket to me though. Agreed?) > Provide automatic schema registration for Protos > > > Key: BEAM-4455 > URL: https://issues.apache.org/jira/browse/BEAM-4455 > Project: Beam > Issue Type: Sub-task > Components: sdk-java-core >Reporter: Reuven Lax >Assignee: Shehzaad Nakhoda >Priority: Major > > Need to make sure this is a compatible change -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-4455) Provide automatic schema registration for Protos
[ https://issues.apache.org/jira/browse/BEAM-4455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16848189#comment-16848189 ] Alex Van Boxel commented on BEAM-4455: -- BEAM-7274 will be used for the implementation of the schema support, this ticket for the integration. Best to split both concerns, > Provide automatic schema registration for Protos > > > Key: BEAM-4455 > URL: https://issues.apache.org/jira/browse/BEAM-4455 > Project: Beam > Issue Type: Sub-task > Components: sdk-java-core >Reporter: Reuven Lax >Assignee: Shehzaad Nakhoda >Priority: Major > > Need to make sure this is a compatible change -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (BEAM-7426) FieldSpecifierNotationLexer should support underscore as field character
Alex Van Boxel created BEAM-7426: Summary: FieldSpecifierNotationLexer should support underscore as field character Key: BEAM-7426 URL: https://issues.apache.org/jira/browse/BEAM-7426 Project: Beam Issue Type: Improvement Components: sdk-java-core Reporter: Alex Van Boxel Assignee: Alex Van Boxel Underscore is a common used word delimiter in field names, the current FieldSpecifierNotationLexer only support alpha-numeric values for field name character. The upcoming Protobuf schema support will emit underscores in the field names, so field names should support underscore. -- This message was sent by Atlassian JIRA (v7.6.3#76005)