[
https://issues.apache.org/jira/browse/BEAM-7274?focusedWorklogId=319773&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-319773
]
ASF GitHub Bot logged work on BEAM-7274:
----------------------------------------
Author: ASF GitHub Bot
Created on: 27/Sep/19 19:51
Start Date: 27/Sep/19 19:51
Worklog Time Spent: 10m
Work Description: TheNeuralBit commented on pull request #8690:
[BEAM-7274] Implement the Protobuf schema provider
URL: https://github.com/apache/beam/pull/8690#discussion_r329227136
##########
File path: sdks/java/core/src/main/java/org/apache/beam/sdk/values/Row.java
##########
@@ -554,6 +555,12 @@ public Builder withFieldValueGetters(
return this;
}
+ /** The FieldValueGetters will handle the conversion for Arrays, Maps and
Rows. */
+ public Builder withFieldValueGettersHandleCollections(boolean
collectionHandledByGetter) {
+ this.collectionHandledByGetter = collectionHandledByGetter;
+ return this;
+ }
Review comment:
@reuvenlax is this the last remaining concern you have for this PR?
https://github.com/apache/beam/pull/8690#issuecomment-497695744 is a good
reference for the motivation of this as well.
I'm not crazy about the way this is implemented since it's adding state to
`RowWithGetter` that will get checked every time a collection field is
accessed. I can't think of a better way to do it without some non-trivial
refactoring though. Some ideas:
1. Rather than storing `collectionHandledByGetter` in `RowWithGetters` and
changing behavior based on it, have two alternate `RowWithGetters`
implementations, one with the special handling for collections and one without.
I think this is still an improvement over the separate `ProtoRow` class that
was rejected since it's not explicitly tied to a particular `SchemaProvider`,
in fact it sounds like it could be re-used for Avro `GenericRecord` instances.
1. Move the special logic for collection getters into those
`SchemaProvider`/`FieldValueGetter` implementations that need it, and make
`RowWithGetters` always behave as if `collectionHandledByGetter` is true. I
think this could be a cleaner approach? But it's challenging because that
special logic stores cached values that wouldn't be appropriate to move into
the `FieldValueGetter` implementations. Maybe a `SchemaProvider` could have
some way to indicate that a certain row is expensive to access and should be
cached in `RowWithGetters`?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 319773)
Time Spent: 7h (was: 6h 50m)
> Protobuf Beam Schema support
> ----------------------------
>
> Key: BEAM-7274
> URL: https://issues.apache.org/jira/browse/BEAM-7274
> Project: Beam
> Issue Type: Improvement
> Components: sdk-java-core
> Reporter: Alex Van Boxel
> Assignee: Alex Van Boxel
> Priority: Minor
> Fix For: 2.17.0
>
> Time Spent: 7h
> Remaining Estimate: 0h
>
> Add support for the new Beam Schema to the Protobuf extension.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)