[ https://issues.apache.org/jira/browse/BEAM-7274?focusedWorklogId=319773&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-319773 ]
ASF GitHub Bot logged work on BEAM-7274: ---------------------------------------- Author: ASF GitHub Bot Created on: 27/Sep/19 19:51 Start Date: 27/Sep/19 19:51 Worklog Time Spent: 10m Work Description: TheNeuralBit commented on pull request #8690: [BEAM-7274] Implement the Protobuf schema provider URL: https://github.com/apache/beam/pull/8690#discussion_r329227136 ########## File path: sdks/java/core/src/main/java/org/apache/beam/sdk/values/Row.java ########## @@ -554,6 +555,12 @@ public Builder withFieldValueGetters( return this; } + /** The FieldValueGetters will handle the conversion for Arrays, Maps and Rows. */ + public Builder withFieldValueGettersHandleCollections(boolean collectionHandledByGetter) { + this.collectionHandledByGetter = collectionHandledByGetter; + return this; + } Review comment: @reuvenlax is this the last remaining concern you have for this PR? https://github.com/apache/beam/pull/8690#issuecomment-497695744 is a good reference for the motivation of this as well. I'm not crazy about the way this is implemented since it's adding state to `RowWithGetter` that will get checked every time a collection field is accessed. I can't think of a better way to do it without some non-trivial refactoring though. Some ideas: 1. Rather than storing `collectionHandledByGetter` in `RowWithGetters` and changing behavior based on it, have two alternate `RowWithGetters` implementations, one with the special handling for collections and one without. I think this is still an improvement over the separate `ProtoRow` class that was rejected since it's not explicitly tied to a particular `SchemaProvider`, in fact it sounds like it could be re-used for Avro `GenericRecord` instances. 1. Move the special logic for collection getters into those `SchemaProvider`/`FieldValueGetter` implementations that need it, and make `RowWithGetters` always behave as if `collectionHandledByGetter` is true. I think this could be a cleaner approach? But it's challenging because that special logic stores cached values that wouldn't be appropriate to move into the `FieldValueGetter` implementations. Maybe a `SchemaProvider` could have some way to indicate that a certain row is expensive to access and should be cached in `RowWithGetters`? ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking ------------------- Worklog Id: (was: 319773) Time Spent: 7h (was: 6h 50m) > Protobuf Beam Schema support > ---------------------------- > > Key: BEAM-7274 > URL: https://issues.apache.org/jira/browse/BEAM-7274 > Project: Beam > Issue Type: Improvement > Components: sdk-java-core > Reporter: Alex Van Boxel > Assignee: Alex Van Boxel > Priority: Minor > Fix For: 2.17.0 > > Time Spent: 7h > Remaining Estimate: 0h > > Add support for the new Beam Schema to the Protobuf extension. -- This message was sent by Atlassian Jira (v8.3.4#803005)