[ https://issues.apache.org/jira/browse/BEAM-3437?focusedWorklogId=86479&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-86479 ]
ASF GitHub Bot logged work on BEAM-3437: ---------------------------------------- Author: ASF GitHub Bot Created on: 02/Apr/18 05:08 Start Date: 02/Apr/18 05:08 Worklog Time Spent: 10m Work Description: reuvenlax commented on a change in pull request #4964: [BEAM-3437] Introduce Schema class, and use it in BeamSQL URL: https://github.com/apache/beam/pull/4964#discussion_r178487905 ########## File path: sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/transform/BeamAggregationTransforms.java ########## @@ -177,21 +171,30 @@ public AggregationAdaptor(List<AggregateCall> aggregationCalls, RowType sourceRo int refIndexKey = call.getArgList().get(0); int refIndexValue = call.getArgList().get(1); + FieldTypeDescriptor keyDescriptor = + sourceSchema.getField(refIndexKey).getTypeDescriptor(); BeamSqlInputRefExpression sourceExpKey = new BeamSqlInputRefExpression( - CalciteUtils.getFieldCalciteType(sourceRowType, refIndexKey), refIndexKey); + CalciteUtils.toSqlTypeName(keyDescriptor.getType(), keyDescriptor.getMetadata()), + refIndexKey); + + FieldTypeDescriptor valueDescriptor = + sourceSchema.getField(refIndexValue).getTypeDescriptor(); BeamSqlInputRefExpression sourceExpValue = new BeamSqlInputRefExpression( - CalciteUtils.getFieldCalciteType(sourceRowType, refIndexValue), refIndexValue); + CalciteUtils.toSqlTypeName(valueDescriptor.getType(), valueDescriptor.getMetadata()), + refIndexValue); sourceFieldExps.add(KV.of(sourceExpKey, sourceExpValue)); } else { int refIndex = call.getArgList().size() > 0 ? call.getArgList().get(0) : 0; + FieldTypeDescriptor typeDescriptor = sourceSchema.getField(refIndex).getTypeDescriptor(); Review comment: <!--thread_id:cc_178376529_t; commit:79c95678e593da730ba0472b77304ec1f916245e; resolved:0--> <!--section:context-quote--> > **akedin** wrote: > It's unclear where we're supposed to be using `FieldTypeDescriptor` vs `FieldType`. Can they be combined? So that, for example, all fields in `FieldType` become instances of `FieldTypeDescriptor`. Do we need both? <!--section:body--> FieldType is an enum that identifies the type of a row . FieldTypeDescriptor contains extra information needed to resolve the type (e.g. the type of the component element or the schema of the row). I think it is possible to merge the classes (since Java enums are just classes), but I think it's better to maintain .a clean separation between primitive field types and the recursive spec needed to resolve a type. Also while it's ok to add some extra convenience functionality to an Enum class, making it a fully recursive type seems like an abuse of enums. For reference, the two classes are _roughly_ equivalent to the SqlTypeName and RelDataType in Calcite. Would it be clearer if I renamed FieldType -> TypeName? ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking ------------------- Worklog Id: (was: 86479) Time Spent: 3h 10m (was: 3h) > Support schema in PCollections > ------------------------------ > > Key: BEAM-3437 > URL: https://issues.apache.org/jira/browse/BEAM-3437 > Project: Beam > Issue Type: Wish > Components: beam-model > Reporter: Jean-Baptiste Onofré > Assignee: Jean-Baptiste Onofré > Priority: Major > Time Spent: 3h 10m > Remaining Estimate: 0h > > As discussed with some people in the team, it would be great to add schema > support in {{PCollections}}. It will allow us: > 1. To expect some data type in {{PTransforms}} > 2. Improve some runners with additional features (I'm thinking about Spark > runner with data frames for instance). > A technical draft document has been created: > https://docs.google.com/document/d/1tnG2DPHZYbsomvihIpXruUmQ12pHGK0QIvXS1FOTgRc/edit?disco=AAAABhykQIs&ts=5a203b46&usp=comment_email_document > I also started a PoC on a branch, I will update this Jira with a "discussion" > PR. -- This message was sent by Atlassian JIRA (v7.6.3#76005)