[ 
https://issues.apache.org/jira/browse/FLINK-5280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15751325#comment-15751325
 ] 

Jark Wu commented on FLINK-5280:
--------------------------------

Hi [~ivan.mushketyk],

Yes, you are right. In your case, the POJO TableSource's {{fieldIndexes}} is 
not clear. But we can use {{getFieldsNames}} and {{getResultType}} to generate 
the {{fieldIndexes}}. So the new {{getFieldMapping}} is still duplicate with 
{{getFieldsNames}}, am I right? 

I don't know much about {{GenericRecord}}, maybe [~fhueske] can answer your 
question. Does {{GenericRecord}} has an immutable schema, or will change every 
record ? 

IMO, the TableSource interface can be simplified to this:

{code:java}
trait TableSource[T] {
  /** Return this table source's row type. The returned RowTypeInfo is a 
composite type and can have
  * nested types whose fields describe the names and types of the columns in 
this table. */
  def getReturnType: RowTypeInfo
}
{code} 

The {{getReturnType}} is forced to return a RowTypeInfo. It describes the first 
level field names and types (maybe nested). So that we can support nested data 
for TableSource. But currently, the RowTypeInfo doesn't  support custom field 
names, so we should fix that first.

And the original {{getNumberOfFields}} , {{getFieldsNames}} , {{getFieldTypes}} 
interfaces in {{TableSource}} could be removed, as they can be derived from the 
returned RowTypeInfo. Finally, it will be similar to Calcite's {{Table}} 
interface which actually only has a {{RelDataType getRowType(RelDataTypeFactory 
typeFactory)}} method to implement.

What do you think ? [~fhueske] [~ivan.mushketyk]


> Extend TableSource to support nested data
> -----------------------------------------
>
>                 Key: FLINK-5280
>                 URL: https://issues.apache.org/jira/browse/FLINK-5280
>             Project: Flink
>          Issue Type: Improvement
>          Components: Table API & SQL
>    Affects Versions: 1.2.0
>            Reporter: Fabian Hueske
>            Assignee: Ivan Mushketyk
>
> The {{TableSource}} interface does currently only support the definition of 
> flat rows. 
> However, there are several storage formats for nested data that should be 
> supported such as Avro, Json, Parquet, and Orc. The Table API and SQL can 
> also natively handle nested rows.
> The {{TableSource}} interface and the code to register table sources in 
> Calcite's schema need to be extended to support nested data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to