[
https://issues.apache.org/jira/browse/FLINK-37656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Mika Naylor updated FLINK-37656:
--------------------------------
Description:
The Java Table API has, for some time now, supported
{{TableEnvironment.fromValues}} which creates a table from a set of values,
similar to the {{VALUES}} clause in SQL. The Python Table API has a a similar
function called {{from_elements}}, that has a similar API surface but different
behaviour, and doesn't translate to a {{VALUES}} clause.
The {{from_elements}} method, when given a set of Python objects, doesn't
transform it into a {{VALUES}} clause but instead serializes the set of values
to an Avro file, which then is deserialized into an Avro source table. In cases
where the function gets as input a set of {{Expression}}s, however, it does
actually function the same as {{fromValues}}.
It would be useful to have an explicit {{from_values}} method, rather than an
implicit one within {{from_elements}} based on the input types, that behaves in
a similar way to the Java one where it can take a set of Expressions, or a set
of supported Python objects which are translated into Expressions. It would be
useful to have this especially for Flink contexts/environments where this
serialization-to-file -> deserialization-from-file step cannot be done.
{{from_elements}} would then still be useful for large datasets which you
wouldn't want to embed in the query, and {{from_values}} where this embedding
doesn't present an issue to avoid the serialization/deserialization step.
was:
The Java Table API has, for some time now, supported
TableEnvironment.{{{}fromValues{}}} which creates a table from a set of values,
similar to the {{VALUES}} clause in SQL. The Python Table API has a a similar
function called {{{}from_elements{}}}, that has a similar API surface but
different behaviour, and doesn't translate to a {{VALUES}} clause.
The {{from_elements }}method, when given a set of Python objects, doesn't
transform it into a VALUES clause but instead serializes the set of values to
an Avro file, which then is deserialized into an Avro source table. In cases
where the function gets as input a set of {{{}Expression{}}}s, however, it does
actually function the same as {{{}fromValues{}}}.
It would be useful to have an explicit {{from_values}} method, rather than an
implicit one within {{from_elements}} based on the input types, that behaves in
a similar way to the Java one where it can take a set of Expressions, or a set
of supported Python objects which are translated into Expressions. It would be
useful to have this especially for Flink contexts/environments where this
serialization-to-file -> deserialization-from-file step cannot be done.
{{from_elements }}would then still be useful for large datasets which you
wouldn't want to embed in the query, and {{from_values }}where this embedding
doesn't present an issue to avoid the serialization/deserialization step.
> Introduce FromValues in Python Table API
> ----------------------------------------
>
> Key: FLINK-37656
> URL: https://issues.apache.org/jira/browse/FLINK-37656
> Project: Flink
> Issue Type: New Feature
> Components: API / Python
> Reporter: Mika Naylor
> Assignee: Mika Naylor
> Priority: Major
>
> The Java Table API has, for some time now, supported
> {{TableEnvironment.fromValues}} which creates a table from a set of values,
> similar to the {{VALUES}} clause in SQL. The Python Table API has a a similar
> function called {{from_elements}}, that has a similar API surface but
> different behaviour, and doesn't translate to a {{VALUES}} clause.
> The {{from_elements}} method, when given a set of Python objects, doesn't
> transform it into a {{VALUES}} clause but instead serializes the set of
> values to an Avro file, which then is deserialized into an Avro source table.
> In cases where the function gets as input a set of {{Expression}}s, however,
> it does actually function the same as {{fromValues}}.
> It would be useful to have an explicit {{from_values}} method, rather than an
> implicit one within {{from_elements}} based on the input types, that behaves
> in a similar way to the Java one where it can take a set of Expressions, or a
> set of supported Python objects which are translated into Expressions. It
> would be useful to have this especially for Flink contexts/environments where
> this serialization-to-file -> deserialization-from-file step cannot be done.
> {{from_elements}} would then still be useful for large datasets which you
> wouldn't want to embed in the query, and {{from_values}} where this embedding
> doesn't present an issue to avoid the serialization/deserialization step.
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)