Mika Naylor created FLINK-37656:
-----------------------------------
Summary: Introduce FromValues in Python Table API
Key: FLINK-37656
URL: https://issues.apache.org/jira/browse/FLINK-37656
Project: Flink
Issue Type: New Feature
Components: API / Python
Reporter: Mika Naylor
Assignee: Mika Naylor
The Java Table API has, for some time now, supported
TableEnvironment.{{{}fromValues{}}} which creates a table from a set of values,
similar to the {{VALUES}} clause in SQL. The Python Table API has a a similar
function called {{{}from_elements{}}}, that has a similar API surface but
different behaviour, and doesn't translate to a {{VALUES}} clause.
The {{from_elements }}method, when given a set of Python objects, doesn't
transform it into a VALUES clause but instead serializes the set of values to
an Avro file, which then is deserialized into an Avro source table. In cases
where the function gets as input a set of {{{}Expression{}}}s, however, it does
actually function the same as {{{}fromValues{}}}.
It would be useful to have an explicit {{from_values}} method, rather than an
implicit one within {{from_elements}} based on the input types, that behaves in
a similar way to the Java one where it can take a set of Expressions, or a set
of supported Python objects which are translated into Expressions. It would be
useful to have this especially for Flink contexts/environments where this
serialization-to-file -> deserialization-from-file step cannot be done.
{{from_elements }}would then still be useful for large datasets which you
wouldn't want to embed in the query, and {{from_values }}where this embedding
doesn't present an issue to avoid the serialization/deserialization step.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)