Fabian Hueske created FLINK-8950: ------------------------------------ Summary: "Materialize" Tables to avoid recomputation. Key: FLINK-8950 URL: https://issues.apache.org/jira/browse/FLINK-8950 Project: Flink Issue Type: New Feature Components: Table API & SQL Affects Versions: 1.5.0 Reporter: Fabian Hueske
Currently, {{Table}} objects of the Table API / SQL are treated like virtual views, i.e., all relational operators that have been applied on them are recorded and translated when a {{Table}} is emitted to a {{TableSink}} or converted into a {{DataSet}} or {{DataStream}}. In case a {{Table}} is accessed twice, the (sub-)query that it represents is translated twice into a {{DataSet}} or {{DataStream}} program and hence also executed twice which is inefficient. Currently, the only way to avoid this is to convert the {{Table}} into a {{DataSet}} or {{DataStream}}, which will cause the optimizer to generate a plan and register it back as a new {{Table}}. We should offer a method to internally "materialize" a {{Table}} object, i.e., to optimize, generate a plan, and register the plan as an internal table. All queries / operations that are evaluated on the materialized {{Table}} will start from the same {{DataSet}} or {{DataStream}} such that it is not computed multiple times. -- This message was sent by Atlassian JIRA (v7.6.3#76005)