[ 
https://issues.apache.org/jira/browse/FLINK-39632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramin Gharib updated FLINK-39632:
---------------------------------
    Description: 
h2. Background

{\{Table}} already exposes ergonomic shortcuts for the built-in changelog PTFs:

{code:java}
Table result = cdcStream.fromChangelog();
Table appendOnly = updatingTable.toChangelog();
{code}

These are equivalent to calling \{{.process("FROM_CHANGELOG", ...)}} / 
\{{.process("TO_CHANGELOG", ...)}}, but read more naturally and don't require 
the user to remember the function name as a string.

h2. Problem

When the user wants set semantics (\{{PARTITION BY}}), the equivalent path 
today goes through the generic \{{PartitionedTable#process(String, Object...)}}:

{code:java}
Table result = cdcStream
    .partitionBy($("id"))
    .process("FROM_CHANGELOG");   // string-based, no compile-time check
{code}

This is awkward and inconsistent with the row-semantics API.

h2. Proposal

Add \{{fromChangelog(Expression...)}} and \{{toChangelog(Expression...)}} to 
\{{PartitionedTable}} (alongside the existing \{{orderBy}}, \{{process}}, 
\{{asArgument}}):

{code:java}
Table result = cdcStream
    .partitionBy($("id"))
    .fromChangelog();             // convenience method, mirrors 
Table#fromChangelog

Table appendOnly = updatingTable
    .partitionBy($("id"))
    .toChangelog();
{code}

These would internally delegate to 
\{{process(BuiltInFunctionDefinitions.FROM_CHANGELOG.getName(), arguments)}} / 
\{{...TO_CHANGELOG...}}, exactly as the row-semantics \{{Table#fromChangelog}} 
/ \{{Table#toChangelog}} already do.

> Add fromChangelog() / toChangelog() convenience methods to PartitionedTable
> ---------------------------------------------------------------------------
>
>                 Key: FLINK-39632
>                 URL: https://issues.apache.org/jira/browse/FLINK-39632
>             Project: Flink
>          Issue Type: Sub-task
>          Components: Table SQL / API
>            Reporter: Ramin Gharib
>            Priority: Major
>
> h2. Background
> {\{Table}} already exposes ergonomic shortcuts for the built-in changelog 
> PTFs:
> {code:java}
> Table result = cdcStream.fromChangelog();
> Table appendOnly = updatingTable.toChangelog();
> {code}
> These are equivalent to calling \{{.process("FROM_CHANGELOG", ...)}} / 
> \{{.process("TO_CHANGELOG", ...)}}, but read more naturally and don't require 
> the user to remember the function name as a string.
> h2. Problem
> When the user wants set semantics (\{{PARTITION BY}}), the equivalent path 
> today goes through the generic \{{PartitionedTable#process(String, 
> Object...)}}:
> {code:java}
> Table result = cdcStream
>     .partitionBy($("id"))
>     .process("FROM_CHANGELOG");   // string-based, no compile-time check
> {code}
> This is awkward and inconsistent with the row-semantics API.
> h2. Proposal
> Add \{{fromChangelog(Expression...)}} and \{{toChangelog(Expression...)}} to 
> \{{PartitionedTable}} (alongside the existing \{{orderBy}}, \{{process}}, 
> \{{asArgument}}):
> {code:java}
> Table result = cdcStream
>     .partitionBy($("id"))
>     .fromChangelog();             // convenience method, mirrors 
> Table#fromChangelog
> Table appendOnly = updatingTable
>     .partitionBy($("id"))
>     .toChangelog();
> {code}
> These would internally delegate to 
> \{{process(BuiltInFunctionDefinitions.FROM_CHANGELOG.getName(), arguments)}} 
> / \{{...TO_CHANGELOG...}}, exactly as the row-semantics 
> \{{Table#fromChangelog}} / \{{Table#toChangelog}} already do.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to