[ 
https://issues.apache.org/jira/browse/FLINK-30204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Caizhi Weng updated FLINK-30204:
--------------------------------
    Description: 
Currently table store sinks will write and compact data files from the same 
job. While this implementation is enough and more economical for most users, 
some user may expect higher or more steady write throughput.

We decided to support creating separated compact jobs for Table Store. This 
will bring us the following advantages:
* Write jobs can concentrate only on writing files. Their throughput will be 
higher and more steady.
* By creating only one compact job for each table, no commit conflicts will 
occur.

The structure of a separated compact job is sketched out as follows:
* There should be three vertices in a compact job. One source vertex, one sink 
(compactor) vertex and one commit vertex.
* The source vertex is responsible for generating records containing partitions 
and buckets to be compacted.
* The sink vertex accepts records containing partitions and buckets, and 
compact these buckets.
* The commit vertex commit the changes from the sink vertex. It is possible 
that the user mistakenly creates other compact jobs so commit conflicts may 
still occur. However as compact changes are optional, this commit vertex will 
commit changes in an at-most-once style.

  was:
Currently table store sinks will write and compact data files from the same 
job. While this implementation is enough and more economical for most users, 
some user may expect higher or more steady write throughput.

We decided to support creating separated compact jobs for Table Store. This 
will bring us the following advantages:
* Write jobs can concentrate only on writing files. Their throughput will be 
higher and more steady.
* By creating only one compact job for each table, no commit conflicts will 
occur.


> Table Store support separated compact jobs
> ------------------------------------------
>
>                 Key: FLINK-30204
>                 URL: https://issues.apache.org/jira/browse/FLINK-30204
>             Project: Flink
>          Issue Type: New Feature
>          Components: Table Store
>    Affects Versions: table-store-0.3.0
>            Reporter: Caizhi Weng
>            Assignee: Caizhi Weng
>            Priority: Major
>             Fix For: table-store-0.3.0
>
>
> Currently table store sinks will write and compact data files from the same 
> job. While this implementation is enough and more economical for most users, 
> some user may expect higher or more steady write throughput.
> We decided to support creating separated compact jobs for Table Store. This 
> will bring us the following advantages:
> * Write jobs can concentrate only on writing files. Their throughput will be 
> higher and more steady.
> * By creating only one compact job for each table, no commit conflicts will 
> occur.
> The structure of a separated compact job is sketched out as follows:
> * There should be three vertices in a compact job. One source vertex, one 
> sink (compactor) vertex and one commit vertex.
> * The source vertex is responsible for generating records containing 
> partitions and buckets to be compacted.
> * The sink vertex accepts records containing partitions and buckets, and 
> compact these buckets.
> * The commit vertex commit the changes from the sink vertex. It is possible 
> that the user mistakenly creates other compact jobs so commit conflicts may 
> still occur. However as compact changes are optional, this commit vertex will 
> commit changes in an at-most-once style.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to