Hello, I’m new to Flink. I’m trying to write a source for Pub/Sub Lite
which is a partition based Pub/Sub product, and I have a few questions.

1.

I saw that there are two sets of interfaces used in existing sources: The
RichSourceFunction, and the set of interfaces from FLIP-27. It seems like
the Source interfaces are preferred for new sources, but I wanted to be
sure.

2.

I’m having a little bit of trouble working out how when the
currentParallelism returned by the SplitEnumeratorContext [1] can change,
and how a source should react to that.

For context, I’m currently thinking about single partitions as “splits”, so
a source would have an approximately constant number of splits which each
has an potentially unbounded amount of work (at least in continuous mode).
Each split will be assigned to some SourceReader by the split enumerator.
If the value of currentParallelism changes, it seems like I’ll need to find
a way to redistribute my partitions over SourceReaders, or else I'll end up
with an unbalanced distribution of partitions to SourceReaders.

I looked at the docs on elastic scaling [2], and it seems like when the
parallelism of the source changes, the source will be checkpointed and
restored. I think this would mean all the SourceReaders get restarted, and
their splits are returned to the SplitEnumerator for reassignment. Is this
approximately correct?

[1]
https://ci.apache.org/projects/flink/flink-docs-master/api/java/org/apache/flink/api/connector/source/SplitEnumeratorContext.html#currentParallelism--

[2]
https://ci.apache.org/projects/flink/flink-docs-master/docs/deployment/elastic_scaling/

Reply via email to