Hi Kezhu,

Thanks for your detailed points for the Hybrid Source. I follow your
opinions and make a corresponding explanation as follows:

1.Would the Hybrid Source be possible to use this feature to switch/chain
multiple homogeneous sources?

"HybridSource" supports to switch/chain multiple homogeneous sources, which
have the respective implementation for "SwitchableSource" and
"SwitchableSplitEnumerator". "HybridSource" doesn't limit whether the
Sources consisted is homogeneous. From the user's perspective, User only
adds the "SwitchableSource" into "HybridSource" and leaves the smooth
migration operation to "HybridSource".

2."setStartState" is actually a reposition operation for next source to
start in job runtime?

IMO, "setStartState" is used to determine the initial position of the new
source for smooth migration, not reposition operation. More importantly, the
"State" mentioned here refers to the start and end positions of reading
source.

3.This conversion should be implementation detail of next source, not
converter function in my opinion?

The state conversion is of course an implementation detail and included in
the switching mechanism, that should provide users with the conversion
interface for conversion, which is defined in converter function. What's
more, when users has already implemented "SwitchableSource" and added to the
Hybrid Source, the users don't need to implement the "SwitchableSource" for
the different conversion. From the user's perspective, users could define
the different converter functions and create the "SwitchableSource" for the
addition of "HybridSource", no need to implement a Source for the converter
function.

4.No configurable start-position. In this situation combination of above
three joints is a nop, and 
"HybridSource" is a chain of start-position pre-configured sources?

Indeed there is no configurable start-position, and this configuration could
be involved in the feature. Users could use
"SwitchableSplitEnumerator#setStartState" interface or the configuration
parameters to configure start-position. 

5.I am wonder whether end-position is a must and how it could be useful for
end users in a generic-enough source?

"getEndState" interface is used for the smooth migration scenario, which
could return null value if it is not needed. In the Hybrid Source mechanism,
this interface is required for the switching between the sources consisted,
otherwise there is no any way to get end-position of upstream source. In
summary, Hybrid Source needs to be able to set the start position and get
the end position of each Source, otherwise there is no use to build Hybrid
Source.

6.Is it possible for converter function to do blocking operations? How to
respond to checkpoint request when switching split enumerators cross
sources? Does end-position or start-position need to be stored in checkpoint
state or not?

The converter function only simply converts the state of upstream source to
the state of downstream source, not blocking operations. The way to respond
the checkpoint request when switching split enumerators cross sources is
send the corresponding "SourceEvent" to coordination. The end-position or
start-position don't need to be stored in checkpoint state, only implements
the "getEndState" interface for end-position.

Best,
Nicholas Jiang



--
Sent from: http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/

Reply via email to