Hi Becket,
You are right, that what we had in mind for
ExecutionConfig/CheckpointConfig etc. is the option b) from your email.
In the context of the FLIP-54, those objects are not Configurable. What
we understood as a Configurable by the FLIP-54 are a simple pojos, that
are stored under a single key. Such as the examples either from the ML
thread (Host) or from the design doc (CacheFile). So when configuring
the host user can provide a host like this:
connector.host: address:localhost, port:1234
rather than
connector.host.address: localhost
connector.host.port: 1234
This is important especially if one wants to configure lists of such
objects:
connector.hosts: address:localhost,port:1234;address:localhost,port:4567
The intention was definitely not to store whole complex objects, such as
ExecutionConfig, CheckpointConfig etc. that contain multiple different
options Maybe it makes sense to call it ConfigObject as Aljosha
suggested? What do you think? Would that make it more understandable?
For the initialization/configuration of objects such as ExecutionConfig,
CheckpointConfig you may have a look at FLIP-59[1] where we suggest to
add a configure method to those classes and we pretty much describe the
process you outline in the last message.
Best,
Dawid
[1]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-59%3A+Enable+execution+configuration+from+Configuration+object
On 04/09/2019 03:37, Becket Qin wrote:
Hi Timo, Dawid and Aljoscha,
Thanks for clarifying the goals. It is very helpful to understand the
motivation here. It would be great to add them to the FLIP wiki.
I agree that the current FLIP design achieves the two goals it wants to
achieve. But I am trying to see is if the current approach is the most
reasonable approach.
Please let me check if I understand this correctly. From end users'
perspective, they will do the following when they want to configure their
Flink Jobs.
1. Create a Configuration instance, and call setters of Configuration
with
the ConfigOptions defined in different components.
2. The Configuration created in step 1 will be passed around, and each
component will just exact their own options from it.
3. ExecutionConfig, CheckpointConfig (and other Config classes) will
become
a Configurable, which is responsible for extracting the configuration
values from the Configuration set by users in step 1.
The confusion I had was that in step 1, how users are going to set the
configs for the ExecutionConfig / CheckpointConfig? There may be two
ways:
a) Users will call setConfigurable(ExectionConfigConfigurableOption,
"config1:v1,config2:v2,config3:v3"), i.e. the entire ExecutionConfig is
exposed as a Configurable to the users.
b) Users will call setInteger(MAX_PARALLELISM, 1),
setInteger(LATENCY_TRACKING_INTERVAL, 1000), etc.. This means users will
set individual ConfigOptions for the ExecutionConfig. And they do not see
ExecutionConfig as a Configurable.
I assume we are following b), then do we need to expose Configurable to
the
users in this FLIP? My concern is that the Configurable may be related to
other mechanism such as plugin which we have not really thought through
in
this FLIP.
I know Becket at least has some thoughts about immutability and loading
objects via the configuration but maybe they could be put into a
follow-up
FLIP if they are needed.
I am perfectly fine to leave something out of the scope of this FLIP to
later FLIPs. But I think it is important to avoid introducing something
in
this FLIP that will be shortly changed by the follow-up FLIPs.
Thanks,
Jiangjie (Becket) Qin
On Tue, Sep 3, 2019 at 8:47 PM Aljoscha Krettek <aljos...@apache.org>
wrote:
Hi,
I think it’s important to keep in mind the original goals of this FLIP
and
not let the scope grow indefinitely. As I recall it, the goals are:
- Extend the ConfigOption system enough to allow the Table API to
configure options that are right now only available on
CheckpointingOptions, ExecutionConfig, and StreamExecutionEnvironment.
We
also want to do this without manually having to “forward” all the
available
configuration options by introducing equivalent setters in the Table API
- Do the above while keeping in mind that eventually we want to allow
users to configure everything from either the flink-conf.yaml, vie
command
line parameters, or via a Configuration.
I think the FLIP achieves this, with the added side goals of making
validation a part of ConfigOptions, making them type safe, and making
the
validation constraints documentable (via automatic doc generation.) All
this without breaking backwards compatibility, if I’m not mistaken.
I think we should first agree what the basic goals are so that we can
quickly converge to consensus on this FLIP because it blocks other
people/work. Among other things FLIP-59 depends on this. What are other
opinions that people have? I know Becket at least has some thoughts
about
immutability and loading objects via the configuration but maybe they
could
be put into a follow-up FLIP if they are needed.
Also, I had one thought on the interaction of this FLIP-54 and FLIP-59
when it comes to naming. I think eventually it makes sense to have a
common
interface for things that are configurable from a Configuration (FLIP-59
introduces the first batch of this). It seems natural to call this
interface Configurable. That’s a problem for this FLIP-54 because we
also
introduce a Configurable. Maybe the thing that we introduce here should
be
called ConfigObject or ConfigStruct to highlight that it has a more
narrow
focus and is really only a POJO for holding a bunch of config options
that
have to go together. What do you think?
Best,
Aljoscha
On 3. Sep 2019, at 14:08, Timo Walther <twal...@apache.org> wrote:
Hi Danny,
yes, this FLIP covers all the building blocks we need also for
unification of the DDL properties.
Regards,
Timo
On 03.09.19 13:45, Danny Chan wrote:
with the new SQL DDL
based on properties as well as more connectors and formats coming up,
unified configuration becomes more important
I Cann’t agree more, do you think we can unify the config options key
format here for all the DDL properties ?
Best,
Danny Chan
在 2019年8月16日 +0800 PM10:12,dev@flink.apache.org,写道:
with the new SQL DDL
based on properties as well as more connectors and formats coming up,
unified configuration becomes more important