Hi all,

I would also like to give a +1 for supporting lists as config options
with the delimeter being a parameter (if we cannot find a consensus).
To some extent the current codebase has already solved the issue by
already having lists as options, but the problem is that so far there
was no principled way of doing it so everyone has his/her own way of
encoding such lists. For example, we have:

CoreOptions.TMP_DIRS: separated by ",", "|", or the system's path separator
Whatever uses the NetUtils.getPortRangeFromString(), such as
    QueryableStateOptions.PROXY_PORT_RANGE and
    QueryableStateOptions.SERVER_PORT_RANGE : it can be a port:
"9123", a range of ports: "50100-50200", or a list of ranges and
ports: "50100-50200,50300-50400,51234"
dynamicproperties in the FlinkYarnSessionCLI: which uses @@ as
delimeter. This is not strictly a config option, but it would be nice
to aim for unified user experience in CLI and config files.

Cheers,
Kostas

On Thu, Sep 5, 2019 at 8:42 AM Becket Qin <becket....@gmail.com> wrote:
>
> Hi Dawid,
>
> Thanks a lot for the clarification. Got it now. A few more thoughts:
>
> 1. Naming.
> I agree that if the name of "Configurable" is a little misleading if we
> just want to use it to save POJOs. It would probably help to just name it
> something like "ConfigPojo".
>
> 2. Flat config map v.s. structured config map.
> From user's perspective, I personally find a flat config map is usually
> easier to understand than a structured config format. But it is just my
> personal opinion and up for debate.
>
> Taking the Host and CachedFile as examples, personally I think the
> following format is more concise and user friendly:
>
> Host: 192.168.0.1:1234 (single config)
> Hosts: 192.168.0.1:1234, 192.168.0.2:5678 (list of configs)
>
> CachedFile: path:file:flag (single config)
> CachedFile: path1:file1:flag1, path2:file2:flag2 (list config)
>
> Maybe for complicate POJOs the full K-V pair would be necessary, but it
> looks we are trying to avoid such complicated POJOs to begin with. Even if
> a full K-V is needed, a List<Map<String, String>> format would also be
> almost equivalent to the current design.
>
> 3. The necessity of the POJO class in Configuration / ConfigOption system.
> I can see the convenience of have a POJO (or ConfigObject) type supported
> in the Configuration / ConfigOption. However, one thing to notice is that
> API wise, the ConfigurableFactory can return arbitrary type of class
> instead of just POJO. This can easily be misused or abused in cases such as
> plugins. And the current API design will force such plugins to implement
> methods like toConfiguration() which is a little awkward.
>
> Given that 1) there will not be many such Pojo classes and 2) these POJO
> classes are defined by Flink, I am thinking that one alternative approach
> is to just have the constructors to take the configuration String (or list
> of string) and parse that. This will avoid a few complications in this FLIP.
>   a) No need to have the ConfigurableFactory class
>   b) No need to have the toConfiguration() implementation. So there is just
> one way to set values in the Configuration instance.
>   c) The Configuration / ConfigOption does not have to also deal with the
> Object creation. Instead, they will simply focus on configuration itself.
>
> Thanks for the patient discussion. I don't want to block this FLIP further,
> so I am fine to go with the current design with changing the name of
> Configurable to something like ConfigPojo in order to avoid misuse as much
> as possible.
>
> Thanks,
>
> Jiangjie (Becket) Qin
>
> On Wed, Sep 4, 2019 at 5:50 PM Dawid Wysakowicz <dwysakow...@apache.org>
> wrote:
>
> > Hi Becket,
> >
> > You are right, that what we had in mind for
> > ExecutionConfig/CheckpointConfig etc. is the option b) from your email.
> > In the context of the FLIP-54, those objects are not Configurable. What
> > we understood as a Configurable by the FLIP-54 are a simple pojos, that
> > are stored under a single key. Such as the examples either from the ML
> > thread (Host) or from the design doc (CacheFile). So when configuring
> > the host user can provide a host like this:
> >
> > connector.host: address:localhost, port:1234
> >
> > rather than
> >
> > connector.host.address: localhost
> >
> > connector.host.port: 1234
> >
> > This is important especially if one wants to configure lists of such
> > objects:
> >
> > connector.hosts: address:localhost,port:1234;address:localhost,port:4567
> >
> > The intention was definitely not to store whole complex objects, such as
> > ExecutionConfig, CheckpointConfig etc. that contain multiple different
> > options Maybe it makes sense to call it ConfigObject as Aljosha
> > suggested? What do you think? Would that make it more understandable?
> >
> > For the initialization/configuration of objects such as ExecutionConfig,
> > CheckpointConfig you may have a look at FLIP-59[1] where we suggest to
> > add a configure method to those classes and we pretty much describe the
> > process you outline in the last message.
> >
> > Best,
> >
> > Dawid
> >
> > [1]
> >
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-59%3A+Enable+execution+configuration+from+Configuration+object
> >
> > On 04/09/2019 03:37, Becket Qin wrote:
> > > Hi Timo, Dawid and Aljoscha,
> > >
> > > Thanks for clarifying the goals. It is very helpful to understand the
> > > motivation here. It would be great to add them to the FLIP wiki.
> > >
> > > I agree that the current FLIP design achieves the two goals it wants to
> > > achieve. But I am trying to see is if the current approach is the most
> > > reasonable approach.
> > >
> > > Please let me check if I understand this correctly. From end users'
> > > perspective, they will do the following when they want to configure their
> > > Flink Jobs.
> > > 1. Create a Configuration instance, and call setters of Configuration
> > with
> > > the ConfigOptions defined in different components.
> > > 2. The Configuration created in step 1 will be passed around, and each
> > > component will just exact their own options from it.
> > > 3. ExecutionConfig, CheckpointConfig (and other Config classes) will
> > become
> > > a Configurable, which is responsible for extracting the configuration
> > > values from the Configuration set by users in step 1.
> > >
> > > The confusion I had was that in step 1, how users are going to set the
> > > configs for the ExecutionConfig / CheckpointConfig? There may be two
> > ways:
> > > a) Users will call setConfigurable(ExectionConfigConfigurableOption,
> > > "config1:v1,config2:v2,config3:v3"), i.e. the entire ExecutionConfig is
> > > exposed as a Configurable to the users.
> > > b) Users will call setInteger(MAX_PARALLELISM, 1),
> > > setInteger(LATENCY_TRACKING_INTERVAL, 1000), etc.. This means users will
> > > set individual ConfigOptions for the ExecutionConfig. And they do not see
> > > ExecutionConfig as a Configurable.
> > >
> > > I assume we are following b), then do we need to expose Configurable to
> > the
> > > users in this FLIP? My concern is that the Configurable may be related to
> > > other mechanism such as plugin which we have not really thought through
> > in
> > > this FLIP.
> > >
> > > I know Becket at least has some thoughts about immutability and loading
> > >> objects via the configuration but maybe they could be put into a
> > follow-up
> > >> FLIP if they are needed.
> > > I am perfectly fine to leave something out of the scope of this FLIP to
> > > later FLIPs. But I think it is important to avoid introducing something
> > in
> > > this FLIP that will be shortly changed by the follow-up FLIPs.
> > >
> > > Thanks,
> > >
> > > Jiangjie (Becket) Qin
> > >
> > > On Tue, Sep 3, 2019 at 8:47 PM Aljoscha Krettek <aljos...@apache.org>
> > wrote:
> > >
> > >> Hi,
> > >>
> > >> I think it’s important to keep in mind the original goals of this FLIP
> > and
> > >> not let the scope grow indefinitely. As I recall it, the goals are:
> > >>
> > >>  - Extend the ConfigOption system enough to allow the Table API to
> > >> configure options that are right now only available on
> > >> CheckpointingOptions, ExecutionConfig, and StreamExecutionEnvironment.
> > We
> > >> also want to do this without manually having to “forward” all the
> > available
> > >> configuration options by introducing equivalent setters in the Table API
> > >>
> > >>  - Do the above while keeping in mind that eventually we want to allow
> > >> users to configure everything from either the flink-conf.yaml, vie
> > command
> > >> line parameters, or via a Configuration.
> > >>
> > >> I think the FLIP achieves this, with the added side goals of making
> > >> validation a part of ConfigOptions, making them type safe, and making
> > the
> > >> validation constraints documentable (via automatic doc generation.) All
> > >> this without breaking backwards compatibility, if I’m not mistaken.
> > >>
> > >> I think we should first agree what the basic goals are so that we can
> > >> quickly converge to consensus on this FLIP because it blocks other
> > >> people/work. Among other things FLIP-59 depends on this. What are other
> > >> opinions that people have? I know Becket at least has some thoughts
> > about
> > >> immutability and loading objects via the configuration but maybe they
> > could
> > >> be put into a follow-up FLIP if they are needed.
> > >>
> > >> Also, I had one thought on the interaction of this FLIP-54 and FLIP-59
> > >> when it comes to naming. I think eventually it makes sense to have a
> > common
> > >> interface for things that are configurable from a Configuration (FLIP-59
> > >> introduces the first batch of this). It seems natural to call this
> > >> interface Configurable. That’s a problem for this FLIP-54 because we
> > also
> > >> introduce a Configurable. Maybe the thing that we introduce here should
> > be
> > >> called ConfigObject or ConfigStruct to highlight that it has a more
> > narrow
> > >> focus and is really only a POJO for holding a bunch of config options
> > that
> > >> have to go together. What do you think?
> > >>
> > >> Best,
> > >> Aljoscha
> > >>
> > >>> On 3. Sep 2019, at 14:08, Timo Walther <twal...@apache.org> wrote:
> > >>>
> > >>> Hi Danny,
> > >>>
> > >>> yes, this FLIP covers all the building blocks we need also for
> > >> unification of the DDL properties.
> > >>> Regards,
> > >>> Timo
> > >>>
> > >>>
> > >>> On 03.09.19 13:45, Danny Chan wrote:
> > >>>>> with the new SQL DDL
> > >>>> based on properties as well as more connectors and formats coming up,
> > >>>> unified configuration becomes more important
> > >>>>
> > >>>> I Cann’t agree more, do you think we can unify the config options key
> > >> format here for all the DDL properties ?
> > >>>> Best,
> > >>>> Danny Chan
> > >>>> 在 2019年8月16日 +0800 PM10:12,dev@flink.apache.org,写道:
> > >>>>> with the new SQL DDL
> > >>>>> based on properties as well as more connectors and formats coming up,
> > >>>>> unified configuration becomes more important
> > >>>
> > >>
> >
> >

Reply via email to