Hi everyone,

I updated the FLIP proposal one more time as mentioned in the voting thread. If there are no objections, I will start a new voting thread tomorrow at 9am Berlin time.

Thanks,
Timo


On 22.08.19 14:19, Timo Walther wrote:
Hi everyone,

thanks for all the feedback we have received online and offline. It showed that many people support the idea of evolving the Flink configuration functionality. I'm almost sure that this FLIP will not solve all issues but at least will improve the current status.

We've updated the document and replaced the Correlation part with the concept of a ConfigOptionGroup that can provide all available options of a group plus custom group validators for eager validation. For now, this eager group validation will only be used at certain locations in the Flink code but it prepares for maybe validating the entire global configuration before submitting a job in the future.

Please take another look if you find time. I hope we can proceed with the voting process if there are no objections.

Regards,
Timo

Am 19.08.19 um 12:54 schrieb Timo Walther:
Hi Stephan,

thanks for your suggestions. Let me give you some background about the decisions made in this FLIP:

1. Goal: The FLIP is labelled "evolve" not "rework" because we did not want to change the entire configuration infrastructure. Both for backwards-compatibility reasons and the amount of work that would be required to update all options. If our goal is to rework the configuration option entirely, I might suggest to switch to JSON format with JSON schema and JSON validator. However, setting properties in a CLI or web interface becomes more tricky the more nested structures are allowed.

2. Class-based Options: The current ConfigOption<T> class is centered around Java classes where T is determined by the default value. The FLIP just makes this more explicit by offering an explicit `intType()` method etc. The current design of validators centered around Java classes makes it possible to have typical domain validators baked by generics as you suggested. If we introduce types such as "quantity with measure and unit" we still need to get a class out of this option at the end, so why changing a proven concept?

3. List Options: The `isList` prevents having arbitrary nesting. As Dawid mentioned, we kept human readability in mind. For every atomic option like "key=12" can be represented by a list "keys=12;13". But we don't want to go further; esp. no nesting. A dedicated list option would start making this more complicated such as "ListOption(ObjectOption(ListOption(IntOption, ...), StringOption(...)))", do we want that?

4. Correlation: The correlation part is one of the suggestions that I like least in the document. We can also discuss removing it entirely, but I think it solves the use case of relating options with each other in a flexible way right next to the actual option. Instead of being hidden in some component initialization, we should put it close to the option to also perform validation eagerly instead of failing at runtime when the option is accessed the first time.

Regards,
Timo


Am 18.08.19 um 23:32 schrieb Stephan Ewen:
A "List Type" sounds like a good direction to me.

The comment on the type system was a bit brief, I agree. The idea is to see
if something like that can ease validation. Especially the correlation
system seems quite complex (proxies to work around order of initialization).

For example, let's assume we don't think primarily about "java types" but would define types as one of the following (just examples, haven't thought
all the details through):

   (a) category type: implies string, and a fix set of possible values.
Those would be passes and naturally make it into the docs and validation.
Maps to a String or Enum in Java.

   (b) numeric integer type: implies long (or optionally integer, if we want
to automatically check overflow / underflow). would take typical domain
validators, like non-negative, etc.

   (c) numeric real type: same as above (double or float)

   (d) numeric interval type: either defined as an interval, or references
other parameter by key. validation by valid interval.

   (e) quantity: a measure and a unit. separately parsable. The measure's type could be any of the numeric types above, with same validation rules.

With a system like the above, would we still correlation validators? Are there still cases that we need to catch early (config loading) or are the remaining cases sufficiently rare and runtime or setup specific, that it is
fine to handle them in component initialization?


On Sun, Aug 18, 2019 at 6:36 PM Dawid Wysakowicz <dwysakow...@apache.org>
wrote:

Hi Stephan,

Thank you for your opinion.

Actually list/composite types are the topics we spent the most of the
time. I understand that from a perspective of a full blown type system,
a field like isList may look weird. Please let me elaborate a bit more
on the reason behind it though. Maybe we weren't clear enough about it
in the FLIP. The key feature of all the conifg options is that they must
have a string representation as they might come from a configuration
file. Moreover it must be a human readable format, so that the values
might be manually adjusted. Having that in mind we did not want to add a
support of an arbitrary nesting and we decided to allow for lists only
(and flat objects - I think though in the current design there is a
mistake around the Configurable interface). I think though you have a
point here and it would be better to have a ListConfigOption instead of
this field. Does it make sense to you?

As for the second part of your message. I am not sure if I understood
it. The validators work with parse/deserialized values from
Configuration that means they can be bound to the generic parameter of
Configuration. You can have a RangeValidator<? extends
Comparable/Number>. I don't think the type hierarchy in the ConfigOption has anything to do with the validation logic. Could you elaborate a bit
more what did you mean?

Best,

Dawid

On 18/08/2019 16:42, Stephan Ewen wrote:
I like the idea of enhancing the configuration and to do early
validation.
I feel that some of the ideas in the FLIP seem a bit ad hoc, though. For example, having a boolean "isList" is a clear indication of not having
thought through the type/category system.
Also, having a more clear category system makes validation simpler.

For example, I have seen systems distinguishing between numeric
parameters
(valid ranges), category parameters (set of possible values), quantities like duration and memory size (need measure and unit), which results in
an
elegant system for validation.


On Fri, Aug 16, 2019 at 5:22 PM JingsongLee <lzljs3620...@aliyun.com
.invalid>
wrote:

+1 to this, thanks Timo and Dawid for the design.
This allows the currently cluttered configuration of various
  modules to be unified.
This is also first step of one of the keys to making new unified
TableEnvironment available for production.

Previously, we did encounter complex configurations, such as
specifying the skewed values of column in DDL. The skew may
  be a single field or a combination of multiple fields. So the
  configuration is very troublesome. We used JSON string to
  configure it.

Best,
Jingsong Lee



------------------------------------------------------------------
From:Jark Wu <imj...@gmail.com>
Send Time:2019年8月16日(星期五) 16:44
To:dev <dev@flink.apache.org>
Subject:Re: [DISCUSS] FLIP-54: Evolve ConfigOption and Configuration

Thanks for starting this design Timo and Dawid,

Improving ConfigOption has been hovering in my mind for a long time.
We have seen the benefit when developing blink configurations and
connector
properties in 1.9 release.
Thanks for bringing it up and make such a detailed design.
I will leave my thoughts and comments there.

Cheers,
Jark


On Fri, 16 Aug 2019 at 22:30, Zili Chen <wander4...@gmail.com> wrote:

Hi Timo,

It looks interesting. Thanks for preparing this FLIP!

Client API enhancement benefit from this evolution which
hopefully provides a better view of configuration of Flink.
In client API enhancement, we likely make the deployment
of cluster and submission of job totally defined by configuration.

Will take a look at the document in days.

Best,
tison.


Timo Walther <twal...@apache.org> 于2019年8月16日周五 下午10:12写道:

Hi everyone,

Dawid and I are working on making parts of ExecutionConfig and
TableConfig configurable via config options. This is necessary to make all properties also available in SQL. Additionally, with the new SQL
DDL
based on properties as well as more connectors and formats coming up,
unified configuration becomes more important.

We need more features around string-based configuration in the future, which is why Dawid and I would like to propose FLIP-54 for evolving
the
ConfigOption and Configuration classes:



https://docs.google.com/document/d/1IQ7nwXqmhCy900t2vQLEL3N2HIdMg-JO8vTzo1BtyKU/edit
In summary it adds:
- documented types and validation
- more common types such as memory size, duration, list
- simple non-nested object types

Looking forward to your feedback,
Timo




Reply via email to