Hi, Rui Fan. Thanks for this FLIP.
I think the key of LOCAL_NUMBER_TASK_MANAGER is better as 'minicluster.number-of-taskmanagers' or 'minicluster.taskmanager-number' instead of 'minicluster.number-taskmanager'. Best, Hang Xuannan Su <suxuanna...@gmail.com> 于2023年12月27日周三 12:40写道: > Hi Xintong and Rui, > > Thanks for the quick feedback and the suggestions. > > > 1. I think the default value for `TASK_MANAGER_LOG_PATH_KEY` should be > "no > > default". > > I have considered both ways of describing the default value. However, > I found out that some of the configurations, such as `web.tmpdir`, put > `System.getProperty()` in the default value [1]. Some are putting the > description in the default value column[2]. So I just picked the first > one. I am fine with either way, so long as they are consistent. WDYT? > > > 3. Simply saying "getting / setting value with string key is discouraged" > > in JavaDoc of get/setString is IMHO a bit confusing. People may have the > > question why would we keep the discouraged interfaces at all. I would > > suggest the following: > > ``` > > We encourage users and developers to always use ConfigOption for getting > / > > setting the configurations if possible, for its rich description, type, > > default-value and other supports. The string-key-based getter / setter > > should only be used when ConfigOption is not applicable, e.g., the key is > > programmatically generated in runtime. > > ``` > > The suggested comment looks good to me. Thanks for the suggestion. I > will update the comment in the FLIP. > > > 2. So I wonder if we can simply mark them as deprecated and remove in > 2.0. > > After some investigation, it turns out those options of input/output > format are only publicly exposed in the DataSet docs[2], which is > deprecated. Thus, marking them as deprecated and removed in Flink 2.0 > looks fine to me. > > > @Rui > > > Configuration has a `public <T> T get(ConfigOption<T> option)` method. > > Could we remove all `Xxx getXxx(ConfigOption<Xxx> configOption)` methods? > > +1 Only keep the get(ConfigOption<T> option), > getOptional(ConfigOption<T> option), and set(ConfigOption<T> option, T > value). > > Best, > Xuannan > > [1] > https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/config/#web-tmpdir > [2] > https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/config/#kubernetes-container-image-ref > [3] > https://nightlies.apache.org/flink/flink-docs-master/docs/dev/dataset/overview/#data-sources > > > > > On Tue, Dec 26, 2023 at 8:47 PM Xintong Song <tonysong...@gmail.com> > wrote: > > > > > > > > Configuration has a `public <T> T get(ConfigOption<T> option)` method. > > > Could we remove all `Xxx getXxx(ConfigOption<Xxx> configOption)` > methods? > > > > > > > > Note: all `public void setXxx(ConfigOption<Xxx> key, Xxx value)` methods > > > can be replaced with `public <T> Configuration set(ConfigOption<T> > option, > > > T value)` as well. > > > > > > +1 > > > > > > Best, > > > > Xintong > > > > > > > > On Tue, Dec 26, 2023 at 8:44 PM Xintong Song <tonysong...@gmail.com> > wrote: > > > > > These features don't have a public option, but they work. I'm not sure > > >> whether these features are used by some advanced users. > > >> Actually, I think some of them are valuable! For example: > > >> > > >> - ConfigConstants.YARN_CONTAINER_START_COMMAND_TEMPLATE > > >> allows users to define the start command of the yarn container. > > >> - FileInputFormat.ENUMERATE_NESTED_FILES_FLAG allows > > >> flink job reads all files under the directory even if it has nested > > >> directories. > > >> > > >> This FLIP focuses on the refactor option, I'm afraid these features > are > > >> used > > >> in some production and removing these features will affect some flink > > >> jobs. > > >> So I prefer to keep these features, WDTY? > > >> > > > > > > First of all, I don't think we should support any knobs that users can > > > only learn how to use from reading Flink's internal codes. From this > > > perspective, for existing string-keyed knobs that are not mentioned in > any > > > public documentation, yes we can argue that they are functioning, but > we > > > can also argue that they are not really exposed to users. That means > > > migrating them to ConfigOption is not a pure refactor, but would make > > > something that used to be hidden from users now exposed to users. For > such > > > options, I personally would lean toward not exposing them. If we > consider > > > them as already exposed, then process-wise there's no problem in > > > deprecating some infrequently-used options and removing them in a major > > > version bump, and if they are proved needed later we can add them back > > > anytime. On the other hand, if we consider them as not yet exposed, > then > > > removing them later would be a breaking change. > > > > > > > > > Secondly, I don't really come up with any cases where users need to > tune > > > these knobs. E.g., why would we allow users to customize the yarn > container > > > start command while we already provide `env.java.opts`? And what would > be > > > the problem if Flink just doesn't support nested files? And even worse, > > > such knobs may provide chances for users to shoot themself in the foot. > > > E.g., what if %jvmmem% is missing from a user-provided container start > > > command? Admittedly, there might be a small fraction of advanced users > that > > > know how to use these knobs. However, those users usually have their > own > > > custom fork of Flink, and it should not be a big problem for them to > build > > > such abilities by themselves. > > > > > > > > > Taking a step back, if we decide that some of the mentioned knobs are > > > really useful, we should at least provide enough descriptions to help > users > > > understand when and how to use these options. E.g., the current > description > > > for yarn container command template is far from enough, which does not > > > explain what the placeholders mean, what happens if some placeholders > are > > > missing or if an unknown placeholder is provided. > > > > > > > > > Best, > > > > > > Xintong > > > > > > > > > > > > On Tue, Dec 26, 2023 at 7:39 PM Rui Fan <1996fan...@gmail.com> wrote: > > > > > >> In addition to what is written in FLIP, I found some methods of > > >> Configuration > > >> are not necessary. And I wanna hear more thoughts from all of you. > > >> > > >> Configuration has a `public <T> T get(ConfigOption<T> option)` method. > > >> Could we remove all `Xxx getXxx(ConfigOption<Xxx> configOption)` > methods? > > >> Such as: > > >> - public int getInteger(ConfigOption<Integer> configOption) > > >> - public String getString(ConfigOption<String> configOption) > > >> - public long getLong(ConfigOption<Long> configOption) > > >> - etc > > >> > > >> I prefer users and flink use `get(ConfigOption<T> option)` instead of > > >> `getXxx(ConfigOption<Xxx> configOption)` based on some reasons: > > >> > > >> 1. All callers can replace it directly without any extra effort. > > >> `T get(ConfigOption<T> option)` method can replace all > > >> `Xxx getXxx(ConfigOption<Xxx> configOption)` methods directly. > > >> 2. Callers can call get directly, and users or flink developers > > >> don't need to care about should they call getInteger or getString. > > >> 3. Flink code is easier to maintain. > > >> 4. `T get(ConfigOption<T> option)` is designed later than > > >> `Xxx getXxx(ConfigOption<Xxx> configOption)`, I guess if > > >> `T get(ConfigOption<T> option)` is designed first, > > >> all `Xxx getXxx(ConfigOption<Xxx> configOption)` methods > > >> aren't needed. > > >> > > >> Note: all `public void setXxx(ConfigOption<Xxx> key, Xxx value)` > methods > > >> can be replaced with `public <T> Configuration set(ConfigOption<T> > option, > > >> T value)` as well. > > >> > > >> If they can be marked as deprecated, only `public Xxx > > >> getXxx(ConfigOption<Xxx> configOption, Xxx overrideDefault)` > > >> is keeped after this FLIP. And the rest of getXxx or setXxx can be > removed > > >> in 2.0. > > >> > > >> Looking forward to everyone's feedback and suggestions, thank you! > > >> > > >> Best, > > >> Rui > > >> > > >> On Tue, Dec 26, 2023 at 7:15 PM Rui Fan <1996fan...@gmail.com> wrote: > > >> > > >> > Thanks Xintong for the quick feedback and the good suggestions! > > >> > > > >> > > 1. I think the default value for `TASK_MANAGER_LOG_PATH_KEY` > should be > > >> > "no > > >> > > default". We can explain in the description that if not > configured, > > >> > > `System.getProperty("log.file")` will be used, but that is not a > > >> default > > >> > > value. > > >> > > > >> > Explain it in the description is fine for me. > > >> > > > >> > > 2. So I wonder if we can simply mark them as deprecated and > remove in > > >> > 2.0. > > >> > > > >> > These features don't have a public option, but they work. I'm not > sure > > >> > whether these features are used by some advanced users. > > >> > Actually, I think some of them are valuable! For example: > > >> > > > >> > - ConfigConstants.YARN_CONTAINER_START_COMMAND_TEMPLATE > > >> > allows users to define the start command of the yarn container. > > >> > - FileInputFormat.ENUMERATE_NESTED_FILES_FLAG allows > > >> > flink job reads all files under the directory even if it has > nested > > >> > directories. > > >> > > > >> > This FLIP focuses on the refactor option, I'm afraid these features > are > > >> > used > > >> > in some production and removing these features will affect some > flink > > >> jobs. > > >> > So I prefer to keep these features, WDTY? > > >> > > > >> > > 3. Simply saying "getting / setting value with string key is > > >> discouraged" > > >> > > in JavaDoc of get/setString is IMHO a bit confusing. People may > have > > >> the > > >> > > question why would we keep the discouraged interfaces at all. I > would > > >> > > suggest the following: > > >> > > ``` > > >> > > We encourage users and developers to always use ConfigOption for > > >> getting > > >> > / > > >> > > setting the configurations if possible, for its rich description, > > >> type, > > >> > > default-value and other supports. The string-key-based getter / > setter > > >> > > should only be used when ConfigOption is not applicable, e.g., the > > >> key is > > >> > > programmatically generated in runtime. > > >> > > ``` > > >> > > > >> > Suggested comment is good for me, and I'd like to hear the thought > from > > >> > Xuannan > > >> > who wrote the original comment. > > >> > > > >> > Best, > > >> > Rui > > >> > > > >> > On Tue, Dec 26, 2023 at 5:26 PM Xintong Song <tonysong...@gmail.com > > > > >> > wrote: > > >> > > > >> >> Thanks for the efforts, Rui and Xuannan. > > >> >> > > >> >> I think it's a good idea to migrate string-key configuration > accesses > > >> to > > >> >> ConfigOption-s in general. > > >> >> > > >> >> I have a few suggestions / questions regarding the FLIP. > > >> >> > > >> >> 1. I think the default value for `TASK_MANAGER_LOG_PATH_KEY` > should be > > >> "no > > >> >> default". We can explain in the description that if not configured, > > >> >> `System.getProperty("log.file")` will be used, but that is not a > > >> default > > >> >> value. > > >> >> > > >> >> 2. I wonder if the following string-keys can be simply removed? > They > > >> are > > >> >> neither set by Flink, nor documented anywhere (AFAIK) so that users > > >> know > > >> >> how to set them. All of them were introduced a long time ago, > require > > >> >> significant knowledge and familiarity about Flink internals and low > > >> level > > >> >> details in order to use, and some of them are even private. So I > > >> wonder if > > >> >> we can simply mark them as deprecated and remove in 2.0. > > >> >> - ConfigConstants.YARN_CONTAINER_START_COMMAND_TEMPLATE > > >> >> - FileInputFormat.FILE_PARAMETER_KEY > > >> >> - FileInputFormat.ENUMERATE_NESTED_FILES_FLAG > > >> >> - FileOutputFormat.FILE_PARAMETER_KEY > > >> >> - BinaryInputFormat.BLOCK_SIZE_PARAMETER_KEY > > >> >> - BinaryOutputFormat.BLOCK_SIZE_PARAMETER_KEY > > >> >> > > >> >> 3. Simply saying "getting / setting value with string key is > > >> discouraged" > > >> >> in JavaDoc of get/setString is IMHO a bit confusing. People may > have > > >> the > > >> >> question why would we keep the discouraged interfaces at all. I > would > > >> >> suggest the following: > > >> >> ``` > > >> >> We encourage users and developers to always use ConfigOption for > > >> getting / > > >> >> setting the configurations if possible, for its rich description, > type, > > >> >> default-value and other supports. The string-key-based getter / > setter > > >> >> should only be used when ConfigOption is not applicable, e.g., the > key > > >> is > > >> >> programmatically generated in runtime. > > >> >> ``` > > >> >> > > >> >> WDYT? > > >> >> > > >> >> Best, > > >> >> > > >> >> Xintong > > >> >> > > >> >> > > >> >> > > >> >> On Tue, Dec 26, 2023 at 4:12 PM Rui Fan <1996fan...@gmail.com> > wrote: > > >> >> > > >> >> > Hi all, > > >> >> > > > >> >> > Xuannan(cced) and I would like to start a discussion on FLIP-405: > > >> >> > FLIP-405: Migrate string configuration key to ConfigOption[1]. > > >> >> > > > >> >> > As Flink progresses to 2.0, we want to enhance the user > experience > > >> >> > with the existing configuration. In FLIP-77, Flink introduced > > >> >> ConfigOption > > >> >> > with DataType and strongly encourage users to utilize > ConfigOption > > >> >> > instead of string keys for accessing and setting Flink > > >> configurations. > > >> >> > Presently, many string configuration keys have been deprecated > and > > >> >> > replaced with ConfigOptions; however, some string configuration > > >> >> > keys are still in use. > > >> >> > > > >> >> > To ensure a better experience with the existing configuration in > > >> Flink > > >> >> > 2.0, > > >> >> > this FLIP will migrate all user-facing string configuration keys > to > > >> >> > ConfigOptions. > > >> >> > Additionally, we want to modify the Configuration infrastructure > to > > >> >> > promote the use of ConfigOption over string configuration keys > > >> >> > among developers and users. It's mentioned in a preview > thread[2]. > > >> >> > > > >> >> > Looking forward to everyone's feedback and suggestions, thank > you! > > >> >> > > > >> >> > [1] https://cwiki.apache.org/confluence/x/6Yr5E > > >> >> > [2] > https://lists.apache.org/thread/zzsf7glfcdjcjm1hfo1xdwc6jp37nb3m > > >> >> > > > >> >> > Best, > > >> >> > Rui > > >> >> > > > >> >> > > >> > > > >> > > > >