Hi Magnus, Thanks for taking a look.
On Mon, Jul 16, 2018, at 11:43, Magnus Edenhill wrote: > Thanks for driving this KIP, Colin. > > I agree with Dong that a new similar modifyConfigs API (and protocol API) > is confusing and that > we should try to extend the current alterConfigs interface to support the > incremental mode instead, > deprecating the non-incremental mode in the process. In the longer term, I think that the non-incremental mode should definitely go away, and not be an option at all. That's why I don't think of this KIP as "adding more options to AlterConfigs" but as getting rid of a broken API. I've described a lot of reasons why non-incremental mode is broken. I've also described why the brokenness is subtle and an easy trap for newbies to fall into. Hopefully everyone agrees that getting rid of non-incremental mode completely should be the eventual goal. I do not think that having a different name for modifyConfigs is confusing. "Deleting all the configs and then setting some designated ones" is a very different operation from "modifying some configurations". Giving the two operations different names expresses the fact that they really are very different. Would it be less confusing if the new function were called alterConfigsIncremental rather than modifyConfigs? I think it's important to have a new name for the new function. If the names are the same, how can we even explain to users which API they should or should not use? "Use the three argument overload, or the two argument overload where the Options class is not the final parameter" That is not user-friendly. You could say that some of the overloads would be deprecated. However, my experience as a Hadoop developer is that most users simply don't care about deprecation warnings. They will use autocomplete in their IDEs and use whatever function seems to have the parameters they need. Hadoop and Kafka themselves use plenty of deprecated APIs. But somehow we expect that our users have much more respect for @deprecated than we ourselves do. I would further argue that function overloads in Java are intended to provide additional parameters, not to fundamentally change the semantics of a function. If you have two functions int addTwoNumbers(int a, int b) and int addTwoNumbers(int a, int b, boolean verbose), they should both add together two numbers. And a user should be able to expect that the plain old addTwoNumbers is equivalent to either addTwoNumbers(verbose=true) or addTwoNumbers(verbose=false), not a totally different operation. Every time programmers violate this contract, it inevitably leads to misunderstanding. One example is how in HDFS there are multiple function overloads for renaming a file. Depending on which one you call, you will get either RENAME or RENAME2, which have different semantics. I think RENAME2 has a different set of return codes surrounding "destination exists" conditions, among other things. Of course users have no idea of whether they're getting RENAME or RENAME2 unless they're developers. It's not obvious from the function call, which is named "rename" in both cases, just with different function parameters. So the whole thing is just a huge source of confusion and user pain. Another thing to consider is that since function overloads are also not an option in C or Go, we need a different solution for those languages anyway. A separate function name works well for this. > > Another thing that I believe is missing is a per-config-entry operation > type, namely SET (ovewrite), APPEND or DELETE. > The current proposal only has SET (changes) and DELETE (removals) > semantics, but I understand there are configuration properties (such as SASL > auth) where > it should be possible to APPEND to a configuration property, otherwise we'll > have the same > non-atomic read-modify-write cycle problems as with the current API. > Instead of providing two sets of config: changes and removals, I think > it might be better to just have one set where each Config entry has > the operation type set, this also voids any confusion on what happens > if a property is included in both changes,removals sets. That's a very good point. I guess the idea is that APPEND would add a new entry in the comma-separated (or other delimiter-separated) list of the config key? That would require per-key support, since not all configuration keys have the same delimiter. That's probably not too difficult, though. There are probably also lots of keys where APPEND makes no sense and should be rejected. For example, APPENDing to a configuration controlling the number of active threads for a subsystem does not make sense. Also, if we have APPEND, we probably also want SUBTRACT, right? best, Colin > > Regards, > Magnus > > 2018-07-16 20:23 GMT+02:00 Dong Lin <[email protected]>: > > > Hey Colin, > > > > Thanks much for the explanation. Yeah it makes sense to deprecate the > > existing non-incremental mode completely. LGTM. I just have two minor > > comments. > > > > I am not too strong with the following argument but it may be useful to > > just put it here for discussion. I am still wondering whether we can just > > overload alterConfigs(...) instead of using modifyConfigs(...). In the > > "Rejected Alternatives" section it is said that this approach does not > > allow us to deprecate the non-incremental mode. Not sure why it is the > > case. Can you explain a bit more? > > > > Regarding whether this approach is surprising and baffling to users, > > personally I feel that with the existing approach in the KIP, a new name > > such as modifyConfigs(...) does make it very explicit that this API is > > different from the existing alterConfigs(...). But it does not really tell > > user how they are different and users will have to read the Java doc to > > figure this out. On the other hand, if we just overload the > > alterConfigs(...) and deprecate the existing non-incremental > > alterConfigs(...), it would also make it reasonable clear to user that the > > two methods are different. Commonly-used IDE such as IntellIj would show > > that one API is deprecated and the other is not. And user would then read > > the Java doc to understand the difference. So the difference between these > > two approaches in the short term is probably not much. And in the long term > > it might be preferred to use "alter" instead of "modify". > > > > Another minor comment: should we include specify in the "Compatibility, > > Deprecation, and Migration Plan" section that we intend to deprecate the > > existing API? And do we plan to deprecate the > > AlterConfigsRequest/AlterConfigsResponse > > as well? The latter may be important to non-Kafka projects that have > > implemented AdminClient interface. > > > > > > Thanks, > > Dong > > > > > > On Mon, Jul 16, 2018 at 9:43 AM, Colin McCabe <[email protected]> wrote: > > > > > On Fri, Jul 13, 2018, at 23:20, Dong Lin wrote: > > > > Hey Colin, > > > > > > > > Thanks for the KIP! > > > > > > > > It seems that the AlterConfigsResult is pretty much the same as > > > > ModifyConfigsResult. Instead of adding ModifyConfigs API and > > deprecating > > > > AlterConfigs API, would it be simpler to just add API alterConfigs( > > > > Map<ConfigResource, Config> changes, Set<ConfigResource> removals, > > final > > > > ModifyConfigsOptions options)? > > > > > > > > Currently we use the word "alter" in method names such as > > > > "alterReplicaLogDirs" and "alterCheckpointDir". So it is probably more > > > > preferred to keep using the word "alter" instead of "modify" if > > > posssible. > > > > And if we can overload the alterConfigs(...) API to allow incremental > > > > change, it might make sense to keep the existing > > > > alterConfigs(Map<ConfigResource, > > > > Config> configs) for users who simply want to overwrite the entire > > > configs. > > > > > > If we have two functions with these type signatures: > > > > > > > AlterConfigsResult alterConfigs(Map<ConfigResource, Config> changes); > > > > AlterConfigsResult alterConfigs(Map<ConfigResource, Config> changes, > > > Set<ConfigResource> removals); > > > > > > It will be extremely surprising, even baffling, to users that the second > > > function overload makes incremental changes, whereas the first function > > > overload clears the entire configuration before applying changes. Just > > > looking at the type signatures (which is all most developers will look > > at, > > > especially if they're using IDE autocomplete), you would not expect such > > a > > > radical difference between them. You would expect the second one to work > > > just like the first, except maybe it would also perform some removals. > > > > > > Calling the two functions different names is good because it reflects the > > > fact that they are very different. > > > > > > > And those user would not have to make code change due to API > > deprecation. > > > > What do you think? > > > > > > alterConfigs needs to be deprecated, though. Code using alterConfigs is > > > almost certainly buggy because of the possibility of simultaneous > > > read-modify-write cycles, and the fact that some configs can't be read. > > > > > > best, > > > Colin > > > > > > > > > > > Thanks, > > > > Dong > > > > > > > > On Wed, Jul 11, 2018 at 10:54 AM, Colin McCabe <[email protected]> > > > wrote: > > > > > > > > > Hi all, > > > > > > > > > > Previously, we discussed some issues with alterConfigs here on the > > > mailing > > > > > list, and eventually came to the conclusion that the RPC as > > implemented > > > > > can't be used for a shell command modifying configurations. > > > > > > > > > > I wrote up a small KIP to fix the issues with the RP Please take a > > > look > > > > > at https://cwiki.apache.org/confluence/display/KAFKA/KIP- > > > > > 339%3A+Create+a+new+ModifyConfigs+API > > > > > > > > > > best, > > > > > Colin > > > > > > > > > >
