Todd Lipcon has posted comments on this change. Change subject: WIP: Expose a way to set "advanced" non-types scan options ......................................................................
Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/6624/4//COMMIT_MSG Commit Message: PS4, Line 15: without "polluting" the api and in such a way that we can remove : support for this in the future. > I am opposed to introducing an API that's weakly structured in this way. I agree with the first portion of the soap box - I would prefer an API that uses named enum constants, like: SetAdvancedOption(PAD_TIMESTAMP_16_BYTES); so that we get the benefits of type safety, documentation, etc. On the second part of the soapbox, I don't think we should constrain ourselves that all APIs need to have an equal stability/compatibility guarantee. As I think we all agree, maintaining a stable API is costly since it involves testing, generates back-compat code concerns, etc. That's why we always try to make the APIs as narrow as possible, think through them up front, etc. That said, I think there's a case to be made that there are some advanced APIs where we can save a lot of cost (docs, compat, etc) by signing up for a lighter compatibility contract. It does force consumers to make the choice to tighten the compatibility matrix (eg Impala 2.9 would only support Kudu 1.4 or whatever), but so long as this restriction is advertised up front, I think that's OK. The consumer can then weigh the advantages of using the API (in this case, performance) against the disadvantages (having to tightly couple versions). So the remaining question is what the decision framework is for whether an API should be stable/documented/etc vs unstable/etc. I think for me it comes down to how many consumers we expect to have, and how advanced they are. In the case of a query engine like Impala building an integration with Kudu, we already expect them to be closely following Kudu development (e.g every release we add new predicate types which Impala adds pushdown for). And we expect them to be very advanced consumers (they already rely on the byte format of our cells to get good performance). So adding one more advanced API that they need to dig into the code a bit to use isn't problematic in my view. Meanwhile, users who are more concerned with ease of use and version flexibility (probably the vast majority of users) can stick with our nice documented stable API. Even if we documented and provided guarantees for this new one, I doubt most users would want to use it. Thoughts? -- To view, visit http://gerrit.cloudera.org:8080/6624 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I043b6514dc5fc307fc9c94eb41f3ae79796ba273 Gerrit-PatchSet: 4 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: David Ribeiro Alves <dral...@apache.org> Gerrit-Reviewer: Adar Dembo <a...@cloudera.com> Gerrit-Reviewer: Jean-Daniel Cryans <jdcry...@apache.org> Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Matthew Jacobs <m...@cloudera.com> Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon <t...@apache.org> Gerrit-HasComments: Yes