IMO if we want to have a constant class for this, we should mark it as IA.Public. IA.LP(Config) just means the class itself can be referenced in config files.
Bryan Beaudreault <[email protected]> 于2022年3月25日周五 04:25写道: > Yes but I don't see a major issue with promoting to LP(Config)? I thought > the agreement here was that LP(Config) was basically "exposed for config > constants", but maybe I misunderstood. > > Also ConnectionConfiguration is a relatively basic and lightweight wrapper > around configs. I think it's a perfect example of what this LP(Config) API > could look like. > > On Thu, Mar 24, 2022 at 3:47 PM Nick Dimiduk <[email protected]> wrote: > > > One sticking point: ConnectionConfiguration and > > AsyncConnectionConfiguration are both IA.Private. > > > > On Thu, Mar 24, 2022 at 17:21 Bryan Beaudreault > > <[email protected]> wrote: > > > > > I realized there already exists a good candidate for this in > hbase-client > > > -- ConnectionConfiguration. My latest commit in > > > https://github.com/apache/hbase/pull/4180 > > <https://github.com/apache/hbase/pull/4180> > > adds a new config constant there > > > and marks it as LP(Config), but I'd also be happy to revert that part > of > > > the PR and instead handle that in a dedicated jira for this topic if > > > desired. > > > > > > On Thu, Mar 24, 2022 at 11:08 AM Bryan Beaudreault < > > > [email protected]> > > > wrote: > > > > > > > Thank you both for the input here! > > > > > > > > It seems like we've come to the conclusion that it would be ok to: > > > > > > > > - Update docs around LP(CONFIG) to say that it also encompasses some > > > > module-aggregated classes which hold config constants > > > > - Maybe document this convention elsewhere for contributors as well, > > > > though need to look into where (guidance welcome, but will look > around) > > > > - I can create the first constant class in hbase-client for the > > use-cases > > > > in my jira. I also have another config constant I've added in > > > > https://github.com/apache/hbase/pull/4180 > > <https://github.com/apache/hbase/pull/4180> > > prior to this discussion, so > > > > may want to use that class there. > > > > - Look into updates to tooling to audit the LP(CONFIG) classes > > > > > > > > If this sounds good, I'll create a JIRA to track the work. The first > 3 > > > > todos will be relatively easy, and I'll look into options for the > audit > > > tool > > > > > > > > > > > > On Mon, Mar 21, 2022 at 12:14 PM Andrew Purtell < > > > [email protected]> > > > > wrote: > > > > > > > >> Agreed the definition of LP(CONFIG) would need to be tweaked. > > > >> > > > >> We do not quite have enough support for analyzing configuration key > > set > > > >> changes in the current API audit tool. Removal of a public constant > > > field > > > >> from an LP annotated class would be flagged but a modification of > the > > > >> constant would not. However it is a OSS project consisting of Perl > > > scripts > > > >> that might accept a contribution or at least could be patched or > > > extended. > > > >> Or we can build a new audit tool for the purpose, which is what I > > would > > > >> recommend. The Perl based tool shells out to javah and is quite > > > expensive, > > > >> and on some platforms, depending on comparison, can fail due to > > command > > > >> line length limits. A Java tool would likely be more efficient at > > > >> processing Java class annotations and introspecting string > constants. > > > >> > > > >> > > > > >> > On Mar 21, 2022, at 8:52 AM, Nick Dimiduk <[email protected]> > > > wrote: > > > >> > > > > >> > On Mon, Mar 21, 2022 at 4:32 PM Andrew Purtell < > > > >> [email protected]> > > > >> > wrote: > > > >> > > > > >> >> Although collecting all configuration keys into a single file is > > > >> >> definitely an anti-pattern I’m not sure the same is true of > package > > > or > > > >> >> Maven module level aggregation classes marked LP(CONFIG).Somewhat > > > like > > > >> >> DFSConfigKeys but geared toward our API/release auditing. > > > >> >> > > > >> >> This would seem virtuous for a couple of reasons. Relevant > > > >> configuration > > > >> >> key constants for the package or module would be grouped in a > well > > > >> known > > > >> >> place for users and developers alike. The LP(CONFIG) designation > > > would > > > >> >> require developers to think about deprecation cycle if > > contemplating > > > a > > > >> >> change, thus providing some back pressure against snap decisions. > > Or, > > > >> if > > > >> >> not then, then at release candidate evaluation time, user > > > configuration > > > >> >> breaking changes could be caught be a release automation tool > that > > > >> diffs > > > >> >> LP(CONFIG) annotated classes. Something like this would improve > the > > > >> state > > > >> >> of configuration key management quite dramatically, because > > currently > > > >> it’s > > > >> >> ad hoc. > > > >> >> > > > >> > > > > >> > I am supportive of trying such an effort. We'll need to tweak the > > > >> meaning > > > >> > of LP(CONFIG) as we define it currently, but that can be done. I > > don't > > > >> know > > > >> > what our current tools do or assume regarding this annotation. I > > think > > > >> > there is something custom happening in the compatibility reports > > that > > > we > > > >> > generate as part of each RC. > > > >> > > > > >> >>> On Mar 16, 2022, at 10:46 AM, Bryan Beaudreault > > > >> <[email protected]> > > > >> >>> wrote: > > > >> >>> > > > >> >>> Thanks for your detailed response, Nick! > > > >> >>> > > > >> >>>> I think that none of my comments address your intended topic: > how > > > do > > > >> we > > > >> >>> publish our configuration points as an API that can be consumed > by > > > >> user > > > >> >>> applications? (Do I have that correct?) > > > >> >>> > > > >> >>> This is a good summary, and I appreciate the other > > > >> >> thoughts/clarifications > > > >> >>> as well. I also realize this is probably hard to get perfect and > > any > > > >> >> choice > > > >> >>> must be weighed against the effort necessary to change/maintain. > > > >> >>> > > > >> >>> One example I know is Hadoop/HDFS, and I bet some on this list > > have > > > >> much > > > >> >>> more knowledge of that project's history than I do. For HDFS > they > > > have > > > >> >>> DFSConfigKeys which in my experience does seem to include most > > > >> configs. I > > > >> >>> believe they even have unit tests which verify that all configs > in > > > the > > > >> >>> various site.xml files are represented in code. In more recent > > > >> versions > > > >> >>> they have split that class up into smaller groupings, for > example > > > >> >>> DfsClientConf and the various inner classes there. > > > >> >>> > > > >> >>> In a vacuum, from a code design perspective, I'm not commenting > on > > > >> >> whether > > > >> >>> that's a good or bad pattern. I also don't know of the politics > of > > > the > > > >> >>> project or what sorts of pain points they've discovered in that > > > >> pattern > > > >> >>> over the years. But from *user's perspective*, this is a handy > way > > > to > > > >> >>> handle things in my opinion. > > > >> >>> > > > >> >>> At my company, in general we try to avoid "magic strings" [1] > and > > > >> instead > > > >> >>> always try to use constants. We can and do define our own > > constants > > > to > > > >> >> try > > > >> >>> to mirror some of the "private" magic strings in the hbase > client. > > > >> This > > > >> >> is > > > >> >>> better than nothing but even better would be to use > hbase-provided > > > >> >>> constants so that we can build more defensive applications, > using > > > the > > > >> >>> compiler to verify that the configs we reference still do > > anything. > > > >> >>> > > > >> >>> I unfortunately can't speak to the original issues with > HConstants > > > >> that > > > >> >>> turned it into an anti-pattern. What I do notice is there are > > > >> definitely > > > >> >>> examples in the hbase codebase of duplicated config strings, one > > of > > > >> which > > > >> >>> is called out in one of the jiras I linked in my original email. > > > These > > > >> >> are > > > >> >>> just bugs waiting to happen in my opinion, either for hbase > itself > > > or > > > >> for > > > >> >>> users which may reference them. > > > >> >>> > > > >> >>> [1] https://deviq.com/antipatterns/magic-strings > > <https://deviq.com/antipatterns/magic-strings> > > > >> <https://deviq.com/antipatterns/magic-strings > > <https://deviq.com/antipatterns/magic-strings> > > > > > > >> >>>> On Wed, Mar 16, 2022 at 10:52 AM Nick Dimiduk < > > [email protected] > > > > > > > >> >> wrote: > > > >> >>>> > > > >> >>>> Hi Bryan, > > > >> >>>> > > > >> >>>> Thanks for bringing this up. > > > >> >>>> > > > >> >>>> I agree with Duo (and I think we have this settled as > > project-wide > > > >> >>>> consensus) that HConstants is/was an anti-pattern, that we are > > > >> actively > > > >> >>>> against adding new fields there, and opportunistically removing > > > >> fields > > > >> >> when > > > >> >>>> we can. Further, the documented meaning of the > > > >> >>>> HBaseInterfaceAudience.CONFIG field is "Denotes class names > that > > > >> appear > > > >> >> in > > > >> >>>> user facing configuration files", so this isn't really > > appropriate > > > >> for > > > >> >>>> marking a field that exposes a configuration key to user > > > >> applications. I > > > >> >>>> will also note that there appears to be two categories of > tunable > > > >> >>>> parameters -- configuration points that we expect users to > tweak > > > are > > > >> >>>> catalogued and documented in the book [0] and everything else > is > > > >> left to > > > >> >>>> the obscurity of code-grep. > > > >> >>>> > > > >> >>>> While we are actively squashing use of fields in HConstants, I > > > don't > > > >> >> know > > > >> >>>> that we have proposed some alternative to the user community. > For > > > my > > > >> >> part, > > > >> >>>> when I write and review code that involves configuration keys, > I > > > >> >> generally > > > >> >>>> implement the key constant string as a private field in an > > > >> appropriate > > > >> >>>> class, and the unit test coverage for that configuration key > > > >> replicates > > > >> >> the > > > >> >>>> string in the test. My reasoning being that the string is a > part > > of > > > >> our > > > >> >>>> public API and making a change to the public API should be > > detected > > > >> from > > > >> >>>> the unit test. I also have (on occasion) gone out of my way to > > > write > > > >> >> about > > > >> >>>> the configuration keys in the package or class-level javadoc. > > > >> >>>> > > > >> >>>> I think that none of my comments address your intended topic: > how > > > do > > > >> we > > > >> >>>> publish our configuration points as an API that can be consumed > > by > > > >> user > > > >> >>>> applications? (Do I have that correct?) > > > >> >>>> > > > >> >>>> I am of the mind that we don't need/want an API of > > configurations ; > > > >> we > > > >> >> want > > > >> >>>> a catalogue, i.e., what has been started in our book. Perhaps > > > >> >> accompanied > > > >> >>>> by/generated from an authoritative hbase-defaults.xml file. In > > > fact, > > > >> we > > > >> >>>> already do generate from hbase-default.xml, the result is [1] > ; I > > > >> don't > > > >> >>>> believe it is authoritative. > > > >> >>>> > > > >> >>>> If we did have an AP thoughI, what would be better than the > > > >> HConstants > > > >> >>>> approach of key-strings as public fields ? What if we had a > > > >> >>>> ConfigurationBuilder type of class, which had methods tied to > > > >> >> configuration > > > >> >>>> keys? I would think that such a globally applicable class would > > > have > > > >> the > > > >> >>>> same maintenance issues as HConstants. But what if we had some > > kind > > > >> of > > > >> >>>> ConfigurationSetter class, perhaps per package, that performed > > this > > > >> >>>> function? That might be maintainable for us and useful for > users. > > > >> >>>> > > > >> >>>> I'm keen to hear what other ideas are out there, or better, > > > examples > > > >> and > > > >> >>>> counter-examples from other projects. > > > >> >>>> > > > >> >>>> Thanks, > > > >> >>>> Nick > > > >> >>>> > > > >> >>>> [0]: > https://hbase.apache.org/book.html#important_configurations > > <https://hbase.apache.org/book.html#important_configurations> > > > >> <https://hbase.apache.org/book.html#important_configurations > > <https://hbase.apache.org/book.html#important_configurations> > > > > > > >> >>>> <https://hbase.apache.org/book.html#important_configurations > > <https://hbase.apache.org/book.html#important_configurations> > > > >> <https://hbase.apache.org/book.html#important_configurations > > <https://hbase.apache.org/book.html#important_configurations> > > > > > > >> > > > > >> >>>> [1]: > > > https://hbase.apache.org/book.html#hbase_default_configurations > > <https://hbase.apache.org/book.html#hbase_default_configurations> > > > >> <https://hbase.apache.org/book.html#hbase_default_configurations > > <https://hbase.apache.org/book.html#hbase_default_configurations> > > > > > > >> >>>> < > https://hbase.apache.org/book.html#hbase_default_configurations > > <https://hbase.apache.org/book.html#hbase_default_configurations> > > > >> <https://hbase.apache.org/book.html#hbase_default_configurations > > <https://hbase.apache.org/book.html#hbase_default_configurations> > > > > > > >> > > > > >> >>>> > > > >> >>>> On Tue, Mar 15, 2022 at 4:28 PM Bryan Beaudreault > > > >> >>>> <[email protected]> wrote: > > > >> >>>> > > > >> >>>>> Hi devs, > > > >> >>>>> > > > >> >>>>> As a major user of hbase, my company has thousands of clients > > > >> deployed > > > >> >>>>> which use the hbase client to connect to a variety of hbase > > > >> clusters. > > > >> >> We > > > >> >>>>> have a common library which handles configuring all clients by > > > >> setting > > > >> >> up > > > >> >>>>> the Configuration object prior to creating a Connection. Our > > > library > > > >> >> sets > > > >> >>>>> configurations using the various configs in HConstants, but > > there > > > >> are > > > >> >>>> also > > > >> >>>>> a bunch of configs which don't exist in HConstants. For these > we > > > >> have > > > >> >>>>> hardcoded config strings in our client. > > > >> >>>>> > > > >> >>>>> We're now working on an hbase upgrade and need to go through > our > > > >> client > > > >> >>>>> library and check how the configs may have changed in the new > > > >> version. > > > >> >>>> This > > > >> >>>>> is relatively easy to do for those HConstants cases -- configs > > may > > > >> be > > > >> >>>>> marked @Deprecated which will show up in one's editor, they > may > > be > > > >> >>>> removed > > > >> >>>>> entirely which would show up is a compile error, and otherwise > > one > > > >> can > > > >> >>>>> easily click through or bring up the javadoc. For the others > > that > > > >> don't > > > >> >>>>> exist in HConstants, we need to go manually search the hbase > > > >> codebase > > > >> >> for > > > >> >>>>> those strings. > > > >> >>>>> > > > >> >>>>> Without doing this painstaking manual process, we would > > > potentially > > > >> >>>> deploy > > > >> >>>>> the upgraded client with configs which are no longer used or > > > >> deprecated > > > >> >>>> by > > > >> >>>>> the hbase client. For those using HConstants, this is > > immediately > > > >> >> obvious > > > >> >>>>> because the HConstant field may have been removed. This is a > > clear > > > >> >>>>> indication of needing to investigate the config. In this case > > it's > > > >> >>>>> preferred to face the compile failure because it's clearer > than > > > >> having > > > >> >>>>> something silently disappear or change. > > > >> >>>>> > > > >> >>>>> I opened 3 jiras to move some fields to HConstants, but got > some > > > >> >>>> reasonable > > > >> >>>>> pushback from Duo: > > > >> >>>>> > > > >> >>>>> https://issues.apache.org/jira/browse/HBASE-26845 > > <https://issues.apache.org/jira/browse/HBASE-26845> > > > >> <https://issues.apache.org/jira/browse/HBASE-26845 > > <https://issues.apache.org/jira/browse/HBASE-26845> > > > > > > >> >>>> <https://issues.apache.org/jira/browse/HBASE-26845 > > <https://issues.apache.org/jira/browse/HBASE-26845> > > > >> <https://issues.apache.org/jira/browse/HBASE-26845 > > <https://issues.apache.org/jira/browse/HBASE-26845> > > > > > > >> > > > > >> >>>>> https://issues.apache.org/jira/browse/HBASE-26846 > > <https://issues.apache.org/jira/browse/HBASE-26846> > > > >> <https://issues.apache.org/jira/browse/HBASE-26846 > > <https://issues.apache.org/jira/browse/HBASE-26846> > > > > > > >> >>>> <https://issues.apache.org/jira/browse/HBASE-26846 > > <https://issues.apache.org/jira/browse/HBASE-26846> > > > >> <https://issues.apache.org/jira/browse/HBASE-26846 > > <https://issues.apache.org/jira/browse/HBASE-26846> > > > > > > >> > > > > >> >>>>> https://issues.apache.org/jira/browse/HBASE-26847 > > <https://issues.apache.org/jira/browse/HBASE-26847> > > > >> <https://issues.apache.org/jira/browse/HBASE-26847 > > <https://issues.apache.org/jira/browse/HBASE-26847> > > > > > > >> >>>> <https://issues.apache.org/jira/browse/HBASE-26847 > > <https://issues.apache.org/jira/browse/HBASE-26847> > > > >> <https://issues.apache.org/jira/browse/HBASE-26847 > > <https://issues.apache.org/jira/browse/HBASE-26847> > > > > > > >> > > > > >> >>>>> > > > >> >>>>> Duo's pushback is that HConstants is an anti-pattern and these > > > >> configs > > > >> >>>> are > > > >> >>>>> not part of our public API. I can agree that a catch-all > > constants > > > >> >> class > > > >> >>>>> might be an anti-pattern, but would argue that consolidating > > > configs > > > >> >>>> there > > > >> >>>>> is very useful for end-users. I can also potentially agree > that > > > >> >> exposing > > > >> >>>>> these as part of our public API might limit the flexibility of > > > >> >>>> development > > > >> >>>>> due to compatibility constraints about IA.Public. > > > >> >>>>> > > > >> >>>>> To me it seems odd to add a configuration, whose whole point > is > > to > > > >> make > > > >> >>>>> something tuneable, but then bury it in a private class > despite > > > >> having > > > >> >>>> real > > > >> >>>>> implications for how the application runs. If a configuration > is > > > not > > > >> >>>> meant > > > >> >>>>> to be tuned, it shouldn't be a configuration at all. Otherwise > > it > > > >> >> should > > > >> >>>> be > > > >> >>>>> exposed for reference. > > > >> >>>>> > > > >> >>>>> I'm wondering if there is some compromise we can achieve which > > > >> makes it > > > >> >>>>> easier for end-users to integrate with tunable configs. > > > >> >>>>> > > > >> >>>>> One can imagine a large project to clean up all of our configs > > > under > > > >> >> some > > > >> >>>>> new class with maybe IA.LimitedPrivate(CONFIG), but I fear > > making > > > >> >> perfect > > > >> >>>>> (needing to migrate all configs) the enemy of good. > > > >> >>>>> > > > >> >>>>> A better option might be to make those classes which expose > > > configs > > > >> >>>>> LimitedPrivate(CONFIG) -- for example AsyncProcess and > > > >> >>>>> ConnectionImplementation. That might be the most incremental > > > change > > > >> we > > > >> >>>>> could make. We could handle this on a case-by-case basis. > > > >> >>>>> > > > >> >>>>> Does anyone have any thoughts? > > > >> >>>>> > > > >> >>>> > > > >> >> > > > >> > > > > > > > > > >
