Hi Bryan,

Thanks for bringing this up.

I agree with Duo (and I think we have this settled as project-wide
consensus) that HConstants is/was an anti-pattern, that we are actively
against adding new fields there, and opportunistically removing fields when
we can. Further, the documented meaning of the
HBaseInterfaceAudience.CONFIG field is "Denotes class names that appear in
user facing configuration files", so this isn't really appropriate for
marking a field that exposes a configuration key to user applications. I
will also note that there appears to be two categories of tunable
parameters -- configuration points that we expect users to tweak are
catalogued and documented in the book [0] and everything else is left to
the obscurity of code-grep.

While we are actively squashing use of fields in HConstants, I don't know
that we have proposed some alternative to the user community. For my part,
when I write and review code that involves configuration keys, I generally
implement the key constant string as a private field in an appropriate
class, and the unit test coverage for that configuration key replicates the
string in the test. My reasoning being that the string is a part of our
public API and making a change to the public API should be detected from
the unit test. I also have (on occasion) gone out of my way to write about
the configuration keys in the package or class-level javadoc.

I think that none of my comments address your intended topic: how do we
publish our configuration points as an API that can be consumed by user
applications? (Do I have that correct?)

I am of the mind that we don't need/want an API of configurations ; we want
a catalogue, i.e., what has been started in our book. Perhaps accompanied
by/generated from an authoritative hbase-defaults.xml file. In fact, we
already do generate from hbase-default.xml, the result is [1] ; I don't
believe it is authoritative.

If we did have an AP thoughI, what would be better than the HConstants
approach of key-strings as public fields ? What if we had a
ConfigurationBuilder type of class, which had methods tied to configuration
keys? I would think that such a globally applicable class would have the
same maintenance issues as HConstants. But what if we had some kind of
ConfigurationSetter class, perhaps per package, that performed this
function? That might be maintainable for us and useful for users.

I'm keen to hear what other ideas are out there, or better, examples and
counter-examples from other projects.

Thanks,
Nick

[0]: https://hbase.apache.org/book.html#important_configurations
[1]: https://hbase.apache.org/book.html#hbase_default_configurations

On Tue, Mar 15, 2022 at 4:28 PM Bryan Beaudreault
<[email protected]> wrote:

> Hi devs,
>
> As a major user of hbase, my company has thousands of clients deployed
> which use the hbase client to connect to a variety of hbase clusters. We
> have a common library which handles configuring all clients by setting up
> the Configuration object prior to creating a Connection. Our library sets
> configurations using the various configs in HConstants, but there are also
> a bunch of configs which don't exist in HConstants. For these we have
> hardcoded config strings in our client.
>
> We're now working on an hbase upgrade and need to go through our client
> library and check how the configs may have changed in the new version. This
> is relatively easy to do for those HConstants cases -- configs may be
> marked @Deprecated which will show up in one's editor, they may be removed
> entirely which would show up is a compile error, and otherwise one can
> easily click through or bring up the javadoc. For the others that don't
> exist in HConstants, we need to go manually search the hbase codebase for
> those strings.
>
> Without doing this painstaking manual process, we would potentially deploy
> the upgraded client with configs which are no longer used or deprecated by
> the hbase client. For those using HConstants, this is immediately obvious
> because the HConstant field may have been removed. This is a clear
> indication of needing to investigate the config. In this case it's
> preferred to face the compile failure because it's clearer than having
> something silently disappear or change.
>
> I opened 3 jiras to move some fields to HConstants, but got some reasonable
> pushback from Duo:
>
> https://issues.apache.org/jira/browse/HBASE-26845
> https://issues.apache.org/jira/browse/HBASE-26846
> https://issues.apache.org/jira/browse/HBASE-26847
>
> Duo's pushback is that HConstants is an anti-pattern and these configs are
> not part of our public API. I can agree that a catch-all constants class
> might be an anti-pattern, but would argue that consolidating configs there
> is very useful for end-users.  I can also potentially agree that exposing
> these as part of our public API might limit the flexibility of development
> due to compatibility constraints about IA.Public.
>
> To me it seems odd to add a configuration, whose whole point is to make
> something tuneable, but then bury it in a private class despite having real
> implications for how the application runs. If a configuration is not meant
> to be tuned, it shouldn't be a configuration at all. Otherwise it should be
> exposed for reference.
>
> I'm wondering if there is some compromise we can achieve which makes it
> easier for end-users to integrate with tunable configs.
>
> One can imagine a large project to clean up all of our configs under some
> new class with maybe IA.LimitedPrivate(CONFIG), but I fear making perfect
> (needing to migrate all configs) the enemy of good.
>
> A better option might be to make those classes which expose configs
> LimitedPrivate(CONFIG) -- for example AsyncProcess and
> ConnectionImplementation. That might be the most incremental change we
> could make. We could handle this on a case-by-case basis.
>
> Does anyone have any thoughts?
>

Reply via email to