Is the Hadoop Credentials API protected by their compatibility guidelines?
If not, can we convince them to support it formally?

On Thu, Aug 25, 2022 at 17:15 Andrew Purtell <andrew.purt...@gmail.com>
wrote:

> We can import the code from our sister Apache project if long term
> dependency and compatibility are concerns. The concerns apply in both
> directions:
>
> Depend, do not import: Hadoop may break us with a change that we have to
> incorporate by updating to a new minimum version probably because of a
> security issue. How likely is this? I don’t think the SPI has changed much.
> Cannot do a decent review at this time, on phone only.
>
> Import: If we do not watch the Hadoop implementation, then our semantics
> may diverge from theirs in a way that is confusing and inconvenient for
> operators who have to manage credentials at both Hadoop and HBase layers.
>
> > On Aug 25, 2022, at 11:00 AM, 张铎 <palomino...@gmail.com> wrote:
> >
> > +1
> >
> > But I'm still not sure whether we should just use the code in hadoop,
> > or we should just use the mechanism but write(copy) the code by our
> > own?
> >
> > Andrew Purtell <andrew.purt...@gmail.com> 于2022年8月25日周四 22:13写道:
> >>
> >> I agree the Credential SPI provided by Hadoop is direct and expedient.
> >>
> >> I would ask that a patch integrating it, if this is the selected
> approach, should also add support to bin/hbase so “hbase credential …”
> command line support is available and identical to that provided by the
> Hadoop bin script. This is for convenience and also a concession to users
> that ship HBase binaries/packages disaggregated from Hadoop ones.
> >>
> >>>> On Aug 25, 2022, at 9:50 AM, Andor Molnar <an...@apache.org> wrote:
> >>>
> >>> As Bryan mentioned there's a nice, extensible API already available in
> >>> Hadoop, the Credentials API.
> >>>
> >>>
> https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/CredentialProviderAPI.html
> >>>
> >>> I think that would be the quickest and easiest approach to resolve this
> >>> problem. Is there any objection or downside of that?
> >>>
> >>> Andor
> >>>
> >>>
> >>>
> >>>> On Tue, 2022-08-23 at 21:39 +0800, 张铎(Duo Zhang) wrote:
> >>>> Maybe we could introduce two configs for a password, password-
> >>>> provider
> >>>> and password-provider-parameter
> >>>>
> >>>> The default implementation is FileBasedPasswordProvider, where the
> >>>> parameter is just a location. And also an EnvVarPasswordProvider,
> >>>> where the parameter is the name of the environment variable.
> >>>>
> >>>> And users could also implement their own providers.
> >>>>
> >>>> WDYT?
> >>>>
> >>>> Bryan Beaudreault <bbeaudrea...@hubspot.com.invalid> 于2022年8月23日周二
> >>>> 21:12写道:
> >>>>> I agree that it seems insecure to put it directly into the hbase-
> >>>>> site.xml.
> >>>>> Another reason is due to the RS UI which (helpfully) can print the
> >>>>> entire
> >>>>> site configuration. We’d need to make sure the password is excluded
> >>>>> from
> >>>>> that, but better to remove it from site xml altogether.
> >>>>>
> >>>>> That said, we already have the concept of keystore passwords in
> >>>>> hbase --
> >>>>> search our refguide for "password" and you'll see two examples: for
> >>>>> jmx ssl
> >>>>> and for encryption-at-rest.  Both cases seem to take the approach
> >>>>> of
> >>>>> allowing an explicit password or a password file. Another example
> >>>>> we can
> >>>>> take inspiration from is Hadoop Credentials API[1] which allows
> >>>>> specifying
> >>>>> by environment variable or password file. Searching around for
> >>>>> other
> >>>>> opensource projects, these options seem to be the most common for
> >>>>> the
> >>>>> keystore password. See Cassandra[2] and Zookeeper[3] as further
> >>>>> examples.
> >>>>>
> >>>>> Elastic takes the approach of allowing "secure settings" [4], which
> >>>>> are
> >>>>> stored in a separate keystore managed via elasticsearch-keystore
> >>>>> command.
> >>>>> This just pushes the problem down a level, as that keystore needs
> >>>>> to be
> >>>>> password protected as well. In which case, you are expected to
> >>>>> provide the
> >>>>> path to a password file using an environment variable at startup
> >>>>> [5]. This
> >>>>> approach is very similar to Hadoop Credentials API.
> >>>>>
> >>>>> Personally I think we should go with the password file path
> >>>>> approach. This
> >>>>> gives a lot of flexibility, for example one could delete it after
> >>>>> startup
> >>>>> like mentioned in [5]. I like the idea of providing a decryption
> >>>>> interface
> >>>>> option for advanced users, but I think we still need to provide an
> >>>>> option
> >>>>> which doesn't require writing a bunch of code.
> >>>>>
> >>>>> Alternatively I think a case could be made for unifying on Hadoop's
> >>>>> Credential API. IMO, if we did that, it should be a separate
> >>>>> initiative
> >>>>> since we'd probably want to unify our existing keystore
> >>>>> configurations into
> >>>>> it and it'd probably need a major version release as a result.
> >>>>>
> >>>>> [1]
> >>>>>
> https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/CredentialProviderAPI.html
> >>>>> [2]
> >>>>>
> https://docs.datastax.com/en/cassandra-oss/3.x/cassandra/configuration/secureSSLClientToNode.html
> >>>>> [3] https://zookeeper.apache.org/doc/r3.8.0/zookeeperAdmin.html
> >>>>> (search keyStore.password)
> >>>>> [4]
> >>>>>
> https://www.elastic.co/guide/en/elasticsearch/reference/master/secure-settings.html
> >>>>> [5]
> >>>>>
> https://www.elastic.co/guide/en/elasticsearch/reference/current/rpm.html#rpm-running-systemd
> >>>>>
> >>>>> On Mon, Aug 22, 2022 at 11:33 PM 张铎(Duo Zhang) <
> >>>>> palomino...@gmail.com>
> >>>>> wrote:
> >>>>>
> >>>>>> In real production deployment, usually we will store an encrypted
> >>>>>> password in the configuration file, and then decrypt it after
> >>>>>> loading,
> >>>>>> to actually use it.
> >>>>>>
> >>>>>> And how to get the decryption will depend on the environment. On
> >>>>>> cloud
> >>>>>> VMs, usually you can use an encryption service to decrypt the
> >>>>>> password. On K8s, you can mount the key using secret.
> >>>>>>
> >>>>>> So maybe we should abstract a decryption interface, so users
> >>>>>> could
> >>>>>> implement it on their own to find a suitable way to decrypt the
> >>>>>> encrypted password?
> >>>>>>
> >>>>>> Andor Molnar <an...@apache.org> 于2022年8月23日周二 05:55写道:
> >>>>>>
> >>>>>>> Hi team,
> >>>>>>>
> >>>>>>> Netty TLS support is now merged into master and branch-2
> >>>>>>> branches.
> >>>>>>> Currently keystore/truststore passwords can only be stored in
> >>>>>>> hbase-
> >>>>>>> site.xml which is not the best approach from security
> >>>>>>> perspective.
> >>>>>>>
> >>>>>>> In the docs review Sergey Soldatov mentioned (
> >>>>>>> https://github.com/apache/hbase/pull/4717/files#r951768699
> >>>>>> <https://github.com/apache/hbase/pull/4717/files#r951768699>;)
> >>>>>> an approach
> >>>>>>> in HDFS where password can be stored in special files or in
> >>>>>>> environment
> >>>>>>> variables.
> >>>>>>>
> >>>>>>> Sergey, would you please point me to the details of that
> >>>>>>> implementation? Sounds like it would be acceptable for HBase
> >>>>>>> too.
> >>>>>>>
> >>>>>>> Is there any other idea that folks could recommend?
> >>>>>>>
> >>>>>>> Thanks,
> >>>>>>> Andor
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>
>

Reply via email to