Is the Hadoop Credentials API protected by their compatibility guidelines? If not, can we convince them to support it formally?
On Thu, Aug 25, 2022 at 17:15 Andrew Purtell <andrew.purt...@gmail.com> wrote: > We can import the code from our sister Apache project if long term > dependency and compatibility are concerns. The concerns apply in both > directions: > > Depend, do not import: Hadoop may break us with a change that we have to > incorporate by updating to a new minimum version probably because of a > security issue. How likely is this? I don’t think the SPI has changed much. > Cannot do a decent review at this time, on phone only. > > Import: If we do not watch the Hadoop implementation, then our semantics > may diverge from theirs in a way that is confusing and inconvenient for > operators who have to manage credentials at both Hadoop and HBase layers. > > > On Aug 25, 2022, at 11:00 AM, 张铎 <palomino...@gmail.com> wrote: > > > > +1 > > > > But I'm still not sure whether we should just use the code in hadoop, > > or we should just use the mechanism but write(copy) the code by our > > own? > > > > Andrew Purtell <andrew.purt...@gmail.com> 于2022年8月25日周四 22:13写道: > >> > >> I agree the Credential SPI provided by Hadoop is direct and expedient. > >> > >> I would ask that a patch integrating it, if this is the selected > approach, should also add support to bin/hbase so “hbase credential …” > command line support is available and identical to that provided by the > Hadoop bin script. This is for convenience and also a concession to users > that ship HBase binaries/packages disaggregated from Hadoop ones. > >> > >>>> On Aug 25, 2022, at 9:50 AM, Andor Molnar <an...@apache.org> wrote: > >>> > >>> As Bryan mentioned there's a nice, extensible API already available in > >>> Hadoop, the Credentials API. > >>> > >>> > https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/CredentialProviderAPI.html > >>> > >>> I think that would be the quickest and easiest approach to resolve this > >>> problem. Is there any objection or downside of that? > >>> > >>> Andor > >>> > >>> > >>> > >>>> On Tue, 2022-08-23 at 21:39 +0800, 张铎(Duo Zhang) wrote: > >>>> Maybe we could introduce two configs for a password, password- > >>>> provider > >>>> and password-provider-parameter > >>>> > >>>> The default implementation is FileBasedPasswordProvider, where the > >>>> parameter is just a location. And also an EnvVarPasswordProvider, > >>>> where the parameter is the name of the environment variable. > >>>> > >>>> And users could also implement their own providers. > >>>> > >>>> WDYT? > >>>> > >>>> Bryan Beaudreault <bbeaudrea...@hubspot.com.invalid> 于2022年8月23日周二 > >>>> 21:12写道: > >>>>> I agree that it seems insecure to put it directly into the hbase- > >>>>> site.xml. > >>>>> Another reason is due to the RS UI which (helpfully) can print the > >>>>> entire > >>>>> site configuration. We’d need to make sure the password is excluded > >>>>> from > >>>>> that, but better to remove it from site xml altogether. > >>>>> > >>>>> That said, we already have the concept of keystore passwords in > >>>>> hbase -- > >>>>> search our refguide for "password" and you'll see two examples: for > >>>>> jmx ssl > >>>>> and for encryption-at-rest. Both cases seem to take the approach > >>>>> of > >>>>> allowing an explicit password or a password file. Another example > >>>>> we can > >>>>> take inspiration from is Hadoop Credentials API[1] which allows > >>>>> specifying > >>>>> by environment variable or password file. Searching around for > >>>>> other > >>>>> opensource projects, these options seem to be the most common for > >>>>> the > >>>>> keystore password. See Cassandra[2] and Zookeeper[3] as further > >>>>> examples. > >>>>> > >>>>> Elastic takes the approach of allowing "secure settings" [4], which > >>>>> are > >>>>> stored in a separate keystore managed via elasticsearch-keystore > >>>>> command. > >>>>> This just pushes the problem down a level, as that keystore needs > >>>>> to be > >>>>> password protected as well. In which case, you are expected to > >>>>> provide the > >>>>> path to a password file using an environment variable at startup > >>>>> [5]. This > >>>>> approach is very similar to Hadoop Credentials API. > >>>>> > >>>>> Personally I think we should go with the password file path > >>>>> approach. This > >>>>> gives a lot of flexibility, for example one could delete it after > >>>>> startup > >>>>> like mentioned in [5]. I like the idea of providing a decryption > >>>>> interface > >>>>> option for advanced users, but I think we still need to provide an > >>>>> option > >>>>> which doesn't require writing a bunch of code. > >>>>> > >>>>> Alternatively I think a case could be made for unifying on Hadoop's > >>>>> Credential API. IMO, if we did that, it should be a separate > >>>>> initiative > >>>>> since we'd probably want to unify our existing keystore > >>>>> configurations into > >>>>> it and it'd probably need a major version release as a result. > >>>>> > >>>>> [1] > >>>>> > https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/CredentialProviderAPI.html > >>>>> [2] > >>>>> > https://docs.datastax.com/en/cassandra-oss/3.x/cassandra/configuration/secureSSLClientToNode.html > >>>>> [3] https://zookeeper.apache.org/doc/r3.8.0/zookeeperAdmin.html > >>>>> (search keyStore.password) > >>>>> [4] > >>>>> > https://www.elastic.co/guide/en/elasticsearch/reference/master/secure-settings.html > >>>>> [5] > >>>>> > https://www.elastic.co/guide/en/elasticsearch/reference/current/rpm.html#rpm-running-systemd > >>>>> > >>>>> On Mon, Aug 22, 2022 at 11:33 PM 张铎(Duo Zhang) < > >>>>> palomino...@gmail.com> > >>>>> wrote: > >>>>> > >>>>>> In real production deployment, usually we will store an encrypted > >>>>>> password in the configuration file, and then decrypt it after > >>>>>> loading, > >>>>>> to actually use it. > >>>>>> > >>>>>> And how to get the decryption will depend on the environment. On > >>>>>> cloud > >>>>>> VMs, usually you can use an encryption service to decrypt the > >>>>>> password. On K8s, you can mount the key using secret. > >>>>>> > >>>>>> So maybe we should abstract a decryption interface, so users > >>>>>> could > >>>>>> implement it on their own to find a suitable way to decrypt the > >>>>>> encrypted password? > >>>>>> > >>>>>> Andor Molnar <an...@apache.org> 于2022年8月23日周二 05:55写道: > >>>>>> > >>>>>>> Hi team, > >>>>>>> > >>>>>>> Netty TLS support is now merged into master and branch-2 > >>>>>>> branches. > >>>>>>> Currently keystore/truststore passwords can only be stored in > >>>>>>> hbase- > >>>>>>> site.xml which is not the best approach from security > >>>>>>> perspective. > >>>>>>> > >>>>>>> In the docs review Sergey Soldatov mentioned ( > >>>>>>> https://github.com/apache/hbase/pull/4717/files#r951768699 > >>>>>> <https://github.com/apache/hbase/pull/4717/files#r951768699>;) > >>>>>> an approach > >>>>>>> in HDFS where password can be stored in special files or in > >>>>>>> environment > >>>>>>> variables. > >>>>>>> > >>>>>>> Sergey, would you please point me to the details of that > >>>>>>> implementation? Sounds like it would be acceptable for HBase > >>>>>>> too. > >>>>>>> > >>>>>>> Is there any other idea that folks could recommend? > >>>>>>> > >>>>>>> Thanks, > >>>>>>> Andor > >>>>>>> > >>>>>>> > >>>>>>> > >>> >