[ 
https://issues.apache.org/jira/browse/HIVE-25445?focusedWorklogId=639344&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-639344
 ]

ASF GitHub Bot logged work on HIVE-25445:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 18/Aug/21 16:45
            Start Date: 18/Aug/21 16:45
    Worklog Time Spent: 10m 
      Work Description: nrg4878 commented on a change in pull request #2584:
URL: https://github.com/apache/hive/pull/2584#discussion_r691332964



##########
File path: 
jdbc-handler/src/main/java/org/apache/hive/storage/jdbc/dao/GenericJdbcDatabaseAccessor.java
##########
@@ -364,6 +366,16 @@ protected Properties 
getConnectionPoolProperties(Configuration conf) throws Exce
       passwd = Utilities.getPasswdFromKeystore(keystore, key);
     }
 
+    if (passwd == null) {

Review comment:
       same here. the password store is last in the order of attempting to find 
a password. Should this be reversed? if it is reversed, are there any backward 
compatibility issues? (I dont think there are but just asking)

##########
File path: 
jdbc-handler/src/main/java/org/apache/hive/storage/jdbc/conf/JdbcStorageConfigManager.java
##########
@@ -77,6 +81,15 @@ public static void copySecretsToJob(Properties props, 
Map<String, String> jobSec
       String key = props.getProperty(CONFIG_PWD_KEY);
       passwd = Utilities.getPasswdFromKeystore(keystore, key);
     }
+    if (passwd == null) {

Review comment:
       What is the order of precedence here? if user specifies password in 
properties, also specifies a keystore with password and password store URI, 
looks like we will pick it up from the properties which is the least secure 
means of specifying it. Should we reverse the order and look for password store 
first? just thinking out loud.
   or should we prevent the user from specifying multiple means.  

##########
File path: 
ql/src/main/resources/META-INF/services/org.apache.hadoop.hive.ql.secrets.SecretSource
##########
@@ -0,0 +1 @@
+org.apache.hadoop.hive.ql.secrets.AWSSecretsManagerSecretSource

Review comment:
       this adds a hard dependency on the aws-sdk-bundle. Should we make this 
configuration based?

##########
File path: ql/src/java/org/apache/hadoop/hive/ql/secrets/URISecretSource.java
##########
@@ -0,0 +1,72 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hive.ql.secrets;
+
+import org.apache.curator.utils.CloseableUtils;
+import org.apache.hadoop.hive.ql.metadata.HiveException;
+
+import java.io.IOException;
+import java.net.URI;
+import java.util.HashMap;
+import java.util.Map;
+import java.util.ServiceLoader;
+
+/**
+ * Class provides a way to load passwords using a URI. The secret sources are 
discovered using java service loader.
+ * The
+ */
+public class URISecretSource {
+  private static final URISecretSource INSTANCE = new URISecretSource();
+
+  private final Map<String, SecretSource> sourcesMap = new HashMap<>();
+  private URISecretSource() {
+    // Find all the registered secretsource.
+    ServiceLoader.load(SecretSource.class).forEach(this::register);
+
+    // Cleanup resources.
+    Runtime.getRuntime().addShutdownHook(new Thread(this::close));
+  }
+
+  public static URISecretSource getInstance() {
+    return INSTANCE;
+  }
+
+  public String getPasswordFromUri(URI uri) throws IOException, HiveException {
+    SecretSource source = sourcesMap.get(uri.getScheme());
+    if (source == null) {
+      throw new HiveException("Cannot fine secret source for scheme: " + 
uri.getScheme());
+    }
+    return source.getSecret(uri);
+  }
+
+  public void register(SecretSource source) {
+    SecretSource oldSource = sourcesMap.put(source.getURIScheme(), source);
+    if (oldSource != null) {
+      throw new RuntimeException("Two sources for same scheme: " + 
source.getURIScheme() +  " [" +
+          source.getClass().getName() + " and " + 
oldSource.getClass().getName() + "]");
+    }
+  }
+
+  public void removeForTest(SecretSource source) {

Review comment:
       this method should not have a public scope right? something that only a 
test should be able to invoke ?

##########
File path: ql/pom.xml
##########
@@ -164,6 +164,15 @@
       <artifactId>hive-storage-api</artifactId>
     </dependency>
     <!-- inter-project -->
+    <dependency>
+      <groupId>com.amazonaws</groupId>
+      <artifactId>aws-java-sdk-bundle</artifactId>
+      <scope>test</scope>
+    </dependency>
+    <dependency>
+      <groupId>com.amazonaws.secretsmanager</groupId>
+      <artifactId>aws-secretsmanager-caching-java</artifactId>

Review comment:
       again a compile time dependency which gets included into any uber jars 
we build.

##########
File path: pom.xml
##########
@@ -267,6 +269,28 @@
   <dependencyManagement>
     <dependencies>
       <!-- dependencies are always listed in sorted order by groupId, 
artifactId -->
+      <dependency>
+        <groupId>com.amazonaws</groupId>

Review comment:
       So seems like the only class we use from aws-java-sdk-bundle is the 
SecretCache. Is this correct? if so, I am not very familiar with this Cache but 
how is this different from a map based cache ? 
   Because of the usage of this class, we are having to declare this as a 
compile time dependency. I was wondering if we could make this a runtime 
dependency so we dont have to include this jar in our shipped binaries. Users 
using this feature need to a) configure a hive property to specify the 
SecretsManager class implementation(s) b) add the jars to the hive lib for 
runtime availability of the classes.
   
   I am trying to avoid adding classloader footprint if possible.

##########
File path: 
ql/src/java/org/apache/hadoop/hive/ql/secrets/AWSSecretsManagerSecretSource.java
##########
@@ -0,0 +1,109 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hive.ql.secrets;
+
+import com.amazonaws.secretsmanager.caching.SecretCache;
+import com.fasterxml.jackson.core.JsonProcessingException;
+import com.fasterxml.jackson.databind.JsonNode;
+import com.fasterxml.jackson.databind.ObjectMapper;
+import com.google.common.annotations.VisibleForTesting;
+import com.google.common.base.Preconditions;
+
+import java.io.IOException;
+import java.net.URI;
+
+/**
+ * Implementation of SecretSource which loads secrets from AWS Secrets Manager.
+ * The format of the uri is "aws-sm:///{key-name-or-arn}"
+ * It uses aws secrets cache sdk to fetch and refresh the secret, the 
environment must be setup so that the default
+ * client can load the secret else it will fail.
+ * It expects the secret fetched to be a json string with "password" as the 
key for password, this is default for
+ * redshift, rds or external database configs. It does not make use of any 
other fields.
+ */
+public class AWSSecretsManagerSecretSource implements SecretSource {
+  // Do not create SecretCache here, it fails to initialize in non-aws aware 
environments.
+  private volatile SecretCache cache = null;
+  private final ObjectMapper mapper = new ObjectMapper();
+
+  /**
+   * @return Fixed string aws-sm.
+   */
+  @Override
+  public String getURIScheme() {
+    return "aws-sm";
+  }
+
+  /**
+   * This load the secret from aws-secrets manager.
+   * @param uri The uri should be of format: aws-sm:///{key-arn-or-name}
+   * @return The secret fetched from AWS.
+   * @throws IOException
+   */
+  @Override
+  public String getSecret(URI uri) throws IOException {

Review comment:
       generally, it a good idea not to pass around secrets/passwords as 
strings as java String is immutable. They are visible as clear text values in a 
heap dump. So byte[] are preferred. Unfortunately, the JDBC spec and most other 
datasource spec accept the value as a string. So they would have to be 
converted to string at some point. But we should try to narrow the scope as 
much as possible. Is there a method that returns the value as a byte[] instead 
of a String?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 639344)
    Time Spent: 0.5h  (was: 20m)

> Enable JdbcStorageHandler to get password from AWS Secrets Service.
> -------------------------------------------------------------------
>
>                 Key: HIVE-25445
>                 URL: https://issues.apache.org/jira/browse/HIVE-25445
>             Project: Hive
>          Issue Type: New Feature
>          Components: HiveServer2
>            Reporter: Harish JP
>            Assignee: Harish JP
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Currently, password for JdbcStorageHandler can be set only via the password 
> field or keystore. This Jira is to add framework to fetch password from any 
> source and implement AWS Secrets Manager as a source.
>  
> The approach takes is to use a new table property dbcp.password.uri which 
> will be used if password and keyfile are not available.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to