[jira] [Created] (HIVE-13819) Read & eXecute permissions on Database allows to ALTER it.

Alexandre Linte (JIRA) Mon, 23 May 2016 06:13:09 -0700

Alexandre Linte created HIVE-13819:
--------------------------------------

             Summary: Read & eXecute permissions on Database allows to ALTER it.
                 Key: HIVE-13819
                 URL: https://issues.apache.org/jira/browse/HIVE-13819
             Project: Hive
          Issue Type: Bug
          Components: Authorization
    Affects Versions: 1.2.1
         Environment: Hadoop 2.7.2, Hive 1.2.1, Kerberos.
            Reporter: Alexandre Linte



Hi,

As the owner of an Hive database I can modify the Hive database metadata 
whereas I only has the read and execute permission on the Hive database 
repository.
I was expected to not be able to modify these metadata.

Context:
- Hive database configured with the Storage Based Authorization strategy.
- Hive client authorization is disabled.
- Metastore side security is activated.

Permission configuration:
{noformat}
dr-x--x---   - hive9990    hive9990             0 2016-05-20 17:10 
/path/to/hive/warehouse/p09990.db
{noformat}

ALTER command as hive9990 user:
{noformat}
hive (p09990)>  ALTER DATABASE p09990 SET DBPROPERTIES ('comment'='database 
altered');
OK
Time taken: 0.277 seconds
hive (p09990)> DESCRIBE DATABASE EXTENDED p09990;
OK
p09990          hdfs://path/to/hive/warehouse/p09990.db        hdfs    USER    
{comment=database altered}
{noformat}

Configuration of hive-site.xml on the metastore:
{noformat}
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<configuration>
 
  <property>
      <name>hive.security.authorization.enabled</name>
      <value>false</value>
      <description>enable or disable the Hive client authorization</description>
  </property>

  <property>
      <name>hive.security.metastore.authorization.manager</name>
      
<value>org.apache.hadoop.hive.ql.security.authorization.StorageBasedAuthorizationProvider</value>
      <description>authorization manager class name to be used in the metastore 
for authorization.
      The user defined authorization class should implement interface 
org.apache.hadoop.hive.ql.security.authorization.HiveMetastoreAuthorizationProvider.
      </description>
  </property>

  <property>
      <name>hive.metastore.pre.event.listeners</name>
      
<value>org.apache.hadoop.hive.ql.security.authorization.AuthorizationPreEventListener</value>
      <description>This turns on metastore-side security.
      </description>
  </property>

  <property>
      <name>hive.security.metastore.authorization.auth.reads</name>
      <value>true</value>
      <description>If this is true, the metastore authorizer authorizes read 
actions on database and table.
      </description>
  </property>

  <property>
      <name>hive.security.authorization.manager</name>
      
<value>org.apache.hadoop.hive.ql.security.authorization.StorageBasedAuthorizationProvider</value>
      <description>The Hive client authorization manager class name.
  The user defined authorization class should implement interface 
org.apache.hadoop.hive.ql.security.authorization.HiveAuthorizationProvider.
      </description>
  </property>

  <property>
      <name>hive.security.authorization.createtable.owner.grants</name>
      <value>ALL</value>
      <description>the privileges automatically granted to the owner whenever a 
table gets created. 
       An example like "select,drop" will grant select and drop privilege to 
the owner of the table</description>
  </property>

  <property>
      <name>hive.users.in.admin.role</name>
      <value>hdfs</value>
      <description>Comma separated list of users who are in admin role for 
bootstrapping.
    More users can be added in ADMIN role later.</description>
  </property>

  <property>
      <name>hive.metastore.warehouse.dir</name>
      <value>/path/to/hive/warehouse/</value>
      <description>location of default database for the warehouse</description>
  </property>

  <property>
      <name>hive.cli.print.current.db</name>
      <value>true</value>
      <description>Whether to include the current database in the Hive 
prompt.</description>
  </property>

  <property>
      <name>hive.metastore.uris</name>
      <value>thrift://hiveserver2http01:9083</value>
      <description>Thrift uri for the remote metastore. Used by metastore 
client to connect to remote metastore.</description>
  </property>

  <property>
      <name>javax.jdo.option.ConnectionDriverName</name>
      <value>com.mysql.jdbc.Driver</value>
      <description>JDBC Driver</description>
  </property>

  <property>
      <name>javax.jdo.option.ConnectionURL</name>
      <value>jdbc:mysql://hivedb01/metastore</value>
      <description>JDBC connect string for a JDBC metastore</description>
  </property>

  <property>
      <name>javax.jdo.option.ConnectionUserName</name>
      <value>metastore</value>
      <description>username to use against metastore database</description>
  </property>

  <property>
      <name>javax.jdo.option.ConnectionPassword</name>
      <value>********</value>
      <description>password to use against metastore database</description>
  </property>

  <property>
      <name>datanucleus.autoCreateSchema</name>
      <value>false</value>
      <description>creates necessary schema on a startup if one doesn't exist. 
set this to false, after creating it once</description>
  </property>

  <property>
      <name>hive.metastore.authorization.storage.checks</name>
      <value>true</value>
      <description>Should the metastore do authorization checks against the 
underlying storage
  for operations like drop-partition (disallow the drop-partition if the user in
  question doesn't have permissions to delete the corresponding directory
  on the storage).</description>
  </property>

  <property>
      <name>hive.metastore.sasl.enabled</name>
      <value>true</value>
      <description>If true, the metastore thrift interface will be secured with 
SASL. Clients must authenticate with Kerberos.</description>
  </property>

  <property>
      <name>hive.metastore.kerberos.keytab.file</name>
      <value>/path/to/metastore.keytab</value>
      <description>The path to the Kerberos Keytab file containing the 
metastore thrift server's service principal.</description>
  </property>

  <property>
      <name>hive.metastore.kerberos.principal</name>
      <value>primary/instance@realm</value>
      <description>The service principal for the metastore thrift server. The 
special string _HOST will be replaced automatically with the correct host 
name.</description>
  </property>

  <property>
      <name>hive.server2.max.start.attempts</name>
      <value>30</value>
      <description>This number of times HiveServer2 will attempt to start 
before exiting, sleeping 60 seconds between retries. The default of 30 will 
keep trying for 30 minutes.</description>
  </property>

  <property>
      <name>hive.server2.transport.mode</name>
      <value>binary</value>
      <description>Server transport mode. "binary" or "http".</description>
  </property>

  <property>
      <name>hive.server2.thrift.http.port</name>
      <value>10001</value>
      <description>Port number when in HTTP mode.</description>
  </property>

  <property>
      <name>hive.server2.thrift.http.path</name>
      <value>bdcorp</value>
      <description>Path component of URL endpoint when in HTTP 
mode.</description>
  </property>

  <property>
      <name>hive.server2.use.SSL</name>
      <value>false</value>
      <description>Set this to true for using SSL encryption in 
HiveServer2</description>
  </property>

  <property>
      <name>hive.server2.keystore.path</name>
      <value></value>
      <description>SSL certificate keystore location</description>
  </property>

  <property>
      <name>hive.server2.keystore.password</name>
      <value></value>
      <description>SSL certificate keystore password.</description>
  </property>

  <property>
      <name>hive.server2.authentication.pam.services</name>
      <value></value>
      <description>List of the underlying pam services that should be used when 
auth type is PAM.
  A file with the same name must exist in /etc/pam.d</description>
  </property>

  <property>
      <name>hive.server2.thrift.min.worker.threads</name>
      <value>5</value>
      <description>Minimum number of Thrift worker threads</description>
  </property>

  <property>
      <name>hive.server2.thrift.max.worker.threads</name>
      <value>500</value>
      <description>Maximum number of Thrift worker threads</description>
  </property>

  <property>
      <name>hive.server2.thrift.worker.keepalive.time</name>
      <value>60</value>
      <description>Keepalive time (in seconds) for an idle worker thread. 
    When number of workers > min workers, excess threads are killed after this 
time interval.
      </description>
  </property>

  <property>
      <name>hive.server2.thrift.http.cookie.auth.enabled</name>
      <value>true</value>
      <description>When true, HiveServer2 in HTTP transport mode will use 
cookie based authentication mechanism.
      </description>
  </property>

  <property>
      <name>hive.server2.thrift.http.cookie.max.age</name>
      <value>86400s</value>
      <description>Maximum age in seconds for server side cookie used by 
HiveServer2 in HTTP mode.
      </description>
  </property>

  <property>
      <name>hive.server2.thrift.http.cookie.path</name>
      <value></value>
      <description>Path for the HiveServer2 generated cookies.
      </description>
  </property>

  <property>
      <name>hive.server2.thrift.http.cookie.domain</name>
      <value></value>
      <description>Domain for the HiveServer2 generated cookies.
      </description>
  </property>

  <property>
      <name>hive.server2.thrift.http.cookie.is.secure</name>
      <value>true</value>
      <description>Secure attribute of the HiveServer2 generated cookie.
      </description>
  </property>

  <property>
      <name>hive.server2.thrift.http.cookie.is.httponly</name>
      <value>true</value>
      <description>HttpOnly attribute of the HiveServer2 generated cookie.
      </description>
  </property>

  <property>
      <name>hive.server2.async.exec.threads</name>
      <value>100</value>
      <description>Number of threads in the async thread pool for 
HiveServer2</description>
  </property>

  <property>
      <name>hive.server2.async.exec.shutdown.timeout</name>
      <value>10</value>
      <description>Time (in seconds) for which HiveServer2 shutdown will wait 
for async
  threads to terminate</description>
  </property>

  <property>
      <name>hive.server2.async.exec.keepalive.time</name>
      <value>10</value>
      <description>Time (in seconds) that an idle HiveServer2 async thread 
(from the thread pool) will wait
  for a new task to arrive before terminating</description>
  </property>

  <property>
      <name>hive.server2.long.polling.timeout</name>
      <value>5000</value>
      <description>Time in milliseconds that HiveServer2 will wait, before 
responding to asynchronous calls that use long polling</description>
  </property>

  <property>
      <name>hive.server2.async.exec.wait.queue.size</name>
      <value>100</value>
      <description>Size of the wait queue for async thread pool in HiveServer2.
  After hitting this limit, the async thread pool will reject new 
requests.</description>
  </property>

  <property>
      <name>hive.server2.thrift.port</name>
      <value>10000</value>
      <description>Port number of HiveServer2 Thrift interface.
  Can be overridden by setting $HIVE_SERVER2_THRIFT_PORT</description>
  </property>

  <property>
      <name>hive.server2.thrift.bind.host</name>
      <value>hiveserver2http01</value>
      <description>Bind host on which to run the HiveServer2 Thrift interface.
  Can be overridden by setting $HIVE_SERVER2_THRIFT_BIND_HOST</description>
  </property>

  <property>
      <name>hive.server2.authentication</name>
      <value>KERBEROS</value>
      <description>
    Client authentication types.
       NONE: no authentication check
       LDAP: LDAP/AD based authentication
       KERBEROS: Kerberos/GSSAPI authentication
       CUSTOM: Custom authentication provider
               (Use with property hive.server2.custom.authentication.class)
       PAM: Pluggable authentication module.
      </description>
  </property>

  <property>
      <name>hive.server2.custom.authentication.class</name>
      <value></value>
      <description>
    Custom authentication class. Used when property
    'hive.server2.authentication' is set to 'CUSTOM'. Provided class
    must be a proper implementation of the interface
    org.apache.hive.service.auth.PasswdAuthenticationProvider. HiveServer2
    will call its Authenticate(user, passed) method to authenticate requests.
    The implementation may optionally extend Hadoop's
    org.apache.hadoop.conf.Configured class to grab Hive's Configuration object.
      </description>
  </property>

  <property>
      <name>hive.server2.authentication.kerberos.principal</name>
      <value>primary/instance@realm</value>
      <description>
    Kerberos server principal
      </description>
  </property>

  <property>
      <name>hive.server2.authentication.kerberos.keytab</name>
      <value>/path/to/hiveserver2.keytab</value>
      <description>
    Kerberos keytab file for server principal
      </description>
  </property>

  <property>
      <name>hive.server2.authentication.spnego.principal</name>
      <value>primary/instance@realm</value>
      <description>
    SPNego service principal, optional,
    typical value would look like HTTP/_h...@example.com
    SPNego service principal would be used by hiveserver2 when kerberos 
security is enabled
    and HTTP transport mode is used.
    This needs to be set only if SPNEGO is to be used in authentication.
      </description>
  </property>

  <property>
      <name>hive.server2.authentication.spnego.keytab</name>
      <value>/path/to/spnego.keytab</value>
      <description>
    keytab file for SPNego principal, optional,
    typical value would look like /etc/security/keytabs/spnego.service.keytab,
    This keytab would be used by hiveserver2 when kerberos security is enabled
    and HTTP transport mode is used.
    This needs to be set only if SPNEGO is to be used in authentication.
    SPNego authentication would be honored only if valid
    hive.server2.authentication.spnego.principal
    and
    hive.server2.authentication.spnego.keytab
    are specified
      </description>
  </property>

  <property>
      <name>hive.server2.authentication.ldap.url</name>
      <value>setindatabag</value>
      <description>
    LDAP connection URL
      </description>
  </property>

  <property>
      <name>hive.server2.authentication.ldap.baseDN</name>
      <value>setindatabag</value>
      <description>
    LDAP base DN
      </description>
  </property>

  <property>
      <name>hive.server2.enable.doAs</name>
      <value>true</value>
      <description>
   Setting this property to true will have HiveServer2 execute
    Hive operations as the user making the calls to it.
      </description>
  </property>

  <property>
      <name>hive.execution.engine</name>
      <value>mr</value>
      <description>
    Chooses execution engine. Options are: mr (Map reduce, default) or tez 
(hadoop 2 only)
      </description>
  </property>

  <property>
      <name>hive.mapjoin.optimized.hashtable</name>
      <value>true</value>
      <description>Whether Hive should use a memory-optimized hash table for 
MapJoin. 
    Only works on Tez, because memory-optimized hash table cannot be serialized.
      </description>
  </property>

  <property>
      <name>hive.mapjoin.optimized.hashtable.wbsize</name>
      <value>10485760</value>
      <description>Optimized hashtable (see hive.mapjoin.optimized.hashtable) 
uses a chain of buffers to store data. 
    This is one buffer size. Hashtable may be slightly faster if this is 
larger, 
    but for small joins unnecessary memory will be allocated and then trimmed.
      </description>
  </property>

  <property>
      <name>hive.prewarm.enabled</name>
      <value>false</value>
      <description>
    Enables container prewarm for tez (hadoop 2 only)
      </description>
  </property>

  <property>
      <name>hive.prewarm.numcontainers</name>
      <value>10</value>
      <description>
    Controls the number of containers to prewarm for tez (hadoop 2 only)
      </description>
  </property>

  <property>
      <name>hive.server2.table.type.mapping</name>
      <value>CLASSIC</value>
      <description>
   This setting reflects how HiveServer2 will report the table types for JDBC 
and other
   client implementations that retrieve the available tables and supported 
table types
     HIVE : Exposes Hive's native table types like MANAGED_TABLE, 
EXTERNAL_TABLE, VIRTUAL_VIEW
     CLASSIC : More generic types like TABLE and VIEW
      </description>
  </property>

  <property>
      <name>hive.server2.thrift.sasl.qop</name>
      <value>auth</value>
      <description>Sasl QOP value; Set it to one of following values to enable 
higher levels of
     protection for HiveServer2 communication with clients.
      "auth" - authentication only (default)
      "auth-int" - authentication plus integrity protection
      "auth-conf" - authentication plus integrity and confidentiality protection
     This is applicable only if HiveServer2 is configured to use Kerberos 
authentication.
      </description>
  </property>

  <property>
      <name>hive.tez.container.size</name>
      <value>-1</value>
      <description>By default tez will spawn containers of the size of a 
mapper. This can be used to overwrite.</description>
  </property>

  <property>
      <name>hive.tez.java.opts</name>
      <value></value>
      <description>By default tez will use the java opts from map tasks. This 
can be used to overwrite.</description>
  </property>

  <property>
      <name>hive.tez.log.level</name>
      <value>INFO</value>
      <description>
    The log level to use for tasks executing as part of the DAG.
    Used only if hive.tez.java.opts is used to configure java opts.
      </description>
  </property>

  <property>
      <name>hive.tez.smb.number.waves</name>
      <value>1</value>
      <description>The number of waves in which to run the SMB 
(sort-merge-bucket) join. 
    Account for cluster being occupied. Ideally should be 1 wave.
      </description>
  </property>

  <property>
      <name>hive.tez.cpu.vcores</name>
      <value>-1</value>
      <description>By default Tez will ask for however many CPUs MapReduce is 
configured to use per container. 
    This can be used to overwrite the default.
      </description>
  </property>

  <property>
      <name>hive.tez.auto.reducer.parallelism</name>
      <value>false</value>
      <description>Turn on Tez' auto reducer parallelism feature. When enabled, 
Hive will still estimate data sizes and set parallelism estimates. 
    Tez will sample source vertices' output sizes and adjust the estimates at 
runtime as necessary.
      </description>
  </property>

  <property>
      <name>hive.auto.convert.join</name>
      <value>true</value>
      <description>
      </description>
  </property>

  <property>
      <name>hive.auto.convert.join.noconditionaltask</name>
      <value>true</value>
      <description>
      </description>
  </property>

  <property>
      <name>hive.auto.convert.join.noconditionaltask.size</name>
      <value>1</value>
      <description>
      </description>
  </property>

  <property>
      <name>hive.vectorized.execution.enabled</name>
      <value>true</value>
      <description>This flag should be set to true to enable vectorized mode of 
query execution. The default value is false.
      </description>
  </property>

  <property>
      <name>hive.vectorized.execution.reduce.enabled</name>
      <value>false</value>
      <description>This flag should be set to true to enable vectorized mode of 
the reduce-side of query execution. The default value is true.
      </description>
  </property>

  <property>
      <name>hive.cbo.enable</name>
      <value>true</value>
      <description>When true, the cost based optimizer, which uses the Calcite 
framework, will be enabled.
      </description>
  </property>

  <property>
      <name>hive.fetch.task.conversion</name>
      <value>more</value>
      <description>Some select queries can be converted to a single FETCH task, 
minimizing latency. 
    Currently the query should be single sourced not having any subquery and 
should not have any aggregations or distincts 
    (which incur RS – ReduceSinkOperator, requiring a MapReduce task), lateral 
views and joins.
      </description>
  </property>

  <property>
      <name>hive.fetch.task.conversion.threshold</name>
      <value>1073741824</value>
      <description>Input threshold (in bytes) for applying 
hive.fetch.task.conversion. 
    If target table is native, input length is calculated by summation of file 
lengths. 
    If it's not native, the storage handler for the table can optionally 
implement the org.apache.hadoop.hive.ql.metadata.InputEstimator interface. 
    A negative threshold means hive.fetch.task.conversion is applied without 
any input length threshold.
      </description>
  </property>

  <property>
      <name>hive.fetch.task.aggr</name>
      <value>false</value>
      <description>Aggregation queries with no group-by clause (for example, 
select count(*) from src) execute final aggregations in a single reduce task.
    If this parameter is set to true, Hive delegates the final aggregation 
stage to a fetch task, possibly decreasing the query time.
      </description>
  </property>

  <property>
      <name>hive.spark.job.monitor.timeout</name>
      <value>60</value>
      <description>Timeout for job monitor to get Spark job state.
      </description>
  </property>

  <property>
      <name>hive.spark.client.future.timeout</name>
      <value>60</value>
      <description>Timeout for requests from Hive client to remote Spark driver.
      </description>
  </property>

  <property>
      <name>hive.spark.client.connect.timeout</name>
      <value>1000</value>
      <description>Timeout for remote Spark driver in connecting back to Hive 
client.
      </description>
  </property>

  <property>
      <name>hive.spark.client.channel.log.level</name>
      <value></value>
      <description>Channel logging level for remote Spark driver. One of DEBUG, 
ERROR, INFO, TRACE, WARN. If unset, TRACE is chosen.
      </description>
  </property>

  <property>
      <name>hive.server2.tez.default.queues</name>
      <value></value>
      <description>
    A list of comma separated values corresponding to yarn queues of the same 
name.
    When hive server 2 is launched in tez mode, this configuration needs to be 
set
    for multiple tez sessions to run in parallel on the cluster.
      </description>
  </property>

  <property>
      <name>hive.server2.tez.sessions.per.default.queue</name>
      <value>1</value>
      <description>
    A positive integer that determines the number of tez sessions that should be
    launched on each of the queues specified by 
"hive.server2.tez.default.queues".
    Determines the parallelism on each queue.
      </description>
  </property>

  <property>
      <name>hive.server2.tez.initialize.default.sessions</name>
      <value>false</value>
      <description>
    This flag is used in hive server 2 to enable a user to use hive server 2 
without
    turning on tez for hive server 2. The user could potentially want to run 
queries
    over tez without the pool of sessions.
      </description>
  </property>

  <property>
      <name>hive.support.sql11.reserved.keywords</name>
      <value>true</value>
      <description>Whether to enable support for SQL2011 reserved keywords. 
When enabled, will support (part of) SQL2011 reserved keywords.
      </description>
  </property>

  <property>
      <name>hive.aux.jars.path</name>
      <value></value>
      <description>A comma separated list (with no spaces) of the jar 
files</description>
  </property>

</configuration>
{noformat}

Best regards.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HIVE-13819) Read & eXecute permissions on Database allows to ALTER it.

Reply via email to