Hey Alex, This indeed is a bug. I have done a patch for it on https://issues.apache.org/jira/browse/HIVE-2616 Would you like to try it out to see if that works for you?
Ashutosh On Tue, Nov 29, 2011 at 02:45, Alex Holmes <grep.a...@gmail.com> wrote: > Running mysql as the metastore doesn't change the behavior of the HDFS > operations, and more importantly who (the ugi) they are executed as. > > Does anyone have any thoughts as to why Hive HDFS operations are run > as different users? > > Many thoughts, > Alex > > > On Tue, Nov 29, 2011 at 2:47 AM, Alexander C.H. Lorenz > <wget.n...@googlemail.com> wrote: > > Derby depends on a local filestore, for more flexibility and security I > > suggest mysql as a metastore. > > - Alex > > > > On Tue, Nov 29, 2011 at 3:06 AM, Alex Holmes <grep.a...@gmail.com> > wrote: > >> > >> Hi, > >> > >> I'm running Hive 0.7.1 with a remote metastore (Derby) on Hadoop 0.20.2. > >> > >> Is there a reason that CREATE and DROP commands when translated into > >> HDFS operations are run as the remote Hive metastore user, but a LOAD > >> is translated into HDFS operations that are executed as the Hive > >> client user? If my understanding is correct, doesn't this mean that: > >> > >> 1. The Hive remote metastore must always be run as a superuser, which > >> is arguably a security risk. If I run the Hive remote metastore as a > >> non-superuser different from the Hive client user, then a LOAD DATA > >> LOCAL (with the HDFS umask default of 022) creates a directory chmod'd > >> 755, which doesn't give the Hive metastore user permissions to remove > >> the directory in a subsequent DROP. > >> > >> 2. The Hive client must have write permissions on the initial table > >> directory created by the CREATE command executed as the Hive remove > >> metastore user. This would only work in cases where both the remote > >> Hive metastore user and the client Hive user were the same user, or if > >> the Hive client were a superuser. In my own testing the only way I > >> could get this to work when they were different users (and not > >> superusers) was in the application of a locally written patch which > >> addresses HIVE-2504. > >> > >> Maybe I'm over-simplifying, but couldn't all the Hive remote metastore > >> HDFS operations be run as the Hive client's user/group? > >> > >> Thanks, > >> Alex > > > > > > > > -- > > Alexander Lorenz > > http://mapredit.blogspot.com > > P Think of the environment: please don't print this email unless you > really > > need to. > > > > >