[jira] Commented: (HIVE-1681) ObjectStore.commitTransaction() does not properly handle transactions that have already been rolled back
[ https://issues.apache.org/jira/browse/HIVE-1681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12917520#action_12917520 ] Venkatesh S commented on HIVE-1681: --- The query ran successfully with this patch. Thanks Carl. Appreciate if this can be committed quickly. > ObjectStore.commitTransaction() does not properly handle transactions that > have already been rolled back > > > Key: HIVE-1681 > URL: https://issues.apache.org/jira/browse/HIVE-1681 > Project: Hadoop Hive > Issue Type: Bug > Components: Metastore >Affects Versions: 0.5.0, 0.6.0, 0.7.0 >Reporter: Carl Steinbach >Assignee: Carl Steinbach > Attachments: HIVE-1681.1.patch.txt > > > Here's the code for ObjectStore.commitTransaction() and > ObjectStore.rollbackTransaction(): > {code} > public boolean commitTransaction() { > assert (openTrasactionCalls >= 1); > if (!currentTransaction.isActive()) { > throw new RuntimeException( > "Commit is called, but transaction is not active. Either there are" > + " mismatching open and close calls or rollback was called in > the same trasaction"); > } > openTrasactionCalls--; > if ((openTrasactionCalls == 0) && currentTransaction.isActive()) { > transactionStatus = TXN_STATUS.COMMITED; > currentTransaction.commit(); > } > return true; > } > public void rollbackTransaction() { > if (openTrasactionCalls < 1) { > return; > } > openTrasactionCalls = 0; > if (currentTransaction.isActive() > && transactionStatus != TXN_STATUS.ROLLBACK) { > transactionStatus = TXN_STATUS.ROLLBACK; > // could already be rolled back > currentTransaction.rollback(); > } > } > {code} > Now suppose a nested transaction throws an exception which results > in the nested pseudo-transaction calling rollbackTransaction(). This causes > rollbackTransaction() to rollback the actual transaction, as well as to set > openTransactionCalls=0 and transactionStatus = TXN_STATUS.ROLLBACK. > Suppose also that this nested transaction squelches the original exception. > In this case the stack will unwind and the caller will eventually try to > commit the > transaction by calling commitTransaction() which will see that > currentTransaction.isActive() returns > FALSE and will throw a RuntimeException. The fix for this problem is > that commitTransaction() needs to first check transactionStatus and return > immediately > if transactionStatus==TXN_STATUS.ROLLBACK. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-842) Authentication Infrastructure for Hive
[ https://issues.apache.org/jira/browse/HIVE-842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12914408#action_12914408 ] Venkatesh S commented on HIVE-842: -- > Should the metastore always take HDFS actions as the user making the RPC? Yes, metastore will run as a super-user (Hadoop proxy user) enabling DO AS operations and impersonate the target user while accessing data on HDFS. > If we see that Hadoop Security is enabled, should we enable SASL on the > metastore thrift server by default? I'd think so. > should there be an option whereby the metastore uses a keytab to authenticate > to HDFS, but doesn't require users to authenticate to it? Wouldn't this leave a hole as it currently exists? > Authentication Infrastructure for Hive > -- > > Key: HIVE-842 > URL: https://issues.apache.org/jira/browse/HIVE-842 > Project: Hadoop Hive > Issue Type: New Feature > Components: Server Infrastructure >Reporter: Edward Capriolo >Assignee: Todd Lipcon > Attachments: HiveSecurityThoughts.pdf > > > This issue deals with the authentication (user name,password) infrastructure. > Not the authorization components that specify what a user should be able to > do. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-842) Authentication Infrastructure for Hive
[ https://issues.apache.org/jira/browse/HIVE-842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12913706#action_12913706 ] Venkatesh S commented on HIVE-842: -- Sounds good to me. > Authentication Infrastructure for Hive > -- > > Key: HIVE-842 > URL: https://issues.apache.org/jira/browse/HIVE-842 > Project: Hadoop Hive > Issue Type: New Feature > Components: Server Infrastructure >Reporter: Edward Capriolo >Assignee: Todd Lipcon > Attachments: HiveSecurityThoughts.pdf > > > This issue deals with the authentication (user name,password) infrastructure. > Not the authorization components that specify what a user should be able to > do. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-842) Authentication Infrastructure for Hive
[ https://issues.apache.org/jira/browse/HIVE-842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12913466#action_12913466 ] Venkatesh S commented on HIVE-842: -- > * Do Hive tasks ever need to authenticate to the metastore? If so, we > will have to build a delegation token system into Hive. I learnt it from Alan and Pradeep that Howl uses the commit task to talk to the metastore. Hence we'll have to build the delegation token system. > Authentication Infrastructure for Hive > -- > > Key: HIVE-842 > URL: https://issues.apache.org/jira/browse/HIVE-842 > Project: Hadoop Hive > Issue Type: New Feature > Components: Server Infrastructure >Reporter: Edward Capriolo >Assignee: Todd Lipcon > Attachments: HiveSecurityThoughts.pdf > > > This issue deals with the authentication (user name,password) infrastructure. > Not the authorization components that specify what a user should be able to > do. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1476) Hive's metastore when run as a thrift service creates directories as the service user instead of the real user issuing create table/alter table etc.
[ https://issues.apache.org/jira/browse/HIVE-1476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12905470#action_12905470 ] Venkatesh S commented on HIVE-1476: --- @Todd, Thrift over HTTP transport (THRIFT-814) can use kerberos over SPNEGO. > Hive's metastore when run as a thrift service creates directories as the > service user instead of the real user issuing create table/alter table etc. > > > Key: HIVE-1476 > URL: https://issues.apache.org/jira/browse/HIVE-1476 > Project: Hadoop Hive > Issue Type: Bug >Affects Versions: 0.6.0, 0.7.0 >Reporter: Pradeep Kamath > Attachments: HIVE-1476.patch, HIVE-1476.patch.2 > > > If the thrift metastore service is running as the user "hive" then all table > directories as a result of create table are created as that user rather than > the user who actually issued the create table command. This is different > semantically from non-thrift mode (i.e. local mode) when clients directly > connect to the metastore. In the latter case, directories are created as the > real user. The thrift mode should do the same. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1171) Check Hadoop JAR shim dependencies into lib/
[ https://issues.apache.org/jira/browse/HIVE-1171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12902575#action_12902575 ] Venkatesh S commented on HIVE-1171: --- Hadoop poms are available now and would be good to specify haddop.version and only use that to build. BTW, why is the resolution "Wont Fix"? > Check Hadoop JAR shim dependencies into lib/ > > > Key: HIVE-1171 > URL: https://issues.apache.org/jira/browse/HIVE-1171 > Project: Hadoop Hive > Issue Type: Bug > Components: Build Infrastructure >Affects Versions: 0.6.0 >Reporter: Carl Steinbach > > In order to satisfy the shim dependencies we currently have > Ivy configured to download four different versions, or 162Mb worth > of Hadoop source tarballs from archive.apache.org. This includes > a lot of junk that we don't actually need. > We should instead pair this down to what we do need and check it > into the Hive source tree (no one has complained about problems > syncing the svn repository, and we already have a bunch of JARs > checked into lib/). > Once Hadoop POMs become available we should shift back to Ivy. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-474) Support for distinct selection on two or more columns
[ https://issues.apache.org/jira/browse/HIVE-474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12902421#action_12902421 ] Venkatesh S commented on HIVE-474: -- Any update on this? > Support for distinct selection on two or more columns > - > > Key: HIVE-474 > URL: https://issues.apache.org/jira/browse/HIVE-474 > Project: Hadoop Hive > Issue Type: Improvement > Components: Query Processor >Reporter: Alexis Rondeau >Assignee: Mafish > Attachments: hive-474.0.4.2rc.patch > > > The ability to select distinct several, individual columns as by example: > select count(distinct user), count(distinct session) from actions; > Currently returns the following failure: > FAILED: Error in semantic analysis: line 2:7 DISTINCT on Different Columns > not Supported user -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1562) CombineHiveInputFormat issues in minimr mode
[ https://issues.apache.org/jira/browse/HIVE-1562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12900212#action_12900212 ] Venkatesh S commented on HIVE-1562: --- There is a bug in CombineFileInputFormat and you would need to patch hadoop core with HADOOP-5759. If you cannot patch core, you could patch the client side and use HADOOP-1938 which is what we are doing. > CombineHiveInputFormat issues in minimr mode > > > Key: HIVE-1562 > URL: https://issues.apache.org/jira/browse/HIVE-1562 > Project: Hadoop Hive > Issue Type: Bug >Reporter: Joydeep Sen Sarma > > followup from HIVE-1523. This is probably because of CombineHiveInputFormat: > ant -Dclustermode=miniMR -Dtestcase=TestCliDriver -Dqfile=sample10.q test > insert overwrite table srcpartbucket partition(ds, hr) select * from srcpart > where ds is not null and key < 10 > 2010-08-18 15:13:54,378 ERROR SessionState > (SessionState.java:printError(277)) - PREHOOK: query: insert overwrite table > srcpartbucket partition(ds, hr) select *\ > from srcpart where ds is not null and key < 10 > 2010-08-18 15:13:54,379 ERROR SessionState > (SessionState.java:printError(277)) - PREHOOK: type: QUERY > 2010-08-18 15:13:54,379 ERROR SessionState > (SessionState.java:printError(277)) - PREHOOK: Input: > defa...@srcpart@ds=2008-04-08/hr=11 > 2010-08-18 15:13:54,379 ERROR SessionState > (SessionState.java:printError(277)) - PREHOOK: Input: > defa...@srcpart@ds=2008-04-08/hr=12 > 2010-08-18 15:13:54,379 ERROR SessionState > (SessionState.java:printError(277)) - PREHOOK: Input: > defa...@srcpart@ds=2008-04-09/hr=11 > 2010-08-18 15:13:54,379 ERROR SessionState > (SessionState.java:printError(277)) - PREHOOK: Input: > defa...@srcpart@ds=2008-04-09/hr=12 > 2010-08-18 15:13:54,704 WARN mapred.JobClient > (JobClient.java:configureCommandLineOptions(539)) - Use GenericOptionsParser > for parsing the arguments. Applicati\ > ons should implement Tool for the same. > 2010-08-18 15:13:55,642 ERROR mapred.EagerTaskInitializationListener > (EagerTaskInitializationListener.java:run(83)) - Job initialization failed: > java.lang.IllegalArgumentException: Network location name contains /: > /default-rack > at org.apache.hadoop.net.NodeBase.set(NodeBase.java:75) > at org.apache.hadoop.net.NodeBase.(NodeBase.java:57) > at > org.apache.hadoop.mapred.JobTracker.addHostToNodeMapping(JobTracker.java:2326) > at > org.apache.hadoop.mapred.JobTracker.resolveAndAddToTopology(JobTracker.java:2320) > at > org.apache.hadoop.mapred.JobInProgress.createCache(JobInProgress.java:343) > at org.apache.hadoop.mapred.JobInProgress.initTasks(JobInProgress.java:440) > at > org.apache.hadoop.mapred.EagerTaskInitializationListener$InitJob.run(EagerTaskInitializationListener.java:81) > at > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907) > at java.lang.Thread.run(Thread.java:619) > 2010-08-18 15:13:56,566 ERROR exec.MapRedTask > (SessionState.java:printError(277)) - Ended Job = job_201008181513_0001 with > errors > 2010-08-18 15:13:56,597 ERROR ql.Driver (SessionState.java:printError(277)) - > FAILED: Execution Error, return code 2 from > org.apache.hadoop.hive.ql.exec.MapRedT\ > ask > See also:combine2.q -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1541) More general dataflow execution backend
[ https://issues.apache.org/jira/browse/HIVE-1541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12898812#action_12898812 ] Venkatesh S commented on HIVE-1541: --- Oozie should be a good candidate as well. > More general dataflow execution backend > --- > > Key: HIVE-1541 > URL: https://issues.apache.org/jira/browse/HIVE-1541 > Project: Hadoop Hive > Issue Type: New Feature >Reporter: Jeff Hammerbacher > > With the recent open source release of Mesos (http://github.com/mesos/mesos), > experimentation at the query execution layer has become more feasible. > Inspired by more general-purpose dataflow systems like Volcano, Dryad, and > Dremel, it would be interesting to explore a more general-purpose dataflow > execution system for Hive queries. One potential backend is the Hyracks > project from UCI: http://code.google.com/p/hyracks. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1307) More generic and efficient merge method
[ https://issues.apache.org/jira/browse/HIVE-1307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12894519#action_12894519 ] Venkatesh S commented on HIVE-1307: --- Hey Ning, any update on this issue is greatly appreciated. > More generic and efficient merge method > --- > > Key: HIVE-1307 > URL: https://issues.apache.org/jira/browse/HIVE-1307 > Project: Hadoop Hive > Issue Type: New Feature >Affects Versions: 0.6.0 >Reporter: Ning Zhang >Assignee: Ning Zhang > Fix For: 0.7.0 > > Attachments: HIVE-1307.0.patch > > > Currently if hive.merge.mapfiles/mapredfiles=true, a new mapreduce job is > create to read the input files and output to one reducer for merging. This MR > job is created at compile time and one MR job for one partition. In the case > of dynamic partition case, multiple partitions could be created at execution > time and generating merging MR job at compile time is impossible. > We should generalize the merge framework to allow multiple partitions and > most of the time a map-only job should be sufficient if we use > CombineHiveInputFormat. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1264) Make Hive work with Hadoop security
[ https://issues.apache.org/jira/browse/HIVE-1264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12891002#action_12891002 ] Venkatesh S commented on HIVE-1264: --- This will work against 20.1xx branch. You need to include the 20.1xx hadoop dependency and it does compile and run. The interface contract does not change and hence not sure if I need to change the shim. UGI has changed in 20S and UnixUGI class is no more. Please suggest how to proceed with this incompatible change. > Make Hive work with Hadoop security > --- > > Key: HIVE-1264 > URL: https://issues.apache.org/jira/browse/HIVE-1264 > Project: Hadoop Hive > Issue Type: Improvement >Reporter: Jeff Hammerbacher > Attachments: HiveHadoop20S_patch.patch > > -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1264) Make Hive work with Hadoop security
[ https://issues.apache.org/jira/browse/HIVE-1264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Venkatesh S updated HIVE-1264: -- Attachment: HiveHadoop20S_patch.patch > Make Hive work with Hadoop security > --- > > Key: HIVE-1264 > URL: https://issues.apache.org/jira/browse/HIVE-1264 > Project: Hadoop Hive > Issue Type: Improvement >Reporter: Jeff Hammerbacher > Attachments: HiveHadoop20S_patch.patch > > -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1264) Make Hive work with Hadoop security
[ https://issues.apache.org/jira/browse/HIVE-1264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Venkatesh S updated HIVE-1264: -- Status: Patch Available (was: Open) Patch for H20S > Make Hive work with Hadoop security > --- > > Key: HIVE-1264 > URL: https://issues.apache.org/jira/browse/HIVE-1264 > Project: Hadoop Hive > Issue Type: Improvement >Reporter: Jeff Hammerbacher > -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.