[jira] [Commented] (HIVE-2055) Hive should add HBase classpath dependencies when available
[ https://issues.apache.org/jira/browse/HIVE-2055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13825647#comment-13825647 ] Roman Shaposhnik commented on HIVE-2055: Sorry for dropping by somewhat late but it looks like you've got a pretty reasonable solution with mapredcp. Hive should add HBase classpath dependencies when available --- Key: HIVE-2055 URL: https://issues.apache.org/jira/browse/HIVE-2055 Project: Hive Issue Type: Bug Components: HBase Handler Affects Versions: 0.10.0 Reporter: sajith v Attachments: 0001-HIVE-2055-include-hbase-dependencies-in-launch-scrip.patch, HIVE-2055.patch Created an external table in hive , which points to the HBase table. When tried to query a column using the column name in select clause got the following exception : ( java.lang.ClassNotFoundException: org.apache.hadoop.hive.hbase.HiveHBaseTableInputFormat), errorCode:12, SQLState:42000) -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-2055) Hive should add HBase classpath dependencies when available
[ https://issues.apache.org/jira/browse/HIVE-2055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13826244#comment-13826244 ] Roman Shaposhnik commented on HIVE-2055: [~ndimiduk] LGTM +1 Hive should add HBase classpath dependencies when available --- Key: HIVE-2055 URL: https://issues.apache.org/jira/browse/HIVE-2055 Project: Hive Issue Type: Bug Components: HBase Handler Affects Versions: 0.10.0 Reporter: sajith v Assignee: Nick Dimiduk Attachments: 0001-HIVE-2055-include-hbase-dependencies-in-launch-scrip.patch, 0001-HIVE-2055-include-hbase-dependencies-in-launch-scrip.patch, 0001-HIVE-2055-include-hbase-dependencies-in-launch-scrip.patch, HIVE-2055.patch Created an external table in hive , which points to the HBase table. When tried to query a column using the column name in select clause got the following exception : ( java.lang.ClassNotFoundException: org.apache.hadoop.hive.hbase.HiveHBaseTableInputFormat), errorCode:12, SQLState:42000) -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-2757) hive can't find hadoop executor scripts without HADOOP_HOME set
[ https://issues.apache.org/jira/browse/HIVE-2757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Roman Shaposhnik updated HIVE-2757: --- Attachment: HIVE-2757-2.patch.txt hive can't find hadoop executor scripts without HADOOP_HOME set --- Key: HIVE-2757 URL: https://issues.apache.org/jira/browse/HIVE-2757 Project: Hive Issue Type: Bug Components: CLI Affects Versions: 0.8.0 Reporter: Roman Shaposhnik Attachments: HIVE-2757-2.patch.txt, HIVE-2757.D3075.1.patch, HIVE-2757.patch.txt, HIVE-2757.patch.txt, hive-2757.diff The trouble is that in Hadoop 0.23 HADOOP_HOME has been deprecated. I think it would be really nice if bin/hive can be modified to capture the which hadoop and pass that as a property into the JVM. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2757) hive can't find hadoop executor scripts without HADOOP_HOME set
[ https://issues.apache.org/jira/browse/HIVE-2757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13271820#comment-13271820 ] Roman Shaposhnik commented on HIVE-2757: Attaching a patch that incorporated all the feedback from the reviews and also got tested. Attaching here since FB is broken. hive can't find hadoop executor scripts without HADOOP_HOME set --- Key: HIVE-2757 URL: https://issues.apache.org/jira/browse/HIVE-2757 Project: Hive Issue Type: Bug Components: CLI Affects Versions: 0.8.0 Reporter: Roman Shaposhnik Attachments: HIVE-2757-2.patch.txt, HIVE-2757.D3075.1.patch, HIVE-2757.patch.txt, HIVE-2757.patch.txt, hive-2757.diff The trouble is that in Hadoop 0.23 HADOOP_HOME has been deprecated. I think it would be really nice if bin/hive can be modified to capture the which hadoop and pass that as a property into the JVM. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2757) hive can't find hadoop executor scripts without HADOOP_HOME set
[ https://issues.apache.org/jira/browse/HIVE-2757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Roman Shaposhnik updated HIVE-2757: --- Attachment: HIVE-2757.patch.txt Sorry for the delay -- it seems I don't have much luck with running Hive unit tests on my machine :-( On the positive side -- I tested this patch against a real cluster and it works as expected. hive can't find hadoop executor scripts without HADOOP_HOME set --- Key: HIVE-2757 URL: https://issues.apache.org/jira/browse/HIVE-2757 Project: Hive Issue Type: Bug Components: CLI Affects Versions: 0.8.0 Reporter: Roman Shaposhnik Attachments: HIVE-2757.patch.txt, HIVE-2757.patch.txt, hive-2757.diff The trouble is that in Hadoop 0.23 HADOOP_HOME has been deprecated. I think it would be really nice if bin/hive can be modified to capture the which hadoop and pass that as a property into the JVM. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2757) hive can't find hadoop executor scripts without HADOOP_HOME set
[ https://issues.apache.org/jira/browse/HIVE-2757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Roman Shaposhnik updated HIVE-2757: --- Status: Patch Available (was: Open) hive can't find hadoop executor scripts without HADOOP_HOME set --- Key: HIVE-2757 URL: https://issues.apache.org/jira/browse/HIVE-2757 Project: Hive Issue Type: Bug Components: CLI Affects Versions: 0.8.0 Reporter: Roman Shaposhnik Attachments: HIVE-2757.patch.txt, HIVE-2757.patch.txt, hive-2757.diff The trouble is that in Hadoop 0.23 HADOOP_HOME has been deprecated. I think it would be really nice if bin/hive can be modified to capture the which hadoop and pass that as a property into the JVM. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2757) hive can't find hadoop executor scripts without HADOOP_HOME set
[ https://issues.apache.org/jira/browse/HIVE-2757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13266715#comment-13266715 ] Roman Shaposhnik commented on HIVE-2757: @Ashutosh, I think at this point I'd go for the minimally invasive patch (which is attached). Potential improvements would include merging the stuff that Buddhika posted (and better yet, following up with a patch that migrates us to ProcessBuilder). However, I believe all of those things need to be handled in separate JIRAs. @Edward, I agree with your comments on standardizing the environment. It is fine for Hive to keep depending on env. variables, but it also needs to have reasonable defaults. hive can't find hadoop executor scripts without HADOOP_HOME set --- Key: HIVE-2757 URL: https://issues.apache.org/jira/browse/HIVE-2757 Project: Hive Issue Type: Bug Components: CLI Affects Versions: 0.8.0 Reporter: Roman Shaposhnik Attachments: HIVE-2757.patch.txt, HIVE-2757.patch.txt, hive-2757.diff The trouble is that in Hadoop 0.23 HADOOP_HOME has been deprecated. I think it would be really nice if bin/hive can be modified to capture the which hadoop and pass that as a property into the JVM. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2757) hive can't find hadoop executor scripts without HADOOP_HOME set
[ https://issues.apache.org/jira/browse/HIVE-2757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13261890#comment-13261890 ] Roman Shaposhnik commented on HIVE-2757: @Edward, not sure I understand -- at this point it is an unused variable and it serves no purpose. If, during the course of development of this patch, we find a use for it -- sure we'll keep it. Does it make sense? hive can't find hadoop executor scripts without HADOOP_HOME set --- Key: HIVE-2757 URL: https://issues.apache.org/jira/browse/HIVE-2757 Project: Hive Issue Type: Bug Components: CLI Affects Versions: 0.8.0 Reporter: Roman Shaposhnik Attachments: HIVE-2757.patch.txt The trouble is that in Hadoop 0.23 HADOOP_HOME has been deprecated. I think it would be really nice if bin/hive can be modified to capture the which hadoop and pass that as a property into the JVM. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2757) hive can't find hadoop executor scripts without HADOOP_HOME set
[ https://issues.apache.org/jira/browse/HIVE-2757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13261921#comment-13261921 ] Roman Shaposhnik commented on HIVE-2757: @Carl, bin/hive shell script calling bin/hadoop is a bit of an orthogonal issue here I think. The basic problem I'm trying to address in this JIRA is this: hive has at least 2 entry points Bigtop cares about for integration purposes: # bin/hive # org.apache.hadoop.hive.cli.CliDriver Regardless of which one is used, though, hive Java code will end up exec'ing hadoop launcher script when it comes to job submission. The reason I'm brining up the 2 is simple: given that #2 exists and is used by things like Oozie we can't rely on shell-level computation of environment (like finding the hadoop executable and passing that to java code via a property, etc.). What this JIRA is trying to accomplish is to push the logic of finding hadoop executable script (if none is given) back to the java code, since because of #2 it is the only place where it can be done reliably. Makes sense? hive can't find hadoop executor scripts without HADOOP_HOME set --- Key: HIVE-2757 URL: https://issues.apache.org/jira/browse/HIVE-2757 Project: Hive Issue Type: Bug Components: CLI Affects Versions: 0.8.0 Reporter: Roman Shaposhnik Attachments: HIVE-2757.patch.txt The trouble is that in Hadoop 0.23 HADOOP_HOME has been deprecated. I think it would be really nice if bin/hive can be modified to capture the which hadoop and pass that as a property into the JVM. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2757) hive can't find hadoop executor scripts without HADOOP_HOME set
[ https://issues.apache.org/jira/browse/HIVE-2757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Roman Shaposhnik updated HIVE-2757: --- Attachment: HIVE-2757.patch.txt Here's an example patch that is not meant for inclusion, but rather to generate discussion whether such an approach would be acceptable. Basically, the fundamental problem is that Hive java code can be used in other projects (like Oozie) and hence it can't rely on launcher shell scripts always passing correct set of properties along based on querying the environment at shell level. This unique feature of Hive makes discovery of Hadoop executor script happen at the level of Java code. The patch contains a very naive attempt at doing that while maintaining backward compatibility with Hadoop 0.20.X and older releases. The most notable feature that is still missing is an ability to discover Hadoop that's part of the user's PATH. Before I implement that, however, I'd like to ask whether exec'ing via ProcessBuilder won't be a better option, rather than me manually tying to parse PATH (error prone). Please let me know what you think. P.S. I have also took the liberty of removing ConfVars.HADOOPCONF since I don't think it is used anymore. hive can't find hadoop executor scripts without HADOOP_HOME set --- Key: HIVE-2757 URL: https://issues.apache.org/jira/browse/HIVE-2757 Project: Hive Issue Type: Bug Components: CLI Affects Versions: 0.8.0 Reporter: Roman Shaposhnik Attachments: HIVE-2757.patch.txt The trouble is that in Hadoop 0.23 HADOOP_HOME has been deprecated. I think it would be really nice if bin/hive can be modified to capture the which hadoop and pass that as a property into the JVM. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-2379) Hive/HBase integration could be improved
Hive/HBase integration could be improved Key: HIVE-2379 URL: https://issues.apache.org/jira/browse/HIVE-2379 Project: Hive Issue Type: Improvement Components: CLI, Clients Affects Versions: 0.7.1, 0.8.0 Reporter: Roman Shaposhnik Priority: Minor For now any Hive/HBase queries would require the following jars to be explicitly added via hive's add jar command: add jar /usr/lib/hive/lib/hbase-0.90.1-cdh3u0.jar; add jar /usr/lib/hive/lib/hive-hbase-handler-0.7.0-cdh3u0.jar; add jar /usr/lib/hive/lib/zookeeper-3.3.1.jar; add jar /usr/lib/hive/lib/guava-r06.jar; the longer term solution, perhaps, should be to have the code at submit time call hbase's TableMapREduceUtil.addDependencyJar(job, HBaseStorageHandler.class) to ship it in distributedcache. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2091) Test scripts rcfile_columnar.q and join_filters.q need to be made deterministic in their output
[ https://issues.apache.org/jira/browse/HIVE-2091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Roman Shaposhnik updated HIVE-2091: --- Attachment: HIVE-2091-trunk.patch Patch against trunk (the previous one was against 0.7.0 branch) Test scripts rcfile_columnar.q and join_filters.q need to be made deterministic in their output - Key: HIVE-2091 URL: https://issues.apache.org/jira/browse/HIVE-2091 Project: Hive Issue Type: Bug Components: Testing Infrastructure Affects Versions: 0.7.0 Reporter: Roman Shaposhnik Priority: Minor Attachments: HIVE-2091-trunk.patch, HIVE-2091.patch Currently this 2 query scripts generate non-deterministic output: * ql/src/test/queries/clientpositive/rcfile_columnar.q * ql/src/test/queries/clientpositive/join_filters.q The suggestion is to use ORDER BY statement. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-2091) Test scripts need to be made deterministic in their output
Test scripts need to be made deterministic in their output -- Key: HIVE-2091 URL: https://issues.apache.org/jira/browse/HIVE-2091 Project: Hive Issue Type: Bug Components: Testing Infrastructure Affects Versions: 0.7.0 Reporter: Roman Shaposhnik Priority: Minor Currently this 2 query scripts generate non-deterministic output: The suggestion is to use GROUP BY statement. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2091) Test scripts rcfile_columnar.q and join_filters.q need to be made deterministic in their output
[ https://issues.apache.org/jira/browse/HIVE-2091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Roman Shaposhnik updated HIVE-2091: --- Description: Currently this 2 query scripts generate non-deterministic output: * ql/src/test/queries/clientpositive/rcfile_columnar.q * ql/src/test/queries/clientpositive/join_filters.q The suggestion is to use GROUP BY statement. was: Currently this 2 query scripts generate non-deterministic output: The suggestion is to use GROUP BY statement. Summary: Test scripts rcfile_columnar.q and join_filters.q need to be made deterministic in their output (was: Test scripts need to be made deterministic in their output) Test scripts rcfile_columnar.q and join_filters.q need to be made deterministic in their output - Key: HIVE-2091 URL: https://issues.apache.org/jira/browse/HIVE-2091 Project: Hive Issue Type: Bug Components: Testing Infrastructure Affects Versions: 0.7.0 Reporter: Roman Shaposhnik Priority: Minor Currently this 2 query scripts generate non-deterministic output: * ql/src/test/queries/clientpositive/rcfile_columnar.q * ql/src/test/queries/clientpositive/join_filters.q The suggestion is to use GROUP BY statement. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2091) Test scripts rcfile_columnar.q and join_filters.q need to be made deterministic in their output
[ https://issues.apache.org/jira/browse/HIVE-2091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Roman Shaposhnik updated HIVE-2091: --- Status: Patch Available (was: Open) Please take a look at the attached patch Test scripts rcfile_columnar.q and join_filters.q need to be made deterministic in their output - Key: HIVE-2091 URL: https://issues.apache.org/jira/browse/HIVE-2091 Project: Hive Issue Type: Bug Components: Testing Infrastructure Affects Versions: 0.7.0 Reporter: Roman Shaposhnik Priority: Minor Currently this 2 query scripts generate non-deterministic output: * ql/src/test/queries/clientpositive/rcfile_columnar.q * ql/src/test/queries/clientpositive/join_filters.q The suggestion is to use GROUP BY statement. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2091) Test scripts rcfile_columnar.q and join_filters.q need to be made deterministic in their output
[ https://issues.apache.org/jira/browse/HIVE-2091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Roman Shaposhnik updated HIVE-2091: --- Attachment: HIVE-2091.patch Test scripts rcfile_columnar.q and join_filters.q need to be made deterministic in their output - Key: HIVE-2091 URL: https://issues.apache.org/jira/browse/HIVE-2091 Project: Hive Issue Type: Bug Components: Testing Infrastructure Affects Versions: 0.7.0 Reporter: Roman Shaposhnik Priority: Minor Attachments: HIVE-2091.patch Currently this 2 query scripts generate non-deterministic output: * ql/src/test/queries/clientpositive/rcfile_columnar.q * ql/src/test/queries/clientpositive/join_filters.q The suggestion is to use GROUP BY statement. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira