[jira] Created: (HIVE-1918) Add export/import facilities to the hive system
Add export/import facilities to the hive system --- Key: HIVE-1918 URL: https://issues.apache.org/jira/browse/HIVE-1918 Project: Hive Issue Type: New Feature Components: Query Processor Reporter: Krishna Kumar This is an enhancement request to add export/import features to hive. With this language extension, the user can export the data of the table - which may be located in different hdfs locations in case of a partitioned table - as well as the metadata of the table into a specified output location. This output location can then be moved over to another different hadoop/hive instance and imported there. This should work independent of the source and target metastore dbms used; for instance, between derby and mysql. For partitioned tables, the ability to export/import a subset of the partition must be supported. Howl will add more features on top of this: The ability to create/use the exported data even in the absence of hive, using MR or Pig. Please see http://wiki.apache.org/pig/Howl/HowlImportExport for these details. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Build failed in Hudson: Hive-trunk-h0.20 #493
See https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/493/ -- [...truncated 21415 lines...] [junit] POSTHOOK: Output: default@srcbucket [junit] OK [junit] PREHOOK: query: LOAD DATA LOCAL INPATH 'https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/srcbucket1.txt' INTO TABLE srcbucket [junit] PREHOOK: type: LOAD [junit] Copying data from https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/srcbucket1.txt [junit] Loading data to table srcbucket [junit] POSTHOOK: query: LOAD DATA LOCAL INPATH 'https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/srcbucket1.txt' INTO TABLE srcbucket [junit] POSTHOOK: type: LOAD [junit] POSTHOOK: Output: default@srcbucket [junit] OK [junit] PREHOOK: query: CREATE TABLE srcbucket2(key int, value string) CLUSTERED BY (key) INTO 4 BUCKETS STORED AS TEXTFILE [junit] PREHOOK: type: CREATETABLE [junit] POSTHOOK: query: CREATE TABLE srcbucket2(key int, value string) CLUSTERED BY (key) INTO 4 BUCKETS STORED AS TEXTFILE [junit] POSTHOOK: type: CREATETABLE [junit] POSTHOOK: Output: default@srcbucket2 [junit] OK [junit] PREHOOK: query: LOAD DATA LOCAL INPATH 'https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/srcbucket20.txt' INTO TABLE srcbucket2 [junit] PREHOOK: type: LOAD [junit] Copying data from https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/srcbucket20.txt [junit] Loading data to table srcbucket2 [junit] POSTHOOK: query: LOAD DATA LOCAL INPATH 'https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/srcbucket20.txt' INTO TABLE srcbucket2 [junit] POSTHOOK: type: LOAD [junit] POSTHOOK: Output: default@srcbucket2 [junit] OK [junit] PREHOOK: query: LOAD DATA LOCAL INPATH 'https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/srcbucket21.txt' INTO TABLE srcbucket2 [junit] PREHOOK: type: LOAD [junit] Copying data from https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/srcbucket21.txt [junit] Loading data to table srcbucket2 [junit] POSTHOOK: query: LOAD DATA LOCAL INPATH 'https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/srcbucket21.txt' INTO TABLE srcbucket2 [junit] POSTHOOK: type: LOAD [junit] POSTHOOK: Output: default@srcbucket2 [junit] OK [junit] PREHOOK: query: LOAD DATA LOCAL INPATH 'https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/srcbucket22.txt' INTO TABLE srcbucket2 [junit] PREHOOK: type: LOAD [junit] Copying data from https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/srcbucket22.txt [junit] Loading data to table srcbucket2 [junit] POSTHOOK: query: LOAD DATA LOCAL INPATH 'https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/srcbucket22.txt' INTO TABLE srcbucket2 [junit] POSTHOOK: type: LOAD [junit] POSTHOOK: Output: default@srcbucket2 [junit] OK [junit] PREHOOK: query: LOAD DATA LOCAL INPATH 'https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/srcbucket23.txt' INTO TABLE srcbucket2 [junit] PREHOOK: type: LOAD [junit] Copying data from https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/srcbucket23.txt [junit] Loading data to table srcbucket2 [junit] POSTHOOK: query: LOAD DATA LOCAL INPATH 'https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/srcbucket23.txt' INTO TABLE srcbucket2 [junit] POSTHOOK: type: LOAD [junit] POSTHOOK: Output: default@srcbucket2 [junit] OK [junit] PREHOOK: query: LOAD DATA LOCAL INPATH 'https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/kv1.txt' INTO TABLE src [junit] PREHOOK: type: LOAD [junit] Copying data from https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/kv1.txt [junit] Loading data to table src [junit] POSTHOOK: query: LOAD DATA LOCAL INPATH 'https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/kv1.txt' INTO TABLE src [junit] POSTHOOK: type: LOAD [junit] POSTHOOK: Output: default@src [junit] OK [junit] PREHOOK: query: LOAD DATA LOCAL INPATH 'https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/kv3.txt' INTO TABLE src1 [junit] PREHOOK: type: LOAD [junit] Copying data from https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/kv3.txt [junit] Loading data to table src1 [junit] POSTHOOK: query: LOAD DATA LOCAL INPATH 'https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/kv3.txt' INTO TABLE src1 [junit] POSTHOOK: type: LOAD [junit] POSTHOOK: Output: default@src1 [junit] OK [junit] PREHOOK: query: LOAD DATA LOCAL INPATH
[jira] Commented: (HIVE-1211) Tapping logs from child processes
[ https://issues.apache.org/jira/browse/HIVE-1211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12982917#action_12982917 ] Carl Steinbach commented on HIVE-1211: -- Can someone please take a look at this? Thanks! Tapping logs from child processes - Key: HIVE-1211 URL: https://issues.apache.org/jira/browse/HIVE-1211 Project: Hive Issue Type: Improvement Components: Logging Reporter: bc Wong Assignee: Carl Steinbach Fix For: 0.7.0 Attachments: HIVE-1211-2.patch, HIVE-1211.1.patch, HIVE-1211.3.patch.txt, HIVE-1211.4.patch.txt, HIVE-1211.5.patch.txt, HIVE-1211.6.patch.txt Stdout/stderr from child processes (e.g. {{MapRedTask}}) are redirected to the parent's stdout/stderr. There is little one can do to to sort out which log is from which query. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1694) Accelerate query execution using indexes
[ https://issues.apache.org/jira/browse/HIVE-1694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12982999#action_12982999 ] John Sichi commented on HIVE-1694: -- Is there an updated tree for this? I checked github but didn't see it. HIVE-1644 needs support for compiling internally-generated SQL into operators, so if you have that working, I'd like to point the Harvey Mudd folks at it when I talk to them tomorrow. Accelerate query execution using indexes Key: HIVE-1694 URL: https://issues.apache.org/jira/browse/HIVE-1694 Project: Hive Issue Type: New Feature Components: Indexing, Query Processor Affects Versions: 0.7.0 Reporter: Nikhil Deshpande Assignee: Nikhil Deshpande Attachments: demo_q1.hql, demo_q2.hql, HIVE-1694_2010-10-28.diff The index building patch (Hive-417) is checked into trunk, this JIRA issue tracks supporting indexes in Hive compiler execution engine for SELECT queries. This is in ref. to John's comment at https://issues.apache.org/jira/browse/HIVE-417?focusedCommentId=12884869page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12884869 on creating separate JIRA issue for tracking index usage in optimizer query execution. The aim of this effort is to use indexes to accelerate query execution (for certain class of queries). E.g. - Filters and range scans (already being worked on by He Yongqiang as part of HIVE-417?) - Joins (index based joins) - Group By, Order By and other misc cases The proposal is multi-step: 1. Building index based operators, compiler and execution engine changes 2. Optimizer enhancements (e.g. cost-based optimizer to compare and choose between index scans, full table scans etc.) This JIRA initially focuses on the first step. This JIRA is expected to hold the information about index based plans operator implementations for above mentioned cases. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1903) Can't join HBase tables if one's name is the beginning of the other
[ https://issues.apache.org/jira/browse/HIVE-1903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-1903: - Component/s: HBase Handler Can't join HBase tables if one's name is the beginning of the other --- Key: HIVE-1903 URL: https://issues.apache.org/jira/browse/HIVE-1903 Project: Hive Issue Type: Bug Components: HBase Handler Reporter: Jean-Daniel Cryans Assignee: John Sichi Fix For: 0.7.0 Attachments: HIVE-1903.1.patch I tried joining two tables, let's call them table and table_a, but I'm seeing an array of errors such as this: {noformat} java.lang.IndexOutOfBoundsException: Index: 3, Size: 3 at java.util.ArrayList.RangeCheck(ArrayList.java:547) at java.util.ArrayList.get(ArrayList.java:322) at org.apache.hadoop.hive.hbase.HiveHBaseTableInputFormat.getRecordReader(HiveHBaseTableInputFormat.java:118) at org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:231) {noformat} The reason is that HiveInputFormat.pushProjectionsAndFilters matches the aliases with startsWith so in my case the mappers for table_a were getting the columns from table as well as its own (and since it had less column, it was trying to get one too far in the array). I don't know if just changing it to equals fill fix it, my guess is it won't, since it may break RCFiles. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1716) TestHBaseCliDriver failed in current trunk
[ https://issues.apache.org/jira/browse/HIVE-1716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-1716: - Component/s: HBase Handler TestHBaseCliDriver failed in current trunk -- Key: HIVE-1716 URL: https://issues.apache.org/jira/browse/HIVE-1716 Project: Hive Issue Type: Bug Components: HBase Handler Affects Versions: 0.7.0 Reporter: Ning Zhang Assignee: John Sichi Fix For: 0.7.0 ant test -Dhadoop.version=0.20.0 -Dtestcase=TestHBaseCliDriver: [junit] org.apache.hadoop.hbase.client.NoServerForRegionException: Timed out trying to locate root region [junit] at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRootRegion(HConnectionManager.java:976) [junit] at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegion(HConnectionManager.java:625) [junit] at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.relocateRegion(HConnectionManager.java:607) [junit] at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegionInMeta(HConnectionManager.java:738) [junit] at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegion(HConnectionManager.java:634) [junit] at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegion(HConnectionManager.java:601) [junit] at org.apache.hadoop.hbase.client.HTable.init(HTable.java:128) [junit] at org.apache.hadoop.hive.hbase.HBaseTestSetup.setUpFixtures(HBaseTestSetup.java:87) [junit] at org.apache.hadoop.hive.hbase.HBaseTestSetup.preTest(HBaseTestSetup.java:59) [junit] at org.apache.hadoop.hive.hbase.HBaseQTestUtil.init(HBaseQTestUtil.java:31) [junit] at org.apache.hadoop.hive.cli.TestHBaseCliDriver.setUp(TestHBaseCliDriver.java:43) [junit] at junit.framework.TestCase.runBare(TestCase.java:125) [junit] at junit.framework.TestResult$1.protect(TestResult.java:106) [junit] at junit.framework.TestResult.runProtected(TestResult.java:124) [junit] at junit.framework.TestResult.run(TestResult.java:109) [junit] at junit.framework.TestCase.run(TestCase.java:118) [junit] at junit.framework.TestSuite.runTest(TestSuite.java:208) [junit] at junit.framework.TestSuite.run(TestSuite.java:203) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.