[jira] Commented: (HIVE-1071) Making RCFile concatenatable to reduce the number of files of the output
[ https://issues.apache.org/jira/browse/HIVE-1071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12803025#action_12803025 ] dhruba borthakur commented on HIVE-1071: we could create a API in HDFS that concatenates a set of files into one file. The partial last block of each file will be zero filled, this is required because all the blocks (except the last block) in a single HDFS file should have the same size. once we have the above-mentioned HDFS API, then we can merge a bunch of RC files into one single file without doing much physical IO. The RC file format has to be such that it can safely ignore zero-filled areas in the middle of the file. Can it do this? Making RCFile concatenatable to reduce the number of files of the output -- Key: HIVE-1071 URL: https://issues.apache.org/jira/browse/HIVE-1071 Project: Hadoop Hive Issue Type: Improvement Reporter: Zheng Shao Hive automatically determine the number of reducers most of the time. Sometimes, we create a lot of small files. Hive has an option to merge those small files though a map-reduce job. Dhruba has the idea which can fix it even faster: if we can make RCFile concatenatable, then we can simply tell the namenode to merge these files. Pros: This approach does not do any I/O so it's faster. Cons: We have to zero-fill the files to make sure they can be concatenated (all blocks except the last have to be full HDFS blocks). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (HIVE-892) Hive command line unable to kill jobs when the comnand line is interrupted
Hive command line unable to kill jobs when the comnand line is interrupted --- Key: HIVE-892 URL: https://issues.apache.org/jira/browse/HIVE-892 Project: Hadoop Hive Issue Type: Bug Reporter: dhruba borthakur Assignee: dhruba borthakur The Hadoop 0.20 version of JT insists that a kill command submitted to the JT is via a POST command. The Hive execDriver submits the kill command via a HTTP-GET command. This means that the hive client is unable to kill hadoop 0.20 jobs. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-892) Hive command line unable to kill jobs when the comnand line is interrupted
[ https://issues.apache.org/jira/browse/HIVE-892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dhruba borthakur updated HIVE-892: -- Attachment: killJobs.txt Use the HTTP POST command to kill jobs. This should work for both all versions of hadoop. Hive command line unable to kill jobs when the comnand line is interrupted --- Key: HIVE-892 URL: https://issues.apache.org/jira/browse/HIVE-892 Project: Hadoop Hive Issue Type: Bug Reporter: dhruba borthakur Assignee: dhruba borthakur Attachments: killJobs.txt The Hadoop 0.20 version of JT insists that a kill command submitted to the JT is via a POST command. The Hive execDriver submits the kill command via a HTTP-GET command. This means that the hive client is unable to kill hadoop 0.20 jobs. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-892) Hive command line unable to kill jobs when the comnand line is interrupted
[ https://issues.apache.org/jira/browse/HIVE-892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dhruba borthakur updated HIVE-892: -- Status: Patch Available (was: Open) Hive command line unable to kill jobs when the comnand line is interrupted --- Key: HIVE-892 URL: https://issues.apache.org/jira/browse/HIVE-892 Project: Hadoop Hive Issue Type: Bug Reporter: dhruba borthakur Assignee: dhruba borthakur Attachments: killJobs.txt The Hadoop 0.20 version of JT insists that a kill command submitted to the JT is via a POST command. The Hive execDriver submits the kill command via a HTTP-GET command. This means that the hive client is unable to kill hadoop 0.20 jobs. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Assigned: (HIVE-682) add UDF concat_ws
[ https://issues.apache.org/jira/browse/HIVE-682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dhruba borthakur reassigned HIVE-682: - Assignee: Jonathan Chang add UDF concat_ws - Key: HIVE-682 URL: https://issues.apache.org/jira/browse/HIVE-682 Project: Hadoop Hive Issue Type: New Feature Components: Query Processor Reporter: Namit Jain Assignee: Jonathan Chang Attachments: concat_ws.patch add UDF concat_ws look at http://dev.mysql.com/doc/refman/5.0/en/func-op-summary-ref.html for details -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (HIVE-653) Limit job hive queries to conform to underlying hadoop capacities
Limit job hive queries to conform to underlying hadoop capacities -- Key: HIVE-653 URL: https://issues.apache.org/jira/browse/HIVE-653 Project: Hadoop Hive Issue Type: Improvement Components: Clients Reporter: dhruba borthakur The Hive client should match underlying cluster capacity with cost estimates from a newly submitted job. The newly submitted job should be rejected if it exceeds the capacity of the cluster. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-653) Limit job hive queries to conform to underlying hadoop capacities
[ https://issues.apache.org/jira/browse/HIVE-653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12732720#action_12732720 ] dhruba borthakur commented on HIVE-653: --- The Hadoop JobClient class could be extended to check the following: 1. The total number of mappers/reduces that a job can use 2. The total size of data that the job will process These are not strictly related to hive but might be useful for most warehouse type of applications. Another solution would be to use the Hive pre-query hook to check these constraints. Limit job hive queries to conform to underlying hadoop capacities -- Key: HIVE-653 URL: https://issues.apache.org/jira/browse/HIVE-653 Project: Hadoop Hive Issue Type: Improvement Components: Clients Reporter: dhruba borthakur The Hive client should match underlying cluster capacity with cost estimates from a newly submitted job. The newly submitted job should be rejected if it exceeds the capacity of the cluster. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-74) Hive can use CombineFileInputFormat for when the input are many small files
[ https://issues.apache.org/jira/browse/HIVE-74?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dhruba borthakur updated HIVE-74: - Attachment: hiveCombineSplit2.patch This combines multiple blocks from files into a single split. Hive can use CombineFileInputFormat for when the input are many small files --- Key: HIVE-74 URL: https://issues.apache.org/jira/browse/HIVE-74 Project: Hadoop Hive Issue Type: Improvement Components: Query Processor Reporter: dhruba borthakur Assignee: dhruba borthakur Fix For: 0.4.0 Attachments: hiveCombineSplit.patch, hiveCombineSplit.patch, hiveCombineSplit2.patch There are cases when the input to a Hive job are thousands of small files. In this case, there is a mapper for each file. Most of the overhead for spawning all these mappers can be avoided if Hive used CombineFileInputFormat introduced via HADOOP-4565 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Issue Comment Edited: (HIVE-74) Hive can use CombineFileInputFormat for when the input are many small files
[ https://issues.apache.org/jira/browse/HIVE-74?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12701036#action_12701036 ] dhruba borthakur edited comment on HIVE-74 at 4/20/09 8:43 PM: --- This combines multiple blocks from files into a single split. All files of the same table are part of a single pool. was (Author: dhruba): This combines multiple blocks from files into a single split. Hive can use CombineFileInputFormat for when the input are many small files --- Key: HIVE-74 URL: https://issues.apache.org/jira/browse/HIVE-74 Project: Hadoop Hive Issue Type: Improvement Components: Query Processor Reporter: dhruba borthakur Assignee: dhruba borthakur Fix For: 0.4.0 Attachments: hiveCombineSplit.patch, hiveCombineSplit.patch, hiveCombineSplit2.patch There are cases when the input to a Hive job are thousands of small files. In this case, there is a mapper for each file. Most of the overhead for spawning all these mappers can be avoided if Hive used CombineFileInputFormat introduced via HADOOP-4565 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-79) Print number of rows inserted to table(s) when the query is finished.
[ https://issues.apache.org/jira/browse/HIVE-79?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12675768#action_12675768 ] dhruba borthakur commented on HIVE-79: -- This should go into trunk and not into any branch, right? Print number of rows inserted to table(s) when the query is finished. -- Key: HIVE-79 URL: https://issues.apache.org/jira/browse/HIVE-79 Project: Hadoop Hive Issue Type: New Feature Components: Logging Reporter: Suresh Antony Assignee: Suresh Antony Priority: Minor Fix For: 0.2.0 Attachments: patch_79_1.txt, patch_79_2.txt, patch_79_3.txt It is good to print the number of rows inserted into each table at end of query. insert overwrite table tab1 select a.* from tab2 a where a.col1 = 10; This query can print something like: tab1 rows=100 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-83) Set up a continuous build of Hive with Hudson
[ https://issues.apache.org/jira/browse/HIVE-83?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12672739#action_12672739 ] dhruba borthakur commented on HIVE-83: -- I have got an hudson account named dhruba. However, this account will be used by Johan to setup the Hive-Hudson builds. Set up a continuous build of Hive with Hudson - Key: HIVE-83 URL: https://issues.apache.org/jira/browse/HIVE-83 Project: Hadoop Hive Issue Type: Task Components: Build Infrastructure Reporter: Jeff Hammerbacher Other projects like Zookeeper and HBase are leveraging Apache's hosted Hudson server (http://hudson.zones.apache.org/hudson/view/HBase). Perhaps Hive should as well? -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-232) metastore.warehouse configuration should use inherited hadoop configuration
[ https://issues.apache.org/jira/browse/HIVE-232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dhruba borthakur updated HIVE-232: -- Hadoop Flags: [Reviewed] Status: Patch Available (was: Open) metastore.warehouse configuration should use inherited hadoop configuration --- Key: HIVE-232 URL: https://issues.apache.org/jira/browse/HIVE-232 Project: Hadoop Hive Issue Type: Bug Reporter: Josh Ferguson Assignee: Prasad Chakka Attachments: hive-232.2.patch, hive-232.2.patch, hive-232.patch the hive.metastore.warehouse.dir configuration property in hive-*.xml needs to use the protocol, host, and port when it is inherited from the fs.name property in hadoop-site.xml. When it doesn't and no protocol is found then a broad range of Move operations when the source and target are both in the DFS will fail. Currently this can be worked around by prepending the protocol, host and port of the hadoop nameserver into the value of the hive.metastore.warehouse.dir property. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-232) metastore.warehouse configuration should use inherited hadoop configuration
[ https://issues.apache.org/jira/browse/HIVE-232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dhruba borthakur updated HIVE-232: -- Resolution: Fixed Fix Version/s: 0.2.0 Status: Resolved (was: Patch Available) I just committed this. Thanks Prasad! metastore.warehouse configuration should use inherited hadoop configuration --- Key: HIVE-232 URL: https://issues.apache.org/jira/browse/HIVE-232 Project: Hadoop Hive Issue Type: Bug Reporter: Josh Ferguson Assignee: Prasad Chakka Fix For: 0.2.0 Attachments: hive-232.2.patch, hive-232.2.patch, hive-232.patch the hive.metastore.warehouse.dir configuration property in hive-*.xml needs to use the protocol, host, and port when it is inherited from the fs.name property in hadoop-site.xml. When it doesn't and no protocol is found then a broad range of Move operations when the source and target are both in the DFS will fail. Currently this can be worked around by prepending the protocol, host and port of the hadoop nameserver into the value of the hive.metastore.warehouse.dir property. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-230) While loading a table from a query that returns empty data results in null pointer exception
[ https://issues.apache.org/jira/browse/HIVE-230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dhruba borthakur updated HIVE-230: -- Resolution: Fixed Hadoop Flags: [Reviewed] Status: Resolved (was: Patch Available) I just committed this. Thanks Prasad. While loading a table from a query that returns empty data results in null pointer exception - Key: HIVE-230 URL: https://issues.apache.org/jira/browse/HIVE-230 Project: Hadoop Hive Issue Type: Bug Components: Query Processor Reporter: Prasad Chakka Assignee: Prasad Chakka Fix For: 0.2.0 Attachments: hive-230.2.patch, hive-230.patch, hive-230.patch If the select query returns zero rows then the insert will fail with null pointer exception INSERT OVERWRITE TABLE test_pc SELECT a.userid, a.ip FROM test_pc2 a WHERE (userid=595058415); 2009-01-13 10:16:21,396 ERROR exec.MoveTask (SessionState.java:printError(254)) - Failed with exception null java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.MoveTask.execute(MoveTask.java:127) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:212) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:174) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:207) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:305) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:166) at org.apache.hadoop.mapred.JobShell.run(JobShell.java:194) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79) at org.apache.hadoop.mapred.JobShell.main(JobShell.java:220) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-220) incorrect log directory in TestMTQueries causing null pointer exception
[ https://issues.apache.org/jira/browse/HIVE-220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dhruba borthakur updated HIVE-220: -- Resolution: Fixed Hadoop Flags: [Reviewed] Status: Resolved (was: Patch Available) I just committed this. Thanks Prasad! incorrect log directory in TestMTQueries causing null pointer exception --- Key: HIVE-220 URL: https://issues.apache.org/jira/browse/HIVE-220 Project: Hadoop Hive Issue Type: Bug Components: Testing Infrastructure Reporter: Prasad Chakka Assignee: Prasad Chakka Priority: Critical Fix For: 0.2.0 Attachments: hive-220.patch, hive-220.patch, hive-220.patch mistyped ')' on line 38 of TestMTQueries.java causing null pointer exception. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-84) MetaStore Client is not thread safe
[ https://issues.apache.org/jira/browse/HIVE-84?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dhruba borthakur updated HIVE-84: - Resolution: Fixed Hadoop Flags: [Reviewed] Status: Resolved (was: Patch Available) I just committed this. Thanks Prasad! MetaStore Client is not thread safe --- Key: HIVE-84 URL: https://issues.apache.org/jira/browse/HIVE-84 Project: Hadoop Hive Issue Type: Bug Components: Metastore Affects Versions: 0.2.0 Environment: with patch for hive-77 - run: ant -lib ./testlibs -Dtestcase=TestMTQueries test Reporter: Joydeep Sen Sarma Assignee: Prasad Chakka Fix For: 0.2.0 Attachments: hive-84.patch when running DDL Tasks in concurrent threads - the following exception trace is observed: java.sql.SQLIntegrityConstraintViolationException: The statement was aborted because it would have caused a duplicate ke\ y value in a unique or primary key constraint or unique index identified by 'UNIQUETABLE' defined on 'TBLS'. at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:207) at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:209) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:174) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:185) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:210) at org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:390) at org.apache.hadoop.hive.ql.QTestUtil$QTRunner.run(QTestUtil.java:681) at java.lang.Thread.run(Thread.java:619) Caused by: javax.jdo.JDODataStoreException: Insert of object org.apache.hadoop.hive.metastore.model.mta...@3bc8d400 us\ ing statement INSERT INTO TBLS (TBL_ID,CREATE_TIME,DB_ID,RETENTION,TBL_NAME,SD_ID,OWNER,LAST_ACCESS_TIME) VALUES (?,?,?\ ,?,?,?,?,?) failed : The statement was aborted because it would have caused a duplicate key value in a unique or primar\ y key constraint or unique index identified by 'UNIQUETABLE' defined on 'TBLS'. NestedThrowables: java.sql.SQLIntegrityConstraintViolationException: The statement was aborted because it would have caused a duplicate ke\ y value in a unique or primary key constraint or unique index identified by 'UNIQUETABLE' defined on 'TBLS'. at org.jpox.jdo.JPOXJDOHelper.getJDOExceptionForJPOXException(JPOXJDOHelper.java:291) at org.jpox.jdo.AbstractPersistenceManager.jdoMakePersistent(AbstractPersistenceManager.java:671) at org.jpox.jdo.AbstractPersistenceManager.makePersistent(AbstractPersistenceManager.java:691) at org.apache.hadoop.hive.metastore.ObjectStore.createTable(ObjectStore.java:479) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table(HiveMetaStore.java:292) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:252) at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:205) ... 7 more Caused by: java.sql.SQLIntegrityConstraintViolationException: The statement was aborted because it would have caused a d\ uplicate key value in a unique or primary key constraint or unique index identified by 'UNIQUETABLE' defined on 'TBLS'. at org.apache.derby.impl.jdbc.SQLExceptionFactory40.getSQLException(Unknown Source) at org.apache.derby.impl.jdbc.Util.generateCsSQLException(Unknown Source) at org.apache.derby.impl.jdbc.TransactionResourceImpl.wrapInSQLException(Unknown Source) at org.apache.derby.impl.jdbc.TransactionResourceImpl.handleException(Unknown Source) at org.apache.derby.impl.jdbc.EmbedConnection.handleException(Unknown Source) at org.apache.derby.impl.jdbc.ConnectionChild.handleException(Unknown Source) at org.apache.derby.impl.jdbc.EmbedStatement.executeStatement(Unknown Source) at org.apache.derby.impl.jdbc.EmbedPreparedStatement.executeStatement(Unknown Source) at org.apache.derby.impl.jdbc.EmbedPreparedStatement.executeUpdate(Unknown Source) at org.jpox.store.rdbms.SQLController.executeStatementUpdate(SQLController.java:396) at org.jpox.store.rdbms.request.InsertRequest.execute(InsertRequest.java:370) at org.jpox.store.rdbms.RDBMSPersistenceHandler.insertTable(RDBMSPersistenceHandler.java:157) at org.jpox.store.rdbms.RDBMSPersistenceHandler.insertObject(RDBMSPersistenceHandler.java:136) at org.jpox.state.JDOStateManagerImpl.internalMakePersistent(JDOStateManagerImpl.java:3082) at org.jpox.state.JDOStateManagerImpl.makePersistent(JDOStateManagerImpl.java:3062) at org.jpox.ObjectManagerImpl.persistObjectInternal(ObjectManagerImpl.java:1231) at org.jpox.ObjectManagerImpl.persistObject(ObjectManagerImpl.java:1077) at org.jpox.jdo.AbstractPersistenceManager.jdoMakePersistent(AbstractPersistenceManager.java:666)
[jira] Updated: (HIVE-48) Support JDBC connections for interoperability between Hive and RDBMS
[ https://issues.apache.org/jira/browse/HIVE-48?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dhruba borthakur updated HIVE-48: - Status: Open (was: Patch Available) I get compilation problems: core-compile: [javac] Compiling 10 source files to /mnt/vol/devrs004.snc1/dhruba/commithive/build/jdbc/classes [javac] /mnt/vol/devrs004.snc1/dhruba/commithive/jdbc/src/java/org/apache/hadoop/hive/jdbc/HiveConnection.java:452: unreported exception java.sql.SQLException; must be caught or declared to be thrown [javac] throw new SQLException(Method not supported); [javac] ^ [javac] /mnt/vol/devrs004.snc1/dhruba/commithive/jdbc/src/java/org/apache/hadoop/hive/jdbc/HiveConnection.java:462: unreported exception java.sql.SQLException; must be caught or declared to be thrown [javac] throw new SQLException(Method not supported); Support JDBC connections for interoperability between Hive and RDBMS Key: HIVE-48 URL: https://issues.apache.org/jira/browse/HIVE-48 Project: Hadoop Hive Issue Type: Bug Components: Clients Reporter: YoungWoo Kim Assignee: Raghotham Murthy Priority: Minor Attachments: hadoop-4101.1.patch, hadoop-4101.2.patch, hadoop-4101.3.patch, hadoop-4101.4.patch, hive-48.5.patch, hive-48.6.patch, hive-48.7.patch In many DW and BI systems, the data are stored in RDBMS for now such as oracle, mysql, postgresql ... for reporting, charting and etc. It would be useful to be able to import data from RDBMS and export data to RDBMS using JDBC connections. If Hive support JDBC connections, It wll be much easier to use 3rd party DW/BI tools. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-48) Support JDBC connections for interoperability between Hive and RDBMS
[ https://issues.apache.org/jira/browse/HIVE-48?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dhruba borthakur updated HIVE-48: - Status: Patch Available (was: Open) Support JDBC connections for interoperability between Hive and RDBMS Key: HIVE-48 URL: https://issues.apache.org/jira/browse/HIVE-48 Project: Hadoop Hive Issue Type: Bug Components: Clients Reporter: YoungWoo Kim Assignee: Raghotham Murthy Priority: Minor Attachments: hadoop-4101.1.patch, hadoop-4101.2.patch, hadoop-4101.3.patch, hadoop-4101.4.patch, hive-48.5.patch, hive-48.6.patch, hive-48.7.patch, hive-48.8.patch In many DW and BI systems, the data are stored in RDBMS for now such as oracle, mysql, postgresql ... for reporting, charting and etc. It would be useful to be able to import data from RDBMS and export data to RDBMS using JDBC connections. If Hive support JDBC connections, It wll be much easier to use 3rd party DW/BI tools. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-48) Support JDBC connections for interoperability between Hive and RDBMS
[ https://issues.apache.org/jira/browse/HIVE-48?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dhruba borthakur updated HIVE-48: - Resolution: Fixed Fix Version/s: 0.2.0 Hadoop Flags: [Reviewed] Status: Resolved (was: Patch Available) I just committed this. Thanks Raghu and Michi. Support JDBC connections for interoperability between Hive and RDBMS Key: HIVE-48 URL: https://issues.apache.org/jira/browse/HIVE-48 Project: Hadoop Hive Issue Type: Bug Components: Clients Reporter: YoungWoo Kim Assignee: Raghotham Murthy Priority: Minor Fix For: 0.2.0 Attachments: hadoop-4101.1.patch, hadoop-4101.2.patch, hadoop-4101.3.patch, hadoop-4101.4.patch, hive-48.5.patch, hive-48.6.patch, hive-48.7.patch, hive-48.8.patch In many DW and BI systems, the data are stored in RDBMS for now such as oracle, mysql, postgresql ... for reporting, charting and etc. It would be useful to be able to import data from RDBMS and export data to RDBMS using JDBC connections. If Hive support JDBC connections, It wll be much easier to use 3rd party DW/BI tools. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-202) LINEAGE is not working for join quries
[ https://issues.apache.org/jira/browse/HIVE-202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dhruba borthakur updated HIVE-202: -- Resolution: Fixed Hadoop Flags: [Reviewed] Status: Resolved (was: Patch Available) I just committed this. Thanks Suresh! LINEAGE is not working for join quries --- Key: HIVE-202 URL: https://issues.apache.org/jira/browse/HIVE-202 Project: Hadoop Hive Issue Type: Bug Components: Clients Affects Versions: 0.2.0 Environment: lineage is not working for join quires Reporter: Suresh Antony Assignee: Suresh Antony Priority: Minor Fix For: 0.2.0 Attachments: patch_202.txt, patch_202.txt lineage is not giving input tables in case of join quires. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-196) Failure when doing 2 tests with the same user on the same machine
[ https://issues.apache.org/jira/browse/HIVE-196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dhruba borthakur updated HIVE-196: -- Resolution: Fixed Hadoop Flags: [Reviewed] Status: Resolved (was: Patch Available) I just committed this. Thanks Ashish! Failure when doing 2 tests with the same user on the same machine - Key: HIVE-196 URL: https://issues.apache.org/jira/browse/HIVE-196 Project: Hadoop Hive Issue Type: Bug Components: Testing Infrastructure Reporter: Zheng Shao Assignee: Ashish Thusoo Attachments: patch-196.txt org.apache.hadoop.util.Shell$ExitCodeException: chmod: cannot access `/tmp/zshao/kv1.txt': No such file or directory We should make a unique directory for each of the test runs, instead of sharing /tmp/${username}. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-76) Column number mismatch between query and destination tables when alias.* expressions are present in the select list of a join
[ https://issues.apache.org/jira/browse/HIVE-76?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dhruba borthakur updated HIVE-76: - Resolution: Fixed Hadoop Flags: [Reviewed] Status: Resolved (was: Patch Available) I just committed this. Thanks Ashish! Column number mismatch between query and destination tables when alias.* expressions are present in the select list of a join - Key: HIVE-76 URL: https://issues.apache.org/jira/browse/HIVE-76 Project: Hadoop Hive Issue Type: Bug Affects Versions: 0.20.0 Reporter: Ashish Thusoo Assignee: Ashish Thusoo Fix For: 0.20.0 Attachments: patch-76.txt, patch-76_1.txt Column number mismatch between query and destination tables when alias.* expressions are present in the select list of a join. The reason is due to a bug in how the row resolver is constructed in SemanticAnalyzer. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-72) wrong results if partition pruning not strict and no mep-reduce job needed
[ https://issues.apache.org/jira/browse/HIVE-72?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dhruba borthakur updated HIVE-72: - Resolution: Fixed Fix Version/s: 0.20.0 Hadoop Flags: [Reviewed] Status: Resolved (was: Patch Available) I just committed this. Thanks Namit! wrong results if partition pruning not strict and no mep-reduce job needed -- Key: HIVE-72 URL: https://issues.apache.org/jira/browse/HIVE-72 Project: Hadoop Hive Issue Type: Bug Reporter: Namit Jain Assignee: Namit Jain Fix For: 0.20.0 Attachments: patch1.mapred.txt Suppose T is a partitioned table on ds, where ds is a string column, the following queries: SELECT a.* FROM T a WHERE a.ds=2008-09-08 LIMIT 1; SELECT a.* FROM T a WHERE a.ds=2008-11-10 LIMIT 1; return the first row from the first partition. This is because of the typecast to double. for a.ds=2008-01-01 or anything (a.ds=1), evaluate (Double, Double) is invoked at partition pruning. Since '2008-11-01' is not a valid double, it is converted to a null, and therefore the result of pruning returns null (unknown) - not FALSE. All unknowns are also accepted, therefore all partitions are accepted which explains this behavior. filter is not invoked since it is a select * query, so map-reduce job is started. We just turn off this optimization if pruning indicates that there can be unknown partitions. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.