[jira] [Commented] (HIVE-3472) Build An Analytical SQL Engine for MapReduce
[ https://issues.apache.org/jira/browse/HIVE-3472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13474838#comment-13474838 ] Shengsheng Huang commented on HIVE-3472: @Lianhui Thanks for comment. I read the nexr's slides. And there's another interesting contribution about SQL window functions presented in Hadoop Summit 2012. http://www.slideshare.net/Hadoop_Summit/analytical-queries-with-hive. It looks to me many people are aware that Hive needs improvement to better accomodate commmon OLAP requirements, and many of us actually share similar ideas about which areas to improve - for example, the SQL data type system, OLAP-oriented features (rank,rollup,window functions,etc.), nested scalar subquery, and etc. It seemed nexr didn't open source their query planer (Hawk), which does the most SQL syntax transformation as I understand, on github (they did contributed a set of OLAP UDF implementations though). We would like to see Hive evolves faster to a better open source tool for OLAP analytics so that we opened this JIRA id to push this forward. And we're willing to contribute our efforts to open source. Build An Analytical SQL Engine for MapReduce Key: HIVE-3472 URL: https://issues.apache.org/jira/browse/HIVE-3472 Project: Hive Issue Type: New Feature Affects Versions: 0.10.0 Reporter: Shengsheng Huang Attachments: SQL-design.pdf While there are continuous efforts in extending Hive’s SQL support (e.g., see some recent examples such as HIVE-2005 and HIVE-2810), many widely used SQL constructs are still not supported in HiveQL, such as selecting from multiple tables, subquery in WHERE clauses, etc. We propose to build a SQL-92 full compatible engine (for MapReduce based analytical query processing) as an extension to Hive. The SQL frontend will co-exist with the HiveQL frontend; consequently, one can mix SQL and HiveQL statements in their queries (switching between HiveQL mode and SQL-92 mode using a “hive.ql.mode” parameter before each query statement). This way useful Hive extensions are still accessible to users. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3564) hivetest.py: revision number and applied patch
[ https://issues.apache.org/jira/browse/HIVE-3564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Namit Jain updated HIVE-3564: - Resolution: Fixed Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Committed. Thanks Ivan hivetest.py: revision number and applied patch -- Key: HIVE-3564 URL: https://issues.apache.org/jira/browse/HIVE-3564 Project: Hive Issue Type: Improvement Components: Testing Infrastructure Reporter: Ivan Gorbachev Assignee: Ivan Gorbachev Attachments: hive-3564.0.patch.txt It's required to add new option for hivetest.py which will allow to show base revision number and applied patch. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3556) Test Path - Alias for explain extended
[ https://issues.apache.org/jira/browse/HIVE-3556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13475057#comment-13475057 ] Namit Jain commented on HIVE-3556: -- +1 Test Path - Alias for explain extended - Key: HIVE-3556 URL: https://issues.apache.org/jira/browse/HIVE-3556 Project: Hive Issue Type: Bug Components: Testing Infrastructure Reporter: Gang Tim Liu Assignee: Gang Tim Liu Attachments: HIVE-3556.patch.1 Test framework masks output of Path - Alias for explain extended. This makes it impossible to verify the output is right. Design is to add a new entry Truncated Path - Alias to MapredWork. It has the same content as Path - Alias except the prefix including file schema and temp dir is removed. The following config will be used for prefix-removal: METASTOREWAREHOUSE(hive.metastore.warehouse.dir, /user/hive/warehouse), This will keep Path - Alias intact and also test it's result is right. The first use case is to verify list bucketing query's result is right. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3276) optimize union sub-queries
[ https://issues.apache.org/jira/browse/HIVE-3276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Namit Jain updated HIVE-3276: - Attachment: hive.3276.12.patch optimize union sub-queries -- Key: HIVE-3276 URL: https://issues.apache.org/jira/browse/HIVE-3276 Project: Hive Issue Type: Bug Reporter: Namit Jain Assignee: Namit Jain Attachments: hive.3276.10.patch, hive.3276.11.patch, hive.3276.12.patch, HIVE-3276.1.patch, hive.3276.2.patch, hive.3276.3.patch, hive.3276.4.patch, hive.3276.5.patch, hive.3276.6.patch, hive.3276.7.patch, hive.3276.8.patch, hive.3276.9.patch It might be a good idea to optimize simple union queries containing map-reduce jobs in at least one of the sub-qeuries. For eg: a query like: insert overwrite table T1 partition P1 select * from ( subq1 union all subq2 ) u; today creates 3 map-reduce jobs, one for subq1, another for subq2 and the final one for the union. It might be a good idea to optimize this. Instead of creating the union task, it might be simpler to create a move task (or something like a move task), where the outputs of the two sub-queries will be moved to the final directory. This can easily extend to more than 2 sub-queries in the union. This is very useful if there is a select * followed by filesink after the union. This can be independently useful, and also be used to optimize the skewed joins https://cwiki.apache.org/Hive/skewed-join-optimization.html. If there is a select, filter between the union and the filesink, the select and the filter can be moved before the union, and the follow-up job can still be removed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3276) optimize union sub-queries
[ https://issues.apache.org/jira/browse/HIVE-3276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13475058#comment-13475058 ] Namit Jain commented on HIVE-3276: -- @Carl, comments addressed. optimize union sub-queries -- Key: HIVE-3276 URL: https://issues.apache.org/jira/browse/HIVE-3276 Project: Hive Issue Type: Bug Reporter: Namit Jain Assignee: Namit Jain Attachments: hive.3276.10.patch, hive.3276.11.patch, hive.3276.12.patch, HIVE-3276.1.patch, hive.3276.2.patch, hive.3276.3.patch, hive.3276.4.patch, hive.3276.5.patch, hive.3276.6.patch, hive.3276.7.patch, hive.3276.8.patch, hive.3276.9.patch It might be a good idea to optimize simple union queries containing map-reduce jobs in at least one of the sub-qeuries. For eg: a query like: insert overwrite table T1 partition P1 select * from ( subq1 union all subq2 ) u; today creates 3 map-reduce jobs, one for subq1, another for subq2 and the final one for the union. It might be a good idea to optimize this. Instead of creating the union task, it might be simpler to create a move task (or something like a move task), where the outputs of the two sub-queries will be moved to the final directory. This can easily extend to more than 2 sub-queries in the union. This is very useful if there is a select * followed by filesink after the union. This can be independently useful, and also be used to optimize the skewed joins https://cwiki.apache.org/Hive/skewed-join-optimization.html. If there is a select, filter between the union and the filesink, the select and the filter can be moved before the union, and the follow-up job can still be removed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3433) Implement CUBE and ROLLUP operators in Hive
[ https://issues.apache.org/jira/browse/HIVE-3433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13475060#comment-13475060 ] Namit Jain commented on HIVE-3433: -- [~kevinwilfong], addressed comments Implement CUBE and ROLLUP operators in Hive --- Key: HIVE-3433 URL: https://issues.apache.org/jira/browse/HIVE-3433 Project: Hive Issue Type: New Feature Components: Query Processor Reporter: Sambavi Muthukrishnan Assignee: Namit Jain Attachments: hive.3433.1.patch, hive.3433.2.patch, hive.3433.3.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3472) Build An Analytical SQL Engine for MapReduce
[ https://issues.apache.org/jira/browse/HIVE-3472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13475080#comment-13475080 ] alex gemini commented on HIVE-3472: --- It's seems there're two unrelated issues we discussed here,one for analytic function,one for compatible for sql-92. for analytic features,I think it's OK to built a new parser since there is no standard,special vendor has some special grammar, we can just treat it like a new framework with a parameter to control whether enable it or not,just like HIVE-896 did, we open a new session with special shell command,new feature with lots of code integration into hive is always change many thing like metastore,execution engine,explain tree etc. I guess it's ok to have a parameter switch. But why we are using pl/sql parser, what exactly analytical features we're talking about? for sql-92 compatible issue we should discussed in HIVE-3561. Build An Analytical SQL Engine for MapReduce Key: HIVE-3472 URL: https://issues.apache.org/jira/browse/HIVE-3472 Project: Hive Issue Type: New Feature Affects Versions: 0.10.0 Reporter: Shengsheng Huang Attachments: SQL-design.pdf While there are continuous efforts in extending Hive’s SQL support (e.g., see some recent examples such as HIVE-2005 and HIVE-2810), many widely used SQL constructs are still not supported in HiveQL, such as selecting from multiple tables, subquery in WHERE clauses, etc. We propose to build a SQL-92 full compatible engine (for MapReduce based analytical query processing) as an extension to Hive. The SQL frontend will co-exist with the HiveQL frontend; consequently, one can mix SQL and HiveQL statements in their queries (switching between HiveQL mode and SQL-92 mode using a “hive.ql.mode” parameter before each query statement). This way useful Hive extensions are still accessible to users. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3561) Build a full SQL-compliant parser for Hive
[ https://issues.apache.org/jira/browse/HIVE-3561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13475092#comment-13475092 ] alex gemini commented on HIVE-3561: --- why we need a full sql-compliant parser? what exactly feature we're talking about?sql-92 has some features likes Temporary tables,call level interface,Scrolling cursors etc.IMO Maybe we should discuss individual feature instead of discussing whether we need a new parser or not. Build a full SQL-compliant parser for Hive -- Key: HIVE-3561 URL: https://issues.apache.org/jira/browse/HIVE-3561 Project: Hive Issue Type: Sub-task Components: Query Processor Affects Versions: 0.10.0 Reporter: Shengsheng Huang To build a full SQL compliant engine on Hive, we'll need a full SQL complant parser. The current Hive parser missed a lot of grammar units from standard SQL. To support full SQL there're possibly four approaches: 1.Extend the existing Hive parser to support full SQL constructs. We need to modify the current Hive.g and add any missing grammar units and resolve conflicts. 2.Reuse an existing open source SQL compliant parser and extend it to support Hive extensions. We may need to adapt Semantic Analyzers to the new AST structure. 3.Reuse an existing SQL compliant parser and make it co-exist with the existing Hive parser. Both parsers share the same CliDriver interface. Use a query mode configuration to switch the query mode between SQL and HQL (this is the approach we're now using in the 0.9.0 demo project) 4.Reuse an existing SQL compliant parser and make it co-exist with the existing Hive parser. Use a separate xxxCliDriver interface for standard SQL. Let's discuss which is the best approach. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3556) Test Path - Alias for explain extended
[ https://issues.apache.org/jira/browse/HIVE-3556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13475104#comment-13475104 ] Namit Jain commented on HIVE-3556: -- [~gangtimliu], can you load the new patch ? Test Path - Alias for explain extended - Key: HIVE-3556 URL: https://issues.apache.org/jira/browse/HIVE-3556 Project: Hive Issue Type: Bug Components: Testing Infrastructure Reporter: Gang Tim Liu Assignee: Gang Tim Liu Attachments: HIVE-3556.patch.1 Test framework masks output of Path - Alias for explain extended. This makes it impossible to verify the output is right. Design is to add a new entry Truncated Path - Alias to MapredWork. It has the same content as Path - Alias except the prefix including file schema and temp dir is removed. The following config will be used for prefix-removal: METASTOREWAREHOUSE(hive.metastore.warehouse.dir, /user/hive/warehouse), This will keep Path - Alias intact and also test it's result is right. The first use case is to verify list bucketing query's result is right. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Comment Edited] (HIVE-3556) Test Path - Alias for explain extended
[ https://issues.apache.org/jira/browse/HIVE-3556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13475104#comment-13475104 ] Namit Jain edited comment on HIVE-3556 at 10/12/12 4:14 PM: [~gangtimliu], can you load the new patch file ? was (Author: namit): [~gangtimliu], can you load the new patch ? Test Path - Alias for explain extended - Key: HIVE-3556 URL: https://issues.apache.org/jira/browse/HIVE-3556 Project: Hive Issue Type: Bug Components: Testing Infrastructure Reporter: Gang Tim Liu Assignee: Gang Tim Liu Attachments: HIVE-3556.patch.1 Test framework masks output of Path - Alias for explain extended. This makes it impossible to verify the output is right. Design is to add a new entry Truncated Path - Alias to MapredWork. It has the same content as Path - Alias except the prefix including file schema and temp dir is removed. The following config will be used for prefix-removal: METASTOREWAREHOUSE(hive.metastore.warehouse.dir, /user/hive/warehouse), This will keep Path - Alias intact and also test it's result is right. The first use case is to verify list bucketing query's result is right. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3556) Test Path - Alias for explain extended
[ https://issues.apache.org/jira/browse/HIVE-3556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Tim Liu updated HIVE-3556: --- Attachment: HIVE-3556.patch.2 Test Path - Alias for explain extended - Key: HIVE-3556 URL: https://issues.apache.org/jira/browse/HIVE-3556 Project: Hive Issue Type: Bug Components: Testing Infrastructure Reporter: Gang Tim Liu Assignee: Gang Tim Liu Attachments: HIVE-3556.patch.1, HIVE-3556.patch.2 Test framework masks output of Path - Alias for explain extended. This makes it impossible to verify the output is right. Design is to add a new entry Truncated Path - Alias to MapredWork. It has the same content as Path - Alias except the prefix including file schema and temp dir is removed. The following config will be used for prefix-removal: METASTOREWAREHOUSE(hive.metastore.warehouse.dir, /user/hive/warehouse), This will keep Path - Alias intact and also test it's result is right. The first use case is to verify list bucketing query's result is right. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3556) Test Path - Alias for explain extended
[ https://issues.apache.org/jira/browse/HIVE-3556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13475160#comment-13475160 ] Gang Tim Liu commented on HIVE-3556: Yes, just load one. thanks Test Path - Alias for explain extended - Key: HIVE-3556 URL: https://issues.apache.org/jira/browse/HIVE-3556 Project: Hive Issue Type: Bug Components: Testing Infrastructure Reporter: Gang Tim Liu Assignee: Gang Tim Liu Attachments: HIVE-3556.patch.1, HIVE-3556.patch.2 Test framework masks output of Path - Alias for explain extended. This makes it impossible to verify the output is right. Design is to add a new entry Truncated Path - Alias to MapredWork. It has the same content as Path - Alias except the prefix including file schema and temp dir is removed. The following config will be used for prefix-removal: METASTOREWAREHOUSE(hive.metastore.warehouse.dir, /user/hive/warehouse), This will keep Path - Alias intact and also test it's result is right. The first use case is to verify list bucketing query's result is right. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-3572) Expressions in distribute/cluster/order/sort by used to work sometimes
Kevin Wilfong created HIVE-3572: --- Summary: Expressions in distribute/cluster/order/sort by used to work sometimes Key: HIVE-3572 URL: https://issues.apache.org/jira/browse/HIVE-3572 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.10.0 Reporter: Kevin Wilfong Assignee: Namit Jain This query used to work explain select key from src distribute by (key + 50); But after HIVE-3268 it fails in compilation. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-3573) Wrap logic introduced in HIVE-3268 in a config
Kevin Wilfong created HIVE-3573: --- Summary: Wrap logic introduced in HIVE-3268 in a config Key: HIVE-3573 URL: https://issues.apache.org/jira/browse/HIVE-3573 Project: Hive Issue Type: Improvement Components: Query Processor Affects Versions: 0.10.0 Reporter: Kevin Wilfong Assignee: Namit Jain This patch introduces some code which can fundamentally break distribute/order/cluster/sort by if there is a bug (as was found). We should add a config around this code so administrators can quickly turn it off if needed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3570) Add/fix facility to collect operator specific statisticsin hive + add hash-in/hash-out counter for GroupBy Optr
[ https://issues.apache.org/jira/browse/HIVE-3570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phabricator updated HIVE-3570: -- Attachment: HIVE-3570.D5985.1.patch satadru added you to the CC list for the revision HIVE-3570 [jira] Hive changes for Optr level stats. Reviewers: njain TEST PLAN Single box testing REVISION DETAIL https://reviews.facebook.net/D5985 AFFECTED FILES ql/src/java/org/apache/hadoop/hive/ql/exec/GroupByOperator.java ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java To: JIRA Add/fix facility to collect operator specific statisticsin hive + add hash-in/hash-out counter for GroupBy Optr --- Key: HIVE-3570 URL: https://issues.apache.org/jira/browse/HIVE-3570 Project: Hive Issue Type: Improvement Components: Statistics Affects Versions: 0.9.0 Reporter: Satadru Pan Assignee: Satadru Pan Priority: Minor Attachments: HIVE-3570.1.patch.txt, HIVE-3570.D5985.1.patch Requirement: Collect Operator specific stats for hive queries. Use the counter framework available in Hive Operator.java to accomplish that. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Work started] (HIVE-3570) Add/fix facility to collect operator specific statisticsin hive + add hash-in/hash-out counter for GroupBy Optr
[ https://issues.apache.org/jira/browse/HIVE-3570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HIVE-3570 started by Satadru Pan. Add/fix facility to collect operator specific statisticsin hive + add hash-in/hash-out counter for GroupBy Optr --- Key: HIVE-3570 URL: https://issues.apache.org/jira/browse/HIVE-3570 Project: Hive Issue Type: Improvement Components: Statistics Affects Versions: 0.9.0 Reporter: Satadru Pan Assignee: Satadru Pan Priority: Minor Attachments: HIVE-3570.1.patch.txt, HIVE-3570.D5985.1.patch Requirement: Collect Operator specific stats for hive queries. Use the counter framework available in Hive Operator.java to accomplish that. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3570) Add/fix facility to collect operator specific statisticsin hive + add hash-in/hash-out counter for GroupBy Optr
[ https://issues.apache.org/jira/browse/HIVE-3570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Satadru Pan updated HIVE-3570: -- Status: Patch Available (was: In Progress) Add/fix facility to collect operator specific statisticsin hive + add hash-in/hash-out counter for GroupBy Optr --- Key: HIVE-3570 URL: https://issues.apache.org/jira/browse/HIVE-3570 Project: Hive Issue Type: Improvement Components: Statistics Affects Versions: 0.9.0 Reporter: Satadru Pan Assignee: Satadru Pan Priority: Minor Attachments: HIVE-3570.1.patch.txt, HIVE-3570.D5985.1.patch Requirement: Collect Operator specific stats for hive queries. Use the counter framework available in Hive Operator.java to accomplish that. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3573) Wrap logic introduced in HIVE-3268 in a config
[ https://issues.apache.org/jira/browse/HIVE-3573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13475277#comment-13475277 ] Namit Jain commented on HIVE-3573: -- https://reviews.facebook.net/D6015 Wrap logic introduced in HIVE-3268 in a config -- Key: HIVE-3573 URL: https://issues.apache.org/jira/browse/HIVE-3573 Project: Hive Issue Type: Improvement Components: Query Processor Affects Versions: 0.10.0 Reporter: Kevin Wilfong Assignee: Namit Jain This patch introduces some code which can fundamentally break distribute/order/cluster/sort by if there is a bug (as was found). We should add a config around this code so administrators can quickly turn it off if needed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3573) Wrap logic introduced in HIVE-3268 in a config
[ https://issues.apache.org/jira/browse/HIVE-3573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Namit Jain updated HIVE-3573: - Attachment: hive.3573.1.patch Wrap logic introduced in HIVE-3268 in a config -- Key: HIVE-3573 URL: https://issues.apache.org/jira/browse/HIVE-3573 Project: Hive Issue Type: Improvement Components: Query Processor Affects Versions: 0.10.0 Reporter: Kevin Wilfong Assignee: Namit Jain Attachments: hive.3573.1.patch This patch introduces some code which can fundamentally break distribute/order/cluster/sort by if there is a bug (as was found). We should add a config around this code so administrators can quickly turn it off if needed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3573) Revert HIVE-3268
[ https://issues.apache.org/jira/browse/HIVE-3573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kevin Wilfong updated HIVE-3573: Summary: Revert HIVE-3268 (was: Wrap logic introduced in HIVE-3268 in a config) Revert HIVE-3268 Key: HIVE-3573 URL: https://issues.apache.org/jira/browse/HIVE-3573 Project: Hive Issue Type: Improvement Components: Query Processor Affects Versions: 0.10.0 Reporter: Kevin Wilfong Assignee: Namit Jain Attachments: hive.3573.1.patch This patch introduces some code which can fundamentally break distribute/order/cluster/sort by if there is a bug (as was found). We should add a config around this code so administrators can quickly turn it off if needed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3573) Revert HIVE-3268
[ https://issues.apache.org/jira/browse/HIVE-3573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kevin Wilfong updated HIVE-3573: Description: This patch introduces some code which can breaks distribute/order/cluster/sort by. We should revert this code until it can be fixed (HIVE-3572). (was: This patch introduces some code which can fundamentally break distribute/order/cluster/sort by if there is a bug (as was found). We should add a config around this code so administrators can quickly turn it off if needed.) Revert HIVE-3268 Key: HIVE-3573 URL: https://issues.apache.org/jira/browse/HIVE-3573 Project: Hive Issue Type: Improvement Components: Query Processor Affects Versions: 0.10.0 Reporter: Kevin Wilfong Assignee: Namit Jain Attachments: hive.3573.1.patch This patch introduces some code which can breaks distribute/order/cluster/sort by. We should revert this code until it can be fixed (HIVE-3572). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3573) Revert HIVE-3268
[ https://issues.apache.org/jira/browse/HIVE-3573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13475284#comment-13475284 ] Kevin Wilfong commented on HIVE-3573: - If you plan to remove the config once the bug is fixed, we'd probably be better off just reverting it. Revert HIVE-3268 Key: HIVE-3573 URL: https://issues.apache.org/jira/browse/HIVE-3573 Project: Hive Issue Type: Improvement Components: Query Processor Affects Versions: 0.10.0 Reporter: Kevin Wilfong Assignee: Namit Jain Attachments: hive.3573.1.patch This patch introduces some code which can breaks distribute/order/cluster/sort by. We should revert this code until it can be fixed (HIVE-3572). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3073) Hive List Bucketing - DML support
[ https://issues.apache.org/jira/browse/HIVE-3073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Tim Liu updated HIVE-3073: --- Summary: Hive List Bucketing - DML support (was: Hive List Bucketing - DML support (single column/manual load)) Hive List Bucketing - DML support -- Key: HIVE-3073 URL: https://issues.apache.org/jira/browse/HIVE-3073 Project: Hive Issue Type: New Feature Components: SQL Reporter: Gang Tim Liu Assignee: Gang Tim Liu If a hive table column has skewed keys, query performance on non-skewed key is always impacted. Hive List Bucketing feature will address it: https://cwiki.apache.org/Hive/listbucketing.html This jira issue will track DML change for the feature: 1. single skewed column 2. manual load data -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3573) Revert HIVE-3268
[ https://issues.apache.org/jira/browse/HIVE-3573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Namit Jain updated HIVE-3573: - Attachment: hive.3573.2.patch Revert HIVE-3268 Key: HIVE-3573 URL: https://issues.apache.org/jira/browse/HIVE-3573 Project: Hive Issue Type: Improvement Components: Query Processor Affects Versions: 0.10.0 Reporter: Kevin Wilfong Assignee: Namit Jain Attachments: hive.3573.1.patch, hive.3573.2.patch This patch introduces some code which can breaks distribute/order/cluster/sort by. We should revert this code until it can be fixed (HIVE-3572). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3573) Revert HIVE-3268
[ https://issues.apache.org/jira/browse/HIVE-3573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Namit Jain updated HIVE-3573: - Status: Patch Available (was: Open) Revert HIVE-3268 Key: HIVE-3573 URL: https://issues.apache.org/jira/browse/HIVE-3573 Project: Hive Issue Type: Improvement Components: Query Processor Affects Versions: 0.10.0 Reporter: Kevin Wilfong Assignee: Namit Jain Attachments: hive.3573.1.patch, hive.3573.2.patch This patch introduces some code which can breaks distribute/order/cluster/sort by. We should revert this code until it can be fixed (HIVE-3572). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-3574) Allow Hive to Submit MapReduce jobs via the MapReduce API (instead of using Hadoop BIN)
Jeremy A. Lucas created HIVE-3574: - Summary: Allow Hive to Submit MapReduce jobs via the MapReduce API (instead of using Hadoop BIN) Key: HIVE-3574 URL: https://issues.apache.org/jira/browse/HIVE-3574 Project: Hive Issue Type: Improvement Components: Query Processor, SQL Affects Versions: 0.9.0, 0.8.1, 0.8.0, 0.7.1, 0.7.0, 0.6.0, 0.5.0, 0.4.1, 0.4.0, 0.3.0, 0.10.0, 0.9.1 Environment: All environments would be affected by this Reporter: Jeremy A. Lucas Priority: Minor Having Hive to submit generated jobs to an M/R cluster via the MapReduce API would allow for potentially greater compatibility across platforms, in addition to allowing for these jobs to be run easily against pseudo-clusters in tests (think MiniMRCluster). This kind of change could involve something as simple as using a Hadoop Configuration object with a generic ToolRunner or something similar to run jobs. Specifically, this kind of change would most likely occur in the execute() method of org.apache.hadoop.hive.ql.exec.MapRedTask (note that this class already has access to a JobConf object as well, which could serve in itself as a Configuration object). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3574) Allow Hive to Submit MapReduce jobs via the MapReduce API (instead of using Hadoop BIN)
[ https://issues.apache.org/jira/browse/HIVE-3574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeremy A. Lucas updated HIVE-3574: -- Description: Having Hive to submit generated jobs to an M/R cluster via the MapReduce API would allow for potentially greater compatibility across platforms, in addition to allowing for these jobs to be run easily against pseudo-clusters in tests (think MiniMRCluster). This kind of change could involve something as simple as using a Hadoop Configuration object with a generic ToolRunner or something similar to run jobs. Specifically, this kind of change would most likely occur in the execute() method of org.apache.hadoop.hive.ql.exec.MapRedTask. was: Having Hive to submit generated jobs to an M/R cluster via the MapReduce API would allow for potentially greater compatibility across platforms, in addition to allowing for these jobs to be run easily against pseudo-clusters in tests (think MiniMRCluster). This kind of change could involve something as simple as using a Hadoop Configuration object with a generic ToolRunner or something similar to run jobs. Specifically, this kind of change would most likely occur in the execute() method of org.apache.hadoop.hive.ql.exec.MapRedTask (note that this class already has access to a JobConf object as well, which could serve in itself as a Configuration object). Allow Hive to Submit MapReduce jobs via the MapReduce API (instead of using Hadoop BIN) --- Key: HIVE-3574 URL: https://issues.apache.org/jira/browse/HIVE-3574 Project: Hive Issue Type: Improvement Components: Query Processor, SQL Affects Versions: 0.3.0, 0.4.0, 0.4.1, 0.5.0, 0.6.0, 0.7.0, 0.7.1, 0.8.0, 0.8.1, 0.9.0, 0.10.0, 0.9.1 Environment: All environments would be affected by this Reporter: Jeremy A. Lucas Priority: Minor Labels: feature, test Having Hive to submit generated jobs to an M/R cluster via the MapReduce API would allow for potentially greater compatibility across platforms, in addition to allowing for these jobs to be run easily against pseudo-clusters in tests (think MiniMRCluster). This kind of change could involve something as simple as using a Hadoop Configuration object with a generic ToolRunner or something similar to run jobs. Specifically, this kind of change would most likely occur in the execute() method of org.apache.hadoop.hive.ql.exec.MapRedTask. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3570) Add/fix facility to collect operator specific statisticsin hive + add hash-in/hash-out counter for GroupBy Optr
[ https://issues.apache.org/jira/browse/HIVE-3570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13475379#comment-13475379 ] Phabricator commented on HIVE-3570: --- njain has commented on the revision HIVE-3570 [jira] Hive changes for Optr level stats. Please add a test. Add a simple hook which prints the number of hashed rows. INLINE COMMENTS ql/src/java/org/apache/hadoop/hive/ql/exec/GroupByOperator.java:1000 NUM_INPUT_ROWS are present in every operator. You dont need it again. REVISION DETAIL https://reviews.facebook.net/D5985 To: njain, satadru Cc: JIRA Add/fix facility to collect operator specific statisticsin hive + add hash-in/hash-out counter for GroupBy Optr --- Key: HIVE-3570 URL: https://issues.apache.org/jira/browse/HIVE-3570 Project: Hive Issue Type: Improvement Components: Statistics Affects Versions: 0.9.0 Reporter: Satadru Pan Assignee: Satadru Pan Priority: Minor Attachments: HIVE-3570.1.patch.txt, HIVE-3570.D5985.1.patch Requirement: Collect Operator specific stats for hive queries. Use the counter framework available in Hive Operator.java to accomplish that. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3570) Add/fix facility to collect operator specific statisticsin hive + add hash-in/hash-out counter for GroupBy Optr
[ https://issues.apache.org/jira/browse/HIVE-3570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Namit Jain updated HIVE-3570: - Status: Open (was: Patch Available) Comments on phabricator Add/fix facility to collect operator specific statisticsin hive + add hash-in/hash-out counter for GroupBy Optr --- Key: HIVE-3570 URL: https://issues.apache.org/jira/browse/HIVE-3570 Project: Hive Issue Type: Improvement Components: Statistics Affects Versions: 0.9.0 Reporter: Satadru Pan Assignee: Satadru Pan Priority: Minor Attachments: HIVE-3570.1.patch.txt, HIVE-3570.D5985.1.patch Requirement: Collect Operator specific stats for hive queries. Use the counter framework available in Hive Operator.java to accomplish that. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3574) Allow Hive to Submit MapReduce jobs via the MapReduce API (instead of using Hadoop BIN)
[ https://issues.apache.org/jira/browse/HIVE-3574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeremy A. Lucas updated HIVE-3574: -- Description: The current behavior of the MapRedTask is to start a process that invokes the hadoop jar command, passing each additional jobconf property as an argument to this Hadoop CLI. Having Hive to submit generated jobs to an M/R cluster via the MapReduce API would allow for potentially greater compatibility across platforms, in addition to allowing for these jobs to be run easily against pseudo-clusters in tests (think MiniMRCluster). This kind of change could involve something as simple as using a Hadoop Configuration object with a generic ToolRunner or something similar to run jobs. Specifically, this kind of change would most likely occur in the execute() method of org.apache.hadoop.hive.ql.exec.MapRedTask. was: Having Hive to submit generated jobs to an M/R cluster via the MapReduce API would allow for potentially greater compatibility across platforms, in addition to allowing for these jobs to be run easily against pseudo-clusters in tests (think MiniMRCluster). This kind of change could involve something as simple as using a Hadoop Configuration object with a generic ToolRunner or something similar to run jobs. Specifically, this kind of change would most likely occur in the execute() method of org.apache.hadoop.hive.ql.exec.MapRedTask. Allow Hive to Submit MapReduce jobs via the MapReduce API (instead of using Hadoop BIN) --- Key: HIVE-3574 URL: https://issues.apache.org/jira/browse/HIVE-3574 Project: Hive Issue Type: Improvement Components: Query Processor, SQL Affects Versions: 0.3.0, 0.4.0, 0.4.1, 0.5.0, 0.6.0, 0.7.0, 0.7.1, 0.8.0, 0.8.1, 0.9.0, 0.10.0, 0.9.1 Environment: All environments would be affected by this Reporter: Jeremy A. Lucas Priority: Minor Labels: feature, test The current behavior of the MapRedTask is to start a process that invokes the hadoop jar command, passing each additional jobconf property as an argument to this Hadoop CLI. Having Hive to submit generated jobs to an M/R cluster via the MapReduce API would allow for potentially greater compatibility across platforms, in addition to allowing for these jobs to be run easily against pseudo-clusters in tests (think MiniMRCluster). This kind of change could involve something as simple as using a Hadoop Configuration object with a generic ToolRunner or something similar to run jobs. Specifically, this kind of change would most likely occur in the execute() method of org.apache.hadoop.hive.ql.exec.MapRedTask. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3518) QTestUtil side-effects
[ https://issues.apache.org/jira/browse/HIVE-3518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13475393#comment-13475393 ] Namit Jain commented on HIVE-3518: -- +1 QTestUtil side-effects -- Key: HIVE-3518 URL: https://issues.apache.org/jira/browse/HIVE-3518 Project: Hive Issue Type: Bug Components: Testing Infrastructure, Tests Reporter: Ivan Gorbachev Assignee: Navis Attachments: HIVE-3518.D5865.1.patch, HIVE-3518.D5865.2.patch, metadata_export_drop.q It seems that QTestUtil has side-effects. This test ([^metadata_export_drop.q]) causes failure of other tests on cleanup stage: {quote} Exception: java.lang.IllegalArgumentException: java.net.URISyntaxException: Relative path in absolute URI: file:../build/ql/test/data/exports/HIVE-3427/src.2012-09-28-11-38-17 org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.IllegalArgumentException: java.net.URISyntaxException: Relative path in absolute URI: file:../build/ql/test/data/exports/HIVE-3427/src.2012-09-28-11-38-17 at org.apache.hadoop.hive.ql.metadata.Hive.dropTable(Hive.java:845) at org.apache.hadoop.hive.ql.metadata.Hive.dropTable(Hive.java:821) at org.apache.hadoop.hive.ql.QTestUtil.cleanUp(QTestUtil.java:445) at org.apache.hadoop.hive.ql.QTestUtil.shutdown(QTestUtil.java:300) at org.apache.hadoop.hive.cli.TestCliDriver.tearDown(TestCliDriver.java:87) at junit.framework.TestCase.runBare(TestCase.java:140) at junit.framework.TestResult$1.protect(TestResult.java:110) at junit.framework.TestResult.runProtected(TestResult.java:128) at junit.framework.TestResult.run(TestResult.java:113) at junit.framework.TestCase.run(TestCase.java:124) at junit.framework.TestSuite.runTest(TestSuite.java:232) at junit.framework.TestSuite.run(TestSuite.java:227) at org.eclipse.jdt.internal.junit.runner.junit3.JUnit3TestReference.run(JUnit3TestReference.java:130) at org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38) at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:460) at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:673) at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:386) at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:196) Caused by: java.lang.IllegalArgumentException: java.net.URISyntaxException: Relative path in absolute URI: file:../build/ql/test/data/exports/HIVE-3427/src.2012-09-28-11-38-17 at org.apache.hadoop.fs.Path.initialize(Path.java:140) at org.apache.hadoop.fs.Path.init(Path.java:132) at org.apache.hadoop.fs.ProxyFileSystem.swizzleParamPath(ProxyFileSystem.java:56) at org.apache.hadoop.fs.ProxyFileSystem.mkdirs(ProxyFileSystem.java:214) at org.apache.hadoop.fs.FilterFileSystem.mkdirs(FilterFileSystem.java:183) at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:1120) at org.apache.hadoop.hive.ql.parse.MetaDataExportListener.export_meta_data(MetaDataExportListener.java:81) at org.apache.hadoop.hive.ql.parse.MetaDataExportListener.onEvent(MetaDataExportListener.java:106) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.drop_table_core(HiveMetaStore.java:1024) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.drop_table(HiveMetaStore.java:1185) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.dropTable(HiveMetaStoreClient.java:566) at org.apache.hadoop.hive.ql.metadata.Hive.dropTable(Hive.java:839) ... 17 more Caused by: java.net.URISyntaxException: Relative path in absolute URI: file:../build/ql/test/data/exports/HIVE-3427/src.2012-09-28-11-38-17 at java.net.URI.checkPath(URI.java:1787) at java.net.URI.init(URI.java:735) at org.apache.hadoop.fs.Path.initialize(Path.java:137) ... 28 more {quote} Flushing 'hive.metastore.pre.event.listeners' into empty string solves the issue. During debugging I figured out this property wan't cleaned for other tests after it was set in metadata_export_drop.q. How to reproduce: {code} ant test -Dtestcase=TestCliDriver -Dqfile=metadata_export_drop.q,some test.q{code} where some test.q means any test which contains CREATE statement. For example, sample10.q -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3518) QTestUtil side-effects
[ https://issues.apache.org/jira/browse/HIVE-3518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13475394#comment-13475394 ] Phabricator commented on HIVE-3518: --- njain has accepted the revision HIVE-3518 [jira] QTestUtil side-effects. REVISION DETAIL https://reviews.facebook.net/D5865 BRANCH DPAL-1907 To: JIRA, njain, navis QTestUtil side-effects -- Key: HIVE-3518 URL: https://issues.apache.org/jira/browse/HIVE-3518 Project: Hive Issue Type: Bug Components: Testing Infrastructure, Tests Reporter: Ivan Gorbachev Assignee: Navis Attachments: HIVE-3518.D5865.1.patch, HIVE-3518.D5865.2.patch, metadata_export_drop.q It seems that QTestUtil has side-effects. This test ([^metadata_export_drop.q]) causes failure of other tests on cleanup stage: {quote} Exception: java.lang.IllegalArgumentException: java.net.URISyntaxException: Relative path in absolute URI: file:../build/ql/test/data/exports/HIVE-3427/src.2012-09-28-11-38-17 org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.IllegalArgumentException: java.net.URISyntaxException: Relative path in absolute URI: file:../build/ql/test/data/exports/HIVE-3427/src.2012-09-28-11-38-17 at org.apache.hadoop.hive.ql.metadata.Hive.dropTable(Hive.java:845) at org.apache.hadoop.hive.ql.metadata.Hive.dropTable(Hive.java:821) at org.apache.hadoop.hive.ql.QTestUtil.cleanUp(QTestUtil.java:445) at org.apache.hadoop.hive.ql.QTestUtil.shutdown(QTestUtil.java:300) at org.apache.hadoop.hive.cli.TestCliDriver.tearDown(TestCliDriver.java:87) at junit.framework.TestCase.runBare(TestCase.java:140) at junit.framework.TestResult$1.protect(TestResult.java:110) at junit.framework.TestResult.runProtected(TestResult.java:128) at junit.framework.TestResult.run(TestResult.java:113) at junit.framework.TestCase.run(TestCase.java:124) at junit.framework.TestSuite.runTest(TestSuite.java:232) at junit.framework.TestSuite.run(TestSuite.java:227) at org.eclipse.jdt.internal.junit.runner.junit3.JUnit3TestReference.run(JUnit3TestReference.java:130) at org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38) at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:460) at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:673) at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:386) at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:196) Caused by: java.lang.IllegalArgumentException: java.net.URISyntaxException: Relative path in absolute URI: file:../build/ql/test/data/exports/HIVE-3427/src.2012-09-28-11-38-17 at org.apache.hadoop.fs.Path.initialize(Path.java:140) at org.apache.hadoop.fs.Path.init(Path.java:132) at org.apache.hadoop.fs.ProxyFileSystem.swizzleParamPath(ProxyFileSystem.java:56) at org.apache.hadoop.fs.ProxyFileSystem.mkdirs(ProxyFileSystem.java:214) at org.apache.hadoop.fs.FilterFileSystem.mkdirs(FilterFileSystem.java:183) at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:1120) at org.apache.hadoop.hive.ql.parse.MetaDataExportListener.export_meta_data(MetaDataExportListener.java:81) at org.apache.hadoop.hive.ql.parse.MetaDataExportListener.onEvent(MetaDataExportListener.java:106) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.drop_table_core(HiveMetaStore.java:1024) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.drop_table(HiveMetaStore.java:1185) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.dropTable(HiveMetaStoreClient.java:566) at org.apache.hadoop.hive.ql.metadata.Hive.dropTable(Hive.java:839) ... 17 more Caused by: java.net.URISyntaxException: Relative path in absolute URI: file:../build/ql/test/data/exports/HIVE-3427/src.2012-09-28-11-38-17 at java.net.URI.checkPath(URI.java:1787) at java.net.URI.init(URI.java:735) at org.apache.hadoop.fs.Path.initialize(Path.java:137) ... 28 more {quote} Flushing 'hive.metastore.pre.event.listeners' into empty string solves the issue. During debugging I figured out this property wan't cleaned for other tests after it was set in metadata_export_drop.q. How to reproduce: {code} ant test -Dtestcase=TestCliDriver -Dqfile=metadata_export_drop.q,some test.q{code} where some test.q means any test which contains CREATE statement. For example, sample10.q -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HIVE-3471) Implement grouping sets/grouping_id in hive
[ https://issues.apache.org/jira/browse/HIVE-3471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Gorbachev reassigned HIVE-3471: Assignee: Ivan Gorbachev (was: Namit Jain) Implement grouping sets/grouping_id in hive --- Key: HIVE-3471 URL: https://issues.apache.org/jira/browse/HIVE-3471 Project: Hive Issue Type: New Feature Components: Query Processor Reporter: Namit Jain Assignee: Ivan Gorbachev -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3574) Allow Hive to Submit MapReduce jobs via the MapReduce API (instead of using Hadoop BIN)
[ https://issues.apache.org/jira/browse/HIVE-3574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13475405#comment-13475405 ] Ashutosh Chauhan commented on HIVE-3574: +1 on the idea. I have also been hit by this. Reliance on hadoop script to launch MR jobs is not cool. Allow Hive to Submit MapReduce jobs via the MapReduce API (instead of using Hadoop BIN) --- Key: HIVE-3574 URL: https://issues.apache.org/jira/browse/HIVE-3574 Project: Hive Issue Type: Improvement Components: Query Processor, SQL Affects Versions: 0.3.0, 0.4.0, 0.4.1, 0.5.0, 0.6.0, 0.7.0, 0.7.1, 0.8.0, 0.8.1, 0.9.0, 0.10.0, 0.9.1 Environment: All environments would be affected by this Reporter: Jeremy A. Lucas Priority: Minor Labels: feature, test The current behavior of the MapRedTask is to start a process that invokes the hadoop jar command, passing each additional jobconf property as an argument to this Hadoop CLI. Having Hive to submit generated jobs to an M/R cluster via the MapReduce API would allow for potentially greater compatibility across platforms, in addition to allowing for these jobs to be run easily against pseudo-clusters in tests (think MiniMRCluster). This kind of change could involve something as simple as using a Hadoop Configuration object with a generic ToolRunner or something similar to run jobs. Specifically, this kind of change would most likely occur in the execute() method of org.apache.hadoop.hive.ql.exec.MapRedTask. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-3575) maintain dependency between views/view partitions and tables/partitions
Namit Jain created HIVE-3575: Summary: maintain dependency between views/view partitions and tables/partitions Key: HIVE-3575 URL: https://issues.apache.org/jira/browse/HIVE-3575 Project: Hive Issue Type: New Feature Components: Metastore, Query Processor Reporter: Namit Jain -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3575) maintain dependency between views/view partitions and tables/partitions
[ https://issues.apache.org/jira/browse/HIVE-3575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13475420#comment-13475420 ] Namit Jain commented on HIVE-3575: -- Hive supports both partitioned and unpartitioned views. Let us consider a specific example: create table T (key string, value string) partitioned by (ds string, hr string); insert overwrite table T partition (ds='1', hr='1') ...; .. insert overwrite table T partition (ds='1', hr='24') ...; T is a partitioned table by date and hour, and Tview is a view which conceptually denotes the table T partitioned by ds. create view Tview (key string, value string) partitioned by (ds string) as select key, value, ds from T; When all the hourly partitions are created for a day (ds='1'), the corresponding partition can be added to Tview alter view Tview add partition (ds='1'); There is a implicit dependency between Tview@ds=1 and T@ds=1/hr=1, T@ds=1/hr=2, T@ds=1/hr=24, but that dependency is not captured anywhere in the metastore. It would be useful to explicitly create that dependency. This dependency can be used for all kinds of auditing purposes. The table's partition T@ds=1/hr=1 cannot be dropped unless the view partition Tview@ds=1 is dropped. maintain dependency between views/view partitions and tables/partitions --- Key: HIVE-3575 URL: https://issues.apache.org/jira/browse/HIVE-3575 Project: Hive Issue Type: New Feature Components: Metastore, Query Processor Reporter: Namit Jain -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3573) Revert HIVE-3268
[ https://issues.apache.org/jira/browse/HIVE-3573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13475422#comment-13475422 ] Kevin Wilfong commented on HIVE-3573: - +1 Thanks Namit Revert HIVE-3268 Key: HIVE-3573 URL: https://issues.apache.org/jira/browse/HIVE-3573 Project: Hive Issue Type: Improvement Components: Query Processor Affects Versions: 0.10.0 Reporter: Kevin Wilfong Assignee: Namit Jain Attachments: hive.3573.1.patch, hive.3573.2.patch This patch introduces some code which can breaks distribute/order/cluster/sort by. We should revert this code until it can be fixed (HIVE-3572). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3574) Allow Hive to Submit MapReduce jobs via the MapReduce API (instead of using Hadoop BIN)
[ https://issues.apache.org/jira/browse/HIVE-3574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13475437#comment-13475437 ] Andrew Look commented on HIVE-3574: --- +1. I find it to be a much more robust and flexible approach to rely on what's in the classpath rather than what's installed on the system. Allow Hive to Submit MapReduce jobs via the MapReduce API (instead of using Hadoop BIN) --- Key: HIVE-3574 URL: https://issues.apache.org/jira/browse/HIVE-3574 Project: Hive Issue Type: Improvement Components: Query Processor, SQL Affects Versions: 0.3.0, 0.4.0, 0.4.1, 0.5.0, 0.6.0, 0.7.0, 0.7.1, 0.8.0, 0.8.1, 0.9.0, 0.10.0, 0.9.1 Environment: All environments would be affected by this Reporter: Jeremy A. Lucas Priority: Minor Labels: feature, test The current behavior of the MapRedTask is to start a process that invokes the hadoop jar command, passing each additional jobconf property as an argument to this Hadoop CLI. Having Hive to submit generated jobs to an M/R cluster via the MapReduce API would allow for potentially greater compatibility across platforms, in addition to allowing for these jobs to be run easily against pseudo-clusters in tests (think MiniMRCluster). This kind of change could involve something as simple as using a Hadoop Configuration object with a generic ToolRunner or something similar to run jobs. Specifically, this kind of change would most likely occur in the execute() method of org.apache.hadoop.hive.ql.exec.MapRedTask. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2935) Implement HiveServer2
[ https://issues.apache.org/jira/browse/HIVE-2935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13475454#comment-13475454 ] Alan Gates commented on HIVE-2935: -- Carl, we'd like to start helping get these patches in shape for commit to the trunk. If we start posting patches on top of the existing patches we'll have a mess. Does it make sense to commit these quickly to a branch so we can collaborate and then merge them into trunk when they're solid? Implement HiveServer2 - Key: HIVE-2935 URL: https://issues.apache.org/jira/browse/HIVE-2935 Project: Hive Issue Type: New Feature Components: Server Infrastructure Reporter: Carl Steinbach Assignee: Carl Steinbach Labels: HiveServer2 Attachments: beelinepositive.tar.gz, HIVE-2935.1.notest.patch.txt, HIVE-2935.2.notest.patch.txt, HIVE-2935.2.nothrift.patch.txt -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3377) ant model-jar command fails in metastore
[ https://issues.apache.org/jira/browse/HIVE-3377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13475526#comment-13475526 ] Krish commented on HIVE-3377: - If someone could share how Eclipse was setup, that would be great. I followed following steps but last command fails with error message get-test does not exists in the project hive - please help. $ ant clean package eclipse-files $ cd metastore $ ant model-jar $ cd .. $ ant gen-test --- Error with the command ant model-jar command fails in metastore Key: HIVE-3377 URL: https://issues.apache.org/jira/browse/HIVE-3377 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 0.10.0 Reporter: Vandana Ayyalasomayajula Priority: Minor Labels: build Running ant model-jar command to set up eclipse dev environment from the following wiki: https://cwiki.apache.org/Hive/gettingstarted-eclipsesetup.html fails with the following message: BUILD FAILED **/workspace/hive-trunk/metastore/build.xml:22: The following error occurred while executing this line: **/workspace/hive-trunk/build-common.xml:112: Problem: failed to create task or type osfamily Cause: The name is undefined. Action: Check the spelling. Action: Check that any custom tasks/types have been declared. Action: Check that any presetdef/macrodef declarations have taken place. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira